r - Can I pool imputed random effect model estimates using the mi package? -
it appears mi package has had pretty big rewrite @ point within past couple of years.
the "old" way of doing things well-outlined in following tutorial: http://thomasleeper.com/rcourse/tutorials/mi.html
the "new" way of doing things (sticking leeper's simulation demo) looks this:
#load mi library(mi) #set seed set.seed(10) #simulate data (with observations missing) x1 <- runif(100, 0, 5) x2 <- rnorm(100) y <- 2*x1 + 20*x2 + rnorm(100) mydf <- cbind.data.frame(x1, x2, y) mydf$x1[sample(1:nrow(mydf), 20, false)] <- na mydf$x2[sample(1:nrow(mydf), 10, false)] <- na # convert missing_data.frame mydf_mdf <- missing_data.frame(mydf) # impute mydf_imp <- mi(mydf_mdf) although function names have changed, pretty similar "old" way of doing things.
the biggest change (from vantage) replacement of following "old" functions
lm.mi(formula, mi.object, ...)
glm.mi(formula, mi.object, family = gaussian, ...)
bayesglm.mi(formula, mi.object, family = gaussian, ...)
polr.mi(formula, mi.object, ...)
bayespolr.mi(formula, mi.object, ...)
lmer.mi(formula, mi.object, rescale=false, ...)
glmer.mi(formula, mi.object, family = gaussian, rescale=false, ...).
previously, user compute model each imputed dataset using 1 of these functions , pool results using mi.pooled() (or coef.mi() if following leeper example).
in current version of mi (i have v1.0 installed), these last steps appear have been combined single function, pool(). pool() function appears read family , link function assigned variable during imputation process above , estimate model bayesglm using specified formula shown below.
# run models on imputed data , pool results summary(pool(y ~ x1 + x2, mydf_imp)) ## ## call: ## pool(formula = y ~ x1 + x2, data = mydf_imp) ## ## deviance residuals: ## min 1q median 3q max ## -1.98754 -0.40923 0.03393 0.46734 2.13848 ## ## coefficients: ## estimate std. error t value pr(>|t|) ## (intercept) -0.34711 0.25979 -1.336 0.215 ## x1 2.07806 0.08738 23.783 1.46e-13 *** ## x2 19.90544 0.11068 179.844 < 2e-16 *** ## --- ## signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (dispersion parameter gaussian family taken 0.7896688) ## ## null deviance: 38594.916 on 99 degrees of freedom ## residual deviance: 76.598 on 97 degrees of freedom ## aic: 264.74 ## ## number of fisher scoring iterations: 7 this looks coming close recovering our simulated beta values (2 , 20). in other words, behaving expected.
let's take larger set of data naively simulated random effect sake of getting grouping variable.
mydf2 <- data.frame(x1 = rep(runif(100, 0, 5), 20) ,x2 = rep(rnorm(100, 0, 2.5), 20) ,group_var = rep(1:20, each = 100) ,noise = rep(rnorm(100), 20)) mydf2$y <- 2*mydf2$x1 + 20*mydf2$x2 + mydf2$noise mydf2$x1[sample(1:nrow(mydf2), 200, false)] <- na mydf2$x2[sample(1:nrow(mydf2), 100, false)] <- na # convert missing_data.frame mydf2_mdf <- missing_data.frame(mydf2) show(mydf2_mdf) ## object of class missing_data.frame 2000 observations on 5 variables ## ## there 4 missing data patterns ## ## append '@patterns' missing_data.frame access corresponding pattern every observation or perhaps use table() ## ## type missing method model ## x1 continuous 200 ppd linear ## x2 continuous 100 ppd linear ## group_var continuous 0 <na> <na> ## noise continuous 0 <na> <na> ## y continuous 0 <na> <na> ## ## family link transformation ## x1 gaussian identity standardize ## x2 gaussian identity standardize ## group_var <na> <na> standardize ## noise <na> <na> standardize ## y <na> <na> standardize since missing_data.frame() appears intepreting group_var continuous, use change() function mi reassign "un" "unordered categorical" , proceed above.
mydf2_mdf <- change(mydf2_mdf, y = "group_var", = "type", = "un" ) # impute mydf2_imp <- mi(mydf2_mdf) now, unless version 1.0 of mi has removed functionality of previous versions (i.e. functionality available lmer.mi , glmer.mi), assume addition of random effect in formula should point pool() appropriate lme4 function. however, initial error message suggests not case.
# run models on imputed data , pool results summary(pool(y ~ x1 + x2 + (1|group_var), mydf2_imp)) ## warning in ops.factor(1, group_var): '|' not meaningful factors ## warning in ops.factor(1, group_var): '|' not meaningful factors ## error in if (prior.scale[j] < min.prior.scale) {: missing value true/false needed following warning message , extracting integers out of factor me estimate, results suggest pool() still estimating fixed-effect model bayesglm , holding attempted random-effect constant.
summary(pool(y ~ x1 + x2 + (1|as.numeric(as.character(group_var))), mydf2_imp)) ## ## call: ## pool(formula = y ~ x1 + x2 + (1 | as.numeric(as.character(group_var))), ## data = mydf2_imp) ## ## deviance residuals: ## min 1q median 3q max ## -1.93633 -0.69923 0.01073 0.56752 2.12167 ## ## coefficients: ## estimate std. error t value ## (intercept) 1.383e-01 2.596e+02 0.001 ## x1 1.995e+00 1.463e-02 136.288 ## x2 2.000e+01 8.004e-03 2499.077 ## 1 | as.numeric(as.character(group_var))true -3.105e-08 2.596e+02 0.000 ## pr(>|t|) ## (intercept) 1 ## x1 <2e-16 *** ## x2 <2e-16 *** ## 1 | as.numeric(as.character(group_var))true 1 ## --- ## signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (dispersion parameter gaussian family taken 0.8586836) ## ## null deviance: 5384205.2 on 1999 degrees of freedom ## residual deviance: 1713.9 on 1996 degrees of freedom ## aic: 5377 ## ## number of fisher scoring iterations: 4 my questions are:
- is possible generate pooled random effects estimates using
mi?, , - if yes, how?
you can specify fun argument pool() function change estimator. in case, summary(pool(y ~ x1 + x2 + (1|as.numeric(as.character(group_var))), data = mydf2_imp, fun = lmer)). may or may not work, legal syntax. if fails, can use complete function created completed data.frames, call lmer on each, , average results yourself, dfs <- complete(mydf2_imp) estimates <- lapply(dfs, fun = lme4, formula = y ~ x1 + x2 + (1|as.numeric(as.character(group_var)))) rowmeans(sapply(estimates, fun = fixef))
Comments
Post a Comment