r - Can I pool imputed random effect model estimates using the mi package? -
it appears mi
package has had pretty big rewrite @ point within past couple of years.
the "old" way of doing things well-outlined in following tutorial: http://thomasleeper.com/rcourse/tutorials/mi.html
the "new" way of doing things (sticking leeper's simulation demo) looks this:
#load mi library(mi) #set seed set.seed(10) #simulate data (with observations missing) x1 <- runif(100, 0, 5) x2 <- rnorm(100) y <- 2*x1 + 20*x2 + rnorm(100) mydf <- cbind.data.frame(x1, x2, y) mydf$x1[sample(1:nrow(mydf), 20, false)] <- na mydf$x2[sample(1:nrow(mydf), 10, false)] <- na # convert missing_data.frame mydf_mdf <- missing_data.frame(mydf) # impute mydf_imp <- mi(mydf_mdf)
although function names have changed, pretty similar "old" way of doing things.
the biggest change (from vantage) replacement of following "old" functions
lm.mi(formula, mi.object, ...)
glm.mi(formula, mi.object, family = gaussian, ...)
bayesglm.mi(formula, mi.object, family = gaussian, ...)
polr.mi(formula, mi.object, ...)
bayespolr.mi(formula, mi.object, ...)
lmer.mi(formula, mi.object, rescale=false, ...)
glmer.mi(formula, mi.object, family = gaussian, rescale=false, ...)
.
previously, user compute model each imputed dataset using 1 of these functions , pool results using mi.pooled()
(or coef.mi()
if following leeper example).
in current version of mi
(i have v1.0 installed), these last steps appear have been combined single function, pool()
. pool()
function appears read family , link function assigned variable during imputation process above , estimate model bayesglm
using specified formula shown below.
# run models on imputed data , pool results summary(pool(y ~ x1 + x2, mydf_imp)) ## ## call: ## pool(formula = y ~ x1 + x2, data = mydf_imp) ## ## deviance residuals: ## min 1q median 3q max ## -1.98754 -0.40923 0.03393 0.46734 2.13848 ## ## coefficients: ## estimate std. error t value pr(>|t|) ## (intercept) -0.34711 0.25979 -1.336 0.215 ## x1 2.07806 0.08738 23.783 1.46e-13 *** ## x2 19.90544 0.11068 179.844 < 2e-16 *** ## --- ## signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (dispersion parameter gaussian family taken 0.7896688) ## ## null deviance: 38594.916 on 99 degrees of freedom ## residual deviance: 76.598 on 97 degrees of freedom ## aic: 264.74 ## ## number of fisher scoring iterations: 7
this looks coming close recovering our simulated beta values (2 , 20). in other words, behaving expected.
let's take larger set of data naively simulated random effect sake of getting grouping variable.
mydf2 <- data.frame(x1 = rep(runif(100, 0, 5), 20) ,x2 = rep(rnorm(100, 0, 2.5), 20) ,group_var = rep(1:20, each = 100) ,noise = rep(rnorm(100), 20)) mydf2$y <- 2*mydf2$x1 + 20*mydf2$x2 + mydf2$noise mydf2$x1[sample(1:nrow(mydf2), 200, false)] <- na mydf2$x2[sample(1:nrow(mydf2), 100, false)] <- na # convert missing_data.frame mydf2_mdf <- missing_data.frame(mydf2) show(mydf2_mdf) ## object of class missing_data.frame 2000 observations on 5 variables ## ## there 4 missing data patterns ## ## append '@patterns' missing_data.frame access corresponding pattern every observation or perhaps use table() ## ## type missing method model ## x1 continuous 200 ppd linear ## x2 continuous 100 ppd linear ## group_var continuous 0 <na> <na> ## noise continuous 0 <na> <na> ## y continuous 0 <na> <na> ## ## family link transformation ## x1 gaussian identity standardize ## x2 gaussian identity standardize ## group_var <na> <na> standardize ## noise <na> <na> standardize ## y <na> <na> standardize
since missing_data.frame()
appears intepreting group_var
continuous, use change()
function mi
reassign "un"
"unordered categorical" , proceed above.
mydf2_mdf <- change(mydf2_mdf, y = "group_var", = "type", = "un" ) # impute mydf2_imp <- mi(mydf2_mdf)
now, unless version 1.0 of mi
has removed functionality of previous versions (i.e. functionality available lmer.mi
, glmer.mi
), assume addition of random effect in formula should point pool()
appropriate lme4
function. however, initial error message suggests not case.
# run models on imputed data , pool results summary(pool(y ~ x1 + x2 + (1|group_var), mydf2_imp)) ## warning in ops.factor(1, group_var): '|' not meaningful factors ## warning in ops.factor(1, group_var): '|' not meaningful factors ## error in if (prior.scale[j] < min.prior.scale) {: missing value true/false needed
following warning message , extracting integers out of factor me estimate, results suggest pool()
still estimating fixed-effect model bayesglm
, holding attempted random-effect constant.
summary(pool(y ~ x1 + x2 + (1|as.numeric(as.character(group_var))), mydf2_imp)) ## ## call: ## pool(formula = y ~ x1 + x2 + (1 | as.numeric(as.character(group_var))), ## data = mydf2_imp) ## ## deviance residuals: ## min 1q median 3q max ## -1.93633 -0.69923 0.01073 0.56752 2.12167 ## ## coefficients: ## estimate std. error t value ## (intercept) 1.383e-01 2.596e+02 0.001 ## x1 1.995e+00 1.463e-02 136.288 ## x2 2.000e+01 8.004e-03 2499.077 ## 1 | as.numeric(as.character(group_var))true -3.105e-08 2.596e+02 0.000 ## pr(>|t|) ## (intercept) 1 ## x1 <2e-16 *** ## x2 <2e-16 *** ## 1 | as.numeric(as.character(group_var))true 1 ## --- ## signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (dispersion parameter gaussian family taken 0.8586836) ## ## null deviance: 5384205.2 on 1999 degrees of freedom ## residual deviance: 1713.9 on 1996 degrees of freedom ## aic: 5377 ## ## number of fisher scoring iterations: 4
my questions are:
- is possible generate pooled random effects estimates using
mi
?, , - if yes, how?
you can specify fun
argument pool()
function change estimator. in case, summary(pool(y ~ x1 + x2 + (1|as.numeric(as.character(group_var))), data = mydf2_imp, fun = lmer))
. may or may not work, legal syntax. if fails, can use complete
function created completed data.frames, call lmer
on each, , average results yourself, dfs <- complete(mydf2_imp) estimates <- lapply(dfs, fun = lme4, formula = y ~ x1 + x2 + (1|as.numeric(as.character(group_var)))) rowmeans(sapply(estimates, fun = fixef))
Comments
Post a Comment