Imputation based on Heckman model for multilevel data.
mice.impute.2l.2stage.heckman.Rd
Imputes outcome and predictor variables that follow an MNAR mechanism according to Heckman's model and come from a multilevel database such as individual participant data.
Usage
mice.impute.2l.2stage.heckman(
y,
ry,
x,
wy = NULL,
type,
pmm = FALSE,
ypmm = NULL,
meta_method = "reml",
...
)
Arguments
- y
Vector to be imputed
- ry
Logical vector of length
length(y)
indicating the the subsety[ry]
of elements iny
to which the imputation model is fitted. Thery
generally distinguishes the observed (TRUE
) and missing values (FALSE
) iny
.- x
Numeric design matrix with
length(y)
rows with predictors fory
. Matrixx
may have no missing values.- wy
Logical vector of length
length(y)
. ATRUE
value indicates locations iny
for which imputations are created.- type
type of the variable in the prediction model 0: No predictor, 1: Predictor in both the outcome and selection,-2: Cluster id (study id), -3: Predictor only in the selection model, -4: Predictor only in the outcome model
- pmm
predictive mean matching can be applied only for for missing continuous variables: "FALSE","TRUE"
- ypmm
vector of donor values of y to perform the predictive mean matching, in case ypmm is not provided, the observable values of y are used.
- meta_method
meta_analysis estimation method for random effects : "ml" (maximum likelihood), "reml" (restricted maximum likelihood) or "mm" method of moments.
- ...
Other named arguments. Not used.
Details
Imputes systematically and sporadically missing binary and continuous univariate variables that follow a MNAR mechanism according to the Heckman selection model and come from a clustered dataset. The imputation method uses a two-stage approach in which the Heckman model parameters at the cluster level are estimated using the copula method.
Note
Missing binary variables should be included as two-level factor type variables. When the cluster variable is not defined in the predictor matrix as "-2", the imputation method is based on a simple Heckman model, i.e. without taking into account the hierarchical structure. In case the Heckman model cannot be estimated at the study level, the imputation method will be based on the simple Heckman model. Added:
Examples
# example code
library(mice)
#>
#> Attaching package: ‘mice’
#> The following object is masked from ‘package:stats’:
#>
#> filter
#> The following objects are masked from ‘package:base’:
#>
#> cbind, rbind
pred <- make.predictorMatrix(nhanes)
pred[, "age"] <- -3
mice(nhanes, pred = pred, meth = "2l.2stage.heckman")
#>
#> iter imp variable
#> 1 1 bmi
#> No group variable has been provided, the Heckman imputation model will be applied globally to the dataset.
#> The Heckman model cannot be estimated marginally, so systematically missing groups will be imputed with the Heckman model based on the full dataset.
#> hyp
#> No group variable has been provided, the Heckman imputation model will be applied globally to the dataset.
#> The Heckman model cannot be estimated marginally, so systematically missing groups will be imputed with the Heckman model based on the full dataset.
#> Error in mice.impute.2l.2stage.heckman(y = c(2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 2, 1, 2, 1, 1, 1, 1, 1), ry = c(`1` = FALSE, `2` = TRUE, `3` = TRUE, `4` = FALSE, `5` = TRUE, `6` = FALSE, `7` = TRUE, `8` = TRUE, `9` = TRUE, `10` = FALSE, `11` = FALSE, `12` = FALSE, `13` = TRUE, `14` = TRUE, `15` = TRUE, `16` = FALSE, `17` = TRUE, `18` = TRUE, `19` = TRUE, `20` = TRUE, `21` = FALSE, `22` = TRUE, `23` = TRUE, `24` = TRUE, `25` = TRUE), x = structure(c(1, 2, 1, 3, 1, 3, 1, 1, 2, 2, 1, 2, 3, 2, 1, 1, 3, 2, 1, 3, 1, 1, 1, 3, 2, -31632.6652863162, 22.7, -52357.3794743866, -64121.0089968081, 20.4, -51517.1201986113, 22.5, 30.1, 22, -57679.0213659062, -31630.9844455429, -66641.78670761, 21.7, 28.7, 29.6, -52077.2930598841, 27.2, 26.3, 35.3, 25.5, -51517.120092814, 33.2, 27.5, 24.9, 27.4, 113, 187, 187, 229, 113, 184, 118, 187, 238, 206, 113, 238, 206, 204, 204, 186, 284, 199, 218, 187, 184, 229, 131, 284, 186), dim = c(25L, 3L), dimnames = list(c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25"), c("age", "bmi", "chl"))), wy = c(`1` = TRUE, `2` = FALSE, `3` = FALSE, `4` = TRUE, `5` = FALSE, `6` = TRUE, `7` = FALSE, `8` = FALSE, `9` = FALSE, `10` = TRUE, `11` = TRUE, `12` = TRUE, `13` = FALSE, `14` = FALSE, `15` = FALSE, `16` = TRUE, `17` = FALSE, `18` = FALSE, `19` = FALSE, `20` = FALSE, `21` = TRUE, `22` = FALSE, `23` = FALSE, `24` = FALSE, `25` = FALSE), type = c(age = -3, bmi = 1, chl = 1)): There is insufficient information to impute the Heckman model at the marginal or study level.