mice: Multivariate Imputation by Chained Equations

多重補完法

> library(mice)
> data("boys")
> data("fdd")
> data("fdgs")
> data("nhanes")

バージョン: 2.25


.
appendbreak Appends specified break to the data
as.mids Converts an multiply imputed dataset (long
format) into a 'mids' object
as.mira Create a 'mira' object from repeated analyses
boys Growth of Dutch boys
bwplot.mids Box-and-whisker plot of observed and imputed
data
cbind.mids Columnwise combination of a 'mids' object.
cc Complete cases
cci Complete case indicator
ccn Complete cases n
complete Creates imputed data sets from a 'mids' object
densityplot.mids Density plot of observed and imputed data
extractBS Extract broken stick estimates from a 'lmer'
object
fdd SE Fireworks disaster data
fdgs Fifth Dutch growth study 2009
fico Fraction of incomplete cases among cases with
observed
flux Influx and outflux of multivariate missing data
patterns
fluxplot Fluxplot of the missing data pattern
getfit Extracts fit objects from 'mira' object
glm.mids Generalized linear model for 'mids' object
ibind Combine imputations fitted to the same data
ic Incomplete cases
ici Incomplete case indicator
icn Incomplete cases n
is.mids Check for 'mids' object
is.mipo Check for 'mipo' object
is.mira Check for 'mira' object
leiden85 Leiden 85+ study
lm.mids Linear regression for 'mids' object
mammalsleep Mammal sleep data
md.pairs Missing data pattern by variable pairs
md.pattern Missing data pattern
mdc Graphical parameter for missing data plots.
mice Multivariate Imputation by Chained Equations
(MICE)
mice.impute.2l.norm Imputation by a two-level normal model
mice.impute.2l.pan Imputation by a two-level normal model using
'pan'
mice.impute.2lonly.mean
Imputation of the mean within the class
mice.impute.2lonly.norm
Imputation at level 2 by Bayesian linear
regression
mice.impute.2lonly.pmm
Imputation at level 2 by predictive mean
matching
mice.impute.cart Imputation by classification and regression
trees
mice.impute.fastpmm Imputation by fast predictive mean matching
mice.impute.lda Imputation by linear discriminant analysis
mice.impute.logreg Imputation by logistic regression
mice.impute.logreg.boot
Imputation by logistic regression using the
bootstrap
mice.impute.mean Imputation by the mean
mice.impute.norm Imputation by Bayesian linear regression
mice.impute.norm.boot Imputation by linear regression, bootstrap
method
mice.impute.norm.nob Imputation by linear regression (non Bayesian)
mice.impute.norm.predict
Imputation by linear regression, prediction
method
mice.impute.passive Passive imputation
mice.impute.pmm Imputation by predictive mean matching
mice.impute.polr Imputation by polytomous regression - ordered
mice.impute.polyreg Imputation by polytomous regression - unordered
mice.impute.quadratic Imputation of quadratric terms
mice.impute.rf Imputation by random forests
mice.impute.ri Imputation by the random indicator method for
nonignorable data
mice.impute.sample Imputation by simple random sampling
mice.mids Multivariate Imputation by Chained Equations
(Iteration Step)
mice.theme Set the theme for the plotting Trellis
functions
mids-class Multiply imputed data set ('mids')
mids2mplus Export 'mids' object to Mplus
mids2spss Export 'mids' object to SPSS
mipo-class Multiply imputed pooled analysis ('mipo')
mira-class Multiply imputed repeated analyses ('mira')
nelsonaalen Cumulative hazard rate or Nelson-Aalen
estimator
nhanes NHANES example - all variables numerical
nhanes2 NHANES example - mixed numerical and discrete
variables
norm.draw Draws values of beta and sigma by Bayesian
linear regression
pattern Datasets with various missing data patterns
plot.mids Plot the trace lines of the MICE algorithm
pool Multiple imputation pooling
pool.compare Compare two nested models fitted to imputed
data
pool.r.squared Pooling: R squared
pool.scalar Multiple imputation pooling: univariate version
popmis Hox pupil popularity data with missing
popularity scores
pops Project on preterm and small for gestational
age infants (POPS)
potthoffroy Potthoff-Roy data
print.mids Print a 'mids' object
quickpred Quick selection of predictors from the data
rbind.mids Rowwise combination of a 'mids' object.
selfreport Self-reported and measured BMI
squeeze Squeeze the imputed values to be within
specified boundaries.
stripplot.mids Stripplot of observed and imputed data
summary.mira Summary of a 'mira' object
supports.transparent Supports semi-transparent foreground colors?
tbc Terneuzen birth cohort
version Echoes the package version number
walking Walking disability data
windspeed Subset of Irish wind speed data
with.mids Evaluate an expression in multiple imputed
datasets
xyplot.mids Scatterplot of observed and imputed data
関数名 概略
appendbreak Appends specified break to the data
as.mids Converts an multiply imputed dataset (long format) into a 'mids' object
as.mira Create a 'mira' object from repeated analyses
boys Growth of Dutch boys
bwplot.mids Box-and-whisker plot of observed and imputed data
cbind.mids Columnwise combination of a 'mids' object.
cc Complete cases
cci Complete case indicator
ccn Complete cases n
complete Creates imputed data sets from a 'mids' object
densityplot.mids Density plot of observed and imputed data
extractBS Extract broken stick estimates from a 'lmer' object
fdd SE Fireworks disaster data
fdgs Fifth Dutch growth study 2009
fico Fraction of incomplete cases among cases with observed
flux Influx and outflux of multivariate missing data patterns
fluxplot Fluxplot of the missing data pattern
getfit Extracts fit objects from 'mira' object
glm.mids Generalized linear model for 'mids' object
ibind Combine imputations fitted to the same data
ic Incomplete cases
ici Incomplete case indicator
icn Incomplete cases n
is.mids Check for 'mids' object
is.mipo Check for 'mipo' object
is.mira Check for 'mira' object
leiden85 Leiden 85+ study
lm.mids Linear regression for 'mids' object
mammalsleep Mammal sleep data
md.pairs Missing data pattern by variable pairs
md.pattern Missing data pattern
mdc Graphical parameter for missing data plots.
mice Multivariate Imputation by Chained Equations (MICE)
mice.impute.2l.norm Imputation by a two-level normal model
mice.impute.2l.pan Imputation by a two-level normal model using 'pan'
mice.impute.2lonly.mean Imputation of the mean within the class
mice.impute.2lonly.norm Imputation at level 2 by Bayesian linear regression
mice.impute.2lonly.pmm Imputation at level 2 by predictive mean matching
mice.impute.cart Imputation by classification and regression trees
mice.impute.fastpmm Imputation by fast predictive mean matching
mice.impute.lda Imputation by linear discriminant analysis
mice.impute.logreg Imputation by logistic regression
mice.impute.logreg.boot Imputation by logistic regression using the bootstrap
mice.impute.mean Imputation by the mean
mice.impute.norm Imputation by Bayesian linear regression
mice.impute.norm.boot Imputation by linear regression, bootstrap method
mice.impute.norm.nob Imputation by linear regression (non Bayesian)
mice.impute.norm.predict Imputation by linear regression, prediction method
mice.impute.passive Passive imputation
mice.impute.pmm Imputation by predictive mean matching
mice.impute.polr Imputation by polytomous regression - ordered
mice.impute.polyreg Imputation by polytomous regression - unordered
mice.impute.quadratic Imputation of quadratric terms
mice.impute.rf Imputation by random forests
mice.impute.ri Imputation by the random indicator method for nonignorable data
mice.impute.sample Imputation by simple random sampling
mice.mids Multivariate Imputation by Chained Equations (Iteration Step)
mice.theme Set the theme for the plotting Trellis functions
mids-class Multiply imputed data set ('mids')
mids2mplus Export 'mids' object to Mplus
mids2spss Export 'mids' object to SPSS
mipo-class Multiply imputed pooled analysis ('mipo')
mira-class Multiply imputed repeated analyses ('mira')
nelsonaalen Cumulative hazard rate or Nelson-Aalen estimator
nhanes NHANES example - all variables numerical
nhanes2 NHANES example - mixed numerical and discrete variables
norm.draw Draws values of beta and sigma by Bayesian linear regression
pattern Datasets with various missing data patterns
plot.mids Plot the trace lines of the MICE algorithm
pool Multiple imputation pooling
pool.compare Compare two nested models fitted to imputed data
pool.r.squared Pooling: R squared
pool.scalar Multiple imputation pooling: univariate version
popmis Hox pupil popularity data with missing popularity scores
pops Project on preterm and small for gestational age infants (POPS)
potthoffroy Potthoff-Roy data
print.mids Print a 'mids' object
quickpred Quick selection of predictors from the data
rbind.mids Rowwise combination of a 'mids' object.
selfreport Self-reported and measured BMI
squeeze Squeeze the imputed values to be within specified boundaries.
stripplot.mids Stripplot of observed and imputed data
summary.mira Summary of a 'mira' object
supports.transparent Supports semi-transparent foreground colors?
tbc Terneuzen birth cohort
version Echoes the package version number
walking Walking disability data
windspeed Subset of Irish wind speed data
with.mids Evaluate an expression in multiple imputed datasets
xyplot.mids Scatterplot of observed and imputed data

boys

オランダにおける児童の成長データセット(欠損値を含む)

> data("boys")
> boys %>% {
+   print(class(.))
+   dplyr::glimpse(.)
+ }
[1] "data.frame"
Observations: 748
Variables: 9
$ age <dbl> 0.035, 0.038, 0.057, 0.060, 0.062, 0.068, 0.068, 0.071, 0....
$ hgt <dbl> 50.1, 53.5, 50.0, 54.5, 57.5, 55.5, 52.5, 53.0, 55.1, 54.5...
$ wgt <dbl> 3.650, 3.370, 3.140, 4.270, 5.030, 4.655, 3.810, 3.890, 3....
$ bmi <dbl> 14.54, 11.77, 12.56, 14.37, 15.21, 15.11, 13.82, 13.84, 12...
$ hc  <dbl> 33.7, 35.0, 35.2, 36.7, 37.3, 37.0, 34.9, 35.8, 36.8, 38.0...
$ gen <ord> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ phb <ord> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ tv  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ reg <fctr> south, south, south, south, south, south, south, west, we...

cc

欠損データの除去

> nhanes$bmi
 [1]   NA 22.7   NA   NA 20.4   NA 22.5 30.1 22.0   NA   NA   NA 21.7 28.7
[15] 29.6   NA 27.2 26.3 35.3 25.5   NA 33.2 27.5 24.9 27.4
> nhanes$bmi %>% cc()
 [1] 22.7 20.4 22.5 30.1 22.0 21.7 28.7 29.6 27.2 26.3 35.3 25.5 33.2 27.5
[15] 24.9 27.4
> cc(nhanes[, 2, drop = FALSE], drop = FALSE)
    bmi
2  22.7
5  20.4
7  22.5
8  30.1
9  22.0
13 21.7
14 28.7
15 29.6
17 27.2
18 26.3
19 35.3
20 25.5
22 33.2
23 27.5
24 24.9
25 27.4

cci

欠損値の確認

> nhanes %>% cci()
    1     2     3     4     5     6     7     8     9    10    11    12 
FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE 
   13    14    15    16    17    18    19    20    21    22    23    24 
 TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE 
   25 
 TRUE
> nhanes$bmi %>% cci()
 [1] FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE
[12] FALSE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
[23]  TRUE  TRUE  TRUE

ccn

欠損値の確認。欠損箇所がいくつあるか。

> ccn(nhanes)
[1] 13

complete

> mice(nhanes) %>% mice::complete()

 iter imp variable
  1   1  bmi  hyp  chl
  1   2  bmi  hyp  chl
  1   3  bmi  hyp  chl
  1   4  bmi  hyp  chl
  1   5  bmi  hyp  chl
  2   1  bmi  hyp  chl
  2   2  bmi  hyp  chl
  2   3  bmi  hyp  chl
  2   4  bmi  hyp  chl
  2   5  bmi  hyp  chl
  3   1  bmi  hyp  chl
  3   2  bmi  hyp  chl
  3   3  bmi  hyp  chl
  3   4  bmi  hyp  chl
  3   5  bmi  hyp  chl
  4   1  bmi  hyp  chl
  4   2  bmi  hyp  chl
  4   3  bmi  hyp  chl
  4   4  bmi  hyp  chl
  4   5  bmi  hyp  chl
  5   1  bmi  hyp  chl
  5   2  bmi  hyp  chl
  5   3  bmi  hyp  chl
  5   4  bmi  hyp  chl
  5   5  bmi  hyp  chl
   age  bmi hyp chl
1    1 22.0   1 131
2    2 22.7   1 187
3    1 22.0   1 187
4    3 24.9   1 218
5    1 20.4   1 113
6    3 22.5   2 184
7    1 22.5   1 118
8    1 30.1   1 187
9    2 22.0   1 238
10   2 27.5   1 204
11   1 30.1   2 187
12   2 22.5   1 187
13   3 21.7   1 206
14   2 28.7   2 204
15   1 29.6   1 229
16   1 27.4   1 131
17   3 27.2   2 284
18   2 26.3   2 199
19   1 35.3   1 218
20   3 25.5   2 229
21   1 33.2   2 199
22   1 33.2   1 229
23   1 27.5   1 131
24   3 24.9   1 184
25   2 27.4   1 186

fdd

> fdd %>% {
+   print(class(.))
+   dplyr::glimpse(.)
+ }
[1] "data.frame"
Observations: 52
Variables: 65
$ id     <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 13, 14, 15, 16, 17, 18,...
$ trt    <fctr> E, C, E, C, E, C, C, C, C, C, E, C, E, C, C, E, E, E, ...
$ pp     <fctr> Y, N, N, Y, Y, Y, Y, N, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, ...
$ trtp   <dbl> 3, 0, NA, 4, 4, 4, 0, 0, 7, 4, 4, 4, 4, 3, 4, 4, 5, 3, ...
$ sex    <fctr> F, F, M, F, M, F, M, F, M, M, F, F, F, F, F, M, F, M, ...
$ etn    <fctr> OT, NL, NL, OT, OT, NL, OT, OT, NL, NL, NL, OT, OT, OT...
$ age    <dbl> 6, 8, 4, 4, 14, 10, 11, 6, 17, 18, 10, 17, 9, 14, 14, 8...
$ trauma <dbl> 4, 4, 2, 4, 3, 1, NA, 4, 3, 2, 2, 3, 3, 1, 3, 1, 1, 2, ...
$ prop1  <dbl> 30.96774, 58.00000, 26.66667, 36.00000, 30.00000, 14.00...
$ prop2  <dbl> 35.00000, NA, 30.00000, 17.00000, 15.00000, 3.00000, 36...
$ prop3  <dbl> 46.00000, NA, 23.46667, 11.00000, 13.00000, 4.00000, 40...
$ crop1  <dbl> NA, 45, NA, NA, 9, 7, 26, NA, 22, 18, NA, 24, 28, 38, 3...
$ crop2  <dbl> NA, NA, NA, NA, 7, 1, 22, NA, 10, 4, NA, 5, 6, 14, 18, ...
$ crop3  <dbl> NA, NA, NA, NA, 2, 2, 27, NA, 19, 10, NA, 2, 2, 25, 13,...
$ masc1  <dbl> NA, NA, NA, NA, 57, 23, 54, NA, 34, 35, NA, 34, 51, 71,...
$ masc2  <dbl> NA, NA, NA, NA, 20, 6, 46, NA, 31, 13, NA, 13, 25, 26, ...
$ masc3  <dbl> NA, NA, NA, NA, 6, 11, 37, NA, 18, 7, NA, 9, 24, 55, 33...
$ cbcl1  <dbl> NA, NA, NA, 74, 96, 38, 79, 42, 28, 36, 44, 16, NA, NA,...
$ cbcl3  <dbl> NA, NA, NA, 32, 39, 11, 94, 48, NA, 43, 28, 0, NA, NA, ...
$ prs1   <dbl> 21, 29, NA, 21, 23, 17, 24, 15, 19, 22, 20, 21, 25, 21,...
$ prs2   <dbl> 25, NA, 15, 16, 10, 3, 19, 12, 13, 8, 7, 9, 3, 11, 9, 1...
$ prs3   <dbl> 21, NA, 21, 9, NA, 3, 18, 21, NA, NA, 16, NA, 0, NA, NA...
$ ypa1   <dbl> 14.000000, NA, 3.000000, 13.000000, 8.000000, 8.000000,...
$ ypb1   <dbl> 4.000000, NA, 0.750000, 4.000000, 8.000000, 9.000000, N...
$ ypc1   <dbl> 18.000000, NA, 9.000000, 16.000000, 11.000000, 15.00000...
$ yp1    <dbl> 36, NA, 13, 33, 27, 32, NA, 24, 48, 45, 26, 37, 25, 39,...
$ ypa2   <dbl> 13.000000, NA, 4.000000, 9.000000, 5.000000, 3.000000, ...
$ ypb2   <dbl> 8.00000, NA, 3.00000, 7.00000, 3.00000, 3.00000, 15.533...
$ ypc2   <dbl> 14, NA, 12, 11, 8, 9, 13, 6, 10, 13, 8, 3, 11, 9, 9, 15...
$ yp2    <dbl> 35, NA, 19, 27, 16, 15, 39, 13, 23, 33, 17, 7, 27, 21, ...
$ ypa3   <dbl> 13.000000, NA, 0.000000, 5.000000, 1.000000, 2.000000, ...
$ ypb3   <dbl> 10.000000, NA, 0.750000, 5.000000, 3.000000, 2.000000, ...
$ ypc3   <dbl> 15, NA, 12, 10, 7, 9, 16, 9, NA, 11, 9, 3, 1, 12, 7, 15...
$ yp3    <dbl> 38, NA, 13, 20, 11, 13, 39, 35, NA, 36, 14, 3, 1, 34, 1...
$ yca1   <dbl> NA, 13, NA, NA, 8, 0, 9, NA, 11, 4, NA, 12, 10, 20, 7, ...
$ ycb1   <dbl> NA, 19, NA, NA, 8, 2, 18, NA, 10, 11, NA, 10, 17, 12, 1...
$ ycc1   <dbl> NA, 13, NA, NA, 10, 6, 14, NA, 14, 13, NA, 11, 16, 18, ...
$ yc1    <dbl> NA, 45, NA, NA, 26, 8, 41, NA, 35, 28, NA, 33, 43, 50, ...
$ yca2   <dbl> NA, NA, NA, NA, 0, 0, 7, NA, 7, 2, NA, 2, NA, 0, 7, 2, ...
$ ycb2   <dbl> NA, NA, NA, NA, 3.000, 0.000, 11.000, NA, 6.000, 6.000,...
$ ycc2   <dbl> NA, NA, NA, NA, 3, 1, 8, NA, 14, 7, NA, 5, NA, 4, 7, 9,...
$ yc2    <dbl> NA, NA, NA, NA, 6, 1, 26, NA, 27, 15, NA, 8, NA, 8, 21,...
$ yca3   <dbl> NA, NA, NA, NA, 0, 0, 9, NA, 1, 1, NA, 1, 1, 7, 2, 3, 1...
$ ycb3   <dbl> NA, NA, NA, NA, 0, 0, 10, NA, 6, 7, NA, 3, 4, 12, 5, 3,...
$ ycc3   <dbl> NA, NA, NA, NA, 4, 2, 12, NA, 7, 5, NA, 5, 2, 16, 3, 10...
$ yc3    <dbl> NA, NA, NA, NA, 4, 2, 31, NA, 14, 13, NA, 9, 7, 35, 10,...
$ ypf1   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0...
$ ypf2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ypf3   <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ypp1   <dbl> 1, 0, 0, 0, 1, 0, NA, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, ...
$ ypp2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ypp3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ycf1   <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1...
$ ycf2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ycf3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ycp1   <dbl> NA, 1, NA, NA, 1, 0, 1, NA, 1, 0, NA, 1, 1, 1, 1, 1, 1,...
$ ycp2   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ycp3   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ cbin1  <dbl> NA, NA, NA, 16, 24, 14, 24, 16, 13, 14, 22, 7, NA, NA, ...
$ cbin3  <dbl> NA, NA, NA, 7, 6, 0, 28, 19, NA, 19, 14, 0, NA, NA, 7, ...
$ cbex1  <dbl> NA, NA, NA, 31, 36, 7, 21, 8, 3, 2, 13, 1, NA, NA, 6, 1...
$ cbex3  <dbl> NA, NA, NA, 16, 20, 1, 33, 11, NA, 5, 7, 0, NA, NA, 2, ...
$ bir1   <dbl> NA, 19, NA, NA, 17, 3, 14, NA, 6, 19, NA, 11, 19, 25, 2...
$ bir2   <dbl> NA, NA, NA, NA, 13, 1, 15, NA, 4, 10, NA, 3, 12, 5, 13,...
$ bir3   <dbl> NA, NA, NA, NA, 2, 2, 17, NA, 2, 19, NA, 4, 3, 21, 5, 1...

fdgs

> fdgs %>% {
+   print(class(.))
+   dplyr::glimpse(.)
+ }
[1] "data.frame"
Observations: 10,030
Variables: 8
$ id    <dbl> 100001, 100003, 100004, 100005, 100006, 100018, 100027, ...
$ reg   <fctr> West, West, West, West, West, East, West, West, City, N...
$ age   <dbl> 13.095140, 13.817933, 13.971253, 13.982204, 13.522245, 1...
$ sex   <fctr> boy, boy, boy, girl, girl, boy, boy, boy, boy, boy, boy...
$ hgt   <dbl> 175.5, 148.4, 159.9, 159.7, 160.3, 157.8, 175.3, 184.0, ...
$ wgt   <dbl> 75.0, 40.0, 42.0, 46.5, 47.8, 39.7, 66.7, 80.7, 35.5, 55...
$ hgt.z <dbl> 1.751, -2.292, -1.000, -0.743, -0.414, 2.025, 0.879, 2.1...
$ wgt.z <dbl> 2.410, -1.494, -1.315, -0.783, -0.355, 0.823, 1.291, 2.4...

md.pairs

変数の組み合わせによる欠損パターンの表示

> md.pairs(nhanes)
$rr
    age bmi hyp chl
age  25  16  17  15
bmi  16  16  16  13
hyp  17  16  17  14
chl  15  13  14  15

$rm
    age bmi hyp chl
age   0   9   8  10
bmi   0   0   0   3
hyp   0   1   0   3
chl   0   2   1   0

$mr
    age bmi hyp chl
age   0   0   0   0
bmi   9   0   1   2
hyp   8   0   0   1
chl  10   3   3   0

$mm
    age bmi hyp chl
age   0   0   0   0
bmi   0   9   8   7
hyp   0   8   8   7
chl   0   7   7  10

md.pattern

欠損パターンの表示

> md.pattern(nhanes)
   age hyp bmi chl   
13   1   1   1   1  0
 1   1   1   0   1  1
 3   1   1   1   0  1
 1   1   0   0   1  2
 7   1   0   0   0  3
     0   8   9  10 27

mice

多重補間代入

Arguments

  • data
  • m
  • method
  • predictorMatrix
  • visitSequence
  • form
  • post
  • defaultMethod
  • maxit
  • diagnostics
  • printFlag
  • seed
  • imputationMethod
  • defaultImputationMethod
  • data.init
  • ...
> nhanes %>% mice()

 iter imp variable
  1   1  bmi  hyp  chl
  1   2  bmi  hyp  chl
  1   3  bmi  hyp  chl
  1   4  bmi  hyp  chl
  1   5  bmi  hyp  chl
  2   1  bmi  hyp  chl
  2   2  bmi  hyp  chl
  2   3  bmi  hyp  chl
  2   4  bmi  hyp  chl
  2   5  bmi  hyp  chl
  3   1  bmi  hyp  chl
  3   2  bmi  hyp  chl
  3   3  bmi  hyp  chl
  3   4  bmi  hyp  chl
  3   5  bmi  hyp  chl
  4   1  bmi  hyp  chl
  4   2  bmi  hyp  chl
  4   3  bmi  hyp  chl
  4   4  bmi  hyp  chl
  4   5  bmi  hyp  chl
  5   1  bmi  hyp  chl
  5   2  bmi  hyp  chl
  5   3  bmi  hyp  chl
  5   4  bmi  hyp  chl
  5   5  bmi  hyp  chl
Multiply imputed data set
Call:
mice(data = .)
Number of multiple imputations:  5
Missing cells per column:
age bmi hyp chl 
  0   9   8  10 
Imputation methods:
  age   bmi   hyp   chl 
   "" "pmm" "pmm" "pmm" 
VisitSequence:
bmi hyp chl 
  2   3   4 
PredictorMatrix:
    age bmi hyp chl
age   0   0   0   0
bmi   1   0   1   1
hyp   1   1   0   1
chl   1   1   1   0
Random generator seed value:  NA
> nhanes %>% mice() %>% class()

 iter imp variable
  1   1  bmi  hyp  chl
  1   2  bmi  hyp  chl
  1   3  bmi  hyp  chl
  1   4  bmi  hyp  chl
  1   5  bmi  hyp  chl
  2   1  bmi  hyp  chl
  2   2  bmi  hyp  chl
  2   3  bmi  hyp  chl
  2   4  bmi  hyp  chl
  2   5  bmi  hyp  chl
  3   1  bmi  hyp  chl
  3   2  bmi  hyp  chl
  3   3  bmi  hyp  chl
  3   4  bmi  hyp  chl
  3   5  bmi  hyp  chl
  4   1  bmi  hyp  chl
  4   2  bmi  hyp  chl
  4   3  bmi  hyp  chl
  4   4  bmi  hyp  chl
  4   5  bmi  hyp  chl
  5   1  bmi  hyp  chl
  5   2  bmi  hyp  chl
  5   3  bmi  hyp  chl
  5   4  bmi  hyp  chl
  5   5  bmi  hyp  chl
[1] "mids"

nhanes

欠損値を多く含んだ模擬データセット

> data("nhanes")
> nhanes %>% {
+   print(class(.))
+   dplyr::as_data_frame(.) %>% head(.)
+ }
[1] "data.frame"
# A tibble: 6 × 4
    age   bmi   hyp   chl
  <dbl> <dbl> <dbl> <dbl>
1     1    NA    NA    NA
2     2  22.7     1   187
3     1    NA     1   187
4     3    NA    NA    NA
5     1  20.4     1   113
6     3    NA    NA   184

pool

> imp <- mice(nhanes)

 iter imp variable
  1   1  bmi  hyp  chl
  1   2  bmi  hyp  chl
  1   3  bmi  hyp  chl
  1   4  bmi  hyp  chl
  1   5  bmi  hyp  chl
  2   1  bmi  hyp  chl
  2   2  bmi  hyp  chl
  2   3  bmi  hyp  chl
  2   4  bmi  hyp  chl
  2   5  bmi  hyp  chl
  3   1  bmi  hyp  chl
  3   2  bmi  hyp  chl
  3   3  bmi  hyp  chl
  3   4  bmi  hyp  chl
  3   5  bmi  hyp  chl
  4   1  bmi  hyp  chl
  4   2  bmi  hyp  chl
  4   3  bmi  hyp  chl
  4   4  bmi  hyp  chl
  4   5  bmi  hyp  chl
  5   1  bmi  hyp  chl
  5   2  bmi  hyp  chl
  5   3  bmi  hyp  chl
  5   4  bmi  hyp  chl
  5   5  bmi  hyp  chl
> fit <- with(data=imp,exp=lm(bmi~hyp+chl))
> pool(fit)
Call: pool(object = fit)

Pooled coefficients:
(Intercept)         hyp         chl 
22.42887115 -1.11753665  0.02849406 

Fraction of information about the coefficients missing due to nonresponse: 
(Intercept)         hyp         chl 
  0.2251261   0.3821697   0.3744282