broom: Convert Statistical Analysis Objects into Tidy Data Frames
統計解析の出力オブジェクトを整形された形にする
- CRAN: http://cran.r-project.org/web/packages/broom/index.html
- GitHub: https://github.com/dgrtwo/broom
- Vignettes:
> library(broom)
Attaching package: 'broom'
The following object is masked from 'package:modelr':
bootstrap
> library(dplyr)
バージョン: 0.4.2
関数名 | 概略 |
---|---|
Arima_tidiers |
Tidying methods for ARIMA modeling of time series |
aareg_tidiers |
Tidiers for aareg objects |
acf_tidiers |
Tidying method for the acf function |
anova_tidiers |
Tidying methods for anova and AOV objects |
auc_tidiers |
Tidiers for objects from the AUC package |
augment |
Augment data according to a tidied model |
augment_columns |
add fitted values, residuals, and other common outputs to an augment call |
biglm_tidiers |
Tidiers for biglm and bigglm object |
binDesign_tidiers |
Tidy a binDesign object |
binWidth_tidiers |
Tidy a binWidth object |
boot_tidiers |
Tidying methods for bootstrap computations |
bootstrap |
Set up bootstrap replicates of a dplyr operation |
broom |
Convert Statistical Analysis Objects into Tidy Data Frames |
btergm_tidiers |
Tidying method for a bootstrapped temporal |
exponential |
random graph model |
cch_tidiers |
tidiers for case-cohort data |
compact |
Remove NULL items in a vector or list |
confint.geeglm |
Confidence interval for 'geeglm' objects |
confint_tidy |
Calculate confidence interval as a tidy data frame |
coxph_tidiers |
Tidiers for coxph object |
cv.glmnet_tidiers |
Tidiers for glmnet cross-validation objects |
data.frame_tidiers |
Tidiers for data.frame objects |
ergm_tidiers |
Tidying methods for an exponential random graph model |
felm_tidiers |
Tidying methods for models with multiple group fixed effects |
finish_glance |
Add logLik, AIC, BIC, and other common measurements to a glance of a prediction |
fitdistr_tidiers |
Tidying methods for fitdistr objects from the MASS package |
fix_data_frame |
Ensure an object is a data frame, with rownames moved into a column |
gam_tidiers |
Tidying methods for a generalized additive model (gam) |
gamlss_tidiers |
Tidying methods for gamlss objects |
geeglm_tidiers |
Tidying methods for generalized estimating equations models |
glance |
Construct a single row summary "glance" of a model, fit, or other object |
glm_tidiers |
Tidying methods for a glm object |
glmnet_tidiers |
Tidiers for LASSO or elasticnet regularized fits |
gmm_tidiers |
Tidying methods for generalized method of moments "gmm" objects |
htest_tidiers |
Tidying methods for an htest object |
inflate |
Expand a dataset to include all factorial combinations of one or more variables |
insert_NAs |
insert a row of NAs into a data frame wherever another data frame has NAs |
kappa_tidiers |
Tidy a kappa object from a Cohen's kappa calculation |
kde_tidiers |
Tidy a kernel density estimate object from the ks package |
kmeans_tidiers |
Tidying methods for kmeans objects |
list_tidiers |
Tidiers for return values from functions that aren't S3 objects |
lm_tidiers |
Tidying methods for a linear model |
lme4_tidiers |
Tidying methods for mixed effects models |
lmodel2_tidiers |
Tidiers for linear model II objects from the lmodel2 package |
loess_tidiers |
Augmenting methods for loess models |
matrix_tidiers |
Tidiers for matrix objects |
mclust_tidiers |
Tidying methods for Mclust objects |
mcmc_tidiers |
Tidying methods for MCMC (Stan, JAGS, etc.) fits |
mle2_tidiers |
Tidy mle2 maximum likelihood objects |
multcomp_tidiers |
tidying methods for objects produced by 'multcomp' |
multinom_tidiers |
Tidying methods for multinomial logistic regression models |
nlme_tidiers |
Tidying methods for mixed effects models |
nls_tidiers |
Tidying methods for a nonlinear model |
optim_tidiers |
Tidiers for lists returned from optim |
orcutt_tidiers |
Tidiers for Cochrane Orcutt object |
plm_tidiers |
Tidiers for panel regression linear models |
poLCA_tidiers |
Tidiers for poLCA objects |
prcomp_tidiers |
Tidying methods for principal components analysis via 'prcomp' |
process_ergm |
helper function to process a tidied ergm object |
process_geeglm |
helper function to process a tidied geeglm object |
process_lm |
helper function to process a tidied lm object |
process_rq |
Helper function for tidy.rq and tidy.rqs |
pyears_tidiers |
Tidy person-year summaries |
rcorr_tidiers |
Tidying methods for rcorr objects |
ridgelm_tidiers |
Tidying methods for ridgelm objects from the MASS package |
rlm_tidiers |
Tidying methods for an rlm (robust linear model) object |
rowwise_df_tidiers |
Tidying methods for rowwise_dfs from dplyr, for |
tidying |
each row and recombining the results |
rq_tidiers |
Tidying methods for quantile regression models |
rstanarm_tidiers |
Tidying methods for an rstanarm model |
sexpfit_tidiers |
Tidy an expected survival curve |
smooth.spline_tidiers |
tidying methods for smooth.spline objects |
sp_tidiers |
tidying methods for classes from the sp package. |
sparse_tidiers |
Tidy a sparseMatrix object from the Matrix package |
summary_tidiers |
Tidiers for summaryDefault objects |
survfit_tidiers |
tidy survival curve fits |
survreg_tidiers |
Tidiers for a parametric regression survival model |
svd_tidiers |
Tidying methods for singular value decomposition |
tidy |
Tidy the result of a test into a summary data.frame |
tidy.NULL |
tidy on a NULL input |
tidy.TukeyHSD |
tidy a TukeyHSD object |
tidy.coeftest |
Tidying methods for coeftest objects |
tidy.default |
Default tidying method |
tidy.density |
tidy a density objet |
tidy.dist |
Tidy a distance matrix |
tidy.ftable |
tidy an ftable object |
tidy.manova |
tidy a MANOVA object |
tidy.map |
Tidy method for map objects. |
tidy.numeric |
Tidy atomic vectors |
tidy.pairwise.htest |
tidy a pairwise hypothesis test |
tidy.power.htest |
tidy a power.htest |
tidy.spec |
tidy a spec objet |
tidy.table |
tidy a table object |
tidy.ts |
tidy a ts timeseries object |
unrowname |
strip rownames from an object |
xyz_tidiers |
Tidiers for x, y, z lists suitable for persp, image, etc. |
zoo_tidiers |
Tidying methods for a zoo object |
augment
> mtcars %>% group_by(cyl) %>%
+ do(fit = lm(wt ~ mpg + qsec + gear, .)) %>%
+ augment(fit) %>%
+ head() %>%
+ knitr::kable()
cyl | wt | mpg | qsec | gear | .fitted | .se.fit | .resid | .hat | .sigma | .cooksd | .std.resid |
---|---|---|---|---|---|---|---|---|---|---|---|
4 | 2.320 | 22.8 | 18.61 | 4 | 2.46334473610 | 0.141857133685 | -0.143344736098 | 0.197297497135 | 0.338715701818 | 0.015421799919 | -0.500972686568 |
4 | 3.190 | 24.4 | 20.00 | 4 | 2.63356011079 | 0.119955227097 | 0.556439889210 | 0.141077439790 | 0.242723102392 | 0.145126019805 | 1.879969736352 |
4 | 3.150 | 22.8 | 22.90 | 4 | 3.39278059524 | 0.298922321177 | -0.242780595235 | 0.876064146800 | 0.199323867070 | 8.240036829579 | -2.159360232454 |
4 | 2.200 | 32.4 | 19.47 | 4 | 1.86408215293 | 0.173846815730 | 0.335917847066 | 0.296314357955 | 0.303757385696 | 0.165508650040 | 1.253872394196 |
4 | 1.615 | 30.4 | 18.52 | 4 | 1.82192616046 | 0.147540125770 | -0.206926160460 | 0.213422161384 | 0.331544809908 | 0.036203130352 | -0.730557079876 |
4 | 1.835 | 33.9 | 19.90 | 4 | 1.83449503768 | 0.209860981278 | 0.000504962315 | 0.431799975737 | 0.344955958298 | 0.000000835906 | 0.002097577191 |
> bootnls_aug <- mtcars %>% bootstrap(100) %>%
+ do(augment(nls(mpg ~ k / wt + b, ., start=list(k=1, b=0)), .))
>
> ggplot(bootnls_aug, aes(wt, mpg)) + geom_point() +
+ geom_line(aes(y=.fitted, group=replicate), alpha=.2)
> smoothspline_aug <- mtcars %>% bootstrap(100) %>%
+ do(augment(smooth.spline(.$wt, .$mpg, df=4), .))
>
> ggplot(smoothspline_aug, aes(wt, mpg)) + geom_point() +
+ geom_line(aes(y=.fitted, group=replicate), alpha=.2)
bootstrap
ブーツストラップ
> set.seed(2014)
> mtcars %>% bootstrap(100) %>%
+ do(nls(mpg ~ k / wt + b, .,
+ start = list(k = 1, b = 0)) %>% tidy())
Source: local data frame [200 x 6]
Groups: replicate [100]
replicate term estimate std.error statistic
<int> <chr> <dbl> <dbl> <dbl>
1 1 k 46.63250189869 4.02601640314 11.582789842164
2 1 b 4.36063670161 1.53826660819 2.834773035050
3 2 k 54.18247616275 4.96497634197 10.912937430288
4 2 b 1.00486839665 1.89719556867 0.529659890229
5 3 k 43.25721216486 3.56485961800 12.134338178812
6 3 b 4.83351040793 1.29790905951 3.724074789762
7 4 k 48.53135108848 4.45688317168 10.889078582281
8 4 b 3.50934563498 1.68897933879 2.077790742843
9 5 k 52.60509851090 5.66269038595 9.289771279292
10 5 b 3.34437976627 2.30041096264 1.453818391838
# ... with 190 more rows, and 1 more variables: p.value <dbl>
compact
リストからNULL要素を取り除く
> list("A", "B", NULL, "D") %>% broom:::compact()
[[1]]
[1] "A"
[[2]]
[1] "B"
[[3]]
[1] "D"
data.frame_tidiers
glance
> mtcars %>% group_by(cyl) %>%
+ do(fit = lm(wt ~ mpg + qsec + gear, .)) %>%
+ glance(fit) %>%
+ head() %>%
+ knitr::kable()
cyl | r.squared | adj.r.squared | sigma | statistic | p.value | df | logLik | AIC | BIC | deviance | df.residual |
---|---|---|---|---|---|---|---|---|---|---|---|
4 | 0.779913093309 | 0.685590133298 | 0.319367260098 | 8.26853921061 | 0.010597767245 | 4 | -0.566856603363 | 11.1337132067 | 13.1231895707 | 0.713968127756 | 7 |
6 | 0.969994653934 | 0.939989307868 | 0.087294251135 | 32.32739431824 | 0.008743763030 | 4 | 10.102267469856 | -10.2045349397 | -10.4749841944 | 0.022860858844 | 3 |
8 | 0.652127757744 | 0.547766085067 | 0.510687079831 | 6.24872849407 | 0.011614604664 | 4 | -8.101858384279 | 26.2037167686 | 29.3990034166 | 2.608012935067 | 10 |
insert_NAs
tidy
> Orange %>% group_by(Tree) %>%
+ do(cor.test(.$age, .$circumference) %>% tidy())
Source: local data frame [5 x 9]
Groups: Tree [5]
Tree estimate statistic p.value parameter
<ord> <dbl> <dbl> <dbl> <int>
1 3 0.988176587129 14.4118806470 0.0000290104593668 5
2 1 0.985467542479 12.9725812675 0.0000485190172612 5
3 5 0.987737642292 14.1468609718 0.0000317709260945 5
4 2 0.987362434577 13.9312907259 0.0000342504117564 5
5 4 0.984460969608 12.5357483679 0.0000573308969016 5
# ... with 4 more variables: conf.low <dbl>, conf.high <dbl>,
# method <fctr>, alternative <fctr>
> Orange %>% group_by(Tree) %>%
+ do(lm(age ~ circumference, data=.) %>% tidy())
Source: local data frame [10 x 6]
Groups: Tree [5]
Tree term estimate std.error statistic
<ord> <chr> <dbl> <dbl> <dbl>
1 3 (Intercept) -209.51232149301 85.268290402704 -2.457095369258
2 3 circumference 12.03888487911 0.835344475434 14.411880647031
3 1 (Intercept) -264.67343750000 98.620556898818 -2.683755251672
4 1 circumference 11.91924542683 0.918802910620 12.972581267502
5 5 (Intercept) -54.48409709432 76.886278786109 -0.708632254734
6 5 circumference 8.78713197900 0.621136519012 14.146860971848
7 2 (Intercept) -132.43972525629 83.131414589342 -1.593136913530
8 2 circumference 7.79522500189 0.559547938180 13.931290725948
9 4 (Intercept) -76.51367061555 88.294375729161 -0.866574682517
10 4 circumference 7.16984173775 0.571951632031 12.535748367901
# ... with 1 more variables: p.value <dbl>
> mtcars %>% group_by(am) %>%
+ do(lm(wt ~ mpg + qsec + gear, .) %>% tidy())
Source: local data frame [8 x 6]
Groups: am [2]
am term estimate std.error statistic
<dbl> <chr> <dbl> <dbl> <dbl>
1 0 (Intercept) 4.9175462315902 1.3966567480290 3.5209411607613
2 0 mpg -0.1918891414413 0.0442832937022 -4.3332174596550
3 0 qsec 0.0919136106783 0.0983206685615 0.9348350862848
4 0 gear 0.1465375354790 0.3681936308209 0.3979904137730
5 1 (Intercept) 4.2830702774164 3.4585995805749 1.2383828129374
6 1 mpg -0.1009831979373 0.0294340870096 -3.4308248767678
7 1 qsec 0.0398316482863 0.1511213514406 0.2635739285459
8 1 gear -0.0228832969429 0.3487822570884 -0.0656091199534
# ... with 1 more variables: p.value <dbl>
> regressions <- mtcars %>% group_by(cyl) %>%
+ do(fit = lm(wt ~ mpg + qsec + gear, .))
> regressions %>% tidy(fit)
Source: local data frame [12 x 6]
Groups: cyl [3]
cyl term estimate std.error statistic
<dbl> <chr> <dbl> <dbl> <dbl>
1 4 (Intercept) -0.77266239762184 2.2278802640408 -0.34681504661315
2 4 mpg -0.08183156858565 0.0238178703631 -3.43572147040065
3 4 qsec 0.21665171541651 0.0759077293445 2.85414564876753
4 4 gear 0.26746961839298 0.2445417311863 1.09375858711481
5 6 (Intercept) -7.78580829042293 3.3548493219313 -2.32076243768252
6 6 mpg 0.04328328231298 0.0519672400109 0.83289553772489
7 6 qsec 0.42199831465506 0.0913681739538 4.61865763967518
8 6 gear 0.63832001855981 0.2052414340504 3.11009334695533
9 8 (Intercept) 0.00597157772867 4.2746051056959 0.00139698933141
10 8 mpg -0.17692456055530 0.0557085166832 -3.17589788938953
11 8 qsec 0.36940581266784 0.1930955273526 1.91307285949380
12 8 gear 0.14276241610120 0.3166412723714 0.45086483840854
# ... with 1 more variables: p.value <dbl>