dplyr: A Grammar of Data Manipulation
データ操作のためのパッケージ
- CRAN: http://cran.r-project.org/web/packages/dplyr/index.html
- GitHub: https://github.com/hadley/dplyr
- Vignettes:
> library(dplyr)
バージョン: 0.6.0
関数名 | 概略 |
---|---|
all_equal |
Flexible equality comparison for data frames |
all_vars |
Apply predicate to all variables |
arrange |
Arrange rows by variables |
arrange_all |
Arrange rows by a selection of variables |
as.table.tbl_cube |
Coerce a 'tbl_cube' to other data structures |
as.tbl_cube |
Coerce an existing data structure into a 'tbl_cube' |
auto_copy |
Copy tables to same source, if necessary |
band_members |
Band membership |
between |
Do values in a numeric vector fall in specified range? |
bind |
Efficiently bind multiple data frames by row and column |
case_when |
A general vectorised if |
coalesce |
Find first non-missing element |
compute |
Force computation of a database query |
copy_to |
Copy a local data frame to a remote src |
cumall |
Cumulativate versions of any, all, and mean |
desc |
Descending order |
distinct |
Select distinct/unique rows |
do |
Do anything |
dplyr-package |
dplyr: a grammar of data manipulation |
explain |
Explain details of a tbl |
filter |
Return rows with matching conditions |
filter_all |
Filter within a selection of variables |
funs |
Create a list of functions calls. |
group_by |
Group by one or more variables |
group_by_all |
Group by a selection of variables |
groups |
Return grouping variables |
ident |
Flag a character vector as SQL identifiers |
if_else |
Vectorised if |
join |
Join two tbls together |
join.tbl_df |
Join data frame tbls |
lead-lag |
Lead and lag. |
mutate |
Add new variables |
n |
The number of observations in the current group. |
n_distinct |
Efficiently count the number of unique values in a set of vector |
na_if |
Convert values to NA |
nasa |
NASA spatio-temporal data |
near |
Compare two numeric vectors |
nth |
Extract the first, last or nth value from a vector |
order_by |
A helper function for ordering window function output |
pull |
Pull out a single variable |
ranking |
Windowed rank functions. |
recode |
Recode values |
rowwise |
Group input by rows |
sample |
Sample n rows from a table |
scoped |
Operate on a selection of variables |
select |
Select/rename variables by name |
select_all |
Select and rename a selection of variables |
select_helpers |
Select helpers |
setops |
Set operations |
slice |
Select rows by position |
sql |
SQL escaping. |
src_dbi |
Source for database backends |
starwars |
Starwars characters |
storms |
Storm tracks data |
summarise |
Reduces multiple values down to a single value |
summarise_all |
Summarise and mutate multiple columns. |
tally |
Count/tally observations by group |
tbl |
Create a table from a data source |
tbl_cube |
A data cube tbl |
top_n |
Select top (or bottom) n rows (by value) |
vars |
Select variables |
all_equal
データフレームの差分を確認する
Arguments
- target, current... 比較対象のデータフレーム
- ignore_col_order
- ignore_row_order
- convert
- ...
> # データフレーム内のデータの並びを変更する関数を定義
> scramble <- function(x) x[sample(nrow(x)), sample(ncol(x))]
>
> # 既定の挙動では行や列の並びが異なっていてもデータが等しければTRUEを返す
> all_equal(mtcars, scramble(mtcars))
[1] TRUE
> # 行の並びを考慮して差分を確認する
> all_equal(mtcars, scramble(mtcars), ignore_col_order = FALSE)
[1] "Same column names, but different order"
all_vars / any_vars
arrange
変数の値による並び替え
> arrange(mtcars, cyl, disp) %>% head()
mpg cyl disp hp drat wt qsec vs am gear carb
1 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
2 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
3 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
4 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
5 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
6 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
> arrange(mtcars, desc(disp)) %>% head()
mpg cyl disp hp drat wt qsec vs am gear carb
1 10.4 8 472 205 2.93 5.250 17.98 0 0 3 4
2 10.4 8 460 215 3.00 5.424 17.82 0 0 3 4
3 14.7 8 440 230 3.23 5.345 17.42 0 0 3 4
4 19.2 8 400 175 3.08 3.845 17.05 0 0 3 2
5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
6 14.3 8 360 245 3.21 3.570 15.84 0 0 3 4
as_data_frame
リストをデータフレームに変換
> list(x = 1:500, y = runif(500), z = 500:1) %>% as_data_frame()
# A tibble: 500 x 3
x y z
<int> <dbl> <int>
1 1 0.7483994697686 500
2 2 0.3320873326156 499
3 3 0.9686443330720 498
4 4 0.7768150032498 497
5 5 0.9165990713518 496
6 6 0.5212027428206 495
7 7 0.0969684200827 494
8 8 0.0219207596965 493
9 9 0.1633499194868 492
10 10 0.5003534886055 491
# ... with 490 more rows
between
指定した範囲内に数値があるか評価する
> x <- seq(1:10)
> dplyr::between(x, 2, 4)
[1] FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
> x[dplyr::between(x, 2, 4)]
[1] 2 3 4
> iris %>% dplyr::filter(., between(Sepal.Width, 3.4, 3.8)) %>% head()
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 5.0 3.6 1.4 0.2 setosa
3 4.6 3.4 1.4 0.3 setosa
4 5.0 3.4 1.5 0.2 setosa
5 5.4 3.7 1.5 0.2 setosa
6 4.8 3.4 1.6 0.2 setosa
bind
行と列の結合(共通の変数を持っていない場合でも結合する)
> X <- data_frame(x = 1:5)
> Y <- data_frame(y = 10:6)
> bind_rows(X, Y)
# A tibble: 10 x 2
x y
<int> <int>
1 1 NA
2 2 NA
3 3 NA
4 4 NA
5 5 NA
6 NA 10
7 NA 9
8 NA 8
9 NA 7
10 NA 6
> bind_cols(X, Y)
# A tibble: 5 x 2
x y
<int> <int>
1 1 10
2 2 9
3 3 8
4 4 7
5 5 6
> combine(X, Y)
Error in combine_all(args): Columns of class data.frame not supported
copy_to
ローカル環境のデータフレームをリモートのデータソースにコピーする
Arguments
- dest
- df
- name
- ...
> copy_to(dest, df, name = deparse(substitute(df)), ...)
data_frame
データフレームの構築(base::data.frame
の強化版)
> a <- 1:5
> data_frame(a, b = a * 2)
# A tibble: 5 x 2
a b
<int> <dbl>
1 1 2
2 2 4
3 3 6
4 4 8
5 5 10
> data_frame(a, b = a * 2, c = 1)
# A tibble: 5 x 3
a b c
<int> <dbl> <dbl>
1 1 2 1
2 2 4 1
3 3 6 1
4 4 8 1
5 5 10 1
> data_frame(x = runif(10), y = x * 2)
# A tibble: 10 x 2
x y
<dbl> <dbl>
1 0.980605010875 1.961210021749
2 0.236343062948 0.472686125897
3 0.977456531487 1.954913062975
4 0.350839339662 0.701678679325
5 0.779177646386 1.558355292771
6 0.331191585399 0.662383170798
7 0.113374165725 0.226748331450
8 0.994318356970 1.988636713941
9 0.760566237383 1.521132474765
10 0.613929431653 1.227858863305
desc
降順並び替え
> desc(1:10)
[1] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
> factor(letters) %>% desc()
[1] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16 -17
[18] -18 -19 -20 -21 -22 -23 -24 -25 -26
do
> by_cyl <- group_by(mtcars, cyl)
> do(by_cyl, head(., 2))
# A tibble: 6 x 11
# Groups: cyl [3]
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
2 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
3 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
4 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
6 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
> models <- by_cyl %>% do(mod = lm(mpg ~ disp, data = .))
> models %>% {
+ print(.)
+ class(.)
+ }
Source: local data frame [3 x 2]
Groups: <by row>
# A tibble: 3 x 2
cyl mod
* <dbl> <list>
1 4 <S3: lm>
2 6 <S3: lm>
3 8 <S3: lm>
[1] "rowwise_df" "tbl_df" "tbl" "data.frame"
> summarise(models, rsq = summary(mod)$r.squared)
# A tibble: 3 x 1
rsq
<dbl>
1 0.6484051395593
2 0.0106260400812
3 0.2701577716900
> models %>% do(data.frame(coef = coef(.$mod)))
Source: local data frame [6 x 1]
Groups: <by row>
# A tibble: 6 x 1
coef
* <dbl>
1 40.87195532165298
2 -0.13514181455960
3 19.08198741874837
4 0.00360511850745
5 22.03279891365333
6 -0.01963409491264
> models %>% do(data.frame(
+ var = names(coef(.$mod)),
+ coef(summary(.$mod)))
+ )
Source: local data frame [6 x 5]
Groups: <by row>
# A tibble: 6 x 5
var Estimate Std..Error t.value
* <fctr> <dbl> <dbl> <dbl>
1 (Intercept) 40.87195532165298 3.58960539961242 11.386197303488
2 disp -0.13514181455960 0.03317160810475 -4.074020594143
3 (Intercept) 19.08198741874837 2.91399289197608 6.548398752547
4 disp 0.00360511850745 0.01555711478988 0.231734390094
5 (Intercept) 22.03279891365333 3.34524111451680 6.586311168436
6 disp -0.01963409491264 0.00931592587348 -2.107583849346
# ... with 1 more variables: Pr...t.. <dbl>
> models <- by_cyl %>% do(mod_linear = lm(mpg ~ disp, data = .),
+ mod_quad = lm(mpg ~ poly(disp, 2), data = .))
> compare <- models %>% do(aov = anova(.$mod_linear, .$mod_quad))
>
> Orange %>% group_by(Tree) %>% do(broom::tidy(lm(age ~ circumference, data = .)))
# A tibble: 10 x 6
# Groups: Tree [5]
Tree term estimate std.error statistic
<ord> <chr> <dbl> <dbl> <dbl>
1 3 (Intercept) -209.51232149301 85.268290402704 -2.457095369258
2 3 circumference 12.03888487911 0.835344475434 14.411880647031
3 1 (Intercept) -264.67343750000 98.620556898818 -2.683755251672
4 1 circumference 11.91924542683 0.918802910620 12.972581267502
5 5 (Intercept) -54.48409709432 76.886278786109 -0.708632254734
6 5 circumference 8.78713197900 0.621136519012 14.146860971848
7 2 (Intercept) -132.43972525629 83.131414589342 -1.593136913530
8 2 circumference 7.79522500189 0.559547938180 13.931290725948
9 4 (Intercept) -76.51367061555 88.294375729161 -0.866574682517
10 4 circumference 7.16984173775 0.571951632031 12.535748367901
# ... with 1 more variables: p.value <dbl>
group_by
水準をグループごとにまとめる
> group_by(mtcars, cyl, vs)
# A tibble: 32 x 11
# Groups: cyl, vs [5]
mpg cyl disp hp drat wt qsec vs am gear carb
* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
# ... with 22 more rows
group_indices
水準のグループに対して新たなIDを付与する
> group_indices(mtcars, cyl)
[1] 2 2 1 2 3 2 3 1 1 2 2 3 3 3 3 3 3 1 1 1 1 3 3 3 3 1 1 1 3 2 3 1
group_size
グループのサイズを得る
> group_size(mtcars)
[1] 32
> mtcars$cyl %>% factor()
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
Levels: 4 6 8
> group_by(mtcars, cyl) %>% group_size()
[1] 11 7 14
join / inner* / left / right_ / full* / semi / anti_
結合の方法
inner_join
... 内部結合left_join
... 左外部結合right_join
... 右外部結合full_join
... 完全外部結合semi_join
... 準結合anti_join
... 逆結合
lead-lag / lead / lag
与えた値より前や後ろの値を出力する
Arguments
- x... ベクター
- n
- default
- order_by
> lead(1:10, n = 1)
[1] 2 3 4 5 6 7 8 9 10 NA
> lag(1:10, n = 4)
[1] NA NA NA NA 1 2 3 4 5 6
> set.seed(71)
> x <- runif(5)
> cbind(ahead = lead(x), x, behind = lag(x))
ahead x behind
[1,] 0.555103868479 0.332928059390 NA
[2,] 0.327369962120 0.555103868479 0.332928059390
[3,] 0.211666960036 0.327369962120 0.555103868479
[4,] 0.316121358424 0.211666960036 0.327369962120
[5,] NA 0.316121358424 0.211666960036
> # http://stackoverflow.com/questions/33147714/return-value-of-next-occurrence-based-on-multiple-columns
> readr::read_delim("Firm Time Date
+ A 08:00 1/1
+ B 09:00 1/1
+ A 09:30 1/1
+ B 10:00 1/1",
+ delim = " ") %>%
+ group_by(Firm, Date) %>%
+ mutate(NextTime = lead(as.character(Time), default = "16:30"))
# A tibble: 4 x 4
# Groups: Firm, Date [2]
Firm Time Date NextTime
<chr> <time> <chr> <chr>
1 A 08:00:00 1/1 09:30:00
2 B 09:00:00 1/1 10:00:00
3 A 09:30:00 1/1 16:30
4 B 10:00:00 1/1 16:30
n
グループ内での観測値を数える
> data("flights", package = "nycflights13")
Error in find.package(package, lib.loc, verbose = verbose): there is no package called 'nycflights13'
> flights %>%
+ group_by(carrier) %>%
+ summarise(carriers = n()) %>%
+ sample_n(5)
Error in eval(lhs, parent, parent): object 'flights' not found
near
> sqrt(2) ^ 2 == 2
[1] FALSE
> near(sqrt(2) ^ 2, 2)
[1] TRUE
nth / first / last
> x <- 1:10
> nth(x, 4)
[1] 4
> first(x)
[1] 1
> last(x)
[1] 10
sample / sample_n / sample_frac
テーブルのデータから幾つかをサンプリングする
Arguments
- tbl
- size... サンプル
- replace
- weight
> sample_n(mtcars, 5)
mpg cyl disp hp drat wt qsec vs am gear carb
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
slice
任意の行を抽出する
> slice(mtcars, 1L)
# A tibble: 1 x 11
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
> dplyr::slice(iris, 2:4)
# A tibble: 3 x 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fctr>
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
src_sqlite
sqliteデータベースとの接続
Arguments
- path
- create... 真偽値。FALSEの場合、パスは既存のものである必要がある。
- src
- from
- ...
> src_sqlite(path, create = FALSE)
tally / count
Arguments
- x
- wt
- sort
- ..., vars
> data("Batting", package = "Lahman")
Error in find.package(package, lib.loc, verbose = verbose): there is no package called 'Lahman'
> batting_tbl <- tbl_df(Batting)
Error in as_data_frame(data): object 'Batting' not found
> tally(group_by(batting_tbl, yearID))
Error in group_by(batting_tbl, yearID): object 'batting_tbl' not found
> tally(group_by(batting_tbl, yearID), sort = TRUE)
Error in group_by(batting_tbl, yearID): object 'batting_tbl' not found
> # Multiple tallys progressively roll up the groups
> plays_by_year <- tally(group_by(batting_tbl, playerID, stint), sort = TRUE)
Error in group_by(batting_tbl, playerID, stint): object 'batting_tbl' not found
> tally(plays_by_year, sort = TRUE)
Error in "n" %in% names(x): object 'plays_by_year' not found
> tally(tally(plays_by_year))
Error in "n" %in% names(x): object 'plays_by_year' not found
> # This looks a little nicer if you use the infix %>% operator
> batting_tbl %>% group_by(playerID) %>% tally(sort = TRUE)
Error in eval(lhs, parent, parent): object 'batting_tbl' not found
> # count is even more succinct - it also does the grouping for you
> batting_tbl %>% count(playerID)
Error in eval(lhs, parent, parent): object 'batting_tbl' not found
> batting_tbl %>% count(playerID, wt = G)
Error in eval(lhs, parent, parent): object 'batting_tbl' not found
> batting_tbl %>% count(playerID, wt = G, sort = TRUE)
Error in eval(lhs, parent, parent): object 'batting_tbl' not found
tbl / is.tbl / as.tbl
データからテーブルを作成する
> as.tbl(mtcars)
# A tibble: 32 x 11
mpg cyl disp hp drat wt qsec vs am gear carb
* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
# ... with 22 more rows
tbl_dt
{data.table}
のdata.tableオブジェクトの作成
> tbl_dt(mtcars) %>% {
+ print(class(.))
+ data.table::as.data.table(.) %>% class() %>% print()
+ }
Error in tbl_dt(mtcars): could not find function "tbl_dt"
tbl_df
データフレームテーブルの生成
> tbl_df(mtcars) %>% {
+ print(class(.))
+ as.data.frame(.) %>% class()
+ }
[1] "tbl_df" "tbl" "data.frame"
[1] "data.frame"
ranking
順位に関する関数群
> x <- c(5, 1, 3, 2, 2, NA)
> row_number(x) # 同位値を考慮しない
[1] 5 1 4 2 3 NA
> min_rank(x) # 同位値を考慮する
[1] 5 1 4 2 2 NA
> dense_rank(x) # 欠損値を無視
[1] 4 1 3 2 2 NA
> percent_rank(x) # 0から1の割合
[1] 1.00 0.00 0.75 0.25 0.25 NA
> cume_dist(x) # 順位割合の積み上げ
[1] 1.0 0.2 0.8 0.6 0.6 NA
> ntile(x, 2) # 指定した群数に分割
[1] 2 1 2 1 1 NA
> data_frame(size = c(100, 10, 1, 0.1, 0.01,
+ 100, 75, 50, 25, 0.1,
+ 100, 1, 1, 1, 1,
+ 20, 20, 20, 20, 20),
+ type = rep(c("log", "uniform", "dominant", "equality"),
+ each = 5)) -> DF.dummy
> DF.dummy %>% dplyr::group_by(type) %>%
+ mutate(row.num = row_number(size),
+ rank.min = min_rank(size),
+ rank.dens = dense_rank(size),
+ rank.pct = percent_rank(size),
+ cum.dst = cume_dist(size),
+ ntile = ntile(size, 50)) %>% kable()
size | type | row.num | rank.min | rank.dens | rank.pct | cum.dst | ntile |
---|---|---|---|---|---|---|---|
100.00 | log | 3 | 5 | 5 | 1.00 | 1.0 | 41 |
10.00 | log | 4 | 4 | 4 | 0.75 | 0.8 | 31 |
1.00 | log | 1 | 3 | 3 | 0.50 | 0.6 | 21 |
0.10 | log | 5 | 2 | 2 | 0.25 | 0.4 | 11 |
0.01 | log | 2 | 1 | 1 | 0.00 | 0.2 | 1 |
100.00 | uniform | 2 | 5 | 5 | 1.00 | 1.0 | 41 |
75.00 | uniform | 1 | 4 | 4 | 0.75 | 0.8 | 31 |
50.00 | uniform | 3 | 3 | 3 | 0.50 | 0.6 | 21 |
25.00 | uniform | 5 | 2 | 2 | 0.25 | 0.4 | 11 |
0.10 | uniform | 4 | 1 | 1 | 0.00 | 0.2 | 1 |
100.00 | dominant | 4 | 5 | 2 | 1.00 | 1.0 | 41 |
1.00 | dominant | 5 | 1 | 1 | 0.00 | 0.8 | 1 |
1.00 | dominant | 2 | 1 | 1 | 0.00 | 0.8 | 11 |
1.00 | dominant | 1 | 1 | 1 | 0.00 | 0.8 | 21 |
1.00 | dominant | 3 | 1 | 1 | 0.00 | 0.8 | 31 |
20.00 | equality | 5 | 1 | 1 | 0.00 | 1.0 | 1 |
20.00 | equality | 1 | 1 | 1 | 0.00 | 1.0 | 11 |
20.00 | equality | 4 | 1 | 1 | 0.00 | 1.0 | 21 |
20.00 | equality | 3 | 1 | 1 | 0.00 | 1.0 | 31 |
20.00 | equality | 2 | 1 | 1 | 0.00 | 1.0 | 41 |
src_mysql
MySQLデータベースに接続する
> src_mysql(dbname, host = NULL, port = 0L, user = "root", password = "",
+ ...)
src_postgres
PostgreSQLに接続する
src_sqlite
SQLiteデータベースに接続する
sql
top_n
上位・下位からの行数を指定した抽出
> df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1))
> df %>% top_n(2)
Selecting by x
x
1 10
2 6
メモ
NSEのvignettesより
{dplyr}
の主要な関数(filter()
, mutate()
, summarise()
, arrange()
, select()
そしてgroup_by()