gdata: Various R Programming Tools for Data Manipulation

Rでのデータ操作のために便利な関数群

> library(gdata)

バージョン: 2.17.0


関数名 概略
Args Describe Function Arguments
ConvertMedUnits Convert medical measurements between International Standard (SI) and US 'Conventional' Units.
MedUnits Table of conversions between Intertional Standard (SI) and US 'Conventional' Units for common medical measurements.
aggregate.table Defunct Functions in Package 'gdata'
ans Value of Last Evaluated Expression
bindData Bind two data frames into a multivariate data frame
case Map elements of a vector according to the provided 'cases'
cbindX Column-bind objects with different number of rows
centerText Center Text Strings
combine Combine R Objects With a Column Labeling the Source
drop.levels Drop unused factor levels
duplicated2 Determine Duplicate Elements
elem Display Information about Elements in a Given Object
env Describe All Loaded Environments
first Return first or last element of an object
frameApply Subset analysis on data frames
gdata-package Various R programming tools for data manipulation
getYear Get date/time parts from date and time objects
humanReadable Print Byte Size in Human Readable Format
installXLSXsupport Install perl modules needed for read.xls to support Excel 2007+ XLSX format
interleave Interleave Rows of Data Frames or Matrices
is.what Run Multiple is.* Tests on a Given Object
keep Remove All Objects, Except Those Specified
left Return the leftmost or rightmost columns of a matrix or dataframe
ll Describe Objects or Elements
ls.funs List function objects
mapLevels Mapping levels
matchcols Select columns names matching certain critera
nPairs Number of variable pairs
nobs Compute the Number of Non-missing Observations
object.size Report the Space Allocated for Objects
read.xls Read Excel files
rename.vars Remove or rename variables in a dataframe
reorder.factor Reorder the Levels of a Factor
resample Consistent Random Samples and Permutations
sheetCount Count or list sheet names in Excel spreadsheet files.
startsWith Determine if a character string "starts with" with the specified characters.
trim Remove leading and trailing spaces from character strings
trimSum Trim a vector such that the last/first value
represents the sum of trimmed values
unknownToNA Change unknown values to NA and vice versa
unmatrix Convert a matrix into a vector, with appropriate names
upperTriangle Extract or replace the upper/lower triangular portion of a matrix
wideByFactor Create multivariate data by a given factor
write.fwf Write object in fixed width format
xlsFormats Check which file formats are supported by read.xls

Args

関数の引数に関する情報を取得する

> Args(scan)
             value                                    

file ""
what double()
nmax -1
n -1
sep ""
quote if (identical(sep, "\n")) "" else "'\"" dec "."
skip 0
nlines 0
na.strings "NA"
flush FALSE
fill FALSE
strip.white FALSE
quiet FALSE
blank.lines.skip TRUE
multi.line TRUE
comment.char ""
allowEscapes FALSE
fileEncoding ""
encoding "unknown"
text
skipNul FALSE

> Args(legend, sort = TRUE) %>% knitr::kable(format = "markdown")
       value              

adj c(0, 0.5)
angle 45
bg par("bg")
border "black"
box.col par("fg")
box.lty par("lty")
box.lwd par("lwd")
bty "o"
cex 1
col par("col")
density NULL
fill NULL
horiz FALSE
inset 0
legend
lty
lwd
merge do.lines && has.pch ncol 1
pch
plot TRUE
pt.bg NA
pt.cex cex
pt.lwd lwd
seg.len 2
text.col par("col")
text.font NULL
text.width NULL
title NULL
title.adj 0.5
title.col text.col
trace FALSE
x
x.intersp 1
xjust 0
xpd
y NULL
y.intersp 1
yjust 1

value
adj c(0, 0.5)
angle 45
bg par("bg")
border "black"
box.col par("fg")
box.lty par("lty")
box.lwd par("lwd")
bty "o"
cex 1
col par("col")
density NULL
fill NULL
horiz FALSE
inset 0
legend
lty
lwd
merge do.lines && has.pch
ncol 1
pch
plot TRUE
pt.bg NA
pt.cex cex
pt.lwd lwd
seg.len 2
text.col par("col")
text.font NULL
text.width NULL
title NULL
title.adj 0.5
title.col text.col
trace FALSE
x
x.intersp 1
xjust 0
xpd
y NULL
y.intersp 1
yjust 1

ans

最後に実行した関数の返り値を取得する。.Last.valueと同等の機能

> 1 * 2
[1] 2
> ans()
 [1] "Rgitbook"  "gdata"     "stats"     "graphics"  "grDevices"
 [6] "utils"     "datasets"  "ggplot2"   "pipeR"     "magrittr" 
[11] "knitr"     "devtools"  "methods"   "base"

bindData

複数のデータフレームの結合

ref) base::merge()

centerText

文字列を中央に表示するようにする

> cat(centerText("One Line Test"), "\n\n")
                               One Line Test

drop.levels

> iris$Species[1] %>% droplevels()
[1] setosa
Levels: setosa
> list(f = factor(c("A", "B", "C", "D"))[1:3], 
+      i = 1:3, 
+      c = c("A", "B", "D")) %>% drop.levels()
$f
[1] A B C
Levels: A B C

$i
[1] 1 2 3

$c
[1] "A" "B" "D"

duplicated2

重複箇所の検出(いずれかではなく、両方を出力できる)

ref) base::duplicated()

> iris[duplicated(iris), ]
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
143          5.8         2.7          5.1         1.9 virginica
> iris[duplicated(iris, fromLast = TRUE), ]
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
102          5.8         2.7          5.1         1.9 virginica
> iris[duplicated2(iris), ]
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
102          5.8         2.7          5.1         1.9 virginica
143          5.8         2.7          5.1         1.9 virginica

env

現在ロードされている環境に関する情報

> env(unit = "KB", digits = 2)
   Environment       Objects KB      
1  .GlobalEnv          15      321.40
2  package:remoji       6       49.21
3  package:Rgitbook     9      347.62
4  package:gdata       67      628.23
5  tools:rstudio      490     3637.53
6  package:stats      452    33299.22
7  package:graphics    88     9431.13
8  package:grDevices  108     5144.52
9  package:utils      207    18529.83
10 package:datasets   104      594.75
11 package:ggplot2    373     5869.58
12 package:pipeR        3       48.78
13 package:magrittr    37      419.59
14 package:knitr       91     1162.25
15 package:devtools   127      961.98
16 JapanEnv             1        7.51
17 package:methods    379    21892.74
18 Autoloads            1        0.00
19 package:base      1308    31465.50

first / last

オブジェクトの要素の最初と最後の値を取得

> x <- 1:4
> first(x)
[1] 1
> last(x)
[1] 4
> d_l <- list(a = 1, b = 2, c = 3)
> first(d_l)
$a
[1] 1
> df_x <- data.frame(a = 1:2, b = 3:4, c = 5:6)
> first(df_x)
  a b c
1 1 3 5

getYear / _Month / _Day / _Hour / _Min / _Sec

現在の時間を取得

> t <- Sys.time()
> getYear(t)
[1] "2015"
> getMonth(t)
[1] "10"

humanReadable

Arguments

  • x
  • standard
  • units
  • digits
  • width
  • sep
  • justify
> humanReadable(2^32 - 1, standard = "Unix")
[1] "4.0 G"

nobs

欠損箇所の確認

> x <- c(1, 2, 3, 5, NA, 6, 7, 1, NA)
> length(x)
[1] 9
> gdata::nobs(x)
[1] 7

read.xls

> path <- system.file("xls", "iris.xls", package = "gdata")
> df_iris <- gdata::read.xls(path)
> dim(df_iris)
[1] 150   5

sheetCount / sheetNames

> path <- system.file("xls", "iris.xls", package = "gdata")
> sheetCount(path)
[1] 1
> sheetNames(path)
[1] "iris.xls"