outliers: Tests for outliers

外れ値の検出

> library(outliers)

バージョン: 0.14


関数名 概略
chisq.out.test Chi-squared test for outlier
cochran.test Test for outlying or inlying variance
dixon.test Dixon tests for outlier
grubbs.test Grubbs tests for one or two outliers in data sample
outlier Find value with largest difference from the mean
qcochran Critical values and p-values for Cochran outlying variance test
qdixon critical values and p-values for Dixon tests
qgrubbs Calculate critical values and p-values for Grubbs tests
qtable Interpolate tabularized distribution
rm.outlier Remove the value(s) most differing from the mean
scores Calculate scores of the sample

chisq.out.test

> set.seed(71)
> x <- rnorm(100)
> chisq.out.test(x)

    chi-squared test for outlier

data:  x
X-squared = 8.9672, p-value = 0.002749
alternative hypothesis: highest value 3.12588239182329 is an outlier
> chisq.out.test(x, opposite = TRUE)

    chi-squared test for outlier

data:  x
X-squared = 4.3196, p-value = 0.03768
alternative hypothesis: lowest value -2.20359620952698 is an outlier
> # boxplot(x)

grubbs.test

Arguments

  • x
  • opposite
  • type
  • two.sided
> set.seed(1234)
> x = rnorm(10)
> grubbs.test(x)

    Grubbs test for one outlier

data:  x
G = 1.97080, U = 0.52047, p-value = 0.1323
alternative hypothesis: lowest value -2.34569770262935 is an outlier
> grubbs.test(x, type = 20)

    Grubbs test for two outliers

data:  x
U = 0.3836, p-value = 0.2459
alternative hypothesis: lowest values -2.34569770262935 , -1.20706574938542 are outliers
> grubbs.test(x, type = 11)

    Grubbs test for two opposite outliers

data:  x
G = 3.44460, U = 0.32364, p-value = 0.195
alternative hypothesis: -2.34569770262935 and 1.08444117668306 are outliers

outlier

(平均からの)外れ値の検出

Arguments

  • x
  • opposite
  • logical
> set.seed(1234)
> y <- rnorm(100)
> outlier(y)
[1] 2.548991
> outlier(y, opposite = TRUE)
[1] -2.345698

rm.outlier

外れ値の除去

> y %>% length()
[1] 100
> rm.outlier(y) %>% length()
[1] 99