identifyr: Clean Unique Identifiers

> library(identifyr)

バージョン: 0.1


関数名 概略
clean_id Clean ID
clean_pb Clean PB Numbers
clean_x Clean X Numbers

clean_id

データフレーム内のデータ整形と統一。データの型を合わせ、欠損値らしきデータを欠損値にする。

> df <- data.frame(pbnum = c("PB123", "PB 0034", "  5678 ", "None"),
+                  status = c("Active", "Closed", "Closed", "Active"),
+                  xnum = c("X00123", "9512", "na", "NOT IN APS"))
> df %>% clean_id(cols = c(1, 3), FUN = c("clean_pb", "clean_x"))
     pbnum status      xnum
1 PB000123 Active X00000123
2 PB000034 Closed X00009512
3 PB005678 Closed      <NA>
4     <NA> Active      <NA>

clean_pb

clean_x

> clean_x("X6789")
[1] "X00006789"
> clean_x("1234")
[1] "X00001234"
> clean_x("No #")
[1] NA