Thursday, 19 September 2013

Combine factors with NAs

Combine factors with NAs

I have a matrix of characters and there are numerous NAs. I would like to
create a new variable which combines all (non-NA) strings into one. So
that from
(df = data.frame(matrix(c("A", "B", "C", NA, NA, "E", NA, "D", "A", "C",
"B", "C", NA, "C", "A"), ncol = 3)))
X1 X2 X3
1 A E B
2 B <NA> C
3 C D <NA>
4 <NA> A C
5 <NA> C A
then I would have
X1 X2 X3 newvar
1 A E B A:B:E
2 B <NA> C B:C
3 C D <NA> C:D
4 <NA> A C A:C
5 <NA> C A A:C
Notice that the individual letters alphabetize so I don't get "A:C" and
"C:A" in the last two rows.
I've tried
within(df, newvar <- factor(X1:X2:X3))
which gives
X1 X2 X3 newvar
1 A E B A:E:B
2 B <NA> C <NA>
3 C D <NA> <NA>
4 <NA> A C <NA>
5 <NA> C A <NA>
but the presence of NAs overrides the aggregation.

No comments:

Post a Comment