get values from factor using matrix - r

I want to extract some values from a factor, but instead of having one index string i have a data.matrix containing multiple columns, all containing indexes.
I have:
f<-factor(c('a','b','c','a'))
d<-data.matrix(data.frame(t=c(1,3,2),u=c(2,3,4)))
> f[d]
[1] a c b b c a
But instead of having all return values in one vector, I would like to maintain the data.matrix structure like this:
[,1][,2]
a b
c c
b a
How is this possible i en elegant way?

I'm not sure of the definition of elegant, but you are essentially looking for dim().
You can do this in a cryptic way with dim<-(), like this:
`dim<-`(f[d], dim(d))
## [,1] [,2]
## [1,] a b
## [2,] c c
## [3,] b a
## Levels: a b c
Less cryptic would be the following (though note the slight difference in the result).
matrix(f[d], ncol = ncol(d))
## [,1] [,2]
## [1,] "a" "b"
## [2,] "c" "c"
## [3,] "b" "a"
If you're after a data.frame, then try:
as.data.frame.matrix(`dim<-`(f[d], dim(d)))
## V1 V2
## 1 a b
## 2 c c
## 3 b a
Or data.frame(matrix(f[d], ncol = ncol(d))).

Related

Compare two matrices, keeping values in one matrix that are TRUE in the other

This seems to be an easy task, which I am not finding a solution on R after looking up here and elsewhere. I have two matrices, one with string values and another with logical values.
a <- matrix(c(
"A", "B", "C"
))
b <- matrix(c(
T, F, T
))
> b
[,1]
[1,] TRUE
[2,] FALSE
[3,] TRUE
> a
[,1]
[1,] "A"
[2,] "B"
[3,] "C"
I need to create a third matrix that keeps values in the first that are TRUE in the second, and leaving NA on the remainder, like so:
> C
[,1]
[1,] "A"
[2,] NA
[3,] "C"
How do I achieve the above result?
C <- matrix(a[ifelse(b, T, NA)], ncol = ncol(a))
Here is an alternative by just assigning the NA to FALSE:
a[b==FALSE] <- NA
[,1]
[1,] "A"
[2,] NA
[3,] "C"
using which:
c<-a
c[which(b==FALSE)]<-NA
a <- a[b] . This might also work, Depending on how you want the result.

give names to values in a matrix in R

not sure if this is possible but it should. i want to have a matrix which elements have names just like you can do in a vector like this:
v = 1:10
names(v) = LETTERS[1:10]
result:
A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10
I've tried to create a matrix and use the same sintax:
m = matrix(v, ncol=2, nrow=5)
names(m) = letters[1:8]
but the result is not what i hoped for.
result:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
attr(,"names")
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
I dont want it to be two separated entities. is there a way to do this without any libraries? or at all?
Thank you

Display identical columns in R dataframe

Suppose I have the following dataframe :
df <- data.frame(A=c(1,2,3),B=c("a","b","c"),C=c(2,1,3),D=c(1,2,3),E=c("a","b","c"),F=c(1,2,3))
> df
A B C D E F
1 1 a 2 1 a 1
2 2 b 1 2 b 2
3 3 c 3 3 c 3
I want to filter out the columns that are identical. I know that I can do it with
DuplCols <- df[duplicated(as.list(df))]
UniqueCols <- df[ ! duplicated(as.list(df))]
In the real world my dataframe has more than 500 columns and I do not know how many identical columns of the same kind I have and I do not know the names of the columns. However, each columnname is unique (as in df). My desired result is (optimally) a dataframe where in each row the column names of the identical columns of one kind are stored. The number of columns in the DesiredResult dataframe is the maximal number of identical columns of one kind in the original dataframe and if there are less identical columns of another kind, NA should be stored:
> DesiredResult
X1 X2 X3
1 A D F
2 B E NA
3 C NA NA
(With "identical column of the same kind" I mean the following: in df the columns A, D, F are identical columns of the same kind and B, E are identical columns of the same kind.)
You can use unique and then test with %in% where it matches to extract the colname.
tt_lapply(unique(as.list(df)), function(x) {colnames(df)[as.list(df) %in% list(x)]})
tt
#[[1]]
#[1] "A" "D" "F"
#
#[[2]]
#[1] "B" "E"
#
#[[3]]
#[1] "C"
t(sapply(tt, "length<-", max(lengths(tt)))) #As data.frame
# [,1] [,2] [,3]
#[1,] "A" "D" "F"
#[2,] "B" "E" NA
#[3,] "C" NA NA

How to create a factor from a binary indicator matrix?

Say I have the following matrix mat, which is a binary indicator matrix for the levels A, B, and C for a set of 5 observations:
mat <- matrix(c(1,0,0,
1,0,0,
0,1,0,
0,1,0,
0,0,1), ncol = 3, byrow = TRUE)
colnames(mat) <- LETTERS[1:3]
> mat
A B C
[1,] 1 0 0
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 1 0
[5,] 0 0 1
I want to convert that into a single factor such that the output is equivalent to fac defines as:
> fac <- factor(rep(LETTERS[1:3], times = c(2,2,1)))
> fac
[1] A A B B C
Levels: A B C
Extra points if you get the labels from the colnames of mat, but a set of numeric codes (e.g. c(1,1,2,2,3)) would also be acceptable as desired output.
Elegant solution with matrix multiplication (and shortest up to now):
as.factor(colnames(mat)[mat %*% 1:ncol(mat)])
This solution makes use of the arr.ind=TRUE argument of which, returning the matching positions as array locations. These are then used to index the colnames:
> factor(colnames(mat)[which(mat==1, arr.ind=TRUE)[, 2]])
[1] A A B B C
Levels: A B C
Decomposing into steps:
> which(mat==1, arr.ind=TRUE)
row col
[1,] 1 1
[2,] 2 1
[3,] 3 2
[4,] 4 2
[5,] 5 3
Use the values of the second column, i.e. which(...)[, 2] and index colnames:
> colnames(mat)[c(1, 1, 2, 2, 3)]
[1] "A" "A" "B" "B" "C"
And then convert to a factor
One way is to replicate the names out by row number and index directly with the matrix, then wrap that with factor to restore the levels:
factor(rep(colnames(mat), each = nrow(mat))[as.logical(mat)])
[1] A A B B C
Levels: A B C
If this is from model.matrix, the colnames have fac prepended, and so this should work the same but removing the extra text:
factor(gsub("^fac", "", rep(colnames(mat), each = nrow(mat))[as.logical(mat)]))
You could use something like this:
lvls<-apply(mat, 1, function(currow){match(1, currow)})
fac<-factor(lvls, 1:3, labels=colnames(mat))
Here is another one
factor(rep(colnames(mat), colSums(mat)))

Creating tuples from two vectors

If I have two vectors of the same length A<-c(5,10) and B<-c(7,13) how can I easily turn these two vectors into a single tuple vector i. e. c((5,7),(7,13))?
Others have mentioned lists. I see other possibilities:
cbind(A, B) # makes a column-major 2x2-"vector"
rbind(A, B) # an row major 2x2-"vector" which could also be added to an array with `abind`
It is also possible to preserve their "origins"
AB <- cbind(A=A, B=B)
array(c(AB,AB+10), c(2,2,2) )
, , 1
[,1] [,2]
[1,] 5 7
[2,] 10 13
, , 2
[,1] [,2]
[1,] 15 17
[2,] 20 23
> abind( array(c(AB,AB+10), c(2,2,2) ), AB+20)
, , 1
A B
[1,] 5 7
[2,] 10 13
, , 2
A B
[1,] 15 17
[2,] 20 23
, , 3
A B
[1,] 25 27
[2,] 30 33
Your tuple vector c((5,7),(7,13)) is not valid syntax. However, your phrasing makes me think you are thinking of something like python's zip. How do you want your tuples represented in R? R has a heterogeneous (recursive) type list and a homogenous type vector; there are no scalar types (that is, types that just hold a single value), just vectors of length 1 (somewhat an oversimplification).
If you want your tuples to be rows of a matrix (all the same type, which they are here):
rbind(A,B)
If you want a list of vectors
mapply(c, A, B, SIMPLIFY=FALSE)
If you want a list of lists (which is what you would need if A and B are not the same type)
mapply(list, A, B, SIMPLIFY=FALSE)
Putting this all together:
> A<-c(5,10)
> B<-c(7,13)
>
> cbind(A,B)
A B
[1,] 5 7
[2,] 10 13
> mapply(c, A, B, SIMPLIFY=FALSE)
[[1]]
[1] 5 7
[[2]]
[1] 10 13
> mapply(list, A, B, SIMPLIFY=FALSE)
[[1]]
[[1]][[1]]
[1] 5
[[1]][[2]]
[1] 7
[[2]]
[[2]][[1]]
[1] 10
[[2]][[2]]
[1] 13
I'm not certain this is exactly what you're looking for, but:
list(A, B)
which gives you a structure like this:
> str(list(A, B))
List of 2
$ : num [1:2] 5 10
$ : num [1:2] 7 13
and is literally represented like this:
dput(list(A, B))
list(c(5, 10), c(7, 13))
... which is about as close to the suggested end result as you can get, I think.
A list in R is essentially a vector of whatever you want it to be.
If that isn't what you're looking for, it might be helpful if you could expand on what exactly you'd like to do with this vector.
I see what you want to accomplish (because I had the same problem)!
Why not use complex numbers because they are basically nothing else but two dimensional numbers and they are an official data type in R with all the necessary methods available:
A <- complex(real=5,imaginary=10)
B <- complex(real=7,imaginary=13)
c(A,B)
## [1] 5+10i 7+13i
matrix(c(A,B),ncol=1)
## [,1]
## [1,] 5+10i
## [2,] 7+13i

Resources