This question already has answers here:
Using row-wise column indices in a vector to extract values from data frame [duplicate]
(2 answers)
Closed 3 years ago.
I have a vector of colnames that is as long as the number of rows in a data frame:
> x <- data.frame(a=c(1,2,3), b=c(3,2,1), c=c(5,6,4))
> cols <- c("c", "a", "b")
> x
a b c
1 1 3 5
2 2 2 6
3 3 1 4
Now I want to extract from x the column cols[i] for each row i of x, that is 5, 2, 1 in this case. I have tried to create a matrix with T and F depending on the macth:
> A <- matrix(rep(colnames(x),nrow(x)), nrow=nrow(x), ncol=ncol(x), byrow=TRUE) == cols
> A
[,1] [,2] [,3]
[1,] FALSE FALSE TRUE
[2,] TRUE FALSE FALSE
[3,] FALSE TRUE FALSE
This looks correct, but when I use this as an index, the result is returned by row:
> x[A]
[1] 2 1 5
Does someone know of the proper way to solve this indexing problem?
x <- data.frame(a=c(1,2,3), b=c(3,2,1), c=c(5,6,4))
cols <- c("c", "a", "b")
sapply(1:length(cols),function(i){x[i,cols[i]]})
[1] 5 2 1
Related
This question already has answers here:
converting a matrix to a list
(2 answers)
Closed 4 years ago.
Is it possible to convert a matrix contains 6 rows and 4 columns into 6 vectors,
each row will be a vector.
m = matrix( c(2, 4, 3, 1, 5, 7,4,8,9,4,5,0,2,5,7,6,1,8), nrow=6,ncol=3)
m
[,1] [,2] [,3]
[1,] 2 4 2
[2,] 4 8 5
[3,] 3 9 7
[4,] 1 4 6
[5,] 5 5 1
[6,] 7 0 8
One option is split by the row of matrix to create a list of 'n' vectors where 'n' is the number of rows of the original matrix
lst <- split(m, row(m))
NOTE: It is better to create a list instead of having many objects in the global environment. Also, it is not clear why this is needed
You would try this example you can get the idea.
> b <- matrix(1:20, nrow = 2, ncol = 10)
> sapply(1:ncol(b), function(i) paste(b[,i],collapse=","))
[1] "1,2" "3,4" "5,6" "7,8" "9,10" "11,12" "13,14" "15,16"
[9] "17,18" "19,20"
A solution with lapply:
lapply(as.data.frame(t(m)), c)
I have a matrix I would like to subset quickly using two criteria. 1) the colnames match the rownames and 2) the value in one matrix is FALSE
m
[,1]
A 1
B 2
C 3
D 4
E 5
tf
E B A
[1,] FALSE FALSE TRUE
the output should be
m2
[,1]
E 5
B 2
As there is only a single row for 'tf', when we subset the logical matrix by negating, it results in a named vector as subsetting ([) by default is drop=TRUE. Extract the names from the vector and use it as row index to subset the 'm'. Here, we can use drop=FALSE as there is only a single column in 'm'.
m[names(tf[,!tf]), , drop=FALSE]
# [,1]
#E 5
#B 2
data
m <- matrix(1:5, nrow=5, 1, dimnames=list(LETTERS[1:5], NULL))
tf <- matrix(c(FALSE, FALSE, TRUE), ncol=3, dimnames=list(NULL, c("E", "B", "A")))
Say I have matrices one and two:
> one <- matrix(1:9, nrow=3, ncol=3, dimnames=list(c("X","Y","Z"), c("A", "B", "C")))
> one
A B C
X 1 4 7
Y 2 5 8
Z 3 6 9
> two <- matrix(1:9, nrow=3, ncol=3, dimnames=list(c("X","Y","Z"), c("WRONG", "B", "C")))
> two
WRONG B C
X 1 4 7
Y 2 5 8
Z 3 6 9
Is there a command that can produce a logical value to verify whether the column and row names of matrix one are the same as those in matrix two?
You are looking for identical(). For row names -
identical(rownames(one), rownames(two))
# [1] TRUE
And the same for colnames(). For all dimnames(), same thing -
identical(dimnames(one), dimnames(two))
# [1] FALSE
For row and column individually at the same time -
Map(identical, dimnames(one), dimnames(two))
# [[1]]
# [1] TRUE
#
# [[2]]
# [1] FALSE
Update: In response to your comment, for multiple matrices you may try
length(unique(lapply(list(one, two, three), dimnames))) == 1
If this returns FALSE, you know that at least one set of dimnames is different.
If there is a need to identify this for each row and column, you could do this
cbind(unlist(dimnames(one)), unlist(dimnames(one)) %in% unlist(dimnames(two)))
# [,1] [,2]
#row1 "X" "TRUE"
#row2 "Y" "TRUE"
#row3 "Z" "TRUE"
#col1 "A" "FALSE"
#col2 "B" "TRUE"
#col3 "C" "TRUE"
Or else another alternative would be
do.call(`%in%`, list(dimnames(one), dimnames(two)))
#for row and column seperately
# [1] TRUE FALSE
Is it possible to have named rows and columns in Matrices?
for example:
[,a] [,b]
[a,] 1 , 2
[b,] 3 , 4
Is it even reasonable to have such a thing for exploring the data?
Sure. Use dimnames:
> a <- matrix(1:4, nrow = 2)
> a
[,1] [,2]
[1,] 1 3
[2,] 2 4
> dimnames(a) <- list(c("A", "B"), c("AA", "BB"))
> a
AA BB
A 1 3
B 2 4
With dimnames, you can provide a list of (first) rownames and (second) colnames for your matrix. Alternatively, you can specify rownames(x) <- whatever and colnames(x) <- whatever.
In a matrix, if there is some missing data recorded as NA.
how could I delete rows with NA in the matrix?
can I use na.rm?
na.omit() will take matrices (and data frames) and return only those rows with no NA values whatsoever - it takes complete.cases() one step further by deleting the FALSE rows for you.
> x <- data.frame(c(1,2,3), c(4, NA, 6))
> x
c.1..2..3. c.4..NA..6.
1 1 4
2 2 NA
3 3 6
> na.omit(x)
c.1..2..3. c.4..NA..6.
1 1 4
3 3 6
I think na.rm usually only works within functions, say for the mean function. I would go with complete.cases: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/complete.cases.htm
let's say you have the following 3x3-matrix:
x <- matrix(c(1:8, NA), 3, 3)
> x
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 NA
then you can get the complete cases of this matrix with
y <- x[complete.cases(x),]
> y
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
The complete.cases-function returns a vector of truth values that says whether or not a case is complete:
> complete.cases(x)
[1] TRUE TRUE FALSE
and then you index the rows of matrix x and add the "," to say that you want all columns.
If you want to remove rows that contain NA's you can use apply() to apply a quick function to check each row. E.g., if your matrix is x,
goodIdx <- apply(x, 1, function(r) !any(is.na(r)))
newX <- x[goodIdx,]