R - is there any way to extract multiple column from a matrix with column names?
for example in matrix below:
A B C D E
A 1 3 5 7 9
B 2 4 6 8 10
extract submatrix with column C, D and E, like:
C D E
A 5 7 9
B 6 8 10
thanks.
As long as the matrix has column names (returned by colnames(m)) you can use them to index the columns you'd like to extract.
m[, c("C", "D", "E")]
# C D E
# A 5 7 9
# B 6 8 10
Related
Say we have a data frame with a factor (Group) that is a grouping variable for a list of IDs:
set.seed(123)
data <- data.frame(Group = factor(sample(5,10, replace = T)),
ID = c(1:10))
In this example, the ID's belong to one of 5 Groups, labeled 1:5. We simply want to replace 1:5 with A:E. In other words, if Group == 1, we want to change it to A, if Group == 2, we want to change it to B, and so on. What is the simplest way to achieve this?
You may assign new labels= in a names list using factor once again.
data$Group1 <- factor(data$Group, labels=list("1"="A", "2"="B", "3"="C", "4"="D", "5"="E"))
## more succinct:
data$Group2 <- factor(data$Group, labels=setNames(list("A", "B", "C", "D", "E"), 1:5))
data
# Group ID Group1 Group2 Group3
# 1 3 1 C C C
# 2 3 2 C C C
# 3 2 3 B B B
# 4 2 4 B B B
# 5 3 5 C C C
# 6 5 6 E E E
# 7 4 7 D D D
# 8 1 8 A A A
# 9 2 9 B B B
# 10 3 10 C C C
This for general, if indeed capital letters are wanted see #RonakShah's solution.
You can use the built-in constant in R LETTERS :
data$new_group <- LETTERS[data$Group]
data
# Group ID new_group
#1 3 1 C
#2 3 2 C
#3 2 3 B
#4 2 4 B
#5 3 5 C
#6 5 6 E
#7 4 7 D
#8 1 8 A
#9 2 9 B
#10 3 10 C
Created a new column (new_group) here for comparison purposes. You can overwrite the same column if you wish to.
I have a CSV with these data:
List Rank.A List Rank.B List Rank.C
a 4 a 8 b 3
b 5 e 5 e 9
c 7 f 5 r 1
I want to create a new csv in which there is only a one-column with a name List with a unique value and there is 3 more columns of "Rank.A", "Rank.B", "Rank.C" in same list. Suppose if Rank.A not listed with any row of List than it display blank. I want data in this format
List Rank.A Rank.B Rank.C
a 4 8
b 5 3
c 7
e 5 9
f 5
r 1
Can you please help me in that?
A base R option using split.default (to split your data.frame by columns) and Reduce + merge to combine data into a single data.frame.
Reduce(
function(x, y) merge(x, y, all = TRUE),
split.default(df, rep(1:(ncol(df) / 2), each = 2)))
# List Rank.A Rank.B Rank.C
# 1 a 4 8 NA
# 2 b 5 NA 3
# 3 c 7 NA NA
# 4 e NA 5 9
# 5 f NA 5 NA
# 6 r NA NA 1
Note that this assumes that you always have pairs of columns (List, Rank.x) in your original data.
Sample data
df <- read.table(text =
"List Rank.A List Rank.B List Rank.C
a 4 a 8 b 3
b 5 e 5 e 9
c 7 f 5 r 1", header = T, check.names = F)
Given a data.frame:
foo <- data.frame(ID=1:10, x=1:10)
rownames(foo) <- LETTERS[1:10]
I would like to reorder a subset of rows, defined by their row names. However, I would like to swap the row names of foo as well. I can do
sel <- c("D", "H") # rows to reorder
foo[sel,] <- foo[rev(sel),]
sel.wh <- match(sel, rownames(foo))
rownames(foo)[sel.wh] <- rownames(foo)[rev(sel.wh)]
but that is long and complicated. Is there a simpler way?
We can replace the sel values in rownames with the reverse of sel.
x <- rownames(foo)
foo[replace(x, x %in% sel, rev(sel)), ]
# ID x
#A 1 1
#B 2 2
#C 3 3
#H 8 8
#E 5 5
#F 6 6
#G 7 7
#D 4 4
#I 9 9
#J 10 10
Not as concise as ronak-shah's answer, but you could also use order.
# extract row names
temp <- row.names(foo)
# reset of vector
temp[which(temp %in% sel)] <- temp[rev(which(temp %in% sel))]
# reset order of data.frame
foo[order(temp),]
ID x
A 1 1
B 2 2
C 3 3
H 8 8
E 5 5
F 6 6
G 7 7
D 4 4
I 9 9
J 10 10
As noted in the comments, this relies on the row names following a lexicographical order. In instances where this is not true, we can use match.
# set up
set.seed(1234)
foo <- data.frame(ID=1:10, x=1:10)
row.names(foo) <- sample(LETTERS[1:10])
sel <- c("D", "H")
Now, the rownames are
# initial data.frame
foo
ID x
B 1 1
F 2 2
E 3 3
H 4 4
I 5 5
D 6 6
A 7 7
G 8 8
J 9 9
C 10 10
# grab row names
temp <- row.names(foo)
# reorder vector containing row names
temp[which(temp %in% sel)] <- temp[rev(which(temp %in% sel))]
Using, match along with order
foo[order(match(row.names(foo), temp)),]
ID x
B 1 1
F 2 2
E 3 3
D 6 6
I 5 5
H 4 4
A 7 7
G 8 8
J 9 9
C 10 10
your data frame is small so you can duplicate it then change the value of each raw:
footmp<-data.frame(foo)
foo[4,]<-footemp[8,]
foot{8,]<-footemp[4,]
Bob
I have data to analyse that is presented in the form of a list (just one row and MANY columns).
A B C D E F G H I
1 2 3 4 5 6 7 8 9
Is there a way to tell R to split this list every x items and get something as seen below (the columns C D E F G H I are virtually the same as A B)?
A B
1 2
3 4
5 6
7 8
9
If the number of columns is a multiple of 'x', then we unlist the dataset, and use matrix to create the expected output.
as.data.frame(matrix(unlist(df1), ncol=2, dimnames=list(NULL, c("A", "B")) , byrow=TRUE))
If the number of columns is not a multiple of 'x', then
x <- 2
gr <- as.numeric(gl(ncol(df1), x, ncol(df1)))
lst <- split(unlist(df1), gr)
do.call(rbind, lapply(lst, `length<-`, max(lengths(lst))))
# A B
# 1 1 2
# 2 3 4
# 3 5 6
# 4 7 8
# 5 9 NA
In R, I want to match and merge two matrices.
For example,
> A
ID a b c d e f g
1 ex 3 8 7 6 9 8 4
2 am 7 5 3 0 1 8 3
3 ple 8 5 7 9 2 3 1
> B
col1
1 a
2 c
3 e
4 f
Then, I want to match header of matrix A and 1st column of matrix B.
The final result should be a matrix like below.
> C
ID a c e f
1 ex 3 7 9 8
2 am 7 3 1 8
3 ple 8 7 2 3
*(My original data has more than 500 columns and more than 20,000 rows.)
Are there any tips for that? Would really appreciate your help.
*In advance, if the matrix B is like below,
> B
col1 col2 col3 col4
1 a c e f
How to make the matrix C in this case?
You want:
A[, c('ID', B[, 1])]
For the second case, you want to use row number 1 of the second matrix, instead of its first column.
A[, c('ID', B[1, ])]
If B is a data.frame instead of a matrix, the syntax changes somewhat — you can use B$col1 instead of B[, 1], and to select by row, you need to transform the result to a vector, because the result of selecting a row in a data.frame is again a data.frame, i.e. you need to do unlist(B[1, ]).
You can use a subset:
cbind(A$ID, A[names(A) %in% B$col1])