reordering selected column of data.table in r - r

I have a data table dt[] which contains 500 columns, I need to pick 6 columns say (a,c,k,m,n,o) from the data table and put them in the starting of the data table.
Is there any way of doing this ?

We can create a vector of columns of interest ('nm1'), then concatenate that with the column names that are found in 'nm1' (using setdiff. In data.table, for subsetting columns, we use with = FALSE.
nm1 <- c('a', 'c' 'k', 'm', 'n', 'o')
dt[, c(nm1, setdiff(names(dt1), nm1)), with=FALSE]
Other option include setcolorder, but the above method is more convenient as it will not replace the order in the original dataset.
NOTE: No external packages used.

Whether dealing with data.frames or data.tables, I would suggest loading "data.table" and using setcolorder.
Paired up with moveMe from my "SOfun" package, you have a very flexible means of reordering columns.
Loading package and creating sample data:
library(SOfun)
library(data.table)
DT <- as.data.table(as.list(setNames(1:26, letters)))
DF <- setDF(copy(DT))
DT
# a b c d e f g h i j k l m n o p q r s t u v w x y z
# 1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
DF
# a b c d e f g h i j k l m n o p q r s t u v w x y z
# 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Moving columns:
setcolorder(DT, moveMe(names(DT), "a,c,k,m,n,o first"))
DT
# a c k m n o b d e f g h i j l p q r s t u v w x y z
# 1: 1 3 11 13 14 15 2 4 5 6 7 8 9 10 12 16 17 18 19 20 21 22 23 24 25 26
setcolorder(DF, moveMe(names(DF), "a,c,k,m,n,o first"))
DF
# a c k m n o b d e f g h i j l p q r s t u v w x y z
# 1 1 3 11 13 14 15 2 4 5 6 7 8 9 10 12 16 17 18 19 20 21 22 23 24 25 26
Beyond "first", you also have "last", "before", and "after".
setcolorder(DF, moveMe(names(DF), "a,c,k,m,n,o first; l,e,q,r,w last"))
DF
# a c k m n o b d f g h i j p s t u v x y z l e q r w
# 1 1 3 11 13 14 15 2 4 6 7 8 9 10 16 19 20 21 22 24 25 26 12 5 17 18 23

Related

extract specefic colum based on character value in a row

I have a data that contains character in the first row like this
J K L M N O P
A T F T F F F T
B 14 15 10 2 3 4 78
C 10 47 15 9 6 12 12
D 17 44 17 1 0 15 11
E 3 12 14 3 2 15 17
i want to extract only the columns that contain the value "T" in row A
so the result i want is this :
J L P
A T T T
B 14 10 78
C 10 15 12
D 17 17 11
E 3 14 17
also, in second time, i want to know how to do the same thing using two conditions, for example : extract all columns that contain value "T" in column A and value 17 in row D so the result will be :
J L
A T T
B 14 10
C 10 15
D 17 17
E 3 14
Thank you
Here is your answer.
> df <- df[, df["A",] == "T" & df["D",] == 17]
You can use index to filter columns. It supports logical statements and you can combine them with &.

R : Change name of variable using for loop

I have a data, and vectors conatin name of variables, from these vectorsi calculate the sum of variables contained in the vector and i want to put the result in a new variables that have diffrent names
let say i have three vectors
>data
Name A B C D E
r1 1 5 12 21 15
r2 2 4 7 10 9
r3 5 15 6 9 6
r4 7 8 0 7 18
And i have these vectors that are generated using for loop that are in variable vec
V1 <- ("A","B","C")
V2 <- ("B","D")
V3 <- ("D","E")
Edit 1 :
These vector are generated using for loop and i don't know the vectors that will be generated or the elemnts contained in these vector , here i'm giving just an example , i want to calculate the sum of variables in each vector and make the result in new variable in my data frame
The issue is don't know how to give new name to variables created (that contains the sum of each vector)
data$column[j] <- rowSums(all_data_Second_program[,vec])
j <- j+1
To obtain this result for example
Name A B C Column1 D Column2 E Column3
r1 1 5 12 18 21 26 15 36
r2 2 4 7 13 10 14 9 19
r3 5 15 6 26 9 24 6 15
r4 7 8 0 15 7 15 18 25
But i didn't obtain this result
Please tell me if you need any more informations or clarifications
Can you tell me please how to that
Put the vectors in a list and then you can use rowSums in lapply -
list_vec <- list(c("A","B","C"), c("B","D"), c("D","E"))
new_cols <- paste0('Column', seq_along(list_vec))
data[new_cols] <- lapply(list_vec, function(x) rowSums(data[x]))
data
# Name A B C D E Column1 Column2 Column3
#1 r1 1 5 12 21 15 18 26 36
#2 r2 2 4 7 10 9 13 14 19
#3 r3 5 15 6 9 6 26 24 15
#4 r4 7 8 0 7 18 15 15 25
We may use a for loop
for(i in 1:3) {
data[[paste0('Column', i)]] <- rowSums(data[get(paste0('V', i))],
na.rm = TRUE)
}
-output
> data
Name A B C D E Column1 Column2 Column3
1 r1 1 5 12 21 15 18 26 36
2 r2 2 4 7 10 9 13 14 19
3 r3 5 15 6 9 6 26 24 15
4 r4 7 8 0 7 18 15 15 25

Operation of matrix in R based on dimension names

I have a symmetric matrix s defined as:
s<-matrix(1:25,5)
s[lower.tri(s)] = t(s)[lower.tri(s)]
dimnames(s) <- list(LETTERS[1:5], LETTERS[1:5])
s
A B C D E
A 1 6 11 16 21
B 6 7 12 17 22
C 11 12 13 18 23
D 16 17 18 19 24
E 21 22 23 24 25
In addition, there is a vector t defined as:
t <- seq(1,10)
names(t) <- c('C_A', 'E_A', 'E_B', 'E_C', 'E_D', 'D_A', 'D_B', 'D_C', 'C_B', 'A_B')
Now I would like to add the elements of t to the upper and lower triangular elements of s in such a way that the element of t with the name 'C_A' is added to the elements of s with row and column names of 'C' and 'A' (or 'A' and 'C'), the element of t with the name 'E_A' is added to the elements of s with row and column names of 'E' and 'A' (or 'A' and 'E'), etc. For example, both s['A','B'] and s['B','A'] should be added by t['A_B'], and similarly for all other off-diagonal elements. Do nothing for the diagonals.
What is an elegant way to achieve this?
This is not especially elegant but:
s<-matrix(1:25,5)
s[lower.tri(s)] = t(s)[lower.tri(s)]
dimnames(s) <- list(LETTERS[1:5], LETTERS[1:5])
t <- seq(1,10)
names(t) <- c('C_A', 'E_A', 'E_B', 'E_C', 'E_D', 'D_A', 'D_B', 'D_C', 'C_B', 'A_B')
s[t(do.call(cbind, strsplit(names(t), split = "_")))] <-
s[t(do.call(cbind, strsplit(names(t), split = "_")))] + t
s
#> A B C D E
#> A 1 16 11 16 21
#> B 6 7 12 17 22
#> C 12 21 13 18 23
#> D 22 24 26 19 24
#> E 23 25 27 29 25
To add the [i,j]th elements just call it again with the index positions reversed
s[t(do.call(cbind, strsplit(names(t), split = "_")))[,2:1]] <-
s[t(do.call(cbind, strsplit(names(t), split = "_")))[,2:1]] + t
s
#> A B C D E
#> A 1 16 12 22 23
#> B 16 7 21 24 25
#> C 12 21 13 26 27
#> D 22 24 26 19 29
#> E 23 25 27 29 25
Use outer to create row/col and col/row indexes, then overwrite the corresponding values of s:
sel1 <- match(names(t), outer(rownames(s),colnames(s), function(x,y) paste(x,y,sep="_")))
sel2 <- match(names(t), outer(rownames(s),colnames(s), function(x,y) paste(y,x,sep="_")))
s[sel1] <- s[sel1]+t
s[sel2] <- s[sel2]+t
# A B C D E
#A 1 16 12 22 23
#B 16 7 21 24 25
#C 12 21 13 26 27
#D 22 24 26 19 29
#E 23 25 27 29 25
Err: Will fix: You can use matrix indexing with a two-column matrix of character values for row and col:
nt <- strsplit(names(t), "_")
dnt <- data.frame(n=t, t(data.frame(nt)))
s[ as.matrix(dnt[-1]) ] <- s[ as.matrix(dnt[-1]) ] + t
s
#-----------
A B C D E
A 1 16 11 16 21
B 6 7 12 17 22
C 12 21 13 18 23
D 22 24 26 19 24
E 23 25 27 29 25
s[as.matrix(dnt[c(3,2)])] <- s[as.matrix(dnt[c(3,2)])] + t
s
#----------
A B C D E
A 1 16 13 28 25
B 26 7 30 31 28
C 12 21 13 34 31
D 22 24 26 19 34
E 23 25 27 29 25
t1 <- as.list(t)
p <- for(i in 1 :length(t1))
{
nam <-unlist(strsplit(names(t1[i]),"_"))
s[nam[1],nam[2]]<- t[[i]]
s[nam[2],nam[1]]<- t[[i]]
}
> s
A B C D E
A 1 10 1 6 2
B 10 7 9 7 3
C 1 9 13 8 4
D 6 7 8 19 5
E 2 3 4 5 25

Combining a list of named vectors without mangling the names

How do I combine a list of named vectors? I need to split a vector of integers (with characters for names) for use with parallel::parSapply() and combine them back again. Example code:
text <- 1:26
names(text) <- letters
n <- 4
text <- split(text, cut(1:length(text),breaks=n,labels=1:n))
# text <- parSapply(..., text, ...) would go here in the actual code
However, the names get mangled when I use unlist to convert the data back into a named vector:
> unlist(text)
1.a 1.b 1.c 1.d 1.e 1.f 1.g 2.h 2.i 2.j 2.k 2.l 2.m 3.n 3.o 3.p 3.q 3.r 3.s 4.t 4.u 4.v
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
4.w 4.x 4.y 4.z
23 24 25 26
What I'm looking for is the following result (except that it should work with any value of n):
> c(text[[1]],text[[2]],text[[3]],text[[4]])
a b c d e f g h i j k l m n o p q r s t u v w x y z
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
One option without changing the structure of 'text' would be to change the names of the vector (unlist(text)) with the names of onjects within the list elements.
setNames(unlist(text), unlist(sapply(text, names)))
# a b c d e f g h i j k l m n o p q r s t u v w x y z
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Or if it is okay to remove the names of the 'text' object, set the names of 'text' to NULL and then unlist
unlist(setNames(text, NULL))
# a b c d e f g h i j k l m n o p q r s t u v w x y z
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
You can remove the list elements names first then there won't be compound naming happening.
> names(text) <- NULL
> do.call(c, text)
a b c d e f g h i j k l m n o p q r s t u v w x y z
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Same as
> unlist(text)
a b c d e f g h i j k l m n o p q r s t u v w x y z
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Or as #RichardScriven pointed out in the comment, you can do it as follows without removing the name in the source variable: do.call("c", c(text, use.names = FALSE))

matrix addition - Multiple unique identifiers

AM trying to add elements from two different matrices, Each of the matrix has got three unique identifiers as below:
Matrix A:
A B C D E F G H
1 x 1 2 10 11 12 13 10
2 y 1 2 11 12 14 12 13
3 y 1 3 12 10 11 12
The second matrix look like:
A B C D E F G H
1 x 1 2 20 14 17 10 10
2 y 1 2 11 12 14 12 13
3 y 1 3 17 10 19 12
Please note that the variables A, B, and D form unique identifiers for each of the participants.
I would wish to write a code so that as I sum the matrix values I consider this.
You should your data in the long format.
library(reshape2)
dat.l <- melt(dat,id=c('A','B','D'))
dat1.l <- melt(dat1,id=c('A','B','D'))
Then you just sum value :
dat.l$value = dat.l$value + dat1.l$value

Resources