How I can combine the values of the columns? Like this:
Expected output:
Another option using rbind.
d <- seq(1, 27)
m <- matrix(d, nrow=3, byrow=TRUE)
Then m would look as follows (mimicking your input data):
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 2 3 4 5 6 7 8 9
[2,] 10 11 12 13 14 15 16 17 18
[3,] 19 20 21 22 23 24 25 26 27
Then you can call rbind on chunks of your input; for bigger matrices you might want to do this in a for-loop:
mnew = data.frame(rbind(m[, 1:3], m[ ,4:6], m[,7:9]))
names(mnew) <- c('V1', 'V2', 'V3')
That yields the desired output:
V1 V2 V3
1 1 2 3
2 10 11 12
3 19 20 21
4 4 5 6
5 13 14 15
6 22 23 24
7 7 8 9
8 16 17 18
9 25 26 27
first you must say which columns you want to combine but you can do it like this considering that dat is the name of your dataframe
dt <- data.frame(c(dat$V1,dat$V4,dat$V7),c(dat$V2,dat$V5,dat$V8),c(dat$V3,dat$V6,dat$V9))
then rename your columns using names(dt) <- c("V1","V2","V3")
Let's say your basic dataframe is named df1
Using rbind, as Orhan already said, you had to split your dataframe and rename the columns:
a <- df1[1:3]
names(a) <- c("V1","V2", "V3")
b <- df1[4:6]
names(b) <- c("V1","V2", "V3")
c <- df1[7:9]
names(c) <- c("V1","V2", "V3")
df2<- rbind(a,b,c)
This could be easily be performed by changing df to á matrix and then using rbind
df <- as.matrix(df)
df <- rbind(df[,1:3],df[,4:6],df[,7:9])
Then you have the option of converting it back to data.frame using as.data.frame()
Related
How to calculate the mean for every n vectors from a df creating a new data frame with the results.
I expect to get:
column 1: mean (V1,V2),
column 2: mean (V3,V4),
column 3: mean (V5,V6)
,and so forth
data
df <- data.frame(v1=1:6,V2=7:12,V3=13:18,v4=19:24,v5=25:30,v6=31:36)
Here is base R option
n <- 2 # Mean across every n = 2 columns
do.call(cbind, lapply(seq(1, ncol(df), by = n), function(idx) rowMeans(df[c(idx, idx + 1)])))
# [,1] [,2] [,3]
#[1,] 4 16 28
#[2,] 5 17 29
#[3,] 6 18 30
#[4,] 7 19 31
#[5,] 8 20 32
#[6,] 9 21 33
This returns a matrix rather than a data.frame (which makes more sense here since you're dealing with "all-numeric" data).
Explanation: The idea is a non-overlapping sliding window approach. seq(1, ncol(df), by = n) creates the start indices of the columns (here: 1, 3, 5). We then loop over those indices idx and calculate the row means of df[c(idx, idx + 1)]. This returns a list which we then cbind into a matrix.
As a minor modifcation, you can also predefine a data.frame with the right dimensions and then skip the do.call(cbind, ...) step by having R do an implicit list to data.frame typecast.
out <- data.frame(matrix(NA, ncol = ncol(df) / 2, nrow = nrow(df)))
out[] <- lapply(seq(1, ncol(df), by = n), function(idx) rowMeans(df[c(idx, idx + 1)]))
# X1 X2 X3
#1 4 16 28
#2 5 17 29
#3 6 18 30
#4 7 19 31
#5 8 20 32
#6 9 21 33
You may try,
dummy <- data.frame(
v1 = c(1:10),
v2 = c(1:10),
v3 = c(1:10),
v4 = c(1:10),
v5 = c(1:10),
v6 = c(1:10)
)
nvec_mean <- function(df, n){
res <- c()
m <- matrix(1:ncol(df), ncol = n, byrow = T)
if (ncol(df) %% n != 0){
stop()
}
for (i in 1:nrow(m)){
v <- rowMeans(df[,m[i,]])
res <- cbind(res, v)
}
colnames(res) <- c(1:nrow(m))
res
}
nvec_mean(dummy,3)
1 2
[1,] 1 1
[2,] 2 2
[3,] 3 3
[4,] 4 4
[5,] 5 5
[6,] 6 6
[7,] 7 7
[8,] 8 8
[9,] 9 9
[10,] 10 10
If you didn't want rowMeans or result is not what you wanted, please let me know.
Simple(?) version
df <- data.frame(v1=1:6,V2=7:12,V3=13:18,v4=19:24,v5=25:30,v6=31:36)
n = 2
res <- c()
m <- matrix(1:ncol(df), ncol = 2, byrow = T)
for (i in 1:nrow(m)){
v <- rowMeans(df[,m[i,]])
res <- cbind(res, v)
}
res
v v v
[1,] 4 16 28
[2,] 5 17 29
[3,] 6 18 30
[4,] 7 19 31
[5,] 8 20 32
[6,] 9 21 33
I would like to index a vector inside a list within a list, and generate a new dataframe that contains that specific vector in each of the lists in every row. I was previously considering using a for loop to do so
a = list(odds = c(1,3,5,7), evens = c(2,4,6,8), name = "name1")
b = list(odds = c(9,11,13,15), evens = c(10,12,14,16), name = "name2")
c = list(odds = c(17,19,21,23), evens = c(18,20,22,24), name = "name3")
d = list(a,b,c)
output = data.frame()
for (i in 1:length(d)) {
output <- rbind(output, d[[i]]$odds)
}
The expected output is as such
# X1 X3 X5 X7
# 1 1 3 5 7
# 2 9 11 13 15
# 3 17 19 21 23
However, as I perpetually require to do such indexing when I handle data, I was wondering if there was a less convoluted method of doing this. Is there perhaps a cleaner method using lapply and rbind functions to avoid looping? I could not figure out how to index the vector required.
Apologies if the question is poorly formatted, this is my first time posting on a coding forum.
You can use :
res <- data.frame(t(sapply(d, `[[`, 'odds')))
# X1 X2 X3 X4
#1 1 3 5 7
#2 9 11 13 15
#3 17 19 21 23
We can use
library(dplyr)
library(purrr)
d %>%
transpose %>%
pluck('odds') %>%
invoke(rbind, .)
# [,1] [,2] [,3] [,4]
#[1,] 1 3 5 7
#[2,] 9 11 13 15
#[3,] 17 19 21 23
This can also be used, albeit very similar to the one posted by dear #akrun. map family of functions also accept integer or character vector in place of anonymous function or formula. In that case it serves as an extractor function by index(integer) or name (character). Then it calls internally to pluck as specified by #akrun's solution. You can verify it by as_mapper("odds").
library(purrr)
# We use big bang operator to splice the list of arguments and then
# use exec to apply `rbind` function to the spliced list.
exec(rbind, !!!map(d, "odds"))
[,1] [,2] [,3] [,4]
[1,] 1 3 5 7
[2,] 9 11 13 15
[3,] 17 19 21 23
Another base R option using simplify2array
> t(do.call(cbind, simplify2array(d)["odds", ]))
[,1] [,2] [,3] [,4]
[1,] 1 3 5 7
[2,] 9 11 13 15
[3,] 17 19 21 23
x <- matrix (1:20,ncol=4)
rownames(x) <-c(letters[1:5])
x
[,1] [,2] [,3] [,4]
a 1 6 11 16
b 2 7 12 17
c 3 8 13 18
d 4 9 14 19
e 5 10 15 20
Now I would like to obtain the names of rows in which every element is greater than 3, i.e. "d" and "e"
One way to do this is to generate an index using apply and all({some expression}) and use that to subset your rownames. In this case:
idx <- apply(x, 1, function(x) all(x>3))
rownames(x)[idx]
I need to create a list of "N" vectors with a length "L" that begin in number "B" . If I specify that N=3, L=4 and B=5. I would need a list of the following three vectors.
5 ,6,7,8,
9,10,11,12
13,14,15,16
I can do it manually one by one but I have sometimes 20 or 30 vectors to create with always different lengths.
I would appreciate if someone could give me a hand with this.
Cheers
Carlos
If you are happy with matrix as an output...
N <- 3
L <- 4
B <- 5
x <- seq(from = B, to = B + N * L - 1)
y <- matrix(x, nrow = N, byrow = TRUE)
y
# [,1] [,2] [,3] [,4]
# [1,] 5 6 7 8
# [2,] 9 10 11 12
# [3,] 13 14 15 16
Taking the matrix to list via transposition and data.frame...
as.list(as.data.frame(t(y)))
# $V1
# [1] 5 6 7 8
#
# $V2
# [1] 9 10 11 12
#
# $V3
# [1] 13 14 15 16
I'm showing it in this way partly because I've never liked the coercion of numbers to colnames, certainly other ways to handle that. The transposition may be removed if you set y <- matrix(x, nrow = L) instead. And drop the as.list because technically the data.frame is a list.
as.data.frame(y)
# V1 V2 V3
# 1 5 9 13
# 2 6 10 14
# 3 7 11 15
# 4 8 12 16
You can use split() to get a list output.
split(seq(B, B + L*N - 1), (1:(L*N)-1) %/% N)
I have found read.csv("file.csv")$V1 that may help to split exported table into columns but my data is organised in a row-by-row fashion, so I would like to record elements to vector sequentially from element[1][1] ->...->element[n][n]. Any thoughts how it could be accomplished in R?
Update:
Once imported mydata looks like:
dat <- matrix(1:27, nrow = 3)
dat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 10 13 16 19 22 25
[2,] 2 5 8 11 14 17 20 23 26
[3,] 3 6 9 12 15 18 21 24 27
Desired output would be vector: c(1, 2, 3, 4, 5, 6, 7.....)
With the code I provided a simple solution could be to extract simply the row, but it looks too much easy maybe I missed something.
new_dat <- dat[1, ]
new_dat
[1] 1 4 7 10 13 16 19 22 25
Edit
My solution works well but it is not efficient. Here I have an improved loop versions so you can store objects separately in only one command.
First define elements that will be the name of the objects:
val <- c(1:3)
nam <- "new_dat_"
and then extract all elements with the loop.
for(i in 1:nrow(dat)){ assign(paste(nam, val, sep = "")[i], dat[i, ]) }
after that use ls() and you should see 3 elements named new_dat_1","new_dat_2", "new_dat_3" "val" each of them contains one row of your dat. This solution can be very helpful if you have to extract several rows and not just one and lead to this output:
new_dat_3
[1] 3 6 9 12 15 18 21 24 27
new_dat_1
[1] 1 4 7 10 13 16 19 22 25
new_dat_2
[1] 2 5 8 11 14 17 20 23 26