I have a vector of integers that I want to split by 3, then I have to order the splitted parts and put bac into integer vector.
as.integer(c(16,9,2,17,10,3,18,11,4,19,12,5,20,13,6,21,14,7,22,15,8))
First step - split like this:
16,9,2
17,10,3
18,11,4
19,12,5
20,13,6
21,14,7
22,15,8
Second step - order:
2,9,16
3,10,17
4,11,18
5,12,19
6,13,20
7,14,21
8,15,22
Third step - put back into integer vector:
2,9,16,3,10,17,4,11,18,5,12,19,6,13,20,7,14,21,8,15,22
With matrix + sort:
x <- as.integer(c(16,9,2,17,10,3,18,11,4,19,12,5,20,13,6,21,14,7,22,15,8))
c(apply(matrix(x, ncol = 3, byrow = T), 1, sort))
#[1] 2 9 16 3 10 17 4 11 18 5 12 19 6 13 20 7 14 21 8 15 22
Or with split + gl:
unlist(lapply(split(x, gl(length(x) / 3, 3)), sort))
Another shorter approach with split + rev (only works if rev and sort are the same):
c(do.call(rbind, rev(split(x, 1:3))))
#[1] 2 9 16 3 10 17 4 11 18 5 12 19 6 13 20 7 14 21 8 15 22
No {dplyr} required here.
x <- as.integer(c(16,9,2,17,10,3,18,11,4,19,12,5,20,13,6,21,14,7,22,15,8))
spl.x <- split(x, ceiling(seq_along(x)/3)) # split the vector
spl.x <- lapply(spl.x, sort) # sort each element of the list
Reduce(c, spl.x) # Reduce list to vector
Second line (splitting) is from this answer: https://stackoverflow.com/a/3321659/2433233
This also works if the length of your original vector is no multiple of 3. The last list element is shorter in this case.
Here is one way to do steps in order:
vector=as.integer(c(16,9,2,17,10,3,18,11,4,19,12,5,20,13,6,21,14,7,22,15,8))
chunk <- 3
n <- length(vector)
r <- rep(1:ceiling(n/chunk),each=chunk)[1:n]
list_of3 <- split(vector,r)
# > list_of3
# $`1`
# [1] 16 9 2
#
# $`2`
# [1] 17 10 3
#
# $`3`
# [1] 18 11 4
#
# $`4`
# [1] 19 12 5
#
# $`5`
# [1] 20 13 6
#
# $`6`
# [1] 21 14 7
#
# $`7`
# [1] 22 15 8
sorted_list<- lapply(list_of3, function(x)sort(x))
final_vector <- unname(unlist(sorted_list))
final_vector
# > final_vector
# [1] 2 9 16 3 10 17 4 11 18 5 12 19 6 13 20 7 14 21 8 15 22```
Here is one way to do it:
v <- as.integer(c(16,9,2,17,10,3,18,11,4,19,12,5,20,13,6,21,14,7,22,15,8))
res <- split(v, 0:(length(v)-1) %/%3)
unlist(lapply(res, sort), use.names = FALSE)
You can put your data into a 3 column matrix by row, sort rowwise, transpose and convert back to vector:
v <- as.integer(c(16,9,2,17,10,3,18,11,4,19,12,5,20,13,6,21,14,7,22,15,8))
m <- matrix(v, ncol = 3, byrow = TRUE)
c(t(matrix(m[order(row(m), m)], nrow(m), byrow = TRUE)))
[1] 2 9 16 3 10 17 4 11 18 5 12 19 6 13 20 7 14 21 8 15 22
Something like this goes through every step:
v = as.integer(c(16,9,2,17,10,3,18,11,4,19,12,5,20,13,6,21,14,7,22,15,8))
v2 = v %>% matrix(ncol= 3, byrow = T)
# [,1] [,2] [,3]
# [1,] 16 9 2
# [2,] 17 10 3
# [3,] 18 11 4
# [4,] 19 12 5
# [5,] 20 13 6
# [6,] 21 14 7
# [7,] 22 15 8
v3 = v2[, rev(seq_len(ncol(v2)))]
# [,1] [,2] [,3]
# [1,] 2 9 16
# [2,] 3 10 17
# [3,] 4 11 18
# [4,] 5 12 19
# [5,] 6 13 20
# [6,] 7 14 21
# [7,] 8 15 22
v4 = v3 %>% as.vector
# [1] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
we know a column x with a vector of like 21 numbers:
x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
If I want to get multiple columns with flexible pattern like :
can set numbers in advance (n could be 3 or 4 or ...):
n=3,n1=2,n2=3,n3=2,.... the total number of columns is determined by number n.
column n=3, for column1:row=n*n1 and column2: row= n*n2, column3:row=n*n3 (Here, the number could be variables)
Final output is:(this is n=3 case, but my final goal is n could be 4,5...)
1 7 16
2 8 17
3 9 18
4 10 19
5 11 20
6 12 21
13
14
15
If n set as n=2,n1=3,n2=4. The one column number would become 14 c(1:14). (The real practice is I do not know how many columns needed to be created in advance. The column number is input by users).
Then what I what to get n =2 columns:
1 7
2 8
3 9
4 10
5 11
6 12
13
14
I am trying to make the columns created automatically in advance with variables.
Many thanks.
We can create an grouping variable with rep and split
split(df1$x, rep(1:3, c(6, 9, 6)))
#$`1`
#[1] 1 2 3 4 5 6
#$`2`
#[1] 7 8 9 10 11 12 13 14 15
#$`3`
#[1] 16 17 18 19 20 21
A function can be created with arguments, 'n', and additional arguments with ...
f1 <- function(dat, n, ...) {
rgrp <- n * c(...)
split(dat[[1]][seq_len(sum(rgrp))], rep(seq_len(n), rgrp))
}
f1(df1, 2, 3, 4)
#$`1`
#[1] 1 2 3 4 5 6
#$`2`
#[1] 7 8 9 10 11 12 13 14
f1(df1, 3, 2, 3, 2)
#$`1`
#[1] 1 2 3 4 5 6
#$`2`
#[1] 7 8 9 10 11 12 13 14 15
#$`3`
#[1] 16 17 18 19 20 21
If the user submits a vector and we don't have n, then get the n from the length of the vector
f1 <- function(dat, vec) {
n <- length(vec)
rgrp <- n * vec
split(dat[[1]][seq_len(sum(rgrp))], rep(seq_len(n), rgrp))
}
f1(df1, 3:4)
If the user input 'n1', 'n2', we can use ...
f1 <- function(dat, ...) {
vec <- c(...)
n <- length(vec)
rgrp <- n * vec
split(dat[[1]][seq_len(sum(rgrp))], rep(seq_len(n), rgrp))
}
f1(df1, 3, 4)
data
df1 <- structure(list(x = 1:21), class = "data.frame", row.names = c(NA,
-21L))
I have a 3D data matrix (df) of the shape[1:1000,1:221,1:2],
a reproducible example is the following:
d <- as.data.frame( matrix( 1:(5*2*3), 10, 3))
df = array( unlist(d), dim=c(5, 2, 3))
df
, , 1
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
, , 2
[,1] [,2]
[1,] 11 16
[2,] 12 17
[3,] 13 18
[4,] 14 19
[5,] 15 20
, , 3
[,1] [,2]
[1,] 21 26
[2,] 22 27
[3,] 23 28
[4,] 24 29
[5,] 25 30
the first dimension is trails, and the second dimension is outcomes, and the third dimension is people.
For each person, I want to get a graph like the following (a excel plot for the first person, df[,,1])
I want to have such a plot for each person displayed on the same page, but I am stuck on how to achieve this using ggplot.
Using your data, you can first re-organize your array in a dataframe (there are maybe easier ways to achieve this part):
final_df = NULL
nb_person = 3
trail = NULL
person = NULL
for(i in 1:nb_person) {
final_df = rbind(final_df, df[,,i])
trail = c(trail, 1:dim(df[,,i])[1])
person = c(person,rep(i,dim(df[,,i])[1]))
}
final_df = data.frame(final_df)
colnames(final_df) = c("start","end")
final_df$trail = trail
final_df$person = person
start end trail person
1 1 6 1 1
2 2 7 2 1
3 3 8 3 1
4 4 9 4 1
5 5 10 5 1
6 11 16 1 2
7 12 17 2 2
8 13 18 3 2
9 14 19 4 2
10 15 20 5 2
11 21 26 1 3
12 22 27 2 3
13 23 28 3 3
14 24 29 4 3
15 25 30 5 3
Then, you can reshape it using pivot_longer function from the package tidyr (if you install and load tidyverse, both tidyr and ggplot2 will be installed and loaded).
library(tidyverse)
final_df_reshaped <- final_df %>% pivot_longer(., -c(trail,person),names_to = "Variable",values_to = "value")
# A tibble: 30 x 4
trail person Variable value
<int> <int> <chr> <int>
1 1 1 start 1
2 1 1 end 6
3 2 1 start 2
4 2 1 end 7
5 3 1 start 3
6 3 1 end 8
7 4 1 start 4
8 4 1 end 9
9 5 1 start 5
10 5 1 end 10
# … with 20 more rows
Alternative using gather for older versions of tidyr
If you have an older version of tidyr (below 1.0.0), you should use gather instead of pivot_longer. (more information here: https://cmdlinetips.com/2019/09/pivot_longer-and-pivot_wider-in-tidyr/)
final_df_reshaped <- final_df %>% gather(., -c(trail,person), key = "Variable",value = "value")
And plot it using this code:
ggplot(final_df_reshaped, aes(x = Variable, y = value, group = as.factor(trail), color = as.factor(trail)))+
geom_point()+
geom_line() +
facet_grid(.~person)+
scale_x_discrete(limits = c("start","end"))
Does it answer your question ?
If you have to do that for 220 different person, I'm not sure it will make a really lisible plot. Maybe you should think an another way to plot it or to extract the useful information.
This seems really basic but I can't figure it out. How do you add two arrays together in R by column name? For example:
a<-matrix(1:9,ncol=3)
colnames(a)<-c("A","B","C")
a
# A B C
#[1,] 1 4 7
#[2,] 2 5 8
#[3,] 3 6 9
b <-matrix(10:18,ncol=3)
colnames(b)<-c("C","B","D")
b
# C B D
#[1,] 10 13 16
#[2,] 11 14 17
#[3,] 12 15 18
I would like to add them together in such a way to yield:
# A B C D
#[1,] 1 17 17 16
#[2,] 2 19 19 17
#[3,] 3 21 21 18
I suppose I could add extra columns to both matrices but it seems like there would be a one line command to accomplish this. Thanks!
Using xtabs , after melting a combined table to a long data.frame:
xtabs(Freq ~ ., data=as.data.frame.table(cbind(a,b)))
# Var2
#Var1 A B C D
# A 1 17 17 16
# B 2 19 19 17
# C 3 21 21 18
The rownames will just be cycling through LETTERS
We could use melt/acast from reshape2 after cbinding both the 'a' and 'b' matrices (inspired from #thelatemail's post).
library(reshape2)
acast(melt(cbind(a,b)), Var1~Var2, value.var='value', sum)
# A B C D
#1 1 17 17 16
#2 2 19 19 17
#3 3 21 21 18
Or we find the column names that are common in both by using intersect, column names that is found in one matrix and not in other with setdiff. By subsetting both the matrices with the common names, we add it together, then cbind the columns in both 'a' and 'b' based on the setdiff output.
nm1 <- intersect(colnames(a), colnames(b))
nm2 <- setdiff(colnames(a), colnames(b))
nm3 <- setdiff(colnames(b), colnames(a))
cbind(a[,nm2, drop=FALSE], a[,nm1]+b[,nm1], b[,nm3,drop=FALSE])
# A B C D
#[1,] 1 17 17 16
#[2,] 2 19 19 17
#[3,] 3 21 21 18
Another option would be create another matrix with all the unique columns in 'a' and 'b', and then replace the values in that
nm <- union(colnames(a), colnames(b))
m1 <- matrix(0, ncol=length(nm), nrow=nrow(a), dimnames=list(NULL, nm))
m1[,colnames(a)] <- a
m1[,colnames(b)] <- m1[,colnames(b)] +b
m1
# A B C D
#[1,] 1 17 17 16
#[2,] 2 19 19 17
#[3,] 3 21 21 18
We could also cbind both the matrices and use tapply to get the sum after grouping with column and row indices
m2 <- cbind(a, b)
t(tapply(m2,list(colnames(m2)[col(m2)], row(m2)), FUN=sum))
Or we loop through 'nm' and get the sum
sapply(nm, function(i) rowSums(m2[,colnames(m2) ==i, drop=FALSE]))
I have a dataframe that looks like this:
x<-data.frame(a=6, b=5:1, c=7, d=10:6)
> x
a b c d
1 6 5 7 10
2 6 4 7 9
3 6 3 7 8
4 6 2 7 7
5 6 1 7 6
I am trying to get the sums of columns a & b and c&d in another data frame that should look like:
> new
ab cd
1 11 17
2 10 16
3 9 15
4 8 14
5 7 13
I've tried the rowSums() function but it returns the sum of ALL the columns per row, and I tried rowSums(x[c(1,2), c(3,4)]) but nothing works. Please help!!
You can use rowSums on a column subset.
As a data frame:
data.frame(ab = rowSums(x[c("a", "b")]), cd = rowSums(x[c("c", "d")]))
# ab cd
# 1 11 17
# 2 10 16
# 3 9 15
# 4 8 14
# 5 7 13
As a matrix:
cbind(ab = rowSums(x[1:2]), cd = rowSums(x[3:4]))
For a wider data frame, you can also use sapply over a list of column subsets.
sapply(list(1:2, 3:4), function(y) rowSums(x[y]))
For all pairwise column combinations:
y <- combn(ncol(x), 2L, function(y) rowSums(x[y]))
colnames(y) <- combn(names(x), 2L, paste, collapse = "")
y
# ab ac ad bc bd cd
# [1,] 11 13 16 12 15 17
# [2,] 10 13 15 11 13 16
# [3,] 9 13 14 10 11 15
# [4,] 8 13 13 9 9 14
# [5,] 7 13 12 8 7 13
Here's another option:
> sapply(split.default(x, 0:(length(x)-1) %/% 2), rowSums)
0 1
[1,] 11 17
[2,] 10 16
[3,] 9 15
[4,] 8 14
[5,] 7 13
The 0:(length(x)-1) %/% 2 step creates a sequence of groups of 2 that can be used with split. It will also handle odd numbers of columns (treating the final column as a group of its own). Since there's a different default split "method" for data.frames that splits by rows, you need to specify split.default to split the columns into groups.