I have a function that first generates a list of vectors (generated by using lapply), and then cbinds it to a column vector. I thought this would produce a dataframe. However, it produces a list of lists.
The cbind function isn't working as I thought it would.
Here's a small example of what the function is generating
col_test <- c(1, 2, 1, 1, 2)
lst_test <- list(c(1, 2 , 3), c(2, 2, 2), c(1, 1, 2), c(1, 2, 2), c(1, 1, 1))
a_df <- cbind(col_test, lst_test)
Typing
> a_df[1,]
gives the output
$`col_test`
[1] 1
$lst_test
[1] 1 2 3
I'd like the data frame to be
[,1] [,2] [,3] [,4]
[1,] 1 1 2 3
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
[5,] 2 1 1 1
How do I get it into this form?
data.frame(col_test,t(as.data.frame(lst_test)))
do.call(rbind, Map(c, col_test, lst_test))
# [,1] [,2] [,3] [,4]
#[1,] 1 1 2 3
#[2,] 2 2 2 2
#[3,] 1 1 1 2
#[4,] 1 1 2 2
#[5,] 2 1 1 1
col_test <- c(1, 2, 1, 1, 2)
lst_test <- list(c(1, 2 , 3), c(2, 2, 2), c(1, 1, 2), c(1, 2, 2), c(1, 1, 1))
name the sublists so we can use bind_rows
names(lst_test) <- 1:length(lst_test)
lst_test1 <- bind_rows(lst_test)
the bind_rows function binds by cols in this case so we need to pivot it
lst_test_pivot <- t(lst_test1)
but this gives us a matrix, so we need to cast it back to a dataframe
lst_test_pivot_df <- as.data.frame(lst_test_pivot)
now it works as
cbind(col_test, lst_test_pivot_df)
now produces
col_test V1 V2 V3
1 1 1 2 3
2 2 2 2 2
3 1 1 1 2
4 1 1 2 2
5 2 1 1 1
This should do the trick. Note that we are using do.call so that the individual elements of lst_test are sent as parameters to cbind, which prevents cbind from creating a list-of-lists. t is used to transpose the resulting matrix to your preferred orientation, and finally, one more cbind with col_test inserts that data as well.
library(tidyverse)
mat.new <- do.call(cbind, lst_test) %>%
t %>%
cbind(col_test, .) %>%
unname
[,1] [,2] [,3] [,4]
[1,] 1 1 2 3
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
[5,] 2 1 1 1
Related
I am attempting to generate a matrix where each column represents a sequence of factors in R. The factors can assume the values 1, 2, 3 or 4. Each sequence has 13 elements suggesting a total of 4^13 potential sequences. However, only a specific subset of these potential sequences are considered valid. The logic is as follows:
A sequence can start at any factor
If a sequence starts at 4, its second element can be less than or equal to 4
Once an element drops below 4, subsequent entries must be weakly increasing
If a sequence starts with 1, 2, or 3 it must be weakly increasing
So for example, the sequence (1,2,3,3,3.....3) is valid. The sequence (4,4,1,1,2,4,4....4) is also valid. The sequence (4,1,2,3,1,1....1) is not, since it is not weakly increasing after the first drop from 4 to 1.
At the moment, I have code to combine the 2&3 factors and generate this matrix. The process involves generating a matrix of all possible sequences and then filtering down based on the above logic. This is highly inefficient, but I can post it if necessary. This process also cannot be generalized to a four factor model, as the 4^13 potential sequences overwhelm my machine.
If any of you can offer insight into how I might generate these valid sequences, it would be greatly appreciated. Thank you.
I am assuming that once a gradually increasing vector reaches 4, it cannot jump back down again to a lower value the way that it can if 4 is the first number (if it can, the code is actually easier).
The following function generates compatible sequences, essentially using switch to implement a Markov chain
generate_seq <- function(n)
{
x <- numeric(n)
x[1] <- sample(4, 1)
had_a_four <- FALSE
for(i in seq(n - 1)) {
if(!had_a_four)
{
x[i + 1] <- switch(x[i], sample(1:2, 1, prob = c(3, 1)),
sample(2:3, 1, prob = c(3, 1)),
sample(3:4, 1, prob = c(3, 1)),
sample(4, 1))
}
else
{
x[i + 1] <- switch(x[i], sample(1:2, 1, prob = c(3, 1)),
sample(2:3, 1, prob = c(3, 1)),
sample(3:4, 1, prob = c(3, 1)),
4)
}
if(x[i + 1] == 4 & !all(x[1:(i+1)] == 4)) had_a_four <- TRUE
}
x
}
And we can generate a 4-column matrix like this:
set.seed(4)
matrix(replicate(4, generate_seq(13)), ncol = 4)
#> [,1] [,2] [,3] [,4]
#> [1,] 4 4 1 1
#> [2,] 3 3 1 2
#> [3,] 3 4 2 3
#> [4,] 3 4 2 3
#> [5,] 4 4 2 4
#> [6,] 4 4 3 4
#> [7,] 4 4 3 4
#> [8,] 4 4 3 4
#> [9,] 4 4 4 4
#> [10,] 4 4 4 4
#> [11,] 4 4 4 4
#> [12,] 4 4 4 4
#> [13,] 4 4 4 4
I think you can use RcppAlgos to do this efficiently by generating the combinations for vectors of length 1:n (where it's assumed that the shorter vectors are left padded to length 13 with 4) :
library(RcppAlgos)
get_combos <- function(n) {
unique(do.call(rbind, sapply(rev(seq(n)), function(x)
do.call(
cbind, c(rep(4, n - x), list(comboGeneral(1:4, x, TRUE)))
))))
}
res <- get_combos(13)
head(res)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] 1 1 1 1 1 1 1 1 1 1 1 1 1
[2,] 1 1 1 1 1 1 1 1 1 1 1 1 2
[3,] 1 1 1 1 1 1 1 1 1 1 1 1 3
[4,] 1 1 1 1 1 1 1 1 1 1 1 1 4
[5,] 1 1 1 1 1 1 1 1 1 1 1 2 2
[6,] 1 1 1 1 1 1 1 1 1 1 1 2 3
nrow(res)
[1] 2367
I have a matrix A:
1 0 2
1 3 1
I want to choose all columns which contain the number 1. My desired output:
1 2
1 1
DATA
A = matrix(c(1, 1, 0, 3, 2, 1), nrow = 2)
A
# [,1] [,2] [,3]
#[1,] 1 0 2
#[2,] 1 3 1
1
A[,colSums(A == 1, na.rm = TRUE) > 0, drop = FALSE]
# [,1] [,2]
#[1,] 1 2
#[2,] 1 1
2
A[,apply(A, 2, function(x) any(x == 1)), drop = FALSE]
# [,1] [,2]
#[1,] 1 2
#[2,] 1 1
I have a data table where I want to swap negative values by assigning them the positive value in the previous row for the same column. for ex:
1 2 3 4
2 -3 -2 3
should be
1 2 3 4
2 2 3 3
Thanks!
Since there are no answers from more experienced guys, here is what I've come up with.
# I'm reconstructing your example:
n <- matrix(c(1, 2, 2, -3, 3, -2, 4, 3), nrow = 2)
n
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 -3 -2 3
changeMat <- function(mat) {
new_mat <- mat
for(i in 1:length(mat))
ifelse(mat[i] < 0, new_mat[i] <- mat[i-1], new_mat[i] <- mat[i])
return(new_mat)
}
changeMat(n)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 2 3 3
I checked that for data.table object dt changeMat(as.matrix(dt)) would work properly.
Anyway, I am pretty sure that there must be smarter way...
I have a vector, say vec1, and another vector named vec2 as follows:
vec1 = c(4,1)
# [1] 4 1
vec2 = c(5,3,2)
# [1] 5 3 2
What I'm looking for is all possible combinations of vec1 and vec2 while the order of the vectors' elements is kept. That is, the resultant matrix should be like this:
> res
[,1] [,2] [,3] [,4] [,5]
[1,] 4 1 5 3 2
[2,] 4 5 1 3 2
[3,] 4 5 3 1 2
[4,] 4 5 3 2 1
[5,] 5 4 1 3 2
[6,] 5 4 3 1 2
[7,] 5 4 3 2 1
[8,] 5 3 4 1 2
[9,] 5 3 4 2 1
[10,] 5 3 2 4 1
# res=structure(c(4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 1, 5, 5, 5, 4, 4, 4,
# 3, 3, 3, 5, 1, 3, 3, 1, 3, 3, 4, 4, 2, 3, 3, 1, 2, 3, 1, 2, 1,
# 2, 4, 2, 2, 2, 1, 2, 2, 1, 2, 1, 1), .Dim = c(10L, 5L))
There is no repetition allowed for two vectors. That is, all rows of the resultant matrix have unique elements.
I'm actually looking for the most efficient way. One way to tackle this problem is to generate all possible permutations of length n which grows factorially (n=5 here) and then apply filtering. But it's time-consuming as n grows.
Is there an efficient way to do that?
Try this one:
nv1 <- length(vec1)
nv2 <- length(vec2)
n <- nv1 + nv2
result <- combn(n,nv1,function(v) {z=integer(n);z[v]=vec1;z[-v]=vec2;z})
The idea is to produce all combinations of indices at which to put the elements of vec1.
Not that elegant as Marat Talipov solution, but you can do:
# get the ordering per vector
cc <- c(order(vec1,decreasing = T), order(vec2, decreasing = T)+length(vec1))
cc
[1] 1 2 3 4 5
# permutation to get all "order-combinations"
library(combinat)
m <- do.call(rbind, permn(cc))
# remove unsorted per vector, only if both vectors are correct set TRUE for both:
gr <- apply(m, 1, function(x){
!is.unsorted(x[x < (length(vec1)+1)]) & !is.unsorted(x[x > (length(vec1))])
})
# result, exchange the order index with the vector elements:
t(apply(m[gr, ], 1, function(x, y) y[x], c(vec1, vec2)))
[,1] [,2] [,3] [,4] [,5]
[1,] 4 1 5 3 2
[2,] 4 5 3 1 2
[3,] 4 5 3 2 1
[4,] 4 5 1 3 2
[5,] 5 4 1 3 2
[6,] 5 4 3 2 1
[7,] 5 4 3 1 2
[8,] 5 3 4 1 2
[9,] 5 3 4 2 1
[10,] 5 3 2 4 1
A = matrix(c(1,2,3, 0, 2, 2, 0,2 ,3), nrow=3, ncol=3)
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 2 2 2
[3,] 3 2 3
B = matrix(c(1,2,3, 1, 4, 2, 2,2 ,1), nrow=3, ncol=3)
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 4 2
[3,] 3 2 1
C = A + B /(Sum numbers diff of zero)
C = matrix(c(1,2,3, 1, 3, 2, 2,2 ,2), nrow=3, ncol=3)
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 3 2
[3,] 3 2 2
I need do it for a list of N matrices (mat_vect[[i]]):
list_mat_vect[[i]] <- assign(paste("a", i, sep = ""), mat_vect[[i]])
Sum matrix and get mean value
mat_sum_mean = Reduce("+", list_mat_vect) / length(file_list)
Here is dividing for all numbers, including the zeros. I dont want that.
You can do
(A+B)/((A!=0) + (B!=0))
to get
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 3 2
[3,] 3 2 2
Here != tests for equality with zero returning TRUE or FALSE. When we add those up, the TRUEs are treated like 1 and the FALSEs become 0.
You can do this with a list of matrices as well
list_mat_vect<-list(A,B)
Reduce("+", list_mat_vect) / Reduce("+", lapply(list_mat_vect, function(x) x!=0))