R piping (%>%) does not work with replicate function - r

I am trying to learn the piping function (%>%).
When trying to convert from this line of code to another line it does not work.
---- R code -- original version -----
set.seed(1014)
replicate(6,sample(1:8))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 3 7 4 5 1
[2,] 2 8 4 2 4 2
[3,] 5 4 8 5 8 5
[4,] 3 1 2 1 1 7
[5,] 4 6 3 7 7 3
[6,] 6 5 1 3 3 8
[7,] 8 7 5 8 6 6
[8,] 7 2 6 6 2 4
---- R code - recoded with the pipe ----
> sample(1:8) %>% replicate(6,.)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 7 7 7 7 7 7
[2,] 3 3 3 3 3 3
[3,] 2 2 2 2 2 2
[4,] 1 1 1 1 1 1
[5,] 5 5 5 5 5 5
[6,] 4 4 4 4 4 4
[7,] 8 8 8 8 8 8
[8,] 6 6 6 6 6 6
Notice that when using pipes, the sampling does not work giving me
the same vector across.

That's to be expected. replicate expects an expression, but when using the pipe operator as is you just paste the result of the call to sample() to replicate. So you get 6 times the same result.
You have to use quote() to pass the expression to replicate instead of the result, but you shouldn't forget to evaluate each of the repetitions of that expression.
quote(sample(c(1:10,-99),6,rep=TRUE)) %>%
replicate(6, .) %>%
sapply(eval)
Gives:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 5 2 10 10 9 2
[2,] 4 3 1 3 -99 1
[3,] 10 2 3 8 2 4
[4,] -99 1 6 2 10 3
[5,] 8 -99 1 9 4 6
[6,] 4 10 8 1 -99 8
What happens here:
the piping sends and expression to replicate without evaluating it.
replicate replicates that expression and returns a list with 6 times that expression but without evaluating it.
sapply(eval) goes through the list and executes each expression in that list.
In your previous question (i.e. when using data.frame), you could have done eg:
quote(sample(c(1:10,-99),6,rep=TRUE)) %>%
replicate(6, .) %>%
data.frame
Now the function data.frame would force the expressions to be executed, but you also end up with terrible variable names, i.e. the expression itself.
If you want to learn more about the issues here, you'll have to dive into what is called "lazy evaluation" and how that is dealt with exactly by the pipe operator. But in all honesty, I really don't see any advantage of using the pipe operator in this case. It's not even more readable.
As per Frank's comment: You can use a mixture of piping and nesting of functions to avoid the sapply. But for that, you have to contain the nested functions inside a code block or the pipe operator won't process it correctly:
quote(sample(c(1:10,-99),6,rep=TRUE)) %>% {
replicate(6, eval(.)) }
Very interesting, but imho not really useful...

Related

R Function to slice a vector/matrix in a rolling manner

I have once seen this function but can't remember its name now. The function performs a rolling-slice of the input vector/matrix and outputs a matrix with 1 dimension higher. Here is what the function does:
rolling_slice <- function(v,window){
rows = length(v)-window+1
m <- matrix(0,rows,window)
for(i in 1:rows){m[i,] <- v[i:(i+window-1)]}
return(m)
}
A sample output with a vector input looks like this:
> v <- 1:10
> rolling_slice(v,3)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 3 4
[3,] 3 4 5
[4,] 4 5 6
[5,] 5 6 7
[6,] 6 7 8
[7,] 7 8 9
[8,] 8 9 10
The reason to try finding it is I want to speed up the rolling-window operations in R and I hope this function could help by pre-indexing the input data.
I just discovered the base R function embed and now it is one of my favorite things:
> numcol <- 3
> embed(1:10, numcol)
[,1] [,2] [,3]
[1,] 3 2 1
[2,] 4 3 2
[3,] 5 4 3
[4,] 6 5 4
[5,] 7 6 5
[6,] 8 7 6
[7,] 9 8 7
[8,] 10 9 8
It basically does exactly what you describe by making a matrix of rolling windows of your data, with the second input being the window size. If order matters you can reverse the columns using:
embed(1:10, numcol)[ , numcol:1]
Sounds like zoo:rollapply/rapply() or roll*() are what you need.
What is your actual end-application: rolling-means, medians, weighted sum, filter, rolling-stdev, something else? I doubt that your end-application is simply taking a sliding-window slice. There's no point in generating huge unnecessary temporary data structure as it'll kill memory and performance.
Also, for performance, this sounds like a case where data.table's sequential access will beat dplyr/tibbles/tidyverse. What data structure are you using?
You could do this vectorized in base R:
window <- 3
m <- diag(length(v)-window+1)
(row(m)+col(m)-1)[,1:window]
# [,1] [,2] [,3]
# [1,] 1 2 3
# [2,] 2 3 4
# [3,] 3 4 5
# [4,] 4 5 6
# [5,] 5 6 7
# [6,] 6 7 8
# [7,] 7 8 9
# [8,] 8 9 10

I would like to fill a matrix with values of a list

I have a list of 6 with 10 values in each. I would like to fill a 10x6 (10 rows, 6 columns) matrix with these values. I've tried some things but it's not working. I'm sure there must be an easy way to do it, but I haven't found it yet. Could anyone please help?
Here some example data:
l = lapply(1:6, rep, 10)
then use ?do.call and cbind to paste the list elements as columns:
do.call(cbind, l)
and you get a matrix:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
[2,] 1 2 3 4 5 6
[3,] 1 2 3 4 5 6
[4,] 1 2 3 4 5 6
[5,] 1 2 3 4 5 6
[6,] 1 2 3 4 5 6
[7,] 1 2 3 4 5 6
[8,] 1 2 3 4 5 6
[9,] 1 2 3 4 5 6
[10,] 1 2 3 4 5 6

mapply of lists of matrix and vector works poorly in a specific case

The function mapply() appears not to properly work in the following case:
a <- list(matrix(1:8,4,2),matrix(1:9,3,3))
b <- list(1:4,1:3)
mapply(a,b,FUN=cbind)
that gives the following matrix
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3 3
[4,] 4 4
[5,] 5 5
[6,] 6 6
[7,] 7 7
[8,] 8 8
[9,] 1 9
[10,] 2 1
[11,] 3 2
[12,] 4 3
instead of the following (expected) result:
[[1]]
[,1] [,2] [,3]
[1,] 1 5 1
[2,] 2 6 2
[3,] 3 7 3
[4,] 4 8 4
[[2]]
[,1] [,2] [,3] [,4]
[1,] 1 4 7 1
[2,] 2 5 8 2
[3,] 3 6 9 3
Can anybody help me in understanding if something in my code is wrong? Thank you!
Make sure to set SIMPLIFY to false
mapply(a,b,FUN=cbind, SIMPLIFY=FALSE)
otherwise mapply tries to coerce everything into a compatible single result. In your case, because the return from each call had 12 elements, it put those two elements side by side in a matrix, with the first matrix values in the first column, and the second matrix in the second column.
Alternatively you can use
Map(cbind, a, b)
which always returns a list. (Map is also nice because if a has names it will use those names in the resulting list which isn't useful in this case, but may be useful in others.)

Move columns by one [duplicate]

This question already has answers here:
Moving columns within a data.frame() without retyping
(17 answers)
Closed 8 years ago.
i would like to move columns in a matrix by one to the right.
Input <- data.frame(read.csv2 ....)
The matrix looks like:
1 2 3 4
1 2 3 4
1 2 3 4
and should be like:
4 1 2 3
4 1 2 3
4 1 2 3
I googled it but i couldn't find anything.
thanks for your help!!!
This looks like pretty good Moving columns within a data.frame() without retyping
Although the answer in comments works for a one-column shift to the right, its fiddly to extend that approach to other shifts and directions.
It boils down to generating the vector of the order of columns that you want to return, and then subsetting columns.
So your original Q boils down to generating c(4,1,2,3). There's a handy function in the magic package that can do this:
> install.packages("magic") # if you dont have it
> magic::shift(1:4,1)
[1] 4 1 2 3
So:
> Data[,magic::shift(1:ncol(Data),1)]
[,1] [,2] [,3] [,4]
[1,] 13 1 5 9
[2,] 14 2 6 10
[3,] 15 3 7 11
[4,] 16 4 8 12
answers your original question. This is then easy to extend to shifts by more than one, or negative (leftward) shifts:
> Data[,magic::shift(1:ncol(Data),-2)]
[,1] [,2] [,3] [,4]
[1,] 9 13 1 5
[2,] 10 14 2 6
[3,] 11 15 3 7
[4,] 12 16 4 8
Of course the right way is now to create matrix shift function:
> mshift = function(m,n=1){m[,magic::shift(1:ncol(m),n)]}
which you can check:
> mshift(Data,1)
[,1] [,2] [,3] [,4]
[1,] 13 1 5 9
[2,] 14 2 6 10
[3,] 15 3 7 11
[4,] 16 4 8 12

Vectorizing rep and seq in R

I am trying to accomplish two things. First if I have a vector 1:5 I want to get a matrix (or two vectors) indicating the unique combinations of these elements including twice the same number but excluding repetitions.
Right now I can do this using a matrix:
foo <- matrix(1:5,5,5)
cbind(foo[upper.tri(foo,diag=TRUE)],foo[lower.tri(foo,diag=TRUE)])
[,1] [,2]
[1,] 1 1
[2,] 1 2
[3,] 2 3
[4,] 1 4
[5,] 2 5
[6,] 3 2
[7,] 1 3
[8,] 2 4
[9,] 3 5
[10,] 4 3
[11,] 1 4
[12,] 2 5
[13,] 3 4
[14,] 4 5
[15,] 5 5
But there has to be a simpler way. I tried to use Vectorize on seq but this gives me an error:
cbind(Vectorize(seq,"from")(1:5,5),Vectorize(seq,"to")(5,1:5))
Error in Vectorize(seq, "from") :
must specify formal argument names to vectorize
A second thing I want to do is if I have a list containing vectors, bar, to get a vector containing the elements of the list repeated equal to the number of elements in that element. I can do this with:
unlist(apply(rbind(1:length(bar),sapply(bar,length)),2,function(x)rep(x[1],x[2])))
[1] 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
But again there must be an easier way. I tried Vectorize again here but with the same error:
Vectorize(rep,"each")(1:length(bar),each=sapply(bar,length))
in Vectorize(rep, "each") :
must specify formal argument names to vectorize
To your first question: what about the simple combn() function in base:
> combn(1:5,2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 2 2 2 3 3 4
[2,] 2 3 4 5 3 4 5 4 5 5
If you need a matrix arranged the one you made up, just transpose it with t(), like t(combn(1:5,2))
Note: this will not give you back the combinations of repeated elements of your seq, but you may add those easily to the matrix.
> unlist(lapply(1:5, seq, from=1))
[1] 1 1 2 1 2 3 1 2 3 4 1 2 3 4 5
> unlist(lapply(1:5, seq, 5))
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
and
> bar = lapply(1:5, seq, from=1)
> rep(seq_along(bar), sapply(bar, length))
[1] 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5
A faster variation of Martin Morgan's solution to the first part:
rep(1:5,5:1)
[1] 1 1 1 1 1 2 2 2 2 3 3 3 4 4 5
unlist(lapply(1:5,function(x) x:5))
[1] 1 2 3 4 5 2 3 4 5 3 4 5 4 5 5
Roughly 7 and 3 times faster respectively.
I'm not sure I follow what you mean in the second part, but the following seems to fit your description:
lapply(bar,function(x) rep(x,length(x)))

Resources