How to write function that takes uses the single ouput from another function as starting point for new analysis? - r

I'm having trouble writing a function that calls another function and uses the output as the basis for running new analysis in a loop (or equivalent). For example, let's say function 1 creates this output: 10. The second function would take that as a starting point to run new analysis. The single data point from the second output would then be the basis for the next round of analysis, and so on.
Here's a simple example. The question is how to create a for loop for this. Or perhaps there's a more efficient way using lapply. In any case, the first function might be as follows:
f.1 <-function(x) {
x
a <-seq(x,by=1,length.out=5)
a.1 <-tail(a,1)
}
The second function, which calls the first function, could run as follows:
f.2 <-function(x) {
f.1 <-function(x) {
a <-seq(x,by=1,length.out=5)
a.1 <-tail(a,1)
}
z <-f.1(x)
y=z+1
seq(y,by=1,length.out=5)
}
How can I modify f.2() so that it re-runs that computation using the previous output as the basis for the next round of analysis. To be precise, f.1(10) outputs:
[1] 14
In turn, f.2(10) results in:
[1] 15 16 17 18 19
How can I re-write f.2() so that it automatically computes f.2(19) on the next iteration, and continually do so for several loops. In the process, I'd like to collect the outputs in a separate file for review. Thanks much!

The magrittr library (which is used most notably by dplyr) makes this type of chaining somewhat simple. First, define the functions,
f.1 <-function(x) {
x
a <- seq(x, by=1, length.out=5)
a.1 <- tail(a,1)
}
f.2 <-function(x) {
y <- x+1
seq(y, by=1, length.out=5)
}
then
library(magrittr)
f.1(10) %>% f.2
# [1] 15 16 17 18 19
As #BondedDust mentioned, you could use Reduce although normally it expects to use the same function over and over so you just need to flip the most common use case
Reduce(function(x,f) f(x), list(f.1, f.2), init=10)
# [1] 15 16 17 18 19

You can try this with two arguments for f.2. The first argument is the x value that you need to initialize x with and n is the number of iterations that you want to do. The output of the function will be a matrix containing n rows and 5 columns.
f.2 <-function(x, n) {
c <- matrix(nrow=n, ncol=5)
for (i in 1:nrow(c))
{
z <-f.1(x) ##if you have already defined your f.1(x) beforehand, there is no need to define it again in f.2. you can simply use z <- f.1(x) like it is done here
y=z+1
c[i,] = seq(y, by=1, length.out=5)
x = c[i,5]
}
return(c)
}
The output of
f <- f.2(10, 10) ##initialising x with 10 and running 10 loops
f
[,1] [,2] [,3] [,4] [,5]
[1,] 15 16 17 18 19
[2,] 24 25 26 27 28
[3,] 33 34 35 36 37
[4,] 42 43 44 45 46
[5,] 51 52 53 54 55
[6,] 60 61 62 63 64
[7,] 69 70 71 72 73
[8,] 78 79 80 81 82
[9,] 87 88 89 90 91
[10,] 96 97 98 99 100

Related

What does sapply do for given function

I am still learning R. Kindly, I'd like to understand this function:
sapply(M[,-1], function(x) x^2)
Where M is a matrix. It looks like it is squaring every element in M. Can someone provide a brief example of how this line functions?
Thank you
The apply functions family in R are of different types depending on the use case.
1.When you want apply a function to the rows or columns of a matrix , apply() function is used.
When you want to apply a function to each element of a list in turn and get a list back , we use lapply() function.
When you want to apply a function to each element of a list in turn, but you want a vector in return, and not a list - we use sapply() function.
In your case above yes it squares all values and returns a vector , except the first column of the matrix, see below :
M <- matrix(seq(10,25), 4, 4) # random 4 by 4 matrix
[,1] [,2] [,3] [,4]
[1,] 10 14 18 22
[2,] 11 15 19 23
[3,] 12 16 20 24
[4,] 13 17 21 25
M[,-1]
[,1] [,2] [,3]
[1,] 14 18 22
[2,] 15 19 23
[3,] 16 20 24
[4,] 17 21 25
sapply(M[,-1], function(x) x^2)
[1] 196 225 256 289 324 361 400 441 484 529 576 625

R Calculate big NOR matrix

I have a big square matrix in R:
norMat <- matrix(NA, nrow=1024, ncol=1024)
This empty matrix needs to be filled with the sum of all equal bits of all matrix index pairs.
So I need to calculate the logical NOR for i(rowIndex) and j(colIndex) and sum the result,e.g:
sum(intToBits(2)==intToBits(3))
Currenty, I have this function which fills the matrix:
norMatrix <- function()
{
matDim=1024
norMat <<- matrix(NA, nrow=matDim, ncol=matDim)
for(i in 0:(matDim-1)) {
for(j in 0:(matDim-1)) {
norMat[i+1,j+1] = norsum(i,j)
}
}
return(norMat)
}
And here's the norsum function:
norsum <- function(bucket1, bucket2)
{
res = sum(intToBits(bucket1)==intToBits(bucket2))
return(res)
}
Is this an efficient solution to fill the matrix?
I'm in doubt since on my machine this takes over 5 minutes.
I suggest this is a great opportunity for the *apply functions. Here's one solution that's a bit faster than 5 minutes.
First, proof of concept, non-square solely for clarity of dimensions.
nc <- 5
nr <- 6
mtxi <- sapply(seq_len(nc), intToBits)
mtxj <- sapply(seq_len(nr), intToBits)
sapply(1:nc, function(i) sapply(1:nr, function(j) sum(mtxi[,i] == mtxj[,j])))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 32 30 31 30 31
# [2,] 30 32 31 30 29
# [3,] 31 31 32 29 30
# [4,] 30 30 29 32 31
# [5,] 31 29 30 31 32
# [6,] 29 31 30 31 30
Assuming that these are correct, the full meal deal:
n <- 1024
mtx <- sapply(seq_len(n), intToBits)
system.time(
ret <- sapply(1:n, function(i) sapply(1:n, function(j) sum(mtx[,i] == mtx[,j])))
)
# user system elapsed
# 3.25 0.00 3.36
You don't technically need to pre-calculate mtxi and mtxj. Though intToBits does not introduce much overhead, I think it's silly to recalculate every time.
My system is reasonable (i7 6600U CPU # 2.60GHz), win10_64, R-3.3.2 ... nothing too fancy.

Output vector of loop function r

i´m trying to create an output vector of a loop, containing a result from each loop.
out=NULL
for (i in 1:5) {
out<-cbind(out,sample(1:100, 1)) #placeholderfunction
for (i in 1:5) {out[i]<- i+1}
}
The good side: My result contains the correct values. The bad side: it does as a matrix and i don´t know why.
> out
out
[1,] 2 71 14 46 96
[2,] 3 71 14 46 96
[3,] 4 71 14 46 96
[4,] 5 71 14 46 96
[5,] 6 71 14 46 96
What i want would be something like:
> out
out
[1,] 2 71 14 46 96
Probably it is just a small step from where i stand, but i just can´t figure it out, maybe someone could help?
(and yes i could just remove but i would like my code clean)
Thanks!
Ok,
by looking at the problem again on this scale i found it - a superfluous line:
> out=NULL
> for (i in 1:5) {
+ out<-cbind(out,sample(1:100, 1))
+ }
> out
[,1] [,2] [,3] [,4] [,5]
[1,] 63 98 78 43 19
What about this
out <- sample(100,5)
Update
I see why I got a -1, the OP wants to construct a vector with a for loop. As a word of caution, creating a vector in this manner is usually not a good idea. For example, my above code is both simpler and faster than the OP's code. That withstanding, if you want generate a vector of random numbers with a for loop use this approach
my.loop <- function(l){
out_1 <- numeric(l)
for (i in 1:l) {
out_1[i] <- sample(1:100, 1)
}
out_1
}
This will be much better than op approach below because we are preallocating memory.
op.loop <- function(l){
out_2 = NULL
for (i in 1:l) {
out_2 <- cbind(out_2, sample(1:100, 1))
}
out_2
}
For fun I timed the two approaches:

R - Apply function with different argument value for each row/column of a matrix

I am trying to apply a function to each row or column of a matrix, but I need to pass a different argument value for each row.
I thought I was familiar with lapply, mapply etc... But probably not enough.
As a simple example :
> a<-matrix(1:100,ncol=10);
> a
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 11 21 31 41 51 61 71 81 91
[2,] 2 12 22 32 42 52 62 72 82 92
[3,] 3 13 23 33 43 53 63 73 83 93
[4,] 4 14 24 34 44 54 64 74 84 94
[5,] 5 15 25 35 45 55 65 75 85 95
[6,] 6 16 26 36 46 56 66 76 86 96
[7,] 7 17 27 37 47 57 67 77 87 97
[8,] 8 18 28 38 48 58 68 78 88 98
[9,] 9 19 29 39 49 59 69 79 89 99
[10,] 10 20 30 40 50 60 70 80 90 100
Let's say I want to apply a function to each row, I would do :
apply(a, 1, myFunction);
However my function takes an argument, so :
apply(a, 1, myFunction, myArgument);
But if I want my argument to take a different value for each row, I cannot find the right way to do it.
If I define a 'myArgument' with multiple values, the whole vector will obviously be passed to each call of 'myFunction'.
I think that I would need a kind of hybrid between apply and the multivariate mapply. Does it make sense ?
One 'dirty' way to achieve my goal is to split the matrix by rows (or columns), use mapply on the resulting list and merge the result back to a matrix :
do.call(rbind, Map(myFunction, split(a,row(a)), as.list(myArgument)));
I had a look at sweep, aggregate, all the *apply variations but I wouldn't find the perfect match to my need. Did I miss it ?
Thank you for your help.
You can use sweep to do that.
a <- matrix(rnorm(100),10)
rmeans <- rowMeans(a)
a_new <- sweep(a,1,rmeans,`-`)
rowMeans(a_new)
I don't think there are any great answers, but you can somewhat simplify your solution by using mapply, which handles the "rbind" part for you, assuming your function always returns the same sizes vector (also, Map is really just mapply):
a <- matrix(1:80,ncol=8)
myFun <- function(x, y) (x - mean(x)) * y
myArg <- 1:nrow(a)
t(mapply(myFun, split(a, row(a)), myArg))
I know the topic is quiet old but I had the same issue and I solved it that way:
# Original matrix
a <- matrix(runif(n=100), ncol=5)
# Different value for each row
v <- runif(n=nrow(a))
# Result matrix -> Add a column with the row number
o <- cbind(1:nrow(a), a)
fun <- function(x, v) {
idx <- 2:length(x)
i <- x[1]
r <- x[idx] / v[i]
return(r)
}
o <- t(apply(o, 1, fun, v=v)
By adding a column with the row number to the left of the original matrix, the index of the needed value from the argument vector can be received from the first column of the data matrix.

Apply over all columns and rows of two diffrent dataframes in R

I try to apply a function over all rows and columns of two dataframes but I don't know how to solve it with apply.
I think the following script explains what I intend to do and the way i tried to solve it. Any advice would be warmly appreciated! Please note, that the simplefunction is only intended to be an example function to keep it simple.
# some data and a function
df1<-data.frame(name=c("aa","bb","cc","dd","ee"),a=sample(1:50,5),b=sample(1:50,5),c=sample(1:50,5))
df2<-data.frame(name=c("aa","bb","cc","dd","ee"),a=sample(1:50,5),b=sample(1:50,5),c=sample(1:50,5))
simplefunction<-function(a,b){a+b}
# apply on a single row
simplefunction(df1[1,2],df2[1,2])
# apply over all colums
apply(?)
## apply over all columns and rows
# create df to receive results
df3<-df2
# loop it
for (i in 2:5)df3[i]<-apply(?)
My first mapply answer!! For your simple example you have...
mapply( FUN = `+` , df1[,-1] , df2[,-1] )
# a b c
# [1,] 60 35 75
# [2,] 57 39 92
# [3,] 72 71 48
# [4,] 31 19 85
# [5,] 47 66 58
You can extend it like so...
mapply( FUN = function(x,y,z,etc){ simplefunctioncodehere} , df1[,-1] , df2[,-1] , ... other dataframes here )
The dataframes will be passed in order to the function, so in this example df1 would be x, df2 would be y and z and etc would be some other dataframes that you specify in that order. Hopefully that makes sense. mapply will take the first row, first column values of all dataframes and apply the function, then the first row, second column of all data frames and apply the function and so on.
You can also use Reduce:
set.seed(45) # for reproducibility
Reduce(function(x,y) { x + y}, list(df1[, -1], df2[,-1]))
# a b c
# 1 53 22 23
# 2 64 28 91
# 3 19 56 51
# 4 38 41 53
# 5 28 42 30
You can just do :
df1[,-1] + df2[,-1]
Which gives :
a b c
1 52 24 37
2 65 63 62
3 31 90 89
4 90 35 33
5 51 33 45

Resources