How to show replicate number when using sapply in R? - r

Consider the function below:
f = function(i) i^2
Now we find the output of f for an input vector of length 1000 (or equivalently to run f for 1000 replications) by:
output = c()
for (i in 1:1000) output[i] = f(i)
In the case of running time-consuming functions, we might like to know which replication we are. So we can use:
output = c()
for (i in 1:1000) {output[i] = f(i); cat("Replicate=", i, "\n")}
This gives the replicate number at the end of each replication. Now what if we use sapply instead of for:
output = sapply(1:1000, function(i) f(i))
How can we see which replicate we are while using sapply? Note that I tried adding cat("Replicate=", i, "\n") in the definition of f. This shows the replicate number, but only at the end of the entire run and not at the end of each replicate.

you say you have tried it, but this code works just great for me:
result <- sapply(1:1000, function(x) {
print(x) # cat works too
return(x^2)
})
You may have forgotten the curly brackets! :-)

Related

Putting user-defined on a list in for loop

I have problems storing user defined functions in R list when they are put on it in a for loop.
I have to define some segment-specific functions based on some parameters, so I create functions and put them on a list looping through segments with for-loop. The problem is I get same function everywhere on a result list.
The code looks like this:
n <- 100
segmenty <- 1:n
segment_functions <- list()
for (i in segmenty){
segment_functions[[i]] <- function(){return(i)}
}
When i run the code what I get is the same function (last created in the loop) for all indexes:
## for all k
segment_functions[[k]]()
[1] 100
There is no problem when I put the functions on list manually e.g.
segment_functions[[1]] <- function(){return(1)}
segment_functions[[2]] <- function(){return(2)}
segment_functions[[3]] <- function(){return(3)}
works just fine.
I honsetly have no idea what's wrong. Could you help?
You need to use the force function to ensure that the evaluation of i is done during the assignment into the list:
n <- 100
segmenty <- 1:n
segment_functions <- list()
f <- function(i) { force(i); function() return(i) }
for (i in segmenty){
segment_functions[[i]] <- f(i)
}
I'd use lapply and capture i in a clousre of the wrapper:
segment_functions <- lapply(1:100, function(i) function() i)

is it possible to generate a list of seq() for loop in r?

I am very new to R and I have some problem on performing loop using seq() and list. I have search on the QnA in SO, yet I have to find the same problem as this. I apologize if there is a duplicate QnA on this.
I know the basic on how to generate sequence of number and generate using list, however I am wondering whether we can generate a list of sequence for each loop.
this is an example of my code
J <- seq(50,200,50) # (I actually wanted to use 1: J to generate a sequence of each combinations . i.e: 1:50, 1:100 etc)
K <- seq(10,100,10) #(same as the above)
set.seed(1234)
for (i in J) {
for (j in K){
f <- rnorm(i + 1) # the f value I would like it to be generate in terms of list, since the j have 4 sequence value, if possible, could it adhere to that?
}
}
i try using both sequence and list function, but i keep getting either messages:
if print(i)
output
[1]1
.
.
.
[1]50
Warning message:
In 1:(seq(50, 200, 50)) :
numerical expression has 6 elements: only the first used
for (i in 1:list(seq(50,200,50)))
Error in 1:list(seq(50, 200, 50)) : NA/NaN argument
May I know, whether such loop combinations can be perform? Could you please guide me on this? Thank you very much.
not yet sure of what you are asking but is this what you are looking for? It was difficult to post this as a comment
J <- seq(50,200,50)
l1 <- vector(length = length(J), mode = "list")
for (i in seq_along(J)){ # you know of seq_along() right?
l1[[i]] = rnorm(J[i])
}
for the second question where you want lists(J) of lists(K) of matrices : Please do note hat <<- has never been a good practice, but for now this is what i could come up with!
Note : to understand what is actually happening, go into the debug mode : i.e. after defining func, also pass debug(func) which will then go into step-by-step execution.
l1 <- vector(length = length(J), mode = "list")
l2 <- vector(length = length(K), mode = "list")
func <- function(x){
l1[[x]] <- l2
func1 <- function(y) {
l1[[x]][[y]] <<- matrix(rnorm(J[x]*K[y]),
ncol = J[x],
nrow = K[y])
}
lapply(seq_along(l1[[x]]),func1)
}
lapply(seq_along(l1), func)

Relooping a function over its own output

I have defined a function which I want to reapply to its own output multiple times. I tried
replicate(1000,myfunction)
but realised that this is just applying my function to my initial input 1000 times, rather than applying my function to the new output each time. In effect what I desire is:
function(function(...function(x_0)...))
1000 times over and being able to see the changes at each stage.
I have previous defined b as a certain vector of length 7.
b_0=b
C=matrix(0,7,1000)
for(k in 1:1000){
b_k=myfun(b_(k-1))
}
C=rbind(b_k)
C
Is this the right idea behind what I want?
You could use Reduce for this. For example
add_two <- function(a) a+2
ignore_current <- function(f) function(a,b) f(a)
Reduce(ignore_current(add_two), 1:10, init=4)
# 24
Normally Reduce expects to iterate over a set of new values, but in this case I use ignore_current to drop the sequence value (1:10) so that parameter is just used to control the number of times we repeat the process. This is the same as
add_two(add_two(add_two(add_two(add_two(add_two(add_two(add_two(add_two(add_two(4))))))))))
Pure functional programming approach, use Compose from functional package:
library(functional)
f = Reduce(Compose, replicate(100, function(x) x+2))
#> f(2)
#[1] 202
But this solution does not work for too big n ! Very interesting.
A loop would work just fine here.
apply_fun_n_times <- function(input, fun, n){
for(i in 1:n){
input <- fun(input)
}
return(input)
}
addone <- function(x){x+1}
apply_fun_n_times(1, addone, 3)
which gives
> apply_fun_n_times(1, addone, 3)
[1] 4
you can try a recursive function:
rec_func <- function(input, i=1000) {
if (i == 0) {
return(input)
} else {
input <- myfunc(input)
i <- i - 1
rec_func(input, i)
}
}
example
myfunc <- function(item) {item + 1}
> rec_func(1, i=1000)
[1] 1001

R Cross Validation

I'm doing cross validation. So I wanted to split data into 10 folds. Somebody has post following code.
f_K_fold <- function(Nobs,K=10){
rs <- runif(Nobs)
id <- seq(Nobs)[order(rs)]
k <- as.integer(Nobs * seq(1, K-1) / K)
k <- matrix(c(0, rep(k, each=2), Nobs), ncol = 2, byrow = TRUE)
k[,1] <- k[,1]+1
l <- lapply(seq.int(K), function(x, k, d)
list(train=d[!(seq(d) %in% seq(k[x, 1],k[x, 2]))],
test=d[seq(k[x,1],k[x,2])]),
k=k,d=id)
return(l)
}
however I don't really understand what the lapply doing. Could someone explain to a newbie? Appreciate it.
It's really unfortunate that the code folding in this example is horrible, since aving properly formatted code can aid in understanding the code and catching mistakes.
The last three lines can be viewed as an anonymous function passed to lapply. lapply in essence "climbs" a list and for each list element, applies that (anonymous) function. In the example below, I've disambiguated the lines into a not so anonymous function and a call to lapply.
notSoanonymousFunction <- function(x, k, d) {
list(train = d[!(seq(d) %in% seq(k[x,1],k[x,2]))],
test = d[seq(k[x,1],k[x,2])])
}
l <- lapply(seq.int(K), FUN = notSoanonymousFunction, k = k, d = id)
If you look at ?lapply, you'll notice that there are no k or d arguments. However, these arguments do belong to our notSoanonymousFunction, and lapply takes it in via the ... argument.
As a mental exercise for you, I will show you one more trick how to learn what the function is doing. If you need to see what is happening inside the function, place a browser() call inside and run it. In your case, this would look like this:
notSoanonymousFunction <- function(x, k, d) {
browser()
list(train = d[!(seq(d) %in% seq(k[x,1],k[x,2]))],
test = d[seq(k[x,1],k[x,2])])
}
Once you run this, your console should say something along the lines of
Browser[1] >
You are now effectively inside the function. You can navigate to next line by typing n, running the whole chunk by c and quitting the browser all together, by pressing Q (see ?browser()). You can view and manipulate objects ad libidum. You can try by checking your workspace with ls() to see which objects are inside the function. You can bet your family farm that there will be objects x, k and d.

Loops and variable scope in R

I have following for loop in R:
v = c(1,2,3,4)
s = create.some.complex.object()
for (i in v){
print(i)
s = some.complex.function.that.updates.s(s)
}
# s here has the right content.
Needless to say, this loop is horribly slow in R.
I tried to write it in functional style:
lapply(v, function(i){
print(i)
s = some.complex.function.that.updates.s(s)
})
# s wasn't updated.
But this doesn't work, because s is passed by value and not by reference.
I only need the result of the last iteration, not all of the intermediate steps.
How do I formulate the first loop in R-style?
Mulone
lapply(v, function(i){
print(i)
s = some.complex.function.that.updates.s(s)
return(s)
})
the result will be a list of object s created for each value of v. Even if it should have passed the value of v anyway cause it was the last operation performed by the function.
If you can't afford to create it many times then there are not a lot of options. It is hard to say as well without seeing the object that you are operating on. If the object is growing/appending you could collect the intermediate results and do the appending at the end. If it is actually mutating you should try to get away from the pass value and use reference classes (http://www.inside-r.org/r-doc/methods/ReferenceClasses). Then the function that modifies it will actually be a method you just call n times.
Is the loop itself really the problem? Or is it rather the time the execution of some.complex.function.that.updates.s needs?
Some R programers will jump through hoops to avoid loops but have a look at this example:
f <- function(a) a/1.001
loop <- function(n) { s = (1/f(1)^n); for (i in 1:n) s <- f(s); s}
system.time(loop(1E7))
user system elapsed
7.011 0.030 7.008
This is 0.7 micro seconds (on a MacBook Pro) per call of a very trivial function in a loop.
v = c(1,2,3,4)
s = create.some.complex.object()
lapply(v, function(i){
print(i)
s <<- some.complex.function.that.updates.s(s)
}) |> invisible()
Use of the <<- operator can sometimes get you into trouble and is (somewhat) discouraged, but when I want to mimic a for loop with side-effects this is a pattern I have found useful.
v = c(1,2,3,4)
s = create.some.complex.object()
lapply(v, function(i){
print(i)
assign('s', some.complex.function.that.updates.s(s), envir = .GlobalEnv)
}) |> invisible()
Using assign allows you to avoid the use of <<- operator. Using <<- is significantly faster than invoking the assign function. For performance reasons in more intensive applications it is very much worth it to replace sequential for loops with vectorized operations as the median execution time of lapply can be several orders of magnitude faster! Here are some toy benchmarks to support this assertion:
v <- c(1, 2, 3, 4)
microbenchmark::microbenchmark({
s <- 1
lapply(v, function(i) {
s <<- s + i
})
}, times = 1e4, unit = 'microseconds')
Median: ~ 4 microseconds
v <- c(1, 2, 3, 4)
microbenchmark::microbenchmark({
s <- 1
for(i in v) {
s <- s + i
}
}, times = 1e4, unit = 'microseconds')
Median: ~ 1488 microseconds

Resources