R functions counting elements - r

I would like to print the element number of the list that is going through a function. For example if there are 10 elements in the list, I would like a counter that will go from 0-10 as the function goes through each element.
a = length(url)
func0 = function(url){
a = a-1
print(a)
}
cc = lapply(url, func0)
However this does not work. Please let me know what I'm doing wrong.

Your function changes the internal copy of variable a, and prints number 9 10 times. To change this behaviour to desired you should change the assignment operator = (btw, why not <-? equal sign is usually used in definitions of functions' parameters) to the global assignment operator <<-.
func0 = function(url){
a <<- a-1
print(a)
}
It will work, but the common recommendation is to avoid the global assignment operator in your code.
As an alternative I can suggest to check the package pbapply, which adds progress bar to the *apply functions.
require(pbapply)
pblapply(url, func1)
Where func1 will stand for the the function you want to apply to each element of the list.

Related

Provide multiple function arguments by one variable

When working with packages like openxlsx, I often find myself writing repetetive code such as defining the wb and sheet arguments with the same values.
To respect the DRY principle, I would like to define one variable that contains multiple arguments. Then, when I call a function, I should be able to provide said variable to define multiple arguments.
Example:
foo <- list(a=1,b=2,c=3)
bar <- function(a,b,c,d) {
return(a+b+c+d)
}
bar(foo, d=4) # should return 10
How should the foo() function be defined to achieve this?
Apparently you are just looking for do.call, which allows you to create and evaluate a call from a function and a list of arguments.
do.call(bar, c(foo, d = 4))
#[1] 10
How should the foo() function be defined to achieve this?
You've got it slightly backwards. Rather than trying to wrangle the output of foo into something that bar can accept, write foo so that it takes input in a form that is convenient to you. That is, create a wrapper function that provides all the boilerplate arguments that bar requires, without you having to specify them manually.
Example:
bar <- function(a, b, c, d) {
return(a+b+c+d)
}
call_bar <- function(d=4) {
bar(1, 2, 3, d)
}
call_bar(42) # shorter than writing bar(1, 2, 3, 42)
I discovered a solution using rlang::exec.
First, we must have a function to structure the dots:
getDots <- function(...) {
out <- sapply(as.list(match.call())[-1], function(x) eval(parse(text=deparse(x))))
return(out)
}
Then we must have a function that executes our chosen function, feeding in our static parameters as a list (a, b, and c), in addition to d.
execute <- function(FUN, ...) {
dots <-
getDots(...) %>%
rlang::flatten()
out <- rlang::exec(FUN, !!!dots)
return(out)
}
Then calling execute(bar, abc, d=4) returns 10, as it should do.
Alternatively, we can write bar %>% execute(abc, d=4).
Let me give you an example!
How to get two or more return values ​​from a function
Method 1: Set global variables, so that if you change global variables in formal parameters, it will also be effective in actual parameters. So you can change the value of multiple global variables in the formal parameter, then in the actual parameter is equivalent to returning multiple values.
Method 2: If you use the array name as a formal parameter, then you change the contents of the array, such as sorting, or perform addition and subtraction operations, and it is still valid when returning to the actual parameter. This will also return a set of values.
Method 3: Pointer variables can be used. This principle is the same as Method 2, because the array name itself is the address of the first element of the array. Not much to say.
Method 4: If you have learned C++, you can quote parameters
You can try these four methods here, I just think the problem is a bit similar, so I provided it to you, I hope it will help you!

inverting an index using clusters

This code is about inverting an index using clusters.
Unfortunately I do not understand the line with recognize<-...
I know that the function Vectorize applies the inner function element-wise, but I do not understand the inner function here.
The parameters (uniq, test) are not defined, how can we apply which then? Also why is there a "uniq" as text right after?
slots <- as.integer(Sys.getenv("NSLOTS"))
cl <- makeCluster(slots, type = "PSOCK")
inverted_index4<-function(x){
y <- unique(x)
recognize <- Vectorize(function(uniq,text) which(text %in% uniq),"uniq",SIMPLIFY = F)
y2 <- parLapply(cl, y, recognize, x)
unlist(y2,recursive=FALSE)
}
The
Vectorise()
function is just making a new element wise, vectorised function of the custom function
function(uniq,text) which(text %in% uniq).
The 'uniq' string is the argument of that function that you must specify you want to iterate over. Such that now you can pass a vector of length greater than one for uniq, and get returned a list with an element for the output of the function evaluated for every element of the input vector uniq.
I would suggest the author make the code a little clearer, better commented etc. the vectorise function doesn't need to be inside the function call necessarily.
Note
ParLapply()
isn't a function I recognise. But the x will be passed to the recognise function and the second argument text should presumably be defined earlier on, in the global environment, .GlobalEnv().

How to use lapply in R to evaluate elements of a list?

My apologies if this has been answered somewhere else. I've defined two functions in R and then nested them with good results. Now I would like to evaluate these two nested functions by changing a variable in the second function. I've tried creating a list for the changing variable and then using lapply to evaluate each element, but I'm getting an error.
My code looks something like this:
# First function
FirstFun <- function(a, b, c, d) {
answer1 <- (a + b)/(1-(0.2*(c/d))-(0.8*(c/d)^2))
return(answer1)
}
# First function evaluated
FirstFun(13,387,1728,1980)
# Second function
SecondFun <- function(answer1,c,d) {
answer2 <- answer1*(1-(0.2*(c/d))-(0.8*(c/d)^2))
return(answer2)
}
# Nested function evaluated
SecondFun(FirstFun(13,387,1728,1980),1728,1980)
# Nested function evaluated with elements of a list
c <- list(0:1980)
lapply(c, SecondFun(FirstFun(13,387,1728,1980),c,1980))
if I under stand you correctly - you are looking for :
SecondFun(FirstFun(13,387,1728,1980),0:1980,1980)
or maybe this :
SecondFun(FirstFun(13,387,1728,0:1980),0:1980,1980)
both return a numeric vector of length 1981.
2 things -
1. no need for a list. a range would work.
2. calling a variable 'c' is a bad idea..... c is reserved

Use of the <<- operator in R [duplicate]

I just finished reading about scoping in the R intro, and am very curious about the <<- assignment.
The manual showed one (very interesting) example for <<-, which I feel I understood. What I am still missing is the context of when this can be useful.
So what I would love to read from you are examples (or links to examples) on when the use of <<- can be interesting/useful. What might be the dangers of using it (it looks easy to loose track of), and any tips you might feel like sharing.
<<- is most useful in conjunction with closures to maintain state. Here's a section from a recent paper of mine:
A closure is a function written by another function. Closures are
so-called because they enclose the environment of the parent
function, and can access all variables and parameters in that
function. This is useful because it allows us to have two levels of
parameters. One level of parameters (the parent) controls how the
function works. The other level (the child) does the work. The
following example shows how can use this idea to generate a family of
power functions. The parent function (power) creates child functions
(square and cube) that actually do the hard work.
power <- function(exponent) {
function(x) x ^ exponent
}
square <- power(2)
square(2) # -> [1] 4
square(4) # -> [1] 16
cube <- power(3)
cube(2) # -> [1] 8
cube(4) # -> [1] 64
The ability to manage variables at two levels also makes it possible to maintain the state across function invocations by allowing a function to modify variables in the environment of its parent. The key to managing variables at different levels is the double arrow assignment operator <<-. Unlike the usual single arrow assignment (<-) that always works on the current level, the double arrow operator can modify variables in parent levels.
This makes it possible to maintain a counter that records how many times a function has been called, as the following example shows. Each time new_counter is run, it creates an environment, initialises the counter i in this environment, and then creates a new function.
new_counter <- function() {
i <- 0
function() {
# do something useful, then ...
i <<- i + 1
i
}
}
The new function is a closure, and its environment is the enclosing environment. When the closures counter_one and counter_two are run, each one modifies the counter in its enclosing environment and then returns the current count.
counter_one <- new_counter()
counter_two <- new_counter()
counter_one() # -> [1] 1
counter_one() # -> [1] 2
counter_two() # -> [1] 1
It helps to think of <<- as equivalent to assign (if you set the inherits parameter in that function to TRUE). The benefit of assign is that it allows you to specify more parameters (e.g. the environment), so I prefer to use assign over <<- in most cases.
Using <<- and assign(x, value, inherits=TRUE) means that "enclosing environments of the supplied environment are searched until the variable 'x' is encountered." In other words, it will keep going through the environments in order until it finds a variable with that name, and it will assign it to that. This can be within the scope of a function, or in the global environment.
In order to understand what these functions do, you need to also understand R environments (e.g. using search).
I regularly use these functions when I'm running a large simulation and I want to save intermediate results. This allows you to create the object outside the scope of the given function or apply loop. That's very helpful, especially if you have any concern about a large loop ending unexpectedly (e.g. a database disconnection), in which case you could lose everything in the process. This would be equivalent to writing your results out to a database or file during a long running process, except that it's storing the results within the R environment instead.
My primary warning with this: be careful because you're now working with global variables, especially when using <<-. That means that you can end up with situations where a function is using an object value from the environment, when you expected it to be using one that was supplied as a parameter. This is one of the main things that functional programming tries to avoid (see side effects). I avoid this problem by assigning my values to a unique variable names (using paste with a set or unique parameters) that are never used within the function, but just used for caching and in case I need to recover later on (or do some meta-analysis on the intermediate results).
One place where I used <<- was in simple GUIs using tcl/tk. Some of the initial examples have it -- as you need to make a distinction between local and global variables for statefullness. See for example
library(tcltk)
demo(tkdensity)
which uses <<-. Otherwise I concur with Marek :) -- a Google search can help.
On this subject I'd like to point out that the <<- operator will behave strangely when applied (incorrectly) within a for loop (there may be other cases too). Given the following code:
fortest <- function() {
mySum <- 0
for (i in c(1, 2, 3)) {
mySum <<- mySum + i
}
mySum
}
you might expect that the function would return the expected sum, 6, but instead it returns 0, with a global variable mySum being created and assigned the value 3. I can't fully explain what is going on here but certainly the body of a for loop is not a new scope 'level'. Instead, it seems that R looks outside of the fortest function, can't find a mySum variable to assign to, so creates one and assigns the value 1, the first time through the loop. On subsequent iterations, the RHS in the assignment must be referring to the (unchanged) inner mySum variable whereas the LHS refers to the global variable. Therefore each iteration overwrites the value of the global variable to that iteration's value of i, hence it has the value 3 on exit from the function.
Hope this helps someone - this stumped me for a couple of hours today! (BTW, just replace <<- with <- and the function works as expected).
f <- function(n, x0) {x <- x0; replicate(n, (function(){x <<- x+rnorm(1)})())}
plot(f(1000,0),typ="l")
The <<- operator can also be useful for Reference Classes when writing Reference Methods. For example:
myRFclass <- setRefClass(Class = "RF",
fields = list(A = "numeric",
B = "numeric",
C = function() A + B))
myRFclass$methods(show = function() cat("A =", A, "B =", B, "C =",C))
myRFclass$methods(changeA = function() A <<- A*B) # note the <<-
obj1 <- myRFclass(A = 2, B = 3)
obj1
# A = 2 B = 3 C = 5
obj1$changeA()
obj1
# A = 6 B = 3 C = 9
I use it in order to change inside map() an object in the global environment.
a = c(1,0,0,1,0,0,0,0)
Say I want to obtain a vector which is c(1,2,3,1,2,3,4,5), that is if there is a 1, let it 1, otherwise add 1 until the next 1.
map(
.x = seq(1,(length(a))),
.f = function(x) {
a[x] <<- ifelse(a[x]==1, a[x], a[x-1]+1)
})
a
[1] 1 2 3 1 2 3 4 5

Convert character vector to numeric vector in R for value assignment?

I have:
z = data.frame(x1=a, x2=b, x3=c, etc)
I am trying to do:
for (i in 1:10)
{
paste(c('N'),i,sep="") -> paste(c('z$x'),i,sep="")
}
Problems:
paste(c('z$x'),i,sep="") yields "z$x1", "z$x1" instead of calling the actual values. I need the expression to be evaluated. I tried as.numeric, eval. Neither seemed to work.
paste(c('N'),i,sep="") yields "N1", "N2". I need the expression to be merely used as name. If I try to assign it a value such as paste(c('N'),5,sep="") -> 5, ie "N5" -> 5 instead of N5 -> 5, I get target of assignment expands to non-language object.
This task is pretty trivial since I can simply do:
N1 = x1...
N2 = x2...
etc, but I want to learn something new
I'd suggest using something like for( i in 1:10 ) z[,i] <- N[,i]...
BUT, since you said you want to learn something new, you can play around with parse and substitute.
NOTE: these little tools are funny, but experienced users (not me) avoid them.
This is called "computing on the language". It's very interesting, and it helps understanding the way R works. Let me try to give an intro:
The basic language construct is a constant, like a numeric or character vector. It is trivial because it is not different from its "unevaluated" version, but it is one of the building blocks for more complicated expressions.
The (officially) basic language object is the symbol, also known as a name. It's nothing but a pointer to another object, i.e., a token that identifies another object which may or may not exist. For instance, if you run x <- 10, then x is a symbol that refers to the value 10. In other words, evaluating the symbol x yields the numeric vector 10. Evaluating a non-existant symbol yields an error.
A symbol looks like a character string, but it is not. You can turn a string into a symbol with as.symbol("x").
The next language object is the call. This is a recursive object, implemented as a list whose elements are either constants, symbols, or another calls. The first element must not be a constant, because it must evaluate to the real function that will be called. The other elements are the arguments to this function.
If the first argument does not evaluate to an existing function, R will throw either Error: attempt to apply non-function or Error: could not find function "x" (if the first argument is a symbol that is undefined or points to something other than a function).
Example: the code line f(x, y+z, 2) will be parsed as a list of 4 elements, the first being f (as a symbol), the second being x (another symbol), the third another call, and the fourth a numeric constant. The third element y+z, is just a function with two arguments, so it parses as a list of three names: '+', y and z.
Finally, there is also the expression object, that is a list of calls/symbols/constants, that are meant to be evaluated one by one.
You'll find lots of information here:
https://github.com/hadley/devtools/wiki/Computing-on-the-language
OK, now let's get back to your question :-)
What you have tried does not work because the output of paste is a character string, and the assignment function expects as its first argument something that evaluates to a symbol, to be either created or modified. Alternativelly, the first argument can also evaluate to a call associated with a replacement function. These are a little trickier, but they are handled by the assignment function itself, not by the parser.
The error message you see, target of assignment expands to non-language object, is triggered by the assignment function, precisely because your target evaluates to a string.
We can fix that building up a call that has the symbols you want in the right places. The most "brute force" method is to put everything inside a string and use parse:
parse(text=paste('N',i," -> ",'z$x',i,sep=""))
Another way to get there is to use substitute:
substitute(x -> y, list(x=as.symbol(paste("N",i,sep="")), y=substitute(z$w, list(w=paste("x",i,sep="")))))
the inner substitute creates the calls z$x1, z$x2 etc. The outer substitute puts this call as the taget of the assignment, and the symbols N1, N2 etc as the values.
parse results in an expression, and substitute in a call. Both can be passed to eval to get the same result.
Just one final note: I repeat that all this is intended as a didactic example, to help understanding the inner workings of the language, but it is far from good programming practice to use parse and substitute, except when there is really no alternative.
A data.frame is a named list. It usually good practice, and idiomatically R-ish not to have lots of objects in the global environment, but to have related (or similar) objects in lists and to use lapply etc.
You could use list2env to multiassign the named elements of your list (the columns in your data.frame) to the global environment
DD <- data.frame(x = 1:3, y = letters[1:3], z = 3:1)
list2env(DD, envir = parent.frame())
## <environment: R_GlobalEnv>
## ta da, x, y and z now exist within the global environment
x
## [1] 1 2 3
y
## [1] a b c
## Levels: a b c
z
## [1] 3 2 1
I am not exactly sure what you are trying to accomplish. But here is a guess:
### Create a data.frame using the alphabet
data <- data.frame(x = 'a', y = 'b', z = 'c')
### Create a numerical index corresponding to the letter position in the alphabet
index <- which(tolower(letters[1:26]) == data[1, ])
### Use an 'lapply' to apply a function to every element in 'index'; creates a list
val <- lapply(index, function(x) {
paste('N', x, sep = '')
})
### Assign names to our list
names(val) <- names(data)
### Observe the result
val$x

Resources