R: Reference list item within the same list - r

In R, we can reference items created within that same list, i.e.:
list(a = a <- 1, b = a)
I am curious if there is a way to write a function which takes the place of a = a <- 1. That is, if something like
`%=%` <- function(x,y) {
envir <- environment()
char_x <- deparse(substitute(x))
assign(char_x, y, parent.env(envir))
unlist(lapply(setNames(seq_along(x),char_x), function(T) y))
}
# does not work
list(a%=%1, b=a)
is possible in R (i.e. returns the list given above)?
edit: I think this boils down to asking, 'can we call list with a language object that preserves all aspects of manually coding list?' (specifically, assigns the list's names attribute the left-hand side of the language element).
It seems to me that below shows that such a solution is hopeless.
my_call <- do.call(substitute, list(expr(expr = {x = y}), list(x=quote(a), y=1)))
equals <- languageEl(my_call, which = 1)
str(equals)
do.call(list, list(equals))

Welp, the clever folk behind tibble have figured this out in their lst() function (also in package dplyr)
library(dplyr)
lst(a=1, b=a, c=c(3,4), d=c)
What a useful feature!

Related

What does declaring function at the start of a line do?

I encountered this code:
res <- lapply(strsplit(s, "\n")[[1]],
(function (str) paste(rev(strsplit(str, "")[[1]]), collapse = "")))
The secodnd line reverses each of the splitted strings at the first line.
How does it do that? Namely, what does calling 'function' at the start do?
Calling lapply takes and performs some function on each list element. It takes the form lapply(list_data, some_function). So, for instance, if I have a list of integers and want to find out how many integers are in each list element, I would run:
list_data <- list(list1 = 1:5,
list2 = 6:10,
list3 = 11:30)
lapply(list_data, length)
The function here is length, which is a function that is inherent in R. Some functions aren't defined in R, say if I want define my own formula for each value in the list, I could define my own function. Calling a function allows users to define a function that is not already in R or an R library. Like so:
lapply(list_data, function(x) x^2+4-x^3)
The function here is x^2+4-x^3, which is not defined in R programming itself.
So in your example, your data is strsplit(s, "\n")[[1]] and it is taking that data and applying the function paste(rev(strsplit(str, "")[[1]]), collapse = "")) to each element in the data.
Note that in my example, I put function(x) - your example puts function(str) - what's in the parentheses doesn't matter and is user defined. For example lapply(list_data, function(str) str^2+4-str^3) will return the same thing as lapply(list_data, function(x) x^2+4-x^3)
Please note that broad "learning" style questions like this are not exactly what this site is for, and this question will likely get removed and/or receive some negative feedback. Since you are new to this site and to R, I'm providing this answer but I would not be surprised if the question is removed. Just trying to help both you and the SO community!

How can create a function using variables in a dataframe

I'm sure the question is a bit dummy (sorry)... I'm trying to create a function using differents variables I have stored in a Dataframe. The function is like that:
mlr_turb <- function(Cond_in, Flow_in, pH_in, pH_out, Turb_in, nm250_i, nm400_i, nm250_o, nm400_o){
Coag = (+0.032690 + 0.090289*Cond_in + 0.003229*Flow_in - 0.021980*pH_in - 0.037486*pH_out
+0.016031*Turb_in -0.026006*nm250_i +0.093138*nm400_o - 0.397858*nm250_o - 0.109392*nm400_o)/0.167304
return(Coag)
}
m4_turb <- mlr_turb(dataset)
The problem is when I try to run my function in a dataframe (with the same name of variables). It doesn't detect my variables and shows this message:
Error in mlr_turb(dataset) :
argument "Flow_in" is missing, with no default
But, actually, there is, also all the variables.
I think I missplace or missing some order in the function that gives it the possibility to take the variables from the dataset. I have searched a lot about that but I have not found any answer...
No dumb questions!
I think you're looking for do.call. This function allows you to unpack values into a function as arguments. Here's a really simple example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15))
# unpack the values into the function using do.call
do.call('myFun', myData)
Output:
[1] 0.3765084 0.6902654 0.9557522 1.1833122 1.3805309
You meet a standard problem when writing R that is related to the question of standard evaluation (SE) vs non standard evaluation (NSE). If you need more elements, you can have a look at this blog post I wrote
I think the most convenient way to write function using variables is to use variable names as arguments of the function.
Let's take again #Muon example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
The question is where R should find the values behind names x, y and z. In a function, R will first look within the function environment (here x,y and z are defined as parameters) then it will look at global environment and then it will look at the different packages attached.
In myFun, R expects vectors. If you give a column name, you will experience an error. What happens if you want to give a column name ? You must say to R that the name you gave should be associated to a value in the scope of a dataframe. You can for instance do something like that:
myFun <- function(df, col1 = "x", col2 = "y", col3 = "z"){
result <- (df[,col1] + df[,col2])/df[,col3]
return(result)
}
You can go far further in that aspect with data.table package. If you start writing functions that need to use variables from a dataframe, I recommend you to start having a look at this package
I like Muon's answer, but I couldn't get it to work if there are columns in the data.frame not in the function. Using the with() function is a simple way to make this work as well...
#Code from Muon:
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15),
a=6:10) #adding a var not used in myFun
# unpack the values into the function using do.call
do.call('myFun', myData)
#generates an error for the unused "a" column
#using with() function:
with(myData, myFun(x, y, z))

Name of a list not the contents

I have a list containing 18 elements called bx2.
I want to use bx2 in a function,
XlsMaker <- function(x) {
library("XLConnect")
a <- length(x)
b <- paste0(x,".xlsx")
for (i in 1:a){
writeWorksheetToFile(b, data = x[[i]], sheet = names(x[i]))
}
}
but when put I bx2 into the function it pulls in all the elements of the list rather than just the name of the list.
Is it possible to re-write the function so that b becomes bx2.xlsx?
The line b <- paste0(x,".xlsx") is wrong. That calls paste0 on the object x itself which is not at all what you want to do. You want to call it on the name of the object.
This in general opens a can of worms because objects can have two different names in two different places. Consider: the object named bx2 in your global environment is now named x within the function's scope. If you only want to call this function from the top level (e.g. for interactive use), you can safely get the name of the object from the parent environment (the environment you called the function from) by replacing that line with:
x_name <- deparse(substitute(x))
b <- paste0(x_name, ".xlsx")
The substitute function gets the name of x in the parent environment, as a special name object. The deparse function converts this name into a character vector of length one.
The reason I said this is only safe to use at the top level is that substitute can return surprising or unintended results if not used carefully. Hadley Wickham goes into detail on this point in his book.
I think you just want to deparse the parameter name
XlsMaker <- function(x) {
varname <- deparse(substitute(x))
library("XLConnect")
a <- length(x)
b <- paste0(varname ,".xlsx")
for (i in 1:a){
writeWorksheetToFile(b, data = x[[i]], sheet = names(x[i]))
}
}
bx2 <-list(1:3, 4:6)
XlsMaker(bx2)

R: How to use as.call with vectors as optional parameters?

I'm trying to write a wrapper for a function in order to use lists as input. I cannot change the function itself, therefore I need a workaround outside of it. I use as.call() and it works without optional arguments, but I fail to make it work when I have vectors as optional arguments.
Example:
# function I cannot change
func <- function(..., opt=c(1,2)) {
cl <- match.call(expand.dots = FALSE)
names <- lapply(cl[[2]],as.character)
ev <- parent.frame()
classes <- unlist(lapply(names,function(name){class(get(name,envir=ev))}))
print(c(opt,names, classes))
}
a <- structure(1:3, class="My_Class")
b <- structure(letters[1:3], class="My_Class")
lst <- list(a, b)
names(lst) <- c("a","b")
# Normal result
func(a,b,opt=c(3,4))
# This should give the same but it doesn't
call <- as.call(append(list(func), list(names(lst), opt=c(3,4))))
g <- eval(call, lst)
Instead of a list as optional argument, I also tried c(), but this doesn't work either. Does anybody have a suggestion or a help page? ?call wasn't to clear about my problem.
(I already asked a previous question to the topic here: R: How to use list elements like arguments in ellipsis? , but left out the detail about the optional parameter and cannot figure it out now.)
This produces the same result for me under both versions
call <- as.call(c(list(quote(func)), lapply(names(lst), as.name), list(opt=c(3,4))))
g <- eval(call, lst)
EDIT: as per Hadley's suggestions in comments.

zipping lists in R

As a guideline I prefer apply functions on elements of a list using lapply or *ply (from plyr) rather than explicitly iterating through them. However, this works well when I have to process one list at a time. When the function takes multiple arguments, I usually do a cycle.
I was wondering if it's possible to have a cleaner construct, still functional in nature. One possible approach could be to define a function similar to Python, zip(x,y), which takes the input lists, and returns a list, whose i-th element is list(x, y), and then apply the function to this list. But my question is whether I am using the cleanest approach or not. I am not worried about performance optimization, but rather clarity/elegance.
Below is the naive example.
A <- as.list(0:9)
B <- as.list(0:9)
f <- function(x, y) x^2+y
OUT <- list()
for (n in 1:10) OUT[[n]] <- f(A[[n]], B[[n]])
OUT
[[1]]
[1] 0
[[2]]
[1] 2
...
And here is the zipped example (which could be extended to arbitrary arguments):
zip <- function(x, y){
stopifnot(length(x)==length(y))
z <- list()
for (i in seq_along(x)){
z[[i]] <- list(x[[i]], y[[i]])
}
z
}
E <- zip(A, B)
lapply(E, function(x) f(x[[1]], x[[2]]))
[[1]]
[1] 0
[[2]]
[1] 2
...
I think you're looking for mapply:
‘mapply’ is a multivariate version of ‘sapply’. ‘mapply’ applies
‘FUN’ to the first elements of each ... argument, the second
elements, the third elements, and so on. Arguments are recycled
if necessary.
For your example, use mapply(f, A, B)
I came across a similar problem today. And after learning the usage of the func mapply, I know how to solve it now.
mapply is so cool!!
Here is an examples:
en = c("cattle", "chicken", "pig")
zh = c("牛", "鸡", "猪")
dict <- new.env(hash = TRUE)
Add <- function(key, val) dict[[key]] <- val
mapply(Add, en, zh)
## cattle chicken pig
## "牛" "鸡" "猪"
I think you could do this with what I call an 'implicit loop' (this name does not hit it fully, but whatever), taking into account that you can loop over vectors within *apply:
OUT <- lapply(1:10, function(x) (A[[x]]^2 + B[[x]]))
or
OUT <- lapply(1:10, function(x) f(A[[x]], B[[x]]))
Note that you then could also use vapply (or 'sapply`) for output managing (i.e. if you don't want a list).
(by the way, I am not getting what you want with the zip function, so I am sorry, if I missed your point.)

Resources