Pass string variable into R function - r

I want to pass a variable name into a function and can't seem to do it. Simply...
library (reshape)
test <- function(x) {
cast(data, x ~ ., length)
}
test(ageg)
I get this kickback.
Error: Casting formula contains variables not found in molten data: x
I know it's simple but I can't find the answer.I want it to simply run
cast(data, ageg ~ ., length)

Try this:
test <- function (x) cast(data, as.formula(paste0(x , " ~ .")), length)
What you are trying to do is write a formula on the fly. However, a formula is possed on as quoted part of the language (IIRC). Therefore, your x is not evaluated but looked for in your data as x.
What this does on the other hand is to first create a character string by evaluating x in paste0. Then the string is converted to a formula using as.formula.

Related

Creating a function for GWR maps

I have created a function for GWR maps and I have run the code without it being in the function and it works well. However, when I create into a function I get an error. I was wondering if anyone could help, thank you!
#a=polygonshapefile
#b= Dependent variabable of shapefile
#c= Explantory variable 1
#d= Explantory vairbale 2
GWR_map <- function(a,b,c,d){
GWRbandwidth <- gwr.sel(a$b ~ a$c+a$d, a,adapt=T)
gwr.model = gwr(a$b ~ a$c+a$d, data = a, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE)
gwr.model
}
GWR_map(OA.Census,"Qualification", "Unemployed", "White_British")
The above code produces the following error:
Error in model.frame.default(formula = a$b ~ a$c + a$d, data = a, drop.unused.levels = TRUE) :
invalid type (NULL) for variable 'a$b'
You can't use function parameters with the $. Try changing your function to use the [[x]] notation instead. It should look like this:
GWR_map <- function(a,b,c,d){
GWRbandwidth <- gwr.sel(a[[b]] ~ a[[c]]+a[[d]], a,adapt=T)
gwr.model = gwr(a[[b]] ~ a[[c]]+a[[d]], data = a, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE)
gwr.model
}
The R help docs (section 6.2 on lists) explain this difference well:
Additionally, one can also use the names of the list components in double square brackets,
i.e., Lst[["name"]] is the same as Lst$name. This is especially useful, when the name of the component to be extracted is stored in another variable as in
x <- "name"; Lst[[x]] It is very important to distinguish Lst[[1]] from Lst[1]. ‘[[...]]’ is the operator used to select a single element, whereas ‘[...]’ is a general subscripting operator. Thus the former is the first object in the list Lst, and if it is a named list the name is not included. The latter
is a sublist of the list Lst consisting of the first entry only. If it is a named list, the names are transferred to the sublist.

How can create a function using variables in a dataframe

I'm sure the question is a bit dummy (sorry)... I'm trying to create a function using differents variables I have stored in a Dataframe. The function is like that:
mlr_turb <- function(Cond_in, Flow_in, pH_in, pH_out, Turb_in, nm250_i, nm400_i, nm250_o, nm400_o){
Coag = (+0.032690 + 0.090289*Cond_in + 0.003229*Flow_in - 0.021980*pH_in - 0.037486*pH_out
+0.016031*Turb_in -0.026006*nm250_i +0.093138*nm400_o - 0.397858*nm250_o - 0.109392*nm400_o)/0.167304
return(Coag)
}
m4_turb <- mlr_turb(dataset)
The problem is when I try to run my function in a dataframe (with the same name of variables). It doesn't detect my variables and shows this message:
Error in mlr_turb(dataset) :
argument "Flow_in" is missing, with no default
But, actually, there is, also all the variables.
I think I missplace or missing some order in the function that gives it the possibility to take the variables from the dataset. I have searched a lot about that but I have not found any answer...
No dumb questions!
I think you're looking for do.call. This function allows you to unpack values into a function as arguments. Here's a really simple example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15))
# unpack the values into the function using do.call
do.call('myFun', myData)
Output:
[1] 0.3765084 0.6902654 0.9557522 1.1833122 1.3805309
You meet a standard problem when writing R that is related to the question of standard evaluation (SE) vs non standard evaluation (NSE). If you need more elements, you can have a look at this blog post I wrote
I think the most convenient way to write function using variables is to use variable names as arguments of the function.
Let's take again #Muon example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
The question is where R should find the values behind names x, y and z. In a function, R will first look within the function environment (here x,y and z are defined as parameters) then it will look at global environment and then it will look at the different packages attached.
In myFun, R expects vectors. If you give a column name, you will experience an error. What happens if you want to give a column name ? You must say to R that the name you gave should be associated to a value in the scope of a dataframe. You can for instance do something like that:
myFun <- function(df, col1 = "x", col2 = "y", col3 = "z"){
result <- (df[,col1] + df[,col2])/df[,col3]
return(result)
}
You can go far further in that aspect with data.table package. If you start writing functions that need to use variables from a dataframe, I recommend you to start having a look at this package
I like Muon's answer, but I couldn't get it to work if there are columns in the data.frame not in the function. Using the with() function is a simple way to make this work as well...
#Code from Muon:
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15),
a=6:10) #adding a var not used in myFun
# unpack the values into the function using do.call
do.call('myFun', myData)
#generates an error for the unused "a" column
#using with() function:
with(myData, myFun(x, y, z))

R Function - assign LMER to dynamic variable name

To create a more compact script, I am trying to create my first function.
The general function is:
f.mean <- function(var, fig, datafile){
require(lme4)
change <- as.symbol(paste("change", var, sep=""))
base <- as.symbol(paste("baseline", var, sep = ""))
x <- substitute(lmer(change ~ base + (1|ID), data=datafile))
out<-eval(x)
name <- paste(fig,".", var, sep="")
as.symbol(name) <- out
}
}
The purpose of this function is to input var, fig and datafile and to output a new variable named fig.var containing out (eval of LMER).
Apparently it is difficult to 'change' the variable name on the left side of the <-.
What we have tried so far:
- assign(name, out)
- as.symbol(name) <<- out
- makeActive Binding("y",function() x, .GlobalEnv)
- several rename options to rename out to the specified var name
Can someone help me to assign the out value to this 'run' specific variable name? All other suggestions are welcome as well.
As #Roland comments, in R (or any) programming one should avoid indirect environment manipulators such as assign, attach, list2env, <<-, and others which are difficult to debug and break the flow of usual programming involving explicitly defined objects and methods.
Additionally, avoid flooding your global environment of potentially hundreds or thousands of similarly structured objects that may require environment mining such as ls, mget, or eapply. Simply, use one large container like a list of named elements which is more manageable and makes code more maintainable.
Specifically, be direct in assigning objects and pass string literals (var, fig) or objects (datafile) as function parameters and have function return values. And for many inputs, build lists with lapply or Map (wrapper to mapply) to retain needed objects. Consider below adjustment that builds a formula from string literals and passes into your model with results to be returned at end.
f.mean <- function(var, fig, datafile){
require(lme4)
myformula <- as.formula(paste0("change", var, " ~ baseline", var, " + (1|ID)"))
x <- lmer(myformula, data=datafile)
return(x)
}
var_list <- # ... list/vector of var character literals
fig_list <- # ... list/vector of fig character literals
# BUILD AND NAME LIST OF LMER OUTPUTS
lmer_list <- setNames(Map(f.mean, var_list, fig_list, MorArgs=df),
paste0(fig_list, ".", var_list))
# IDENTIFY NEEDED var*fig* BY ELEMENT NAME OF LARGER CONTAINER
lmer_list$fig1.var1
lmer_list$fig2.var2
lmer_list$fig3.var3

converting a string into a data frame name

In functions such as plotmeans there is an argument that specifies the data frame to use, data=. I would like to construct the name of the data frame to be used using paste0 or something similar, df <- paste0("results", i), where i is a number to get (say) "results04". If I then use data=df, I get an error saying that data= expects a variable, not a string. Is there any way to convert the string into a form that data= will accept? data=results04 without the quotes, of course, works.
Thanks for any suggestions or pointers.
The answer would have been obvious to one with more R experience, but let me put it here for others: use the get() function, so for instance
df <- paste0("results", i)
plotmeans(a ~ b, data=get(df))

R: substitute pattern in formula for a variable name

I have a general function that calls an expression that uses a formula and I would like to pass this functions to various environments that store some specific variables and modify parts of a formula designated by a specific pattern.
Here is an example:
# Let's assume I have an environment storing a variable
env <- new.env()
env$..M.. <- "Sepal.Length"
# And a function that calls an expression
func <- function() summary(lm(..M.. ~ Species, data = iris))$r.squared
# And let's assume I am trying to evaluate it within the environment
environment(func) <- env
# And I would like to have some method that makes it evaluate as:
summary(lm(Sepal.Length ~ Species, data = iris))$r.squared
So far I came up with a very dirty solution based on deparsing the function down to string, greping and then parsing it back. It goes like this:
tfunc <- paste(deparse(func), collapse = "")
tfunc <- gsub("\\.\\.M\\.\\.", ..M.., tfunc, perl = TRUE)
tfunc <- eval(parse(text = tfunc))
So yes, it works, but I would like to find a cleaner method, that would somewhat magically substitute this ..M.. pattern into Sepal.Length without a need for all this parsing and deparsing.
So I would really appreciate some help and hints for that problem.

Resources