Converting from a Formula object to a list - r

In R, I would like to iterate over a formula object. R automatically converts a formula to a parse tree, so I see no reason why I shouldn't be able to iterate.
For example, f <- ~x + y has elements f[[1]] = ~ and f[[2]] = x + y. However, for(v in f) print(toString(v)) does not output
[1] "~"
[1] "+, x, y"
as I would expect it to. Instead, it gives the error invalid for() loop sequence.
If I need to do it manually, I could always use for(i in 1:length(f)) print(toString(f[[i]])) which does produce the correct output. However, I would like to know why the first method does not work.

Related

Evaluation of one input in a function with multiple inputs in R

I am trying to do the following but can't figure it out. Could someone please help me?
f <- expression(x^3+4*y)
df <- D(f,'x')
x <-0
df0 <- eval(df)
df0 should be a function of y!
If you take the derivative of f with respect to x you get 3 * x^2. The 4*y is a constant as far as x is concerned. So you don't have a function of y as such, your df is a constant as far as y is concerned (although it is a function of x).
Assigning to x doesn't change df; it remains the expression 3 * x^2 and is still a function of x if you wanted to treat it as such.
If you want to substitute a variable in an expression, then substitute() is what you are looking for.
> substitute(3 * x^2, list(x = 0))
3 * 0^2
It is a blind substitute with no simplification of the expression--we probably expected zero here, but we get zero times 3--but that is what you get.
Unfortunately, substituting in an expression you have in a variable is a bit cumbersome, since substitute() thinks its first argument is the verbatim expression, so you get
> substitute(df, list(x = 0))
df
The expression is df, there is no x in that so nothing is substituted, and you just get df back.
You can get around that with two substitutions and an eval:
> df0 <- eval(
+ substitute(substitute(expr, list(x = 0)),
+ list(expr = df)))
> df0
3 * 0^2
> eval(df0)
[1] 0
The outermost substitute() puts the value of df into expr, so you get the right expression there, and the inner substitute() changes the value of x.
There are nicer functions for manipulating expressions in the Tidyverse, but I don't remember them off the top of my head.

Creating a function for GWR maps

I have created a function for GWR maps and I have run the code without it being in the function and it works well. However, when I create into a function I get an error. I was wondering if anyone could help, thank you!
#a=polygonshapefile
#b= Dependent variabable of shapefile
#c= Explantory variable 1
#d= Explantory vairbale 2
GWR_map <- function(a,b,c,d){
GWRbandwidth <- gwr.sel(a$b ~ a$c+a$d, a,adapt=T)
gwr.model = gwr(a$b ~ a$c+a$d, data = a, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE)
gwr.model
}
GWR_map(OA.Census,"Qualification", "Unemployed", "White_British")
The above code produces the following error:
Error in model.frame.default(formula = a$b ~ a$c + a$d, data = a, drop.unused.levels = TRUE) :
invalid type (NULL) for variable 'a$b'
You can't use function parameters with the $. Try changing your function to use the [[x]] notation instead. It should look like this:
GWR_map <- function(a,b,c,d){
GWRbandwidth <- gwr.sel(a[[b]] ~ a[[c]]+a[[d]], a,adapt=T)
gwr.model = gwr(a[[b]] ~ a[[c]]+a[[d]], data = a, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE)
gwr.model
}
The R help docs (section 6.2 on lists) explain this difference well:
Additionally, one can also use the names of the list components in double square brackets,
i.e., Lst[["name"]] is the same as Lst$name. This is especially useful, when the name of the component to be extracted is stored in another variable as in
x <- "name"; Lst[[x]] It is very important to distinguish Lst[[1]] from Lst[1]. ‘[[...]]’ is the operator used to select a single element, whereas ‘[...]’ is a general subscripting operator. Thus the former is the first object in the list Lst, and if it is a named list the name is not included. The latter
is a sublist of the list Lst consisting of the first entry only. If it is a named list, the names are transferred to the sublist.

How can create a function using variables in a dataframe

I'm sure the question is a bit dummy (sorry)... I'm trying to create a function using differents variables I have stored in a Dataframe. The function is like that:
mlr_turb <- function(Cond_in, Flow_in, pH_in, pH_out, Turb_in, nm250_i, nm400_i, nm250_o, nm400_o){
Coag = (+0.032690 + 0.090289*Cond_in + 0.003229*Flow_in - 0.021980*pH_in - 0.037486*pH_out
+0.016031*Turb_in -0.026006*nm250_i +0.093138*nm400_o - 0.397858*nm250_o - 0.109392*nm400_o)/0.167304
return(Coag)
}
m4_turb <- mlr_turb(dataset)
The problem is when I try to run my function in a dataframe (with the same name of variables). It doesn't detect my variables and shows this message:
Error in mlr_turb(dataset) :
argument "Flow_in" is missing, with no default
But, actually, there is, also all the variables.
I think I missplace or missing some order in the function that gives it the possibility to take the variables from the dataset. I have searched a lot about that but I have not found any answer...
No dumb questions!
I think you're looking for do.call. This function allows you to unpack values into a function as arguments. Here's a really simple example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15))
# unpack the values into the function using do.call
do.call('myFun', myData)
Output:
[1] 0.3765084 0.6902654 0.9557522 1.1833122 1.3805309
You meet a standard problem when writing R that is related to the question of standard evaluation (SE) vs non standard evaluation (NSE). If you need more elements, you can have a look at this blog post I wrote
I think the most convenient way to write function using variables is to use variable names as arguments of the function.
Let's take again #Muon example.
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
The question is where R should find the values behind names x, y and z. In a function, R will first look within the function environment (here x,y and z are defined as parameters) then it will look at global environment and then it will look at the different packages attached.
In myFun, R expects vectors. If you give a column name, you will experience an error. What happens if you want to give a column name ? You must say to R that the name you gave should be associated to a value in the scope of a dataframe. You can for instance do something like that:
myFun <- function(df, col1 = "x", col2 = "y", col3 = "z"){
result <- (df[,col1] + df[,col2])/df[,col3]
return(result)
}
You can go far further in that aspect with data.table package. If you start writing functions that need to use variables from a dataframe, I recommend you to start having a look at this package
I like Muon's answer, but I couldn't get it to work if there are columns in the data.frame not in the function. Using the with() function is a simple way to make this work as well...
#Code from Muon:
# a simple function that takes x, y and z as arguments
myFun <- function(x, y, z){
result <- (x + y)/z
return(result)
}
# a simple data frame with columns x, y and z
myData <- data.frame(x=1:5,
y=(1:5)*pi,
z=(11:15),
a=6:10) #adding a var not used in myFun
# unpack the values into the function using do.call
do.call('myFun', myData)
#generates an error for the unused "a" column
#using with() function:
with(myData, myFun(x, y, z))

Creating call objects to compare to formula elements

I would like to create an object from a string to compare with an element of a formula.
For example, in the following:
# note that f does not exist
myForm <- y ~ f(x)
theF <- myForm[[3]]
fString <- "f(x)"
How can I compare fString to theF?
If I know the string is "f(x)" I can manually enter the following
cheating <- as.call(quote(f(x)))
identical(theF, cheating)
which works (it gives TRUE) but I want to be able to take the string "f(x)" as an argument (e.g. maybe it's "g(x)".
The real point of this question is for me to understand better how to work with call objects and quote function.
parse(text = s) converts text, s, to an expression and e[[1]] extracts the call object from a length 1 expression e. theF is a call object so putting these together we have:
identical(theF, parse(text = fString)[[1]])
## TRUE
note that formula's are really nothing on their own in R.
the only thing they do is convert it into a string like object...
"y~f(x)"
it's then on to the functions that accept formulas to interpret it...
check coplot for an example implementation

R - extract variable names from unevaluated expression

Assume following model
is written in a text file by someone not familiar with R as follows:
goal1 = dec1_g1 + dec2_g1 + dec3_g1
goal2 = min(dec1_g2, dec2_g2, dec3_g2)
goal3 = dec1_g3 - dec2_g3 - dec3_g3
...
I need to be able to parse the text file with the model and evaluate any one line without having to assign values to the dec variables from the remaining lines of the model. While the parse function creates an unevaluated expression exp that can be queried and evaluated in parts as eval(exp[1]), eval(exp[2]), I haven't found a way to do something like eval(exp['goal1']).
Question: is there a way to parse the model without evaluating it and create a list with elements named by the left-hand sides of the model expressions, e.g.
model = list(
"goal1" = expression(goal1 = dec1_g1 + dec2_g1 + dec3_g1),
"goal2" = expression(goal2 = min(dec1_g2, dec2_g2, dec3_g2)),
"goal3" = expression(goal3 = dec1_g3 * dec2_g3 * dec3_g3),
...
)
Motivation: I want to be able to load the model from within an R code, parse it and evaluate it expression by expression assigning correct values to the dec variables depending no the goal that's being evaluated.
The "left hand side" of expression(x=y+z) is actually the name of the argument you're passing to expression(), whose value is the (unevaluated) call y + z. So it's not a part of the expression, but is returned as the name of the list element (an expression is a list of calls, usually unnamed):
> as.list(expression(x=y+z))
$x
y + z
> names(expression(x=y+z))
[1] "x"
If, OTOH, you use the formula constructor ~, then you get the LHS as a part of the expression:
> as.list(expression(x~y+z))
[[1]]
x ~ y + z
And you can get to it selecting the second element of the call:
> expression(x~y+z)[[1]]
x ~ y + z
> expression(x~y+z)[[1]][[1]]
`~`
> expression(x~y+z)[[1]][[2]]
x
Note: in the last line, x is a symbol.

Resources