Creating call objects to compare to formula elements - r

I would like to create an object from a string to compare with an element of a formula.
For example, in the following:
# note that f does not exist
myForm <- y ~ f(x)
theF <- myForm[[3]]
fString <- "f(x)"
How can I compare fString to theF?
If I know the string is "f(x)" I can manually enter the following
cheating <- as.call(quote(f(x)))
identical(theF, cheating)
which works (it gives TRUE) but I want to be able to take the string "f(x)" as an argument (e.g. maybe it's "g(x)".
The real point of this question is for me to understand better how to work with call objects and quote function.

parse(text = s) converts text, s, to an expression and e[[1]] extracts the call object from a length 1 expression e. theF is a call object so putting these together we have:
identical(theF, parse(text = fString)[[1]])
## TRUE

note that formula's are really nothing on their own in R.
the only thing they do is convert it into a string like object...
"y~f(x)"
it's then on to the functions that accept formulas to interpret it...
check coplot for an example implementation

Related

What's the difference between substitute and quote in R

In the official docs, it says:
substitute returns the parse tree for the (unevaluated) expression
expr, substituting any variables bound in env.
quote simply returns its argument. The argument is not evaluated and
can be any R expression.
But when I try:
> x <- 1
> substitute(x)
x
> quote(x)
x
It looks like both quote and substitute returns the expression that's passed as argument to them.
So my question is, what's the difference between substitute and quote, and what does it mean to "substituting any variables bound in env"?
Here's an example that may help you to easily see the difference between quote() and substitute(), in one of the settings (processing function arguments) where substitute() is most commonly used:
f <- function(argX) {
list(quote(argX),
substitute(argX),
argX)
}
suppliedArgX <- 100
f(argX = suppliedArgX)
# [[1]]
# argX
#
# [[2]]
# suppliedArgX
#
# [[3]]
# [1] 100
R has lazy evaluation, so the identity of a variable name token is a little less clear than in other languages. This is used in libraries like dplyr where you can write, for instance:
summarise(mtcars, total_cyl = sum(cyl))
We can ask what each of these tokens means: summarise and sum are defined functions, mtcars is a defined data frame, total_cyl is a keyword argument for the function summarise. But what is cyl?
> cyl
Error: object 'cyl' not found
It isn't anything! Well, not yet. R doesn't evaluate it right away, but treats it as an expression to be parsed later with some parse tree that is different than the global environment your command line is working in, specifically one where the columns of mtcars are defined. Somewhere in the guts of dplyr, something like this is happening:
> substitute(cyl, mtcars)
[1] 6 6 4 6 8 ...
Suddenly cyl means something. That's what substitute is for.
So what is quote for? Well sometimes you want your lazily-evaluated expression to be represented somewhere else before it's evaluated, i.e. you want to display the actual code you're writing without any (or only some) values substituted. The docs you quoted explain this is common for "informative labels for data sets and plots".
So, for example, you could create a quoted expression, and then both print the unevaluated expression in your chart to show how you calculated and actually calculate with the expression.
expr <- quote(x + y)
print(expr) # x + y
eval(expr, list(x = 1, y = 2)) # 3
Note that substitute can do this expression trick also while giving you the option to parse only part of it. So its features are a superset of quote.
expr <- substitute(x + y, list(x = 1))
print(expr) # 1 + y
eval(expr, list(y = 2)) # 3
Maybe this section of the documentation will help somewhat:
Substitution takes place by examining each component of the parse tree
as follows: If it is not a bound symbol in env, it is unchanged. If it
is a promise object, i.e., a formal argument to a function or
explicitly created using delayedAssign(), the expression slot of the
promise replaces the symbol. If it is an ordinary variable, its value
is substituted, unless env is .GlobalEnv in which case the symbol is
left unchanged.
Note the final bit, and consider this example:
e <- new.env()
assign(x = "a",value = 1,envir = e)
> substitute(a,env = e)
[1] 1
Compare that with:
> quote(a)
a
So there are two basic situations when the substitution will occur: when we're using it on an argument of a function, and when env is some environment other than .GlobalEnv. So that's why you particular example was confusing.
For another comparison with quote, consider modifying the myplot function in the examples section to be:
myplot <- function(x, y)
plot(x, y, xlab = deparse(quote(x)),
ylab = deparse(quote(y)))
and you'll see that quote really doesn't do any substitution.
Regarding your question why GlobalEnv is treated as an exception for substitute, it is just a heritage of S. From The R language definition (https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Substitutions):
The special exception for substituting at the top level is admittedly peculiar. It has been inherited from S and the rationale is most likely that there is no control over which variables might be bound at that level so that it would be better to just make substitute act as quote.

Passing character as variable name in R

I am trying to solve an implicit equation in R using multiroot function from package rootSolve.
I am reading the implicit equation from a text file using parse. Also, the variable to be solved for is read from a text file as a character.
For using multiroot,
multiroot(function, initial_guess, ....))
we have to generate a function from the read equation. I did this by
fun <- function(op) {fun <- eval(expr.im)}
op = as.name(opim.names)
where expr.im is the read implicit equation as an expression from the text file, and opim.names is the variable to be solved for, as character.
But the problem arises when I pass the variable op to be solved as a symbol to the function. It gives an error saying that the object
"variable to be solved for" not found.
I think that the variable symbol is not being passed correctly in the function.
Please tell me how to do it correctly.
Since a lot of stuff is going on in my code, I cannot post the whole thing here.
Let me just state a small example for it.
var.name = "x1" # This is what I read from the text file #
var.sym = as.name(var.name)
func <- function(var.sym){
func = x1^2 # the expression x1^2 is also read from a text file #
} # I am trying to solve the implicit equation x1^2 = 0 #
initial_guess = 1
root = multiroot(f=func, start = initial_guess)
As requested by nicola here's what I want -
I have a text file giving me the name of the variable and its initial guess.
I read the variable name (say "x") and the initial guess value (say 1) into variables var (character) and guess(numeric).
I also have another text file containing the following equation -
x^3-1
I read this as an expression in the variable expr.
I want to find the solution to the implicit equation expr.
(The text files can have different names of variables and correspondingly an implicit expression in another file)
As you know, for using the multiroot function, we need to have a function.
The problem is I am not able to pass the variable name stored in var to the function.
Any further clarification will be given if asked.
You can build your function in the following way.
#input: function expression, variable names and initial guess
expr<-"x^3-1"
var.name<-"x"
initial.guess<-2
#we build an "empty" function
func<-function() {}
#now we set the formal arguments and the body of the function
formals(func)<-eval(parse(text=paste0("alist(",paste(var.name,collapse="=,"),"=)")))
body(func)<-parse(text=expr)
#we can see that func is correctly defined
func
#function (x)
#x^3 - 1
#now we can call multiroot
multiroot(func,initial.guess)
#$root
#[1] 1
#$f.root
#[1] 3.733019e-08
#$iter
#[1] 6
#$estim.precis
#[1] 3.733019e-08
You need a little more care if you are dealing with function of more than one variable. See ?multiroot to check how to pass the arguments. Basically you have to pass just one argument which is a vector of all the function arguments. It shouldn't be difficult if you take some time to see how I managed to build func. If you are exclusively dealing with one variable function, you should use the base uniroot function.
Not able to understand the description fully. But to answer the heading, you can try this procedure-
a = "random_string"
b = "a"
eval(parse(text = b))
[1] "random_string"

Using a single object to pass multiple arguments to a function?

Let's say I have a function that can't be altered, like:
add.these <- function(x,y,z) {
x + y + z
}
And I want to pass all three arguments as a single object. How do I pass this single object through to the function so it evaluates them as separate inputs?
The ideal result would be something like args <- list(x,y,z), and add.these(args) returns the result.
It's a simple question that's been bothering me but I've stupidly been unable to figure it out. The actual use case is that the function has variable numbers of arguments it requires depending on the desired outputs, and I want to pass these through as a list or something.
Are you looking for do.call?
> args=list(1,2,3)
> do.call(add.these,args)
[1] 6

Function argument as name of variable/array in R

I'd like to be able to create a vector with R but determine its name within the function call. In SAS I would use the macro language to perform a loop but with R, I can't find how to refer to a variable by name e.g. (Obviously this does not work but it describes what I'd like to do)
fun <- function(X, vectorName) {
paste(vectorName) <- 1:X
}
I'd like to be able to call fun(5, v) and get a vector v = c(1,2,3,4,5) out the end.
Although this is possible, it's not something you should do. A function should only have a return value, which you then can assign, e.g.:
v <- seq_len(5)
Or if you have to pass a variable name programmatically:
myname <- "w"
assign(myname, seq_len(5))
(Though I can't think of a reason why you'd need that.)

Convert character vector to numeric vector in R for value assignment?

I have:
z = data.frame(x1=a, x2=b, x3=c, etc)
I am trying to do:
for (i in 1:10)
{
paste(c('N'),i,sep="") -> paste(c('z$x'),i,sep="")
}
Problems:
paste(c('z$x'),i,sep="") yields "z$x1", "z$x1" instead of calling the actual values. I need the expression to be evaluated. I tried as.numeric, eval. Neither seemed to work.
paste(c('N'),i,sep="") yields "N1", "N2". I need the expression to be merely used as name. If I try to assign it a value such as paste(c('N'),5,sep="") -> 5, ie "N5" -> 5 instead of N5 -> 5, I get target of assignment expands to non-language object.
This task is pretty trivial since I can simply do:
N1 = x1...
N2 = x2...
etc, but I want to learn something new
I'd suggest using something like for( i in 1:10 ) z[,i] <- N[,i]...
BUT, since you said you want to learn something new, you can play around with parse and substitute.
NOTE: these little tools are funny, but experienced users (not me) avoid them.
This is called "computing on the language". It's very interesting, and it helps understanding the way R works. Let me try to give an intro:
The basic language construct is a constant, like a numeric or character vector. It is trivial because it is not different from its "unevaluated" version, but it is one of the building blocks for more complicated expressions.
The (officially) basic language object is the symbol, also known as a name. It's nothing but a pointer to another object, i.e., a token that identifies another object which may or may not exist. For instance, if you run x <- 10, then x is a symbol that refers to the value 10. In other words, evaluating the symbol x yields the numeric vector 10. Evaluating a non-existant symbol yields an error.
A symbol looks like a character string, but it is not. You can turn a string into a symbol with as.symbol("x").
The next language object is the call. This is a recursive object, implemented as a list whose elements are either constants, symbols, or another calls. The first element must not be a constant, because it must evaluate to the real function that will be called. The other elements are the arguments to this function.
If the first argument does not evaluate to an existing function, R will throw either Error: attempt to apply non-function or Error: could not find function "x" (if the first argument is a symbol that is undefined or points to something other than a function).
Example: the code line f(x, y+z, 2) will be parsed as a list of 4 elements, the first being f (as a symbol), the second being x (another symbol), the third another call, and the fourth a numeric constant. The third element y+z, is just a function with two arguments, so it parses as a list of three names: '+', y and z.
Finally, there is also the expression object, that is a list of calls/symbols/constants, that are meant to be evaluated one by one.
You'll find lots of information here:
https://github.com/hadley/devtools/wiki/Computing-on-the-language
OK, now let's get back to your question :-)
What you have tried does not work because the output of paste is a character string, and the assignment function expects as its first argument something that evaluates to a symbol, to be either created or modified. Alternativelly, the first argument can also evaluate to a call associated with a replacement function. These are a little trickier, but they are handled by the assignment function itself, not by the parser.
The error message you see, target of assignment expands to non-language object, is triggered by the assignment function, precisely because your target evaluates to a string.
We can fix that building up a call that has the symbols you want in the right places. The most "brute force" method is to put everything inside a string and use parse:
parse(text=paste('N',i," -> ",'z$x',i,sep=""))
Another way to get there is to use substitute:
substitute(x -> y, list(x=as.symbol(paste("N",i,sep="")), y=substitute(z$w, list(w=paste("x",i,sep="")))))
the inner substitute creates the calls z$x1, z$x2 etc. The outer substitute puts this call as the taget of the assignment, and the symbols N1, N2 etc as the values.
parse results in an expression, and substitute in a call. Both can be passed to eval to get the same result.
Just one final note: I repeat that all this is intended as a didactic example, to help understanding the inner workings of the language, but it is far from good programming practice to use parse and substitute, except when there is really no alternative.
A data.frame is a named list. It usually good practice, and idiomatically R-ish not to have lots of objects in the global environment, but to have related (or similar) objects in lists and to use lapply etc.
You could use list2env to multiassign the named elements of your list (the columns in your data.frame) to the global environment
DD <- data.frame(x = 1:3, y = letters[1:3], z = 3:1)
list2env(DD, envir = parent.frame())
## <environment: R_GlobalEnv>
## ta da, x, y and z now exist within the global environment
x
## [1] 1 2 3
y
## [1] a b c
## Levels: a b c
z
## [1] 3 2 1
I am not exactly sure what you are trying to accomplish. But here is a guess:
### Create a data.frame using the alphabet
data <- data.frame(x = 'a', y = 'b', z = 'c')
### Create a numerical index corresponding to the letter position in the alphabet
index <- which(tolower(letters[1:26]) == data[1, ])
### Use an 'lapply' to apply a function to every element in 'index'; creates a list
val <- lapply(index, function(x) {
paste('N', x, sep = '')
})
### Assign names to our list
names(val) <- names(data)
### Observe the result
val$x

Resources