Evaluate a CallExpression - abstract-syntax-tree

I have a large code base with a bunch of obfuscated expressions and trying to simplify them:
let x = "1+1".split().join("") // x = 2
Result should be just let x = 2, which can be easily obtained by eval right-hand side.
Is it possible to evaluate a call expression (with arguments) during the transformation process?

Related

Pass vector as command line argument in Julia

Is there a way to pass a vector variable as a command line argument in Julia? In my case I have two arguments: an integer and a vector of integers. While I can easily parse the first argument, I didn't find any pleasant way to parse a vector. For now I simply set the vector to be v = parse.(Int, ARGS[2:end]) but it is quite confusing since the items of the vector are treated as arguments. Is there a some special syntax to treat such cases?
I think your current solution is fine as it is, and in line with the way many command-line tools do things.
If you really want to pass your whole array as one command-line argument, you'll have to:
somehow make sure it is correctly parsed as one argument by the shell
parse it as an array within Julia
Both steps vary depending on the syntax you want to use.
Two examples:
example 1: Julia-like syntax
shell$ julia myscript.jl 42 "[1,2,3]"
i = 42
v = [1, 2, 3]
We can take advantage of the Julia parser being able to parse such arrays (but let's be cautious about not evaluating arbitrary julia code input by the user):
# First argument: an integer
i = parse(Int, ARGS[1])
# Second argument, a Vector{Int} in Julia-like format: "[1, 2, 3]"
v = let expr = Meta.parse(ARGS[2])
#assert expr.head == :vect
Int.(expr.args)
end
#show i
#show v
Example 2: space- or comma-separated values
shell$ julia myscript.jl 42 "1,2,3"
i = 42
v = [1, 2, 3]
Here, we can use DelimitedFiles to parse the array (change the delimiter to whatever you like):
# First argument: an integer
i = parse(Int, ARGS[1])
# Second argument, a Vector{Int} as comma-separated values
using DelimitedFiles
v = reshape(readdlm(IOBuffer(ARGS[2]), ',', Int), :)
# set the delimiter here ^
#show i
#show v

Accessing values in expression using a macro

I'm wondering whether it's possible to define a macro that can modify the values of an expression only if the values are of a specific type?
Here's a minimal example:
type Special
x::Int
end
f1(s, n::Special) = println("f1", s, n)
f2(s, n::Special) = println("f2", s, n)
x1 = Special(3)
x2 = Special(5)
expr = :(
f1("this is f1", x1),
f2("this is f2", x2)
)
Now a macro might be able to examine the values of the arguments to the functions, determine that x1 and x2 are of type Special, run some function to modify their values, say by changing 3 to 4 and 5 to 2 (it might involve comparing two values), then pass the expression back to the caller. The final result would be equivalent to calling:
f1("this is f1", 4)
f2("this is f2", 2)
I found that it's possible to access the values in a macro via:
eval(eval(filter(x -> typeof(eval(x)) == Special, expr.args[1].args))[1]).x
=> 3
but although this works it looks wrong, and I'm might either be doing it wrong or trying to do something too way out...
No, you should never try to check types or values inside macros. Using eval to figure out the type or value of something in a macro may work in very limited situations, but it'll break in almost every real use. Instead, just have the macro insert a call to a generic function — that's where Julia excels at picking apart types (as method dispatch) and values (within the method):
munge_special(x::Special) = Special(x.x + 42)
munge_special(x) = x
macro do_something_special(x)
return :(munge_special($(esc(x))))
end
julia> #do_something_special Special(2)
Special(44)
julia> #do_something_special 3
3

How can I interpret user input as a function in Julia?

I've been using the following function to take in user input for something I'm writing in Julia:
function input(prompt::AbstractString = "")
println(prompt * " ")
chomp(readline())
end
In my particular case, the input that I'm taking in is in the form of equations such as "y = x^2". After the input() function passes it to me as an ASCIIString, I then use the parse() function to convert it to an Expression:
:(y = x^2)
As an Expression, I can use the .args attribute to do things like counting the number of variables and returning the unique variables, all of which has worked fine. Now, I need to be able to evaluate the right side of the expression as the Function f(x) = x^2. To do so, I began writing the following function (which has some pretty major flaws):
function evalExpression()
L = [1,2,3,4]
equation = parse(input("Enter an equation"))
f = equation.args[2].args[2]
for i in L
x = i
value = eval(f)
println(value)
end
end
This function has two problems that I haven't been able to resolve. The first is that it gives me an UndefVarError for x when I try to run it right now; that's more or less expected. The second is that unless I knew that the user would input a function of only x, I would have no way of figuring out what the variables I needed to assign were. I wrote a recursive function that can take in an expression and return all its variables in the form of [:x, :y, etc.], but I cannot assign :x to a number to evaluate the function--I need to assign it just to x, and I cannot figure out how to access that. Is there anything that I can use to access the variables I need? Or a different approach I could take?
Thank you!
When I run the following:
function evalExpression()
L = [1,2,3,4]
equation = parse(input("Enter an equation"))
global x
for i in L
x = i
f = equation.args[2].args[2]
value = eval(f)
println(value)
end
end
and then putting y = x*x I get
evalExpression()
Enter an equation
y = x*x
1
2
3
4
What is missing, at least for x as a variable, is declaring it globally. When you eval parsed statements, these parsed statements only access global variables
So what you probably need to do after you've invented your recursive function to correctly fetch variables, is to create them globally. Maybe
eval(parse("$variable = 0"))
will do

parameter passing mechanism in R

The following function is used to multiply a sequence 1:x by y
f1<-function(x,y){return (lapply(1:x, function(a,b) b*a, b=y))}
Looks like a is used to represent the element in the sequence 1:x, but I do not know how to understand this parameter passing mechanism. In other OO languages, like Java or C++, there have call by reference or call by value.
Short answer: R is call by value. Long answer: it can do both.
Call By Value, Lazy Evaluation, and Scoping
You'll want to read through: the R language definition for more details.
R mostly uses call by value but this is complicated by its lazy evaluation:
So you can have a function:
f <- function(x, y) {
x * 3
}
If you pass in two big matrixes to x and y, only x will be copied into the callee environment of f, because y is never used.
But you can also access variables in parent environments of f:
y <- 5
f <- function(x) {
x * y
}
f(3) # 15
Or even:
y <- 5
f <- function() {
x <- 3
g <- function() {
x * y
}
}
f() # returns function g()
f()() # returns 15
Call By Reference
There are two ways for doing call by reference in R that I know of.
One is by using Reference Classes, one of the three object oriented paradigms of R (see also: Advanced R programming: Object Oriented Field Guide)
The other is to use the bigmemory and bigmatrix packages (see The bigmemory project). This allows you to create matrices in memory (underlying data is stored in C), returning a pointer to the R session. This allows you to do fun things like accessing the same matrix from multiple R sessions.
To multiply a vector x by a constant y just do
x * y
The (some prefix)apply functions works very similar to each other, you want to map a function to every element of your vector, list, matrix and so on:
x = 1:10
x.squared = sapply(x, function(elem)elem * elem)
print(x.squared)
[1] 1 4 9 16 25 36 49 64 81 100
It gets better with matrices and data frames because you can now apply a function over all rows or columns, and collect the output. Like this:
m = matrix(1:9, ncol = 3)
# The 1 below means apply over rows, 2 would mean apply over cols
row.sums = apply(m, 1, function(some.row) sum(some.row))
print(row.sums)
[1] 12 15 18
If you're looking for a simple way to multiply a sequence by a constant, definitely use #Fernando's answer or something similar. I'm assuming you're just trying to determine how parameters are being passed in this code.
lapply calls its second argument (in your case function(a, b) b*a) with each of the values of its first argument 1, 2, ..., x. Those values will be passed as the first parameter to the second argument (so, in your case, they will be argument a).
Any additional parameters to lapply after the first two, in your case b=y, are passed to the function by name. So if you called your inner function fxn, then your invocation of lapply is making calls like fxn(1, b=4), fxn(2, b=4), .... The parameters are passed by value.
You should read the help of lapply to understand how it works. Read this excellent answer to get and a good explanation of different xxpply family functions.
From the help of laapply:
lapply(X, FUN, ...)
Here FUN is applied to each elementof X and ... refer to:
... optional arguments to FUN.
Since FUN has an optional argument b, We replace the ... by , b=y.
You can see it as a syntax sugar and to emphasize the fact that argument b is optional comparing to argument a. If the 2 arguments are symmetric maybe it is better to use mapply.

Convert character vector to numeric vector in R for value assignment?

I have:
z = data.frame(x1=a, x2=b, x3=c, etc)
I am trying to do:
for (i in 1:10)
{
paste(c('N'),i,sep="") -> paste(c('z$x'),i,sep="")
}
Problems:
paste(c('z$x'),i,sep="") yields "z$x1", "z$x1" instead of calling the actual values. I need the expression to be evaluated. I tried as.numeric, eval. Neither seemed to work.
paste(c('N'),i,sep="") yields "N1", "N2". I need the expression to be merely used as name. If I try to assign it a value such as paste(c('N'),5,sep="") -> 5, ie "N5" -> 5 instead of N5 -> 5, I get target of assignment expands to non-language object.
This task is pretty trivial since I can simply do:
N1 = x1...
N2 = x2...
etc, but I want to learn something new
I'd suggest using something like for( i in 1:10 ) z[,i] <- N[,i]...
BUT, since you said you want to learn something new, you can play around with parse and substitute.
NOTE: these little tools are funny, but experienced users (not me) avoid them.
This is called "computing on the language". It's very interesting, and it helps understanding the way R works. Let me try to give an intro:
The basic language construct is a constant, like a numeric or character vector. It is trivial because it is not different from its "unevaluated" version, but it is one of the building blocks for more complicated expressions.
The (officially) basic language object is the symbol, also known as a name. It's nothing but a pointer to another object, i.e., a token that identifies another object which may or may not exist. For instance, if you run x <- 10, then x is a symbol that refers to the value 10. In other words, evaluating the symbol x yields the numeric vector 10. Evaluating a non-existant symbol yields an error.
A symbol looks like a character string, but it is not. You can turn a string into a symbol with as.symbol("x").
The next language object is the call. This is a recursive object, implemented as a list whose elements are either constants, symbols, or another calls. The first element must not be a constant, because it must evaluate to the real function that will be called. The other elements are the arguments to this function.
If the first argument does not evaluate to an existing function, R will throw either Error: attempt to apply non-function or Error: could not find function "x" (if the first argument is a symbol that is undefined or points to something other than a function).
Example: the code line f(x, y+z, 2) will be parsed as a list of 4 elements, the first being f (as a symbol), the second being x (another symbol), the third another call, and the fourth a numeric constant. The third element y+z, is just a function with two arguments, so it parses as a list of three names: '+', y and z.
Finally, there is also the expression object, that is a list of calls/symbols/constants, that are meant to be evaluated one by one.
You'll find lots of information here:
https://github.com/hadley/devtools/wiki/Computing-on-the-language
OK, now let's get back to your question :-)
What you have tried does not work because the output of paste is a character string, and the assignment function expects as its first argument something that evaluates to a symbol, to be either created or modified. Alternativelly, the first argument can also evaluate to a call associated with a replacement function. These are a little trickier, but they are handled by the assignment function itself, not by the parser.
The error message you see, target of assignment expands to non-language object, is triggered by the assignment function, precisely because your target evaluates to a string.
We can fix that building up a call that has the symbols you want in the right places. The most "brute force" method is to put everything inside a string and use parse:
parse(text=paste('N',i," -> ",'z$x',i,sep=""))
Another way to get there is to use substitute:
substitute(x -> y, list(x=as.symbol(paste("N",i,sep="")), y=substitute(z$w, list(w=paste("x",i,sep="")))))
the inner substitute creates the calls z$x1, z$x2 etc. The outer substitute puts this call as the taget of the assignment, and the symbols N1, N2 etc as the values.
parse results in an expression, and substitute in a call. Both can be passed to eval to get the same result.
Just one final note: I repeat that all this is intended as a didactic example, to help understanding the inner workings of the language, but it is far from good programming practice to use parse and substitute, except when there is really no alternative.
A data.frame is a named list. It usually good practice, and idiomatically R-ish not to have lots of objects in the global environment, but to have related (or similar) objects in lists and to use lapply etc.
You could use list2env to multiassign the named elements of your list (the columns in your data.frame) to the global environment
DD <- data.frame(x = 1:3, y = letters[1:3], z = 3:1)
list2env(DD, envir = parent.frame())
## <environment: R_GlobalEnv>
## ta da, x, y and z now exist within the global environment
x
## [1] 1 2 3
y
## [1] a b c
## Levels: a b c
z
## [1] 3 2 1
I am not exactly sure what you are trying to accomplish. But here is a guess:
### Create a data.frame using the alphabet
data <- data.frame(x = 'a', y = 'b', z = 'c')
### Create a numerical index corresponding to the letter position in the alphabet
index <- which(tolower(letters[1:26]) == data[1, ])
### Use an 'lapply' to apply a function to every element in 'index'; creates a list
val <- lapply(index, function(x) {
paste('N', x, sep = '')
})
### Assign names to our list
names(val) <- names(data)
### Observe the result
val$x

Resources