calling variable name of a dataframe inside functions in R using "$" - r

I have a dataframe(master) that has some variables which i have stored in the list below:
cont<-list("Quantity","Amt_per_qty","Trans_tax","Total_trans_amt")
catg<-list("Gender","Region_code","SubCategory")
I am trying to create a function where I can access the variables from dataframe and perform some function on them, though x and val in below function seems to resolve, how can I access the variables using the $ sign inside function
univar<-function (x){
for (val in cont){
print (val)
n<-nrow(x$val) }
print (n) }
univar(master)
Its returning NULL, I tried even with n<-nrow(x[,val]), that also don't seem to work.

i) x[val] returns a data.frame
ii) x[,val,drop = TRUE] returns a vector
iii) x[[val]] shall return as a vector. Advantage of this is : it also works with data.tables
n <- nrow(x) or length(x[[val]])

The reason is that the OP created a list, it could be unlisted and then use [
cont <- unlist(cont)
univar<-function(x){
for (val in cont){
print (val)
n<-nrow(x[[val]]) }
print (n) }
univar(master)

Related

Reading global variables

All I can find is how to write to global variables, but not how to read them.
Example of incorrect code:
v = 0;
test <- function(v) {
v ->> global_v;
v <<- global_v + v;
}
test(1);
print(v);
This yields 2 because v ->> global_v treats v as the local variable v which is equal to 1. What can I replace that line with for global_v to get the 0 from the global v?
I'm asking of course about solutions different to "use different variable names".
You can use with(globalenv(), v) to evaluate v in the global environment rather than the function. with constructs an environment from its first argument, and evaluates the subsequent arguments in that environment. globalenv() returns the global environment. Putting those together, your function would become this:
test <- function(v) {
v <<- with(globalenv(), v) + v;
}

Why does this happen when a user-defined R function does not return a value?

In the function shown below, there is no return. However, after executing it, I can confirm that the value entered d normally.
There is no return. Any suggestions in this regard will be appreciated.
Code
#installed plotly, dplyr
accumulate_by <- function(dat, var) {
var <- lazyeval::f_eval(var, dat)
lvls <- plotly:::getLevels(var)
dats <- lapply(seq_along(lvls), function(x) {
cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
})
dplyr::bind_rows(dats)
}
d <- txhousing %>%
filter(year > 2005, city %in% c("Abilene", "Bay Area")) %>%
accumulate_by(~date)
In the function, the last assignment is creating 'dats' which is returned with bind_rows(dats) We don't need an explicit return statement. Suppose, if there are two objects to be returned, we can place it in a list
In some languages like python, for memory efficiency, generators are used which will yield instead of creating the whole output in memory i.e. Consider two functions in python
def get_square(n):
result = []
for x in range(n):
result.append(x**2)
return result
When we run it
get_square(4)
#[0, 1, 4, 9]
The same function can be written as a generator. Instead of returning anything,
def get_square(n):
for x in range(n):
yield(x**2)
Running the function
get_square(4)
#<generator object get_square at 0x0000015240C2F9E8>
By casting with list, we get the same output
list(get_square(4))
#[0, 1, 4, 9]
There is always a return :) You just don't have to be explicit about it.
All R expressions return something. Including control structures and user-defined functions. (Control-structures are just functions, by the way, so you can just remember that everything is a value or a function call, and everything evaluates to a value).
For functions, the return value is the last expression evaluated in the execution of the function. So, for
f <- function(x) 2 + x
when you call f(3) you will invoke the function + with two parameters, 2 and x. These evaluate to 2 and 3, respectively, so `+`(2, 3) evaluates to 5, and that is the result of f(3).
When you call the return function -- and remember, this is a function -- you just leave the control-flow of a function early. So,
f <- function(x) {
if (x < 0) return(0)
x + 2
}
works as follows: When you call f, it will call the if function to figure out what to do in the first statement. The if function will evaluate x < 0 (which means calling the function < with parameters x and 0). If x < 0 is true, if will evaluate return(0). If it is false, it will evaluate its else part (which, because if has a special syntax when it comes to functions, isn't shown, but is NULL). If x < 0 is not true, f will evaluate x + 2 and return that. If x < 0 is true, however, the if function will evaluate return(0). This is a call to the function return, with parameter 0, and that call will terminate the execution of f and make the result 0.
Be careful with return. It is a function so
f <- function(x) {
if (x < 0) return;
x + 2
}
is perfectly valid R code, but it will not return when x < 0. The if call will just evaluate to the function return but not call it.
The return function is also a little special in that it can return from the parent call of control structures. Strictly speaking, return isn't evaluated in the frame of f in the examples above, but from inside the if calls. It just handles this special so it can return from f.
With non-standard evaluation this isn't always the case.
With this function
f <- function(df) {
with(df, if (any(x < 0)) return("foo") else return("bar"))
"baz"
}
you might think that
f(data.frame(x = rnorm(10)))
should return either "foo" or "bar". After all, we return in either case in the if statement. However, the if statement is evaluated inside with and it doesn't work that way. The function will return baz.
For non-local returns like that, you need to use callCC, and then it gets more technical (as if this wasn't technical enough).
If you can, try to avoid return completely and rely on functions returning the last expression they evaluate.
Update
Just to follow up on the comment below about loops. When you call a loop, you will most likely call one of the built-in primitive functions. And, yes, they return NULL. But you can write your own, and they will follow the rule that they return the last expression they evaluate. You can, for example, implement for in terms of while like this:
`for` <- function(itr_var, seq, body) {
itr_var <- as.character(substitute(itr_var))
body <- substitute(body)
e <- parent.frame()
j <- 1
while (j < length(seq)) {
assign(x = itr_var, value = seq[[j]], envir = e)
eval(body, envir = e)
j <- j + 1
}
"foo"
}
This function, will definitely return "foo", so this
for(i in 1:5) { print(i) }
evalutes to "foo". If you want it to return NULL, you have to be explicit about it (or just let the return value be the result of the while loop -- if that is the primitive while it returns NULL).
The point I want to make is that functions return the last expression they evaluate has to do with how the functions are defined, not how you call them. The loops use non-standard evaluation, so the last expression in the loop body you provide them might be the last value they evaluate and might not. For the primitive loops, it is not.
Except for their special syntax, there is nothing magical about loops. They follow the rules all functions follow. With non-standard evaluation it can get a bit tricky to work out from a function call what the last expression they will evaluate might be, because the function body looks like it is what the function evaluates. It is, to a degree, if the function is sensible, but the loop body is not the function body. It is a parameter. If it wasn't for the special syntax, and you had to provide loop bodies as normal parameters, there might be less confusion.

Write similar which function

sorry for unclarity
myfunction should return index of elements in vector satisfy condition
myfunction <- function(vector,condition)
{
seq_along(vector)[vector == condition]
}
myfunction(vector == condition)
Error: object 'conditions' not found
I'm not sure exactly what you want your function to perform. Does it need to show which elements in a vector satisfy a condition (which is what which(vector == 10) would do)? If that is your intent, can you just do something like:
myfunction <- function(vector, condition){
which(vector == condition)
}
In any case, as far as I'm aware, you can't put a test condition in the parameter definitions of your function.

filling an array recursively in R language

I have a multidimensional array (B_matrix) that I need to fill up with some random values. Since the dimension depends on two parameters K and C that are user defined, I cannot use nested loop to fill the array, so I have decided to fill it up recursively.
The problem with the recursion function (fillUp) is that that even though the array is declared outside the function, the array is set to NULL after the function is run.
B_dim = rep(2,((K+1+C)*2))
B_matrix = array( dim = B_dim, dimnames = NULL)
string = c()
fillUp<-function(level, string ){
if (level>=1){
for(i in c(1,2)){
Recall(level-1, c(string, i))
}
}else{
B_matrix[string] = 1;
}
}
fillUp(length(B_dim), string)
> sum( B_matrix == 1)
[1] NA
I'm new to R, so I'm not sure if the "global" declaration allows fillUp to change the values of the matrix.
Edit:
Note that the line
B_matrix[string] = 1;
is just a test case, and the original idea is to assign some random value that depends of the position of the array element.
Edit2:
Based on what #Bridgeburners hinted, I'm almost there. Replacing B_matrix[string] = 1, by
assign('str', matrix(string,1), envir=.GlobalEnv)
assign('hl', B_half_length, envir=.GlobalEnv)
rul <-runif(1, 0, sum(str[1:hl]))
with( .GlobalEnv,B_matrix[str] <- rul)
I get the error (last line):
Error in eval(expr, envir, enclos) : object 'rul' not found
The problem, I guess, is that I'm working with variables from two different environments at the same time. I don't know how to proceed here.
This option doesn't work either
assign('str',matrix(string,1), envir=.GlobalEnv)
assign('hl', B_half_length, envir=.GlobalEnv)
assign('ru', runif(1, 0, sum(str[1:hl])), envir=.GlobalEnv)
with( .GlobalEnv,B_matrix[str] <- ru)
Note: no visible binding for global variable 'ru'
Edit3:
I've finally solved it:
assign('str',matrix(string,1), envir=.GlobalEnv)
with( .GlobalEnv, B_matrix[str] <- runif(1, 0, sum(str[1:B_half_length])-B_half_length+1) )
where B_half_length is a global variable
Whenever a process is working within a function, it's working in a different environment. The object "B_matrix" is defined in the global environment. Since you're nesting environments (2*(K+C+1) times) you're not impacting the original object. If you simply replace line
B_matrix[string] = 1;
with
assign('str', matrix(string,1), envir=.GlobalEnv)
with(.GlobalEnv,B_matrix[str] <- 1)
your code will work. You simply need to specify which environment your expression is working in. (In the first line you're passing the local value of "string" to a global object named "str".)
Note, also, that indexing an array with a vector doesn't work.
That is, "B_matrix[2,2,2,2,2,2]" is not the same as "B_matrix[c(2,2,2,2,2,2)]".
But it works with a matrix
What you want can be achieved with the following line code once you have initialised you B_matrix array:
B_matrix[] <- runif(length(B_matrix))

Logging and writing error messages to a dataframe

I intend to record the errors in my R code while calling functions in a dataframe (ERR_LOG, say). I want to use 'try' to identify errors while calling a function,if any.The dataframe(ERR_LOG) will have the following columns :
Time : The time at which the function was called (Sys.time)
Loc : For which function call was this error recorded (name of the
function)
Desc : Description of the error which R throws at us (Error message
in R)
Example :
First I would like to initialize a blank dataframe 'ERR_LOG' with these columns
Then write the function
f <- function(a){
x <- a*100
return(x)
}
Now I put the output of the call to 'f' in 'chk'
chk <- try(f())
The above call gives the error 'Error in a * 100 : 'a' is missing' (description of the error)
Check
if(inherits(chk,'try-error'))
{then I want to populate ERR_LOG and stop the code execution}
How can this be done in R?
use tryCatch instead of try
Then inside tryCatch(), use the argument error=function(e){}
e will have an element named message, which is what you would like
Use the following call with browser to explore e$message:
x <- tryCatch(stop("This is your error message"), error=function(e) {browser()})
Note that your function need not be anonymous.
MyErrorParser <- function(e) {
m <- e$message
if (grepl("something", m))
do something
return (something_else)
}
## THEN
tryCatch(stop("This is a test"), error=MyErrorParser)

Resources