Vectorize Environment Access in R - r

So I have created an environment (which I am trying to use as a hashtable).
To clarify I'm accessing the values stored in the environment with this:
hash[["uniqueIDString"]] ## hash takes a uniqueID and returns a
## dataframe subset that is precomputed
I also have a function called func which returns some subset of the rows returned by hash. It works fine for single calls but it isn't vectorized so I can't use it in a transform which is kind of vital.
The following doesn't work:
df <- transform(df,FOO = func(hash[[ID]])$FOO)
It gives me an error about having the wrong number of arguments for the hash which I presume is because it's passing a vector of IDs to my environment and the environment doesn't know what to do.
EDIT: The exact error is:
Error in hash[[ID]] :
wrong arguments for subsetting an environment
EDIT: With Rob's suggestion I receive the following error:
Error in hash[ID] :
object of type 'environment' is not subsettable
EDIT: For clarification I'm attempting to use the hash lookup in the context of a transform where the values in the ID column are looked up in the hashtable and passed to func so that the output can become a new column.

I use environments as hash tables a lot, the way to retrieve values corresponding to multiple keys is to usemget:
hash <- new.env()
hash[['one']] <- 'this is one'
hash[['two']] <- 'this is two'
mget(c('one', 'two'), envir = hash)
which returns a list
$one
[1] "this is one"
$two
[1] "this is two"
If you need the output as a vector, use unlist:
unlist(mget(c('one', 'two'), envir = hash))
gives you
one two
"this is one" "this is two"
UPDATE If your IDs come from a data frame, you'd need to use unlist to convert that column to vector:
df <- data.frame(id=c('one', 'two'))
unlist(mget(unlist(df$id), envir = hash))

Related

How do you solve "could not find function "deparse<-" | "as.name<-" | "eval<-"" errors when trying to dynamically name dataframes in R? [duplicate]

I am using R to parse a list of strings in the form:
original_string <- "variable_name=variable_value"
First, I extract the variable name and value from the original string and convert the value to numeric class.
parameter_value <- as.numeric("variable_value")
parameter_name <- "variable_name"
Then, I would like to assign the value to a variable with the same name as the parameter_name string.
variable_name <- parameter_value
What is/are the function(s) for doing this?
assign is what you are looking for.
assign("x", 5)
x
[1] 5
but buyer beware.
See R FAQ 7.21
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f
You can use do.call:
do.call("<-",list(parameter_name, parameter_value))
There is another simple solution found there:
http://www.r-bloggers.com/converting-a-string-to-a-variable-name-on-the-fly-and-vice-versa-in-r/
To convert a string to a variable:
x <- 42
eval(parse(text = "x"))
[1] 42
And the opposite:
x <- 42
deparse(substitute(x))
[1] "x"
The function you are looking for is get():
assign ("abc",5)
get("abc")
Confirming that the memory address is identical:
getabc <- get("abc")
pryr::address(abc) == pryr::address(getabc)
# [1] TRUE
Reference: R FAQ 7.21 How can I turn a string into a variable?
Use x=as.name("string"). You can use then use x to refer to the variable with name string.
I don't know, if it answers your question correctly.
strsplit to parse your input and, as Greg mentioned, assign to assign the variables.
original_string <- c("x=123", "y=456")
pairs <- strsplit(original_string, "=")
lapply(pairs, function(x) assign(x[1], as.numeric(x[2]), envir = globalenv()))
ls()
assign is good, but I have not found a function for referring back to the variable you've created in an automated script. (as.name seems to work the opposite way). More experienced coders will doubtless have a better solution, but this solution works and is slightly humorous perhaps, in that it gets R to write code for itself to execute.
Say I have just assigned value 5 to x (var.name <- "x"; assign(var.name, 5)) and I want to change the value to 6. If I am writing a script and don't know in advance what the variable name (var.name) will be (which seems to be the point of the assign function), I can't simply put x <- 6 because var.name might have been "y". So I do:
var.name <- "x"
#some other code...
assign(var.name, 5)
#some more code...
#write a script file (1 line in this case) that works with whatever variable name
write(paste0(var.name, " <- 6"), "tmp.R")
#source that script file
source("tmp.R")
#remove the script file for tidiness
file.remove("tmp.R")
x will be changed to 6, and if the variable name was anything other than "x", that variable will similarly have been changed to 6.
I was working with this a few days ago, and noticed that sometimes you will need to use the get() function to print the results of your variable.
ie :
varnames = c('jan', 'feb', 'march')
file_names = list_files('path to multiple csv files saved on drive')
assign(varnames[1], read.csv(file_names[1]) # This will assign the variable
From there, if you try to print the variable varnames[1], it returns 'jan'.
To work around this, you need to do
print(get(varnames[1]))
If you want to convert string to variable inside body of function, but you want to have variable global:
test <- function() {
do.call("<<-",list("vartest","xxx"))
}
test()
vartest
[1] "xxx"
Maybe I didn't understand your problem right, because of the simplicity of your example. To my understanding, you have a series of instructions stored in character vectors, and those instructions are very close to being properly formatted, except that you'd like to cast the right member to numeric.
If my understanding is right, I would like to propose a slightly different approach, that does not rely on splitting your original string, but directly evaluates your instruction (with a little improvement).
original_string <- "variable_name=\"10\"" # Your original instruction, but with an actual numeric on the right, stored as character.
library(magrittr) # Or library(tidyverse), but it seems a bit overkilled if the point is just to import pipe-stream operator
eval(parse(text=paste(eval(original_string), "%>% as.numeric")))
print(variable_name)
#[1] 10
Basically, what we are doing is that we 'improve' your instruction variable_name="10" so that it becomes variable_name="10" %>% as.numeric, which is an equivalent of variable_name=as.numeric("10") with magrittr pipe-stream syntax. Then we evaluate this expression within current environment.
Hope that helps someone who'd wander around here 8 years later ;-)
Other than assign, one other way to assign value to string named object is to access .GlobalEnv directly.
# Equivalent
assign('abc',3)
.GlobalEnv$'abc' = 3
Accessing .GlobalEnv gives some flexibility, and my use case was assigning values to a string-named list. For example,
.GlobalEnv$'x' = list()
.GlobalEnv$'x'[[2]] = 5 # works
var = 'x'
.GlobalEnv[[glue::glue('{var}')]][[2]] = 5 # programmatic names from glue()

Parameterize name of output dataframe in global environment, assigned to from a function

Trying to pass into a function what I want it to name the dataframe it creates, then save it to global environment.
I am trying to automate creating dataframes that are subsets of other dataframes by filtering for a value; since I'm creating 43 of these, I'm writing a function that can automatically:
a) subset rows containing a certain string into it's own data.frame then
b) name a dataframe after that string and save it to my global environment. (The string in a) is also the suffix I want it to name the data.frame after in b))
I can do a) fine but am having trouble with b).
Say I have a dataset which includes a column named "Team" (detailing whose team that member belongs to):
original.df <- read_csv("../original_data_set")
I create a function to split that dataset according to values in one of its columns...
split.function <- function(string){
x <- original.df
as.name(string) <<- filter(x, str_detect(`Team`, string))
}
... then save the dataframe with the name:
split.by.candidate('Team.Curt')
I keep getting:
> Error in as.name(x) <<- filter(y, str_detect(`Receiving Committee`, x)) :
object 'x' not found
But I just want to see Team.Curt saved as a data.frame in my global environment when I do this with rows including the term Team.Curt
You can use assign to create objects based on a string:
split.function <- function(string){
x <- original.df
assign(string, filter(x, str_detect(`Team`, string)), envir = .GlobalEnv)
}
Here, envir = .GlobalEnv is used to assign the value to the global environment.
Both <- and <<- assignments require that the statement hardcodes the object name. Since you want to parameterize the name, as in your cases, you must use assign().
<<- is merely a variant of <- that can be used inside a function, and does a bottom-up search of environments until it either reaches the top (.GlobalEnv) or finds an existing object of that name. In your case that's unnecessary and slightly dangerous, since if an object of that name existed in some environment halfway up the hierarchy, you'd pick it up and assign to it instead.
So just use assign(..., envir = .GlobalEnv) instead.
But both <<- or assigning directly into .GlobalEnv within functions are strongly discouraged as being disasters in waiting, or "life by a volcano" (burns-stat.com/pages/Tutor/R_inferno.pdf). See the caveats at Assign multiple objects to .GlobalEnv from within a function. tidyverse is probably a better approach for managing multiple dataframes.

How to name an object out of a string in R [duplicate]

I am using R to parse a list of strings in the form:
original_string <- "variable_name=variable_value"
First, I extract the variable name and value from the original string and convert the value to numeric class.
parameter_value <- as.numeric("variable_value")
parameter_name <- "variable_name"
Then, I would like to assign the value to a variable with the same name as the parameter_name string.
variable_name <- parameter_value
What is/are the function(s) for doing this?
assign is what you are looking for.
assign("x", 5)
x
[1] 5
but buyer beware.
See R FAQ 7.21
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f
You can use do.call:
do.call("<-",list(parameter_name, parameter_value))
There is another simple solution found there:
http://www.r-bloggers.com/converting-a-string-to-a-variable-name-on-the-fly-and-vice-versa-in-r/
To convert a string to a variable:
x <- 42
eval(parse(text = "x"))
[1] 42
And the opposite:
x <- 42
deparse(substitute(x))
[1] "x"
The function you are looking for is get():
assign ("abc",5)
get("abc")
Confirming that the memory address is identical:
getabc <- get("abc")
pryr::address(abc) == pryr::address(getabc)
# [1] TRUE
Reference: R FAQ 7.21 How can I turn a string into a variable?
Use x=as.name("string"). You can use then use x to refer to the variable with name string.
I don't know, if it answers your question correctly.
strsplit to parse your input and, as Greg mentioned, assign to assign the variables.
original_string <- c("x=123", "y=456")
pairs <- strsplit(original_string, "=")
lapply(pairs, function(x) assign(x[1], as.numeric(x[2]), envir = globalenv()))
ls()
assign is good, but I have not found a function for referring back to the variable you've created in an automated script. (as.name seems to work the opposite way). More experienced coders will doubtless have a better solution, but this solution works and is slightly humorous perhaps, in that it gets R to write code for itself to execute.
Say I have just assigned value 5 to x (var.name <- "x"; assign(var.name, 5)) and I want to change the value to 6. If I am writing a script and don't know in advance what the variable name (var.name) will be (which seems to be the point of the assign function), I can't simply put x <- 6 because var.name might have been "y". So I do:
var.name <- "x"
#some other code...
assign(var.name, 5)
#some more code...
#write a script file (1 line in this case) that works with whatever variable name
write(paste0(var.name, " <- 6"), "tmp.R")
#source that script file
source("tmp.R")
#remove the script file for tidiness
file.remove("tmp.R")
x will be changed to 6, and if the variable name was anything other than "x", that variable will similarly have been changed to 6.
I was working with this a few days ago, and noticed that sometimes you will need to use the get() function to print the results of your variable.
ie :
varnames = c('jan', 'feb', 'march')
file_names = list_files('path to multiple csv files saved on drive')
assign(varnames[1], read.csv(file_names[1]) # This will assign the variable
From there, if you try to print the variable varnames[1], it returns 'jan'.
To work around this, you need to do
print(get(varnames[1]))
If you want to convert string to variable inside body of function, but you want to have variable global:
test <- function() {
do.call("<<-",list("vartest","xxx"))
}
test()
vartest
[1] "xxx"
Maybe I didn't understand your problem right, because of the simplicity of your example. To my understanding, you have a series of instructions stored in character vectors, and those instructions are very close to being properly formatted, except that you'd like to cast the right member to numeric.
If my understanding is right, I would like to propose a slightly different approach, that does not rely on splitting your original string, but directly evaluates your instruction (with a little improvement).
original_string <- "variable_name=\"10\"" # Your original instruction, but with an actual numeric on the right, stored as character.
library(magrittr) # Or library(tidyverse), but it seems a bit overkilled if the point is just to import pipe-stream operator
eval(parse(text=paste(eval(original_string), "%>% as.numeric")))
print(variable_name)
#[1] 10
Basically, what we are doing is that we 'improve' your instruction variable_name="10" so that it becomes variable_name="10" %>% as.numeric, which is an equivalent of variable_name=as.numeric("10") with magrittr pipe-stream syntax. Then we evaluate this expression within current environment.
Hope that helps someone who'd wander around here 8 years later ;-)
Other than assign, one other way to assign value to string named object is to access .GlobalEnv directly.
# Equivalent
assign('abc',3)
.GlobalEnv$'abc' = 3
Accessing .GlobalEnv gives some flexibility, and my use case was assigning values to a string-named list. For example,
.GlobalEnv$'x' = list()
.GlobalEnv$'x'[[2]] = 5 # works
var = 'x'
.GlobalEnv[[glue::glue('{var}')]][[2]] = 5 # programmatic names from glue()

How to open an R data file in R window

I have some data in R that I intend to analyze. However, the file is not displaying the data. Instead, It is only showing a variable in the data. The following is the procedure I used to load the data and the output produced.
load("C:\Users\user\AppData\Local\Temp\1_29_923-Macdonell.RData")
data=load("C:\Users\user\AppData\Local\Temp\1_29_923-Macdonell.RData")
data
[1] "HeightFinger"
How do I get to view the data?
If you read ?help, it says that the return value of load is:
A character vector of the names of objects created, invisibly.
This suggests (but admittedly does not state) that the true work of the load command is by side-effect, in that it inserts the objects into an environment (defaulting to the current environment, often but not always .GlobalEnv). You should immediately have access to them from where you called load(...).
For instance, if I can guess at variables you might have in your rda file:
x
# Error: object 'x' not found
# either one of these on windows, NOT BOTH
dat = load("C:\\Users\\user\\AppData\\Local\\Temp\\1_29_923-Macdonell.RData")
dat = load("C:/Users/user/AppData/Local/Temp/1_29_923-Macdonell.RData")
dat
# [1] "x" "y" "z"
x
# [1] 42
If you want them to be not stored in the current environment, you can set up an environment to place them in. (I use parent=emptyenv(), but that's not strictly required. There are some minor ramifications to not including that option, none of them earth-shattering.)
myenv <- new.env(parent = emptyenv())
dat = load("C:/Users/user/AppData/Local/Temp/1_29_923-Macdonell.RData",
envir = myenv)
dat
# [1] "x" "y" "z"
x
# Error: object 'x' not found
ls(envir = myenv)
# [1] "x" "y" "z"
From here you can get at your data in any number of ways:
ls.str(myenv) # similar in concept to str() but for environments
# x : num 42
# y : num 1
# z : num 2
myenv$x
# [1] 42
get("x", envir = myenv)
# [1] 42
Side note:
You may have noticed that I used dat as my variable name instead of data. Though you are certainly allowed to use that, it can bite you if you use variable names that match existing variables or functions. For instance, all of your code will work just fine as long as you load your data. If, however, you run some of your code without pre-loading your objects into your data variable, you'll likely get an error such as:
mean(data$x)
# Error in data$x : object of type 'closure' is not subsettable
That error message is not immediately self-evident. The problem is that if not previously defined as in your question, then data here refers to the function data. In programming terms, a closure is a special type of function, so the error really should have said:
# Error in data$x : object of type 'function' is not subsettable
meaning that though dat can be subsetted and dat$x means something, you cannot use the $ subset method on a function itself. (You can't do mean$x when referring to the mean function, for example.) Regardless, even though this here-modified error message is less confusing, it is still not clearly telling you what/where the problem is located.
Because of this, many seasoned programmers will suggest you use unique variable names (perhaps more than just x :-). If you use my suggestion and name it dat instead, then the mistake of not preloading your data will instead error with:
mean(dat$x)
# Error in mean(dat$x) : object 'dat' not found
which is a lot more meaningful and easier to troubleshoot.
There are two ways to save R objects, and you've got them mixed up. In the first way, you save() any collection of objects in an environment to a file. When you load() that file, those objects are re-created with their original names in your current environment. This is how R saves and resotres workspaces.
The second way stores (serializes) a single R object into a file with the saveRDS() function, and recreates it in your environment with the readRDS() function. If you don't assign the results of readRDS(), it'll just print to your screen and drift away.
Examples below:
# Make a simple dataframe
testdf <- data.frame(x = 1:10,
y = rnorm(10))
# Save it out using the save() function
savedir <- tempdir()
savepath <- file.path(savedir, "saved.Rdata")
save(testdf, file = savepath)
# Delete it
rm(testdf)
# Load without assigning - and it's back in your environment
load(savepath)
testdf
# But if you assign the results of load, you just get the name of the object
wrong <- load(savepath)
wrong
# Compare with the RDS:
rds_path <- file.path(savedir, "testdf.rds")
saveRDS(testdf, file = rds_path)
rm(testdf)
testdf <- readRDS(file = rds_path)
testdf
Why the two different approaches? The save()-environment approach is good for creating a checkpoint of your entire environment that you can restore later - that's what R uses it for - but that's about it. It's too easy for such an environment to get cluttered, and if an object you load() has the same name as an object in your current environment, it will overwrite that object:
testdf$z <- "blah"
load(savepath)
testdf # testdf$z is gone
The RDS method lets you assign the name on read, as you're looking to do here. It's a little more annoying to save multiple objects, sure, but you probably shouldn't be saving objects very often anyway - recreating objects from scratch is the best way to ensure that your R code does what you think it does.

input variables in R functions from third party libraries

I am a reasonably proficient python programmer messing around with some R.
On this website, for the third party library ICC, I'm confused about input variables for the function ICCest.
Located here:
http://www.inside-r.org/packages/cran/ICC/docs/ICCest
I can use:
ICCest(Chick, weight, data=ChickWeight, CI.type="S")
And I got this to work. Chick and weight are column names for the data frame variable called ChickWeight. All is well and good.
Except, that, what type of variables are "Chick" and "weight"?? They aren't in my R namespace. They aren't strings because they don't have quotes around them.
Doing:
ICCest(Chick, "weight", data=ChickWeight, CI.type="S")
yields:
In ICCest(Chick, "weight", data = ChickWeight, CI.type = "S") :
passing a character string to 'y' is deprecated since ICC vesion 2.3.0 and will not be supported in future versions. The argument to 'y' should either be an unquoted column name of 'data' or an object
So again in my nice friendly python land you can't pass in unquoted characters strings that are not objects in your namespace so I am quite confused.
What is happening here?
You can take a look at the function's code by typing ICCest (without the parantheses):
> ICCest
Object with tracing code, class "functionWithTrace"
Original definition:
function (x, y, data = NULL, alpha = 0.05, CI.type = c("THD", "Smith")){
square <- function(z) {
z^2
}
icall <- list(y = substitute(y), x = substitute(x))
if (is.character(icall$y)) {
warning("passing a character string to 'y' is deprecated since ICC vesion 2.3.0 and will not be supported in future versions. The argument to 'y' should either be an unquoted column name of 'data' or an object")
if (missing(data))
stop("Supply either the unquoted name of the object containing 'y' or supply both 'data' and then 'y' as an unquoted column name to 'data'")
icall$y <- eval(as.name(y), data, parent.frame())
} ...
what happens after the square function block, is that the input is stored in icall in a parse tree, which you can think of as a set of unevaluated expressions. So there's no error when you pass plain weight without the quotation marks, because at this point, there hasn't been an attempt to evaluate the expressions yet. (I'm a bit unsure about this last statement. I hope someone can confirm if it is technically correct)
Inside the if block (where your warning is raised), you can see that they are using eval to update the local variable icall$y. What eval does is essentially evaluating an expression within an environment. Specifically, in the environment of a dataframe, the column names are considered part of the environment.
Now it says in the documentation, that eval takes an expression as its first input. This is why y is cast to an object with as.name before being passed to eval (remember that we are in the if block for string input y)
eval(expr, envir = parent.frame(),...)
And expressions and strings are different in R. So in the last line of code shown above, the y input (here, weight) is being evaluated in the data environment --which, here, is ChickWeight.
To get a better feeling, try this:
> eval(weight, ChickWeight)
Error in eval(weight, ChickWeight) : object 'weight' not found
But if you make an unevaluated expression first, it will work:
> expr <- quote(weight)
> eval(expr, ChickWeight)
Here, quote is doing roughly the same thing as substitute in the 4th line of the function. Check here for more on quote and substitute\.
Why are you passing your y as a quoted string. The function doesn't appear to require quoted strings for variable names. Doing
str(ChickWeight)
will give you the types for the variables. They aren't in a 'name space' because they are variable names in the data.frame ChickWeight.

Resources