R- creating a function that creates names of variables - r

I want to create a function who create a name depending of its argument.
I tried:
a <- function(x){ assign(paste("train",x,sep=""),4]) }
But when i do a(3) for example, nothing happens. what's wrong?
Thanks for your help.
edit: I will be more specific as requested.
I want to do a feature selection: the idea is to use a function to generate different subsets of features, generate a training set for each subset, then use the output of this function in another function (let's say lm() ) to test each training set. the number of subsets/training set is variable and I don't know how to store them in order to re-use them later.

You need to assign the variable within the global environment (or whichever environment you want the variable to live in).
> a <- function(x) { assign(paste('train', x, sep = ''), 4, envir = .GlobalEnv) }
> ls()
[1] "a"
> a(1)
> ls()
[1] "a" "train1"

Related

Global assignment of list elements within a function, using a variable name for the list

The title is a bit loose. I would like to create a function that takes a list name as a variable, and assigns elements to that list. So far, I can access the list's elements via its name in the my_fun function. I can then modify the list within the function (using the function parameter), and print to verify that the assignment has worked. Of course, I'm not saving any of this to the global environment, so when I call the list outside of the function, it hasn't changed.
my_list <- list('original_element')
my_fun <- function(some_list) {
# check that I can access the list using the parameter name
print(some_list[[1]])
# modify the list
some_list[[1]] <- 'element_assigned_in_function'
# check that the list was modified
print(some_list[[1]])
}
my_fun(my_list)
# [1] "original_element"
# [1] "element_assigned_in_function"
# of course, my_list hasn't been modified in the global environment
my_list
# [[1]]
# [1] "original_element"
My question is, how could you assign new elements to this list within the function and store them in the global environment? If I simply try to make the assignment global with <<-, the following error is thrown:
Error in some_list[[1]] <<- "element_assigned_in_function" :
object 'some_list' not found
I've tried to use assign and set envir = .GlobalEnv, but that doesn't work and I don't think assign can be used with list elements. I suspect that there may be some quoting/unquoting/use of expressions to get this to work, but I haven't been able to figure it out. Thank you.
First of all, I'll be clear that I do not encourage assigning values to variables from inside the function in global environment. Ideally, you should always return the value from the function which you want to change. However, just for demonstration purpose here is a way in which you can change the contents of the list from inside the function
my_fun <- function(some_list) {
list_name <- deparse(substitute(some_list))
some_list[[1]] <- 'element_assigned_in_function'
assign(list_name, some_list, .GlobalEnv)
}
my_list <- list('original_element')
my_fun(my_list)
my_list
#[[1]]
#[1] "element_assigned_in_function"

Assign a variable in R using another variable

I have to run 10's of different permutations with same structure but different base names for the output. to avoid having to keep replacing the whole character names within each formula, I was hoping to great a variable then use paste function to assign the variable to the name of the output..
Example:
var<-"Patient1"
(paste0("cells_", var, sep="") <- WhichCells(object=test, expression = test > 0, idents=c("patient1","patient2"))
The expected output would be a variable called "cells_Patient1"
Then for subsequent runs, I would just copy and paste these 2 lines and change var <-"Patient1" to var <-"Patient2"
[please note that I am oversimplifying the above step of WhichCells as it entails ~10 steps and would rather not have to replace "Patient1" by "Patient2" using Search and Replaced
Unfortunately, I am unable to crate the variable "cells_Patient1" using the above command. I am getting the following error:
Error in variable(paste0("cells_", var, sep = "")) <-
WhichCells(object = test, : target of assignment expands to
non-language object
Browsing stackoverflow, I couldn't find a solution. My understanding of the error is that R can't assign an object to a variable that is not a constant. Is there a way to bypass this?
1) Use assign like this:
var <- "Patient1"
assign(paste0("cells_", var), 3)
cells_Patient1
## [1] 3
2) environment This also works.
e <- .GlobalEnv
e[[ paste0("cells_", var) ]] <- 3
cells_Patient1
3) list or it might be better to make these variables into a list:
cells <- list()
cells[[ var ]] <- 3
cells[[ "Patient1" ]]
## [1] 3
Then we could easily iterate over all such variables. Replace sqrt with any suitable function.
lapply(cells, sqrt)
## $Patient1
## [1] 1.732051

Make sure that R functions don't use global variables

I'm writing some code in R and have around 600 lines of functions right now and want to know if there is an easy way to check, if any of my functions is using global variables (which I DON'T want).
For example it could give me an error if sourcing this code:
example_fun<-function(x){
y=x*c
return(y)
}
x=2
c=2
y=example_fun(x)
WARNING: Variable c is accessed from global workspace!
Solution to the problem with the help of #Hugh:
install.packages("codetools")
library("codetools")
x = as.character(lsf.str())
which_global=list()
for (i in 1:length(x)){
which_global[[x[i]]] = codetools::findGlobals(get(x[i]), merge = FALSE)$variables
}
Results will look like this:
> which_global
$arrange_vars
character(0)
$cal_flood_curve
[1] "..count.." "FI" "FI_new"
$create_Flood_CuRve
[1] "y"
$dens_GEV
character(0)
...
For a given function like example_function, you can use package codetools:
codetools::findGlobals(example_fun, merge = FALSE)$variables
#> [1] "c"
To collect all functions see Is there a way to get a vector with the name of all functions that one could use in R?
What about emptying your global environment and running the function? If an object from the global environment were to be used in the function, you would get an error, e.g.
V <- 100
my.fct <- function(x){return(x*V)}
> my.fct(1)
[1] 100
#### clearing global environment & re-running my.fct <- function... ####
> my.fct(1)
Error in my.fct(1) : object 'V' not found

Why does my R function have knowledge of variables that are not given as arguments? [duplicate]

Is there any way to throw a warning (and fail..) if a global variable is used within a R function? I think that is much saver and prevents unintended behaviours...e.g.
sUm <- 10
sum <- function(x,y){
sum = x+y
return(sUm)
}
due to the "typo" in return the function will always return 10. Instead of returning the value of sUm it should fail.
My other answer is more about what approach you can take inside your function. Now I'll provide some insight on what to do once your function is defined.
To ensure that your function is not using global variables when it shouldn't be, use the codetools package.
library(codetools)
sUm <- 10
f <- function(x, y) {
sum = x + y
return(sUm)
}
checkUsage(f)
This will print the message:
<anonymous> local variable ‘sum’ assigned but may not be used (:1)
To see if any global variables were used in your function, you can compare the output of the findGlobals() function with the variables in the global environment.
> findGlobals(f)
[1] "{" "+" "=" "return" "sUm"
> intersect(findGlobals(f), ls(envir=.GlobalEnv))
[1] "sUm"
That tells you that the global variable sUm was used inside f() when it probably shouldn't have been.
There is no way to permanently change how variables are resolved because that would break a lot of functions. The behavior you don't like is actually very useful in many cases.
If a variable is not found in a function, R will check the environment where the function was defined for such a variable. You can change this environment with the environment() function. For example
environment(sum) <- baseenv()
sum(4,5)
# Error in sum(4, 5) : object 'sUm' not found
This works because baseenv() points to the "base" environment which is empty. However, note that you don't have access to other functions with this method
myfun<-function(x,y) {x+y}
sum <- function(x,y){sum = myfun(x+y); return(sUm)}
environment(sum)<-baseenv()
sum(4,5)
# Error in sum(4, 5) : could not find function "myfun"
because in a functional language such as R, functions are just regular variables that are also scoped in the environment in which they are defined and would not be available in the base environment.
You would manually have to change the environment for each function you write. Again, there is no way to change this default behavior because many of the base R functions and functions defined in packages rely on this behavior.
Using get is a way:
sUm <- 10
sum <- function(x,y){
sum <- x+y
#with inherits = FALSE below the variable is only searched
#in the specified environment in the envir argument below
get('sUm', envir = environment(), inherits=FALSE)
}
Output:
> sum(1,6)
Error in get("sUm", envir = environment(), inherits = FALSE) :
object 'sUm' not found
Having the right sum in the get function would still only look inside the function's environment for the variable, meaning that if there were two variables, one inside the function and one in the global environment with the same name, the function would always look for the variable inside the function's environment and never at the global environment:
sum <- 10
sum2 <- function(x,y){
sum <- x+y
get('sum', envir = environment(), inherits=FALSE)
}
> sum2(1,7)
[1] 8
You can check whether the variable's name appears in the list of global variables. Note that this is imperfect if the global variable in question has the same name as an argument to your function.
if (deparse(substitute(var)) %in% ls(envir=.GlobalEnv))
stop("Do not use a global variable!")
The stop() function will halt execution of the function and display the given error message.
Another way (or style) is to keep all global variables in a special environment:
with( globals <- new.env(), {
# here define all "global variables"
sUm <- 10
mEan <- 5
})
# or add a variable by using $
globals$another_one <- 42
Then the function won't be able to get them:
sum <- function(x,y){
sum = x+y
return(sUm)
}
sum(1,2)
# Error in sum(1, 2) : object 'sUm' not found
But you can always use them with globals$:
globals$sUm
[1] 10
To manage the discipline, you can check if there is any global variable (except functions) outside of globals:
setdiff(ls(), union(lsf.str(), "globals")))

how to access the name of a dataframe in R

my question is how can I get the name of a dataframe not the colnames
for example d is my dataframe I want to use a function to get the exact name "d" rather than the results from names(d)
Thank you so much!
Update:
The reason why I am asking this is because I want to write a function to generate several plots at one time. I need to change the main of the plots in order to distinguish them. My function looks like
fct=function(data){
cor_Max = cor(data)
solution=fa(r = cor_Max, nfactors = 1, fm = "ml")
return(fa.diagram(solution,main=names(data))
}
How can I change the main in the function correspondingly to the data's name?
You can use the fact that R allows you to obtain the text representation of an expression:
getName <- function(x) deparse(substitute(x))
print(getName(d))
# [1] "d"
objects() will list all of the objects in your environment. Note that names(), as used in your question, provides the column names of the data frame.
I read your question to say that you are looking for the name of the data frame, not the column names. So you're looking for the name passed to the data argument of fct. If so, perhaps something like the following would help
fct <- function(data){
cor_Max <- cor(data)
# as.character(sys.call()) returns the function name followed by the argument values
# so the value of the "data" argument is the second element in the char vector
main <- as.character(sys.call())[2]
print(main)
}
This is a bit ad hoc but maybe it would work for your case.
The most accepted way to do this is as Robert showed, with deparse(substitute(x)).
But you could try something with match.call()
f <- function(x){
m <- match.call()
list(x, as.character(m))
}
> y <- 25
> f(y)
# [[1]]
# [1] 25
#
# [[2]]
# [1] "f" "y"
Now you've got both the value of y and its name, "y" inside the function environment. You can use as.character(m)[-1] to retrieve the object name passed to the argument x
So, your function can use this as a name, for example, like this:
fct <- function(data){
m <- match.call()
plot(cyl ~ mpg, data, main = as.character(m)[-1])
}
> fct(mtcars)

Resources