Character to unquoted - r

Sometimes data files come in .rdata objects. These are annoying compared to .rds files because the objects have predefined names. In my case, I want to rename the object automatically and get rid of the wrongly named version. Simple somewhat contrived example:
#make a new iris with a bad name
badnameiris = iris
#save it to a file
save(badnameiris, file = "iris.rdata")
#rename badname version from global envir
rm(badnameiris)
#read iris from file
irisname = load("iris.rdata")
#this variable is not iris, but the name of the variable it was assigned to
irisname
[1] "badnameiris"
#it's to use the right name with get()
goodnameiris = get(irisname)
#but harder to get rid of the wrong one with rm()
rm(irisname)
The last line does not work as intended because it requires a bare name as input and it gets a character vector. I realize one can actually use the list argument in rm(), but suppose one could not.
How does one in general convert from character to unquoted for these purposes?
I tried the rlang functions, but these are for non-standard evaluation as used in tidyverse context. I tried as.name(), as suggested here. Does not work either. Most questions I could find asking this question relate to tidyverse, but I'm trying to do base R context.
(An alternative solution above is to make a function that utilizes the destruction of the local environment to remove the unwanted copy of the object.)

Just use do.call:
x <- 1
s <- "x"
do.call(rm, list(s))
ls()
#[1] "s"
Or compute on the language:
eval(bquote(rm(.(s))))

Related

Is there a way to delete existing variables in my R environment via the terminal? [duplicate]

I would like to remove some data from the workspace. I know the "Clear All" button will remove all data. However, I would like to remove just certain data.
For example, I have these data frames in the data section:
data
data_1
data_2
data_3
I would like to remove data_1, data_2 and data_3, while keeping data.
I tried data_1 <- data_2 <- data_3 <- NULL, which does remove the data (I think), but still keeps it in the workspace area, so it is not fully what I would like to do.
You'll find the answer by typing ?rm
rm(data_1, data_2, data_3)
A useful way to remove a whole set of named-alike objects:
rm(list = ls()[grep("^tmp", ls())])
thereby removing all objects whose name begins with the string "tmp".
Edit: Following Gsee's comment, making use of the pattern argument:
rm(list = ls(pattern = "^tmp"))
Edit: Answering Rafael comment, one way to retain only a subset of objects is to name the data you want to retain with a specific pattern. For example if you wanted to remove all objects whose name do not start with paper you would issue the following command:
rm(list = grep("^paper", ls(), value = TRUE, invert = TRUE))
Following command will do
rm(list=ls(all=TRUE))
In RStudio, ensure the Environment tab is in Grid (not List) mode.
Tick the object(s) you want to remove from the environment.
Click the broom icon.
You can use the apropos function which is used to find the objects using partial name.
rm(list = apropos("data_"))
Use the following command
remove(list=c("data_1", "data_2", "data_3"))
If you just want to remove one of a group of variables, then you can create a list and keep just the variable you need. The rm function can be used to remove all the variables apart from "data". Here is the script:
0->data
1->data_1
2->data_2
3->data_3
#check variables in workspace
ls()
rm(list=setdiff(ls(), "data"))
#check remaining variables in workspace after deletion
ls()
#note: if you just use rm(list) then R will attempt to remove the "list" variable.
list=setdiff(ls(), "data")
rm(list)
ls()
paste0("data_",seq(1,3,1))
# makes multiple data.frame names with sequential number
rm(list=paste0("data_",seq(1,3,1))
# above code removes data_1~data_3
If you're using RStudio, please consider never using the rm(list = ls()) approach!* Instead, you should build your workflow around frequently employing the Ctrl+Shift+F10 shortcut to restart your R session. This is the fastest way to both nuke the current set of user-defined variables AND to clear loaded packages, devices, etc. The reproducibility of your work will increase markedly by adopting this habit.
See this excellent thread on Rstudio community for (h/t #kierisi) for a more thorough discussion (the main gist is captured by what I've stated already).
I must admit my own first few years of R coding featured script after script starting with the rm "trick" -- I'm writing this answer as advice to anyone else who may be starting out their R careers.
*of course there are legitimate uses for this -- much like attach -- but beginning users will be much better served (IMO) crossing that bridge at a later date.
To clear all data:
click on Misc>Remove all objects.
Your good to go.
To clear the console:
click on edit>Clear console.
No need for any code.
Adding one more way, using ls() and remove()
ls() return a vector of character strings giving the names of the objects in the specified environment.
Create a list of objects you want to remove from the environment using ls() and then use remove() to remove it.
remove(list = ls()[ls() != "data"])
You can also use tidyverse
# to remove specific objects(s)
rm(list = ls() %>% str_subset("xxx"))
# or to keep specific object(s)
rm(list = setdiff(ls(), ls() %>% str_subset("xxx")))
Maybe this can help as well
remove(list = c(ls()[!ls() %in% c("what", "to", "keep", "here")] ) )

Assign read.csv with some set parameters to a name, in order to pass it to a function

I want to read multiple files. To do this I use a generic function read_list
read_list(file_list, read_fun)
Assigning different read function to the argument read_fun I can read different kinds of files, i.e. read.csv for reading csv files, read_dta for STATA files, etc.
Now, I need to read some csv files where the first four lines need to be skipped. Thus, instead than passing read.csv as an argument to read_list, I would like to pass read.csv with the skip argument set to 4. Is it possible to do this in R? I tried
my_read_csv <- function(...){
read.csv(skip = 4, ...)
}
It seems to work, but I would like to confirm that this is the right way to do it. I think that functions being objects in R is a fantastic and very powerful feature of the language, but I'm not very familiar with R closures and scoping rules, thus I don't want to inadvertently make some big mistake.
You can simply rewrite your read_list to add the unnamed argument qualifier ... at the end and then replace the call to
read_fun(file) with read_fun(file, ...).
This will allow you to write the following syntax:
read_list(files, read.csv, skip = 4)
wich will be equivalent to using your current read_list with a cusom read function:
read_list(files, function(file)read.csv(file, skip = 4))
Also, be aware that read_list sounds awfully lot like a "reinvent the wheel" function. If you describe the behaviour of read_list a little more, I can expand.
Possible alternatives may be
read_list <- function(files, read_fun, ...)lapply(files, read_fun, ...)
# in this case read_list is identical to lapply
read_list <- function(files, read_fun, ...)do.call(rbind, lapply(files, read_fun, ...))
# This will rbind() all the files to one data.frame
I'm not sure if read_list is specialized to your specific task in some way but you can use lapply along with read.csv to read a list of files:
# generate fake file names
files <- paste0('file_', 1:10, '.csv')
# Read files using lapply
dfs <- lapply(files, read.csv, skip = 4)
The third argument of lapply is ... which allows you to pass additional arguments to the function you're applying. In this case, we can use ... to pass the skip = 4 argument to read.csv

Use character string as object name in operations

I am all too inexperienced in programming generally and R specifically so please forgive me if what I have is bad coding.
The problem I am trying to solve is to load many separate csv files into R, tidy up the input a bit, perform a few operations on the resulting objects and eventually plot the results of those operations. The way I have tried to solve it is to use a vector of strings which echoes the object names to call the objects in question. This does not work.
Below is a bit of code which after loading the data does not work.
files=list.files('foldername',pattern="*.csv",full.names=F) #Make a list of files
filen=str_extract(files, '.*(?=\\.csv)') #Pretty the file names for object names
for (i in 1:length(files)){
assign(paste(filen[i]),read.csv(paste(files[i]))) #Load the files
as.object(filen[i])=as.object(filen[i])[,order(names(ATCN_21))] # pseudocode line
as.object(filen[i])=operation(as.object(filen[i]),parameter 1, parameter 2, etc) #More pseudocode
}
where operation may be a plot command or an arbitrary function such as rbind, colnames, whatever you may fancy.
In other words: I need some way to use string i in vector filen exactly as if it were an object name. How can I do this?
The solution: Lists. (Thank you, Pierre)
files=list.files('foldername',pattern="\\.csv$",full.names=F) #Make a list of files
filen=str_extract(files, '.*(?=\\.csv)') #Pretty the file names for object names
list=lst()
for (i in 1:length(files)){
lst[[i]]=read.csv(paste(files[i]))#Load the files
names(lst)[i]<-filen[i] #Name the entries
lst[[i]]=lst[[i]][,order(names(lst[[i]]))]
lst[[i]]=operation(foo)
}
Thank you for helping a clueless n00b.

plyr rename function not working

I can't figure out why this version of plyr's rename function isn't working.
I have a dataframe where I have a single column that ends up being named seq(var_slcut_trucknumber_min, var_slcut_trucknumber_max) because I made it like this:
df_metbal_slcut <- as.data.frame(seq(var_slcut_trucknumber_min,var_slcut_trucknumber_max))
The terms var_slcut_trucknumber_min and var_slcut_trucknumber_max are defined as the min and max of another column.
However, when trying to rename it by the following code,
var_temp <- names(df_metbal_slcut)
df_metbal_slcut <- rename(df_metbal_slcut, c(var_temp="trucknumber"))
I get an error as follows:
The following `from` values were not present in `x`: var_temp
I don't understand why. I know that I can easily do this as colnames(df_metbal_slcut)[1] <- "trucknumber", but I'm an R n00b, and I was looking at a data manipulation tutorial that said that learning plyr was the way to go, so here I am stuck on this.
Try this instead:
df_metbal_slcut <- rename(df_metbal_slcut, setNames("trucknumber",var_temp))
The reason it wasn't working was that c(var_temp = "trucknumber") creates a named vector with the name var_temp, which is not what you were intending. When creating named objects using the tag = value syntax, R won't evaluate variables. It assumes that you literally want the name to be var_temp.
More broadly, it might make sense to name the column more sensibly when initially creating the data frame again using setNames.

How do I clear only a few specific objects from the workspace?

I would like to remove some data from the workspace. I know the "Clear All" button will remove all data. However, I would like to remove just certain data.
For example, I have these data frames in the data section:
data
data_1
data_2
data_3
I would like to remove data_1, data_2 and data_3, while keeping data.
I tried data_1 <- data_2 <- data_3 <- NULL, which does remove the data (I think), but still keeps it in the workspace area, so it is not fully what I would like to do.
You'll find the answer by typing ?rm
rm(data_1, data_2, data_3)
A useful way to remove a whole set of named-alike objects:
rm(list = ls()[grep("^tmp", ls())])
thereby removing all objects whose name begins with the string "tmp".
Edit: Following Gsee's comment, making use of the pattern argument:
rm(list = ls(pattern = "^tmp"))
Edit: Answering Rafael comment, one way to retain only a subset of objects is to name the data you want to retain with a specific pattern. For example if you wanted to remove all objects whose name do not start with paper you would issue the following command:
rm(list = grep("^paper", ls(), value = TRUE, invert = TRUE))
Following command will do
rm(list=ls(all=TRUE))
In RStudio, ensure the Environment tab is in Grid (not List) mode.
Tick the object(s) you want to remove from the environment.
Click the broom icon.
You can use the apropos function which is used to find the objects using partial name.
rm(list = apropos("data_"))
Use the following command
remove(list=c("data_1", "data_2", "data_3"))
If you just want to remove one of a group of variables, then you can create a list and keep just the variable you need. The rm function can be used to remove all the variables apart from "data". Here is the script:
0->data
1->data_1
2->data_2
3->data_3
#check variables in workspace
ls()
rm(list=setdiff(ls(), "data"))
#check remaining variables in workspace after deletion
ls()
#note: if you just use rm(list) then R will attempt to remove the "list" variable.
list=setdiff(ls(), "data")
rm(list)
ls()
paste0("data_",seq(1,3,1))
# makes multiple data.frame names with sequential number
rm(list=paste0("data_",seq(1,3,1))
# above code removes data_1~data_3
If you're using RStudio, please consider never using the rm(list = ls()) approach!* Instead, you should build your workflow around frequently employing the Ctrl+Shift+F10 shortcut to restart your R session. This is the fastest way to both nuke the current set of user-defined variables AND to clear loaded packages, devices, etc. The reproducibility of your work will increase markedly by adopting this habit.
See this excellent thread on Rstudio community for (h/t #kierisi) for a more thorough discussion (the main gist is captured by what I've stated already).
I must admit my own first few years of R coding featured script after script starting with the rm "trick" -- I'm writing this answer as advice to anyone else who may be starting out their R careers.
*of course there are legitimate uses for this -- much like attach -- but beginning users will be much better served (IMO) crossing that bridge at a later date.
To clear all data:
click on Misc>Remove all objects.
Your good to go.
To clear the console:
click on edit>Clear console.
No need for any code.
Adding one more way, using ls() and remove()
ls() return a vector of character strings giving the names of the objects in the specified environment.
Create a list of objects you want to remove from the environment using ls() and then use remove() to remove it.
remove(list = ls()[ls() != "data"])
You can also use tidyverse
# to remove specific objects(s)
rm(list = ls() %>% str_subset("xxx"))
# or to keep specific object(s)
rm(list = setdiff(ls(), ls() %>% str_subset("xxx")))
Maybe this can help as well
remove(list = c(ls()[!ls() %in% c("what", "to", "keep", "here")] ) )

Resources