lapply function to remove a list of objects [duplicate] - r

My memory is getting clogged by a bunch of intermediate files (call them temp1, temp2, etc.), and I would like to know if it is possible to remove them from memory without doing repeated rm calls (i.e. rm(temp1), rm(temp2))?
I tried rm(list(temp1, temp2, etc.)), but that doesn't seem to work.

Make the list a character vector (not a vector of names)
rm(list = c('temp1','temp2'))
or
rm(temp1, temp2)

An other solution rm(list=ls(pattern="temp")), remove all objects matching the pattern.

Or using regular expressions
"rmlike" <- function(...) {
names <- sapply(
match.call(expand.dots = FALSE)$..., as.character)
names = paste(names,collapse="|")
Vars <- ls(1)
r <- Vars[grep(paste("^(",names,").*",sep=""),Vars)]
rm(list=r,pos=1)
}
rmlike(temp)

Another variation you can try is (expanding #mnel's answer)
if you have many temp'x'.
Here, "n" could be the number of temp variables present:
rm(list = c(paste("temp",c(1:n),sep="")))

Related

How to use a loop to change a variable [duplicate]

This question already has an answer here:
BioMart: Is there a way to easily change the species for all of my code?
(1 answer)
Closed 4 years ago.
Is there any way to use a loop to write this code? Each line of code is identical except from the species name
ensembl_hsapiens <- useMart("ensembl",
dataset = "hsapiens_gene_ensembl")
ensembl_mouse <- useMart("ensembl",
dataset = "mmusculus_gene_ensembl")
ensembl_chicken <- useMart("ensembl",
dataset = "ggallus_gene_ensembl")
Here's an approach. Note that using a loop (or a loop-equivalent construct) to populate the global environment isn't often a good idea. But it's what you asked for.
There's nothing special about useMart, so I'll make up a nonsense function that takes two character arguments:
foo <- function(x, y) {
nchar(paste(x, y))
}
Here are the species names. I'll use them for the object names as well.
species <- c("hsapiens", "mmusculus", "ggallus")
Now, you want to create three named objects in the global environment. You can use the assign function for this, noting that you use pos=2 because each loop of lapply is done in its own environment.
lapply(species, function(s) assign(paste0("ensembl_", s),
foo("ensemble", paste0(s, "_gene_ensembl")),
pos = 1))
This gives you what you want. You can replace foo use useMart.
Now, is this a good idea? Perhaps not. I would be more inclined to keep the objects themselves in a list.
objs <- lapply(species, function(s) foo("ensemble", paste0(s, "_gene_ensembl")))
names(objs) <- paste0("ensemble_", species)
You can access them using statements like objs$ensemble_hsapiens or objs[["ensemble_hsapiens"]]

Generating Multiple Variables Dynamically [duplicate]

This question already has answers here:
How to assign values to dynamic names variables
(2 answers)
Closed 7 years ago.
I keep running into situations where I want to dynamically create variables using a for loop (or similar / more efficient construct using dplyr perhaps). However, it's unclear to me how to do it right now.
For example, the below shows a construct that I would intuitively expect to generate 10 variables assigned numbers 1:10, but it doesn't work.
for (i in 1:10) {paste("variable",i,sep = "") = i}
The error
Error in paste("variable", i, sep = "") = i :
target of assignment expands to non-language object
Any thoughts on what method I should use to do this? I assume there are multiple approaches (including a more efficient dplyr method). Full disclosure: I'm relatively new to R and really appreciate the help. Thanks!
I've run into this problem myself many times. The solution is the assign command.
for(i in 1:10){
assign(paste("variable", i, sep = ""), i)
}
If you wanted to get everything into one vector, you could use sapply. The following code would give you a vector from 1 to 10, and the names of each item would be "variable i," where i is the value of each item. This may not be the prettiest or most elegant way to use the apply family for this, but I think it ought to work well enough.
var.names <- function(x){
a <- x
names(a) <- paste0("variable", x)
return(a)
}
variables <- sapply(X = 1:10, FUN = var.names)
This sort of approach seems to be favored because it keeps all of those variables tucked away in one object, rather than scattered all over the global environment. This could make calling them easier in the future, preventing the need to use get to scrounge up variables you'd saved.
No need to use a loop, you can create character expression with paste0 and then transform it as uneveluated expression with parse, and finally evaluate it with eval.
eval(parse(text = paste0("variable", 1:10, "=",1:10, collapse = ";") ))
The code you have is really no more useful than a vector of elements:
x<-1
for(i in 2:10){
x<-c(x,i)
}
(Obviously, this example is trivial, could just use x<-1:10 and be done. I assume there's a reason you need to do non-vectored calculations on each variable).

How can I make a list of all dataframes that are in my global environment?

I am trying to use rbind on them. But I need a list of all the dataframes that are already in my global environment. How can I do it?
Code I used to import the 20 csv files in a directory. Basically, have to combine into a single dataframe.
temp = list.files(pattern = "*.csv")
for (i in 1:length(temp)) assign(temp[i], read.csv(temp[i]))
This function should return a proper list with all the data.frames as elements
dfs <- Filter(function(x) is(x, "data.frame"), mget(ls()))
then you can rbind them with
do.call(rbind, dfs)
Of course it's awfully silly to have a bunch of data.frames lying around that are so related that you want to rbind them. It sounds like they probably should have been in a list in the first place.
I recommend you say away from assign(), that's always a sign things are probably afoul. Try
temp <- list.files(pattern="*.csv")
dfs <- lapply(temp, read.csv)
that should return a list straight away.
From your posted code, I would recommend you start a new R session, and read the files in again with the following code
do.call(rbind, lapply(list.files(pattern = ".csv"), read.csv))
The ls function lists all things in your environment. The get function gets a variable with a given name. You can use the class function to get the class of a variable.
If you put them all together, you can do this:
ls()[sapply(ls(), function(x) class(get(x))) == 'data.frame']
which will return a character vector of the data.frames in the current environment.
If you only have data.frames with the same number of columns and column names in you global environment, the following should work (non-data.frame object don't matter):
do.call(rbind, eapply(.GlobalEnv,function(x) if(is.data.frame(x)) x))
This is a slight improvement on MentatOfDune's answer, which does not catch data.frames with multiple classes:
ls()[grepl('data.frame', sapply(ls(), function(x) class(get(x))))]
To improve MentatOfDune's answer (great username by the way):
ls()[sapply(ls(), function(x) any(class(get(x)) == 'data.frame'))]
or even more robust:
ls()[sapply(ls(), function(x) is.data.frame(get(x)))]
This also supports tibbles (created with dplyr for example), because they contain multiple classes, where data.frame is one of them.
A readable version to get TRUEs and FALSEs using R 4 and higher:
ls() |> sapply(get) |> sapply(is.data.frame)
Finally super, super robust, also for package developers:
ls()[sapply(ls(), function(x) is.data.frame(eval(parse(text = x), envir = globalenv())))]

calling data frames in a for loop by a vector

I have some data.frames
dat1=read.table...
dat2=read.table...
dat3=read.table...
And I would to count the rows for each data set. So
the names are saved like this (cannot "change" it) vector=c("dat1","dat2","dat3...)
p <- vector(numeric, length=1:length(dat))
counting <- function(x) {for (i in 1:x){
p[i]<-nrow(dat[i])}
return(p)
}
This is not working because the input for nrow is a character, but i need integer(?) or?
Thx for help
You can use get for this, but be careful! Instead reading the tables at a list is the R-ish way:
file.names <- list.files()
dat <- lapply(file.names, read.table)
Then you have all the conveniences of lapply and the apply family at your disposal, e.g.:
lapply(dat, nrow)
The solution using get (also vector is a bad variable name since its a very important function):
lapply(vector, function(x) nrow(get(x)))
Your method fails since there is no object called dat to index into. The for loop could look like:
p = NULL
for(v in vector) {
p <- c(p, nrow(get(v)))
}
But that technique is poor form for lotsa reasons...
If you want to determine properties of items you know to be in the .GlobalEnv, this works:
> sapply( c("A","B"), function(objname) nrow(.GlobalEnv[[objname]]) )
A B
5 4
You could substitute any character vector for c("A","B")`. If the object is not in the global environment it just returns NULL, so it's reasonably robust.

Remove multiple objects with rm()

My memory is getting clogged by a bunch of intermediate files (call them temp1, temp2, etc.), and I would like to know if it is possible to remove them from memory without doing repeated rm calls (i.e. rm(temp1), rm(temp2))?
I tried rm(list(temp1, temp2, etc.)), but that doesn't seem to work.
Make the list a character vector (not a vector of names)
rm(list = c('temp1','temp2'))
or
rm(temp1, temp2)
An other solution rm(list=ls(pattern="temp")), remove all objects matching the pattern.
Or using regular expressions
"rmlike" <- function(...) {
names <- sapply(
match.call(expand.dots = FALSE)$..., as.character)
names = paste(names,collapse="|")
Vars <- ls(1)
r <- Vars[grep(paste("^(",names,").*",sep=""),Vars)]
rm(list=r,pos=1)
}
rmlike(temp)
Another variation you can try is (expanding #mnel's answer)
if you have many temp'x'.
Here, "n" could be the number of temp variables present:
rm(list = c(paste("temp",c(1:n),sep="")))

Resources