How to use a loop to change a variable [duplicate] - r

This question already has an answer here:
BioMart: Is there a way to easily change the species for all of my code?
(1 answer)
Closed 4 years ago.
Is there any way to use a loop to write this code? Each line of code is identical except from the species name
ensembl_hsapiens <- useMart("ensembl",
dataset = "hsapiens_gene_ensembl")
ensembl_mouse <- useMart("ensembl",
dataset = "mmusculus_gene_ensembl")
ensembl_chicken <- useMart("ensembl",
dataset = "ggallus_gene_ensembl")

Here's an approach. Note that using a loop (or a loop-equivalent construct) to populate the global environment isn't often a good idea. But it's what you asked for.
There's nothing special about useMart, so I'll make up a nonsense function that takes two character arguments:
foo <- function(x, y) {
nchar(paste(x, y))
}
Here are the species names. I'll use them for the object names as well.
species <- c("hsapiens", "mmusculus", "ggallus")
Now, you want to create three named objects in the global environment. You can use the assign function for this, noting that you use pos=2 because each loop of lapply is done in its own environment.
lapply(species, function(s) assign(paste0("ensembl_", s),
foo("ensemble", paste0(s, "_gene_ensembl")),
pos = 1))
This gives you what you want. You can replace foo use useMart.
Now, is this a good idea? Perhaps not. I would be more inclined to keep the objects themselves in a list.
objs <- lapply(species, function(s) foo("ensemble", paste0(s, "_gene_ensembl")))
names(objs) <- paste0("ensemble_", species)
You can access them using statements like objs$ensemble_hsapiens or objs[["ensemble_hsapiens"]]

Related

Loop for on dataset name in R

This topic has been covered numerous times I see but I can't really get the answer I'm looking for. Thus, here I go.
I am trying to do a loop to create variables in 5 data sets that have similar names as such:
Ech_repondants_nom_1
Ech_repondants_nom_2
Ech_repondants_nom_3
Ech_repondants_nom_4
Ech_repondants_nom_5
Below if the code that I have tried:
list <- c(1:5)
for (i in list) {
Ech_repondants_nom_[[i]]$sec = as.numeric(Ech_repondants_nom_[[i]]$interviewtime)
Ech_repondants_nom_[[i]]$min = round(Ech_repondants_nom_[[i]]$sec/60,1)
Ech_repondants_nom_[[i]]$heure = round(Ech_repondants_nom_[[i]]$min/60,1)
}
Any clues why this does not work?
cheers!
These are object names and not list elements to subset as Ech_repondants_nom_[[i]]. We may need to get the object by paste i.e.
get(paste0("Ech_repondants_nom_", i)$sec
but, then if we need to update the original object, have to call assign. Instead of all this, it can be done more easily if we load the datasets into a list and loop over the list with lapply
lst1 <- lapply(mget(paste0("Ech_repondants_nom_", 1:5)), function(dat)
within(dat, {sec <- as.numeric(interviewtime);
min <- round(sec/60, 1);
heure <- round(min/60, 1)}))
It may be better to keep it as a list, but if we need to update the original object, use list2env
list2env(lst1, .GlobalEnv)
Ech_repondants_nom_[[i]]
Isn't actually selcting that dataframe because you can't call objects like that. Try creating a function that takes a dataframe as an argument then iterating through the dataframes
changing_time_stamp<-function(df){
df$sec = as.numeric(df$interviewtime)
df$min = round(df$sec/60,1)
df$heure = round(df$min/60,1)
for (i in list) {
changing_time_stamp(i)
}
EDIT: I fixed some of the variable names in the function

get() not working for column in a data frame in a list in R (phew)

I have a list of data frames. I want to use lapply on a specific column for each of those data frames, but I keep throwing errors when I tried methods from similar answers:
The setup is something like this:
a <- list(*a series of data frames that each have a column named DIM*)
dim_loc <- lapply(1:length(a), function(x){paste0("a[[", x, "]]$DIM")}
Eventually, I'll want to write something like results <- lapply(dim_loc, *some function on the DIMs*)
However, when I try get(dim_loc[[1]]), say, I get an error: Error in get(dim_loc[[1]]) : object 'a[[1]]$DIM' not found
But I can return values from function(a[[1]]$DIM) all day long. It's there.
I've tried working around this by using as.name() in the dim_loc assignment, but that doesn't seem to do the trick either.
I'm curious 1. what's up with get(), and 2. if there's a better solution. I'm constraining myself to the apply family of functions because I want to try to get out of the for-loop habit, and this name-as-list method seems to be preferred based on something like R- how to dynamically name data frames?, but I'd be interested in other, more elegant solutions, too.
I'd say that if you want to modify an object in place you are better off using a for loop since lapply would require the <<- assignment symbol (<- doesn't work on lapply`). Like so:
set.seed(1)
aList <- list(cars = mtcars, iris = iris)
for(i in seq_along(aList)){
aList[[i]][["newcol"]] <- runif(nrow(aList[[i]]))
}
As opposed to...
invisible(
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <<- runif(nrow(aList[[x]]))
})
)
You have to use invisible() otherwise lapply would print the output on the console. The <<- assigns the vector runif(...) to the new created column.
If you want to produce another set of data.frames using lapply then you do:
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <- runif(nrow(aList[[x]]))
return(aList[[x]])
})
Also, may I suggest the use of seq_along(list) in lapply and for loops as opposed to 1:length(list) since it avoids unexpected behavior such as:
# no length list
seq_along(list()) # prints integer(0)
1:length(list()) # prints 1 0.

lapply function to remove a list of objects [duplicate]

My memory is getting clogged by a bunch of intermediate files (call them temp1, temp2, etc.), and I would like to know if it is possible to remove them from memory without doing repeated rm calls (i.e. rm(temp1), rm(temp2))?
I tried rm(list(temp1, temp2, etc.)), but that doesn't seem to work.
Make the list a character vector (not a vector of names)
rm(list = c('temp1','temp2'))
or
rm(temp1, temp2)
An other solution rm(list=ls(pattern="temp")), remove all objects matching the pattern.
Or using regular expressions
"rmlike" <- function(...) {
names <- sapply(
match.call(expand.dots = FALSE)$..., as.character)
names = paste(names,collapse="|")
Vars <- ls(1)
r <- Vars[grep(paste("^(",names,").*",sep=""),Vars)]
rm(list=r,pos=1)
}
rmlike(temp)
Another variation you can try is (expanding #mnel's answer)
if you have many temp'x'.
Here, "n" could be the number of temp variables present:
rm(list = c(paste("temp",c(1:n),sep="")))

Generating Multiple Variables Dynamically [duplicate]

This question already has answers here:
How to assign values to dynamic names variables
(2 answers)
Closed 7 years ago.
I keep running into situations where I want to dynamically create variables using a for loop (or similar / more efficient construct using dplyr perhaps). However, it's unclear to me how to do it right now.
For example, the below shows a construct that I would intuitively expect to generate 10 variables assigned numbers 1:10, but it doesn't work.
for (i in 1:10) {paste("variable",i,sep = "") = i}
The error
Error in paste("variable", i, sep = "") = i :
target of assignment expands to non-language object
Any thoughts on what method I should use to do this? I assume there are multiple approaches (including a more efficient dplyr method). Full disclosure: I'm relatively new to R and really appreciate the help. Thanks!
I've run into this problem myself many times. The solution is the assign command.
for(i in 1:10){
assign(paste("variable", i, sep = ""), i)
}
If you wanted to get everything into one vector, you could use sapply. The following code would give you a vector from 1 to 10, and the names of each item would be "variable i," where i is the value of each item. This may not be the prettiest or most elegant way to use the apply family for this, but I think it ought to work well enough.
var.names <- function(x){
a <- x
names(a) <- paste0("variable", x)
return(a)
}
variables <- sapply(X = 1:10, FUN = var.names)
This sort of approach seems to be favored because it keeps all of those variables tucked away in one object, rather than scattered all over the global environment. This could make calling them easier in the future, preventing the need to use get to scrounge up variables you'd saved.
No need to use a loop, you can create character expression with paste0 and then transform it as uneveluated expression with parse, and finally evaluate it with eval.
eval(parse(text = paste0("variable", 1:10, "=",1:10, collapse = ";") ))
The code you have is really no more useful than a vector of elements:
x<-1
for(i in 2:10){
x<-c(x,i)
}
(Obviously, this example is trivial, could just use x<-1:10 and be done. I assume there's a reason you need to do non-vectored calculations on each variable).

How can I assign a function to a variable that its name should be made by paste function? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to name variables on the fly in R?
I have 10 objects of type list named: list1, list2, ..., list10. And I wanted to use the output from my function FUN to replace the first index of these list variables through a for loop like this:
for (i in 1:10){
eval(parse(test=(paste("list",i,sep=""))))[[1]] <- FUN()
}
but this does not work. I also used lapply in this way and again it is wrong:
lapply(eval(parse(text=(paste("list",i,sep=""))))[[1]], FUN)
Any opinion would be appreciated.
This is FAQ 7.21. The most important part of that FAQ is the last part that says not to do it that way, but to put everything into a list and operate on the list.
You can put your objects into a list using code like:
mylists <- lapply( 1:10, function(i) get( paste0('list',i) ) )
Then you can do the replacement with code like:
mylists <- lapply( mylists, function(x) { x[[1]] <- FUN()
x} )
Now if you want to save all the lists, or delete all the lists, you just have a single object that you need to work with instead of looping through the names again. If you want to do something else to each list then you just use lapply or sapply on the overall list in a single easy step without worrying about the loop. You can name the elements of the list to match the original names if you want and access them that way as well. Keeping everything of interest in a single list will also make your code safer, much less likely to accidentilly overwrite or delete another object.
You probably want something like
for (i in 1:10) {
nam <- paste0("list", i) ## built the name of the object
tmp <- get(nam) ## fetch the list by name, store in tmp
tmp[[1]] <- FUN() ## alter 1st component of tmp using FUN()
assign(nam, value = tmp) ## assign tmp back to the current list
}
as the loop.

Resources