How to save multiple variables with 1 line of code in R? - r

I have 7 large seurat objects, saved as sn1, sn2, sn3 ... sn7
I am trying to do scaledata on all 7 samples. I could write the same line 7 times as:
all.genes <- rownames(sn1)
snN1<-ScaleData(sn1, features = all.genes)
all.genes <- rownames(sn2)
snN2<-ScaleData(sn2, features = all.genes)
all.genes <- rownames(sn2)
snN2<-ScaleData(sn2, features = all.genes)
.
.
.
This would work perfectly. Since I have to use all 7 samples for quite a while still, I thought I'd save time and make a for loop to do the job with one line of code, but I am unable to save the varables, getting an error "Error in { : target of assignment expands to non-language object".
This is what I tried:
samples<-c("sn1", "sn2", "sn3", "sn4", "sn5", "sn6", "sn7")
list<-c("snN1", "snN2", "snN3", "snN4", "snN5", "snN6", "snN7")
for (i in samples) {
all.genes <- rownames(get(i))
list[1:7]<-ScaleData(get(i), features = all.genes)
}
How do I have to format the code so it could create varables snN1, snN2, snN3 and save scaled data from sn1, sn2, sn3... to each respective new variable?

I think the error is in this line: list[1:7]<-ScaleData(get(i), features = all.genes). You are saying to the for-loop to reference the output of the function ScaleData, to the 7 string variables in the list, which makes no sense. I think you are looking for the function assign(), but it is recommended to use it in very specific contexts.
Also, there're better methods that for-loops in R, for example apply() and related functions. I recommend to you to create as a custom function the steps you want to apply, and then call lapply() to iteratively - as a for-loop would do - change every variable and store it in a list. To call every 'snX' variable as the input you can reference them in a list that direct to them.
# Custom function
custom_scale <- function(x){
all.genes <- rownames(x)
y = ScaleData(x, features = all.genes)
}
# Apply custom function and return saved in a list
# Create a list that directo to every variable
samples = list(sn1, sn2, sn3, sn4, sn5, sn6, sn7) # Note I'm not using characters, I'm referencing the actual variable.
# use lapply to iterate over the list and apply your custom function, saving the result as your list
scaled_Data_list = lapply(samples, function(x) custom_scale(x))
This should work, however without an example data I can't test it.

Here is how to do it using a loop and assign. I removed some redundant code/variables as this can always be a source of error. However, I agree with RobertoT that storing such data in a list and using lapply is a good idea.
samples <- paste0('sn', 1:7)
for (sn in samples) {
sn.data <- get(sn)
assign(sub('n', 'nN', sn),
ScaleData(sn.data, features=rownames(sn.data)))
}

Related

retreive Seurat object name in R during a for loop

I'm working on single cell rna-seq on Seurat and I'm trying to make a for() loop over Seurat objects to draw several heatmaps of average gene expression.
for(i in c(seuratobject1, seuratobject2, seuratobject3)){
cluster.averages <- data.frame(AverageExpression(i, features = genelist))
cluster.averages$rowmeans <- rowMeans(cluster.averages)
genelist.new <- as.list(rownames(cluster.averages))
cluster.averages <- cluster.averages[order(cluster.averages$rowmeans),]
HMP.ordered <- DoHeatmap(i, features = genelist.new, size = 3, draw.lines = T)
ggsave(HMP.ordered, file=paste0(i, ".HMP.ordered.png"), width=7, height=30)
the ggsave line does not work as it takes i as a seurat object. Hence my question: How to get ggsave() to use the name of my seurat object stored in "i"?
I tried substitute(i) and deparse(substitute(i)) w/o success.
Short answer: you can’t.
Long answer: using substitute or similar to try to get i’s name will give you … i. (This is different for function arguments, where substitute(arg) gives you the call’s argument expression.)
You need to use a named vector instead. Ideally you’d have your Seurat objects inside a list to begin with. But to create such a list on the fly, you can use get:
names = c('seuratobject1', 'seuratobject2', 'seuratobject3')
for(i in names) {
cluster.averages <- data.frame(AverageExpression(get(i), features = genelist))
# … rest is identical …
}
That said, I generally advocate strongly against the use of get and for treating the local environment as a data structure. Lists and vectors are designed to be used in this situation instead.

How can I use for loop for these process in R

I have a data frame that includes 43 different countries.
To summarize my data frame, row names like that: (AUS1, AUS2, AUS3, ... BRA1, BRA2, ... GER1, GER2...GER56) and there is a variable like Country which includes country codes.
I need to find their export values. I can find separately but, it is taking so much time because I have 14 different years. Thus, I want to use for loop. However, I can not find any way to use for loop for the below process.
This is my code to find export for single country.
##AUT
AUT <- filter(wiot, wiot$Country == "AUT")
exportAUT <- sum(AUT$TOT) - sum(select(AUT, starts_with("AUT")))
##BEL
BEL <- filter(wiot, wiot$Country == "BEL")
exportBEL <- sum(BEL$TOT) - sum(select(BEL, starts_with("BEL")))
Trying to create individually named objects for this set of results is the path to madness in R. Instead create a list with a more generic name and then put results in the "leaves" (individual element) inside the list:
export <- list()
for (i in wiot$Country) {
export[i] <- sum(wiot[i]$TOT) - sum(select(wiot, starts_with(i)))
#or maybe: export[i] <- sum(wiot[i]$TOT) - sum(wiot[ grepl(i,names(wiot)) ] )
}
This is a guess, since I'm not able to figure out how the rows and columns are referenced in your data.frame object. It would be much easier to debug this if you provided a less ambiguous description of the data object named wiot. Use either the output of str(wiot) or show output of dput(head(wiot))
Consider base R's by to build a named list of export calculations:
export_list <- by(wiot, wiot$country, function(sub)
sum(sub$TOT) - sum(select(sub, starts_with(sub$country[1])))
)
export_list$AUT
export_list$BEL
export_list$GER
...

How to subset rows with strings

I want to use function for repetitively making up set with different names.
for example, if I have 5 random vectors.
number1<-sample(1:10, 3)
number2<-sample(1:10, 3)
number3<-sample(1:10, 3)
number4<-sample(1:10, 3)
number5<-sample(1:10, 3)
Then, I will use these vectors for selecting rows in raw data set(i.e. dataframe)
testset1<-raw[number1,]
testset2<-raw[number2,]
testset3<-raw[number3,]
tsetset4<-raw[number4,]
testset5<-raw[number5,]
It takes lot of spaces in manuscript for writing up each commands. I'm trying to shorten these commands with using 'function'
However, I found that it is hard to use variables in a function statement for writing 'text argument'. For example, it is easy to use variables like this.
mean_function<-function(x){
mean(x)
}
But, I want to use function like this.
testset "number with 1-5" <-raw[number"number 1-5",]
I would really appreciate your help.
You don't need to create a function for this task, simply use lapply to loop over the list of elements produced by mget(), then set some names and finally put all results in the global environment:
rowSelected <-lapply(mget(paste0("number", 1:5)), function(x) raw[x, ])
names(rowSelected) <- paste0("testset", 1:5)
list2env(rowSelected, envir = .GlobalEnv)

get() not working for column in a data frame in a list in R (phew)

I have a list of data frames. I want to use lapply on a specific column for each of those data frames, but I keep throwing errors when I tried methods from similar answers:
The setup is something like this:
a <- list(*a series of data frames that each have a column named DIM*)
dim_loc <- lapply(1:length(a), function(x){paste0("a[[", x, "]]$DIM")}
Eventually, I'll want to write something like results <- lapply(dim_loc, *some function on the DIMs*)
However, when I try get(dim_loc[[1]]), say, I get an error: Error in get(dim_loc[[1]]) : object 'a[[1]]$DIM' not found
But I can return values from function(a[[1]]$DIM) all day long. It's there.
I've tried working around this by using as.name() in the dim_loc assignment, but that doesn't seem to do the trick either.
I'm curious 1. what's up with get(), and 2. if there's a better solution. I'm constraining myself to the apply family of functions because I want to try to get out of the for-loop habit, and this name-as-list method seems to be preferred based on something like R- how to dynamically name data frames?, but I'd be interested in other, more elegant solutions, too.
I'd say that if you want to modify an object in place you are better off using a for loop since lapply would require the <<- assignment symbol (<- doesn't work on lapply`). Like so:
set.seed(1)
aList <- list(cars = mtcars, iris = iris)
for(i in seq_along(aList)){
aList[[i]][["newcol"]] <- runif(nrow(aList[[i]]))
}
As opposed to...
invisible(
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <<- runif(nrow(aList[[x]]))
})
)
You have to use invisible() otherwise lapply would print the output on the console. The <<- assigns the vector runif(...) to the new created column.
If you want to produce another set of data.frames using lapply then you do:
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <- runif(nrow(aList[[x]]))
return(aList[[x]])
})
Also, may I suggest the use of seq_along(list) in lapply and for loops as opposed to 1:length(list) since it avoids unexpected behavior such as:
# no length list
seq_along(list()) # prints integer(0)
1:length(list()) # prints 1 0.

Convert A List Object into a Useable Matrix Name (R)

I want to be able to use a loop to perform the same funtion on a group of data sets without having to recall the name of all of the data sets individually. For example, say I have the following matricies:
a<-matrix(1:5,nrow=5,ncol=2)
b<-matrix(6:10,nrow=5,ncol=2)
c<-matrix(11:15,nrow=5,ncol=2)
I define a vector of set names:
SetNames<- c("a","b","c")
Then I want to sum the second column of all of the matricies without having to call each matrix name. Basically, I would like to be able to call SetNames[1], have the program return 'a' as USEABLE text which can be used to call apply(a[2],2,sum).
If apply(SetNames[1][2],2,sum) worked, that would be the basic syntax I was looking for, however I would replace the 1 with a variable I can increase in a loop.
sapply can do that.
sapply(SetNames, function(z) {
dfz <- get(z)
sum(dfz[,2])
})
# a b c
# 15 40 65
Notice that get() is used here to dynamically access a variable.
a less compact way of writing this would be
sumRowTwo <- function(z) {
dfz <- get(z)
sum(dfz[,2])
}
sapply(SetNames, sumRowTwo)
and now you can play around with sumRowTwo and see what e.g.
sumRowTwo("a")
returns

Resources