I want to be able to use a loop to perform the same funtion on a group of data sets without having to recall the name of all of the data sets individually. For example, say I have the following matricies:
a<-matrix(1:5,nrow=5,ncol=2)
b<-matrix(6:10,nrow=5,ncol=2)
c<-matrix(11:15,nrow=5,ncol=2)
I define a vector of set names:
SetNames<- c("a","b","c")
Then I want to sum the second column of all of the matricies without having to call each matrix name. Basically, I would like to be able to call SetNames[1], have the program return 'a' as USEABLE text which can be used to call apply(a[2],2,sum).
If apply(SetNames[1][2],2,sum) worked, that would be the basic syntax I was looking for, however I would replace the 1 with a variable I can increase in a loop.
sapply can do that.
sapply(SetNames, function(z) {
dfz <- get(z)
sum(dfz[,2])
})
# a b c
# 15 40 65
Notice that get() is used here to dynamically access a variable.
a less compact way of writing this would be
sumRowTwo <- function(z) {
dfz <- get(z)
sum(dfz[,2])
}
sapply(SetNames, sumRowTwo)
and now you can play around with sumRowTwo and see what e.g.
sumRowTwo("a")
returns
Related
I have 7 large seurat objects, saved as sn1, sn2, sn3 ... sn7
I am trying to do scaledata on all 7 samples. I could write the same line 7 times as:
all.genes <- rownames(sn1)
snN1<-ScaleData(sn1, features = all.genes)
all.genes <- rownames(sn2)
snN2<-ScaleData(sn2, features = all.genes)
all.genes <- rownames(sn2)
snN2<-ScaleData(sn2, features = all.genes)
.
.
.
This would work perfectly. Since I have to use all 7 samples for quite a while still, I thought I'd save time and make a for loop to do the job with one line of code, but I am unable to save the varables, getting an error "Error in { : target of assignment expands to non-language object".
This is what I tried:
samples<-c("sn1", "sn2", "sn3", "sn4", "sn5", "sn6", "sn7")
list<-c("snN1", "snN2", "snN3", "snN4", "snN5", "snN6", "snN7")
for (i in samples) {
all.genes <- rownames(get(i))
list[1:7]<-ScaleData(get(i), features = all.genes)
}
How do I have to format the code so it could create varables snN1, snN2, snN3 and save scaled data from sn1, sn2, sn3... to each respective new variable?
I think the error is in this line: list[1:7]<-ScaleData(get(i), features = all.genes). You are saying to the for-loop to reference the output of the function ScaleData, to the 7 string variables in the list, which makes no sense. I think you are looking for the function assign(), but it is recommended to use it in very specific contexts.
Also, there're better methods that for-loops in R, for example apply() and related functions. I recommend to you to create as a custom function the steps you want to apply, and then call lapply() to iteratively - as a for-loop would do - change every variable and store it in a list. To call every 'snX' variable as the input you can reference them in a list that direct to them.
# Custom function
custom_scale <- function(x){
all.genes <- rownames(x)
y = ScaleData(x, features = all.genes)
}
# Apply custom function and return saved in a list
# Create a list that directo to every variable
samples = list(sn1, sn2, sn3, sn4, sn5, sn6, sn7) # Note I'm not using characters, I'm referencing the actual variable.
# use lapply to iterate over the list and apply your custom function, saving the result as your list
scaled_Data_list = lapply(samples, function(x) custom_scale(x))
This should work, however without an example data I can't test it.
Here is how to do it using a loop and assign. I removed some redundant code/variables as this can always be a source of error. However, I agree with RobertoT that storing such data in a list and using lapply is a good idea.
samples <- paste0('sn', 1:7)
for (sn in samples) {
sn.data <- get(sn)
assign(sub('n', 'nN', sn),
ScaleData(sn.data, features=rownames(sn.data)))
}
I have this parameter:
L_inf <- seq(17,20,by=0.1)
and this function:
fun <- function(x){
L_inf*(1-exp(-B*(x-0)))}
I would to apply this function for a range of value of L_inf.
I tried with loop for, like this:
A <- matrix() # maybe 10 col and 31 row or vice versa
for (i in L_inf){
A[i] <- fun(1:10)
}
Bur R respond: longer object length is not a multiple of shorter object length.
My expected output is a matrix (or data frame, or list maybe) with 10 result (fun(1:10)) for each value of the vector L_inf (lenght=31).
How can to do it?
You are trying to put a vector of 10 elements into one of the matrix cell. You want to assign it to the matrix row instead (you can access the ith row with A[i,]).
But using a for loop in this case is inefficient and it is quite straightforward to use one of the "apply" function. Apply functions typically return a list (which is the most versatile container since there is basically no constraint).
Here sapply is an apply function which tries to Simplify its result to a convenient data structure. In this case, since all results have the same length (10), sapply will simplify the result to a matrix.
Note that I modified your function to make it explicitly depend on L_inf. Otherwise it will not do what you think it should do (see keyword "closures" if you want more info).
L_inf_range <- seq(17,20,by=0.1)
B <- 1
fun <- function(x, L_inf) {
L_inf*(1-exp(-B*(x-0)))
}
sapply(L_inf_range, function(L) fun(1:10, L_inf=L))
I am implementing k-means in R.
In a loop, I am initiating several vectors that will be used to store values that belong to a particular cluster, as seen here:
for(i in 1:k){
assign(paste("cluster",i,sep=""),vector())
}
I then want to add to a particular "cluster" vector, depending on the value I get for the variable getIndex. So if getIndex is equal to 2 I want to add the variable minimumDistance to the vector called cluster2. This is what I am attempting to do:
minimumDistance <- min(distanceList)
getIndex <- match(minimumDistance,distanceList)
clusterName <- paste("cluster",getIndex,sep="")
name <- c(name, minimumDistance)
But obviously the above code does not work because in order to append to a vector that I'm naming I need to use assign as I do when I instantiate the vectors. But I do not know how to use assign, when using paste, when also appending to a vector.
I cannot use the index such as vector[i] because I don't know what index of that particular vector I want to add to.
I need to use the vector <- c(vector,newItem) format but I do not know how to do this in R. Or if there is any other option I would greatly, greatly appreciate it. If I were using Python I would simply use paste and then use append but I can't do that in R. Thank you in advance for your help!
You can do something like this:
out <- list()
for (i in 1:nclust) {
# assign some data (in this case a list) to a cluster
assign(paste0("N_", i), list(...))
# here I put all the clusters data in a list
# but you could use a similar statement to do further data manipulation
# ie if you've used a common syntax (here "N_" <index>) to refer to your elements
# you can use get to retrieve them using the same syntax
out[[i]] <- get(paste0("N_", i))
}
If you want a more complete code example, this link sounds like a similar problem emclustr::em_clust_mvn
I would like to write an expression in R where I can modify some part of it depending on some other variable. For example, let's say that I want to create a lot of new columns in my data frame using the following basic expression:
MyDataFrame$ColumnNumber1 <- 33
If I wanted to create a lot of these, I could just type them out explicitly
MyDataFrame$ColumnNumber2 <- 52
MyDataFrame$ColumnNumber3 <- 12
...
and so on. However, this would be impractical if I have to many of them. Therefore, I'd like to have some way of replacing a part of the expression with something that's generated through a variable. Let's for example say that there was an operator Input() that would do this. Then I could write a function looking like this:
for(i in 1:1000)
{
MyDataFrame$ColumnNumberInput(i) <- 32*i
}
where the Input(i) part would be replaced with whatever number the counter were at at the moment, which in turn would generate an expression.
I now that I can do this by using:
eval(parse(text=paste("MyDataFrame$","Column","Number","1","<- 32*i",sep="")))
but this gets impossible to use if the expression is too complicated, long, or have other things like this nested inside of it.
Use [[]]:
for(i in 1:1000)
{
MyDataFrame[[i]] <- 32*i
}
[[]] is the programmatic equivalent of $.
You may then reference the data structure as:
MyDataFrame[[14]] <- MyDataObjFourteen
MyDataFrame$14
MyDataFrame[[14]]
You could also use strings, like so:
MyDataFrame[["SomeString"]] <- 32
MyDataFrame$SomeString
# 32
EDIT 1:
To do something similar, in a more general way try:
assign("someArbitraryString", 3)
someArbitraryString
# 3
as.name(someArbitraryString)
I am trying to put together a function that will loop thru a given data frame in blocks and return a new data frame containing stuff calculated from the original. The length of x will be different each time and the actual problem will have more loops in the function. New-ish to R and have not been able to find anything helpful (I don't think using a list will help)
func<-function(x){
tmp # need to declare this here?
for (i in 1:dim(x)[1]){
tmp[i]<-ave(x[i,]) # add things to it
}
return(tmp)
}
df<-cbind(rnorm(10),rnorm(10))
means<-func(df)
This code does not work but I hope it gets across what I want to do. thanks!
Do you mean you want to loop through each row of df and return a data frame with the calculated values?
You may want to look in to the apply function:
df <- cbind(rnorm(10),rnorm(10))
# apply(df,1,FUN) does FUN(df[i,])
# e.g. mean of each row:
apply(df,1,mean)
For more complicated looping like performing some operation on a per-factor basis, I strongly recommend package plyr, and function ddply within. Quick example:
df <- data.frame( gender=c('M','M','F','F'), height=c(183,176,157,168) )
# find mean height *per gender*
ddply(df,.(gender), function(x) c(height=mean(x$height)))
# returns:
gender height
1 F 162.5
2 M 179.5