for loop and apply to calculate means - r

i want to calculate the means for 32 vectors in a list. I thought this code should do the job:
for(i in sequence(length(means16list))){
mat.means16 <- apply(means16list, 1, mean)
}
where means16list contains 32 numeric vectors and mat.means16 should contain the means. It is a matrix 4,4 and defined in a previous step.
Maybe I did not understand how loops work yet.
Can someone help?
Cheers

mat.means16 is being overridden each time, you should make a list and store the results there potentially.
There are likely better ways to do this if you post example data, I'm assuming you want rowMeans() of a matrix.
results <- lapply(means16list, rowMeans)

Unless I misunderstood, why not just:
# Sample data
set.seed(2017);
means16list <- lapply(1:32, function(x) runif(10))
# Return a list of the sample means
lapply(means16list, mean);
# Return a vector of the sample means
sapply(means16list, mean);
I don't see the point of the for loop. lapply/sapply will loop through every element of your list and apply function mean to it.

Related

R: Iterating Parameter Arguments from List for Random Generation For Loop

I'm new to the forum and to r, so please forgive the sloppy code.
In short, I am trying to get a normal distribution to iteratively use the parameters drawn from two lists for use in a For Loop that generates a 30x10000 matrix of random samples using these parameters.
The first list (List1) is a collection of numeric vectors. The second list (List2) has corresponding values I would like to use for the standard deviation argument in rnorm: i.e. vector 1 from List1's standard deviation is Value1 in List2.
set.seed(1500) #set up random gen
var1 = rnorm(1:1000, mean = #mean of vector(i) from list1, sd = #value(i) from List2)
sample(var1,size=1)
X = matrix(ncol = 30, nrow = 10000)
for(j in 1:length(var1)){ #simulates data using parameters set by rnorm var1 function
for(i in 1:10000){
X[i.j] = sample(var1,1)
}
}
Here's the original post where this code is inspired from.
Cheers!
It seems mapply() would help you:
# First let's turn the list1 into means.
dist.means = lapply(list1,mean)
Lapply is a way to execute a function for every element in a list. Mapply works in a very similar way but uses multiples lists.
samples = mapply(rnorm, 30*10000, dist.means, list2,SIMPLIFY=F)
A little bit more explanation: mapply() runs rnorm() multiple times. In the first attempt, it runs using the first element of first list as the first argument, the first element of second list as second argument, etc. So in our case it will run rnorm( 30*10000, dist.means[[1]], list2[[1]] ) then rnorm( 30*10000, dist.means[[2]], list2[[2]] ) and store the output in a list.
Note that I use a small trick here. The first list is a single number 30*10000. When you give list of different sizes to mapply it recycles the shorter one, i.e. it repeats the shorter lists until it has the same length of the longer lists.
Hope that helps

need to assign variables some values in a loop in R

I need to assign variables some values in a loop
Eg:
abc_1<-
abc_2<-
abc_3<-
.....
something like:
for(i in 1:20)
{
paste("abc",i,sep="_")<-some calculated value
}
I have tried to use paste as above but it doesn't work.
How could this be done.Thanks
assign() and paste0() should help you.
for example:
object_names <- paste0("abc",1:20)
for (i in 1:20){
assign(object_names[i],runif(40))
}
assign() takes the string in object_names and assigns the function in the second argument to each name. When you place a numeric vector inside of paste0() it gives back a character vector of concatenated values for each value in the numeric vector.
edit:
As Gregor says below, this is much better to do in a list because:
It will be faster.
When making a large number of things you probably want to do the same thing to each of them. lapply() is very good at this.
For example:
N <- 20
# create random numbers in list
abcs <- lapply(1:N,function(i) runif(40))
# multiply each vector in list by 10
abc.mult <- lapply(1:length(abcs), function(i) abcs[[i]] * 10)

How to apply operation and sum over columns in R?

I want to apply some operations to the values in a number of columns, and then sum the results of each row across columns. I can do this using:
x <- data.frame(sample=1:3, a=4:6, b=7:9)
x$a2 <- x$a^2
x$b2 <- x$b^2
x$result <- x$a2 + x$b2
but this will become arduous with many columns, and I'm wondering if anyone can suggest a simpler way. Note that the dataframe contains other columns that I do not want to include in the calculation (in this example, column sample is not to be included).
Many thanks!
I would simply subset the columns of interest and apply everything directly on the matrix using the rowSums function.
x <- data.frame(sample=1:3, a=4:6, b=7:9)
# put column indices and apply your function
x$result <- rowSums(x[,c(2,3)]^2)
This of course assumes your function is vectorized. If not, you would need to use some apply variation (which you are seeing many of). That said, you can still use rowSums if you find it useful like so. Note, I use sapply which also returns a matrix.
# random custom function
myfun <- function(x){
return(x^2 + 3)
}
rowSums(sapply(x[,c(2,3)], myfun))
I would suggest to convert the data set into the 'long' format, group it by sample, and then calculate the result. Here is the solution using data.table:
library(data.table)
melt(setDT(x),id.vars = 'sample')[,sum(value^2),by=sample]
# sample V1
#1: 1 65
#2: 2 89
#3: 3 117
You can easily replace value^2 by any function you want.
You can use apply function. And get those columns that you need with c(i1,i2,..,etc).
apply(( x[ , c(2, 3) ])^2, 1 ,sum )
If you want to apply a function named somefunction to some of the columns, whose indices or colnames are in the vector col_indices, and then sum the results, you can do :
# if somefunction can be vectorized :
x$results<-apply(x[,col_indices],1,function(x) sum(somefunction(x)))
# if not :
x$results<-apply(x[,col_indices],1,function(x) sum(sapply(x,somefunction)))
I want to come at this one from a "no extensions" R POV.
It's important to remember what kind of data structure you are working with. Data frames are actually lists of vectors--each column is itself a vector. So you can you the handy-dandy lapply function to apply a function to the desired column in the list/data frame.
I'm going to define a function as the square as you have above, but of course this can be any function of any complexity (so long as it takes a vector as an input and returns a vector of the same length. If it doesn't, it won't fit into the original data.frame!
The steps below are extra pedantic to show each little bit, but obviously it can be compressed into one or two steps. Note that I only retain the sum of the squares of each column, given that you might want to save space in memory if you are working with lots and lots of data.
create data; define the function
grab the columns you want as a separate (temporary) data.frame
apply the function to the data.frame/list you just created.
lapply returns a list, so if you intend to retain it seperately make it a temporary data.frame. This is not necessary.
calculate the sums of the rows of the temporary data.frame and append it as a new column in x.
remove the temp data.table.
Code:
x <- data.frame(sample=1:3, a=4:6, b=7:9); square <- function(x) x^2 #step 1
x[2:3] #Step 2
temp <- data.frame(lapply(x[2:3], square)) #step 3 and step 4
x$squareRowSums <- rowSums(temp) #step 5
rm(temp) #step 6
Here is an other apply solution
cols <- c("a", "b")
x <- data.frame(sample=1:3, a=4:6, b=7:9)
x$result <- apply(x[, cols], 1, function(x) sum(x^2))

How to assign one value to a specific matrix entry, where the matrix must be called with a variable name

I have the following problem: I have a huge list of matrices with unique names that share the same dimension. I calculate some values that I now want to assign to a certain matrix indice, e.g. [3,4]. Because I have so many matrices I created a list with the names that those matrices shall have and then I used assign() to create all those matrices (empty). I would now like to call single matrices with its variable name to assign different values to certain matrix entries. I only know the commands assign() and eval(parse()), but didn't manage to get it working. I tried several things without success:
assign(x=MatrixNameList[i][3,4],value=z)
assign(x=MatrixNameList[i],value=z)[3,4]
eval(parse(text=MatrixNameList[i]))[3,4]<-z
assign(x=eval(parse(text=MatrixNameList[i]))[3,4] ,value=z)
So I am wondering if there is a possibility for what I want to do. The structure of my code is a simple loop:
Matrix1 <- Matrix2 <- matrix(nrow=3,ncol=4)
MatrixNameList <- c('Matrix1', 'Matrix2')
for (i in 1:length(MatrixNameList)){
z <- calculatedValue <- 4 # different for the single matrices
assign... ?
eval(parse... ?
}
I hope I was able to clearly point out my problem. Thanks in advance,
Eric
Use get:
get(MatrixNameList[1]) # retrieves matrix called "Matrix1"
However, you're better off collecting all those matrices into one object. Something like this should get you started.
Matrices <- lapply(MatrixNameList, get)
You can assign values like the following:
MatrixList <- list(Matrix1, Matrix2)
names(MatrixList) <- MatrixNameList
MatrixList[[1]][2,3] <- 4
# OR:
MatrixList$Matrix1[2,3] <- 4

Means from a list of data frames in R

I am relatively new to R and have a complicated situation to solve. I have uploaded a list of over 1000 data frames into R and called this list x. What I want to do is take certain data frames and take the mean and variance of the entire data frames (excluding the first column of each) and save these into two separate vectors. For example I wish to take the mean and variance of every third data frame in the list starting from element (3) and going to element (54).
So what I ultimately want are two vectors:
meanvector=c(mean(data frame(3)), mean(data frame(6)),..., mean(data frame(54)))
variancevector=c(var(data frame (3)), var(data frame (6)), ..., var(data frame(54)))
This problem is way above my knowledge level but I am thinking I can do this effectively using some sort of loop but I do not know how to go about making such loop. Any help would be much appreciated! Thank you in advance.
You can use lapply and pass indices as follows:
ids <- seq(3, 54, by=3)
out <- do.call(rbind, lapply(ids, function(idx) {
t <- unlist(x[[idx]][, -1])
c(mean(t), var(t))
}))
If x is a list of 1000 dataframes, you can use lapply to return the means and variances of a subset of this list.
ix = seq(1, 1000, 3)
lapply(x[ix], function(df){
#exclude the first column
c(mean(df[,-1]), var(df[,-1]))
})

Resources