applying function to a series of items in a list of lists - r

I have a function that needs to be applied to the same item in a list which is in a list pool. it is as follows:
myFUN<-function(x) {
myRESULT<-sqrt(sd(x)/50)
return(myRESULT)
}
i need to apply myFUN to a list such as:
big.list[[i]][["smaller.list"]][["smallest.list"]]
here, the thing is that there are 1500 different i in big.list and each of them have smaller.list and smallest.list in them. in other words, smallest.list is a list in smaller.list while smaller.list is a list in i and i is a list in big.list. also is are numbers from 1 to 1500.
I need to apply myFUN to each of the is and get the mean of them.

An option is to loop through the list, extract the smaller.list and smallest.list, apply the function. and then get the mean
mean(sapply(big.list, function(x) myFUN(x[["smaller.list"]][["smallest.list"]])))

Related

Lapply on a list of a list

This is all about a code in R.
I have seperated a big data file "All_data.csv" in smaller data of individuals in a particular year.
So the list looks like this:
All individuals is a list of 25 individuals. If you would then take the first element of that list you get:
Indivudual 1: dataframe_year1, dataframe_year2.
If you take the second element you get for instance:
Individual 2: dataframe_year1, dataframe_year2, dataframe_year3.
etc. so the lists in the lists differ in their length.
Now I want to do a (analysis) function on the dataframes, I do not need to store the output again in the list per se.
I solved it with doing an lapply on the list All_data, with a function defined by myself which also calls lapply again and then my analysis function. But I was wondering if there was another way. Because it seems a bit inefficient to do.
split <- function (All_data)
{
#function that splits files by date and individual
#returns list of individuals and within that list is another list of dataframes. Called All_individuals
}
Make_analysis <- function (All_individuals)
{
Listfiles <- split (All_individuals)
HRE <- lapply (Listfiles, Doall)
}
Analysis <- function (files)
{
...
}
function calls:
lapply (All_data, Make_analysis)
Could anyone help?
Also is this the best way to go if I would want to parallise the analysis with RSlurm to run it on a HPC? Then I could change lapply with slurm map right?
My function in itself works but it seems very inefficient. Would like some tips on how to make code more efficient. Also on how to parallise it with Rslurm.

multiplying two lists of matrices in continuos form

I have two lists of matrices and I want to multiply the first element of the first list with the first element of the second list and so on, without writing every operatios due to may be a large number of elements on each list (both lists have the same length)
this is what I mean
'(colSums(R1*t(M1))),(colSums(R2*t(M2))),...(colSums(Rn*t(Mn)))'
Do I need to create an extra list?
Although first I must be able to transpose the matrices of one of the lists before multiplying them. The results will be used for easier operations.
I already tried to use indexes and loops and doesn't work,
first tried to transpose matrices in one list like this (M is one of the lists and the other is named R, M contains M1,M2,..Mn and the same for list R)
The complete operation looks like this:
'for (i in 1:length(M)){Mt<-list(t(M[[i]]))}'
and only applies it to the last element.
The full operation looks like this:
'(cbind((colSums(R1*t(M1))),(colSums(R2*t(M2))),...(colSums(Rn*t(Mn))))'
any step of these will be useful
you could use the rlist package.
The function
list.apply(.data, .fun, ...)
will apply a function to each list element.
You can find documentation at [https://cran.r-project.org/web/packages/rlist/rlist.pdf][1].

How do I apply a function with multiple arguments over a list?

For example, I want to apply the intersect() function to every element (dataset) of a list. I want each element compared to this aother dataset, data1.
I know I can use a for loop, but I was thinking that I could use lapply. However, I need to hold one of the arguments constant. How can I do so?
This doesn't work:
> lapply(list(winnepennninckx, brunner), intersect(,selectG))

Using R lapply on a list to remove elements from the list

I have a list I want to process to remove elements not meeting certain criteria. I'd use a loop but that is too slow so I assume lapply might be better.
assuming the last thing function y does is give a value of 1 or 2 to variable x how would I modify
lapply(list,functiony, z=valueA, a=valueB) to give back a list of only the elements of the list that are of value x=1?
Try mylist[sapply(mylist, functiony, z=valueA, a=valueB)==1]

Performing column select over multiple dataframes

I have looked around a lot for this answer, they get close but no cigar. I am trying to perform a selection of columns over multiple dataframes. I can do this and return a list, but I wish to preserve the dataframes in the global environment. I want to keep the dataframes separate for ease of use and visibility in Rstudio. For example I am selecting columns based on their name as so, for one dataframe:
E07 <- E07[,c("Block","Name","F635.Mean","F532.Mean","B635.Mean","B532")]
I have x amount of data frames listed in dflist so I have written this function:
columnselect<-function(df){df[,c("Block","Name","F635.Mean","F532.Mean","B635.Mean","B532")];df}
I then wish to apply this over the dflist as so:
lapply(X=dflist,FUN=columnselect)
This returns the function over the dflist however the data tables remain unchanged. How do I apply the function over multiple dataframes without returning them in a list.
Many thanks
M
Your function returns the data frames unchanged because this is the last thing evaluated in your function. Instead of:
columnselect<-function(df){
df[,c("Block","Name","F635.Mean","F532.Mean","B635.Mean","B532")]
df}
It should be:
columnselect<-function(df){
df[,c("Block","Name","F635.Mean","F532.Mean","B635.Mean","B532")]
}
Having the last df in your function simply returned the full df that you passed in the function.
As for the second question that you would like to have the data.frames in the global environment rather than in the list (which is bad practice just so you know; it is always better to keep those in the list) you need the list2env function i.e.:
mylist <- lapply(X=dflist,FUN=columnselect)
list2env(mylist, envir = globalenv())
Using this the data.frames in the global environment will be updated.

Resources