Extract xyz values from list of lists in R - r

I want to extract values from a list of (named) lists in R.
Example
The data looks as follows:
data <- list('1' = list(x = c(1,2,3), y = c(2,3,4), z = c(2,3,7)),
'2' = list(x = c(2,3,4,5), y = c(3,4,5,6), z = c(1,2,3,5)))
From a specified list (e.g., '1'), I would like to extract all the first/second/etc elements from the lists. The choice for the index of the element should be random.
For example, if I want to sample from the first list (i.e., '1'), I generate a random index and extract the x, y, and z values corresponding to that random index. Say the index is 2, then the elements should be x=2, y=3, and z=3.
Approach
I thought a function should be able to do the job. The first step was to call the list from the function:
This works:
x <- function(i){
data$`1`
}
x(1)
But this doesn't:
x <- function(i){
data$`i`
}
x(1)
Question
How do I call a list of named lists from within the function? And what is the most convenient way to sample data corresponding to the selected index?

Do you need something like this ?
get_elements <- function(data, i) {
#select the main list
tmp <- data[[i]]
#Check the length of each sublist, select minimum value
#and sample 1 number from 1 to that number
rand_int <- sample(min(lengths(tmp)), 1)
#select that element from each sub-list
sapply(tmp, `[[`, rand_int)
}
get_elements(data, 1)

If I understood your problem correctly a solution would be with the "purrr" package:
library(purrr)
# list "name"
i <- '1'
# index
j <- 2
# to get the needed info as a list:
purrr::map(data[[i]], ~ .x[j])
# to get the needed info as a data.frame:
purrr::map_df(data[[i]], ~ .x[j])

Related

R Convert loop into function

I would like to clean up my code a bit and start to use more functions for my everyday computations (where I would normally use for loops). I have an example of a for loop that I would like to make into a function. The problem I am having is in how to step through the constraint vectors without a loop. Here's what I mean;
## represents spectral data
set.seed(11)
df <- data.frame(Sample = 1:100, replicate(1000, sample(0:1000, 100, rep = TRUE)))
## feature ranges by column number
frm <- c(438,563,953,963)
to <- c(548,803,1000,993)
nm <- c("WL890", "WL1080", "WL1400", "WL1375")
WL.ps <- list()
for (i in 1:length(frm)){
## finds the minimum value within the range constraints and returns the corresponding column name
WL <- colnames(df[frm[i]:to[i]])[apply(df[frm[i]:to[i]],1,which.min)]
WL.ps[[i]] <- WL
}
new.df <- data.frame(WL.ps)
colnames(new.df) <- nm
The part where I iterate through the 'frm' and 'to' vector values is what I'm having trouble with. How does one go from frm[1] to frm[2].. so-on in a function (apply or otherwise)?
Any advice would be greatly appreciated.
Thank you.
You could write a function which returns column name of minimum value in each row for a particular range of columns. I have used max.col instead of apply(df, 1, which.min) to get minimum value in a row since max.col would be efficient compared to apply.
apply_fun <- function(data, x, y) {
cols <- x:y
names(data[cols])[max.col(-data[cols])]
}
Apply this function using Map :
WL.ps <- Map(apply_fun, frm, to, MoreArgs = list(data = df))

How to call a different value for each element in a list in R

I have a list with 29 data frames.
I am trying to do a simple transformation with ifelse(), that looks something like this:
with(df, ifelse(col1 > x, col1 <- col1-y, col1<-col1+y))
The one thing I can't seem to get is how to change that x and y value so that a different value is used for each data frame in the list.
Here's a quick reproducible example of what I've got so far .. but I want to call different values for x and y from a data frame (e.g. info)
df.1 <- data.frame("df"=rep(c(1), times=4),"length"=c(10:7))
df.2 <- data.frame("df"=rep(c(2),times=4),"length"=c(8:11))
df.3 <- data.frame("df"=rep(c(3),times=4),"length"=c(9:12))
list <- list(df.1,df.2,df.3)
info <- data.frame(x=rep(c(8.5,9.5,10.5)), y=rep(c(1,1.5,2)))
# using static number for x & y but wanting these to be grabbed from the above df and change
# for each list
x <- 8
y <- 1
lapply(list, function(df) {
df <- with(df, ifelse(length > x,
length <- length-y,
length <- length+y)) })
Any and all help/insight is appreciated!
Edited to add clarification:
I would like the rows to match up with lists.
E.g. Row 1 in Info (x=8.5, y=1) is used in the function and applied just to the first data frame in the list (df.1).
When you need to pass more than one value to lapply, you must use mapply instead.
mapply(
function(df, x, y) {
#print("df")
#print(df)
#print("x")
#print(x)
#print("y")
#print(y)
with(df, ifelse(length > x, length <- length - x, length <- length + y))
},
list,
info$x,
info$y
)
I've left some debugging in the code which can enabled in case you want to see how it works.

How to split and rearrange list

I have list of year_month dataframes.
They are like this
List = c( A2017_1, A2017_2,....A2017_12, A2018_1, ... A2018_12, ..... ) and so on.
I want to rearrange this list with the months, like this:
month_1 = c(A2017_1, A2018_1, A2019_1, ....)
month_2 = c(A2017_2, A2018_2, A2019_2, ....)
.
.
... and so on.
This is what I tried.
for (x in 1:12){
LF <- emp_yymm[grep(str_c('+_', x,'$'), names(emp_yymm))]
LFF <-append(LFF, as.list(LF))
names(LFF)[x]<- str_c('mon_',x)
}
And it failed.
You should really start with some reproducible example, but below is the solution I'm giving assuming some things about your data.
name <- paste0("A",rep(c("2017","2018","2019","2020"), each = 12), "_",rep(1:12, 4))
L <- vector("list",48)
names(L) <- name
Final <- vector("list",0)
for(x in 1:12){
indx <- grep(pattern = paste0("_",x,"$"), x = names(L)) #find index of things ending in _num
ord <- order(names(L)[indx]) #correct order
tmp <- L[indx[ord]] # make list subset
Final <- c(Final,tmp) # build list
}
I'm assuming that the element names of the list are the Year_month values and you are just trying to reorder the list. If you are trying to make 12 more lists per month you can dynamically name variables with the assign function...
Building a list as I have isn't always best. It is usually better to specify the final length first.
for (x in 1:12){
LF <- emp_yymm[grep(str_c('+_', x,'$'), names(emp_yymm))]
if(x<10){
assign(str_c('mon_0',x),LF)
}
else{
assign(str_c('mon_',x),LF)
}
}
monlist<-mget(ls(pattern="mon_\\d+"))

Assigning a variable to pasted name of column in R

I have a few data frames with the names:
Meanplots1,
Meanplots2,
Meanplots3 etc.
I am trying to write a for loop to do a series of equations on each data frame.
I am attempting to use the paste0 function.
What I want to happen is for x to be a column of each data set. So the code should work like this line:
x <- Meanplots1$PAR
However, since I want to put this in a for loop I want to format it like this:
for (i in 1:3){
x <- paste0("Meanplots",i,"$PAR")
Dmodel <- nls(y ~ ((a*x)/(b + x )) - c, data = dat, start = list(a=a,b=b,c=c))
}
What this does is it assigns x to the list "Meanplots1$PAR" not the actual column. Any idea on how to fix this?
We can get all the data.frame in a list with mget
lst1 <- mget(ls(pattern = '^MeanPlots\\d+$'))
then loop over the list with lapply and apply the model
DmodelLst <- lapply(lst1, function(dat) nls(y ~ ((a* PAR)/(b + PAR )) - c,
data = dat, start = list(a=a,b=b,c=c)))
Replace 'x' with the column name 'PAR'.
In the OP's loop, create a NULL list to store the output ('Outlst'), get the value of the object from paste0, then apply the formula with the unquoted column name i.e. 'PAR'
Outlst <- vector("list", 3)
ndat <- data.frame(x = seq(0,2000,100))
for(i in 1:3) {
dat <- get(paste0("MeanPlots", i))
modeltmp <- nls(y ~ ((a*PAR)/(b + PAR )) - c,
data = dat, start = list(a=a,b=b,c=c))
MD <- data.frame(predict(modeltmp, newdata = ndat))
MD[,2] <- ndat$x
names(MD) <- c("Photo","PARi")
Outlst[[i]] <- MD
}
Now, we extract the output of each list element
Outlst[[1]]
Outlst[[2]]
instead of creating multiple objects in the global environment

Passing list element names as a variable to functions within lapply

I have a named list of data and a custom function I want to apply to the data:
#Some example data
d.list <- list(a = c(1,2,3), b = c(4,5,6), c = c(7,8,9))
#A simple function to process some data, write an output graph, and return an output
myfun <- function(data, f.name) {
y <- list()
y[1] <- data[1] + data[2] + data[3]
y[2] <- data[3] - data[1]/ data[2]
y[3] <- data[2] * data[3] - data[1]
svg(filename = f.name, width = 7, height = 5, pointsize = 12)
plot.new()
plot(data, y)
dev.off()
return(y)
}
I now want to iterate this over my list using sapply and get a saved image file for each iteration, with the file name set as the list element name (e.g. a.svg, b.svg, c.svg in the example above), along with the data frame containing results of the calculations. When I run this:
#Iterate over the example data using sapply
res <- t(sapply(d.list, function(x) myfun(data=x,
f.name=paste(names(d.list), ".svg", sep = ""))))
I get the expected data frame:
[,1] [,2] [,3]
a 6 2.500 5
b 15 5.200 26
c 24 8.125 65
but I only end up with one file in the target directory: "a.svg"
How can I pass the list element names through correctly as a parameter to the function I'm calling in sapply?
If you need to iterate over two vectors at the same time (both your data and the file names), use mapply (or Map)
res <- t(mapply(myfun, d.list, paste0(names(d.list), ".svg")))
In the OP's post, it looped through each element of the 'd.list', but called names(d.list) in each of the loop i.e. calling a vector (c("a", "b", "c")). We need to loop by the names of the 'd.list'. In that way, we can get the individual names as well as the list elements by subsetting.
lapply(names(d.list), function(x) paste0(x, ".svg"))
If we are using the OP's function
lapply(names(d.list), function(x) myfun(data= d.list[[x]],
f.name = paste0(x, ".svg")))

Resources