Loop over a list in R - r

I want to do an operation if each data frame of a list. I want to perform the Kolmogorov–Smirnov (KS) test for one column in each data frame. I am using the code below but it is not working:
PDF_mean <- matrix(nrow = length(siteNumber), ncol = 4)
PDF_mean <- data.frame(PDF_mean)
names(PDF_mean) <- c("station","normal","gamma","gev")
listDF <- mget(ls(pattern="DSF_moments_"))
length(listDF)
i <- 1
for (i in length(listDF)) {
PDF_mean$station[i] <- siteNumber[i]
PDF_mean$normal[i] <- ks.test(list[i]$mean,"pnorm")$p.value
PDF_mean$gev[i] <- ks.test(list[i]$mean,"pgev")$p.value
PDF_mean$gamma[i] <- ks.test(list[i]$mean,"gamma")$p.value
}
Any help?

It is not length(listDF) instead, it would be seq_along(listDF) or 1:length(listDF) (however, it is more appropriate with seq_along) because length is a single value and it is not doing any loop
for(i in seq_along(listDF)) {
PDF_mean$station[i] <- listDF[[i]]$siteNumber
PDF_mean$normal[i] <- ks.test(listDF[[i]]$mean,"pnorm")$p.value
PDF_mean$gev[i] <- ks.test(listDF[[i]]$mean,"pgev")$p.value
PDF_mean$gamma[i] <- ks.test(listDF[[i]]$mean,"gamma")$p.value
}

Related

How to create multiple repeating data frame or matrices in R?

I am trying to make multiple data frames (df_1,,, df_N) with the same structure.
For now, I have made them all individually, but I imagine there should be more efficient way of writing the codes.
Below are the matrices I created (only three for now, but can be more than 100 later on)
quantileMatrix_1 <- matrix(NA,nrow=ncol(outDf_1), ncol = 3)
for(jj in 1:ncol(outDf_1)){
quantiles <- outDf_1[,jj] %>% quantile(probs=c(.5,.025,.975))
quantileMatrix_1[jj,] <- quantiles
}
quantileMatrix_2 <- matrix(NA,nrow=ncol(outDf_2), ncol = 3)
for(jj in 1:ncol(outDf_2)){
quantiles <- outDf_2[,jj] %>% quantile(probs=c(.5,.025,.975))
quantileMatrix_2[jj,] <- quantiles
}
quantileMatrix_3 <- matrix(NA,nrow=ncol(outDf_3), ncol = 3)
for(jj in 1:ncol(outDf_3)){
quantiles <- outDf_3[,jj] %>% quantile(probs=c(.5,.025,.975))
quantileMatrix_3[jj,] <- quantiles
}
I would use another for loop, to put every df in a list.
my_matrix <- list()
for (d in 1:100) {
quantileMatrix_d <- matrix(NA,nrow=ncol(outDf_1), ncol = 3)
for(jj in 1:ncol(outDf_1)){
quantiles <- outDf_1[,jj] %>% quantile(probs=c(.5,.025,.975))
quantileMatrix_d[jj,] <- quantiles
}
my_matrix[[d]] <- quantileMatrix_d
}

Loop-generated list of data frames not being joined by rbind properly

I have a table with samples of data named Sample_1, Sample_2, etc. I take user input as a string for which samples are wanted (Sample_1,Sample_3,Sample_5). Then after parsing the string, I have a for-loop which I pass each sample name to and the program filters the original dataset for the name and creates a DF with calculations. I then append the DF to a list after each iteration of the loop and at the end, I rbind the list for a complete DF.
sampleloop <- function(samplenames) {
data <- unlist(strsplit(samplenames, ","))
temp = list()
for(inc in 1:length(data)) {
df <- CT[CT[["Sample_Name"]] == data[inc],]
........
tempdf = goitemp
temp[inc] <- tempdf
}
newdf <- do.call(rbind.data.frame, temp)
}
The inner function on its own produces the correct wanted output. However, with the loop the function produces the following wrong DF if the input is "Sample_3,Sample_9":
I'm wondering if it has something to do with the rbind?
The issue seems to be using [ instead of [[ to access and assign to the list element`
sampleloop <- function(samplenames) {
data <- unlist(strsplit(samplenames, ","))
temp <- vector('list', length(data))
for(inc in seq_along(data)) {
df <- CT[CT[["Sample_Name"]] == data[inc],]
........
tempdf <- goitemp
temp[[inc]] <- tempdf
}
newdf <- do.call(rbind.data.frame, temp)
return(newdf)
}
The difference can be noted with the reproducible example below
lst1 <- vector('list', 5)
lst2 <- vector('list', 5)
for(i in 1:5) {
lst1[i] <- data.frame(col1 = 1:5, col2 = 6:10)
lst2[[i]] <- data.frame(col1 = 1:5, col2 = 6:10)
}

R loop to create data frames with 2 counters

What I want is to create 60 data frames with 500 rows in each. I tried the below code and, while I get no errors, I am not getting the data frames. However, when I do a View on the as.data.frame, I get the view, but no data frame in my environment. I've been trying for three days with various versions of this code:
getDS <- function(x){
for(i in 1:3){
for(j in 1:30000){
ID_i <- data.table(x$ID[j: (j+500)])
}
}
as.data.frame(ID_i)
}
getDS(DATASETNAME)
We can use outer (on a small example)
out1 <- c(outer(1:3, 1:3, Vectorize(function(i, j) list(x$ID[j:(j + 5)]))))
lapply(out1, as.data.table)
--
The issue in the OP's function is that inside the loop, the ID_i gets updated each time i.e. it is not stored. Inorder to do that we can initialize a list and then store it
getDS <- function(x) {
ID_i <- vector('list', 3)
for(i in 1:3) {
for(j in 1:3) {
ID_i[[i]][[j]] <- data.table(x$ID[j:(j + 5)])
}
}
ID_i
}
do.call(c, getDS(x))
data
x <- data.table(ID = 1:50)
I'm not sure the description matches the code, so I'm a little unsure what the desired result is. That said, it is usually not helpful to split a data.table because the built-in by-processing makes it unnecessary. If for some reason you do want to split into a list of data.tables you might consider something along the lines of
getDS <- function(x, n=5, size = nrow(x)/n, column = "ID", reps = 3) {
x <- x[1:(n*size), ..column]
index <- rep(1:n, each = size)
replicate(reps, split(x, index),
simplify = FALSE)
}
getDS(data.table(ID = 1:20), n = 5)

How to input data into data frame using nested for loop in R

Using the following code, I can print the values iterating each for loop.
for(i in 5:12)
{
for(j in 5:12)
{
for(k in 5:12)
{
for(l in 5:12)
{
cat(i,j,k,l,'\n')
}
}
}
}
Now I want to store the output data into a data frame df considering 4 columns (a,b,c,d) of numeric data. All I know is only the following code but has only single 'for' in it.
f3 <- function(n){
df <- data.frame(x = numeric(n), y = numeric(n))
for(i in 1:n){
df$x[i] <- i
df$y[i] <- i
}
df
}
How to input data into data frames while using nested for loops. Thank you.
you should try expand.grid
a <- 5:12
df <- expand.grid(a,a,a,a)
names(df) <- c("a","b","c","d")

subsetting a list of data frames using a for loop

My question is why does the last statement "a <- ..." work to give me a subset of that data frame within the list, but when I try to automate the process with a for loop through all data frames in the list I am met with all kinds of warnings and not the answer I am looking for??
time <- c(1:20)
temp <- c(2,3,4,5,6,2,3,4,5,6,2,3,4,5,6,2,3,4,5,6)
data <- data.frame(time,temp)
tmp <- c(1,diff(data[[2]]))
tmp2 <- tmp < 0
tmp3 <- cumsum(tmp2)
data1 <- split(data, tmp3)
#this does not work. I want to automate the successful process below through all data frames in the list "data1"
for(i in 1:length(data1)){
finale[i] <- subset(data1[[i]], data1[[i]][,2] > 3)
}
#this works to give me a part of what I want
a <- subset(data1[[1]], data1[[1]][,2] >3)
Maybe you may want to try with lapply
lapply(data1, function(x) subset(x, x[,2]>3))
Same result using a for loop
finale <- vector("list", length(data1))
for(i in 1:length(data1)){
finale[[i]] <- subset(data1[[i]], data1[[i]][,2] > 3)
}
It works because I preallocate a type and a length for finale, it didn't work for you, because you did not declare what finale should be.
You're trying to save a data.frame (2D object) in a vector (1D objetc). Just define finale as list and the code will work:
time <- c(1:20)
temp <- c(2,3,4,5,6,2,3,4,5,6,2,3,4,5,6,2,3,4,5,6)
data <- data.frame(time,temp)
tmp <- c(1,diff(data[[2]]))
tmp2 <- tmp < 0
tmp3 <- cumsum(tmp2)
data1 <- split(data, tmp3)
#this does not work. I want to automate the successful process below through all data frames in the list "data1"
finale <- vector(mode='list')
for(i in 1:length(data1)){
finale[[i]] <- subset(data1[[i]], data1[[i]][,2] > 3) # Use [[i]] instead of [i]
}
To save all in 1 data.frame:
finale <- do.call(rbind, finale)

Resources