I want to exclude/copy rows/columns of multiple dataframes within a list in a list.
The code doesn't work yet. Maybe somebody here knows what to do.
Zelllysate_extr <- list()
#defining the list
Zelllysate_extr$X0809P3_extr <- X0809P3_extr
#defining the list within the list
X0809P3_extr = lapply(Zelllysate_colr[["X0809P3"]], function(x) {
as.data.frame(x) <- Zelllysate_colr[["X0809P3_colr"]][2:1500, 1 & 3:4]
return(x)
})
#defining the list for the dataframes to place in; 2:1500, 1 & 3:4 are the rows and columns to copy
thanks
Instead of trying to iterate over the list, iterate over the length of the list.
X0809P3_extr = lapply(1: length(Zelllysate_colr[["X0809P3"]]), function(x) {
Zelllysate_colr[["X0809P3_colr"]][[x]][2:1500, c(1, 3:4)]
})
You don't need a return or to set the value equal to something in lapply.
I'm assuming that Zelllysate_colr[["X0809P3"]] is a list within the list Zelllysate_colr.
If this doesn't work, you'll have to share some of your data. Most of the time the output from dput(head(dataObject)) is enough, but I think you're working with lists of lists, so that might not be enough to see the structure. You can read about how to ask great questions to get great answers quickly.
Related
I know this topic appeared on SO a few times, but the examples were often more complicated and I would like to have an answer (or set of possible solutions) to this simple situation. I am still wrapping my head around R and programming in general. So here I want to use lapply function or a simple loop to data list which is a list of three lists of vectors.
data1 <- list(rnorm(100),rnorm(100),rnorm(100))
data2 <- list(rnorm(100),rnorm(100),rnorm(100))
data3 <- list(rnorm(100),rnorm(100),rnorm(100))
data <- list(data1,data2,data3)
Now, I want to obtain the list of means for each vector. The result would be a list of three elements (lists).
I only know how to obtain list of outcomes for a list of vectors and
for (i in 1:length(data1)){
means <- lapply(data1,mean)
}
or by:
lapply(data1,mean)
and I know how to get all the means using rapply:
rapply(data,mean)
The problem is that rapply does not maintain the list structure.
Help and possibly some tips/explanations would be much appreciated.
We can loop through the list of list with a nested lapply/sapply
lapply(data, sapply, mean)
It is otherwise written as
lapply(data, function(x) sapply(x, mean))
Or if you need the output with the list structure, a nested lapply can be used
lapply(data, lapply, mean)
Or with rapply, we can use the argument how to get what kind of output we want.
rapply(data, mean, how='list')
If we are using a for loop, we may need to create an object to store the results.
res <- vector('list', length(data))
for(i in seq_along(data)){
for(j in seq_along(data[[i]])){
res[[i]][[j]] <- mean(data[[i]][[j]])
}
}
I have 5 lists called "nightclub", "hospital", "bar", "attraction", "social_facility" all of which contain a data frame called osm_points. I want to create a new list with 5 dataframes with the names of the original dataframes that only contain 3 vectors "osm_id","name","addr.postcode" with no NA values for the vector "addr.postcode". Below is my attempted code, I do not know another way to subset lists without $ (which gives me an error for having an atomic vector), or without the square brackets. Let know if you guys have some advice.
vectors <- c("osm_id","name","addr.postcode")
features <- c("nightclub", "hospital", "bar", "attraction", "social_facility")
datasets <- list()
n <- 0
for (i in features){
n <- n + 1
datasets[[n]] <- paste(i)[["osm_points"]][!is.na(paste(i)[["osm_points"]][["addr.postcode"]]), variables]
}
I managed to do this operation without a for loop (below), but I want to be able to code better and do it all in one operation. Thanks so much for your help.
nightclub1 <- nightclub$osm_points[!is.na(nightclub$osm_points$addr.postcode), variables]
Thanks !!
Try this code using lapply :
result <- lapply(mget(features), function(x)
x$osm_points[!is.na(x$osm_points$addr.postcode), vectors])
result should have list of 5 dataframe one for each features with only vectors column and without NA value for addr.postcode.
You need to put your lists into a list to have something to iterate over.
As you have it, features is a character vector. In your first iteration i is "nightclub", a string. paste(i) changes nothing, it is still "nightclub". So your code becomes "nightclub"[["osm_points"]]..., but you need nightclub[["osm_points"]].
## make a list of lists
list_of_lists <- mget(features)
## then you can do
for(i in seq_along(list_of_lists)) {
datasets[[i]] <- list_of_lists[[i]][["osm_points"]]...
}
Substituting list_of_lists[[i]] wherever you currently have paste(i).
I have a list of 185 data frames called WaFramesNumeric. Each dataframe has several hundred columns and thousands of rows. I want to edit every data frame, so that it leaves all numeric columns as well as any non-numeric columns that I specify.
Using:
for(i in seq_along(WaFramesNumeric)) {
WaFramesNumeric[[i]] <- WaFramesNumeric[[i]][,sapply(WaFramesNumeric[[i]],is.numeric)]
}
successfully makes each dataframe contain only its numeric columns.
I've tried to amend this with lines to add specific columns. I have tried:
for (i in seq_along(WaFramesNumeric)) {
a <- WaFramesNumeric[[i]]$Device_Name
WaFramesNumeric[[i]] <- WaFramesNumeric[[i]][,sapply(WaFramesNumeric[[i]],is.numeric)]
cbind(WaFramesNumeric[[i]],a)
}
and in an attempt to call the column numbers of all integer columns as well as the specific ones and then combine based on that:
for (i in seq_along(WaFramesNumeric)) {
f <- which(sapply(WaFramesNumeric[[i]],is.numeric))
m <- match("Cost_Center",colnames(WaFramesNumeric[[i]]))
n <- match("Device_Name",colnames(WaFramesNumeric[[i]]))
combine <- c(f,m,n)
WaFramesNumeric[[i]][,i,combine]
}
These all return errors and I am stumped as to how I could do this. WaFramesNumeric is a copy of another list of dataframes (WaFramesNumeric <- WaFramesAll) and so I also tried adding the specific columns from the WaFramesAll but this was not successful.
I appreciate any advice you can give and I apologize if any of this is unclear.
You are mistakenly assuming that the last commmand in a for loop is meaningful. It is not. In fact, it is being discarded, so since you never assigned it anywhere (the cbind and the indexing of WaFramesNumeric...), it is silently discarded.
Additionally, you are over-indexing your data.frame in the third code block. First, it's using i within the data.frame, even though i is an index within the list of data.frames, not the frame itself. Second (perhaps caused by this), you are trying to index three dimensions of a 2D frame. Just change the last indexing from [,i,combine] to either [,combine] or [combine].
Third problem (though perhaps not seen yet) is that match will return NA if nothing is found. Indexing a frame with an NA returns an error (try mtcars[,NA] to see). I suggest that you can replace match with grep: it returns integer(0) when nothing is found, which is what you want in this case.
for (i in seq_along(WaFramesNumeric)) {
f <- which(sapply(WaFramesNumeric[[i]], is.numeric))
m <- grep("Cost_Center", colnames(WaFramesNumeric[[i]]))
n <- grep("Device_Name", colnames(WaFramesNumeric[[i]]))
combine <- c(f,m,n)
WaFramesNumeric[[i]] <- WaFramesNumeric[[i]][combine]
}
I'm not sure what you mean by "an attempt to call the column numbers of all integer columns...", but in case you want to go through a list of data frames and select some columns based on some function and keep given a column name you can do like this:
df <- data.frame(a=rnorm(20), b=rnorm(20), c=letters[1:20], d=letters[1:20], stringsAsFactors = FALSE)
WaFramesNumeric <- rep(list(df), 2)
Selector <- function(data, select_func, select_names) {
select_func <- match.fun(select_func)
idx_names <- match(select_names, colnames(data))
idx_names <- idx_names[!is.na(idx_names)]
idx_func <- which(sapply(data, select_func))
idx <- unique(c(idx_func, idx_names))
return(data[, idx])
}
res <- lapply(X = WaFramesNumeric, FUN = Selector, select_names=c("c"), select_func = is.numeric)
I have set of data.frame object in nested list, I want to group them by name of data.frame object. Because each nested list, data.frame objects are placed in different order, I have difficulty to group them in new list. I tried transpose method from purr packages in CRAN, but it wasn't right answer that I expected. Does anyone knows any trick of doing this sort of grouping for data.frame object more efficiently? Thanks a lot
example:
res_1 <- list(con=list(a.con_1=airquality[1:4,], b.con_1=iris[2:5,], c.con_1=ChickWeight[3:7,]),
dis=list(a.dis_1=airquality[5:7,], b.dis_1=iris[8:11,], c.dis_1=ChickWeight[12:17,]))
res_2 <- list(con=list(b.con_2=iris[7:11,], a.con_2=airquality[4:9,], c.con_2=ChickWeight[2:8,]),
dis=list(b.dis_2=iris[2:5,], a.dis_2=airquality[1:3,], c.dis_2=ChickWeight[12:15,]))
res_3 <- list(con=list(c.con_3=ChickWeight[10:15,], a.con_3=airquality[2:9,], b.con_3=iris[12:19,]),
dis=list(c.dis_3=ChickWeight[2:7,], a.dis_3=airquality[13:16,], b.dis_3=iris[2:7,]))
desired output:
group1_New <- list(con=list(a.con_1, a.con_2, a.con_3),
dis=list(a.dis_1, a.dis_2, a.dis_3))
group2_New <- list(con=list(b.con_1, b.con_2, b.con_3),
dis=list(b.dis_1, b.dis_2, b.dis_3))
group3_New <- list(con=list(c.con_1, c.con_2, c.con_3),
dis=list(c.dis_1, c.dis_2, c.dis_3))
Here is a twice nested for loop that creates the desired structure. There is likely a more efficient method.
# put the nested lists into a list:
myList <- list(res_1, res_2, res_3)
# make a copy of the list to preserve the structure for the new list
myList2 <- myList
for(i in seq_len(length(myList))) {
# get ordering of inner list names
myOrder <- rank(names(myList[[c(i,2)]]))
for(j in seq_len(length(myList[[i]]))) {
for(k in seq_len(length(myList[[c(i, j)]]))) {
# reorder content
myList2[[c(myOrder[k], j, i)]] <- myList[[c(i, j, k)]]
# rename element
names(myList2[[c(myOrder[k], j)]])[i] <- names(myList[[c(i, j)]])[k]
}
}
}
If desired, you could extract the list items after the loops.
The key to this solution is the realization that if you put these lists into a list, the result can be achieved by selectively reversing the indices of the list items. By selectively, I mean that I incorporate rank on the data.frame names to find the proper order for the inner-most loop.
In addition to reordering the data.frames as desired, I included a line to properly reset the names within the list.
i'm a bit new to R and this site has been an amazing help to me in answering a lot of questions. However, I’ve come across a recent problem and have exhausted all options to find a solution on my own and am in need of some help.
I am trying to write a code where I create multiple data frames (or matrices) INSIDE the loop and loop it 5000 times. On each loop I would like the variable to change so I can retrieve the data for each loop at a later point.
Also, I would like to be able to repeat this method for other data frames and in creating these new data frames, it draws upon other data frames based on the iteration it is on.
I have tried to find a solution to this and it seems that it could be either the for loop or apply function, but I am not sure as to how I could execute it. As an example of what I would like to see:
for (i in 1:10) {
df.a[i] <- data.frame (…information...)
df.b[i] <- data.frame (...information...)
df.c[i] <- data.frame (new.col.A=df.a[i]$column1, new.col.B=df.b[i]$column2)
}
Then, after having run the loop, if I were to write df.c3 I would find the data frame created in the loop on the third iteration which has data from iteration 3 in df.a and df.b.
The ‘closest’ I have come to getting what I thought I needed was by doing this:
df.a = seq (1, 10, by=1)
df.b = seq (1,10, by=1)
df.c = seq (1,10, by=1)
for (i in 1:10) {
df.a[[i]] <- data.frame (...information)
...
}
But this typically results in an error of: "number of items to replace is not a multiple of replacement length".
So i'm not sure what else i could do and really hope someone is able to help out.
Create objects df.x as empty lists:
df.a <- list()
df.b <- list()
df.c <- list()
Then access (and write to) individual dataframes using double square backets:
for (i in 1:10) {
df.a[[i]] <- data.frame(...)
df.b[[i]] <- data.frame(...)
df.c[[i]] <- data.frame(new.col.A=df.a[[i]]$column1, new.col.B=df.b[[i]]$column2)
}