I have a list of table list_table_Tanzania which I want to conver into a list of matrices list_matrix_Tanzania. I tried to run this for loop function but I got only one output.
for (i in 1:length(list_table_Tanzania)) {
list_matrix_Tanzania<-as.matrix(list_table_Tanzania[[i]], rownames.force = NA)
}
Does somebody know how could I get the same number of matrices of the number of tables in my list?
You are overwriting the value of list_matrix_Tanzania in each iteration.
Try with lapply
list_matrix_Tanzania <- lapply(list_table_Tanzania, as.matrix, rownames.force = NA)
Doing it with a for loop is also possible, you first have to initialise your list by list_matrix_Tanzania <- vector("list", length(list_table_Tanzania) and then in the for loop assign the result of the ith iteration to list_matrix_Tanzania[[i]]
Edit To remove some columns from the lapply result, you can modify it to something like
list_matrix_Tanzania <- lapply(list_table_Tanzania, function(d) {
result <- as.matrix(d, rownames.force = NA)
v <- 1:2 # Replace this with the indices of the columns you wish to remove
result[, -v]
})
Related
Like the title says, I wish to use lapply instead of a for loop to parse data from a data frame and put it into an empty data frame. My motivation is that the data frame I'm parsing contains thousands of genes and I've read that the apply functions are faster at iterating through large tables.
### My data table ###
rawCounts <- data.frame(ensembl_gene_id_version = c('ENSG00000000003.15', 'ENSG00000000005.6', 'ENSG00000000419.14'),
HS1 = c(1133, 0, 1392),
HS2 = c(900, 0, 1155),
HS3 = c(1251, 0, 2011),
HS4 = c(785, 0, 1022),
stringsAsFactors = FALSE)
## Function
extract_counts <- function(df, esdbid){
counts <- data.frame()
plyr::ldply(esdbid, function(i) {counts <- df[grep(pattern = i, x = df),] %>% rbind()})
return(counts)
}
## Call the first one
extract_counts(df = rawCounts, esdbid = c('ENSG00000000003.15'))
I want this to return a data frame, so I used the plyr::ldply function from this post - Extracting outputs from lapply to a dataframe
However, it isn't returning anything. Eventually I want to scale up my esdbid vector to include multiple values; such as any combination of gene IDs to quickly retrieve the gene counts.
Strangely, when I run this in the console it appears to work as intended for a vector of length 1, i.e.;
esdbid <- 'ENSG00000000003.15'
plyr::ldply(esdbid, function(i) {counts <- rawCounts[grep(pattern = i, x = rawCounts),] %>% rbind()})
Returns a data frame with the correct values. However, when I increase the length of the vector it returns only the first value for each row. For example if esdbid <- c('ENSG00000000003.15', 'ENSG00000000005.6', 'ENSG00000000419.14') then the console code will return the values for ENSG00000000003.15 three times.
Maybe subset can handle this more effectively?
extract_counts <- function(.data, esdbid) {
subset(.data, grepl(esdbid, .data))
}
esdbid <- "ENSG00000000003.15"
df |> extract_counts(esdbid)
Then you can use lapply if you want a list with all dataframe subsets:
lapply(
unique(df$ensembl_gene_id_version),
function(id) { df |> extract_counts(id) }
)
I have a list of dataframes that I would like to multiply for each element of vector.
The first dataframe in the list would be multiplied by the first observation of the vector, and so on, producing another list of dataframes already multiplied.
I tried to do this with a loop, but was unsuccessful. I also tried to imagine something using map or lapply, but I couldn't.
for(i in vec){
for(j in listdf){
listdf2 <- i*listdf[[j]]
}
}
Error in listdf[[j]] : invalid subscript type 'list'
Any idea how to solve this?
*Vector and the List of Dataframes have the same length.
Use Map :
listdf2 <- Map(`*`, listdf, vec)
in purrr this can be done using map2 :
listdf2 <- purrr::map2(listdf, vec, `*`)
If you are interested in for loop solution you just need one loop :
listdf2 <- vector('list', length(listdf))
for (i in seq_along(vec)) {
listdf2[[i]] <- listdf[[i]] * vec[i]
}
data
vec <- c(4, 3, 5)
df <- data.frame(a = 1:5, b = 3:7)
listdf <- list(df, df, df)
I have a list of multiple matrices. I can transform an item of this list into a dataframe using this code:
as.data.frame(list_of_matrices[i])
But how can I do the same in an automatic way for all indexes (i)?
I tried:
a <- data.frame()
for(i in 1:length(list_of_matrices)){
dataframes[i] <- as.data.frame(list_of_matrices[i])
}
but it didn't work:
Error in `[[<-.data.frame`(`*tmp*`, i, value = list(X1 = 1:102, X2 = c(2L, :
replacement has 102 rows, data has 0
In the OP's code, we need [[ instead of [ because by doing [, it will still be a list of length 1
for(i in seq_along(list_of_matrices)){
list_of_matrices[[i]] <- as.data.frame(list_of_matrices[[i]])
}
If we need multiple objects in the global env, (not recommended), either assign or list2env should work. After naming the list with custom names or letters (a, b, c, ,..), use list2env
names(list_of_matrices) <- letters[seq_along(list_of_matrices)]
list2env(list_of_matrices, .GlobalEnv)
Now, we check for
head(a)
head(b)
Another option is `assign with in the loop itself
for(i in seq_along(list_of_matrices)) {
assign(letters[i], as.data.frame(list_of_matrices[[i]])
}
head(a)
head(b)
NOTE: We assume that the length of list_of_matrices is less than 26 or else have to change the names from the built-in letters to something else..
Try this:
# Example list of matrices
mat_list <- list(
matrix(runif(20), 4, 5),
matrix(runif(20), 4, 5)
)
# Convert to list of df
df_list <- lapply(mat_list, as.data.frame)
I need to apply a list of indices to a list of dataframes with a one on one mapping. First element of the list of indices goes to the first dataframe only and so on. List of indices applies to the rows in the dataframes.
And a list of complementary dataframes needs to created by selecting rows not mentioned in the indices list.
Here is some sample data:
set.seed(1)
A <- data.frame(matrix(rnorm(40,0,1), nrow = 10))
B <- data.frame(matrix(rnorm(40,2,3), nrow = 10))
C <- data.frame(matrix(rnorm(40,3,4), nrow = 10))
dflis <- list(A,B,C)
# Create a sample row index
ix <- lapply(lapply(dflis,nrow), sample, size = 6)
So far I have managed this working but ugly looking code:
dflis.train <- lapply(seq_along(dflis), function(x) dflis[[x]][ix[[x]],])
dflis.test <- lapply(seq_along(dflis), function(x) dflis[[x]][-ix[[x]],])
Can someone suggest something better, more elegant?
Use Map/mapply instead of the univariate lapply, so that you can iterate over both objects and apply a function, like:
Map(function(d,r) d[r,], dflis, ix)
Or if you want to be fancy:
Map(`[`, dflis, ix, TRUE)
Matches your requested answer.
identical(
Map(function(d,r) d[r,], dflis, ix),
lapply(seq_along(dflis), function(x) dflis[[x]][ix[[x]],])
)
#[1] TRUE
I have a nested list (FCdat.pp.con[[x]][[y]]) in which seperate columns hold values of a matrix (for each x and nested y). Now I want to make the matrices by writing to dataframes.
However, I don't succeed in giving the matrices the correct name in a double for loop. At the moment, I only succeeded in making matrices with information about "x" but not with information about "y". Please help me out or suggest another method to do this.
for (x in 1:36){
for (y in 1:5){
cells <- c(0,
FCdat.pp.con[[x]][[y]]$A12,
FCdat.pp.con[[x]][[y]]$A13,
FCdat.pp.con[[x]][[y]]$A14,
FCdat.pp.con[[x]][[y]]$A21,0,
FCdat.pp.con[[x]][[y]]$A23,
FCdat.pp.con[[x]][[y]]$A24,
FCdat.pp.con[[x]][[y]]$A31,
FCdat.pp.con[[x]][[y]]$A32,
0,
FCdat.pp.con[[x]][[y]]$A34,
FCdat.pp.con[[x]][[y]]$A41,
FCdat.pp.con[[x]][[y]]$A42,
FCdat.pp.con[[x]][[y]]$A43,
0)
rnames <- c("ALG1","ALG2","ALG3","ALG4")
cnames <- c("ALG1","ALG2","ALG3","ALG4")
Ind.matr <- t(matrix(cells, nrow=4, ncol = 4))
dimnames(Ind.matr)=list(rnames, cnames)
assign(paste0("Indmatr", x), Ind.matr)
}
}
Add this before the loop begins: result <- lapply(1:36, function(.)vector("list",5))
Then replace assign(paste0("Indmatr", x), Ind.matr) with result[[x]][[y]] <- Ind.matr.