How to split and rearrange list - r

I have list of year_month dataframes.
They are like this
List = c( A2017_1, A2017_2,....A2017_12, A2018_1, ... A2018_12, ..... ) and so on.
I want to rearrange this list with the months, like this:
month_1 = c(A2017_1, A2018_1, A2019_1, ....)
month_2 = c(A2017_2, A2018_2, A2019_2, ....)
.
.
... and so on.
This is what I tried.
for (x in 1:12){
LF <- emp_yymm[grep(str_c('+_', x,'$'), names(emp_yymm))]
LFF <-append(LFF, as.list(LF))
names(LFF)[x]<- str_c('mon_',x)
}
And it failed.

You should really start with some reproducible example, but below is the solution I'm giving assuming some things about your data.
name <- paste0("A",rep(c("2017","2018","2019","2020"), each = 12), "_",rep(1:12, 4))
L <- vector("list",48)
names(L) <- name
Final <- vector("list",0)
for(x in 1:12){
indx <- grep(pattern = paste0("_",x,"$"), x = names(L)) #find index of things ending in _num
ord <- order(names(L)[indx]) #correct order
tmp <- L[indx[ord]] # make list subset
Final <- c(Final,tmp) # build list
}
I'm assuming that the element names of the list are the Year_month values and you are just trying to reorder the list. If you are trying to make 12 more lists per month you can dynamically name variables with the assign function...
Building a list as I have isn't always best. It is usually better to specify the final length first.

for (x in 1:12){
LF <- emp_yymm[grep(str_c('+_', x,'$'), names(emp_yymm))]
if(x<10){
assign(str_c('mon_0',x),LF)
}
else{
assign(str_c('mon_',x),LF)
}
}
monlist<-mget(ls(pattern="mon_\\d+"))

Related

Extract xyz values from list of lists in R

I want to extract values from a list of (named) lists in R.
Example
The data looks as follows:
data <- list('1' = list(x = c(1,2,3), y = c(2,3,4), z = c(2,3,7)),
'2' = list(x = c(2,3,4,5), y = c(3,4,5,6), z = c(1,2,3,5)))
From a specified list (e.g., '1'), I would like to extract all the first/second/etc elements from the lists. The choice for the index of the element should be random.
For example, if I want to sample from the first list (i.e., '1'), I generate a random index and extract the x, y, and z values corresponding to that random index. Say the index is 2, then the elements should be x=2, y=3, and z=3.
Approach
I thought a function should be able to do the job. The first step was to call the list from the function:
This works:
x <- function(i){
data$`1`
}
x(1)
But this doesn't:
x <- function(i){
data$`i`
}
x(1)
Question
How do I call a list of named lists from within the function? And what is the most convenient way to sample data corresponding to the selected index?
Do you need something like this ?
get_elements <- function(data, i) {
#select the main list
tmp <- data[[i]]
#Check the length of each sublist, select minimum value
#and sample 1 number from 1 to that number
rand_int <- sample(min(lengths(tmp)), 1)
#select that element from each sub-list
sapply(tmp, `[[`, rand_int)
}
get_elements(data, 1)
If I understood your problem correctly a solution would be with the "purrr" package:
library(purrr)
# list "name"
i <- '1'
# index
j <- 2
# to get the needed info as a list:
purrr::map(data[[i]], ~ .x[j])
# to get the needed info as a data.frame:
purrr::map_df(data[[i]], ~ .x[j])

How can I transform a list in multiple dataframes in R?

I have a list of multiple matrices. I can transform an item of this list into a dataframe using this code:
as.data.frame(list_of_matrices[i])
But how can I do the same in an automatic way for all indexes (i)?
I tried:
a <- data.frame()
for(i in 1:length(list_of_matrices)){
dataframes[i] <- as.data.frame(list_of_matrices[i])
}
but it didn't work:
Error in `[[<-.data.frame`(`*tmp*`, i, value = list(X1 = 1:102, X2 = c(2L, :
replacement has 102 rows, data has 0
In the OP's code, we need [[ instead of [ because by doing [, it will still be a list of length 1
for(i in seq_along(list_of_matrices)){
list_of_matrices[[i]] <- as.data.frame(list_of_matrices[[i]])
}
If we need multiple objects in the global env, (not recommended), either assign or list2env should work. After naming the list with custom names or letters (a, b, c, ,..), use list2env
names(list_of_matrices) <- letters[seq_along(list_of_matrices)]
list2env(list_of_matrices, .GlobalEnv)
Now, we check for
head(a)
head(b)
Another option is `assign with in the loop itself
for(i in seq_along(list_of_matrices)) {
assign(letters[i], as.data.frame(list_of_matrices[[i]])
}
head(a)
head(b)
NOTE: We assume that the length of list_of_matrices is less than 26 or else have to change the names from the built-in letters to something else..
Try this:
# Example list of matrices
mat_list <- list(
matrix(runif(20), 4, 5),
matrix(runif(20), 4, 5)
)
# Convert to list of df
df_list <- lapply(mat_list, as.data.frame)

R function doesn't return value

I'm working on a Kaggle Kernel relating to FIFA 19 data(https://www.kaggle.com/karangadiya/fifa19) and trying to create a function which adds up numbers in a column.
The column has values like 88+2 (class - character)
The desired result would be 90 (class - integer)
I tried to create a function in order to transform such multiple columns
add_fun <- function(x){
a <- strsplit(x, "\\+")
for (i in 1:length(a)){
a[[i]] <- as.numeric(a[[i]])
}
for (i in 1:length(a)){
a[[i]] <- a[[i]][1] + a[[i]][2]
}
x <- as.numeric(unlist(a))
}
This works perfectly fine when I manually transform each column but the function won't return the desired results. Can someone sort this out?
read the csv data in df
then extract the 4 columns required using
dff <- df[, c("LS","ST", "RS","LW")]
def_fun <- function(x){
a <- strsplit(x, '\\+')
for (i in length(a)){
b <- sum(as.numeric(a[[i]]))
}
return (b)
}
Then apply the operations on the required columns
for (i in 1: ncol(dff)){
dff[i] <- apply(dff[i], 1, FUN = def_fun)
}
You can cbind this dataFrame with the original one and drop the original columns.
I hope it proves helpful.

Create list with named objects in R and retrieve parts of the objects from this list

I have several dataframes (full data and reducted data) and now I want to do a whole lot of analysing with kmeans and hclust. I want to be able to work in a loop and store the results in a list where I can retreive (parts of) the stored objects based on their names. The reason is that in R-Markdown there is no good way to create new objects (and no, assign is NOT a good option to do so).
So the idea is that I make several kmeans-objects in a for-loop on several dataframes and put them to a list. But I can't seem to store them in such a way, that I can name these objects. In my list everything is cluttering up. See my example.
To retreive (parts of) the object of the desired list, I have problems how to address this parts (see my last part)
set.seed(4711)
df <- data.frame(matrix(sample(0:6, 120, replace = TRUE), ncol = 15, nrow = 8))
list_of_kmeans_objects <- list()
for (i in 2:4){
list_of_kmeans_objects <- c(list_of_kmeans_objects, kmeans(df, centers = i))
}
Now I have a clutterded up list of 36 items. But what I want is a list with 'items' which I also want to be named. My desired list would be:
C2_kmeans_df <- kmeans(df, centers = 2)
C3_kmeans_df <- kmeans(df, centers = 3)
C4_kmeans_df <- kmeans(df, centers = 4)
desired_list_of_kmeans <- list(C2_kmeans_df, C3_kmeans_df, C4_kmeans_df, C5_kmeans_df)
names(desired_list_of_kmeans)[1] <- "C2_kmeans_df"
names(desired_list_of_kmeans)[2] <- "C3_kmeans_df"
names(desired_list_of_kmeans)[3] <- "C4_kmeans_df"
If I should have this list, my last problem is how do I extract for example
C3_kmeans_df$cluster #or
C4_kmeans_df$tot.withinss
from this list, using the names of the objects in the desired list?
Here is an option using lapply and setNames.
idx <- 2:4
out <- setNames(object = lapply(idx, function(i) kmeans(df, centers = i)),
nm = paste0("C", idx, "_kmeans_df"))
Check the names
names(out)
# [1] "C2_kmeans_df" "C3_kmeans_df" "C4_kmeans_df"
Access cluster
out$C2_kmeans_df$cluster
# [1] 2 1 2 1 2 1 2 1
In your present for loop, you erase the list_of_kmeans_objects object at each iteration.
The following code should do what you do want:
list_of_kmeans_objects <- list()
aaa <- 0
for (i in 2:4) {
aaa <- aaa+1
list_of_kmeans_objects[[aaa]] <- kmeans(df, centers=i)
names(list_of_kmeans_objects)[aaa] <- paste0("C", aaa, "_kmeans_df")
}

Using R to loop through vector and copy some sequences to data.frame

I want to search through a vector for the sequence of strings "hello" "world". When I find this sequence, I want to copy it, including the 10 elements before and after, as a row in a data.frame to which I'll apply further analysis.
My problem: I get an error "new column would leave holes after existing columns". I'm new to coding, so I'm not sure how to manipulate data.frames. Maybe I need to create rows in the loop?
This is what I have:
df = data.frame()
i <- 1
for(n in 1:length(v))
{
if(v[n] == 'hello' & v[n+1] == 'world')
{
df[i,n-11:n+11] <- v[n-10:n+11]
i <- i+1
}
}
Thanks!
May be this helps
indx <- which(v1[-length(v1)]=='hello'& v1[-1]=='world')
lst <- Map(function(x,y) {s1 <- seq(x,y)
v1[s1[s1>0 & s1 < length(v1)]]}, indx-10, indx+11)
len <- max(sapply(lst, length))
d1 <- as.data.frame(do.call(rbind,lapply(lst, `length<-`, len)))
data
set.seed(496)
v1 <- sample(c(letters[1:3], 'hello', 'world'), 100, replace=TRUE)

Resources