R: Function that uses variable dataframe names from a vector [duplicate]

R: Function that uses variable dataframe names from a vector [duplicate] - r

This question already has answers here:
How to convert certain columns only to numeric?
(4 answers)
Make a list from ls(pattern="") [R]
(1 answer)
Closed 2 years ago.
I have a number of x dataframes (depending on previous operation). The names of the dataframes are stored in a different vector:
> list.industries
[1] "misc" "machinery" "electronics" "drugs" "chemicals"
Now, I want to set every column after the 4th as numeric. As the number of created dataframes and, therefore, the names change, I want to ask, if there is any way to do it automatically.
I tried:
for (i in 1:length(list.industries)) {
paste0(list.industries) <- lapply(paste0(list.industries)[,4:ncol(paste0(list.industries))] , as.numeric)
}
Where the function places automatically the name of the dataframe from the vector list.industries to set it as numeric.
Is there any way, how I can place the name of a dataframe as a variable from a vector?
Thanks!

You can use mget to get data as a named list, turn every columns after 4th as numeric and return the dataframe back.
new_data <- lapply(mget(list.industries), function(x) {
x[, 4:ncol(x)] <- lapply(x[, 4:ncol(x)], as.numeric)
x
})
new_data would have list of dataframes, if you want the changes to be reflected in the orignal dataframe use list2env.
list2env(new_data, .GlobalEnv)

You could use this fragment (untested):
one_df <- function(x) {
dat <- get(x)
for (i in seq(4, ncol(dat))) dat[,i] <- as.numeric(dat[,i])
return(dat)
}
ans <- lapply(list.industries, one_df)
So in short: you are looking for get.

Related

How can I access a data frame using string and modify the data frame? [duplicate]

This question already has answers here:
Assign multiple objects to .GlobalEnv from within a function
(4 answers)
Closed 2 years ago.
I got a list of data frames, such as c(df01,df02,df03).
Each data frame has three columns, c("A", "B", "C").
I want to write a for loop to modify each column for each data frame. I tried:
for (df in c("df01", "df02", "df03")) {
for (col in c("A", "B", "C")) {
get(df)[[col]] <- 0
}
}
I learned from this post that we cannot assign value to the result of the get() function in R.
I also tried
assign(df[[col]], 0)
But this also does not work. The assign() function only assigns a value to a name, but here df[[col]] is not a name, but a column.
How can I fix this?

You could get the dataframes in a list and use lapply to change the columns
df_vec <- c("df01","df02","df03")
col_vec <- c("A","B","C")
result <- lapply(mget(df_vec), function(x) {x[col_vec] <- 0;x})
For these changes to reflect in original dataframe use list2env :
list2env(result, .GlobalEnv)

Create vectors from a contingency table [duplicate]

This question already has answers here:
Split a large dataframe into a list of data frames based on common value in column
(3 answers)
Closed 2 years ago.
I have a contingency table of meteorological stations and frequency of occurrence. I used logical indexing to create separate vectors like below (b1:b5) from the table. However there has to be a simpler way, perhaps from the apply family. Can someone provide such an example, thanks.
mf1<-c("USW00023047","USW00013966","USC00416740","USC00413828", "USC00414982", "USC00414982", "USW00013966", "USW00013966", "USW00003927",
"USW00003927", "USC00412019", "USC00411596", "USW00012960", "USW00012960", "USW00012960", "USW00012960", "USW00012960", "USC00417327",
"USC00417327", "USC00418433", "USC00417743", "USC00419499", "USC00419847", "USR0000TCLM", "USR0000TCOL", "USW00012921", "USW00012921",
"USW00012970", "USW00012921", "USW00012921", "USW00012924")
table(mf1)
dfcont<-as.data.frame(table(mf1))
a<-dfcont$mf1
b1<-a[dfcont$Freq < 6]
b2<-a[dfcont$Freq == 2]
b3<-a[dfcont$Freq == 3]
b4<-a[dfcont$Freq == 4]
b5<-a[dfcont$Freq == 5]

You can use split:
temp <- split(as.character(dfcont$mf1), dfcont$Freq)
This will give you list of vectors in temp. Usually, it is better to keep data in a list but if you want them as separate vectors assign name to them and use list2env
names(temp) <- paste0('b', seq_along(temp))
list2env(temp, .GlobalEnv)
You would now have b1, b2 etc in your global environment.

I couldn't find anything simpler than
tbl <- table(mf1)
split(names(tbl), tbl)
If the names need to be b*, assign by pasting the "b" as a prefix to the current names.
names(sp) <- paste0('b', names(sp))

How to loop through a vector of data frame names to print first columns of the df's? [duplicate]

This question already has answers here:
How to extract certain columns from a list of data frames
(3 answers)
Closed 2 years ago.
so x is a vector. i am trying to print the first col of df's name's saved in the vector. so far I have tried the below but they don't seem to work.
x = (c('Ethereum,another Df..., another DF...,'))
for (i in x){
print(i[,1])
}
sapply(toString(Ethereum), function(i) print(i[1]))

You can try this
x <- c('Ethereum','anotherDf',...)
for (i in x){
print(get(i)[,1])
}

You can use mget to get data in a list and using lapply extract the first column of each dataframe in the list.
data <- lapply(mget(x), `[`, 1)
#Use `[[` to get it as vector.
#data <- lapply(mget(x), `[[`, 1)
Similar solution using purrr::map :
data <- purrr::map(mget(x), `[`, 1)

Creating Subset data frames in R within For loop [duplicate]

This question already has answers here:
Split a large dataframe into a list of data frames based on common value in column
(3 answers)
Closed 4 years ago.
What I am trying to do is filter a larger data frame into 78 unique data frames based on the value of the first column in the larger data frame. The only way I can think of doing it properly is by applying the filter() function inside a for() loop:
for (i in 1:nrow(plantline))
{x1 = filter(rawdta.df, Plant_Line == plantline$Plant_Line[i])}
The issue is I don't know how to create a new data frame, say x2, x3, x4... every time the loop runs.
Can someone tell me if that is possible or if I should be trying to do this some other way?

There must be many duplicates for this question
split(plantline, plantline$Plant_Line)
will create a list of data.frames.
However, depending on your use case, splitting the large data.frame into pieces might not be necessary as grouping can be used.

You could use split -
# creates a list of dataframes into 78 unique data frames based on
# the value of the first column in the larger data frame
lst = split(large_data_frame, large_data_frame$first_column)
# takes the dataframes out of the list into the global environment
# although it is not suggested since it is difficult to work with 78
# dataframes
list2env(lst, envir = .GlobalEnv)
The names of the dataframes will be the same as the value of the variables in the first column.

It would be easier if we could see the dataframes....
I propose something nevertheless. You can create a list of dataframes:
dataframes <- vector("list", nrow(plantline))
for (i in 1:nrow(plantline)){
dataframes[[i]] = filter(rawdta.df, Plant_Line == plantline$Plant_Line[i])
}

You can use assign :
for (i in 1:nrow(plantline))
{assign(paste0(x,i), filter(rawdta.df, Plant_Line == plantline$Plant_Line[i]))}
alternatively you can save your results in a list :
X <- list()
for (i in 1:nrow(plantline))
{X[[i]] = filter(rawdta.df, Plant_Line == plantline$Plant_Line[i])}

Would be easier with sample data. by would be my favorite.
d <- data.frame(plantline = rep(LETTERS[1:3], 4),
x = 1:12,
stringsAsFactors = F)
l <- by(d, d$plantline, data.frame)
print(l$A)
print(l$B)

Solution using plyr:
ma <- cbind(x = 1:10, y = (-4:5)^2, z = 1:2)
ma <- as.data.frame(ma)
library(plyr)
dlply(ma, "z") # you split ma by the column named z

Initializing data frame with columns from a list of names in R [duplicate]

This question already has answers here:
How to initialize empty data frame (lot of columns at the same time) in R
(2 answers)
Closed 6 years ago.
I have a list of column names, such as
names = c("a","b")
I'd like to make an empty data frame with the column names taken from names, with 1 row where all values are NA.
"a" "b"
NA NA
I've tried something like this:
d = data.frame()
for(i in seq(1,length(names))) {
d[,toString(names[i])] = NA
}
Doesn't seem to work

We can replicate NA by the length of names into a list, set the names of the list with names and convert to a data.frame
data.frame(setNames(rep(list(NA), length(names)), names))
Or another option is read.csv
read.csv(text=paste(rep(NA, length(names)), collapse=","),
header=FALSE,col.names = names)

This will also do:
df <- as.data.frame(matrix(rep(NA, length(names)), nrow=1))
names(df) <- names

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R: Function that uses variable dataframe names from a vector [duplicate] - r

You could use this fragment (untested): one_df <- function(x) { dat <- get(x) for (i in seq(4, ncol(dat))) dat[,i] <- as.numeric(dat[,i]) return(dat) } ans <- lapply(list.industries, one_df) So in short: you are looking for get.

Related

How can I access a data frame using string and modify the data frame? [duplicate]

Create vectors from a contingency table [duplicate]

How to loop through a vector of data frame names to print first columns of the df's? [duplicate]

Creating Subset data frames in R within For loop [duplicate]

Initializing data frame with columns from a list of names in R [duplicate]

Categories

Resources