A post on here a day back has me wondering how to assign values to multiple objects in the global environment from within a function. This is my attempt using lapply (assign may be safer than <<- but I have never actually used it and am not familiar with it).
#fake data set
df <- data.frame(
x.2=rnorm(25),
y.2=rnorm(25),
g=rep(factor(LETTERS[1:5]), 5)
)
#split it into a list of data frames
LIST <- split(df, df$g)
#pre-allot 5 objects in R with class data.frame()
V <- W <- X <- Y <- Z <- data.frame()
#attempt to assign the data frames in the LIST to the objects just created
lapply(seq_along(LIST), function(x) c(V, W, X, Y, Z)[x] <<- LIST[[x]])
Please feel free to shorten any/all parts of my code to make this work (or work better/faster).
Update of 2018-10-10:
The most succinct way to carry out this specific task is to use list2env() like so:
## Create an example list of five data.frames
df <- data.frame(x = rnorm(25),
g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)
## Assign them to the global environment
list2env(LIST, envir = .GlobalEnv)
## Check that it worked
ls()
## [1] "A" "B" "C" "D" "df" "E" "LIST"
Original answer, demonstrating use of assign()
You're right that assign() is the right tool for the job. Its envir argument gives you precise control over where assignment takes place -- control that is not available with either <- or <<-.
So, for example, to assign the value of X to an object named NAME in the the global environment, you would do:
assign("NAME", X, envir = .GlobalEnv)
In your case:
df <- data.frame(x = rnorm(25),
g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)
NAMES <- c("V", "W", "X", "Y", "Z")
lapply(seq_along(LIST),
function(x) {
assign(NAMES[x], LIST[[x]], envir=.GlobalEnv)
}
)
ls()
[1] "df" "LIST" "NAMES" "V" "W" "X" "Y" "Z"
I think this question can have a nice crossover with this one: Can lists be created that name themselves based on input object names?
Say you want to do the same modification to a set of objects on the fly. But list2env() requires a named list, and you don't want to copy and paste them again. Borrowing the namedList function, and combining it with
Josh O'Brien anwser:
> namedList <- function(...) {
+ L <- list(...)
+ snm <- sapply(substitute(list(...)), deparse)[-1]
+ if (is.null(nm <- names(L))) nm <- snm
+ if (any(nonames <- nm=="")) nm[nonames] <- snm[nonames]
+ setNames(L ,nm)
+ }
>
> df_1 <- data.frame(x = 1)
> df_2 <- data.frame(x = 2)
> df_3 <- data.frame(x = 3)
>
> list2env(lapply(namedList(df_1, df_2, df_3), function(x) {
+ x <- cbind.data.frame(x, y = "B")
+ }), envir = .GlobalEnv)
<environment: R_GlobalEnv>
>
> df_1
x y
1 1 B
> df_2
x y
1 2 B
> df_3
x y
1 3 B
If you have a list of object names and file paths you can also use mapply:
object_names <- c("df_1", "df_2", "df_3")
file_paths <- list.files({path}, pattern = ".csv", full.names = T)
mapply(function(df_name, file)
assign(df_name, read.csv(file), envir=.GlobalEnv),
object_names,
file_paths)
I used list.files() to construct a vector of all the .csv files in a
specific directory. But file_paths could be written or constructed in any way.
If the files you want to read in are in the current working
directory, then file_paths could be replaced with a character vector of
file names.
In the code above, you need to replace {path} with a
string of the desired directory's path.
This demonstrates how to split out a nested dataframe into objects in the global environment with tidyverse functions:
library(tidyverse)
library(palmerpenguins)
penguins %>%
group_nest(species) %>%
deframe() %>%
list2env(.GlobalEnv)
Related
for(i in 1:3){
names <- c("n1","n2","n3")
assign(paste0("mydf",i), data.frame(matrix("", nrow = 3, ncol = 3)))
}
I tried the code shown below but it didn't work.
for(i in 1:3){
names <- c("n1","n2","n3")
assign(paste0("mydf",i), names(data.frame(matrix("", nrow = 3, ncol = 3)))[1:3] <- names)
}
What's your solution? Thanks in advance.
This the approach I would take. The following script not only changes the column names, but also creates 3 dataframes in the global environment kind of like your original script.
for (i in 1:3){
noms <- c("n1","n2","n3") # create the names in order the columns appear in the dataframe
df_ <- data.frame(matrix("", nrow = 3, ncol = 3)) # create the dataframe
df_nom <- paste("mydf", i, sep = "") # create the dataframe name
colnames(df_) <- noms # assign the names to the columns
assign(df_nom, df_) # rename the dataframe
}
1) Normally one puts the data frames in a list but if you really want to them into the current environment then do the following. If you want the global environment then replace the first line with e <- .GlobalEnv or if you want to create a list instead (preferable) then use e <- list() instead.
# define 3 data frames
e <- environment() # or e <- .GlobalEnv or e <- list()
nms <- paste0("mydf", 1:3)
for(nm in nms) e[[nm]] <- data.frame(matrix("", 3, 3))
# change their column names
for(nm in nms) names(e[[nm]]) <- c("n1", "n2", "n3")
2) Even better if we want lists is:
L <- Map(function(x) data.frame(matrix("", 3, 3)), paste0("mydf", 1:3))
L[] <- lapply(L, `names<-`, c("n1", "n2", "n3")) # change col names
Converting
Note that we can convert a list L to data frames that are loose in the environment using one of these depending on which environment you want to put the list components into.
list2env(L, environment())
list2env(L, .GlobalEnv)
and we can go the other way using where e is environment() or .GlobalEnv depending on what we need. We can omit the e argument is the data frames are in the current environment.
L <- mget(nms, e)
You can use get to get the data.frame by name, update the names and assign it back.
nNames <- c("n1","n2","n3")
for(i in 1:3) {
D <- paste0("mydf",i)
tt <- get(D)
names(tt) <- nNames
assign(D, tt)
}
names(mydf1)
#[1] "n1" "n2" "n3"
Alternatively the names could already be set when creating the matrix by using dimnames:
nNames <- c("n1","n2","n3")
for(i in 1:3) {
assign(paste0("mydf", i),
data.frame(matrix("", 3, 3, dimnames=list(NULL, nNames))))
}
names(mydf1)
#[1] "n1" "n2" "n3"
A post on here a day back has me wondering how to assign values to multiple objects in the global environment from within a function. This is my attempt using lapply (assign may be safer than <<- but I have never actually used it and am not familiar with it).
#fake data set
df <- data.frame(
x.2=rnorm(25),
y.2=rnorm(25),
g=rep(factor(LETTERS[1:5]), 5)
)
#split it into a list of data frames
LIST <- split(df, df$g)
#pre-allot 5 objects in R with class data.frame()
V <- W <- X <- Y <- Z <- data.frame()
#attempt to assign the data frames in the LIST to the objects just created
lapply(seq_along(LIST), function(x) c(V, W, X, Y, Z)[x] <<- LIST[[x]])
Please feel free to shorten any/all parts of my code to make this work (or work better/faster).
Update of 2018-10-10:
The most succinct way to carry out this specific task is to use list2env() like so:
## Create an example list of five data.frames
df <- data.frame(x = rnorm(25),
g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)
## Assign them to the global environment
list2env(LIST, envir = .GlobalEnv)
## Check that it worked
ls()
## [1] "A" "B" "C" "D" "df" "E" "LIST"
Original answer, demonstrating use of assign()
You're right that assign() is the right tool for the job. Its envir argument gives you precise control over where assignment takes place -- control that is not available with either <- or <<-.
So, for example, to assign the value of X to an object named NAME in the the global environment, you would do:
assign("NAME", X, envir = .GlobalEnv)
In your case:
df <- data.frame(x = rnorm(25),
g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)
NAMES <- c("V", "W", "X", "Y", "Z")
lapply(seq_along(LIST),
function(x) {
assign(NAMES[x], LIST[[x]], envir=.GlobalEnv)
}
)
ls()
[1] "df" "LIST" "NAMES" "V" "W" "X" "Y" "Z"
I think this question can have a nice crossover with this one: Can lists be created that name themselves based on input object names?
Say you want to do the same modification to a set of objects on the fly. But list2env() requires a named list, and you don't want to copy and paste them again. Borrowing the namedList function, and combining it with
Josh O'Brien anwser:
> namedList <- function(...) {
+ L <- list(...)
+ snm <- sapply(substitute(list(...)), deparse)[-1]
+ if (is.null(nm <- names(L))) nm <- snm
+ if (any(nonames <- nm=="")) nm[nonames] <- snm[nonames]
+ setNames(L ,nm)
+ }
>
> df_1 <- data.frame(x = 1)
> df_2 <- data.frame(x = 2)
> df_3 <- data.frame(x = 3)
>
> list2env(lapply(namedList(df_1, df_2, df_3), function(x) {
+ x <- cbind.data.frame(x, y = "B")
+ }), envir = .GlobalEnv)
<environment: R_GlobalEnv>
>
> df_1
x y
1 1 B
> df_2
x y
1 2 B
> df_3
x y
1 3 B
If you have a list of object names and file paths you can also use mapply:
object_names <- c("df_1", "df_2", "df_3")
file_paths <- list.files({path}, pattern = ".csv", full.names = T)
mapply(function(df_name, file)
assign(df_name, read.csv(file), envir=.GlobalEnv),
object_names,
file_paths)
I used list.files() to construct a vector of all the .csv files in a
specific directory. But file_paths could be written or constructed in any way.
If the files you want to read in are in the current working
directory, then file_paths could be replaced with a character vector of
file names.
In the code above, you need to replace {path} with a
string of the desired directory's path.
This demonstrates how to split out a nested dataframe into objects in the global environment with tidyverse functions:
library(tidyverse)
library(palmerpenguins)
penguins %>%
group_nest(species) %>%
deframe() %>%
list2env(.GlobalEnv)
I have a set of datasets that end with .fin. I would like to create a list and merge them using
ls(pattern = ".fin")
"A.fin" "B.fin" "C.fin" "D.fin" "E.fin" "F.fin" "G.fin" "H.fin" "I.fin"
"J.fin" "K.fin" "L.fin" "M.fin" "N.fin"
I would like to go from the line and code above to the line below beginning with list, like list(ls(pattern = ".fin")); however this only returns a vector in a list of the data set names. I have also tried using list(get(ls(pattern = ".fin")) and list(eval(parse(text = ls(pattern = .fin)))) with no avail.
list(ls(pattern = ".fin")) ### <- REPLACE THIS SOMEHOW %>%
Reduce(function(dtf1,dtf2) full_join(dtf1,dtf2,by="i"), .)
You can use mget:
mget(ls(pattern = ".fin"))
A.fin <- c(1,2,3)
B.fin <- c(4,5,6)
mget(ls(pattern = ".fin"))
#$A.fin
#[1] 1 2 3
#$B.fin
#[1] 4 5 6
get is not vectorized so you should "loop" over whatever ls() is returning. You can do that either
sapply(ls(pattern = ".fin"), FUN = get)
or the long way
xy <- ls(pattern = ".fin")
mylist <- vector("list", length(xy))
for (i in 1:length(mylist)) {
mylist[[i]] <- get(xy[i])
}
or use mget(ls(pattern = ".fin")).
I've been trying to create a function that can pipe the name of a variable into the names of columns for a data.frame that will be created in that function.
For example:
#Create variable
Var1 <- c(1,1,2,2,3,3)
#Create function that dummy codes the variable and renames them
rename_dummies <- function(x){
m <- model.matrix(~factor(x))
colnames(m)[1] <- "Dummy1"
colnames(m)[2] <- "Dummy2"
colnames(m)[3] <- "Dummy3"
m <<- data.frame(m)
}
rename_dummies(Var1)
Now, what can I add to this function that that "Var1" is automatically placed in front of "Dummy" in each of the variable names? Ideally I would end up with 3 variables that look like this...
> names(m)
[1] "Var1_Dummy1" "Var1_Dummy2" "Var1_Dummy3"
Try the below code. The key is in deparse(substitute). I also modified your function to not use the global assignment operator <<-, which is poor practice.
Var1 <- c(1,1,2,2,3,3)
#Create function that dummy codes the variable and renames them
rename_dummies <- function(x){
nm = deparse(substitute(x))
m <- model.matrix(~factor(x))
colnames(m)[1] <- "Dummy1"
colnames(m)[2] <- "Dummy2"
colnames(m)[3] <- "Dummy3"
m <- data.frame(m)
names(m) <- paste(nm, names(m), sep = "_")
m
}
rename_dummies(Var1)
A post on here a day back has me wondering how to assign values to multiple objects in the global environment from within a function. This is my attempt using lapply (assign may be safer than <<- but I have never actually used it and am not familiar with it).
#fake data set
df <- data.frame(
x.2=rnorm(25),
y.2=rnorm(25),
g=rep(factor(LETTERS[1:5]), 5)
)
#split it into a list of data frames
LIST <- split(df, df$g)
#pre-allot 5 objects in R with class data.frame()
V <- W <- X <- Y <- Z <- data.frame()
#attempt to assign the data frames in the LIST to the objects just created
lapply(seq_along(LIST), function(x) c(V, W, X, Y, Z)[x] <<- LIST[[x]])
Please feel free to shorten any/all parts of my code to make this work (or work better/faster).
Update of 2018-10-10:
The most succinct way to carry out this specific task is to use list2env() like so:
## Create an example list of five data.frames
df <- data.frame(x = rnorm(25),
g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)
## Assign them to the global environment
list2env(LIST, envir = .GlobalEnv)
## Check that it worked
ls()
## [1] "A" "B" "C" "D" "df" "E" "LIST"
Original answer, demonstrating use of assign()
You're right that assign() is the right tool for the job. Its envir argument gives you precise control over where assignment takes place -- control that is not available with either <- or <<-.
So, for example, to assign the value of X to an object named NAME in the the global environment, you would do:
assign("NAME", X, envir = .GlobalEnv)
In your case:
df <- data.frame(x = rnorm(25),
g = rep(factor(LETTERS[1:5]), 5))
LIST <- split(df, df$g)
NAMES <- c("V", "W", "X", "Y", "Z")
lapply(seq_along(LIST),
function(x) {
assign(NAMES[x], LIST[[x]], envir=.GlobalEnv)
}
)
ls()
[1] "df" "LIST" "NAMES" "V" "W" "X" "Y" "Z"
I think this question can have a nice crossover with this one: Can lists be created that name themselves based on input object names?
Say you want to do the same modification to a set of objects on the fly. But list2env() requires a named list, and you don't want to copy and paste them again. Borrowing the namedList function, and combining it with
Josh O'Brien anwser:
> namedList <- function(...) {
+ L <- list(...)
+ snm <- sapply(substitute(list(...)), deparse)[-1]
+ if (is.null(nm <- names(L))) nm <- snm
+ if (any(nonames <- nm=="")) nm[nonames] <- snm[nonames]
+ setNames(L ,nm)
+ }
>
> df_1 <- data.frame(x = 1)
> df_2 <- data.frame(x = 2)
> df_3 <- data.frame(x = 3)
>
> list2env(lapply(namedList(df_1, df_2, df_3), function(x) {
+ x <- cbind.data.frame(x, y = "B")
+ }), envir = .GlobalEnv)
<environment: R_GlobalEnv>
>
> df_1
x y
1 1 B
> df_2
x y
1 2 B
> df_3
x y
1 3 B
If you have a list of object names and file paths you can also use mapply:
object_names <- c("df_1", "df_2", "df_3")
file_paths <- list.files({path}, pattern = ".csv", full.names = T)
mapply(function(df_name, file)
assign(df_name, read.csv(file), envir=.GlobalEnv),
object_names,
file_paths)
I used list.files() to construct a vector of all the .csv files in a
specific directory. But file_paths could be written or constructed in any way.
If the files you want to read in are in the current working
directory, then file_paths could be replaced with a character vector of
file names.
In the code above, you need to replace {path} with a
string of the desired directory's path.
This demonstrates how to split out a nested dataframe into objects in the global environment with tidyverse functions:
library(tidyverse)
library(palmerpenguins)
penguins %>%
group_nest(species) %>%
deframe() %>%
list2env(.GlobalEnv)