file.choose() analogue for objects in R

file.choose() analogue for objects in R - r

Is in R an analogue to file.choose() function,working with objects inside R
(elements of vectors, objects in environments and etc)?
I need just dialog window like in file.choose() function, where i can choose elements of vector, for example
For Example
I have dataframe with 3 columns.
length(unique(df$column2))
[1] 3
Then i write
df<- filter(df, column2 %in% MyMagicFunction() )
Then i see window, where i choose right elements =)

I guess you are working in pure R console for that (ie. not RStudio)
You can use file.choose for that purpose after having populated some fake files, see:
myfunction <- function(df){
split_path <- function(path) {
rev(setdiff(strsplit(path,"/|\\\\")[[1]], ""))
}
tmpdir <- file.path("c:/temp",substitute(df))
dir.create(tmpdir,showWarnings =FALSE)
for (ivar in names(df)){
cat("", file=file.path(tmpdir,ivar))
}
selvar <- choose.files(default = paste(tmpdir,"*",sep="/"), caption = "Variable",
multi = FALSE)
varname <- split_path(selvar)[1]
unlink(file.path(tmpdir,"*"))
print(varname) # to be replaced by your function exploiting df and varname such as mean(df[,varname])
}
then:
> doit <- myfunction(iris)
[1] "Sepal.Length"
as said in comment, you have to define your own function call within myfunction.

Related

Adding a new column with filenames for the list of files in a for loop

I have a time series data. I stored the data in txt files under daily subfolders in Monthly folders.
setwd(".../2018/Jan")
parent.folder <-".../2018/Jan"
sub.folders <- list.dirs(parent.folder, recursive=TRUE)[-1] #To read the sub-folders under parent folder
r.scripts <- file.path(sub.folders)
A_2018 <- list()
for (j in seq_along(r.scripts)) {
A_2018[[j]] <- dir(r.scripts[j],"\\.txt$")}
Of these .txt files, I removed some of the files which I don't want to use for the further analysis, using the following code.
trim_to_two <- function(x) {
runs = rle(gsub("^L1_\\d{4}_\\d{4}_","",x))
return(cumsum(runs$lengths)[which(runs$lengths > 2)] * -1)
}
A_2018_new <- list()
for (j in seq_along(A_2018)) {
A_2018_new[[j]] <- A_2018[[j]][trim_to_two(A_2018[[j]])]
}
Then, I want to make a rowbind by for loop for the whole .txt files. Before that, I would like to remove some lines in each txt file, and add one new column with file name. The following is my code.
for (i in 1:length(A_2018_new)) {
for (j in 1:length(A_2018_new[[i]])){
filename <- paste(str_sub(A_2018_new[[i]][j], 1, 14))
assign(filename, read_tsv(complete_file_name, skip = 14, col_names = FALSE),
)
Y <- r.scripts %>% str_sub(46, 49)
MD <- r.scripts %>% str_sub(58, 61)
HM <- filename %>% str_sub(9, 12)
Turn <- filename %>% str_sub(14, 14)
time_minute <- paste(Y, MD, HM, sep="-")
Map(cbind, filename, SampleID = names(filename))
}
}
But I didn't get my desired output. I tried to code using other examples. Could anyone help to explain what my code is missing.

Your code seems overly complex for what it is doing. Your problem is however not 100% clear (e.g. what is the pattern in your file names that determine what to import and what not?). Here are some pointers that would greatly simplify the code, and likely avoid the issue you are having.
Use lapply() or map() from the purrr package to iterate instead of a for loop. The benefit is that it places the different data frames in a list and you don't need to assign multiple data frames into their own objects in the environment. Since you tagged the tidyverse, we'll use the purrr functions.
library(tidyverse)
You could for instance retrieve the txt file paths, using something like
txt_files <- list.files(path = 'data/folder/', pattern = "txt$", full.names = TRUE) # Need to remove those files you don't with whatever logic applies
and then use map() with read_tsv() from readr like so:
mydata <- map(txt_files, read_tsv)
Then for your manipulation, you can again use lapply() or map() to apply that manipulation to each data frame. The easiest way is to create a custom function, and then apply it to each data frame:
my_func <- function(df, filename) {
df |>
filter(...) |> # Whatever logic applies here
mutate(filename = filename)
}
and then use map2() to apply this function, iterating through the data and filenames, and then list_rbind() to bind the data frames across the rows.
mydata_output <- map2(mydata, txt_files, my_func) |>
list_rbind()

Looping over lists, extracting certain elements and delete the list?

I am trying to create an efficient code that opens data files containing a list, extracts one element within the list, stores it in a data frame and then deletes this object before opening the next one.
My idea is doing this using loops. Unfortunately, I am quite new in learning how to do this using loops, and don't know how write the code.
I have managed to open the data-sets using the following code:
for(i in 1995:2015){
objects = paste("C:/Users/...",i,"agg.rda", sep=" ")
load(objects)
}
The problem is that each data-set is extremely large and R cannot open all of them at once. Therefore, I am now trying to extract an element within each list called: tab_<<i value >>_agg[["A"]] (for example tab_1995_agg[["A"]]), then delete the object and iterate over each i (which are different years).
I have tried using the following code but it does not work
for(i in unique(1995:2015)){
objects = paste("C:/Users/...",i,"agg.rda", sep=" ")
load(objects)
tmp = cat("tab",i,"_agg[[\"A\"]]" , sep = "")
y <- rbind(y, tmp)
rm(list=objects)
}
I apologize for any silly mistake (or question) and greatly appreciate any help.

Here’s a possible solution using a function to rename the object you’re loading in. I got loadRData from here. The loadRData function makes this a bit more approachable because you can load in the object with a different name.
Create some data for a reproducible example.
tab2000_agg <-
list(
A = 1:5,
b = 6:10
)
tab2001_agg <-
list(
A = 1:5,
d = 6:10
)
save(tab2000_agg, file = "2000_agg.rda")
save(tab2001_agg, file = "2001_agg.rda")
rm(tab2000_agg, tab2001_agg)
Using your loop idea.
loadRData <- function(fileName){
load(fileName)
get(ls()[ls() != "fileName"])
}
y <- list()
for(i in 2000:2001){
objects <- paste("", i, "_agg.rda", sep="")
data_list <- loadRData(objects)
tmp <- data_list[["A"]]
y[[i]] <- tmp
rm(data_list)
}
y <- do.call(rbind, y)
You could also turn it into a function rather than use a loop.
getElement <- function(year){
objects <- paste0("", year, "_agg.rda")
data_list <- loadRData(objects)
tmp <- data_list[["A"]]
return(tmp)
}
y <- lapply(2000:2001, getElement)
y <- do.call(rbind, y)
Created on 2022-01-14 by the reprex package (v2.0.1)

Adding new columns and column names in a loop in R

I have a loop to read in a series of .csv files
for (i in 1:3)
{
nam <- paste0("A_tree", i)
assign(nam, read.csv(sprintf("/Users/sethparker/Documents/%d_tree_from_data.txt", i), header = FALSE))
}
This works fine and generates a series of files comparable to this example data
A_tree1 <- data.frame(cbind(c(1:5),c(1:5),c(1:5)))
A_tree2 <- data.frame(cbind(c(2:6),c(2:6),c(2:6)))
A_tree3 <- data.frame(cbind(c(3:10),c(3:10),c(3:10)))
What I want to do is add column names, and populate 2 new columns with data (month and model run). My current successful approach is to do this individually, like this:
colnames(A_tree1) <- c("GPP","NPP","LA")
A_tree1$month <- seq.int(nrow(A_tree1))
A_tree1$run <- c("1")
colnames(A_tree2) <- c("GPP","NPP","LA")
A_tree2$month <- seq.int(nrow(A_tree2))
A_tree2$run <- c("2")
colnames(A_tree3) <- c("GPP","NPP","LA")
A_tree3$month <- seq.int(nrow(A_tree3))
A_tree3$run <- c("3")
This is extremely inefficient for the number of _tree objects I have. Attempts to modify the loop with paste0() or sprintf() to incorporate these desired manipulations have resulted in Error: target of assignment expands to non-language object. I think I understand why this error is appearing based on reading other posts (Error in <my code> : target of assignment expands to non-language object). Is it possible to do what I want within my for loop? If not, how could I automate this better?

You can use lapply:
n <- index #(include here the total index)
l <- lapply(1:n, function(i) {
# this is the same of sprintf, but i prefer paste0
# importing data on each index i
r <- read.csv(
paste0("/Users/sethparker/Documents/", i, "_tree_from_data.txt"),
header = FALSE
)
# creating add columns
r$month <- seq.int(nrow(r))
r$run <- i
return(r)
})
# lapply will return a list for you, if you desire to append tables
# include a %>% operator and a bind_rows() call (dplyr package)
l %>%
bind_rows() # like this

R : name of an object stored in a variable

I have this little problem in R : I loaded a dataset, modified it and stored it in the variable "mean". Then I used an other variable "dataset" also containing this dataset
data<-read.table()
[...modification on data...]
mean<-data
dataset<-mean
I used the variable "dataset" in some other functions of my script, etc. and at the end I want to store in a file with the name "table_mean.csv"
Of course the command write.csv(tabCorr,file=paste("table_",dataset,".csv",sep=""))
nor the one with ...,quote(dataset)... do what I want...
Does anyone know how I can retrieve "mean" (as string) from "dataset" ?
(The aim would be that I could use this script for other purposes simply changing e.g. dataset<-variance)
Thank you in advance !

I think you are trying to do something like the following code does:
data1 <- 1:4
data2 <- 4:8
## Configuration ###
useThisDataSet <- "data2" # Change to "data1" to use other dataset.
currentDataSet <- get(x = useThisDataSet)
## Your data analysis.
result <- fivenum(currentDataSet)
## Save results.
write.csv(x = result, file = paste0("table_", useThisDataSet, ".csv"))
However, a better alternative would be to wrap your code into a function and pass in your data:
doAnalysis <- function(data, name) {
result <- fivenum(data)
write.csv(x = result, file = paste0("table_", name, ".csv"))
}
doAnalysis(data1, "data1")
If you always want to use the name of the object passed into the function as part of the filename, we can use non-standard evaluation to save some typing:
doAnalysisShort <- function(data) {
result <- fivenum(data)
write.csv(x = result, file = paste0("table_", substitute(data), ".csv"))
}
doAnalysisShort(data1)

Save elements of a list to ".Rda" file inside a function

Just for example, I have a dataframe with columns: name, n, mean and sd. How do I extract and then save the elements of a list into a single rda file. The file should contain the generated datasets and not the list.
random.r <- function(df, filename) {
save.random <- function(name, n, mean, sd) {
rn <- rnorm(n=n, mean=mean, sd=sd)
assign(deparse(name), rn)
}
rlist <- sapply(1:nrow(df), function(x)
save.random(df$name[x], df$n[x],df$mean[x],df$sd[x],simplify = FALSE))
save(list = rlist, file = paste(filename,".Rda",sep=""), envir = .GlobalEnv)
}
Cheers

The trick is to tell R where to find the objects referred to in save. To do this, provide the list itself as an environment:
save(list=names(rlist), file=..., envir=as.environment(rlist))
Note also that list must be a vector of object names, so this should be names(rlist), not simply rlist, since the latter is a list of numeric vectors.
The following is a modification of your random.r, which works as you had intended. At the end of this post I also provide simplified code that achieves the same.
random.r <- function(df, filename) {
save.random <- function(name, n, mean, sd) {
rnorm(n=n, mean=mean, sd=sd)
}
rlist <- setNames(lapply(1:nrow(df), function(x) {
save.random(df$name[x], df$n[x], df$mean[x], df$sd[x])
}), df$name)
save(list = names(rlist), file = paste0(filename, ".rda"),
envir = as.environment(rlist))
}
The key changes above are the specification of names(rlist) as the list (vector) of element names that you want to save, and as.environment(rlist) as the environment in which you want R to search for objects with those names. Note also that I've used setNames to correctly assign elements of df$name as the names of the resulting elements of rlist.
A simplified version would be:
rlist <- setNames(mapply(rnorm, d$n, d$mean, d$sd), d$name)
save(list=names(rlist), file='~/../Desktop/foo.rda',
envir=as.environment(rlist))
where d is your data.frame. Here, mapply is a handy shortcut; it steps through the vectors d$n, d$mean and d$sd simultaneously, performing rnorm each time.
The simplified code can of course be wrapped into a function if you require, e.g.:
f <- function(x, filename) {
rlist <- setNames(mapply(rnorm, x$n, x$mean, x$sd), x$name)
save(list=names(rlist), file=paste0(filename, '.rda'),
envir=as.environment(rlist))
}
d <- data.frame(name=LETTERS, n=sample(100, 26), mean=runif(26), sd=runif(26),
stringsAsFactors=FALSE)
f(d, '~/../Desktop/foo')

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

file.choose() analogue for objects in R - r

Related

Adding a new column with filenames for the list of files in a for loop

Looping over lists, extracting certain elements and delete the list?

Adding new columns and column names in a loop in R

R : name of an object stored in a variable

Save elements of a list to ".Rda" file inside a function

Categories

Resources