R : name of an object stored in a variable - r

I have this little problem in R : I loaded a dataset, modified it and stored it in the variable "mean". Then I used an other variable "dataset" also containing this dataset
data<-read.table()
[...modification on data...]
mean<-data
dataset<-mean
I used the variable "dataset" in some other functions of my script, etc. and at the end I want to store in a file with the name "table_mean.csv"
Of course the command write.csv(tabCorr,file=paste("table_",dataset,".csv",sep=""))
nor the one with ...,quote(dataset)... do what I want...
Does anyone know how I can retrieve "mean" (as string) from "dataset" ?
(The aim would be that I could use this script for other purposes simply changing e.g. dataset<-variance)
Thank you in advance !

I think you are trying to do something like the following code does:
data1 <- 1:4
data2 <- 4:8
## Configuration ###
useThisDataSet <- "data2" # Change to "data1" to use other dataset.
currentDataSet <- get(x = useThisDataSet)
## Your data analysis.
result <- fivenum(currentDataSet)
## Save results.
write.csv(x = result, file = paste0("table_", useThisDataSet, ".csv"))
However, a better alternative would be to wrap your code into a function and pass in your data:
doAnalysis <- function(data, name) {
result <- fivenum(data)
write.csv(x = result, file = paste0("table_", name, ".csv"))
}
doAnalysis(data1, "data1")
If you always want to use the name of the object passed into the function as part of the filename, we can use non-standard evaluation to save some typing:
doAnalysisShort <- function(data) {
result <- fivenum(data)
write.csv(x = result, file = paste0("table_", substitute(data), ".csv"))
}
doAnalysisShort(data1)

Related

How to work with nested for loops in R with same list?

I amtrying to do some R coding for my project. Where I have to read some .csv files from one directory in R and I have to assign data frame as df_subject1_activity1, i have tried nested loops but it is not working.
ex:
my dir name is "Test" and i have six .csv files
subject1activity1.csv,
subject1activity2.csv,
subject1activity3.csv,
subject2activity1.csv,
subject2activity2.csv,
subject2activity3.csv
now i want to write code to load this .csv file in R and assign dataframe name as
ex:
subject1activity1 = df_subject1_activity1
subject1activity2 = df_subject1_activity2
.... so on using for loop.
my expected output is:
df_subject1_activity1
df_subject1_activity2
df_subject1_activity3
df_subject2_activity1
df_subject2_activity2
df_subject2_activity3
I have trie dfollowing code:
setwd(dirname(getActiveDocumentContext()$path))
new_path <- getwd()
new_path
data_files <- list.files(pattern=".csv") # Identify file names
data_files
for(i in 1:length(data_files)) {
for(j in 1:4){
assign(paste0("df_subj",i,"_activity",j)
read.csv2(paste0(new_path,"/",data_files[i]),sep=",",header=FALSE))
}
}
I am not getting desire output.
new to R can anyone please help.
Thanks
One solution is to use the vroom package (https://www.tidyverse.org/blog/2019/05/vroom-1-0-0/), e.g.
library(tidyverse)
library(vroom)
library(fs)
files <- fs::dir_ls(glob = "subject_*.csv")
data <- purrr::map(files, ~vroom::vroom(.x))
list2env(data, envir = .GlobalEnv)
# You can also combine all the dataframes if they have the same columns, e.g.
library(data.table)
concat <- data.table::rbindlist(data, fill = TRUE)
You are almost there. As always, if you are unsure, is never a bad idea to code clearly using more lines.
data_files <- list.files(pattern=".csv", full.names=TRUE) # Identify file names data_files
for( data_file in data_files) {
## check that the data file matches our expected pattern:
if(!grepl( "subject[0-9]activity[0-9]", basename(data_file) )) {
warning( "skiping file ", basename(data_file) )
next
}
## start creating the variable name from the filename
## remove the .csv extension
var.name <- sub( "\\.csv", "", basename(data_file), ignore.case=TRUE )
## prepend 'df' and introduce underscores:
var.name <- paste0(
"df",
gsub( "(subject|activity)", "_\\1", var.name ) ## this looks for literal 'subject' and 'acitivity' and if found, adds an underscore in front of it
)
## now read the file
data.from.file <- read.csv2( data_file )
## and assign it to our variable name
assign( var.name, data.from.file )
}
I don't have your files to test with, but should the above fail, you should be able to run the code line by line and easily see where it starts to go wrong.

Config file in a csv (or txt) format

I want to create a config file. In an R file it would look like the following:
#file:config.R
min_birthday_year <- 1920
max_birthday <- Sys.Date() %m+% months(9)
min_startdate_year <- 2010
max_startdate_year <- 2022
And in the main script I would do: source("config.R") .
However, now I want to source the config data from a .csv file. Does anyone have any idea how to? The file could also be in a .txt format
First thing I would suggest is looking into the config package.
It allows you to specify variables in a yaml text file. I haven't used it but it seems pretty neat and looks like it may be a good solution.
If you don't want to use that, then if your csv is something like this, with var names in one column and values in the next:
min_birthday_year,1920
max_birthday,Sys.Date() %m+% months(9)
min_startdate_year,2010
max_startdate_year,2022
then you could do something like this:
# Read in the file
# assuming that names are in one column and values in another
# will create vars using vals from second col with names from first
config <- read.table("config.csv", sep = ",")
# mapply with assign, with var names in one vector and values in the other
# eval(parse()) call to evaluate value as an expression - needed to evaluate the Sys.Date() thing.
# tryCatch in case you add a string value to the csv at some point, which will throw an error in the `eval` call
mapply(
function(x, y) {
z <- tryCatch(
eval(parse(text = y)),
error = function(e) y
)
assign(x, z, inherits = TRUE)
},
config[[1]],
config[[2]]
)

file.choose() analogue for objects in R

Is in R an analogue to file.choose() function,working with objects inside R
(elements of vectors, objects in environments and etc)?
I need just dialog window like in file.choose() function, where i can choose elements of vector, for example
For Example
I have dataframe with 3 columns.
length(unique(df$column2))
[1] 3
Then i write
df<- filter(df, column2 %in% MyMagicFunction() )
Then i see window, where i choose right elements =)
I guess you are working in pure R console for that (ie. not RStudio)
You can use file.choose for that purpose after having populated some fake files, see:
myfunction <- function(df){
split_path <- function(path) {
rev(setdiff(strsplit(path,"/|\\\\")[[1]], ""))
}
tmpdir <- file.path("c:/temp",substitute(df))
dir.create(tmpdir,showWarnings =FALSE)
for (ivar in names(df)){
cat("", file=file.path(tmpdir,ivar))
}
selvar <- choose.files(default = paste(tmpdir,"*",sep="/"), caption = "Variable",
multi = FALSE)
varname <- split_path(selvar)[1]
unlink(file.path(tmpdir,"*"))
print(varname) # to be replaced by your function exploiting df and varname such as mean(df[,varname])
}
then:
> doit <- myfunction(iris)
[1] "Sepal.Length"
as said in comment, you have to define your own function call within myfunction.

function that returns a value stored as a variable in an RData file (without global vars)

I want to get a specific variable value from a stored RData file. Often times in R sample code, the data set is loaded involving global variables.
I want to avoid any global variables and instead write a function that returns the value of a variable stored in an RData file. (This makes is also more explicit which variable is needed.)
How can I program a function returns a value stored as a variable in an RData file (without using any global variables).
(My try ist the function getVariableFromRDatabelow, but it is a bit cumbersome and perhaps not correct.)
xx <- pi # to ensure there is some data
save(list = ls(all = TRUE), file= "all.RData")
rm(xx)
getVariableFromRData <- function(dataName, varName) {
e <- new.env()
load(dataName, envir=e)
if(varName %in% ls(e)) {
resultVar <- e[[varName]]
return(resultVar)
} else {
stop (paste0("!! Error: varname (", varName,
") not found in RData (", dataName, ")!"))
}
}
yy <- getVariableFromRData("all.RData", "xx")
Your solution looks decent. Compare w/a function I wrote (based on some old SO question) to modify a .Rdata file:
resave<- function (..., list = character(), file)
{
previous <- load(file)
var.names <- c(list, as.character(substitute(list(...)))[-1L])
for (var in var.names) assign(var, get(var, envir = parent.frame()))
save(list = unique(c(previous, var.names)), file = file)
}
So strictly speaking you don't need a new environment: you can just query the output of load to see if the desired variable name is there.

Executing function on objects of name 'i' within for-loop in R

I am still pretty new to R and very new to for-loops and functions, but I searched quite a bit on stackoverflow and couldn't find an answer to this question. So here we go.
I'm trying to create a script that will (1) read in multiple .csv files and (2) apply a function to strip twitter handles from urls in and do some other things to these files. I have developed script for these two tasks separately, so I know that most of my code works, but something goes wrong when I try to combine them. I prepare for doing so using the following code:
# specify directory for your files and replace 'file' with the first, unique part of the
# files you would like to import
mypath <- "~/Users/you/data/"
mypattern <- "file+.*csv"
# Get a list of the files
file_list <- list.files(path = mypath,
pattern = mypattern)
# List of names to be given to data frames
data_names <- str_match(file_list, "(.*?)\\.")[,2]
# Define function for preparing datasets
handlestripper <- function(data){
data$handle <- str_match(data$URL, "com/(.*?)/status")[,2]
data$rank <- c(1:500)
names(data) <- c("dateGMT", "url", "tweet", "twitterid", "rank")
data <- data[,c(4, 1:3, 5)]
}
That all works fine. The problem comes when I try to execute the function handlestripper() within the for-loop.
# Read in data
for(i in data_names){
filepath <- file.path(mypath, paste(i, ".csv", sep = ""))
assign(i, read.delim(filepath, colClasses = "character", sep = ","))
i <- handlestripper(i)
}
When I execute this code, I get the following error: Error in data$URL : $ operator is invalid for atomic vectors. I know that this means that my function is being applied to the string I called from within the vector data_names, but I don't know how to tell R that, in this last line of my for-loop, I want the function applied to the objects of name i that I just created using the assign command, rather than to i itself.
Inside your loop, you can change this:
assign(i, read.delim(filepath, colClasses = "character", sep = ","))
i <- handlestripper(i)
to
tmp <- read.delim(filepath, colClasses = "character", sep = ",")
assign(i, handlestripper(tmp))
I think you should make as few get and assign calls as you can, but there's nothing wrong with indexing your loop with names as you are doing. I do it all the time, anyway.

Resources