I am trying to create a simple GUI in R. In this GUI I want to have two combobox, 1) List of data frame from the current workspace 2) Names of variable corresponding to the data frame selected in first combobox. I am trying to use gWidgets library but I am not sure how to do it. Can anyone in this forum help me? Thanks
I want SourceVar as a combo box by taking variable names from inData argument
## list data frames in an environment
lsDF <- function(envir=.GlobalEnv) {
varNames <- ls(envir=envir)
dfs <- sapply(varNames, function(i) inherits(get(i,envir=envir),"data.frame"))
varNames[dfs]
}
testFun <- function(inData,sourceVar,targetVar,outData)
{
inData[[targetVar]] <- as.factor(inData[[sourceVar]])
assign(outData, inData, envir = .GlobalEnv)
}
lst <- list()
lst$action <- list(beginning="toFactor(",ending=")")
lst$arguments$inData <- list(type="gcombobox",lsDF())
lst$arguments$sourceVar <- list(type="gedit")
lst$arguments$targetVar <- list(type="gedit")
lst$arguments$outData <- list(type="gedit")
ggenericwidget(lst, container=gwindow("Recode to Factor"))
I wouldn't use ggenericwidget (it isn't supported in gWidgets2). Instead create a combobox along the lines of cb <- gcombobox(character(0), cont=...) then to populate it call its [<- method: e.g., cb[] <- lsDf().
Here is a pattern for you to work with:
library(gWidgets2)
data(mtcars)
testdf <- data.frame(a=1:3, b= 1:3)
## list data frame names in envir
dfnms <- names(Filter(is.data.frame, mget(ls())) )
lsNms <- function(d, envir=.GlobalEnv) names(get(d, envir=envir))
w <- gwindow("two combos")
g <- glayout(cont=w)
g[1,1] <- "Data frames:"
g[1,2] <- (dfs <- gcombobox(dfnms, cont=g))
g[2,1] <- "Variables:"
g[2,2] <- (vnames <- gcombobox(character(0), cont=g))
addHandlerChanged(dfs, handler=function(...) vnames[] <- lsNms(svalue(dfs)))
Related
I am trying to create an efficient code that opens data files containing a list, extracts one element within the list, stores it in a data frame and then deletes this object before opening the next one.
My idea is doing this using loops. Unfortunately, I am quite new in learning how to do this using loops, and don't know how write the code.
I have managed to open the data-sets using the following code:
for(i in 1995:2015){
objects = paste("C:/Users/...",i,"agg.rda", sep=" ")
load(objects)
}
The problem is that each data-set is extremely large and R cannot open all of them at once. Therefore, I am now trying to extract an element within each list called: tab_<<i value >>_agg[["A"]] (for example tab_1995_agg[["A"]]), then delete the object and iterate over each i (which are different years).
I have tried using the following code but it does not work
for(i in unique(1995:2015)){
objects = paste("C:/Users/...",i,"agg.rda", sep=" ")
load(objects)
tmp = cat("tab",i,"_agg[[\"A\"]]" , sep = "")
y <- rbind(y, tmp)
rm(list=objects)
}
I apologize for any silly mistake (or question) and greatly appreciate any help.
Here’s a possible solution using a function to rename the object you’re loading in. I got loadRData from here. The loadRData function makes this a bit more approachable because you can load in the object with a different name.
Create some data for a reproducible example.
tab2000_agg <-
list(
A = 1:5,
b = 6:10
)
tab2001_agg <-
list(
A = 1:5,
d = 6:10
)
save(tab2000_agg, file = "2000_agg.rda")
save(tab2001_agg, file = "2001_agg.rda")
rm(tab2000_agg, tab2001_agg)
Using your loop idea.
loadRData <- function(fileName){
load(fileName)
get(ls()[ls() != "fileName"])
}
y <- list()
for(i in 2000:2001){
objects <- paste("", i, "_agg.rda", sep="")
data_list <- loadRData(objects)
tmp <- data_list[["A"]]
y[[i]] <- tmp
rm(data_list)
}
y <- do.call(rbind, y)
You could also turn it into a function rather than use a loop.
getElement <- function(year){
objects <- paste0("", year, "_agg.rda")
data_list <- loadRData(objects)
tmp <- data_list[["A"]]
return(tmp)
}
y <- lapply(2000:2001, getElement)
y <- do.call(rbind, y)
Created on 2022-01-14 by the reprex package (v2.0.1)
I have a loop to read in a series of .csv files
for (i in 1:3)
{
nam <- paste0("A_tree", i)
assign(nam, read.csv(sprintf("/Users/sethparker/Documents/%d_tree_from_data.txt", i), header = FALSE))
}
This works fine and generates a series of files comparable to this example data
A_tree1 <- data.frame(cbind(c(1:5),c(1:5),c(1:5)))
A_tree2 <- data.frame(cbind(c(2:6),c(2:6),c(2:6)))
A_tree3 <- data.frame(cbind(c(3:10),c(3:10),c(3:10)))
What I want to do is add column names, and populate 2 new columns with data (month and model run). My current successful approach is to do this individually, like this:
colnames(A_tree1) <- c("GPP","NPP","LA")
A_tree1$month <- seq.int(nrow(A_tree1))
A_tree1$run <- c("1")
colnames(A_tree2) <- c("GPP","NPP","LA")
A_tree2$month <- seq.int(nrow(A_tree2))
A_tree2$run <- c("2")
colnames(A_tree3) <- c("GPP","NPP","LA")
A_tree3$month <- seq.int(nrow(A_tree3))
A_tree3$run <- c("3")
This is extremely inefficient for the number of _tree objects I have. Attempts to modify the loop with paste0() or sprintf() to incorporate these desired manipulations have resulted in Error: target of assignment expands to non-language object. I think I understand why this error is appearing based on reading other posts (Error in <my code> : target of assignment expands to non-language object). Is it possible to do what I want within my for loop? If not, how could I automate this better?
You can use lapply:
n <- index #(include here the total index)
l <- lapply(1:n, function(i) {
# this is the same of sprintf, but i prefer paste0
# importing data on each index i
r <- read.csv(
paste0("/Users/sethparker/Documents/", i, "_tree_from_data.txt"),
header = FALSE
)
# creating add columns
r$month <- seq.int(nrow(r))
r$run <- i
return(r)
})
# lapply will return a list for you, if you desire to append tables
# include a %>% operator and a bind_rows() call (dplyr package)
l %>%
bind_rows() # like this
I have been stacking this work for quite long time, tried different approaches but couldn't succeed.
what I want is to apply following 4 functions to 30 different data (data1,2,3,...data30) within for loop or whatsoever in R. These datasets have same (10) column numbers and different rows.
This is the code I wrote for first data (data1). It works well.
for(i in 1:nrow(data1)){
data1$simp <-diversity(data1$sp, "simpson")
data1$shan <-diversity(data1$sp, "shannon")
data1$E <- E(data1$sp)
data1$D <- D(data1$sp)
}
I want to apply this code for other 29 data in order not to repeat the process 29 times.
Following code what I am trying to do now. But still not right.
data.list <- list(data1, data2,data3,data4,data5)
for(i in data.list){
data2 <- NULL
i$simp <-diversity(i$sp, "simpson")
i$shan <-diversity(i$sp, "shannon")
i$E <- E(i$sp)
i$D <- D(i$sp)
data2 <- rbind(data2, i)
print(data2)
}
So I wanna ask how I can tell R to apply functions to other 29 data?
Thanks in advance!
You can do this with Map.
fun <- function(DF){
for(i in 1:nrow(DF)){
DF$simp <-diversity(DF$sp, "simpson")
DF$shan <-diversity(DF$sp, "shannon")
DF$E <- E(DF$sp)
DF$D <- D(DF$sp)
}
DF
}
result.list <- Map(fun, data.list)
Or, if you don't want to have a function fun in the .GlobalEnv, with lapply.
result.list <- lapply(data.list, function(DF){
for(i in 1:nrow(DF)){
DF$simp <-diversity(DF$sp, "simpson")
DF$shan <-diversity(DF$sp, "shannon")
DF$E <- E(DF$sp)
DF$D <- D(DF$sp)
}
DF
})
If I understand the question, it you're ultimately asking about your 'data2' variable and how to merge these all together? I think the issue you're having is that you're setting data2 <- NULL with each loop iteration. The proposed solution below moves this definition outside the loop and the call to rbind() should now append all your data frames together to return the consolidated dataset.
data.list <- list(data1, data2,data3,data4,data5) #all 29 can go here
data2 <- NULL
for(i in data.list){
i$simp <-diversity(i$sp, "simpson")
i$shan <-diversity(i$sp, "shannon")
i$E <- E(i$sp)
i$D <- D(i$sp)
data2 <- rbind(data2, i)
}
print(data2)
I am assuming that your data1, ..., dataN are files stored in a directory and you're reading them one at a time. Also they have the same header.
What you can do is to import them one at a time and then perform the operations you want, as you mentioned:
files <- list.files(directoryPath) #maybe you can grep() some specific files
for (f in files){
data <- read.table(f) #choose header, sep and so on...
for(i in 1:nrow(data)){
data$simp <-diversity(data$sp, "simpson")
data$shan <-diversity(data$sp, "shannon")
data$E <- E(data$sp)
data$D <- D(data$sp)
}
}
be careful that you must be in the working directory or you must add a path to the filename while reading the tables (i.e. paste(path, f, sep=""))
There are plenty of options, here's one using only base functions:
data.list <- list(data1, data2, data3, data4, data5)
changed_data <- lapply(data.list, function(my_data) {
my_data$simp <-diversity(my_data$sp, "simpson")
my_data$shan <-diversity(my_data$sp, "shannon")
my_data$E <- E(my_data$sp)
my_data$D <- D(my_data$sp)
my_data})
I have a for loop that loops through a list of urls,
url_list <- c('http://www.irs.gov/pub/irs-soi/04in21id.xls',
'http://www.irs.gov/pub/irs-soi/05in21id.xls',
'http://www.irs.gov/pub/irs-soi/06in21id.xls',
'http://www.irs.gov/pub/irs-soi/07in21id.xls',
'http://www.irs.gov/pub/irs-soi/08in21id.xls',
'http://www.irs.gov/pub/irs-soi/09in21id.xls',
'http://www.irs.gov/pub/irs-soi/10in21id.xls',
'http://www.irs.gov/pub/irs-soi/11in21id.xls',
'http://www.irs.gov/pub/irs-soi/12in21id.xls',
'http://www.irs.gov/pub/irs-soi/13in21id.xls',
'http://www.irs.gov/pub/irs-soi/14in21id.xls',
'http://www.irs.gov/pub/irs-soi/15in21id.xls')
dowloads an excel file from each one assigns it to a dataframe and performs a set of data cleaning operations on it.
library(gdata)
for (url in url_list){
test <- read.xls(url)
cols <- c(1,4:5,97:98)
test <- test[-(1:8),cols]
test <- test[1:22,]
test <- test[-4,]
test$Income <-test$Table.2.1...Returns.with.Itemized.Deductions..Sources.of.Income..Adjustments..Itemized.Deductions.by.Type..Exemptions..and.Tax..Items..by.Size.of.Adjusted.Gross.Income..Tax.Year.2015..Filing.Year.2016.
test$Total_returns <- test$X.2
test$return_dollars <- test$X.3
test$charitable_deductions <- test$X.95
test$charitable_deduction_dollars <- test$X.96
test[1:5] <- NULL
}
My problem is that the loop simply writes over the same dataframe for each iteration through the loop. How can I have it assign each iteration through the loop to a data frame with a different name?
Use assign. This question is a duplicate of this post: Change variable name in for loop using R
For your particular case, you can do something like the following:
for (i in 1:length(url_list)){
url = url_list[i]
test <- read.xls(url)
cols <- c(1,4:5,97:98)
test <- test[-(1:8),cols]
test <- test[1:22,]
test <- test[-4,]
test$Income <-test$Table.2.1...Returns.with.Itemized.Deductions..Sources.of.Income..Adjustments..Itemized.Deductions.by.Type..Exemptions..and.Tax..Items..by.Size.of.Adjusted.Gross.Income..Tax.Year.2015..Filing.Year.2016.
test$Total_returns <- test$X.2
test$return_dollars <- test$X.3
test$charitable_deductions <- test$X.95
test$charitable_deduction_dollars <- test$X.96
test[1:5] <- NULL
assign(paste("test", i, sep=""), test)
}
You could write to a list:
result_list <- list()
for (i_url in 1:length(url_list)){
url <- url_list[i_url]
...
result_list[[i_url]] <- test
}
You can also name the list
names(result_list) <- c("df1","df2","df3",...)
Here's another approach with lapply instead of for loops which will write all resulting data.frames as separate list items which can then be re-named (if needed).
url_list <- c('http://www.irs.gov/pub/irs-soi/04in21id.xls',
...
'http://www.irs.gov/pub/irs-soi/15in21id.xls')
readURLFunc <- function(z){
test <- readxl::read_xls(z)
...
test[1:5] <- NULL
return(test)}
data_list <- lapply(url_list, readURLFunc)
I try to solve a problem from a question I have previously posted looping inside list in r
Is there a way to get the name of a dataframe that is on a list of dataframes?
I have listed a serie of dataframes and to each dataframe I want to apply myfunction. But I do not know how to get the name of each dataframe in order to use it on nameofprocesseddf of myfunction.
Here is the way I get the list of my dataframes and the code I got until now. Any suggestion how I can make this work?
library(missForest)
library(dplyr)
myfunction <- function (originaldf, proceseddf, nonproceseddf, nameofprocesseddf=character){
NRMSE <- nrmse(proceseddf, nonproceseddf, originaldf)
comment(nameofprocesseddf) <- nameofprocesseddf
results <- as.data.frame(list(comment(nameofprocesseddf), NRMSE))
names(results) <- c("Dataset", "NRMSE")
return(results)
}
a <- data.frame(value = rnorm(100), cat = c(rep(1,50), rep(2,50)))
da1 <- data.frame(value = rnorm(100,4), cat2 = c(rep(2,50), rep(3,50)))
dataframes <- dir(pattern = ".txt")
list_dataframes <- llply(dataframes, read.table, header = T, dec=".", sep=",")
n <- length(dataframes)
# Here is where I do not know how to get the name of the `i` dataframe
for (i in 1:n){
modified_list <- llply(list_dataframes, myfunction, originaldf = a, nonproceseddf = da1, proceseddf = list_dataframes[i], nameof processeddf= names(list_dataframes[i]))
write.table(file = sprintf("myfile/%s_NRMSE20%02d.txt", dataframes[i]), modified_list[[i]], row.names = F, sep=",")
}
as a matter of fact, the name of a data frame is not an attribute of the data frame. It's just an expression used to call the object. Hence the name of the data frame is indeed 'list_dataframes[i]'.
Since I assume you want to name your data frame as the text file is named without the extension, I propose you use something like (it require the library stringr) :
nameofprocesseddf = substr(dataframes[i],start = 1,stop = str_length(dataframes[i])-4)