Convert entire dataframe dynamically - r

I have some dataframe allchemicals that generates n columns. I want to convert each of the columns in the dataframe so that it updates in real time. I have tried
convertedframes<-reactiveValues(cc0=0, mgl=0, ngl=0, ugl=0)
outputallconcs<-reactiveValues(chemicals=0)
observe({
convertedframes$cc0<-allchemicalscc0()
convertedframes$mgl<-allchemicals()*1
convertedframes$ngl<-allchemicals()*1000
convertedframes$ugl<-allchemicals()*1000000
})
observeEvent(input$run_button, {
req(allchemicals())
if(input$OCUnits=="c/c0"){
outputallconcs$chemicals<-convertedframes$cc0
}
if(input$OCunits=="mg/L"){
outputallconcs$chemicals<-convertedframes$mgl
}
if(input$OCunits=="ng/L"){
outputallconcs$chemicals<-convertedframes$ngl
}
if(input$OCunits=="ug/L"){
outputallconcs$chemicals<-convertedframes$ugl
}
})
But this leaves me with the error Warning: Error in if: argument is of length zero
When I do output$sum<-renderTable(outputallconcs$chemicals) I see the output is the dataframe that I want. When I try a similar method with just a single column this method works fine because I can just reference the one column name, however, things seem more complicated with a varying number of columns. Is there any easy way to do this? I apologize this is not a reproducible example, to generate these dataframes takes hundreds of lines of code which didn't seem necessary to share.

Related

Create list from column and filter from the resulting list

The below code allows simple filtering of a list:
#Filter to applicable codes only
ICS_List <- "QMM|QJG|QH8|QM7|QUE|QHG"
EofEMSOAs <- EofEMSOAs[grep(ICS_List, EofEMSOAs$Code),]
What I am looking to do instead is take all data from the column from another dataframe within a project and use the grep function to filter for values contained within that column - there could be hundreds so typing a manual list is not practical.
I have tried the below but it results in error 'argument 'pattern' has length > 1 and only the first element will be used. Seems using dplyr in this way does not create the same output as manually typing in a list which is throwing the error so I only get one result.
#To filter from required dataframe 'EofEMSOAsIMD'
EofEMSOAsCodeListOnly <- dplyr::pull(EofEMSOAsIMD, "Area Code")
EofEMSOAsFinalList <- EofEMSOAs[grep(EofEMSOAsCodeListOnly, EofEMSOAs$msoa11cd),]
Could anyone please amend the above so it does work using similar logic to the code at top of this question, namely 1. List created from column 2. Dataframe filtered for matches to that list? Thank you.

Combining many vectors into one larger vector (in an automated way)

I have a list of identifiers as follows:
url_num <- c('85054655', '85023543', '85001177', '84988480', '84978776', '84952756', '84940316', '84916976', '84901819', '84884081', '84862066', '84848942', '84820189', '84814935', '84808144')
And from each of these I'm creating a unique variable:
for (id in url_num){
assign(paste('test_', id, sep = ""), FUNCTION GOES HERE)
}
This leaves me with my variables which are:
test_8505465, test_85023543, etc, etc
Each of them hold the correct output from the function (I've checked), however my next step is to combine them into one big vector which holds all of these created variables as a seperate element in the vector. This is easy enough via:
c(test_85054655,test_85023543,test_85001177,test_84988480,test_84978776,test_84952756,test_84940316,test_84916976,test_84901819,test_84884081,test_84862066,test_84848942,test_84820189,test_84814935,test_84808144)
However, as I update the original 'url_num' vector with new identifiers, I'd also have to come down to the above chunk and update this too!
Surely there's a more automated way I can setup the above chunk?
Maybe some sort of concat() function in the original for-loop which just adds each created variable straight into an empty vector right then and there?
So far I've just been trying to list all the variable names and somehow get the output to be in an acceptable format to get thrown straight into the c() function.
for (id in url_num){
cat(as.name(paste('test_', id, ",", sep = "")))
}
...which results in:
test_85054655,test_85023543,test_85001177,test_84988480,test_84978776,test_84952756,test_84940316,test_84916976,test_84901819,test_84884081,test_84862066,test_84848942,test_84820189,test_84814935,test_84808144,
This is close to the output I'm looking for but because it's using the cat() function it's essentially a print statement and its output can't really get put anywhere. Not to mention I feel like this method I've attempted is wrong to begin with and there must be something simpler I'm missing.
Thanks in advance for any help you guys can give me!
Troy

In R, I am trying to make a for loop that will cycle through variable names and perform functions on them

I have variables that are named team.1, team.2, team.3, and so forth.
First of all, I would like to know how to go through each of these and assign a data frame to each one. So team.1 would have data from one team, then team.2 would have data from a second team. I am trying to do this for about 30 teams, so instead of typing the code out 30 times, is there a way to cycle through each with a counter or something similar?
I have tried things like
vars = list(sprintf("team.x%s", 1:33)))
to create my variables, but then I have no luck assigning anything to them.
Along those same lines, I would like to be able to run a function I made for cleaning and sorting the individual data sets on all of them at once.
For this, I have tried a for loop
for (j in 1:33) {
assign(paste("team.",j, sep = ""), cleaning1(paste("team.",j, sep =""), j))
}
where cleaning1 is my function, with two calls.
cleaning1(team.1, 1)
This produces the error message
Error in who[, -1] : incorrect number of dimensions
So obviously I am hoping the loop would count through my data sets, and also input my function calls and reassign my datasets with the newly cleaned data.
Is something like this possible? I am a complete newbie, so the more basic, the better.
Edit:
cleaning1:
cleaning1 = function (who, year) {
who[,-1]
who$SeasonEnd = rep(year, nrow(who))
who = (who[-nrow(who),])
who = tbl_df(who)
for (i in 1:nrow(who)) {
if ((str_sub(who$Team[i], -1)) == "*") {
who$Playoffs[i] = 1
} else {
who$Playoffs[i] = 0
}
}
who$Team = gsub("[[:punct:]]",'',who$Team)
who = who[c(27:28,2:26)]
return(who)
}
This works just fine when I run it on the data sets I have compiled myself.
To run it though, I have to go through and reassign each data set, like this:
team.1 = cleaning1(team.1, 1)
team.2 = cleaning1(team.2, 2)
So, I'm trying to find a way to automate that part of it.
I think your problem would be better solved by using a list of data frames instead of many variables containing one data frame each.
You do not say where you get your data from, so I am not sure how you would create the list. But assuming you have your data frames already stored in the variables team.1 etc., you could generate the list with
team.list <- list(team.1, team.2, ...,team.33)
where the dots stand for the variables that I did not write explicitly (you will have to do that). This is tedious, of course, and could be simplified as follows
team.list <- do.call(list,mget(paste0("team.",1:33)))
The paste0 command creates the variable names as strings, mget converts them to the actual objects, and do.call applies the list command to these objects.
Now that you have all your data in a list, it is much easier to apply a function on all of them. I am not quite sure how the year argument should be used, but from your example, I assume that it just runs from 1 to 33 (let me know, if this is not true and I'll change the code). So the following should work:
team.list.cleaned <- mapply(cleaning1,team.list,1:33)
It will go through all elements of team.list and 1:33 and apply the function cleaning1 with the elements as its arguments. The result will again be a list containing the output of each call, i.e.,
list( cleaning1(team.list[[1]],1), cleaning1(team.list[[2]],2), ...)
Since you are now to R I strongly recommend that you read the help on the apply commands (apply, lapply, tapply, mapply). There are very useful and once you got used to them, you will use them all the time...
There is probably also a simple way to directly generate the list of data frames using lapply. As an example: if the data frames are read in from files and you have the file names stored in a character vector file.names, then something along the lines of
team.list <- lapply(file.names,read.table)
might work.

Files details from folder

I'd like to loop through a list of files and record detailed info about them (size, no. of rows, means of columns).
I just started with storing the info in a data frame:
df<-data.frame()
all <-list.files(pattern=".csv")
for (i in all){
file<-read.csv(i)
filas<-nrow(file)
cols<-ncol(file)
info<-c(i,filas,cols)
df<-rbind(df,i,filas,cols)
}
but it triggers an error caused by the 'i' variable, which is just a file name. What am I doing wrong?
Thanks in advance, p.
Don't use for loops. Rather, use lapply in combination with do.call to obtain your desired result. Try:
do.call(rbind,lapply(all,function(x) {y<-read.csv(x); c(file=x, filas=nrow(y), cols=ncol(y))}))
Your approach was failing because in order of rbind to work, you need two data.frames with the same number of columns. You initially have created an empty data.frame (with 0 column) and this couldn't be rbinded to a vector of length 3 (assuming that you want a row for each file showing file name, number of rows and number of columns). If you really want to use a for loop, you should do something like:
for (i in 1:length(all)) {
file<-read.csv(all[i])
info<- data.frame(file=all[i], filas=nrow(file), cols=ncol(file))
if (i==1) df<-info else df<-rbind(df,info)
}

how to prompt user to remove multiple columns using the readline() in R

I am trying to write a code that allows the user to decide how many columns to remove from a table in R. The steps I am trying to perform are as follows:
1) print the column headers of the table
2) ask the user if they want to remove any columns. If the answer is yes, proceed to remove columns. This is in a loop, in case the user wants to remove multiple columns.
3) once the user is done removing columns, I want the modified table (with unwanted columns removed) to be returned so that it can be used later in script.
4) if the user does not want to remove any columns at all, they can just proceed, and the table is returned with no columns missing.
I am having 2 major issues/questions with my code as I currently have it:
1) the loop only works once (only one column is removed). the loop does work (it keeps prompting me if I keep answering "Y"), however in the end, the returned object only has 1 column removed (the first column I removed when the loop began). I tried to find if there is a way to have the user write in multiple inputs using readline, however the answers I found did not really help me.
2) If I don't want to remove any columns, and I enter "no" the first time I'm prompted for input, something very strange happens where what is returned is a table with the first column is removed.
I am still a newbie at coding, and I realize this may not be the best way to do what I want to do. I appreciate any advice/feedback!
my_data<-read.table(file.choose(),header=TRUE)
print(names(my_data)
for (column in my_data) {
remove_columns<-readline("Would you like to remove any columns? \n")
if(remove_columns=="Y" || remove_columns=="y") {
my_data_new<-my_data[,-!names(my_data) %in% c(readline("Which columns would you like to remove? \n"))]
} else {
return(my_data_new)
}}
I think you're looking for a while loop
my_data <- read.table(file.choose(), header = TRUE)
print(names(my_data)
while (TRUE) {
remove_columns <- readline("Would you like to remove any columns? \n")
if (remove_columns == "Y" || remove_columns == "y") {
my_data <- my_data[,-!names(my_data) %in% c(readline("Which columns would you like to remove? \n"))]
} else {
break
}
}

Resources