I am trying to write a loop in R but I think the nomenclature is not correct as it does not create the new objects, here is a simplified example of what I am trying to do:
for i in (1:8) {
List_i <-List
colsToGrab_i <-grep(predefinedRegex_i, colnames(List_i$table))
List_i$table <- List_i$table[,predefinedRegex_i]
}
I have created 'predefinedRegex'es 1:8 which the grep should use to search
The loop creates an object called "List_i" and then fails to find "predefinedRegex_i".
I have tried putting quotes around the "i" and $ in front of the i , also [i] but these do not work.
Any help much appreciated. Thank you.
#
Using #RyanGrammel's answer below::
#CREATING regular expressions for grabbing sets groups 1 -7 ::::
g_1 <- "DC*"
g_2 <- "BN_._X.*"
g_3 <- "BN_a*"
g_4 <- "BN_b*"
g_5 <- "BN_a_X.*"
g_6 <- "BN_b_X.*"
g_7 <- "BN_._Y.*"
for i in (1:8)
{
assign(x = paste("tableA_", i, sep=""), value = BigList$tableA)
assign(x = paste("Forgrep_", i, sep=""), value = colnames(get(x = paste("tableA_", i, sep=""))))
assign(x = paste("grab_", i, sep=""), value = grep((get(x = paste("g_",i, sep=""))), (get(x = paste("Forgrep_",i, sep="")))))
assign(x = paste("tableA_", i, sep=""), value = BigList$tableA[,get(x = paste("grab_",i, sep=""))])
}
This loop is repeated for each table inside "BigList".
I found I could not extract columnnames from
(get(x = paste("BigList_", i, "$tableA" sep=""))))
or from
(get(x = paste("BigList_", i, "[[2]]" sep=""))))
so it was easier to extract the tables first. I will now write a loop to repack the lists up.
Problem
Your syntax is off: you don't seem to understand how exactly R deals with variable names.
for(i in 1:10) name_i <- 1
The above code doesn't assign name_1, name_2,....,name_10. It assigns "name_i" over and over again
To create a list, you call 'list()', not List
creating a variable List_i in a loop doesn't assign List_1, List_2,...,List_8.
It repeatedly assigns an empty list to the name 'List_i'. Think about it; if R names variables in the way you tried to, it'd be equally likely to name your variables L1st_1, L2st_2...See 'Solution' for some valid R code do something similar
'predefinedRegex_i' isn't interpreted as an attempt to get the variable 'predefinedRegex_1', 'predefinedRegex_2', and so one.
However, get(paste0("predefinedRegex_", i)) is interpreted in this way. Just make sure i actually has a value when using this. See below.
Solution:
In general, use this to dynamically assign variables (List_1, List_2,..)
assign(x = paste0("prefix_", i), value = i)
if i is equal to 199, then this code assigns the variable prefix_199 the value 199.
In general, use this to dynamically get the variables you assigned using the above snippet of code.
get(x = paste0("prefix_", i))
if i is equal to 199, then this code gets the variable prefix_199.
That should solve the crux of your problem; if you need any further help feel free to ask for clarification here, or contact me via my Twitter Feed.
Related
I have the following double loop:
indexnames = c(a, b, c, d, etc.)
# with
# length(indexnames) = 87
# class(indexnames) = "character"
# (indexnames = indexes I want to add in a column)
files = c(aname, bname, cname, dname, etc.)
# with
# length(files) = 87
# class(files) = "character"
# (files = name of files in the global environment)
Now I want to loop through the two list and add to the files[1] a column of name "index" with the input index[1]. I implemented this the following way:
for(i in files){
for(j in indexnames){
files[i] = cbind(Index = indexnames[j], files[i])
}
}
When I run this, I get an error message of 50 or more warnings.
What am I doing wrong?
Appreciating any help, thanks.
You need to use get() and assign() functions to get the behavior you want.
Actually you don't have to use i or j in name elements when creating loops. It's easier to debug a loop if you name them in a more human readable way. Still let's look at your inner part of the loop.
files[i]
Given files is a vector, you cannot call a specific element by it's value this way (nor you'd want to, since it's just a vector with the name of objects). Instead make "i" cycle through a number vector 'for(i in 1:87)'
for (index in 1:87) {
assign( files[i] , `[[<-`(get(files[i]), 'index', value = indexnames[i] ))
}
I found some help in this answer:
How to use `assign()` or `get()` on specific named column of a dataframe?
I'm trying to run a loop with multiple dataframes. I'm using the gather function from tidyr and I want to use as argument the index of the loop, i, along with a word, deaths.
I've been trying:
gather(data[[i]], "year", paste("deaths", i, sep="_"), 2:ncol(data[[i]]))
However, everytime I try that, it returns "Error: Must supply a symbol or a string as argument".
I read somewhere that tidyr evaluates things in a non-standard way and that the alternative is gather_, which uses standard evaluation.
However, the command
gather_(data[[i]], "year", paste("deaths", i, sep="_"), 2:ncol(data[[i]]))
Returns Error: Only strings can be converted to symbols.
However, I tought the paste command was already resulting in a string.
Anyone knows a fix?
Here is the full error:
"<error>
message: Only strings can be converted to symbols
class: `rlang_error`
backtrace:
-tidyr::gather_(...)
-tidyr:::gather_.data.frame(...)
-rlang::syms(gather_cols)
-rlang:::map(x, sym)
-base::lapply(.x, .f, ...)
-rlang:::FUN(X[[i]], ...)
Call `summary(rlang::last_error())` to see the full backtrace"
The full code:
require(datasus)
require(tidyr)
data_list <- list()
for(i in 1:2){
data_list[[i]] <- sim_inf10_mun(linha = "Município", coluna = "Ano do Óbito", periodo = c(1996:2016), municipio = "all",
capitulo_cid10 = i)
data_list[[i]] <- data.frame(data_list[i])
data_list[[i]] <- data_list[[i]][-1,]
data_list[[i]] <- data_list[[i]][,-ncol(data_list[[i]])]
data_list[[i]] <- gather(data_list[[i]], "ano", "deaths_01_i", 2:ncol(data_list[[i]]))
names(data_list[[i]])[1]<-"cod_mun"
data_list[[i]] <- transform(data_list[[i]], cod_mun = substr(cod_mun, 1, 6))
data_list[[i]] <- transform(data_list[[i]], ano = substr(ano, 2, 5))
}
This returns a panel dataset exactly the way I want, with (municipality code-year) identification in lines, and a value. My problem is that the value (column) name is "deaths_01_i", which is kinda obvious since it is quotation marks, whereas I wanted it to run with the loop. Thus I tried to implement it with a paste.
I know I can just change the variable name by adding a line names(data_list[[i]])[3]<-paste("deaths_01",i,sep="_"), but the problem got my attention to improving my understanding of the code.
Some words are in Portuguese but they are irrelevant to the problem. I also changed the range of the loop to avoid time issues.
I have hardcoded this:
s79t5 <- read.csv("filename1.csv", header = TRUE)
s81t2 <- read.csv("filename2.csv", header = TRUE)
etc.
subsets79t5 <- subset(s79t5, Tags!='')
subsets81t2 <- subset(s81t2, Tags!='')
...
subsets100t5 <- subset(s100t5, Tags!='')
now i need to softcode it. i am almost there:
sessions <- c('s79t5', 's81t2', 's88t2', 's90t3', 's96t3', 's98t4', 's100t5')
for (i in 1:length(sessions)) {
jFileName <- c(as.character(sessions[i]))
j <- data.frame(jFileName)
subset <- subset(j, j$Tags!='')
assign(paste("subset", jFileName, sep = ""), data.frame(subset))
}
Just throwing an answer here to close this question. Discussion was in the comments.
You need the get function in your line: j <- data.frame(jFileName)
It should be: j <- as.data.frame(get(jFileName))
The get function looks in your existing objects for the string character you gave it (in this case, jFileName) and returns that object. I then make sure it is a data frame with as.data.frame.
Previously you were essentially telling R to make a data frame out of a character string. With get you are now referencing your actual dataset.
I am trying to print the "result" of using table function, but when I tried to use the code here, I got something very strange:
for (i in 1:4){
print (table(paste("group",i,"$", "BMI_obese",sep=""), paste("group",i,"$","A1.1", sep="")))
}
This is the result in R output:
group1$A1.1
group1$BMI_obese 1
group2$A1.1
group2$BMI_obese 1
group3$A1.1
group3$BMI_obese 1
group4$A1.1
group4$BMI_obese 1
But when I type out the statement without typing inside the loop:
table(group2$BMI_obese, group2$A1.1)
I got what I want:
1 2 3 4 5
0 51 20 9 8 0
1 37 20 15 6 4
Does anyone know which part of my for loop code is not correct or can be modified to fit my purpose of printing the loop table result?
Hi, all but now I have another problem. I am trying to add an inner loop which will take the column name as an argument, because I would like to loop through mulitiple column for each of the group data (i.e. for group1, I would like to have table of BMI_obese vs A1.1, BMI_obese vs A1.2 ... BMI_obese vs A1.15. This is my code, but somehow it is not working, I think it is because it is not recognizing the A1.1, A1.2,... as an column taking from the data group1, group2, group3, group4. But instead it is treated as a string I think. I am not sure how to fix it:
for (i in 2:4) {
for (j in c("A1.1","A1.2"))
{
print(with(get(paste0("group", i)),table(BMI_obese,j)))
}
}
I keep getting this error message:
Error in table(BMI_obese, j) : all arguments must have the same length
Okay, you are trying to construct a variable name using paste and then do a table. You are simply passing the name of the variable to table, not the variable object itself. For this sort of approach you want to use get()
for (i in 1:4) {
with(get(paste0("group", i), table(BMI_obese, A1.1))
}
#example saving as a list (using lapply rather than for loop)
group1 <- data.frame(x=LETTERS[1:10], y=(1:10)[sample(10, replace=TRUE)])
group2 <- data.frame(x=LETTERS[1:10], y=(1:10)[sample(10, replace=TRUE)])
result <- lapply(1:2, function(i) with(get(paste0("group", i)), table(x, y)))
#look at first six rows of each:
head(result[[1]])
head(result[[2]])
#example illustrating fetching objects from a string name
data(mtcars)
head(with(get("mtcars"), table(disp, cyl)))
head(with(get("mtcars"), table(disp, "cyl")))
#Error in table(disp, "cyl") : all arguments must have the same length
head(with(get("mtcars"), table(disp, get("cyl"))))
You could also use a combination of eval and parse like this:
x1 <- c(sample(10, 100, replace = TRUE))
y1 <- c(sample(10, 100, replace = TRUE))
table(eval(parse(text = paste0("x", 1))),
eval(parse(text = paste0("y", 1))))
But I'd also say it is not the nicest practice to access variables that way...
Your types are used wrong. See the difference:
table(group2$BMI_obese, group2$A1.1)
and
table(paste(...),paste(...))
So what type does paste return? Certainly some string.
EDIT:
paste(...) was not meant to be syntactically correct but an abbreviation for paste("group",i,"$", "BMI_obese",sep=""), or whatever you paste together.
paste(...) is returning some string. If you put that result into a table, you get a table of strings (the unexpected result that you got). What you want to do is acessing variables or fields with the name which is returned by your paste(...). Just an an eval to your paste like Daniel said and do it like this.
for (i in 1:4){
print (table(eval(paste("group",i,"$", "BMI_obese",sep="")),eval(paste("group",i,"$","A1.1", sep=""))))
}
I have the following dynamic list created with the names cluster_1, cluster_2... like so:
observedUserShifts <- vector("list")
cut <- 2
for (i in 1:cut) {
assign(paste('cluster_', i, sep=''), subset(sortedTestRTUser, cluster==i))
observedUserShifts[[i]] <- mean(cluster_1$shift_length_avg)
}
Notice that i have cut=2 so 2 lists are created dynamically with the names due to the 'assign' function: cluster_1 and cluster_2
I want to invoke each of the above lists within the for loop. Notice that i have hard coded cluster_1 in the for loop (2nd line inside for loop). How do I change this so that this is not hard coded?
I tried:
> observedUserShifts[[i]] <- mean((paste('cluster_','k',sep='')$shift_length_avg)
+ )
Error in paste("cluster_", "k", sep = "")$shift_length_avg :
$ operator is invalid for atomic vectors
Agree this is suboptimal coding practice, but to answer the specific question, use get:
for (i in 1:cut) {
assign(paste('cluster_', i, sep=''), subset(sortedTestRTUser, cluster==i))
observedUserShifts[[i]] <-
mean( get(paste('cluster_', i, sep='') )[['shift_length_avg']] )
}
Notice that instead of using $ I chose to use [[ with a quoted column name.