Dynamic Variables in R - r

I am using RStudio (ver. 3.0.2) and new to R.
In my task, would like to create the below variables in RScript and assign a value.
AA_1_11 <- 1
AA_2_22 <- 2
AA_3_33 <- 3
AA_4_44 <- 4
BB_1_11 <- 5
BB_2_22 <- 6
BB_3_33 <- 7
BB_4_44 <- 8
Rather to statically create the 8 lines above. I am intending to dynamically code those so that the codes are useful when the variables expand in future i.e AA,BB, could later extend to CC, DD etc.
Tried the below code:
char_list <- c("AA", "BB")
num_list <- c("1_11", "2_22", "3_33", "4_44")
i=1
char_num_array <-vector()
for (char in char_list) {
for (num in num_list) {
char_num_array[i] <- paste(char,num, sep = '_')
i <- i + 1
}
}
(char_num_array)
for (j in 1:length(char_num_array)){
char_num_array[j] <- j
}
Though I can create all the variables names as a string name, have no idea why the actual variables are not created in RStudio as shown in the picture below.
Hope you can guide me on this.
Update:
Thanks to all for your help. I really wanted to make separate variables so I can monitor the changes in the values on RStudio as the variables will continually be used for other manipulations i.e fitting, refitting etc.
Thanks to JDB. I have modified the codes to below, where i could be replaced with other mathematical equation.
char_list <- c("AA", "BB")
num_list <- c("1_11", "2_22", "3_33", "4_44")
i <- 1
for (char in char_list) {
for (num in num_list) {
assign(paste(char,num, sep = '_'), i)
i <- i + 1
}
}

You don't need a loop for this. The preferred "R way" to do this type of operation is to keep all these variables in a list. Along with keeping the global environment "clean", this makes it easier to refer to the variables, and also makes it easier to perform operations on them all at once later.
For example, for this operation you can do
char_list <- c("AA", "BB")
num_list <- c("1_11", "2_22", "3_33", "4_44")
x <- setNames(
as.list(1:8),
paste(rep(char_list, each=4), num_list, sep="_")
)
Now the variables you want are all stored in x
names(x)
# [1] "AA_1_11" "AA_2_22" "AA_3_33" "AA_4_44" "BB_1_11"
# [6] "BB_2_22" "BB_3_33" "BB_4_44"
and can be accessed by name with $, by name or number with [ and [[, with with(), etc...
x$AA_1_11
# [1] 1
x[[5]]
# [1] 5
with(x, c(AA_1_11, BB_4_44))
# [1] 1 8

There's no point in your code where you're telling R to create those variables. Your first loop DOES succeed in creating a vector list (a string) of the names you desire, but it's not assigning them any values as variables.
R is pretty good at dynamic variable creation, however. Why do you want to assign the same value to a bunch of different variables? Could you create a data frame, for example, with these variables?
array <- data.frame(variables = numeric(length(char_num_array)))
rownames(array) <- char_num_array
array
variables
AA_2_22 0
AA_3_33 0
AA_4_44 0
BB_1_11 0
BB_2_22 0
BB_3_33 0
BB_4_44 0
If you really want to create these as separate variables, you can use assign():
for(item in char_num_array) assign(item, 1)

Related

Loop Changing to Matrix then Running tests

I have a dataframe with ~9000 rows of human coded data in it, two coders per item so about 4500 unique pairs. I want to break the dataset into each of these pairs, so ~4500 dataframes, run a kripp.alpha on the scores that were assigned, and then save those into a coder sheet I have made. I cannot get the loop to work to do this.
I can get it to work individually, using this:
example.m <- as.matrix(example.m)
s <- kripp.alpha(example.m)
example$alpha <- s$value
However, when trying a loop I am getting either "Error in get(v) : object 'NA' not found" when running this:
for (i in items) {
v <- i
v <- v[c("V1","V2")]
v <- assign(v, as.matrix(get(v)))
s <- kripp.alpha(v)
i$alpha <- s$value
}
Or am getting "In i$alpha <- s$value : Coercing LHS to a list" when running:
for (i in items) {
i.m <- i[c("V1","V2")]
i.m <- as.matrix(i.m)
s <- kripp.alpha(i.m)
i$alpha <- s$value
}
Here is an example set of data. Items is a list of individual dataframes.
l <- as.data.frame(matrix(c(4,3,3,3,1,1,3,3,3,3,1,1),nrow=2))
t <- as.data.frame(matrix(c(4,3,4,3,1,1,3,3,1,3,1,1),nrow=2))
items <- c("l","t")
I am sure this is a basic question, but what I want is for each file, i, to add a column with the alpha score at the end. Thanks!
Your problem is with scoping and extracting names from objects when referenced through strings. You'd need to eval() some of your object to make your current approach work.
Here's another solution
library("irr") # For kripp.alpha
# Produce the data
l <- as.data.frame(matrix(c(4,3,3,3,1,1,3,3,3,3,1,1),nrow=2))
t <- as.data.frame(matrix(c(4,3,4,3,1,1,3,3,1,3,1,1),nrow=2))
# Collect the data as a list right away
items <- list(l, t)
Now you can sapply() directly over the elements in the list.
sapply(items, function(v) {
kripp.alpha(as.matrix(v[c("V1","V2")]))$value
})
which produces
[1] 0.0 -0.5

How to batch process some frames with different dimension but same name pattern some how by R

In the R environment, I have already have some variable, their name:
id_01_r
id_02_l
id_05_l
id_06_r
id_07_l
id_09_1
id_11_l
So, their pattern seems like id_ and follows two figures, then _ and r or l randomly.
Each of them corresponds to one frame but different dim() output.
Also, there are some other variables in the environment, so first I should extract these frames. For this, I'm going to adopt:
> a <- list(ls()[grep("id*",ls())])` #a little sample for just id* I know
But, this function put them as one element, so I don't think it's good way
> length(a) [1] 1
I know how to read them in like below, but now for extact and same processes, I'm so confused.
i_set <- Sys.glob(paths='mypath/////id*.txt')
for (i in i_set) {
assign(substring(i, startx, endx),read.table(file=i,header=F))
}
Here, the key point is I want to do a series of same data processing for each of these frames. But based on these, what can I do instead of one by one?
Thanks your kind consideration.
Here is an example:
id_01_r <- iris
id_02_l <- mtcars
foo <- 42
vars <- grep("^id_\\d{2}_[rl]$", ls(), value = TRUE)
# [1] "id_01_r" "id_02_l"
process_data <- function(df) {
dim(df)
}
processed_data <- lapply(
mget(vars),
process_data
)
# $id_01_r
# [1] 150 5
#
# $id_02_l
# [1] 32 11

How would you write this using apply family of functions in R? Should you?

Here is my R Script that works just fine:
perc.rank <- function(x) trunc(rank(x)) / length(x) * 100.0
library(dplyr)
setwd("~/R/xyz")
datFm <- read.csv("yellow_point_02.csv")
datFm <- filter(datFm, HRA_ClassHRA_Final != -9999)
quant_cols <- c("CL_GammaRay_Despiked_Spline_MLR", "CT_Density_Despiked_Spline_FinalMerged",
"HRA_PC_1HRA_Final", "HRA_PC_2HRA_Final","HRA_PC_3HRA_Final",
"SRES_IMGCAL_SHIFT2VL_Slab_SHIFT2CL_DT", "Ultrasonic_DT_Despiked_Spline_MLR")
# add an extra column to datFm to store the quantile value
for (column_name in quant_cols) {
datFm[paste(column_name, "quantile", sep = "_")] <- NA
}
# initialize an empty dataframe with the new column names appended
newDatFm <- datFm[0,]
# get the unique values for the hra classes
hraClassNumV <- sort(unique(datFm$HRA_ClassHRA_Final))
# loop through the vector and create currDatFm and append it to newDatFm
for (i in hraClassNumV) {
currDatFm <- filter(datFm, HRA_ClassHRA_Final == i)
for (column_name in quant_cols) {
currDatFm <- within(currDatFm,
{
CL_GammaRay_Despiked_Spline_MLR_quantile <- perc.rank(currDatFm$CL_GammaRay_Despiked_Spline_MLR)
CT_Density_Despiked_Spline_FinalMerged_quantile <- perc.rank(currDatFm$CT_Density_Despiked_Spline_FinalMerged)
HRA_PC_1HRA_Final_quantile <- perc.rank(currDatFm$HRA_PC_1HRA_Final)
HRA_PC_2HRA_Final_quantile <- perc.rank(currDatFm$HRA_PC_2HRA_Final)
HRA_PC_3HRA_Final_quantile <- perc.rank(currDatFm$HRA_PC_3HRA_Final)
SRES_IMGCAL_SHIFT2VL_Slab_SHIFT2CL_DT_quantile <- perc.rank(currDatFm$SRES_IMGCAL_SHIFT2VL_Slab_SHIFT2CL_DT)
Ultrasonic_DT_Despiked_Spline_MLR_quantile <- perc.rank(currDatFm$Ultrasonic_DT_Despiked_Spline_MLR)
}
)
}
newDatFm <- rbind(newDatFm, currDatFm)
}
newDatFm <- newDatFm[order(newDatFm$Core_Depth),]
# head(newDatFm, 10)
write.csv(newDatFm, file = "Ricardo_quantiles.csv")
I have a few questions though. Every R book or video that I have read or watched, recommends using the 'apply' family of language constructs over the classic 'for' loop stating that apply is much faster.
So the first question is: how would you write it using apply (or tapply or some other apply)?
Second, is this really true though that apply is much faster than for? The csv file 'yellow_point_02.csv' has approx. 2500 rows. This script runs almost instantly on my Macbook Pro which has 16 Gig of memory.
Third, See the 'quant_cols' vector? I created it so that I could write a generic loop (for columm_name in quant_cols) ....But I could not make it to work. So I hard-coded the column names post-fixed with '_quantile' and called the 'perc.rank' many times. Is there a way this could be made dynamic? I tried the 'paste' stuff that I have in my script, but that did not work.
On the positive side though, R seems awesome in its ability to cut through the 'Data Wrangling' tasks with very few statements.
Thanks for your time.

how to add value to existing variable from inside a loop?

I want to add a computed value to an existing vector from within a loop in which the wanted vector is called from within the loop . that is im looking for some function that is similar to assign() function but that will enable me to add values to an existing variables and not creating new variables.
example:
say I have 3 variabels :
sp=3
for(i in 1:sp){
name<-paste("sp",i,sep="")
assign(name,rnorm(5))
}
and now I want to access the last value in each of the variabels, double it and add the resault to the vector:
for(i in 1:sp){
name<-paste("sp",i,sep="")
name[6]<-name[5]*2
}
the problem here is that "name" is a string, how can R identify it as a veriable name and access it?
What you are asking for is something like this:
get(name)
In your code it would like this:
v <- 1:10
var <- "v"
for (i in v){
tmp <- get(var)
tmp[6] <- tmp[5]*2
assign(var, tmp)
}
# [1] 1 2 3 4 5 10 7 8 9 10
Does that help you in any way?
However, I agree with the other answer, that lists and the lapply/sapply-functions are better suited!
This is how you can do this with a list:
sp=3
mylist <- vector(mode = "list", length = sp) #initialize a list
names(mylist) <- paste0("sp",seq_len(sp)) #set the names
for(i in 1:sp){
mylist[[i]] <- rnorm(5)
}
for(i in 1:sp){
mylist[[i]] <- c(mylist[[i]], mylist[[i]][5] * 2)
}
mylist
#$sp1
#[1] 0.6974563 0.7714190 1.1980534 0.6011610 -1.5884306 -3.1768611
#
#$sp2
#[1] -0.2276942 0.2982770 0.5504381 -0.2096708 -1.9199551 -3.8399102
#
#$sp3
#[1] 0.235280995 0.276813498 0.002567075 -0.774551774 0.766898045 1.533796089
You can then access the list elements as described in help("["), i.e., mylist$sp1, mylist[["sp1"]], etc.
Of course, this is still very inefficient code and it could be improved a lot. E.g., since all three variables are of same type and length, they really should be combined into a matrix, which could be filled with one call to rnorm and which would also allow doing the second operation with vectorized operations.
#Roland is absolutely right and you absolutely should use a list for this type of problem. It's cleaner and easier to work with. Here's another way of working with what you have (It can be easily generalised):
sp <- replicate(3, rnorm(5), simplify=FALSE)
names(sp) <- paste0("sp", 1:3)
sp
#$sp1
#[1] -0.3723205 1.2199743 0.1226524 0.7287469 -0.8670466
#
#$sp2
#[1] -0.5458811 -0.3276503 -1.3031100 1.3064743 -0.7533023
#
#$sp3
#[1] 1.2683564 0.9419726 -0.5925012 -1.2034788 -0.6613149
newsp <- lapply(sp, function(x){x[6] <- x[5]*2; x})
newsp
#$sp1
#[1] -0.3723205 1.2199743 0.1226524 0.7287469 -0.8670466 -1.7340933
#
#$sp2
#[1] -0.5458811 -0.3276503 -1.3031100 1.3064743 -0.7533023 -1.5066046
#
#$sp3
#[1] 1.2683564 0.9419726 -0.5925012 -1.2034788 -0.6613149 -1.3226297
EDIT: If you are truly, sincerely dedicated to doing this despite being recommended otherwise, you can do it this way:
for(i in 1:sp){
name<-paste("sp",i,sep="")
assign(name, `[<-`(get(name), 6, `[`(get(name), 5) * 2))
}

Adding data frames as list elements (using for loop)

I have in my environment a series of data frames called EOG. There is one for each year between 2006 and 2012. Like, EOG2006, EOG2007...EOG2012. I would like to add them as elements of a list.
First, I am trying to know if this is possible. I read the official R guide and a couple of R programming manuals but I didn't find explicit examples about that.
Second, I would like to do this using a for loop. Unfortunately, the code I used to do the job is wrong and I am going crazy to fix it.
for (j in 2006:2012){
z<-j
sEOG<-paste("EOG", z, sep="")
dEOG<-get(paste("EOG", z, sep=""))
lsEOG<-list()
lsEOG[[sEOG]]<-dEOG
}
This returns a list with one single element. Where is the mistake?
You keep reinitializing the list inside the loop. You need to move lsEOG<-list() outside the for loop.
lsEOG<-list()
for (j in 2006:2012){
z <- j
sEOG <- paste("EOG", z, sep="")
dEOG <- get(paste("EOG", z, sep=""))
lsEOG[[sEOG]] <-dEOG
}
Also, you can use j directly in the paste functions:
sEOG <- paste("EOG", j, sep="")
I had the same question, but felt that the OP's initial code was a bit opaque for R beginners. So, here is perhaps a bit clearer example of how to create data frames in a loop and add them to a list which I just now figured out by playing around in the R shell:
> dfList <- list() ## create empty list
>
> for ( i in 1:5 ) {
+ x <- rnorm( 4 )
+ y <- sin( x )
+ dfList[[i]] <- data.frame( x, y ) ## create and add new data frame
+ }
>
> length( dfList ) ## 5 data frames in list
[1] 5
>
> dfList[[1]] ## print 1st data frame
x y
1 -0.3782376 -0.3692832
2 -1.3581489 -0.9774756
3 1.2175467 0.9382535
4 -0.7544750 -0.6849062
>
> dfList[[2]] ## print 2nd data frame
x y
1 -0.1211670 -0.1208707
2 -1.5318212 -0.9992406
3 0.8790863 0.7701564
4 1.4014124 0.9856888
>
> dfList[[2]][4,2] ## in 2nd data frame, print element in row 4 column 2
[1] 0.9856888
>
For R beginners like me, note that double brackets are required to access the ith data frame. Basically, double brackets are used for lists while single brackets are used for vectors.
If the data frames are saved as an object you can find them by apropos("EOG", ignore.case=FALSE) and them with a loop store them in the list:
list.EOG<- apropos("EOG", ignore.case=FALSE) #Find the objects with case sensitive
lsEOG<-NULL #Creates the object to full fill in the list
for (j in 1:length(list.EOG)){
lsEOG[i]<-get(list.EOG[i]) #Add the data.frame to each element of the list
}
to add the name of each one to the list you can use:
names(lsEOG, "names")<-list.EOG

Resources