Search and term variables by character string - r

I have a maybe simple problem, but I can't solve it.
I have two list's. List A is empty and a list B has several named columns. Now, I want to select a colum of B by a variable and put it in list A. Somehow like shown in the example:
A<-list()
B<-list()
VAR<-"a"
B$a<-c(1:10)
B$b<-c(10:20)
B$c<-c(20:30)
#This of course dosn't work...
A$VAR<-B$VAR

You can extract list entry with B[[VAR]] and append new entry to a list using get (A[[get("VAR")]] <- newEntry):
A[[get("VAR")]] <- B[[VAR]]
## A list
# $a
# [1] 1 2 3 4 5 6 7 8 9 10

Related

renaming a specific column of each dataframe in a list as part of a loop

Basically, I would like to create multiple dataframes and attach them to a list (all within a loop) after renaming second columns. Below is the sample code.
the problem is - I want to rename the second column of each dataframe to the loop variable but couldn't manage yet.
### create a blank list
temp_list<-vector("list",)
### create a vector with values to use in the loop
temp_years<-c("2010","2011")
### loop to generate the dataframes with name i
##add to the list above, then rename column 2
### of each dataframe to the loop variable (i).
for (i in temp_years){
temp_df<-data.frame(coltest11=runif(4),coltest12=runif(4))
temp_list[[i]]<-temp_df
names(temp_list[[i]][2])<-i
}
desired output in terms of column titles:
$`2010`
coltest11 **2010**
1 0.4781636 0.28835747
2 0.1413173 0.84415993
3 0.6564438 0.01405185
4 0.3046113 0.83951115
$`2011`
coltest11 **2011**
1 0.8050338 0.2284567
2 0.3049061 0.8308597
3 0.2920562 0.8118845
4 0.3452323 0.9222456
You can try this:
### create a blank list
temp_list<-list()
### create a vector with values to use in the loop
temp_years<-c("2010","2011")
#Loop
for (i in temp_years){
temp_df<-data.frame(coltest11=runif(4),coltest12=runif(4))
temp_list[[i]]<-temp_df
names(temp_list[[i]])[2]<-i
}
Output:
$`2010`
coltest11 2010
1 0.1481673 0.5234788
2 0.4055919 0.5426163
3 0.2353523 0.5847577
4 0.5258541 0.6792990
$`2011`
coltest11 2011
1 0.4292431 0.7717647
2 0.2160180 0.4033482
3 0.8142830 0.6944202
4 0.5900886 0.4449840
Add colnames(temp_df)[2] <- i to your loop
temp_list<-vector("list",)
temp_years<-c("2010","2011")
for (i in temp_years){
temp_df<-data.frame(coltest11=runif(4),coltest12=runif(4))
colnames(temp_df)[2] <- i
temp_list[[i]]<-temp_df
names(temp_list[[i]][2])<-i
}

Calling & creating new columns based on string

I have searched quite a bit and not found a question that addresses this issue--but if this has been answered, forgive me, I am still quite green when it comes to coding in general. I have a data frame with a large number of variables that I would like to combine & create new variables from based on names I've put in a 2nd data frame in a loop. The data frame formulas should create & call columns from the main data frame data
USDb = c(1,2,3)
USDc = c(4,5,6)
EURb = c(7,8,9)
EURc = c(10,11,12)
data = data.frame(USDb, USDc, EURb, EURc)
Now I'd like to create a new column data$USDa as defined by
data$USDa = data$USDb - data$USDc
and so on for EUR and other variables. This is easy enough to do manually, but I'd like to create a loop that pulls the names from formulas, something like this:
a = c("USDa", "EURa")
b = c("USDb", "EURb")
c = c("USDc", "EURc")
formulas = data.frame(a,b,c)
for (i in 1:length(formulas[,a])){
data$formulas[i,a] = data$formulas[i,b] - data$formulas[i,c]
}
Obviously data$formulas[i,a] this returns NULL, so I tried data$paste0(formulas[i,a]) and that returns Error: attempt to apply non-function
How can I get these strings to be recognized as variables in this way? Thanks.
There are simpler ways to do this, but I'll stick to most of your code as a means of explanation. Your code should work so long as you edit your for loop to the following:
for (i in 1:length(formulas[,"a"])){
data[formulas[i,"a"]] = data[formulas[i,"b"]] - data[formulas[i,"c"]]
}
formulas[,a] won't work because you have a variable defined as a already that is not appropriate inside an index. Use formulas[, "a"] instead if you want all rows from column "a" in data.frame formulas.
data$formulas is literally searching for the column called "formulas" in the data.frame data. Instead you want to write data[formulas](of course, knowing that you need to index formulas in order to make it a proper string)
logic : iterate through each of the formulae, using a apply which is a for loop internally, and do calculation based on the formula
x = apply(formulas, 1, function(x) data[[x[3]]] - data[[x[2]]])
colnames(x) = formulas$a
x
# USDa EURa
#[1,] 3 3
#[2,] 3 3
#[3,] 3 3
cbind(data, x)
# USDb USDc EURb EURc USDa EURa
#1 1 4 7 10 3 3
#2 2 5 8 11 3 3
#3 3 6 9 12 3 3
Another option is split with sapply
sapply(setNames(split.default(as.matrix(formulas[-1]),
row(formulas[-1])), formulas$a), function(x) Reduce(`-`, data[rev(x)]))
# USDa EURa
#[1,] 3 3
#[2,] 3 3
#[3,] 3 3

combine list elements based on element names

How to combine this list of vectors by elements names ?
L1 <- list(F01=c(1,2,3,4),F02=c(10,20,30),F01=c(5,6,7,8,9),F02=c(40,50))
So to get :
results <- list(F01=c(1,2,3,4,5,6,7,8),F02=c(10,20,30,40,50))
I tried to apply the following solution merge lists by elements names but I can't figure out how to adapt this to my situation.
sapply(unique(names(L1)), function(x) unname(unlist(L1[names(L1)==x])), simplify=FALSE)
$F01
[1] 1 2 3 4 5 6 7 8 9
$F02
[1] 10 20 30 40 50
You can achieve the same result using map function from purrr
map(unique(names(L1)), ~ flatten_dbl(L1[names(L1) == .x])) %>%
set_names(unique(names(L1)))
The first line transforms the data by merging elements with matching names, while the last line renames new list accordingly.

Change multiple dataframes in a loop

I have, for example, this three datasets (in my case, they are many more and with a lot of variables):
data_frame1 <- data.frame(a=c(1,5,3,3,2), b=c(3,6,1,5,5), c=c(4,4,1,9,2))
data_frame2 <- data.frame(a=c(6,0,9,1,2), b=c(2,7,2,2,1), c=c(8,4,1,9,2))
data_frame2 <- data.frame(a=c(0,0,1,5,1), b=c(4,1,9,2,3), c=c(2,9,7,1,1))
on each data frame I want to add a variable resulting from a transformation of an existing variable on that data frame. I would to do this by a loop. For example:
datasets <- c("data_frame1","data_frame2","data_frame3")
vars <- c("a","b","c")
for (i in datasets){
for (j in vars){
# here I need a code that create a new variable with transformed values
# I thought this would work, but it didn't...
get(i)$new_var <- log(get(i)[,j])
}
}
Do you have some valid suggestions about that?
Moreover, it would be great for me if it were possible also to assign the new column names (in this case new_var) by a character string, so I could create the new variables by another for loop nested in the other two.
Hope I've not been too tangled in explain my problem.
Thanks in advance.
You can put your dataframes in a list and use lapply to process them one by one. So no need to use a loop in this case.
For example you can do this :
data_frame1 <- data.frame(a=c(1,5,3,3,2), b=c(3,6,1,5,5), c=c(4,4,1,9,2))
data_frame2 <- data.frame(a=c(6,0,9,1,2), b=c(2,7,2,2,1), c=c(8,4,1,9,2))
data_frame3 <- data.frame(a=c(0,0,1,5,1), b=c(4,1,9,2,3), c=c(2,9,7,1,1))
ll <- list(data_frame1,data_frame2,data_frame3)
lapply(ll,function(df){
df$log_a <- log(df$a) ## new column with the log a
df$tans_col <- df$a+df$b+df$c ## new column with sums of some columns or any other
## transformation
### .....
df
})
the dataframe1 becomes :
[[1]]
a b c log_a tans_col
1 1 3 4 0.0000000 8
2 5 6 4 1.6094379 15
3 3 1 1 1.0986123 5
4 3 5 9 1.0986123 17
5 2 5 2 0.6931472 9
I had the same need and wanted to change also the columns in my actual list of dataframes.
I found a great method here (the purrr::map2 method in the question works for dataframes with different columns), followed by
list2env(list_of_dataframes ,.GlobalEnv)

Converting a list to a "two or more objects" argument in R

I have to call a function in R that takes "2 or more objects" as an input, so the function definition is:
function(..., all = TRUE, <other named parameters>)
where ... is defined as 2 or more objects
The issue is I have is that my objects are in a list, and I am working with a different number of objects according to what I want to do. So if my list has 3 elements for example I would have to do:
function(list[[1]], list [[2]], list[[3]])
How can I do that generically, regardless of the number of element in my list ?
You can use do.call, as that takes a list of arguments and applies them on function. Eg for rbind :
X <- list(A=1:3,B=4:6,C=7:9)
do.call(rbind,X)
[,1] [,2] [,3]
A 1 2 3
B 4 5 6
C 7 8 9
Mind you, if you need extra arguments, you should add them to the list as well. See eg :
X <- list(A=list(A1=1:2,A2=3:4),B=list(B1=5:6,B2=7:8))
do.call(c,X) # Returns a list
do.call(c,X,recursive=TRUE) # Gives an error
do.call(c,c(X,list(recursive=TRUE)))
A.A11 A.A12 A.A21 A.A22 B.B11 B.B12 B.B21 B.B22
1 2 3 4 5 6 7 8
An example would be helpful, but I'm pretty sure you're looking for do.call:
do.call(function, c(list, list(all=TRUE, <other named parameters>)))

Resources