R assign as dataframe and and to list - r

I'm looping through a vector with prefixes. I'm assigning dataframes in the loop, based on the prefix and I want to add them to a list.
This is the code I have. It works to initialize the dataframe and they also get the correct names. But how can I add them to the list
prefix = c("green", "red", "orange")
diff_list = list()
for (i in 1:length(prefix)) {
tmp_name = paste(prefix[i], "_diff_tbl", sep = "")
assign(tmp_name, data.frame())
diff_list[prefix[i]] = ???
}

I am not sure if the code below reaches your goal
diff_list <- setNames(lapply(prefix,function(x) data.frame()),paste(prefix,"_diff_tbl", sep = ""))
and
list2env(setNames(lapply(prefix,function(x) data.frame()),paste(prefix,"_diff_tbl", sep = "")),envir = .GlobalEnv)

You can use rep() or replicate() to copy an object many times and store them into a list.
## option 1
diff_list <- setNames(rep(list(data.frame()), length(prefix)), paste0(prefix, "_diff_tbl"))
and
## option 2
diff_list <- setNames(replicate(length(prefix), data.frame()), paste0(prefix, "_diff_tbl"))

Related

Using lapply with gsub to replace word in dataframe using another dataframe as 'dictionnary'

I have a dataframe called data where I want to replace some word in specific columns A & B.
I have a second dataframe called dict that is playing the role of dictionnary/hash containing the words and the values to use for replacement.
I think it could be done with purrr’s map() but I want to use apply. It's for a package and I don't want to have to load another package.
The following code is not working but it's give you the idea. I'm stuck.
columns <- c("A", "B" )
data[columns] <- lapply(data[columns], function(x){x}) %>% lapply(dict, function(y){
gsub(pattern = y[,2], replacement = y[,1], x)})
This is working for one word to change...but I'm not able to pass the list of changes conainted in the dictionnary.
data[columns] <- lapply(data[columns], gsub, pattern = "FLT1", replacement = "flt1")
#Gregor_Thomas is right, you need a for loop to have a recursive effect, otherwise you just replace one value at the time.
df <- data.frame("A"=c("PB1","PB2","OK0","OK0"),"B"=c("OK3","OK4","PB1","PB2"))
dict <- data.frame("pattern"=c("PB1","PB2"), "replacement"=c("OK1","OK2"))
apply(df[,c("A","B")],2, FUN=function(x) {
for (i in 1:nrow(dict)) {
x <- gsub(pattern = dict$pattern[i], replacement = dict$replacement[i],x)
}
return(x)
})
Or, if your dict data is too long you can generate a succession of all the gsub you need using a paste as a code generator :
paste0("df[,'A'] <- gsub(pattern = '", dict$pattern,"', replacement = '", dict$replacement,"',df[,'A'])")
It generates all the gsub lines for the "A" column :
"df[,'A'] <- gsub(pattern = 'PB1', replacement = 'OK1',df[,'A'])"
"df[,'A'] <- gsub(pattern = 'PB2', replacement = 'OK2',df[,'A'])"
Then you evaluate the code and wrap it in a lapply for the various columns :
lapply(c("A","B"), FUN = function(v) { eval(parse(text=paste0("df[,'", v,"'] <- gsub(pattern = '", dict$pattern,"', replacement = '", dict$replacement,"',df[,'",v,"'])"))) })
It's ugly but it works fine to avoid long loops.
Edit : for a exact matching between df and dict maybe you should use a boolean selection with == instead of gsub().
(I don't use match() here because it selects only the first matching
df <- data.frame("A"=c("PB1","PB2","OK0","OK0","OK"),"B"=c("OK3","OK4","PB1","PB2","AB"))
dict <- data.frame("pattern"=c("PB1","PB2","OK"), "replacement"=c("OK1","OK2","ZE"))
apply(df[,c("A","B")],2, FUN=function(x) {
for (i in 1:nrow(dict)) {
x[x==dict$pattern[i]] <- dict$replacement[i]
}
return(x)
})

Using paste and sum inside a for-loop

I need to compare a character string to multiple others and tried to do it the following way:
empty = character(0)
ps_2 = c("h2","h3")
ps_3 = c("h3", "h4")
visible = ("h2")
i = 2
ps_t = empty
ps_t <- append(ps_t, sum(visible %in% paste("ps_", i, sep="")))
With the intention to write a loop instead of i = 2, in order to cycle trough ps_2,ps_3,...
However I think it's not working since the paste() command returns a string instead of the character string with the name: ps_2.
How can I fix this?
Thanks for the time and effort!
Kind regards,
A fellow datafanatic!
The function you need is get(), which gets the value of the object.
ps_t <- ps_t = NULL
sapply(2:3, function(i) append(ps_t, sum(visible %in% get(paste0("ps_", i)))))
Or simply:
sapply(2:3, function(i) sum(visible %in% get(paste0("ps_", i))))
Output
[1] 1 0
You can use eval in R to convert the string to a variable name. You can find the solution here.
Here's what your code will look like:
ps_t <- c(0, (sum(visible %in% eval(parse(text = paste("ps_", i, sep=""))))))
It will give you a numeric vector.
OR
You can use get.
ps_t <- append(0, sum(visible %in% get(paste("ps_", i, sep = ""))))
ps_t

R adding values to vector in for loop

I'm working in R & using a for loop to iterate through my data:
pos = c(1256:1301,6542:6598)
sd_all = null
for (i in pos){
nameA = paste("A", i, sep = "_")
nameC = paste("C", i, sep = "_")
resA = assign(nameA, unlist(lapply(files, function(x) x$percentageA[x$position==i])))
resC = assign(nameC, unlist(lapply(files, function(x) x$percentageC[x$position==i])))
sd_A = sd(resA)
sd_C = sd(resC)
sd_all = ?
}
now I want to generate a vector called 'sd_all' that contains the standard deviations of resA & resC. I cannot just do 'sd_all = c(sd(resA), sd(resC))', because then I only use one value in 'pos'. I want to do it for all values in 'pos' off course.
It looks like you'd be best served with sd_all as a list object. That way you can insert each of your 2 values ( sd(resA) and sd(resC) ).
Initialising a list is simple (this would replace the second line of your code):
sd_all <- list()
Then you can insert both the values you want to into a single list element like so (this would replace the last line in your for loop):
sd_all[[ i ]] <- c( sd( resA ), sd( resC ) )
After your loop, you can then insert this list as a column in a data.frame if that's what you'd like to do.

Search-and-replace on a set of columns - getting an error trying to gsub

this is a follow-up to this question: Search-and-replace on a list of strings - gsub eapply?
I have the following code:
library(quantmod)
library(stringr)
stockData <- new.env()
stocksLst <- c("AAB.TO", "BBD-B.TO", "BB.TO", "ZZZ.TO")
nrstocks = length(stocksLst)
startDate = as.Date("2016-09-01")
for (i in 1:nrstocks) {
getSymbols(stocksLst[i], env = stockData, src = "yahoo", from = startDate)
}
stockData = as.list(stockData)
names(stockData) = gsub("[.].*$", "", names(stockData))
names(stockData) = gsub("-", "", names(stockData))
symbolsLstCl <- ls(stockData)
The last post got me this far and I greatly appreciate the help. Now, I am trying to do a similar replace for the column names as quantmod includes the symbol name in the columns:
colnames(stockData$ZZZ)
# [1] "ZZZ.TO.Open" "ZZZ.TO.High" "ZZZ.TO.Low" "ZZZ.TO.Close" "ZZZ.TO.Volume" "ZZZ.TO.Adjusted"
I can easily update one of the xts objects using colnames, but I want to include this in a loop so I can do it to all. This is what I had tried, but it fails:
eval(parse(text = paste0("colnames(stockData$", symbolsLstCl[i], ")"))) <- eval(parse(text = (paste0("str_replace(colnames(stockData$", symbolsLstCl[i], "), ", "\".TO\", ", "\"\")"))))
Which I find strange, as if I use this (where the left side is hard-coded), it works:
colnames(stockData$ZZZ) <- eval(parse(text = (paste0("str_replace(colnames(stockData$", symbolsLstCl[i], "), ", "\".TO\", ", "\"\")"))))
I have the sneaking suspicion that there is a much better way to update all of the columns for each element in these lists.. any suggestions are appreciated. Thanks, Adam
allnames <- lapply(stockData,
function(x) names(x) = gsub(".TO", "", names(x)))
# replace column names
for (i in 1:length(stockData)) {
names(stockData[[i]]) <- allnames[[i]]
}
# print all column names
for (i in 1:length(stockData)) {
print(names(stockData[[i]]))
}
[1] "AAB.Open" "AAB.High" "AAB.Low" "AAB.Close" "AAB.Volume" "AAB.Adjusted"
[1] "BBD-B.Open" "BBD-B.High" "BBD-B.Low" "BBD-B.Close" "BBD-B.Volume" "BBD-B.Adjusted"
[1] "ZZZ.Open" "ZZZ.High" "ZZZ.Low" "ZZZ.Close" "ZZZ.Volume" "ZZZ.Adjusted"
[1] "BB.Open" "BB.High" "BB.Low" "BB.Close" "BB.Volume" "BB.Adjusted"
Edited: the output were not correct just now.
I suppose this is what you hope to get.

R: Use function arguments as names for list sub elements

Here is a simplified example of what I am trying to do
set.seed(1)
a <- rnorm(10)
b <- rnorm(10)
asdf<-function(vec1,vec2){
mylist <- list(sums = c(vec1 = sum(a), vec2 = sum(b)),
products = c(vec1 = prod(a), vec2 = prod(b)))
return(mylist)
}
asdf(a,b)
Here is the output:
$sums
vec1 vec2
1.322028 2.488450
$products
vec1 vec2
0.0026236813 0.0003054751
The names of the list elements are based on the names I specified when defining the function, not the actual inputs used in the function. This makes sense, in general, but I would like to know how to change this behavior for a specific problem
My desired output, given inputs a and b would be
$sums
a b
1.322028 2.488450
$products
a b
0.0026236813 0.0003054751
Whatever the inputs are, be they c(1,2,3,3,3,123) and c(2,1,1,5,7,1) or rnorm(10) and rpois(10), should be returned in the output.
I know how to do the renaming after the function is done, but I want the naming to happen
within the function. I've been looking at some other questions on SO, but haven't had anything work out quite right.
A few things I've tried without success.
asdf<-function(vec1,vec2){
name1<- deparse(substitute(vec1))
name2<- deparse(substitute(vec2))
mylist <- list(sums = c(name1 = sum(a), name2 = sum(b)),
products = c(name1 = prod(a), name2 = prod(b)))
return(mylist)
}
asdf<-function(vec1,vec2){
mylist <- list(sums = c(name1 = sum(a), name2 = sum(b)),
products = c(name1 = prod(a), name2 = prod(b)))
assign(names(mylist(vec1,vec2)$sums,
c(deparse(substitute(vec1)),deparse(substitute(vec2)))))
return(mylist)
}
It seems I may need to use get or assign or match.call, but I'm out of my league here.
I feel a bit like a dunce reading some of these help pages. If I don't know enough to understand the help pages, well, I'm not nearly as good at R as I thought I was.
use substitute to capture the names, then setNames to setthem.
asdf<-function(vec1,vec2){
nms <- as.character(c(substitute(vec1), substitute(vec2)))
mylist <- list(sums = c(vec1 = sum(a), vec2 = sum(b)),
products = c(vec1 = prod(a), vec2 = prod(b)))
# return
lapply(mylist, setNames, nms)
}
asdf(a,b)
you can put the setNames right into the list() call above, but that might the code too cumbersome to read

Resources