R: if else statement is handling column as whole vector - r

I have a data set where I want to calculate the 6 month return of stocks with tq_get (see example below)
Dataset called top
ticker 6month
AKO.A
BIG
BGFV
Function
library(tidyverse)
library(dplyr)
library(tidyquant)
library(riingo)
calculate <- function (x) {
(tq_get(x, get = "tiingo", from = yesterday, to = yesterday)$adjusted/tq_get(x, get = "tiingo", from = before, to = before)$adjusted)-1
}
top[2] <- lapply(top[1], function(x) calculate(x))
Unfortunately for some of the tickers there is no value existing which results in error message when simply using lapply or mutate as the resulting vector is smaller (less rows) then the existing dataset. Resolving with try_catch did not worked.
I now wanted to apply a work around by checking with is_supported_ticker() provided by the package riingo if the ticker is available
calculate <- function (x) {
if (is_supported_ticker(x, type = "tiingo") == TRUE) {
(tq_get(x, get = "tiingo", from = yesterday, to = yesterday)$adjusted/tq_get(x, get = "tiingo", from = before, to = before)$adjusted)-1
}
else {
NA
}
}
top[2] <- lapply(top[1], function(x) calculate(x))
But now I receive the error message x ticker must be length 1, but is actually length 3.
I assume this is based on the fact that the whole first column of my dataset is used as input for is_supported_ticker() instead of row by row. How can I resolve this issue?

Glancing at the documentation, it looks like tq_get supports multiple symbols, only if_supported_ticker goes one at a time. So probably you should check all the tickers to see if they are supported, and then use tq_get once on all the supported ones. Something like this (untested, as I don't have any of these packages):
calculate <- function (x) {
supported = sapply(x, is_supported_ticker, type = "tiingo")
result = rep(NA, length(x))
result[supported] =
(
tq_get(x[supported], get = "tiingo", from = yesterday, to = yesterday)$adjusted /
tq_get(x[supported], get = "tiingo", from = before, to = before)$adjusted
) - 1
return(result)
}
It worries me that before and yesterday aren't function arguments - they're just assumed to be there in the global environment. I'd suggest passing them in as arguments to calculate(), like this:
calculate <- function (x, before, yesterday) {
supported = sapply(x, is_supported_ticker, type = "tiingo")
result = rep(NA, length(x))
result[supported] =
(
tq_get(x[supported], get = "tiingo", from = yesterday, to = yesterday)$adjusted /
tq_get(x[supported], get = "tiingo", from = before, to = before)$adjusted
) - 1
return(result)
}
# then calling it
calculate(top$ticker, before = <...>, yesterday = <...>)
This way you can pass values in for before and yesterday on the fly. If they are objects in your global environment, you can simply use calculate(top$ticker, before, yesterday), but it gives you freedom to vary those arguments without redefining those names in your global environment.

Related

R : How to create objects with a function which name and value depend on an argument, and that these objects are found in the global environment?

I have the following situation: I have different dataframes, I would like to be able, for each dataframe, to create 2 dataframes according to the value of one of the columns (log2FoldChange>1 and logFoldChange<-1).
For this I use the following code:
DJ29_T0_Overexpr = DJ29_T0[which(DJ29_T0$log2FoldChange > 1),]
DJ29_T0_Underexpr = DJ29_T0[which(DJ21_T0$log2FoldChange < -1),]
DJ229_T0 being one of my dataframe.
First problem: the sign for the dataframe where log2FoldChange < -1 is not taken into account.
But the main problem is at the time of making the function, I wrote the following:
spliteOverUnder <- function(res){
nm <-deparse(substitute(res))
assign(paste(nm,"_Overexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) > 1),])
assign(paste(nm,"_Underexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) < -1),])
}
Which I then ran with :
spliteOverUnder(DJ29_T0)
No error message, but my objects are not exported in my global environment. I tried with return(paste(nm,"_Overexpr", sep="") but it only returns the object name but not the associated dataframe.
Using paste() forces the use of assign(), so I can't do :
spliteOverUnder <- function(res){
nm <-deparse(substitute(res))
paste(nm,"_Overexpr", sep="") <<- res[which(as.numeric(as.character(res$log2FoldChange)) > 1),]
paste(nm,"_Underexpr", sep="") <<- res[which(as.numeric(as.character(res$log2FoldChange)) < -1),]
}
spliteOverUnder(DJ24_T0)
I encounter the following error:
Error in paste(nm, "_Overexpr", sep = "") <<- res[which(as.numeric(as.character(res$log2FoldChange)) > :
could not find function "paste<-"
If you've encountered this difficulty before, I'd appreciate a little help.
And if you knew, once the function works, how to use a For loop going through a list containing all my dataframes to apply this function to each of them, I'm also a taker.
Thanks
When assigning, use the pos argument to hoist the new objects out of the function.
function(){
assign(x = ..., value = ...,
pos = 1 ## see below
)
}
... where 0 = the function's local environment, 1 = the environment next up (in which the function is defined) etc.
edit
A general function to create the split dataframes in your global environment follows. However, you might rather want to save the new dataframes (from within the function) or just forward them to downstream functions than cram your workspace with intermediary objects.
splitOverUnder <- function(the_name_of_the_frame){
df <- get(the_name_of_the_frame)
df$cat <- cut(df$log2FoldChange,
breaks = c(-Inf, -1, 1, Inf),
labels = c('underexpr', 'normal', 'overexpr')
)
split_data <- split(df, df$cat)
sapply(c('underexpr', 'overexpr'),
function(n){
new_df_name <- paste(the_name_of_the_frame, n, sep = '_')
assign(x = new_df_name,
value = split_data$n,
envir = .GlobalEnv
)
}
)
}
## say, df1 and df2 are your initial dataframes to split:
sapply(c('df1', 'df2'), function(n) splitOverUnder(n))

Modify default cenboxplot (censored boxplot)

I'd like to use the censored boxplot in the R package NADA but I want to reorder the X-axis.
library(NADA)
data(Golden)
#this should reorder the factor and change the x-axis but does not
Golden$DosageGroup <-factor(Golden$DosageGroup, levels=c("Low","High"))
cenboxplot(Golden$Blood, Golden$BloodCen, Golden$DosageGroup)
The help says the output is the default boxplot method but I cannot seem to get it to work.
PS - similar to this post but no answers were given
Modifying a function in a loaded existing package consists of several steps:
get the code and store its environment
cenboxplot # just typing the name of a function should bring up its code.
# this appears
function (obs, cen, group, log = TRUE, range = 0, ...)
{
if (log)
log = "y"
else log = ""
if (missing(group))
ret = boxplot(cenros(obs, cen), log = log, range = range,
...)
else {
modeled = numeric()
groups = character()
for (i in levels(as.factor(group))) {
mod = suppressWarnings(cenros(obs[group == i], cen[group ==
i])$modeled)
grp = rep(i, length(mod))
modeled = c(modeled, mod)
groups = c(groups, grp)
}
# problem with levels of the `groups` object
boxplot(modeled ~ as.factor(groups), log = log, range = range,
...)
ret = data.frame(ros.model = modeled, group = groups)
}
abline(h = max(obs[cen]))
invisible(ret)
}
<environment: 0x55a8acb2d708>
cbp_env <- environment(cenboxplot)
figure where the function is deficient and make a copy ready to fix the problem:
It's because the constructed groups object doesn't inherit the levels from the group argument. When I'm looking at the code, I select the console output starting with the function name and ending just before the <environment ....> designation, and then paste that back to the console. I then put an assignment arrow (<-)right after the function name.
Modify teh code before hitting enter (or copy it to an editor if it's goint to require major surgery.) After code is modified, assign new value to existing name (or a new name at your discretion)
The modification that succeeds: Put this line in just below the curley-brace that is the end of the for loop. (It's also just before the boxplot call:
groups=factor(groups, levels=levels(group)) # adhere to user's intent
assign the same environment to the new version as the old version had:
environment(cenboxplot) <- cbp_env # which was stored above.
Now running your code yields:
There are other options to the fname<-old_fname; environment(fname)<-environment(old_fname) strategy. There is reassignInPackage in the R.utils package. And apparently you can do: environment(censboxplot) <- asNamespace('NADA')

R: Not able to trycatch error with lapply

I have a table with stocks in R where I want to calculate the 6 month return based on tq_get and tiingo API. I wanted to use lapply to fill my table but unfortunately some tickers are not available on tiingo or maybe are wrong which returns an error. With this error the assigned data has less rows then the existing data and lapply is not working. I tried to resolve with tryCatch but it's still not working. What is missing?
today <- Sys.Date()
yesterday <- as.Date(today) - days(1)
before <- as.Date(today) - months(6)
tiingo_api_key('<my API key')
calculate <- function (x) {
((tq_get(x, get = "tiingo", from = yesterday, to = yesterday)$adjusted)/(tq_get(x, get = "tiingo", from = before, to = before)$adjusted)-1)
}
top10[20] <- lapply(top10[1], calculate(x) tryCatch(calculate(x), error=function(e) NA))
You need to move the function inside tryCatch. tryCatch wraps your function and catches errors. This should work.
# Old version vvvvvv function call in wrong place
top10[20] <- lapply(top10[1], calculate(x) tryCatch(calculate(x), error=function(e) NA))
# Corrected version
top10[20] <- lapply(top10[1], function(x) tryCatch(calculate(x), error=function(e) NA))
EDIT: #rawr already suggested this in a comment, I just saw. I only added a brief explanation of the function.
With including is_supported_ticker() from package riingo a workaround is possible to avoid the error message.
calculate <- function (x) {
supported = sapply(x, is_supported_ticker, type = "tiingo")
result = rep(NA, length(x))
result[supported] =
(
tq_get(x[supported], get = "tiingo", from = yesterday, to = yesterday)$adjusted /
tq_get(x[supported], get = "tiingo", from = before, to = before)$adjusted
) - 1
return(result)
}

How to create a data frame with Rblpapi subscribe function

I'm sorry this example won't be reproducible by those who aren't Bloomberg users.
For the others, I'm using Rblpapi and its subscribe function. I would like to create something like a data frame, a matrix or an array and fill it with values that are streamed by the subscription.
Assuming your BBComm component is up and running, my example says:
require(Rblpapi)
con <- blpConnect()
securities <- c('SX5E 07/20/18 C3400 Index',
'SX5E 07/20/18 C3450 Index',
'SX5E 07/20/18 C3500 Index')
I would like to fill a 3 x 2 matrix with these fields:
fields <- c('BID', 'ASK')
I guess I can create a matrix like this with almost no performance overhead:
mat <- matrix(data = NA,
nrow = 3,
ncol = 2)
Now I use subscribe and its argument fun for filling purposes, so something like this (albeit ugly to see and likely inefficient):
i <- 1
subscribe(securities = securities,
fields = fields,
fun = function(x){
if (i > length(securities))
i <<- 1
tryCatch(
expr = {
mat[i, 1] <<- x$data$BID
mat[i, 2] <<- x$data$ASK
i <<- i + 1
},
error = function(e){
message(e)
},
finally = {}
)
})
Result:
Error in subscribe_Impl(con, securities, fields, fun, options, identity) :
Evaluation error: number of items to replace is not a multiple of replacement length.
Of course, this doesn't work because I don't really know how to use indexing on streamed data. $ operator seems fine to retrieve data points by name - like I did with BID and ASK - but I cannot find a way to figure out which values are referring to, say, securities[1] or to securities[2]. It seems that I get a stream of numeric values that are indistinguishable one from each other because I cannot retrieve the ownership of the value among the securities.
Using an index on x$data$BID[1] throws the same error.
Ok your code looks fine, the only thing that does not work is x$data$BID, change to x$data["BID"] and then you can store it, Im working with your code and this is my result.
fields=c("TIME","LAST_PRICE", "BID", "ASK")
blpConnect()
blpConnect()
i <- 1
subscribe(securities = securities,
fields = fields,"interval=60",
fun = function(x){
if (i > length(securities))
i <<- 1
tryCatch(
expr = {
tim <- x$data["TIME"]
last <<- x$data["LAST_PRICE"]
ask <<- x$data["ASK"]
bid <<- x$data["BID"]
i <<- i + 1
},
error = function(e){
message(e)
},
finally = {}
)
print(cbind(tim$TIME,last$LAST_PRICE,ask$ASK, bid$BID))
})
result
A good way to take a look at the result object from the subscribe function is:
subscribe(securities=c("AAPL US Equity"),
fields=c("LAST_PRICE"),
fun=function(x) print(str(x)))
From there you can work your way into the data:
subscribe(securities=c("AAPL US Equity", "INTC US Equity"),
fields=c("LAST_PRICE","BID","ASK"),
fun=function(x) {
if (!is.null(x$data$MKTDATA_EVENT_TYPE) && x$data$MKTDATA_EVENT_TYPE == "TRADE" && exists("LAST_PRICE", where = x$data)) {
print(data.frame(Ticker = x$topic, DateTime = x$data$TRADE_UPDATE_STAMP_RT, Trade = x$data$LAST_PRICE))
}
})
I only printed the data.frame here. The data can be processed or stored directly using the FUN argument of subscribe.

Paste multiple elements in R

I make this code using a for-statement. (The main purpose of this code is to list different webpages, which are obtained via httr and rvest)
r = "asdgkjkhdf"
t = "osrt"
all = c()
for(i in 1:400)
{
y = paste(r, i, sep = '')
d = paste(y, t, sep = '')
all = c(all, d)
}
all
I got things like these (pasted numbers are actually getting accumulated in the each results)
[1]asdgkjkhdf1osrt
[2]asdgkjkhdf12osrt
[3]asdgkjkhdf123osrt
[4]asdgkjkhdf1234osrt
...
But I want results like these regardless of how many numbers i put in 'for()'function.
[1]asdgkjkhdf1osrt
[2]asdgkjkhdf2osrt
...
[400]asdgkjkhdf400osrt
like these above
What should I change in order to have what I want to result in?
Should I use paste(substr(), substr(), sep='')?
If you really want to use a for-statement you can use the following
r = "asdgkjkhdf"
t = "osrt"
all = c()
for (idx in 1:400)
all = c(all, paste0(r, idx, t))
However, in R you should prefer code without for-statements since, in general, this is less readable and hurts performance. The solution without the for-statement (given by Roland in the comments) equals
all <- paste0(r, 1:400, t)
Note that paste0("string")is just a short notation for paste("string", sep='').

Resources