Query in back-testing strategy in R- Indian trader perspective - r

There is a documentation for backtesting in R in GitHub(https://timtrice.github.io/backtesting-strategies/).
I have a query in two lines of code mentioned in this document (https://timtrice.github.io/backtesting-strategies/using-quantstrat.html#settings-and-variables).
First line
Sys.setenv(TZ = "UTC")
Second line
currency('USD')
As you can see, the first line sets - system time to the US and the second line - sets the currency in which trading is occurring to the US. I am an Indian Trader and my job is to do back-testing with equity data for Indian companies. I use quantstrat and quantmod packages along with its dependencies. The data is downloaded from Yahoo Finance through R platform.
What is the argument should an Indian trader pass to both these
functions(Sys.setenv and currency)???. The currency of Indian market
is INR(Indian Nation Rupees) and the time of India is GMT+5:30
I have tried to pass the argument "GMT+5:30" to Sys.setenv function and it turned back an error. But when i tried to pass GMT, there was no error. But Indian timing is GMT+5:30.

I found the answer. For determining the time zone, type OlsonNames() in R. You will get a comprehensive list of timezones. Among that, please choose the specific one according to your timezone. So for me(Indian trader), it would be Sys.getenv("Asia/Kolkata") For the currency, please set it as currency("INR") . I thank Ilya Kipnis - for helping in arriving at solution.

Related

Quantmod getSymbols systematically returns missing value on Chinese stocks

Today(2019-2-27), I discovered that almost all the stock price of Chinese companies listed in Shanghai/Shenzhen cannot be completely downloaded by "getSymbols" function in quantmod, which always generated a warning message of missing data. However, Neither US companies nor Chinese companies listed in US were affected. As far as I can remember, this is the first time I encounter this issue. I was thinking which parts of this process went wrong. Yahoo finance database or getSymbols??? Examples I tried were actually some of the biggest companies, so I assume their stock data are fully available.
> getSymbols("BABA") ### Alibaba listed in US, not affected
[1] "BABA"
> getSymbols("BILI")
[1] "BILI"
> getSymbols("0700.hk") ### Tencent listed in HK, affected.
[1] "0700.HK"
Warning message:
0700.hk contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
> getSymbols("601398.SS")
[1] "601398.SS"
Warning message:
601398.SS contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
> getSymbols("601318.SS")
[1] "601318.SS"
Warning message:
601318.SS contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
It is a yahoo issue. If you look at the december 2011 data from tencent on the historical data tab of yahoo, you can see that yahoo doesn't have the data for the 24th and the 31st of December. Which are two of the 3 records that are missing data. The other is for 2008-08-22.
You do know, that the default request with getSymbols for yahoo starts at 2007-01-01. So you could change that to a more recent date. But it is free data. You can not expect the same data quality as other data providers. And it happens more often with yahoo for other tickers as well.
Yes,as mentioned above by #phiver, the data quality from yahoo finance database are not satisfying. Meanwhile, google finance has stoped providing support to quantmod since March 2018. Thus I was looking for another data source within the framework of quantmod.
I found that tiingo database started to support quantmod as google finance exits.
https://www.r-bloggers.com/goodbye-google-hello-tiingo/
go to tiingo website to create an account, then you have you api.
use getSymbols.tiingo(ticker,api.key="your key") to download data
by the way,the ticker of Chinese stocks is a bit different in getSymbol.tiingo compared with getSymbols. You don't need to indicate which stock exchange, ss or sz.
getSymbols("000001.SS")
getSymbols.tiingo("000001",api.key="xxxxx")
also you might need to store your api.key, I recommend you to create a snippet, this is the most efficient way i have found so far. Further details can be seen in my another answer on how to store api.key in Rstudio.

quantstrat trading strategy with two Symbols. SIgnals in one, buy with prices in another symbol

Is it possible to set up a trading strategy in R using one Symbol (for example QQQ) to generate Signals but buy/sell another Symbol (for example QLD) ?
The tradingstrategy with the QQQ is already made, how can i include the quotes of the QLD for buying and selling when a signal with the QQQ-quotes occur?
Thank you for your help?
Merge on the QLD OHLC data (or QLD tick data if available) to an xts object which contains the QQQ indicators and signals.
Use na.locf where necessary to fill forward price data.
Treat this merged object as one symbol in quantstrat (and add other pairs of data in other symbols if you wish), even though the price data is for QLD, and the indicators are for QQQ.
When you get an entry signal in your QQQ market data, you will can execute on QLD prices, if you have a "Close" column in the data object, quantstrat will fill on that.
If these instructions don't make sense, make a full reproducible example for someone to help you with.

Rblpapi BDH to get historical fundamental data

My objective is to get fundamental data from Bloomberg via Rblpapi. Say you wanted to compare QoQ and YoY revenue per share for AMD stock - in last reporting period (date:12/26/15) to 1yr before (date:12/27/14).
# To get data for last reporting period you could
last_report_dt = bdp ("AMD US Equity", "MOST_RECENT_PERIOD_END_DT")
rev_yrly_cur = bdh("AMD US Equity","REVENUE_PER_SH",last_report_dt,last_report_dt, opt=c("periodicitySelection"="YEARLY"))
rev_qtrly_cur = bdh("AMD US Equity","REVENUE_PER_SH",last_report_dt,last_report_dt, opt=c("periodicitySelection"="QUARTERLY"))
Question is how to get the reporting date for the year before (12/27/2014) programmatically (I have many tickers) so I can get revenue for that period and compare.
Any suggestions or workarounds welcome?
Try something along the lines of:
bdp("AMD US Equity","REVENUE_PER_SH", override_fields = "EQY_FUND_RELATIVE_PERIOD", override_values = "-1FY")
This means get the value of the previous financial year. Other examples for options you can override with are: "-1FQ", "-1CQ" meaning previous financial quarter and previous calendar year, respectively.
Also, if you want to test easily you can use Excel API or FLDS on the Bloomberg Terminal. The formula to test this with Excel API is:
=BDP($E8,F$7,"DX243=-3FQ")
Overrides is the solution:
bdp("AMD US Equity","REVENUE_PER_SH",overrides=c("EQY_FUND_RELATIVE_PERIOD"="-1FQ"))

How do I use Quantmod to query Yahoo for the existence of a stock symbol

I've been using Quantmod in R and been using the
getSymbols(allsymbols, src = 'yahoo', warnings = TRUE)
However I've got a file of over 8000 stocks I want to query and a lot of them aren't valid in Yahoo's data source.
So I'd like to check them against the list of existing stocks and then adjust the list of Symbols to only include valid ones.
I've gone through Quantmod documentation and can't find anything there. Maybe it's not possible within Quantmod, but perhaps another way of doing it?
I did an extensive amount of research a while ago on this and unfortunately didn't get anywhere. The only list you can get effortlessly is through stockSymbols {TTR} function which hands in a limited list of almost 7000 ticker symbols, traded on AMEX NASDAQ and NYSE, from Yahoo! Finance:
require(TTR)
tickersList <- stockSymbols()
FYI there's been a detailed discussion about this here.
Hope it helps.

How to download intraday stock market data with R

All,
I'm looking to download stock data either from Yahoo or Google on 15 - 60 minute intervals for as much history as I can get. I've come up with a crude solution as follows:
library(RCurl)
tmp <- getURL('https://www.google.com/finance/getprices?i=900&p=1000d&f=d,o,h,l,c,v&df=cpct&q=AAPL')
tmp <- strsplit(tmp,'\n')
tmp <- tmp[[1]]
tmp <- tmp[-c(1:8)]
tmp <- strsplit(tmp,',')
tmp <- do.call('rbind',tmp)
tmp <- apply(tmp,2,as.numeric)
tmp <- tmp[-apply(tmp,1,function(x) any(is.na(x))),]
Given the amount of data I'm looking to import, I worry that this could be computationally expensive. I also don't for the life of me, understand how the time stamps are coded in Yahoo and Google.
So my question is twofold--what's a simple, elegant way to quickly ingest data for a series of stocks into R, and how do I interpret the time stamping on the Google/Yahoo files that I would be using?
I will try to answer timestamp question first. Please note this is my interpretation and I could be wrong.
Using the link in your example https://www.google.com/finance/getprices?i=900&p=1000d&f=d,o,h,l,c,v&df=cpct&q=AAPL I get following data :
EXCHANGE%3DNASDAQ
MARKET_OPEN_MINUTE=570
MARKET_CLOSE_MINUTE=960
INTERVAL=900
COLUMNS=DATE,CLOSE,HIGH,LOW,OPEN,VOLUME
DATA=
TIMEZONE_OFFSET=-300
a1357828200,528.5999,528.62,528.14,528.55,129259
1,522.63,528.72,522,528.6499,2054578
2,523.11,523.69,520.75,522.77,1422586
3,520.48,523.11,519.6501,523.09,1130409
4,518.28,520.579,517.86,520.34,1215466
5,518.8501,519.48,517.33,517.94,832100
6,518.685,520.22,518.63,518.85,565411
7,516.55,519.2,516.55,518.64,617281
...
...
Note the first value of first column a1357828200, my intuition was that this has something to do with POSIXct. Hence a quick check :
> as.POSIXct(1357828200, origin = '1970-01-01', tz='EST')
[1] "2013-01-10 14:30:00 EST"
So my intuition seems to be correct. But the time seems to be off. Now we have one more info in the data. TIMEZONE_OFFSET=-300. So if we offset our timestamps by this amount we should get :
as.POSIXct(1357828200-300*60, origin = '1970-01-01', tz='EST')
[1] "2013-01-10 09:30:00 EST"
Note that I didn't know which day data you had requested. But quick check on google finance reveals, those were indeed price levels on 10th Jan 2013.
Remaining values from first column seem to be some sort of offset from first row value.
So downloading and standardizing the data ended up being more much of a bear than I figured it would--about 150 lines of code. The problem is that while Google provides the past 50 training days of data for all exchange-traded stocks, the time stamps within the days are not standardized: an index of '1,' for example could either refer to the first of second time increment on the first trading day in the data set. Even worse, stocks that only trade at low volumes only have entries where a transaction is recorded. For a high-volume stock like APPL that's no problem, but for low-volume small caps it means that your series will be missing much if not the majority of the data. This was problematic because I need all the stock series to lie neatly on to of each other for the analysis I'm doing.
Fortunately, there is still a general structure to the data. Using this link:
https://www.google.com/finance/getprices?i=1800&p=1000d&f=d,o,h,l,c,v&df=cpct&q=AAPL
and changing the stock ticker at the end will give you the past 50 days of trading days on 1/2-hourly increment. POSIX time stamps, very helpfully decoded by #geektrader, appear in the timestamp column at 3-week intervals. Though the timestamp indexes don't invariably correspond in a convenient 1:1 manner (I almost suspect this was intentional on Google's part) there is a pattern. For example, for the half-hourly series that I looked at the first trading day of ever three-week increment uniformly has timestamp indexes running in the 1:15 neighborhood. This could be 1:13, 1:14, 2:15--it all depends on the stock. I'm not sure what the 14th and 15th entries are: I suspect they are either daily summaries or after-hours trading info. The point is that there's no consistent pattern you can bank on.The first stamp in a training day, sadly, does not always contain the opening data. Same thing for the last entry and the closing data. I found that the only way to know what actually represents the trading data is to compare the numbers to the series on Google maps. After days of futiley trying to figure out how to pry a 1:1 mapping patter from the data, I settled on a "ballpark" strategy. I scraped APPL's data (a very high-volume traded stock) and set its timestamp indexes within each trading day as the reference values for the entire market. All days had a minimum of 13 increments, corresponding to the 6.5 hour trading day, but some had 14 or 15. Where this was the case I just truncated by taking the first 13 indexes. From there I used a while loop to essentially progress through the downloaded data of each stock ticker and compare its time stamp indexes within a given training day to the APPL timestamps. I kept the overlap, gap-filled the missing data, and cut out the non-overlapping portions.
Sounds like a simple fix, but for low-volume stocks with sparse transaction data there were literally dozens of special cases that I had to bake in and lots of data to interpolate. I got some pretty bizarre results for some of these that I know are incorrect. For high-volume, mid- and large-cap stocks, however, the solution worked brilliantly: for the most part the series either synced up very neatly with the APPL data and matched their Google Finance profiles perfectly.
There's no way around the fact that this method introduces some error, and I still need to fine-tune the method for spare small-caps. That said, shifting a series by a half hour or gap-filling a single time increment introduces a very minor amount of error relative to the overall movement of the market and the stock. I am confident that this data set I have is "good enough" to allow me to get relevant answers to some questions that I have. Getting this stuff commercially costs literally thousands of dollars.
Thoughts or suggestions?
Why not loading the data from Quandl? E.g.
library(Quandl)
Quandl('YAHOO/AAPL')
Update: sorry, I have just realized that only daily data is fetched with Quandl - but I leave my answer here as Quandl is really easy to query in similar cases
For the timezone offset, try:
as.POSIXct(1357828200, origin = '1970-01-01', tz=Sys.timezone(location = TRUE))
(The tz will automatically adjust according to your location)

Resources