RBbg - Reshaping Time Series Data - r

Is there a better way of reshaping dataframe data?
temp <- bdh(conn,c("AUDUSD Curncy","EURUSD Curncy"),"PX_LAST","20110101")
gives
head(temp)
ticker date PX_LAST
1 AUDUSD Curncy 2011-01-01 NA
2 AUDUSD Curncy 2011-01-02 NA
3 AUDUSD Curncy 2011-01-03 1.0205
4 AUDUSD Curncy 2011-01-04 1.0040
5 AUDUSD Curncy 2011-01-05 1.0014
6 AUDUSD Curncy 2011-01-06 0.9969
and
tail(temp)
ticker date PX_LAST
2127 EURUSD Curncy 2013-11-26 1.3557
2128 EURUSD Curncy 2013-11-27 1.3570
2129 EURUSD Curncy 2013-11-28 1.3596
2130 EURUSD Curncy 2013-11-29 1.3591
2131 EURUSD Curncy 2013-11-30 NA
2132 EURUSD Curncy 2013-12-01 NA
in other words, the data are just vertically tacked on to each other and further processing is necessary in order to get them working. how can i regroup this data into the various tickers, i.e.
head(temp)
AUDUSD.Curncy EURUSD.Curncy
2011-01-01 NA NA
2011-01-02 NA NA
2011-01-03 1.0205 1.3375
2011-01-04 1.0040 1.3315
2011-01-05 1.0014 1.3183
2011-01-06 0.9969 1.3028
All the reshaping questions I googled didnt have the kind of reshaping I wanted. I have implemented my own piecemeal solution given below but for learning's sake I wanted to ask you guys if there is a more elegant solution for this?

You could try read.zoo. Use index.column to specify in which column index/time is stored, and reshape data according to splitcolumnn, . The result is a zoo time series
library(zoo)
z <- read.zoo(text = "ticker date PX_LAST
1 AUDUSD 2011-01-01 NA
2 AUDUSD 2011-01-02 NA
3 AUDUSD 2011-01-03 1.0205
4 AUDUSD 2011-01-04 1.0040
5 AUDUSD 2011-01-05 1.0014
6 AUDUSD 2011-01-06 0.9969
2127 EURUSD 2013-11-26 1.3557
2128 EURUSD 2013-11-27 1.3570
2129 EURUSD 2013-11-28 1.3596
2130 EURUSD 2013-11-29 1.3591
2131 EURUSD 2013-11-30 NA
2132 EURUSD 2013-12-01 NA", index.column = "date", split = "ticker")
z
# AUDUSD EURUSD
# 2011-01-01 NA NA
# 2011-01-02 NA NA
# 2011-01-03 1.0205 NA
# 2011-01-04 1.0040 NA
# 2011-01-05 1.0014 NA
# 2011-01-06 0.9969 NA
# 2013-11-26 NA 1.3557
# 2013-11-27 NA 1.3570
# 2013-11-28 NA 1.3596
# 2013-11-29 NA 1.3591
# 2013-11-30 NA NA
# 2013-12-01 NA NA
str(z)

This is exactly why we have created the RbbgExtension package. It is a wrapper around the Rbbg package that handles many issues when dealing with financial data - issues we have come across in our daily work with backtesting trading strategies etc. for a financial institution.
As you can see the output is a xts object, but if the query is across multiple tickers and multiple fields, then the output will an array - but you can read about why that is in the documentation.
We have made the package open source and publicly available on GitHub. Just use Hadley's devtools' function install_github("pgarnry/RbbgExtension") to get the package. It has a few dependencies including "Rbbg".
> require(RbbgExtension)
Loading required package: RbbgExtension
>
> tickers <- c("AUDUSD", "EURUSD")
>
> prices <- HistData(tickers = tickers,
+ type = "Curncy",
+ fields = "PX_LAST",
+ startdate = "20110101")
R version 3.1.2 (2014-10-31)
rJava Version 0.9-6
Rbbg Version 0.5.3
Java environment initialized successfully.
Looking for most recent blpapi3.jar file...
Adding C:\blp\API\APIv3\JavaAPI\v3.7.1.1\lib\blpapi3.jar to Java classpath
Bloomberg API Version 3.7.1.1
> class(prices)
[1] "xts" "zoo"
> head(prices)
AUDUSD EURUSD
2011-01-03 1.0168 1.3361
2011-01-04 1.0051 1.3308
2011-01-05 0.9995 1.3149
2011-01-06 0.9944 1.3003
2011-01-07 0.9959 1.2907
2011-01-10 0.9956 1.2951
> tail(prices)
AUDUSD EURUSD
2015-01-26 0.7925 1.1238
2015-01-27 0.7937 1.1381
2015-01-28 0.7889 1.1287
2015-01-29 0.7762 1.1320
2015-01-30 0.7762 1.1291
2015-02-02 0.7806 1.1351

rbbg's blh (now bdh) is dumb. this outputs time series correctly.
bdhx <- function(conn,securities,start_date,end_date=NULL,fields="PX_LAST",override_fields = NULL,overrides = NULL) {
temp <- bdh(conn=conn,securities=securities,fields=fields,start_date=start_date,end_date=end_date,override_fields=override_fields)
if (colnames(temp)[1]=="date")
{temp <- as.xts(temp)[,-1];colnames(temp) <- securities;res <- temp;}
else
{cn <- unique(temp[,1]);fil <- temp[,1]==cn[1];
res <- xts(temp[fil,3],as.Date(temp[fil,2]));colnames(res) <- securities[1];
for (i in 4:(length(cn)+2)){
fil <- temp[,1]==cn[i-2]
temp2 <- xts(temp[fil,3],as.Date(temp[fil,2]));colnames(temp2) <- securities[i-2];
res <- merge.xts(res,temp2)}
}
res}

Related

Dividing table without using split in R

I have a datatable for a time period of 21 days with data measured every 10 seconds which looks like
TimeStamp ActivePower CurrentL1 GeneratorRPM RotorRPM WindSpeed
2017-03-05 00:00:10 2183.650 1201.0 1673.90 NA 10.60
2017-03-05 00:00:20 2216.200 1224.0 1679.70 NA 11.00
2017-03-05 00:00:30 2176.500 1203.5 NA 16.05 11.90
---
2017-03-25 23:59:40 2024.20 1150.0 1687.00 16.15 10.35
2017-03-25 23:59:50 1959.05 1106.0 1661.15 15.90 8.65
2017-03-26 00:00:00 1820.55 1038.0 1665.70 15.80 9.20
I want to divide it into 30 minute blocks and my colleague said I shouldn't use the split function since the data can also have timestamps where there is no data and that I should manually make a 30 minute interval duration.
I have done this so far:
library(data.table)
library(dplyr)
library(tidyr)
datei <- file.choose()
data_csv <- fread(datei)
datatable1 <- as.data.table(data_csv)
datatable1 <- datatable1[turbine=="UTHA02",]
datatable1[, TimeStamp:=as.POSIXct(get("_time"), tz="UTC")]
setkey(datatable1, TimeStamp)
startdate <- datatable1[1,TimeStamp]
enddate <- datatable1[nrow(datatable1), TimeStamp]
durationForInterval <- 30*60 #in seconds
curr <- startdate
datatable1[TimeStamp >= curr & TimeStamp < curr + durationForInterval]
So I manually made a 30 minute interval duration and got the first interval
time ActivePower CurrentL1 GeneratorRPM RotorRPM WindSpeed
1: 2017-03-05 00:00:10 2183.65 1201.0 1673.90 NA 10.60
2: 2017-03-05 00:00:20 2216.20 1224.0 1679.70 NA 11.00
3: 2017-03-05 00:00:30 2176.50 1203.5 NA 16.05 11.90
4: 2017-03-05 00:00:40 2267.95 1256.5 1685.85 NA 10.60
5: 2017-03-05 00:00:50 2533.15 1408.0 1693.30 16.20 12.40
---
176: 2017-03-05 00:29:20 2750.35 1531.0 1694.40 16.20 11.45
177: 2017-03-05 00:29:30 2930.40 1630.5 1668.25 NA 12.65
178: 2017-03-05 00:29:40 2459.55 1367.0 1680.25 15.90 12.15
179: 2017-03-05 00:29:50 2713.80 1508.5 1681.15 16.20 12.25
180: 2017-03-05 00:30:00 2395.20 1333.0 1667.75 16.00 11.75
But I only could do it for the first interval and I dont know how to do it for the rest. Is there something that I am missing or am I overthinking? Any help is appreciated!
This will create a column interval with a unique value for every 30 minutes.
datatable1[, interval := as.integer(TimeStamp, units = "secs") %/% (60L*30L)]
You could split on that column or use it for grouping operations.
split(datatable1, datatable1$interval) # or split(datatable1, by = "interval")

Merging two as.POSIXct (digits.sec=2) timestamped datasets with irregular frequency - specific data issue

I am trying to merge two datasets which have irregular timestamp frequencies. I have followed an example in another post to try make this work but still it wont merge. The example I have tried to follow is adding data from an irregular time series to a timeseries with 5-min timesteps.
Data set 1
Words <- as.character(c("2016-08-30 15:04:51.97", "2016-08-30 15:04:53.70",
"2016-08-30 15:04:54.26", "2016-08-30 15:04:56.00",
"2016-08-30 15:04:56.55", "2016-08-30 15:04:58.29",
"2016-08-30 15:04:58.85", "2016-08-30 15:05:00.59",
"2016-08-30 15:05:01.15", "2016-08-30 15:05:02.89",
"2016-08-30 15:05:03.45", "2016-08-30 15:05:05.19",
"2016-08-30 15:05:05.75", "2016-08-30 15:05:07.49",
"2016-08-30 15:05:08.04"))
op <- options(digits.secs = 2)
op
Date <- as.POSIXct(Words)
TagID <- rep(2297.2, 15)
Xaxis <- as.numeric(c(13.738267, 13.76611, 13.728986, 13.70624, 13.722799,
13.696131, 13.707635, 13.683349, 13.688462, 13.690102,
13.67994, 13.680669, 13.684442, 13.676477, 13.678154))
Yaxis <- as.numeric(c(14.670887, 14.630401, 14.684383, 14.68586, 14.69338,
14.686517, 14.694365, 14.677797, 14.681285, 14.687439,
14.675471, 14.678207, 14.681899, 14.674103, 14.675745))
Zaxis <- as.numeric(c(10.106183, 10.198599, 10.075378, 10.057535, 10.054841,
10.049604, 10.042946, 10.057003, 10.054044, 10.043906,
10.058976, 10.054471, 10.050245, 10.059166, 10.057288))
Data1 <- data.frame(Date, TagID, Xaxis, Yaxis, Zaxis)
Dataset 2
Words2 <- as.character(c("2016-08-30 15:05:01.55", "2016-08-30 15:10:01.56"))
Date <- as.POSIXct(Words2)
Speed <- c(0.385031168, 0.389179907)
Direction <- c(239.5721794,229.063366)
Data2 <- data.frame(Date, Speed, Direction)
The merged datasets should look like this:
# Date TagID Xaxis Yaxis Zaxis Speed Direction
# 1: 2016-08-30 15:04:51.97 2297.2 13.73827 14.67089 10.10618 NA NA
# 2: 2016-08-30 15:04:53.70 2297.2 13.76611 14.63040 10.19860 NA NA
# 3: 2016-08-30 15:04:54.25 2297.2 13.72899 14.68438 10.07538 NA NA
# 4: 2016-08-30 15:04:56.00 2297.2 13.70624 14.68586 10.05753 NA NA
# 5: 2016-08-30 15:04:56.54 2297.2 13.72280 14.69338 10.05484 NA NA
# 6: 2016-08-30 15:04:58.28 2297.2 13.69613 14.68652 10.04960 NA NA
# 7: 2016-08-30 15:04:58.84 2297.2 13.70763 14.69436 10.04295 NA NA
# 8: 2016-08-30 15:05:00.58 2297.2 13.68335 14.67780 10.05700 NA NA
# 9: 2016-08-30 15:05:01.15 2297.2 13.68846 14.68129 10.05404 0.385031 239.5722
# 10: 2016-08-30 15:05:02.89 2297.2 13.69010 14.68744 10.04391 NA NA
# 11: 2016-08-30 15:05:03.45 2297.2 13.67994 14.67547 10.05898 NA NA
# 12: 2016-08-30 15:05:05.19 2297.2 13.68067 14.67821 10.05447 NA NA
# 13: 2016-08-30 15:05:05.75 2297.2 13.68444 14.68190 10.05025 NA NA
# 14: 2016-08-30 15:05:07.49 2297.2 13.67648 14.67410 10.05917 NA NA
# 15: 2016-08-30 15:05:08.03 2297.2 13.67815 14.67574 10.05729 NA NA
Convert the dataframes to datatables and merge them together:
#Merge datasets
library(data.table)
Data1.dt <- data.table(Data1, key="Date")[,Date2:=Date]
Data2.dt <- data.table(Data2)
NewData <- Data1.dt[Data2.dt, list(Date=Date2, Speed, Direction), roll=-Inf][
Data1.dt, list(Date, TagID, Xaxis, Yaxis, Zaxis, Speed, Direction)]
#does not work - error message but this worked in the other example
#Try again
Data1.dt <- data.table(Data1, key="Date")
Data2.dt <- data.table(Data2)
NewData2 <- Data1.dt[Data2.dt, on="Date", list(Date, Speed, Direction), roll=-Inf][
Data1.dt, list(Date, TagID, Xaxis, Yaxis, Zaxis, Speed, Direction)]
#Merges but does not carry the data with it
What am I missing to make it merge the two datasets and carry the data with it?
Note that the person that posted the original example had a similar problem but was down to a version issue - I am using a later version so it shouldn't be a problem.
R version 3.2.4 Revised (2016-03-16 r70336) -- "Very Secure Dishes"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
data.table version 1.10.4
Thanks in advance for your help.

Calling a list of tickers in quantmod using R

I want to get some data from a list of Chinese stocks using quantmod.
The list is like below:
002705.SZ -- 002730.SZ (in this sequence, there are some tickers matched with Null stock, for example, there is no stock called 002720.SZ)
300357.SZ -- 300402.SZ
603188.SS
603609.SS
603288.SS
603306.SS
603369.SS
I want to write a loop to run all these stocks to get the data from each of them and save them into one data frame.
This should get you started.
library(quantmod)
library(stringr) # for str_pad
stocks <- paste(str_pad(2705:2730,width=6,side="left",pad="0"),"SZ",sep=".")
get.stock <- function(s) {
s <- try(Cl(getSymbols(s,auto.assign=FALSE)),silent=T)
if (class(s)=="xts") return(s)
return (NULL)
}
result <- do.call(cbind,lapply(stocks,get.stock))
head(result)
# X002705.SZ.Close X002706.SZ.Close X002707.SZ.Close X002708.SZ.Close X002709.SZ.Close X002711.SZ.Close X002712.SZ.Close X002713.SZ.Close
# 2014-01-21 15.25 27.79 NA 17.26 NA NA NA NA
# 2014-01-22 14.28 28.41 NA 16.56 NA NA NA NA
# 2014-01-23 13.65 27.78 33.62 15.95 19.83 NA 36.58 NA
# 2014-01-24 15.02 30.56 36.98 17.55 21.81 NA 40.24 NA
# 2014-01-27 14.43 31.26 40.68 18.70 23.99 26.34 44.26 NA
# 2014-01-28 14.18 30.01 44.75 17.66 25.57 28.97 48.69 NA
This takes advantage of the fact that getSymbols(...) returns either an xts object, or a character string with an error message if the fetch fails.
Note that cbind(...) for xts objects aligns according to the index, so it acts like merge(...).
This produces an xts object, not a data frame. To convert this to a data.frame, use:
result.df <- data.frame(date=index(result),result)

add 1 business day to date in R

I have a Date object in R and would like to add 1 business day to this date. If the result is a holiday, I would like the date to be incremented to the next non-holiday date. Let's assume I mean NYSE holidays. How can I do this?
Example:
mydate = as.Date("2013-12-24")
mydate + 1 #this is a holiday so I want this to roll over to the 26th instead
I might use a combo of timeDate::nextBizDay() and roll=-Inf to set up a data.table lookup calendar, like this:
library(data.table)
library(timeDate)
## Set up a calendar for 2013 & 2014
cal <- data.table(date=seq(from=as.Date("2013-01-01"), by=1, length=730),
key="date")
cal2 <- copy(cal)
cal2[,nextBizDay:=date+1]
cal2 <- cal2[isBizday(as.timeDate(nextBizDay)),]
cal <- cal2[cal,,roll=-Inf]
## Check that it works
x <- as.Date("2013-12-21")+1:10
cal[J(x),]
# date nextBizDay
# 1: 2013-12-22 2013-12-23
# 2: 2013-12-23 2013-12-24
# 3: 2013-12-24 2013-12-26
# 4: 2013-12-25 2013-12-26
# 5: 2013-12-26 2013-12-27
# 6: 2013-12-27 2013-12-30
# 7: 2013-12-28 2013-12-30
# 8: 2013-12-29 2013-12-30
# 9: 2013-12-30 2013-12-31
# 10: 2013-12-31 2014-01-01
## Or perhaps:
lu <- with(cal, setNames(nextBizDay, date))
lu[as.character(x[1:6])]
# 2013-12-22 2013-12-23 2013-12-24 2013-12-25 2013-12-26 2013-12-27
# "2013-12-23" "2013-12-24" "2013-12-26" "2013-12-26" "2013-12-27" "2013-12-30"
Lubridate will not help you as it does not a notion of business days.
At least two packages do, and they both have a financial bent:
RQuantLib has exchange calendars for many exchanges (but it is a pretty large package)
timeDate also has calendars
Both packages have decent documentation which will permit you to set this up from working examples.
A third option (for simple uses) is to just store a local calendar out a few years and use that.
Edit: Here is a quick RQuantLib example:
R> library(RQuantLib)
R> adjust(calendar="TARGET", dates=Sys.Date()+2:6, bdc = 0)
2013-12-22 2013-12-23 2013-12-24 2013-12-25 2013-12-26
"2013-12-23" "2013-12-23" "2013-12-24" "2013-12-27" "2013-12-27"
R>
It just moves the given day (from argument dates) forward to the next biz day.
holidayNYSE(year = getRmetricsOptions("currentYear")) also check out isHoliday from timeDate package

Add months to IDate column of data.table in R

I have been using data.table for practically everything I was using data.frames for, as it is much, much faster on big in-memory data (several million rows). However, I'm not quite sure how to add days or months to an IDate column without using apply (which is very slow).
A minimal example:
dates = c("2003-01-01", "2003-02-01", "2003-03-01", "2003-06-01", "2003-12-01",
"2003-04-01", "2003-05-01", "2003-07-01", "2003-09-01", "2003-08-01")
dt = data.table(idate1=as.IDate(dates))
Now, let's say I want to create a column with dates 6 months ahead. Normally, for a single IDate, I would do this:
seq(dt$idate1[1],by="6 months",length=2)[2]
But this won't work as from= must be of length 1:
dt[,idate2:=seq(idate1,by="6 months",length=2)[2]]
Is there an efficient way of doing it to create column idate2 in dt?
Thanks a lot,
RR
One way is to use mondate package and add the months to it and then convert it back to iDate class object.
require(mondate)
dt = data.table(idate1=as.IDate(dates))
dt[, idate2 := as.IDate(mondate(as.Date(idate1)) + 6)]
# idate1 idate2
# 1: 2003-01-01 2003-07-01
# 2: 2003-02-01 2003-08-02
# 3: 2003-03-01 2003-09-01
# 4: 2003-06-01 2003-12-02
# 5: 2003-12-01 2004-06-01
# 6: 2003-04-01 2003-10-02
# 7: 2003-05-01 2003-11-01
# 8: 2003-07-01 2004-01-01
# 9: 2003-09-01 2004-03-02
# 10: 2003-08-01 2004-02-01
Although, I suppose that there might be other better solutions.
You can use lubridate,
library(lubridate)
dt[, idate2 := as.IDate(idate1 %m+% months(6))]
idate1 idate2
1: 2003-01-01 2003-07-01
2: 2003-02-01 2003-08-01
3: 2003-03-01 2003-09-01
4: 2003-06-01 2003-12-01
5: 2003-12-01 2004-06-01
6: 2003-04-01 2003-10-01
7: 2003-05-01 2003-11-01
8: 2003-07-01 2004-01-01
9: 2003-09-01 2004-03-01
10: 2003-08-01 2004-02-01

Resources