Sum xts elements on a list row by row - r

I have an xts object called data which contains 5 min returns for the period from 2015-01-01 17:00:00 to 2015-12-31 17:00:00. Each trading day starts at 17:00:00 and finishes the next day at the same time for a total of 288 daily returns[(24hours*60 minutes) / 5 minutes = 288 intraday returns]. The returns are denoted as
head(data, 5)
DPRICE
2015-01-01 17:00:00 0.000000e+00
2015-01-01 17:05:00 9.797714e-05
2015-01-01 17:10:00 2.027022e-04
2015-01-01 17:15:00 2.735798e-04
2015-01-01 17:20:00 7.768653e-05
tail(data, 5)
DPRICE
2015-12-31 16:40:00 0.0001239429
2015-12-31 16:45:00 0.0001272704
2015-12-31 16:50:00 0.0010186764
2015-12-31 16:55:00 0.0006841370
2015-12-31 17:00:00 0.0002481227
I am trying to standardize the data by their average absolute value for each 5-minute intra-day interval according to McMillan and Speight Daily FX Volatility Forecasts (2012).
The mathematical formula is :
My *code is
library(xts)
std_data = abs(data) #create absolute returns
D <- split(std_data, "days") #splits data to days
mts.days <- lapply(seq_along(D) - 1, function(i) {
if (i > 0) rbind(D[[i]]["T17:00:00/T23:55:00"], D[[i + 1]]["T00:00:00/T16:55:00"])
}) #creates a list with 365 elements each containing 288 unique returns
dummy = mapply(sum, mts.days) #add the first,second... observations from each element
With this code I create a list with 365 xts elements each having dimensions
> dim(mts.days[[2]])
[1] 288 1
I want to add the same observations from each element to create the denominator of the function above.

I don't understand your request, but will give it a shot nevertheless.
## generate bogus data
library(quantmod)
set.seed(123)
ndays <- 3
ndatperday <- 288
data <- cumsum(do.call("rbind", lapply(13:15, function(dd){
xts(rnorm(ndatperday)/1e4,
seq(as.POSIXct(paste0("2016-08-",dd," 17:00:00")),
length = ndatperday, by = 300))
})))
colnames(data) <- "DPRICE"
## calculate percentage returns
ret <- ROC(data, type="discrete")
## this is probably not what you need: returns divided by the overall mean
ret/mean(abs(ret), na.rm=T)
## I suspect indeed that you need returns divided by the daily mean return
library(dplyr)
ret.df <- data.frame(ret)
## create a factor identifying the 3 days of bogus data
ret.df$day <- rep(paste0("2016-08-",13:15),each=ndatperday)
## compute daily mean return
dail <- ret.df %>%
group_by(day) %>%
summarise(mean=mean(abs(DPRICE), na.rm=TRUE))
## attach daily mean returns to the days they actually are associated to
ret.df <- ret.df %>% left_join(dail)
## normalize
ret.df$DPRICE <- ret.df$DPRICE/ret.df$mean
%%%%%%%%%
Second shot: after reading the paper (http://onlinelibrary.wiley.com/doi/10.1002/for.1222/full) I might have understood what you were after:
library(quantmod)
library(dplyr)
set.seed(123)
## generate bogus 5-min series
ndays <- 365
ndatperday <- 288
data <- as.xts(zoo(0.1+cumsum(rt(ndays*ndatperday, df=3))/1e4,
seq(as.POSIXct("2015-01-01 17:00"),
as.POSIXct("2015-12-31 17:00"), by=300)))
colnames(data) <- "DPRICE"
## calculate 5-min percentage returns
ret <- ROC(data, type="discrete")
## create a factor identifying the 5-minute intra-day interval
ret.df <- as.data.frame(ret)
ret.df$intra5 <- strftime(index(ret), format="%H:%M")
## compute mean returns (over the year) for each of the 288 5-minute intra-day intervals
dail <- ret.df %>%
group_by(intra5) %>%
summarise(mean=mean(abs(DPRICE), na.rm=TRUE))
## attach mean returns to each datapoint
ret.df <- ret.df %>% left_join(dail)
## normalize
ret.df$DPRICE <- ret.df$DPRICE/ret.df$mean

Related

R filtering/selecting data by POSIXct time and a condition

I have made measurements of temperature in a high time resolution of 10 minutes on different urban Tree species, whose reactions should be compared. Therefore I am researching especially periods of heat. The Task that I fail to do on my Dataset is to choose complete days from a maximum value. E.G. Days where there is one measurement above 30 °C should be subsetted from my Dataframe completely.
Below you find a reproducible example that should illustrate my problem:
In my Measurings Dataframe I have calculated a column indicating wether the individual Measurement is above or below 30°C. I wanted to use that column to tell other functions wether they should pick a day or not to produce a New Dataframe. When anytime of the day the value is above 30 ° C i want to include it by Date from 00:00 to 23:59 in that New Dataframe for further analyses.
start <- as.POSIXct("2018-05-18 00:00", tz = "CET")
tseq <- seq(from = start, length.out = 1000, by = "hours")
Measurings <- data.frame(
Time = tseq,
Temp = sample(20:35,1000, replace = TRUE),
Variable1 = sample(1:200,1000, replace = TRUE),
Variable2 = sample(300:800,1000, replace = TRUE)
)
Measurings$heat30 <- ifelse(Measurings$Temp > 30,"heat", "normal")
Measurings$otheroption30 <- ifelse(Measurings$Temp > 30,"1", "0")
The example is yielding a Dataframe analog to the structure of my Data:
head(Measurings)
Time Temp Variable1 Variable2 heat30 otheroption30
1 2018-05-18 00:00:00 28 56 377 normal 0
2 2018-05-18 01:00:00 23 65 408 normal 0
3 2018-05-18 02:00:00 29 78 324 normal 0
4 2018-05-18 03:00:00 24 157 432 normal 0
5 2018-05-18 04:00:00 32 129 794 heat 1
6 2018-05-18 05:00:00 25 27 574 normal 0
So how do I subset to get a New Dataframe where all the days are taken where at least one entry is indicated as "heat"?
I know that for example dplyr:filter could filter the individual entries (row 5 in the head of the example). But how could I tell to take all the day 2018-05-18?
I am quite new to analyzing Data with R so I would appreciate any suggestions on a working solution to my problem. dplyris what I have been using for quite some tasks, but I am open to whatever works.
Thanks a lot, Konrad
Create variable which specify which day (droping hours, minutes etc.). Iterate over unique dates and take only such subsets which in heat30 contains "heat" at least once:
Measurings <- Measurings %>% mutate(Time2 = format(Time, "%Y-%m-%d"))
res <- NULL
newdf <- lapply(unique(Measurings$Time2), function(x){
ss <- Measurings %>% filter(Time2 == x) %>% select(heat30) %>% pull(heat30) # take heat30 vector
rr <- Measurings %>% filter(Time2 == x) # select date x
# check if heat30 vector contains heat value at least once, if so bind that subset
if(any(ss == "heat")){
res <- rbind(res, rr)
}
return(res)
}) %>% bind_rows()
Below is one possible solution using the dataset provided in the question. Please note that this is not a great example as all days will probably include at least one observation marked as over 30 °C (i.e. there will be no days to filter out in this dataset but the code should do the job with the actual one).
# import packages
library(dplyr)
library(stringr)
# break the time stamp into Day and Hour
time_df <- as_data_frame(str_split(Measurings$Time, " ", simplify = T))
# name the columns
names(time_df) <- c("Day", "Hour")
# create a new measurement data frame with separate Day and Hour columns
new_measurings_df <- bind_cols(time_df, Measurings[-1])
# form the new data frame by filtering the days marked as heat
new_df <- new_measurings_df %>%
filter(Day %in% new_measurings_df$Day[new_measurings_df$heat30 == "heat"])
To be more precise, you are creating a random sample of 1000 observations varying between 20 to 35 for temperature across 40 days. As a result, it is very likely that every single day will have at least one observation marked as over 30 °C in your example. Additionally, it is always a good practice to set seed to ensure reproducibility.

Calculate difference between timestamps

I try to find the difference between two timestamps.
The codeQ:
survey <- data.frame(date=c("07/2012","07/2012"),tx_start=c("01/2012","01/2012"))
survey$date_diff <- as.Date(as.character(survey$date), format="%m/%Y")-
as.Date(as.character(survey$tx_start), format="%m/%Y")
survey
I expect to have in the new column the different but I take NA
The results:
> survey
date tx_start date_diff
1 07/2012 01/2012 NA days
2 07/2012 01/2012 NA days
What should I change to replace as.Date for months or years?
Update based on comment of Gregor:
> survey <- data.frame(date=c("07/2012","07/2012"),tx_start=c("01/2012","01/2012"))
> survey$date <- as.Date(paste0("01/", as.character(survey$date)), "%d/%m/%Y")
> survey$tx_start <- as.Date(paste0("01/", as.character(survey$tx_start)), "%d/%m/%Y")
> survey$date_diff <- as.Date(survey$date, format="%d/%m/%Y")-
+ as.Date(survey$tx_start, format="%d/%m/%Y")
> survey
date tx_start date_diff
1 2012-07-01 2012-01-01 182 days
2 2012-07-01 2012-01-01 182 days
I usually convert my dates to POSIXct format. Then, when direct differences are taken with normal syntax, you get an answer in units of seconds. There is a difftime() function in base R that you can use as well:
survey <- data.frame(date=c("07/2012","07/2012"),tx_start=c("01/2012","01/2012"))
# Dates are finicky, add a day so that conversion will work
survey$date2 <- paste0("01/",survey$date)
survey$tx_start2 <- paste0("01/",survey$tx_start)
# conversion
survey$date2 <- as.POSIXct(x=survey$date2,format="%d/%m/%Y")
survey$tx_start2 <- as.POSIXct(x=survey$tx_start2,format="%d/%m/%Y")
# take the difference
survey$date_diff <- with(survey,difftime(time1=date2,time2=tx_start2,units="hours"))

Create 10,000 date data.frames with fake years based on 365 days window

Here my time period range:
start_day = as.Date('1974-01-01', format = '%Y-%m-%d')
end_day = as.Date('2014-12-21', format = '%Y-%m-%d')
df = as.data.frame(seq(from = start_day, to = end_day, by = 'day'))
colnames(df) = 'date'
I need to created 10,000 data.frames with different fake years of 365days each one. This means that each of the 10,000 data.frames needs to have different start and end of year.
In total df has got 14,965 days which, divided by 365 days = 41 years. In other words, df needs to be grouped 10,000 times differently by 41 years (of 365 days each one).
The start of each year has to be random, so it can be 1974-10-03, 1974-08-30, 1976-01-03, etc... and the remaining dates at the end df need to be recycled with the starting one.
The grouped fake years need to appear in a 3rd col of the data.frames.
I would put all the data.frames into a list but I don't know how to create the function which generates 10,000 different year's start dates and subsequently group each data.frame with a 365 days window 41 times.
Can anyone help me?
#gringer gave a good answer but it solved only 90% of the problem:
dates.df <- data.frame(replicate(10000, seq(sample(df$date, 1),
length.out=365, by="day"),
simplify=FALSE))
colnames(dates.df) <- 1:10000
What I need is 10,000 columns with 14,965 rows made by dates taken from df which need to be eventually recycled when reaching the end of df.
I tried to change length.out = 14965 but R does not recycle the dates.
Another option could be to change length.out = 1 and eventually add the remaining df rows for each column by maintaining the same order:
dates.df <- data.frame(replicate(10000, seq(sample(df$date, 1),
length.out=1, by="day"),
simplify=FALSE))
colnames(dates.df) <- 1:10000
How can I add the remaining df rows to each col?
The seq method also works if the to argument is unspecified, so it can be used to generate a specific number of days starting at a particular date:
> seq(from=df$date[20], length.out=10, by="day")
[1] "1974-01-20" "1974-01-21" "1974-01-22" "1974-01-23" "1974-01-24"
[6] "1974-01-25" "1974-01-26" "1974-01-27" "1974-01-28" "1974-01-29"
When used in combination with replicate and sample, I think this will give what you want in a list:
> replicate(2,seq(sample(df$date, 1), length.out=10, by="day"), simplify=FALSE)
[[1]]
[1] "1985-07-24" "1985-07-25" "1985-07-26" "1985-07-27" "1985-07-28"
[6] "1985-07-29" "1985-07-30" "1985-07-31" "1985-08-01" "1985-08-02"
[[2]]
[1] "2012-10-13" "2012-10-14" "2012-10-15" "2012-10-16" "2012-10-17"
[6] "2012-10-18" "2012-10-19" "2012-10-20" "2012-10-21" "2012-10-22"
Without the simplify=FALSE argument, it produces an array of integers (i.e. R's internal representation of dates), which is a bit trickier to convert back to dates. A slightly more convoluted way to do this is and produce Date output is to use data.frame on the unsimplified replicate result. Here's an example that will produce a 10,000-column data frame with 365 dates in each column (takes about 5s to generate on my computer):
dates.df <- data.frame(replicate(10000, seq(sample(df$date, 1),
length.out=365, by="day"),
simplify=FALSE));
colnames(dates.df) <- 1:10000;
> dates.df[1:5,1:5];
1 2 3 4 5
1 1988-09-06 1996-05-30 1987-07-09 1974-01-15 1992-03-07
2 1988-09-07 1996-05-31 1987-07-10 1974-01-16 1992-03-08
3 1988-09-08 1996-06-01 1987-07-11 1974-01-17 1992-03-09
4 1988-09-09 1996-06-02 1987-07-12 1974-01-18 1992-03-10
5 1988-09-10 1996-06-03 1987-07-13 1974-01-19 1992-03-11
To get the date wraparound working, a slight modification can be made to the original data frame, pasting a copy of itself on the end:
df <- as.data.frame(c(seq(from = start_day, to = end_day, by = 'day'),
seq(from = start_day, to = end_day, by = 'day')));
colnames(df) <- "date";
This is easier to code for downstream; the alternative being a double seq for each result column with additional calculations for the start/end and if statements to deal with boundary cases.
Now instead of doing date arithmetic, the result columns subset from the original data frame (where the arithmetic is already done). Starting with one date in the first half of the frame and choosing the next 14965 values. I'm using nrow(df)/2 instead for a more generic code:
dates.df <-
as.data.frame(lapply(sample.int(nrow(df)/2, 10000),
function(startPos){
df$date[startPos:(startPos+nrow(df)/2-1)];
}));
colnames(dates.df) <- 1:10000;
>dates.df[c(1:5,(nrow(dates.df)-5):nrow(dates.df)),1:5];
1 2 3 4 5
1 1988-10-21 1999-10-18 2009-04-06 2009-01-08 1988-12-28
2 1988-10-22 1999-10-19 2009-04-07 2009-01-09 1988-12-29
3 1988-10-23 1999-10-20 2009-04-08 2009-01-10 1988-12-30
4 1988-10-24 1999-10-21 2009-04-09 2009-01-11 1988-12-31
5 1988-10-25 1999-10-22 2009-04-10 2009-01-12 1989-01-01
14960 1988-10-15 1999-10-12 2009-03-31 2009-01-02 1988-12-22
14961 1988-10-16 1999-10-13 2009-04-01 2009-01-03 1988-12-23
14962 1988-10-17 1999-10-14 2009-04-02 2009-01-04 1988-12-24
14963 1988-10-18 1999-10-15 2009-04-03 2009-01-05 1988-12-25
14964 1988-10-19 1999-10-16 2009-04-04 2009-01-06 1988-12-26
14965 1988-10-20 1999-10-17 2009-04-05 2009-01-07 1988-12-27
This takes a bit less time now, presumably because the date values have been pre-caclulated.
Try this one, using subsetting instead:
start_day = as.Date('1974-01-01', format = '%Y-%m-%d')
end_day = as.Date('2014-12-21', format = '%Y-%m-%d')
date_vec <- seq.Date(from=start_day, to=end_day, by="day")
Now, I create a vector long enough so that I can use easy subsetting later on:
date_vec2 <- rep(date_vec,2)
Now, create the random start dates for 100 instances (replace this with 10000 for your application):
random_starts <- sample(1:14965, 100)
Now, create a list of dates by simply subsetting date_vec2 with your desired length:
dates <- lapply(random_starts, function(x) date_vec2[x:(x+14964)])
date_df <- data.frame(dates)
names(date_df) <- 1:100
date_df[1:5,1:5]
1 2 3 4 5
1 1997-05-05 2011-12-10 1978-11-11 1980-09-16 1989-07-24
2 1997-05-06 2011-12-11 1978-11-12 1980-09-17 1989-07-25
3 1997-05-07 2011-12-12 1978-11-13 1980-09-18 1989-07-26
4 1997-05-08 2011-12-13 1978-11-14 1980-09-19 1989-07-27
5 1997-05-09 2011-12-14 1978-11-15 1980-09-20 1989-07-28

Convert continuous time-series data into daily-hourly representation using R

I have time-series data in xts representation as
library(xts)
xtime <-timeBasedSeq('2015-01-01/2015-01-30 23')
df <- xts(rnorm(length(xtime),30,4),xtime)
Now I want to calculate co-orelation between different days, and hence I want to represent df in matrix form as:
To achieve this I used
p_mat= split(df,f="days",drop=FALSE,k=1)
Using this I get a list of days, but I am not able to arrange this list in matrix form. Also I used
p_mat<- df[.indexday(df) %in% c(1:30) & .indexhour(df) %in% c(1:24)]
With this I do not get any output.
Also I tried to use rollapply(), but was not able to arrange it properly.
May I get help to form the matrix using xts/zoo objects.
Maybe you could use something like this:
#convert to a data.frame with an hour column and a day column
df2 <- data.frame(value = df,
hour = format(index(df), '%H:%M:%S'),
day = format(index(df), '%Y:%m:%d'),
stringsAsFactors=FALSE)
#then use xtabs which ouputs a matrix in the format you need
tab <- xtabs(value ~ day + hour, df2)
Output:
hour
day 00:00:00 01:00:00 02:00:00 03:00:00 04:00:00 05:00:00 06:00:00 07:00:00 08:00:00 09:00:00 10:00:00 11:00:00 12:00:00
2015:01:01 28.15342 35.72913 27.39721 29.17048 28.42877 28.72003 28.88355 31.97675 29.29068 27.97617 35.37216 29.14168 29.28177
2015:01:02 23.85420 28.79610 27.88688 27.39162 29.77241 22.34256 34.70633 23.34011 28.14588 25.53632 26.99672 38.34867 30.06958
2015:01:03 37.47716 31.70040 29.04541 34.23393 33.54569 27.52303 38.82441 28.97989 24.30202 29.42240 30.83015 39.23191 30.42321
2015:01:04 24.13100 32.08409 29.36498 35.85835 26.93567 28.27915 26.29556 29.29158 31.60805 27.07301 33.32149 25.16767 25.80806
2015:01:05 32.16531 29.94640 32.04043 29.34250 31.68278 28.39901 24.51917 33.95135 36.07898 28.76504 24.98684 32.56897 29.82116
2015:01:06 18.44432 27.43807 32.28203 29.76111 29.60729 32.24328 25.25417 34.38711 29.97862 32.82924 34.13643 30.89392 26.48517
2015:01:07 34.58491 20.38762 32.29096 31.49890 28.29893 33.80405 28.44305 28.86268 33.42964 36.87851 31.08022 28.31126 25.24355
2015:01:08 33.67921 31.59252 28.36989 35.29703 27.19507 27.67754 25.99571 27.32729 33.78074 31.73481 34.02064 28.43953 31.50548
2015:01:09 28.46547 36.61658 36.04885 30.33186 32.26888 25.90181 31.29203 34.17445 30.39631 28.18345 27.37687 29.85631 34.27665
2015:01:10 30.68196 26.54386 32.71692 28.69160 23.72367 28.53020 35.45774 28.66287 32.93100 33.78634 30.01759 28.59071 27.88122
2015:01:11 32.70907 31.51985 29.22881 36.31157 32.38494 25.30569 29.37743 22.32436 29.21896 19.63069 35.25601 27.45783 28.28008
2015:01:12 29.96676 30.51542 29.41650 29.34436 37.05421 33.05035 34.44572 26.30717 30.65737 34.61930 29.77391 21.48256 31.37938
2015:01:13 33.46089 34.29776 37.58262 27.58801 28.43653 28.33511 28.49737 28.53348 28.81729 35.76728 27.20985 28.44733 32.61015
2015:01:14 22.96213 32.27889 36.44939 23.45088 26.88173 27.43529 27.27547 21.86686 32.00385 23.87281 29.90001 32.37194 29.20722
2015:01:15 28.30359 30.94721 20.62911 33.84679 27.58230 26.98849 23.77755 24.18443 30.22533 32.03748 21.60847 25.98255 32.14309
2015:01:16 23.52449 29.56138 31.76356 35.40398 24.72556 31.45754 30.93400 34.77582 29.88836 28.57080 25.41274 27.93032 28.55150
2015:01:17 25.56436 31.23027 25.57242 31.39061 26.50694 30.30921 28.81253 25.26703 30.04517 33.96640 36.37587 24.50915 29.00156
...and so on
Here's one way to do it using a helper function that will account for days that do not have 24 observations.
library(xts)
xtime <- timeBasedSeq('2015-01-01/2015-01-30 23')
set.seed(21)
df <- xts(rnorm(length(xtime),30,4), xtime)
tHourly <- function(x) {
# initialize result matrix for all 24 hours
dnames <- list(format(index(x[1]), "%Y-%m-%d"),
paste0("H", 0:23))
res <- matrix(NA, 1, 24, dimnames = dnames)
# transpose day's rows and set colnames
tx <- t(x)
colnames(tx) <- paste0("H", .indexhour(x))
# update result object and return
res[,colnames(tx)] <- tx
res
}
# split on days, apply tHourly to each day, rbind results
p_mat <- split(df, f="days", drop=FALSE, k=1)
p_list <- lapply(p_mat, tHourly)
p_hmat <- do.call(rbind, p_list)

R xts irregular time series need regular 5 minute intervals but only for trading days

I have an irregular time series of all trades of a given ETF over a span of 4 years:
> head(BKF.xts)
BKF.xts
2008-01-02 09:30:01 59.870
2008-01-02 09:38:04 59.710
2008-01-02 09:39:51 59.612
2008-01-02 09:51:16 59.640
2008-01-02 10:06:08 59.500
> tail(BKF.xts)
BKF.xts
2011-12-30 15:59:23 36.26
2011-12-30 15:59:53 36.26
2011-12-30 15:59:56 36.27
2011-12-30 15:59:57 36.27
2011-12-30 15:59:58 36.27
2011-12-30 16:00:00 36.33
What I would like is to have the prices at every 5 minute interval for ALL trading days. Because I am dealing with ETFs it's possible that there are dates where the market is open that the ETF did not trade and so there will be no data for that date in my sample. However i need my final time series to account for all trading days. I have downloaded daily data for the same period so that I have another time series of every trading day. Not sure if that helps.
Also if there is no particular trade at one 5:00 minute time stamp I would like for the price of the most recent trade that took place. So for the data i posted above, what I would want is:
> head(BKF.xts)
BKF.xts
2008-01-02 09:35:00 59.870
2008-01-02 09:40:00 59.612
2008-01-02 09:45:00 59.612
2008-01-02 09:50:00 59.640
2008-01-02 09:55:00 59.640
Any help is greatly appreciated.
As mentionned in a previous question,
you can use to.period to have the last value in each 5-minute period,
align.time to replace the timestamps with the end of each period,
cbind to add the missing periods (with a missing value)
and na.locf to replace the missing values.
# Sample data
library(quantmod)
days <- seq(Sys.Date(), by=1, length=20)
days <- days[ ! format(days, "%A") %in% c("Saturday", "Sunday") ]
timestamps <- ISOdatetime(
year(days), month(days), day(days),
9, 0, 0 # You may want/need to add the timezone
)
timestamps <- timestamps[-2]
x <- lapply(timestamps, function(u) sort(u + sample(60*60*8,200)))
x <- do.call(c, x)
x <- xts(rnorm(length(x)), x)
# Value at the end of each 5-minute period
y <- to.minutes5(x)
y <- Cl(y)
y <- align.time(y, 5*60)
# All 5-minute periods, betweem 9am (excluded)
# and 5pm (included) for each day
z <- lapply(timestamps, function(u) u + 5*60*(1:(12*8)))
z <- do.call(c, z)
z <- cbind(y, xts(, z))
# Fill in missing values
z <- na.locf(z)
Thanks, I actually figured it out on my own after enough trial and error and discovering the xts subset function. Here's what I did:
#BKF here is my data set
BKF<-xts(BKF$PRICE,order.by=BKF$DATE)
colnames(BKF)=c("Price")
BKF<-to.minutes5(BKF)
BKF<-align.time(BKF,5*60)
#create a regular time series that has values for each 5 minute interval and use cbind to merge with my data
tmp<-xts(,seq.POSIXt(start(BKF),end(BKF),by="5 mins"))
BKF<-cbind(tmp,BKF)
# subset data from 9:30am to 4:00pm and replace NA's with last observation
BKF<-BKF["T09:30:00/T16:00:00"]
BKF<-na.locf(BKF)
# SP here is daily S&P data for the same sample period
SP<-xts(order.by=as.Date(td$Date,tz="",format="%y-%m-%d"))
# Subset observations for all trading days according to the daily S&P data
test<-bt[as.Date(index(bt),tz="")%in%as.Date(index(td),tz="")]
Done.

Resources