I am using this code to create candlesticks in plotly. However, it contains a loop which is very inefficient (38 secs to loop through 10K observations). It also uses the rbind function which means the date has to be converted to numeric and then back again, which doesn't appear to be straight forward considering its a date with time.
The loop Im trying to replace with a more efficient function is:
for(i in 1:nrow(prices)){
x <- prices[i, ]
# For high / low
mat <- rbind(c(x[1], x[3]),
c(x[1], x[4]),
c(NA, NA))
plot.base <- rbind(plot.base, mat)
}
The output is a vector with the first observation being the 1st(date) and 3rd col from input data, the second observation is the 1st and 4th col from input data, and the third observation is two NAs. The NAs are important later on for the plotting.
What is the most efficient way to achieve this?
Minimal reproducible example:
library(quantmod)
prices <- getSymbols("MSFT", auto.assign = F)
# Convert to dataframe
prices <- data.frame(time = index(prices),
open = as.numeric(prices[,1]),
high = as.numeric(prices[,2]),
low = as.numeric(prices[,3]),
close = as.numeric(prices[,4]),
volume = as.numeric(prices[,5]))
# Create line segments for high and low prices
plot.base <- data.frame()
for(i in 1:nrow(prices)){
x <- prices[i, ]
# For high / low
mat <- rbind(c(x[1], x[3]),
c(x[1], x[4]),
c(NA, NA))
plot.base <- rbind(plot.base, mat)
}
Edit:
dput(head(prices))
structure(list(time = structure(c(13516, 13517, 13518, 13521,
13522, 13523), class = "Date"), open = c(29.91, 29.700001, 29.629999,
29.65, 30, 29.799999), high = c(30.25, 29.969999, 29.75, 30.1,
30.18, 29.889999), low = c(29.4, 29.440001, 29.450001, 29.530001,
29.73, 29.43), close = c(29.860001, 29.809999, 29.639999, 29.93,
29.959999, 29.66), volume = c(76935100, 45774500, 44607200, 50220200,
44636600, 55017400)), .Names = c("time", "open", "high", "low",
"close", "volume"), row.names = c(NA, 6L), class = "data.frame")
I would be wary of a tutorial that grows an object in a loop. That's one of the slowest operations you can do in programming. (It's like buying a shelf that has exactly the room needed for your books and then replacing the shelf every time you buy a new book.)
Use subsetting like this:
res <- data.frame(date = rep(prices[, 1], each = 3),
y = c(t(prices[,c(3:4)])[c(1:2, NA),])) #transpose, subset, make to vector
res[c(FALSE, FALSE, TRUE), 1] <- NA
# date y
#1 2007-01-03 30.25
#2 2007-01-03 29.40
#3 <NA> <NA>
#4 2007-01-04 29.97
#5 2007-01-04 29.44
#6 <NA> <NA>
#7 2007-01-05 29.75
#8 2007-01-05 29.45
#9 <NA> <NA>
#10 2007-01-08 30.10
#11 2007-01-08 29.53
#12 <NA> <NA>
#13 2007-01-09 30.18
#14 2007-01-09 29.73
#15 <NA> <NA>
#16 2007-01-10 29.89
#17 2007-01-10 29.43
#18 <NA> <NA>
Related
I would like to construct annualized volatility of returns for a panel data set in R. I have monthly returns (%) per month, per firm (entity), for a large dataset.
I would like to construct the five year average of annualized volatility of monthly returns - per year (t+5) and per firm.
Constructing this measure by it self is not difficult, but I would like to do it in R, so that it groups by firm & year. I am thankful for any help.
The data looks like this:
library(xts)
library(PerformanceAnalytics)
library(quantmod)
library(lubridate)
library(data.table)
library(stringr)
# let's fetch some real-world panel data in a similar format to that cited by OP
symbols <- c('GOOG', 'AAPL', 'AMZN')
quantmod::getSymbols(symbols,
auto.assign = TRUE,
from = Sys.time() - years(20),
periodicity = 'monthly')
lapply(symbols, function(x) {
tmp <- get(x, envir = .GlobalEnv)
tmp$Return <- CalculateReturns(Ad(tmp), method = 'discrete')
tmp$LogReturn <- CalculateReturns(Ad(tmp), method = 'log')
assign(x, tmp, envir = .GlobalEnv)
}) |> invisible()
panel_data_df <- lapply(symbols, function(x) {
tmp <- get(x, envir = .GlobalEnv)
df <- data.frame(Symbol = x,
Date = index(tmp),
Return = round(tmp$Return * 1e2, 2) |>
sprintf(fmt = '%s%%') |>
str_replace_all('NA%', NA_character_),
LogReturn = tmp$LogReturn)
df
}) |>
rbindlist() |>
as.data.frame()
head(panel_data_df)
Symbol Date Return LogReturn
1 GOOG 2004-09-01 <NA> NA
2 GOOG 2004-10-01 47.1% 0.38593415
3 GOOG 2004-11-01 -4.54% -0.04649014
4 GOOG 2004-12-01 5.94% 0.05770476
5 GOOG 2005-01-01 1.47% 0.01457253
6 GOOG 2005-02-01 -3.9% -0.03978529
# now let's calculate the 5 year mean of annualized monthly volatility
metrics_df <- split(panel_data_df, panel_data_df$Symbol) |>
lapply(function(x) {
df_xts <- xts(x$LogReturn, order.by = as.POSIXct(x$Date))
stddev_1yr <- period.apply(df_xts,
endpoints(df_xts, 'years', 1),
StdDev.annualized)
stddev_1yr_5yr_mean <- period.apply(stddev_1yr,
endpoints(stddev_1yr, 'years', 5),
mean)
stddev_1yr_5yr_mean_df <- as.data.frame(stddev_1yr_5yr_mean)
colnames(stddev_1yr_5yr_mean_df) <- 'StDevAnn5YrMean'
stddev_1yr_5yr_mean_df$Date <- rownames(stddev_1yr_5yr_mean_df) |>
str_split('\\s') |>
sapply('[', 1)
rownames(stddev_1yr_5yr_mean_df) <- NULL
stddev_1yr_5yr_mean_df$Symbol <- x$Symbol[ 1 ]
stddev_1yr_5yr_mean_df
}) |> rbindlist() |> as.data.frame()
panel_data_df <- merge(panel_data_df,
metrics_df,
by = c('Symbol', 'Date'),
all = TRUE)
head(panel_data_df, 50)
Symbol Date Return LogReturn StDevAnn5YrMean
1 AAPL 2002-11-01 <NA> NA NA
2 AAPL 2002-12-01 -7.55% -0.078484655 NA
3 AAPL 2003-01-01 0.21% 0.002089444 NA
4 AAPL 2003-02-01 4.53% 0.044272032 NA
5 AAPL 2003-03-01 -5.8% -0.059709353 NA
6 AAPL 2003-04-01 0.57% 0.005642860 NA
7 AAPL 2003-05-01 26.23% 0.232938925 NA
8 AAPL 2003-06-01 6.18% 0.060001124 NA
9 AAPL 2003-07-01 10.6% 0.100732953 NA
[ ... ]
26 AAPL 2004-12-01 -3.95% -0.040325449 NA
27 AAPL 2004-12-31 <NA> NA 0.2947654
28 AAPL 2005-01-01 19.41% 0.177392802 NA
29 AAPL 2005-02-01 16.67% 0.154188206 NA
30 AAPL 2005-03-01 -7.11% -0.073765972 NA
[ ... ]
I'm trying to calculate how long one person stays in a homeless shelter using R. The homeless shelter has two different types of check-ins, one for overnight and another for a long-term. I would like to shape the data to get an EntryDate and ExitDate for every stay which does not have at least a one day break.
Here are what the data currently look like:
PersonalID EntryDate ExitDate
1 2016-12-01 2016-12-02
1 2016-12-03 2016-12-04
1 2016-12-16 2016-12-17
1 2016-12-17 2016-12-18
1 2016-12-18 2016-12-19
2 2016-10-01 2016-10-20
2 2016-10-21 2016-10-22
3 2016-09-01 2016-09-02
3 2016-09-20 2016-09-21
Ultimately, I'm trying to get the above date to represent continuous ranges to calculate total length of stay by participant.
For example, the above data would become:
PersonalID EntryDate ExitDate
1 2016-12-01 2016-12-04
1 2016-12-16 2016-12-19
2 2016-10-01 2016-10-22
3 2016-09-01 2016-09-02
3 2016-09-20 2016-09-21
Here is an ugly solution. It is probably possible to do something more clean... But it works. This solution should alaso be debugged with real data (I have added one line to your exaple to have more different situations)
d <- read.table(text = '
PersonalID EntryDate ExitDate
1 2016-12-01 2016-12-02
1 2016-12-03 2016-12-04
1 2016-12-16 2016-12-17
1 2016-12-17 2016-12-18
1 2016-12-18 2016-12-19
2 2016-10-01 2016-10-20
2 2016-10-21 2016-10-22
3 2016-09-01 2016-09-02
3 2016-09-20 2016-09-21
4 2016-09-20 2016-09-21
', header = TRUE)
#' transorm in Date format
d$EntryDate <- as.Date(as.character(d$EntryDate))
d$ExitDate <- as.Date(as.character(d$ExitDate))
summary(d)
#' Reorder to be sure that the ExitDate / Entry date are in chronological order
d <- d[order(d$PersonalID, d$EntryDate),]
#' Add a column that will store the number of days between one exit and the next entry
d$nbdays <- 9999
# Split to have a list with dataframe for each ID
d <- split(d, d$PersonalID)
d
for(i in 1:length(d)) {
# Compute number of days between one exit and the next entry (only if there are
# more than one entry)
if(nrow(d[[i]])>1) {
d[[i]][-1,"nbdays"] <- d[[i]][2:nrow(d[[i]]),"EntryDate"] -
d[[i]][1:(nrow(d[[i]])-1),"ExitDate"]
}
x <- d[[i]] # store a copy of the data to lighten the syntax
# Entry dates for which the previous exit is higher than 1 day (including the first one)
entr <- x[x$nbdays>1,"EntryDate"]
# Exit dates just before cases where nbdays are > 1 and includes the last exit date.
# We use unique to avoid picking 2 times the last exit
whichexist <- unique(c(c(which(x$nbdays > 1)-1)[-1],nrow(x)))
exit <- x[whichexist,"ExitDate"]
d[[i]] <- data.frame(
PersonalID = x[1,1],
EntryDate = entr,
ExitDate = exit
)
}
# paste the elements of this list into one data.frame
do.call(rbind, d)
Here a solution using dplyr.
library(dplyr)
d = structure(list(PersonalID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 3L,
3L), EntryDate = structure(c(17136, 17138, 17151, 17152, 17153,
17075, 17095, 17045, 17064), class = "Date"), ExitDate = structure(c(17137,
17139, 17152, 17153, 17154, 17094, 17096, 17046, 17065), class = "Date")), class = "data.frame", .Names = c("PersonalID",
"EntryDate", "ExitDate"), row.names = c(NA, -9L))
First create a temporary dataframe to hold all the dates between entry and exit date:
d2 = d %>%
rowwise() %>%
do(data.frame(PersonalID = .$PersonalID, Present = seq(.$EntryDate, .$ExitDate, by = 'day'))) %>%
unique %>% ## remove double dates when exit and re-entry occur on the same day
ungroup()
Then look for all the consecutive dates with some inpiration from https://stackoverflow.com/a/14868742/827766
d2 %>%
group_by(PersonalID) %>%
mutate(delta = c(1, diff(as.Date(Present)))) %>%
group_by(PersonalID, stay = cumsum(delta!=1)) %>%
summarize(EntryDate = min(Present), ExitDate = max(Present)) %>%
subset(select = -c(stay))
I'm interested in expanding a data frame with missing values across any number of columns for the periods where data is missing following the data units.
Example
The problem can be easily illustrated on with use of a simple example.
Data
The generated data contains some time series observations and dates missing on random.
# Data generation
# Seed
set.seed(1)
# Size
sizeDf <- 10
# Populate data frame
dta <- data.frame(
dates = seq(
from = Sys.Date() - (sizeDf - 1),
to = Sys.Date(),
by = 1
),
varA = runif(n = sizeDf),
varB = runif(n = sizeDf),
varC = runif(n = sizeDf)
)
# Delete rows
dta <-
dta[-sample(1:sizeDf, replace = TRUE, size = round(sqrt(sizeDf), 0)),]
Preview
>> dta
dates varA varB varC
1 2016-07-28 0.26550866 0.2059746 0.93470523
2 2016-07-29 0.37212390 0.1765568 0.21214252
3 2016-07-30 0.57285336 0.6870228 0.65167377
4 2016-07-31 0.90820779 0.3841037 0.12555510
7 2016-08-03 0.94467527 0.7176185 0.01339033
8 2016-08-04 0.66079779 0.9919061 0.38238796
9 2016-08-05 0.62911404 0.3800352 0.86969085
10 2016-08-06 0.06178627 0.7774452 0.34034900
Key characteristics
From the perspective of the proposed analysis, the key characteristics are:
The date units, days in that case
Randomly missing dates
Missing dates
seq(
from = Sys.Date() - (sizeDf - 1),
to = Sys.Date(),
by = 1
)[!(seq(
from = Sys.Date() - (sizeDf - 1),
to = Sys.Date(),
by = 1
) %in% dta$dates)]
"2016-08-01" "2016-08-02"
Desired results
The newly created data frame should look like that:
>> dtaNew
dates varA varB varC
1 2016-07-28 0.3337749 0.32535215 0.8762692
2 2016-07-29 0.4763512 0.75708715 0.7789147
3 2016-07-30 0.8921983 0.20269226 0.7973088
4 2016-07-31 0.8643395 0.71112122 0.4552745
5 2016-08-01 NA NA NA
6 2016-08-02 NA NA NA
7 2016-08-03 0.9606180 0.14330438 0.6049333
8 2016-08-04 0.4346595 0.23962942 0.6547239
9 2016-08-05 0.7125147 0.05893438 0.3531973
10 2016-08-06 0.3999944 0.64228826 0.2702601
This simply obtained with use of:
dtaNew[dtaNew$dates %in% missDates, 2:4] <- NA
where the missDates is taken from the previous seq.
Attempts
Creating vector with all the dates is simple:
allDates <- seq(from = min(dta$dates), to = max(dta$dates), by = 1)
but obviously I cannot just push it to the data frame:
>> dta$allDates <- allDates
Error in `$<-.data.frame`(`*tmp*`, "allDates", value = c(17010, 17011, :
replacement has 10 rows, data has 8
The possible solution could use the loop that would push the row with NA values to the data frame row by row for each of the dates identified as missing but this is grossly inefficient and messy.
To sum up, I'm interested in achieving the following:
Expanding the data frame with all the dates following the same unit. I.e. for missing daily data days are added, for missing quarterly data quarters are added.
I would like to then push the NA values across all the columns in the data frame for where the missing date was found
If I understand your question, you can use rbind.fill from the plyr package to get your desired output:
sizeDf <- 10
# Populate data frame
dta <- data.frame(
dates = seq(
from = Sys.Date() - (sizeDf - 1),
to = Sys.Date(),
by = 1
),
varA = runif(n = sizeDf),
varB = runif(n = sizeDf),
varC = runif(n = sizeDf)
)
# Delete rows
dta <-dta[-sample(1:sizeDf, replace = TRUE, size = round(sqrt(sizeDf), 0)),]
#Get missing dates
missing_dates <- seq(from=min(dta$dates), to=max(dta$dates), by=1)[!(seq(from=min(dta$dates), to=max(dta$dates), by=1) %in% dta$dates)]
#Create the new dataset by using plyr's rbind.fill function
dta_new <- plyr::rbind.fill(dta,data.frame(dates=missing_dates))
#Order the data by the dates column
dta_new <- dta_new[order(dta_new$dates),]
#Print it
print(dta_new, row.names = F, right = F)
dates varA varB varC
2016-07-28 0.837859418 0.2966637 0.61245244
2016-07-29 0.144884547 0.9284294 0.11033990
2016-07-30 NA NA NA
2016-07-31 NA NA NA
2016-08-01 0.003167049 0.9096805 0.29239470
2016-08-02 0.574859760 0.1466993 0.69541969
2016-08-03 NA NA NA
2016-08-04 0.748639215 0.9602836 0.67681826
2016-08-05 0.983939562 0.4867804 0.35270309
2016-08-06 0.383366957 0.2241982 0.09244522
I hope this helps.
Given a data set like below. I would like to count how many times a particular hour of the day (00:00, 01:00, ...., 22:00, 23:00) falls completely within any of the given intervals.
The date of occurrence doesn't matter. Just the overall count.
### This code is to create a data set similar to the one I am using.
### This is a function I found on here to generate random times
latemail <- function(N, st="2012/01/01", et="2012/12/31") {
st <- as.POSIXct(as.Date(st))
et <- as.POSIXct(as.Date(et))
dt <- as.numeric(difftime(et,st,unit="sec"))
ev <- sort(runif(N, 0, dt))
rt <- st + ev
}
set.seed(123)
startTimes <- latemail(5)
endTimes <- startTimes +18000
my_data <- data.frame(startTimes, endTimes)
> my_data
start end
1 2012-04-14 16:10:44 2012-04-14 21:10:44
2 2012-05-28 23:38:16 2012-05-29 04:38:16
3 2012-10-14 10:33:10 2012-10-14 15:33:10
4 2012-11-17 23:13:56 2012-11-18 04:13:56
5 2012-12-08 22:29:36 2012-12-09 03:29:36
So that hopefully helps give you an idea of what I am working with.
Ideally the output would be a dataset with one variable for the hour, and another for the count of occurrences. Like this
hour count
1 00:00 3
2 01:00 3
3 etc ?
How to doing this in different increments (say 15 minutes) would also be great to know.
Thank you!
Here is my attempt. I am sure there are better ways of doing this. Given the comments above, I did the following. First, I took hour using ifelse. As you described in your commented, I rounded up/down hour here. Using transmute, I want to get a string including hours. In some cases, start hour can be larger than ending hour (in this case the record crosses dates). In order to deal with that, I used setdiff(), c(), and toString(). Using separate I separated hours into columns. I wanted to use cSplit() from the splitstackshape package, but I had an error message coming back. Hence, I chose separate() here. Once I had all hours separated, I reshaped the data using gather() and finally counted hour with count(). filter() was employed to remove NA cases. I hope this will help you to some extent.
** Data **
structure(list(startTimes = structure(c(1328621832.79254, 1339672345.94964,
1343434566.9641, 1346743867.55964, 1355550696.37895), class = c("POSIXct",
"POSIXt")), endTimes = structure(c(1328639832.79254, 1339690345.94964,
1343452566.9641, 1346761867.55964, 1355568696.37895), class = c("POSIXct",
"POSIXt"))), .Names = c("startTimes", "endTimes"), row.names = c(NA,
-5L), class = "data.frame")
# startTimes endTimes
#1 2012-02-07 22:37:12 2012-02-08 03:37:12
#2 2012-06-14 20:12:25 2012-06-15 01:12:25
#3 2012-07-28 09:16:06 2012-07-28 14:16:06
#4 2012-09-04 16:31:07 2012-09-04 21:31:07
#5 2012-12-15 14:51:36 2012-12-15 19:51:36
library(dplyr)
library(tidyr)
mutate(my_data, start = ifelse(as.numeric(format(startTimes, "%M")) >= 0 & as.numeric(format(startTimes, "%S")) > 0,
as.numeric(format(startTimes, "%H")) + 1,
as.numeric(format(startTimes, "%H"))),
end = ifelse(as.numeric(format(endTimes, "%M")) >= 0 & as.numeric(format(endTimes, "%S")) > 0,
as.numeric(format(endTimes, "%H")) - 1,
as.numeric(format(endTimes, "%H"))),
start = replace(start, which(start == "24"), 0),
end = replace(end, which(end == "-1"), 23)) %>%
rowwise() %>%
transmute(hour = ifelse(start < end, toString(seq.int(start, end, by = 1)),
toString(c(setdiff(seq(0, 23, by = 1), seq.int(end, start, by = 1)),
start, end)))) %>%
separate(hour, paste("hour", 1:24, sep = "."), ", ", extra = "merge") %>%
gather(foo, hour) %>%
count(hour) %>%
filter(complete.cases(hour))
# hour n
#1 0 2
#2 1 1
#3 10 1
#4 11 1
#5 12 1
#6 13 1
#7 15 1
#8 16 1
#9 17 2
#10 18 2
#11 19 1
#12 2 1
#13 20 1
#14 21 1
#15 22 1
#16 23 2
I couldn't find a solution to this on net. The two xts objects match on number of rows and columns. Still I get following error for merge operation - "number of items to replace is not a multiple of replacement length".
Following is the R code along with printed output at interim steps. I am bit new to R. So if you notice any steps in program that could be done better then can you advise me on that as well. Thanks.
> # LOAD THE SPY DATA AND CREATE A DATA FRAME WITH RETURN COLUMN
> library(quantmod)
> library(PerformanceAnalytics)
> getSymbols("SPY", src='yahoo', index.class=c("POSIXt","POSIXct"), from='2002-01-01')
> SPY<-to.monthly(SPY)
> SPY.ret<-Return.calculate(SPY$SPY.Close)
> print(head(SPY.ret))
SPY.Close
Jan 2002 NA
Feb 2002 -0.018098831
Mar 2002 0.029868840
Apr 2002 -0.059915390
May 2002 -0.005951292
Jun 2002 -0.080167070
> index(SPY.ret) = as.Date(index(SPY)) # Convert to Date format as xts index is a Date.
> colnames(SPY.ret) <- "SPY"
> print(head(SPY.ret))
SPY
2002-01-01 NA
2002-02-01 -0.018098831
2002-03-01 0.029868840
2002-04-01 -0.059915390
2002-05-01 -0.005951292
2002-06-01 -0.080167070
> #LOAD THE TRADE FILE & CREATE A DATA FRAME WITH PROFIT COLUMN
> trades = as.xts(read.zoo(file="Anvi/CSV/ARS_EW_R2_SPDR.csv", index.column="Exit.time", format="%m/%d/%Y", header=TRUE, sep=","))
Warning message:
In zoo(rval3, ix) :
some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
> df = trades$Profit
> print(head(df))
Profit
2003-09-30 " 0.079734219"
2004-01-31 " 0.116722585"
2004-03-31 " 0.060347888"
2004-04-30 " 0.100379816"
2004-07-31 " 0.084048027"
2004-07-31 " 0.018710103"
> df$Profits = as.numeric(trades$Profit)
> df = df$Profit #Inefficent way to convert Profit column to numeric?
> print(head(df))
Profit
2003-09-30 0.07973422
2004-01-31 0.11672259
2004-03-31 0.06034789
2004-04-30 0.10037982
2004-07-31 0.08404803
2004-07-31 0.01871010
> df = aggregate(df, by=index(df))
> colnames(df) = "Profit"
> print(head(df))
Profit
2003-09-30 0.07973422
2004-01-31 0.11672259
2004-03-31 0.06034789
2004-04-30 0.10037982
2004-07-31 0.10275813
2004-11-30 0.02533904
>
> #MERGE THE SPY RET AND TRADE RESULTS DATA FRAMES
> temp = head(df)
> temp1 = head(SPY.ret)
> print(temp)
Profit
2003-09-30 0.07973422
2004-01-31 0.11672259
2004-03-31 0.06034789
2004-04-30 0.10037982
2004-07-31 0.10275813
2004-11-30 0.02533904
> print(temp1)
SPY
2002-01-01 NA (Note: I tried replacing NA with 0 but still same error).
2002-02-01 -0.018098831
2002-03-01 0.029868840
2002-04-01 -0.059915390
2002-05-01 -0.005951292
2002-06-01 -0.080167070
> mdf = merge(x=temp, y=temp1, all=TRUE)
Error in z[match0(index(a), indexes), ] <- a[match0(indexes, index(a)), :
number of items to replace is not a multiple of replacement length
>
What I am trying to do above is merge the objects such that resulting object's index is a UNION and has two columns "SPY", "PROFIT". The empty cells in each of the columns in the merged object are filled with 0.
aggregate returns a zoo object, not an xts object. That means the zoo method of merge is being dispatched instead of the xts method. Your code works fine if both objects are xts objects.
temp <-
structure(c(0.07973422, 0.11672259, 0.06034789, 0.10037982, 0.10275813,
0.02533904), .Dim = c(6L, 1L), index = structure(c(12325, 12448,
12508, 12538, 12630, 12752), class = "Date"), class = "zoo",
.Dimnames = list(NULL, "Profit"))
temp1 <-
structure(c(NA, -0.018098831, 0.02986884, -0.05991539, -0.005951292,
-0.08016707), .Dim = c(6L, 1L), index = structure(c(1009864800,
1012543200, 1014962400, 1017640800, 1020229200, 1022907600), tzone = "",
tclass = "Date"), .indexCLASS = "Date", tclass = "Date", .indexTZ = "",
tzone = "", .Dimnames = list(NULL, "SPY"), class = c("xts", "zoo"))
merge(temp, temp1) # error
merge(as.xts(temp), temp1, fill=0) # works, filled with zeros
# Profit SPY
# 2002-01-01 0.00000000 NA
# 2002-02-01 0.00000000 -0.018098831
# 2002-03-01 0.00000000 0.029868840
# 2002-04-01 0.00000000 -0.059915390
# 2002-05-01 0.00000000 -0.005951292
# 2002-06-01 0.00000000 -0.080167070
# 2003-09-30 0.07973422 0.000000000
# 2004-01-31 0.11672259 0.000000000
# 2004-03-31 0.06034789 0.000000000
# 2004-04-30 0.10037982 0.000000000
# 2004-07-31 0.10275813 0.000000000
# 2004-11-30 0.02533904 0.000000000