min and max over time range on each day of xts - r

I have an xts object with intraday OHLC price data over several years. I'd like to be able to write a function that calculates the min and max value between 04:00:00 and 05:00:00 every day and include that as a column in the xts object. Im not really familiar with manipulating xts objects. Can anyone point me in the right direction? Here's a head of the xts object.
Open High Low Close Volume
2017-01-01 00:00:00 968.29 968.76 966.74 966.97 106562
2017-01-01 00:05:00 966.97 967.00 966.89 966.89 13731
2017-01-01 00:10:00 966.89 966.89 964.86 964.86 124137
2017-01-01 00:15:00 964.86 964.99 964.80 964.80 3001
2017-01-01 00:20:00 964.80 964.80 964.80 964.80 0
2017-01-01 00:25:00 964.80 965.09 964.54 964.91 48000
2017-01-01 00:30:00 964.91 965.01 964.91 965.01 2501
2017-01-01 00:35:00 965.01 967.82 965.57 967.82 71501
2017-01-01 00:40:00 967.82 967.82 967.08 967.08 50
2017-01-01 00:45:00 967.08 967.40 967.40 967.40 50
2017-01-01 00:50:00 967.40 968.08 967.40 968.08 14000
2017-01-01 00:55:00 968.08 968.08 966.89 968.00 1008
2017-01-01 01:00:00 968.00 968.10 968.00 968.10 1002
2017-01-01 01:05:00 968.10 968.10 967.62 967.62 5200
2017-01-01 01:10:00 967.62 967.70 966.29 966.29 35476
2017-01-01 01:15:00 966.29 966.29 966.28 966.28 3068
2017-01-01 01:20:00 966.28 966.66 965.00 965.00 30471
2017-01-01 01:25:00 965.00 965.01 964.00 964.00 77884
2017-01-01 01:30:00 964.00 964.76 964.76 964.76 500
2017-01-01 01:35:00 964.76 967.48 964.69 965.00 134129
2017-01-01 01:40:00 965.00 965.00 963.67 963.67 59676
2017-01-01 01:45:00 963.67 963.67 963.67 963.67 0
2017-01-01 01:50:00 963.67 964.56 963.66 964.55 5531
2017-01-01 01:55:00 964.55 963.43 963.40 963.40 3000
2017-01-01 02:00:00 963.40 964.60 963.40 964.60 1301
2017-01-01 02:05:00 964.60 964.60 964.60 964.60 0
2017-01-01 02:10:00 964.60 964.60 964.00 964.11 49954
2017-01-01 02:15:00 964.11 964.60 964.59 964.60 5000
2017-01-01 02:20:00 964.60 964.60 964.60 964.60 0
2017-01-01 02:25:00 964.60 964.60 964.51 964.51 2000
2017-01-01 02:30:00 964.51 964.51 964.51 964.51 0
2017-01-01 02:35:00 964.51 964.51 963.23 963.99 16667
2017-01-01 02:40:00 963.99 963.99 963.65 963.66 10000
2017-01-01 02:45:00 963.66 964.26 963.16 964.26 75500
2017-01-01 02:50:00 964.26 964.26 964.26 964.26 0
2017-01-01 02:55:00 964.26 964.26 964.26 964.26 0
2017-01-01 03:00:00 964.26 964.61 963.98 964.61 13000
2017-01-01 03:05:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:10:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:15:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:20:00 964.61 964.82 964.48 964.82 16666
2017-01-01 03:25:00 964.82 965.00 963.99 964.97 50500
2017-01-01 03:30:00 964.97 964.97 964.02 964.02 56000
2017-01-01 03:35:00 964.02 964.29 964.29 964.29 500
2017-01-01 03:40:00 964.29 963.53 963.52 963.52 24000
2017-01-01 03:45:00 963.52 963.52 963.43 963.43 16500
2017-01-01 03:50:00 963.43 963.67 963.42 963.42 25002
2017-01-01 03:55:00 963.42 963.42 961.69 961.69 84507
2017-01-01 04:00:00 961.69 961.69 960.90 960.93 57909
2017-01-01 04:05:00 960.93 960.93 960.93 960.93 0
2017-01-01 04:10:00 960.93 961.19 961.19 961.19 400
2017-01-01 04:15:00 961.19 962.09 961.19 962.09 7001
2017-01-01 04:20:00 962.09 962.09 962.09 962.09 0
2017-01-01 04:25:00 962.09 962.10 961.14 961.14 32000
2017-01-01 04:30:00 961.14 961.14 960.93 960.93 41900
2017-01-01 04:35:00 960.93 961.94 960.93 961.64 640
2017-01-01 04:40:00 961.64 961.71 961.64 961.71 1
2017-01-01 04:45:00 961.71 962.00 961.90 961.99 5499
2017-01-01 04:50:00 961.99 961.99 961.99 961.99 0
2017-01-01 04:55:00 961.99 961.99 961.99 961.99 1
2017-01-01 05:00:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:05:00 961.99 961.99 961.99 961.99 40
2017-01-01 05:10:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:15:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:20:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:25:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:30:00 961.99 962.10 961.99 962.10 1382
2017-01-01 05:35:00 962.10 968.84 962.10 968.84 122909
2017-01-01 05:40:00 968.84 968.86 963.78 965.53 161263
2017-01-01 05:45:00 965.53 964.81 963.11 963.81 18021
2017-01-01 05:50:00 963.81 964.39 963.85 964.39 40006
2017-01-01 05:55:00 964.39 964.47 964.00 964.47 39966
2017-01-01 06:00:00 964.47 964.47 964.47 964.47 0

You can do this by filtering on the hours of the index and then using period.max and period.min functions. The values will be put in last record of the chosen hour. See example below with intraday data of MSFT, max and min values for between 15:00 and 16:00.
library(xts)
# max of high values between 15 and 16. (excluding 16:00)
msft$max <- period.max(msft$high[.indexhour(msft) == 15], endpoints(msft$high[.indexhour(msft) == 15], on = "hour"))
# min of low values between 15 and 16. (excluding 16:00)
msft$min <- period.min(msft$low[.indexhour(msft) == 15], endpoints(msft$low[.indexhour(msft) == 15], on = "hour"))
head(msft[8:24], 16)
open high low close volume max min
2020-01-23 14:50:00 166.180 166.2300 166.090 166.1050 87934 NA NA
2020-01-23 14:55:00 166.105 166.2200 166.103 166.1700 92280 NA NA
2020-01-23 15:00:00 166.160 166.3500 166.160 166.3400 114359 NA NA
2020-01-23 15:05:00 166.335 166.3400 166.285 166.2850 102633 NA NA
2020-01-23 15:10:00 166.290 166.3050 166.170 166.2550 125558 NA NA
2020-01-23 15:15:00 166.250 166.2750 166.210 166.2400 103938 NA NA
2020-01-23 15:20:00 166.230 166.2500 166.180 166.2350 99649 NA NA
2020-01-23 15:25:00 166.240 166.3000 166.225 166.2850 93846 NA NA
2020-01-23 15:30:00 166.270 166.4164 166.175 166.3600 183154 NA NA
2020-01-23 15:35:00 166.360 166.5000 166.320 166.4600 177178 NA NA
2020-01-23 15:40:00 166.450 166.4650 166.380 166.3800 112174 NA NA
2020-01-23 15:45:00 166.385 166.4050 166.290 166.3875 152806 NA NA
2020-01-23 15:50:00 166.382 166.5200 166.362 166.4500 205667 NA NA
2020-01-23 15:55:00 166.450 166.6900 166.305 166.6700 508469 166.69 166.16
2020-01-23 16:00:00 166.660 166.7200 166.589 166.7200 934090 NA NA
2020-01-24 09:35:00 167.510 167.5300 166.890 166.8918 1152646 NA NA
data:
msft <- structure(c(166.224, 166.29, 166.29, 166.2456, 166.165, 166.1446,
166.1601, 166.18, 166.105, 166.16, 166.335, 166.29, 166.25, 166.23,
166.24, 166.27, 166.36, 166.45, 166.385, 166.382, 166.45, 166.66,
167.51, 167.03, 167.265, 167.325, 167.37, 167.16, 167.405, 167.35,
167.31, 167.39, 167.17, 167.1, 166.845, 167.03, 167.1223, 167.125,
167.21, 167.34, 167.235, 167.3, 167.37, 167.1977, 166.9814, 166.8499,
166.99, 166.93, 166.83, 166.64, 166.775, 166.85, 166.71, 166.6838,
166.46, 166.35, 165.765, 166.2269, 166.01, 166.19, 166.13, 166.31,
166.36, 166.42, 166.3682, 165.99, 166.1328, 165.85, 165.74, 165.8439,
165.655, 165.5434, 165.47, 165.3227, 165.0627, 165.03, 165.2546,
165.14, 165.1, 164.91, 164.75, 164.65, 164.53, 164.81, 164.8979,
164.6, 164.89, 164.94, 165.03, 165.12, 165.17, 165.24, 165.4,
165.335, 165.2734, 164.985, 164.9, 164.61, 164.93, 165.18, 166.315,
166.29, 166.3, 166.265, 166.22, 166.2201, 166.2, 166.23, 166.22,
166.35, 166.34, 166.305, 166.275, 166.25, 166.3, 166.4164, 166.5,
166.465, 166.405, 166.52, 166.69, 166.72, 167.53, 167.34, 167.39,
167.495, 167.47, 167.48, 167.4251, 167.3699, 167.42, 167.41,
167.2, 167.1, 167.03, 167.21, 167.23, 167.255, 167.35, 167.35,
167.33, 167.405, 167.38, 167.25, 167.01, 167, 167.02, 167.02,
166.8384, 166.9056, 166.86, 166.94, 166.75, 166.6844, 166.47,
166.42, 166.22, 166.4049, 166.221, 166.2003, 166.3749, 166.3999,
166.43, 166.43, 166.375, 166.175, 166.16, 165.96, 165.93, 165.86,
165.671, 165.64, 165.49, 165.4, 165.08, 165.27, 165.26, 165.34,
165.12, 165, 164.825, 164.765, 164.82, 164.89, 165, 164.89, 164.99,
165.041, 165.293, 165.23, 165.27, 165.44, 165.6046, 165.37, 165.295,
165.18, 164.93, 164.945, 165.185, 165.24, 166.22, 166.225, 166.25,
166.15, 166.145, 166.13, 166.1015, 166.09, 166.103, 166.16, 166.285,
166.17, 166.21, 166.18, 166.225, 166.175, 166.32, 166.38, 166.29,
166.362, 166.305, 166.589, 166.89, 167.03, 167.22, 167.32, 167.225,
167.16, 167.2, 167.23, 167.2801, 167.145, 167.05, 166.84, 166.77,
167, 167.1, 167.02, 167.18, 167.23, 167.223, 167.28, 167.1843,
166.862, 166.85, 166.821, 166.8121, 166.85, 166.55, 166.6303,
166.69, 166.7, 166.54, 166.4, 166.31, 165.76, 165.74, 165.8966,
165.91, 166.07, 166.09, 166.171, 166.32, 166.22, 165.96, 165.97,
165.82, 165.73, 165.72, 165.64, 165.49, 165.45, 165.32, 165.045,
164.89, 164.91, 165.09, 165.1, 164.91, 164.74, 164.53, 164.529,
164.53, 164.735, 164.59, 164.54, 164.88, 164.938, 165.01, 165.0792,
165.12, 165.22, 165.335, 165.263, 164.88, 164.89, 164.58, 164.58,
164.87, 164.87, 166.29, 166.275, 166.26, 166.16, 166.145, 166.155,
166.19, 166.105, 166.17, 166.34, 166.285, 166.255, 166.24, 166.235,
166.285, 166.36, 166.46, 166.38, 166.3875, 166.45, 166.67, 166.72,
166.8918, 167.27, 167.325, 167.371, 167.2251, 167.4, 167.34,
167.29, 167.3988, 167.2, 167.1047, 166.86, 167.025, 167.11, 167.12,
167.2, 167.345, 167.23, 167.29, 167.37, 167.1916, 167.0027, 166.85,
167, 166.94, 166.85, 166.64, 166.7738, 166.85, 166.72, 166.68,
166.4672, 166.3512, 165.79, 166.21, 165.9969, 166.18, 166.14,
166.2968, 166.36, 166.43, 166.36, 165.99, 166.13, 165.83, 165.73,
165.8405, 165.65, 165.545, 165.48, 165.33, 165.05, 165.0227,
165.26, 165.1425, 165.101, 164.91, 164.74, 164.6581, 164.5292,
164.805, 164.89, 164.59, 164.8801, 164.9498, 165.04, 165.12,
165.16, 165.2302, 165.4, 165.34, 165.28, 164.987, 164.89, 164.605,
164.94, 165.185, 165.04, 158120, 165333, 101115, 78491, 123999,
76037, 82733, 87934, 92280, 114359, 102633, 125558, 103938, 99649,
93846, 183154, 177178, 112174, 152806, 205667, 508469, 934090,
1152646, 558627, 277325, 321651, 255494, 333848, 272126, 395463,
194593, 211910, 193131, 242112, 210240, 193265, 139617, 204182,
179146, 159259, 237888, 410982, 213787, 233082, 188071, 193742,
132377, 118994, 264247, 182490, 109514, 138164, 221052, 194127,
169059, 458214, 247712, 169523, 115531, 161259, 263230, 155536,
82474, 87549, 109057, 101772, 130642, 171988, 117235, 134507,
236662, 219303, 217698, 219808, 420288, 208087, 149358, 197435,
218090, 267667, 320279, 422434, 340478, 273866, 258938, 212451,
268017, 323657, 267686, 214060, 222314, 293731, 288867, 219687,
304733, 251063, 425450, 455311, 741208, 1429645),
.Dim = c(100L, 5L),
.Dimnames = list(NULL, c("open", "high", "low", "close", "volume")),
index = structure(c(1579785300, 1579785600, 1579785900, 1579786200, 1579786500,
1579786800, 1579787100, 1579787400, 1579787700, 1579788000,
1579788300, 1579788600, 1579788900, 1579789200, 1579789500,
1579789800, 1579790100, 1579790400, 1579790700, 1579791000,
1579791300, 1579791600, 1579854900, 1579855200, 1579855500,
1579855800, 1579856100, 1579856400, 1579856700, 1579857000,
1579857300, 1579857600, 1579857900, 1579858200, 1579858500,
1579858800, 1579859100, 1579859400, 1579859700, 1579860000,
1579860300, 1579860600, 1579860900, 1579861200, 1579861500,
1579861800, 1579862100, 1579862400, 1579862700, 1579863000,
1579863300, 1579863600, 1579863900, 1579864200, 1579864500,
1579864800, 1579865100, 1579865400, 1579865700, 1579866000,
1579866300, 1579866600, 1579866900, 1579867200, 1579867500,
1579867800, 1579868100, 1579868400, 1579868700, 1579869000,
1579869300, 1579869600, 1579869900, 1579870200, 1579870500,
1579870800, 1579871100, 1579871400, 1579871700, 1579872000,
1579872300, 1579872600, 1579872900, 1579873200, 1579873500,
1579873800, 1579874100, 1579874400, 1579874700, 1579875000,
1579875300, 1579875600, 1579875900, 1579876200, 1579876500,
1579876800, 1579877100, 1579877400, 1579877700, 1579878000),
tzone = "",
tclass = c("POSIXct", "POSIXt")),
class = c("xts", "zoo"))

Related

How to plot lagged data against other data in R

I would like to lag one variable by, say, 10 time steps and plot it against the other variable which remains the same. I would like to do this for various lags to see if there is a time period that the first variable influences the other. The data I have is daily and after lagging I am separating into Dec-Feb data only. The problem I am having is the plot and correlation between the lagged variable and the other data is coming out the same as the non-lagged plot and correlation every time. I am not sure how to achieve this.
A sample of my data frame "data" can be seen below.
Date x y
14158 2017-10-05 1.913918e+00 -0.1538234614
14159 2017-10-06 1.479714e+00 -0.1937094170
14160 2017-10-07 8.783669e-01 -0.1703790211
14161 2017-10-08 5.706581e-01 -0.1294144428
14162 2017-10-09 4.979405e-01 -0.0666569815
14163 2017-10-10 3.233477e-01 0.0072006102
14164 2017-10-11 3.057630e-01 0.0863445067
14165 2017-10-12 5.877673e-01 0.1097707831
14166 2017-10-13 1.208526e+00 0.1301967193
14167 2017-10-14 1.671705e+00 0.1728109268
14168 2017-10-15 1.810979e+00 0.2264911145
14169 2017-10-16 1.426651e+00 0.2702958315
14170 2017-10-17 1.241140e+00 0.3242637704
14171 2017-10-18 8.997498e-01 0.3879727861
14172 2017-10-19 5.594161e-01 0.4172990825
14173 2017-10-20 3.980254e-01 0.3915170864
14174 2017-10-21 2.138538e-01 0.3249736995
14175 2017-10-22 3.926440e-01 0.2224834840
14176 2017-10-23 2.268644e-01 0.0529143372
14177 2017-10-24 5.664923e-01 -0.0081443464
14178 2017-10-25 6.167520e-01 0.0312073984
14179 2017-10-26 7.751882e-02 0.0043897693
14180 2017-10-27 -5.634851e-02 -0.0726825266
14181 2017-10-28 -2.122061e-01 -0.1711305549
14182 2017-10-29 -8.500991e-01 -0.2068581639
14183 2017-10-30 -1.039685e+00 -0.2909120824
14184 2017-10-31 -3.057745e-01 -0.3933633317
14185 2017-11-01 -1.288774e-01 -0.3726346136
14186 2017-11-02 -5.608007e-03 -0.2425754386
14187 2017-11-03 4.853990e-01 -0.0503543980
14188 2017-11-04 5.822672e-01 0.0896130098
14189 2017-11-05 8.491505e-01 0.1299151006
14190 2017-11-06 1.052999e+00 0.0749888307
14191 2017-11-07 1.170470e+00 0.0287317882
14192 2017-11-08 7.919862e-01 0.0788187381
14193 2017-11-09 4.574565e-01 0.1539981316
14194 2017-11-10 4.552032e-01 0.2034393145
14195 2017-11-11 -3.621350e-01 0.2077476707
14196 2017-11-12 -8.053965e-01 0.1759558604
14197 2017-11-13 -8.307459e-01 0.1802858410
14198 2017-11-14 -9.421325e-01 0.2175529008
14199 2017-11-15 -9.880204e-01 0.2392924580
14200 2017-11-16 -7.448127e-01 0.2519253751
14201 2017-11-17 -8.081435e-01 0.2614254732
14202 2017-11-18 -1.216806e+00 0.2629971336
14203 2017-11-19 -1.122674e+00 0.3469995055
14204 2017-11-20 -1.242597e+00 0.4553094014
14205 2017-11-21 -1.294885e+00 0.5049438231
14206 2017-11-22 -9.325514e-01 0.4684133163
14207 2017-11-23 -4.632281e-01 0.4071673624
14208 2017-11-24 -9.689322e-02 0.3710270269
14209 2017-11-25 4.704467e-01 0.4126721465
14210 2017-11-26 8.682453e-01 0.3745057653
14211 2017-11-27 5.105564e-01 0.2373454931
14212 2017-11-28 4.747265e-01 0.1650783370
14213 2017-11-29 5.905379e-01 0.2632154120
14214 2017-11-30 4.083787e-01 0.3888834762
14215 2017-12-01 3.451736e-01 0.5008047592
14216 2017-12-02 5.161312e-01 0.5388177242
14217 2017-12-03 7.109279e-01 0.5515360710
14218 2017-12-04 4.458635e-01 0.5127537202
14219 2017-12-05 -3.986610e-01 0.3896493238
14220 2017-12-06 -5.968253e-01 0.1095843268
14221 2017-12-07 -1.604398e-01 -0.2455506506
14222 2017-12-08 -4.384744e-01 -0.5801038215
14223 2017-12-09 -7.255016e-01 -0.8384627087
14224 2017-12-10 -9.691828e-01 -0.9223171538
14225 2017-12-11 -1.140588e+00 -0.8177806761
14226 2017-12-12 -1.956622e-01 -0.5250998474
14227 2017-12-13 -1.083792e-01 -0.3430768534
14228 2017-12-14 -8.016345e-02 -0.3163476104
14229 2017-12-15 8.899266e-01 -0.2813253830
14230 2017-12-16 1.322833e+00 -0.2545953062
14231 2017-12-17 1.547972e+00 -0.2275373110
14232 2017-12-18 2.164907e+00 -0.3217205817
14233 2017-12-19 2.276258e+00 -0.5773412429
14234 2017-12-20 1.862291e+00 -0.7728091393
14235 2017-12-21 1.125083e+00 -0.9099696881
14236 2017-12-22 7.737118e-01 -1.2441963604
14237 2017-12-23 7.863508e-01 -1.4802661587
14238 2017-12-24 4.313111e-01 -1.4111320559
14239 2017-12-25 -8.814799e-02 -1.0024805520
14240 2017-12-26 -3.615127e-01 -0.4943077147
14241 2017-12-27 -5.011363e-01 -0.0308588186
14242 2017-12-28 -8.474088e-01 0.3717555895
14243 2017-12-29 -7.283247e-01 0.8230450219
14244 2017-12-30 -4.566981e-01 1.2495961116
14245 2017-12-31 -4.577034e-01 1.4805369230
14246 2018-01-01 1.946166e-01 1.5310004017
14247 2018-01-02 5.203149e-01 1.5384595802
14248 2018-01-03 5.024570e-02 1.4036679018
14249 2018-01-04 -7.065297e-01 1.0749574137
14250 2018-01-05 -8.741815e-01 0.7608524752
14251 2018-01-06 1.589530e-01 0.7891084646
14252 2018-01-07 8.632378e-01 1.1230358751
I am using
lagged <- lag(ts(x), k=10)
This is so the tsp isn't ignored. However, when I do
cor(data$x, data$y)
and
cor(lagged, data$y)
the result is the same, where I would have thought it would have been different. How do I get this lag to work before I can go ahead separate via date?
Many thanks!

dealing with "missing" times when setting data to xts

I have some data which looks like the following;
Dates Open Close
1000 06/06/2019 0:05 244.599 244.524
1001 06/06/2019 0:04 244.592 244.599
1002 06/06/2019 0:03 244.564 244.592
1003 06/06/2019 0:02 244.809 244.564
1004 06/06/2019 0:01 244.849 244.809
1005 06/06/2019 245.080 244.849
1006 05/06/2019 23:59 245.092 245.080
1007 05/06/2019 23:58 245.253 245.092
1008 05/06/2019 23:57 244.858 245.253
1009 05/06/2019 23:56 244.643 244.863
1010 05/06/2019 23:55 244.720 244.643
Where row 1005 doesn't have a time stamp. I try to set my dates to POSIXlt format.
data$Dates <- gsub("/", "-", data$Dates)
data$Dates <- as.POSIXlt(strptime(data$Dates, format="%d-%m-%Y %H:%M"))
Now my data looks like:
Dates Open Close
1000 2019-06-06 00:05:00 244.599 244.524
1001 2019-06-06 00:04:00 244.592 244.599
1002 2019-06-06 00:03:00 244.564 244.592
1003 2019-06-06 00:02:00 244.809 244.564
1004 2019-06-06 00:01:00 244.849 244.809
1005 <NA> 245.080 244.849
1006 2019-06-05 23:59:00 245.092 245.080
1007 2019-06-05 23:58:00 245.253 245.092
1008 2019-06-05 23:57:00 244.858 245.253
1009 2019-06-05 23:56:00 244.643 244.863
1010 2019-06-05 23:55:00 244.720 244.643
I am just wondering if there is a way around converting the times with no Hour or Minute data. It only occurs on the hour 0:00
Data:
data <- structure(list(Dates = c("06/06/2019 0:05", "06/06/2019 0:04",
"06/06/2019 0:03", "06/06/2019 0:02", "06/06/2019 0:01", "06/06/2019",
"05/06/2019 23:59", "05/06/2019 23:58", "05/06/2019 23:57", "05/06/2019 23:56",
"05/06/2019 23:55"), Open = c(244.599, 244.592, 244.564, 244.809,
244.849, 245.08, 245.092, 245.253, 244.858, 244.643, 244.72),
Close = c(244.524, 244.599, 244.592, 244.564, 244.809, 244.849,
245.08, 245.092, 245.253, 244.863, 244.643)), row.names = 1000:1010, class = "data.frame")
EDIT:
I just thought perhaps I should first split the column into two (one for dates and another for times) fill in the blank cells in the second column with 0:00 and paste back together.
parse_date_time in the lubridate package will successively check alternative formats until it succeeds if you give it a vector of formats. The separators and percent signs can be omitted from the format strings.
library(lubridate)
parse_date_time(data$Dates, c("dmYHM", "dmY"), tz = "")
giving:
[1] "2019-06-06 00:05:00 EDT" "2019-06-06 00:04:00 EDT"
[3] "2019-06-06 00:03:00 EDT" "2019-06-06 00:02:00 EDT"
[5] "2019-06-06 00:01:00 EDT" "2019-06-06 00:00:00 EDT"
[7] "2019-06-05 23:59:00 EDT" "2019-06-05 23:58:00 EDT"
[9] "2019-06-05 23:57:00 EDT" "2019-06-05 23:56:00 EDT"
[11] "2019-06-05 23:55:00 EDT"
Using dplyr, one possibility could be:
data %>%
mutate(Dates = ifelse(nchar(Dates) == 10, paste(Dates, "0:00", sep = " "), Dates),
Dates = as.POSIXct(Dates, format = "%d/%m/%Y %H:%M"))
Dates Open Close
1 2019-06-06 00:05:00 244.599 244.524
2 2019-06-06 00:04:00 244.592 244.599
3 2019-06-06 00:03:00 244.564 244.592
4 2019-06-06 00:02:00 244.809 244.564
5 2019-06-06 00:01:00 244.849 244.809
6 2019-06-06 00:00:00 245.080 244.849
7 2019-06-05 23:59:00 245.092 245.080
8 2019-06-05 23:58:00 245.253 245.092
9 2019-06-05 23:57:00 244.858 245.253
10 2019-06-05 23:56:00 244.643 244.863
11 2019-06-05 23:55:00 244.720 244.643
Here, for rows containing just the 10 characters, it combines the date with 0:00.
The same with base R:
data$Dates <- ifelse(nchar(data$Dates) == 10, paste(data$Dates, "0:00", sep = " "), data$Dates)
as.POSIXct(data$Dates, format = "%d/%m/%Y %H:%M")

endpoints are always in UTC/GMT

I think the endpoints are generated by converting the index to UTC. I want the endpoints at change of every hour 9:00/10:00 etc ( or at my specific defined intervals, say at 9:30/10:30 etc).
In the below example object 'a' is UTC and endpoints are created at 4:55,5:55 etc, for the object 'b' its at 10:25,11:25 etc. I am looking for a solution that gives endpoints in specified way, irrespective of timezone.
Is there any simple mechanism to do so?
> head(a)
Open High Low Close
2008-01-01 04:30:00 6114.05 6126.65 6111.35 6111.35
2008-01-01 04:35:00 6110.50 6130.65 6110.50 6128.90
2008-01-01 04:40:00 6128.70 6130.15 6123.15 6123.55
2008-01-01 04:45:00 6124.85 6131.90 6123.45 6131.55
2008-01-01 04:50:00 6132.20 6134.45 6128.70 6131.20
2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45
> indexTZ(a)
TZ
"UTC"
> a[endpoints(a,on="hours")]
Open High Low Close
2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45
2008-01-01 05:55:00 6136.70 6136.70 6132.15 6134.45
2008-01-01 06:55:00 6157.65 6157.65 6153.20 6154.25
2008-01-01 07:55:00 6155.65 6157.60 6155.00 6157.25
2008-01-01 08:55:00 6143.25 6143.90 6137.50 6138.05
2008-01-01 09:55:00 6150.95 6151.65 6147.50 6149.20
2008-01-02 04:55:00 6113.40 6120.90 6089.00 6089.00
2008-01-02 05:55:00 6086.15 6087.25 6068.80 6068.95
2008-01-02 06:55:00 6098.10 6108.25 6098.10 6105.85
2008-01-02 07:05:00 6107.40 6107.40 6093.70 6094.80
> head(b)
Open High Low Close
2008-01-01 10:00:00 6114.05 6126.65 6111.35 6111.35
2008-01-01 10:05:00 6110.50 6130.65 6110.50 6128.90
2008-01-01 10:10:00 6128.70 6130.15 6123.15 6123.55
2008-01-01 10:15:00 6124.85 6131.90 6123.45 6131.55
2008-01-01 10:20:00 6132.20 6134.45 6128.70 6131.20
2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45
> indexTZ(b)
TZ
"Asia/Kolkata"
> b[endpoints(b,on="hours")]
Open High Low Close
2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45
2008-01-01 11:25:00 6136.70 6136.70 6132.15 6134.45
2008-01-01 12:25:00 6157.65 6157.65 6153.20 6154.25
2008-01-01 13:25:00 6155.65 6157.60 6155.00 6157.25
2008-01-01 14:25:00 6143.25 6143.90 6137.50 6138.05
2008-01-01 15:25:00 6150.95 6151.65 6147.50 6149.20
2008-01-02 10:25:00 6113.40 6120.90 6089.00 6089.00
2008-01-02 11:25:00 6086.15 6087.25 6068.80 6068.95
2008-01-02 12:25:00 6098.10 6108.25 6098.10 6105.85
2008-01-02 12:35:00 6107.40 6107.40 6093.70 6094.80
>
Thanks & Regds
Siva Sunku
endpoints always calculates offsets in UTC. There's nothing you can do to change that as an end-user. But you could work-around it by comparing the 1-hour endpoints with the 30-minute endpoints.
x <- .xts(1:12, seq(0, by=600, length.out=12), tzone="Asia/Kolkata")
x[endpoints(x, "hours")]
# [,1]
# 1970-01-01 06:20:00 6
# 1970-01-01 07:20:00 12
hourEndpointsTZ30 <- function(x) {
h <- endpoints(x, "hours", 1)
m <- endpoints(x, "minutes", 30)
c(0, setdiff(m, h), last(m))
}
x[hourEndpointsTZ30(x)]
# [,1]
# 1970-01-01 05:50:00 3
# 1970-01-01 06:50:00 9
# 1970-01-01 07:20:00 12

Troubles in applying the zoo aggregate function to a time series

We have the following function to compute monthly returns from a daily series of prices:
PricesRet = diff(Prices)/lag(Prices,k=-1)
tail(PricesRet)
# Monthly simple returns
MonRet = aggregate(PricesRet+1, as.yearmon, prod)-1
tail(MonRet)
The problem is that it returns wrong values, take for example the simple return for the month of Feb 2013, the function returns a return -0.003517301 while it should have been -0.01304773.
Why that happens?
Here are the last prices observations:
> tail(Prices,30)
Prices
2013-01-22 165.5086
2013-01-23 165.2842
2013-01-24 168.4845
2013-01-25 170.6041
2013-01-28 169.7373
2013-01-29 169.8724
2013-01-30 170.6554
2013-01-31 170.7210
2013-02-01 173.8043
2013-02-04 172.2145
2013-02-05 172.8400
2013-02-06 172.8333
2013-02-07 171.3586
2013-02-08 170.5602
2013-02-11 171.2172
2013-02-12 171.4126
2013-02-13 171.8687
2013-02-14 170.7955
2013-02-15 171.2848
2013-02-19 170.9482
2013-02-20 171.6355
2013-02-21 170.0300
2013-02-22 169.9319
2013-02-25 170.9035
2013-02-26 168.6822
2013-02-27 168.5180
2013-02-28 168.4935
2013-03-01 169.6546
2013-03-04 169.3076
2013-03-05 169.0579
Here are price returns:
> tail(PricesRet,50)
PricesRet
2012-12-18 0.0055865274
2012-12-19 -0.0015461900
2012-12-20 -0.0076140194
2012-12-23 0.0032656346
2012-12-26 0.0147750923
2012-12-27 0.0013482760
2012-12-30 -0.0004768131
2013-01-01 0.0128908541
2013-01-02 -0.0047646818
2013-01-03 0.0103372029
2013-01-06 -0.0024547278
2013-01-07 -0.0076920352
2013-01-08 0.0064368720
2013-01-09 0.0119663301
2013-01-10 0.0153828814
2013-01-13 0.0050590540
2013-01-14 -0.0053324785
2013-01-15 -0.0027043105
2013-01-16 0.0118840383
2013-01-17 -0.0005876459
2013-01-21 -0.0145541598
2013-01-22 -0.0013555548
2013-01-23 0.0193624621
2013-01-24 0.0125802978
2013-01-27 -0.0050807744
2013-01-28 0.0007959058
2013-01-29 0.0046096266
2013-01-30 0.0003844082
2013-01-31 0.0180603867
2013-02-03 -0.0091473127
2013-02-04 0.0036322298
2013-02-05 -0.0000390941
2013-02-06 -0.0085320734
2013-02-07 -0.0046591956
2013-02-10 0.0038517581
2013-02-11 0.0011412046
2013-02-12 0.0026607502
2013-02-13 -0.0062440496
2013-02-14 0.0028645616
2013-02-18 -0.0019651341
2013-02-19 0.0040206637
2013-02-20 -0.0093543648
2013-02-21 -0.0005764665
2013-02-24 0.0057176118
2013-02-25 -0.0129979321
2013-02-26 -0.0009730782
2013-02-27 -0.0001453191
2013-02-28 0.0068911863
2013-03-03 -0.0020455332
2013-03-04 -0.0014747845
The results of the function is instead:
> tail(data.frame(MonRet))
MonRet
ott 2012 -0.000848156
nov 2012 0.009833881
dic 2012 0.033406884
gen 2013 0.087822700
feb 2013 -0.023875638
mar 2013 -0.003517301
Your returns are wrong. The return for 2013-01-23 should be:
> 165.2842/165.5086-1
[1] -0.001355821
but you have 0.0193624621. I suspect this is because Prices is an xts object, not a zoo object. lag.xts breaks the convention in lag.ts and lag.zoo of k=1 implying a "lag" of (t+1) for the more common convention of using k=1 to imply a "lag" of (t-1).

Date time am/pm in R

I'm having an issue with the datetime field of a timeseries:
> CO1temp[163:169,]
Date OPEN HIGH LOW CLOSE
163 7/11/2011 11:45:00 PM 116.30 116.30 116.09 116.18
164 7/11/2011 11:50:00 PM 116.16 116.78 116.13 116.70
165 7/11/2011 11:55:00 PM 116.69 116.83 116.51 116.65
166 7/12/2011 116.65 116.79 116.44 116.50
167 7/12/2011 12:05:00 AM 116.50 116.60 116.39 116.47
168 7/12/2011 12:10:00 AM 116.49 116.55 116.38 116.52
169 7/12/2011 12:15:00 AM 116.52 116.67 116.39 116.44
As you can see the midnight time (line 166) is not showing properly.
Which creates a NA when I create my xts object:
CO1 <- as.xts(CO1temp[, 2:5], order.by = as.POSIXct(CO1temp[,1],format='%m/%d/%Y %r'),frequency="5 minutes")
> CO1[163:169,]
OPEN HIGH LOW CLOSE
2011-07-11 23:45:00 116.30 116.30 116.09 116.18
2011-07-11 23:50:00 116.16 116.78 116.13 116.70
2011-07-11 23:55:00 116.69 116.83 116.51 116.65
<NA> 116.65 116.79 116.44 116.50
2011-07-12 00:05:00 116.50 116.60 116.39 116.47
2011-07-12 00:10:00 116.49 116.55 116.38 116.52
2011-07-12 00:15:00 116.52 116.67 116.39 116.44
This later leads to more problem when I want to analyze this timeseries.
?strptime is quite specific about it:
The default for the format methods is "%Y-%m-%d %H:%M:%S" if any component has a time component which is not midnight, and "%Y-%m-%d" otherwise.
However my datetime is not in the standard format.
I would greatly appreciate any help.
This a kind of a hack but it works.
You just have to append "12:00:00 AM" to your vector of date: those which are lacking the hour information will be read correctly, and in the dates that already have the hour information it will just be ignored and only the one that was already there will be read.
CO1 <- as.xts(CO1temp[, 2:5],
order.by = as.POSIXct(paste(CO1temp$Date,"12:00:00 AM", sep=" "),
format='%m/%d/%Y %r'),
frequency="5 minutes")
CO1
OPEN HIGH LOW CLOSE
2011-07-11 23:45:00 116.30 116.30 116.09 116.18
2011-07-11 23:50:00 116.16 116.78 116.13 116.70
2011-07-11 23:55:00 116.69 116.83 116.51 116.65
2011-07-12 00:00:00 116.65 116.79 116.44 116.50
2011-07-12 00:05:00 116.50 116.60 116.39 116.47
2011-07-12 00:10:00 116.49 116.55 116.38 116.52
2011-07-12 00:15:00 116.52 116.67 116.39 116.44
That being said, if you ended up with your dataframe as it is after using strptime then your date column is already in POSIXct format and therefore the following should work directly:
as.xts(CO1temp[, 2:5], order.by = CO1temp$Date, frequency = "5 minutes")

Resources