min and max over time range on each day of xts - r
I have an xts object with intraday OHLC price data over several years. I'd like to be able to write a function that calculates the min and max value between 04:00:00 and 05:00:00 every day and include that as a column in the xts object. Im not really familiar with manipulating xts objects. Can anyone point me in the right direction? Here's a head of the xts object.
Open High Low Close Volume
2017-01-01 00:00:00 968.29 968.76 966.74 966.97 106562
2017-01-01 00:05:00 966.97 967.00 966.89 966.89 13731
2017-01-01 00:10:00 966.89 966.89 964.86 964.86 124137
2017-01-01 00:15:00 964.86 964.99 964.80 964.80 3001
2017-01-01 00:20:00 964.80 964.80 964.80 964.80 0
2017-01-01 00:25:00 964.80 965.09 964.54 964.91 48000
2017-01-01 00:30:00 964.91 965.01 964.91 965.01 2501
2017-01-01 00:35:00 965.01 967.82 965.57 967.82 71501
2017-01-01 00:40:00 967.82 967.82 967.08 967.08 50
2017-01-01 00:45:00 967.08 967.40 967.40 967.40 50
2017-01-01 00:50:00 967.40 968.08 967.40 968.08 14000
2017-01-01 00:55:00 968.08 968.08 966.89 968.00 1008
2017-01-01 01:00:00 968.00 968.10 968.00 968.10 1002
2017-01-01 01:05:00 968.10 968.10 967.62 967.62 5200
2017-01-01 01:10:00 967.62 967.70 966.29 966.29 35476
2017-01-01 01:15:00 966.29 966.29 966.28 966.28 3068
2017-01-01 01:20:00 966.28 966.66 965.00 965.00 30471
2017-01-01 01:25:00 965.00 965.01 964.00 964.00 77884
2017-01-01 01:30:00 964.00 964.76 964.76 964.76 500
2017-01-01 01:35:00 964.76 967.48 964.69 965.00 134129
2017-01-01 01:40:00 965.00 965.00 963.67 963.67 59676
2017-01-01 01:45:00 963.67 963.67 963.67 963.67 0
2017-01-01 01:50:00 963.67 964.56 963.66 964.55 5531
2017-01-01 01:55:00 964.55 963.43 963.40 963.40 3000
2017-01-01 02:00:00 963.40 964.60 963.40 964.60 1301
2017-01-01 02:05:00 964.60 964.60 964.60 964.60 0
2017-01-01 02:10:00 964.60 964.60 964.00 964.11 49954
2017-01-01 02:15:00 964.11 964.60 964.59 964.60 5000
2017-01-01 02:20:00 964.60 964.60 964.60 964.60 0
2017-01-01 02:25:00 964.60 964.60 964.51 964.51 2000
2017-01-01 02:30:00 964.51 964.51 964.51 964.51 0
2017-01-01 02:35:00 964.51 964.51 963.23 963.99 16667
2017-01-01 02:40:00 963.99 963.99 963.65 963.66 10000
2017-01-01 02:45:00 963.66 964.26 963.16 964.26 75500
2017-01-01 02:50:00 964.26 964.26 964.26 964.26 0
2017-01-01 02:55:00 964.26 964.26 964.26 964.26 0
2017-01-01 03:00:00 964.26 964.61 963.98 964.61 13000
2017-01-01 03:05:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:10:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:15:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:20:00 964.61 964.82 964.48 964.82 16666
2017-01-01 03:25:00 964.82 965.00 963.99 964.97 50500
2017-01-01 03:30:00 964.97 964.97 964.02 964.02 56000
2017-01-01 03:35:00 964.02 964.29 964.29 964.29 500
2017-01-01 03:40:00 964.29 963.53 963.52 963.52 24000
2017-01-01 03:45:00 963.52 963.52 963.43 963.43 16500
2017-01-01 03:50:00 963.43 963.67 963.42 963.42 25002
2017-01-01 03:55:00 963.42 963.42 961.69 961.69 84507
2017-01-01 04:00:00 961.69 961.69 960.90 960.93 57909
2017-01-01 04:05:00 960.93 960.93 960.93 960.93 0
2017-01-01 04:10:00 960.93 961.19 961.19 961.19 400
2017-01-01 04:15:00 961.19 962.09 961.19 962.09 7001
2017-01-01 04:20:00 962.09 962.09 962.09 962.09 0
2017-01-01 04:25:00 962.09 962.10 961.14 961.14 32000
2017-01-01 04:30:00 961.14 961.14 960.93 960.93 41900
2017-01-01 04:35:00 960.93 961.94 960.93 961.64 640
2017-01-01 04:40:00 961.64 961.71 961.64 961.71 1
2017-01-01 04:45:00 961.71 962.00 961.90 961.99 5499
2017-01-01 04:50:00 961.99 961.99 961.99 961.99 0
2017-01-01 04:55:00 961.99 961.99 961.99 961.99 1
2017-01-01 05:00:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:05:00 961.99 961.99 961.99 961.99 40
2017-01-01 05:10:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:15:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:20:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:25:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:30:00 961.99 962.10 961.99 962.10 1382
2017-01-01 05:35:00 962.10 968.84 962.10 968.84 122909
2017-01-01 05:40:00 968.84 968.86 963.78 965.53 161263
2017-01-01 05:45:00 965.53 964.81 963.11 963.81 18021
2017-01-01 05:50:00 963.81 964.39 963.85 964.39 40006
2017-01-01 05:55:00 964.39 964.47 964.00 964.47 39966
2017-01-01 06:00:00 964.47 964.47 964.47 964.47 0
You can do this by filtering on the hours of the index and then using period.max and period.min functions. The values will be put in last record of the chosen hour. See example below with intraday data of MSFT, max and min values for between 15:00 and 16:00.
library(xts)
# max of high values between 15 and 16. (excluding 16:00)
msft$max <- period.max(msft$high[.indexhour(msft) == 15], endpoints(msft$high[.indexhour(msft) == 15], on = "hour"))
# min of low values between 15 and 16. (excluding 16:00)
msft$min <- period.min(msft$low[.indexhour(msft) == 15], endpoints(msft$low[.indexhour(msft) == 15], on = "hour"))
head(msft[8:24], 16)
open high low close volume max min
2020-01-23 14:50:00 166.180 166.2300 166.090 166.1050 87934 NA NA
2020-01-23 14:55:00 166.105 166.2200 166.103 166.1700 92280 NA NA
2020-01-23 15:00:00 166.160 166.3500 166.160 166.3400 114359 NA NA
2020-01-23 15:05:00 166.335 166.3400 166.285 166.2850 102633 NA NA
2020-01-23 15:10:00 166.290 166.3050 166.170 166.2550 125558 NA NA
2020-01-23 15:15:00 166.250 166.2750 166.210 166.2400 103938 NA NA
2020-01-23 15:20:00 166.230 166.2500 166.180 166.2350 99649 NA NA
2020-01-23 15:25:00 166.240 166.3000 166.225 166.2850 93846 NA NA
2020-01-23 15:30:00 166.270 166.4164 166.175 166.3600 183154 NA NA
2020-01-23 15:35:00 166.360 166.5000 166.320 166.4600 177178 NA NA
2020-01-23 15:40:00 166.450 166.4650 166.380 166.3800 112174 NA NA
2020-01-23 15:45:00 166.385 166.4050 166.290 166.3875 152806 NA NA
2020-01-23 15:50:00 166.382 166.5200 166.362 166.4500 205667 NA NA
2020-01-23 15:55:00 166.450 166.6900 166.305 166.6700 508469 166.69 166.16
2020-01-23 16:00:00 166.660 166.7200 166.589 166.7200 934090 NA NA
2020-01-24 09:35:00 167.510 167.5300 166.890 166.8918 1152646 NA NA
data:
msft <- structure(c(166.224, 166.29, 166.29, 166.2456, 166.165, 166.1446,
166.1601, 166.18, 166.105, 166.16, 166.335, 166.29, 166.25, 166.23,
166.24, 166.27, 166.36, 166.45, 166.385, 166.382, 166.45, 166.66,
167.51, 167.03, 167.265, 167.325, 167.37, 167.16, 167.405, 167.35,
167.31, 167.39, 167.17, 167.1, 166.845, 167.03, 167.1223, 167.125,
167.21, 167.34, 167.235, 167.3, 167.37, 167.1977, 166.9814, 166.8499,
166.99, 166.93, 166.83, 166.64, 166.775, 166.85, 166.71, 166.6838,
166.46, 166.35, 165.765, 166.2269, 166.01, 166.19, 166.13, 166.31,
166.36, 166.42, 166.3682, 165.99, 166.1328, 165.85, 165.74, 165.8439,
165.655, 165.5434, 165.47, 165.3227, 165.0627, 165.03, 165.2546,
165.14, 165.1, 164.91, 164.75, 164.65, 164.53, 164.81, 164.8979,
164.6, 164.89, 164.94, 165.03, 165.12, 165.17, 165.24, 165.4,
165.335, 165.2734, 164.985, 164.9, 164.61, 164.93, 165.18, 166.315,
166.29, 166.3, 166.265, 166.22, 166.2201, 166.2, 166.23, 166.22,
166.35, 166.34, 166.305, 166.275, 166.25, 166.3, 166.4164, 166.5,
166.465, 166.405, 166.52, 166.69, 166.72, 167.53, 167.34, 167.39,
167.495, 167.47, 167.48, 167.4251, 167.3699, 167.42, 167.41,
167.2, 167.1, 167.03, 167.21, 167.23, 167.255, 167.35, 167.35,
167.33, 167.405, 167.38, 167.25, 167.01, 167, 167.02, 167.02,
166.8384, 166.9056, 166.86, 166.94, 166.75, 166.6844, 166.47,
166.42, 166.22, 166.4049, 166.221, 166.2003, 166.3749, 166.3999,
166.43, 166.43, 166.375, 166.175, 166.16, 165.96, 165.93, 165.86,
165.671, 165.64, 165.49, 165.4, 165.08, 165.27, 165.26, 165.34,
165.12, 165, 164.825, 164.765, 164.82, 164.89, 165, 164.89, 164.99,
165.041, 165.293, 165.23, 165.27, 165.44, 165.6046, 165.37, 165.295,
165.18, 164.93, 164.945, 165.185, 165.24, 166.22, 166.225, 166.25,
166.15, 166.145, 166.13, 166.1015, 166.09, 166.103, 166.16, 166.285,
166.17, 166.21, 166.18, 166.225, 166.175, 166.32, 166.38, 166.29,
166.362, 166.305, 166.589, 166.89, 167.03, 167.22, 167.32, 167.225,
167.16, 167.2, 167.23, 167.2801, 167.145, 167.05, 166.84, 166.77,
167, 167.1, 167.02, 167.18, 167.23, 167.223, 167.28, 167.1843,
166.862, 166.85, 166.821, 166.8121, 166.85, 166.55, 166.6303,
166.69, 166.7, 166.54, 166.4, 166.31, 165.76, 165.74, 165.8966,
165.91, 166.07, 166.09, 166.171, 166.32, 166.22, 165.96, 165.97,
165.82, 165.73, 165.72, 165.64, 165.49, 165.45, 165.32, 165.045,
164.89, 164.91, 165.09, 165.1, 164.91, 164.74, 164.53, 164.529,
164.53, 164.735, 164.59, 164.54, 164.88, 164.938, 165.01, 165.0792,
165.12, 165.22, 165.335, 165.263, 164.88, 164.89, 164.58, 164.58,
164.87, 164.87, 166.29, 166.275, 166.26, 166.16, 166.145, 166.155,
166.19, 166.105, 166.17, 166.34, 166.285, 166.255, 166.24, 166.235,
166.285, 166.36, 166.46, 166.38, 166.3875, 166.45, 166.67, 166.72,
166.8918, 167.27, 167.325, 167.371, 167.2251, 167.4, 167.34,
167.29, 167.3988, 167.2, 167.1047, 166.86, 167.025, 167.11, 167.12,
167.2, 167.345, 167.23, 167.29, 167.37, 167.1916, 167.0027, 166.85,
167, 166.94, 166.85, 166.64, 166.7738, 166.85, 166.72, 166.68,
166.4672, 166.3512, 165.79, 166.21, 165.9969, 166.18, 166.14,
166.2968, 166.36, 166.43, 166.36, 165.99, 166.13, 165.83, 165.73,
165.8405, 165.65, 165.545, 165.48, 165.33, 165.05, 165.0227,
165.26, 165.1425, 165.101, 164.91, 164.74, 164.6581, 164.5292,
164.805, 164.89, 164.59, 164.8801, 164.9498, 165.04, 165.12,
165.16, 165.2302, 165.4, 165.34, 165.28, 164.987, 164.89, 164.605,
164.94, 165.185, 165.04, 158120, 165333, 101115, 78491, 123999,
76037, 82733, 87934, 92280, 114359, 102633, 125558, 103938, 99649,
93846, 183154, 177178, 112174, 152806, 205667, 508469, 934090,
1152646, 558627, 277325, 321651, 255494, 333848, 272126, 395463,
194593, 211910, 193131, 242112, 210240, 193265, 139617, 204182,
179146, 159259, 237888, 410982, 213787, 233082, 188071, 193742,
132377, 118994, 264247, 182490, 109514, 138164, 221052, 194127,
169059, 458214, 247712, 169523, 115531, 161259, 263230, 155536,
82474, 87549, 109057, 101772, 130642, 171988, 117235, 134507,
236662, 219303, 217698, 219808, 420288, 208087, 149358, 197435,
218090, 267667, 320279, 422434, 340478, 273866, 258938, 212451,
268017, 323657, 267686, 214060, 222314, 293731, 288867, 219687,
304733, 251063, 425450, 455311, 741208, 1429645),
.Dim = c(100L, 5L),
.Dimnames = list(NULL, c("open", "high", "low", "close", "volume")),
index = structure(c(1579785300, 1579785600, 1579785900, 1579786200, 1579786500,
1579786800, 1579787100, 1579787400, 1579787700, 1579788000,
1579788300, 1579788600, 1579788900, 1579789200, 1579789500,
1579789800, 1579790100, 1579790400, 1579790700, 1579791000,
1579791300, 1579791600, 1579854900, 1579855200, 1579855500,
1579855800, 1579856100, 1579856400, 1579856700, 1579857000,
1579857300, 1579857600, 1579857900, 1579858200, 1579858500,
1579858800, 1579859100, 1579859400, 1579859700, 1579860000,
1579860300, 1579860600, 1579860900, 1579861200, 1579861500,
1579861800, 1579862100, 1579862400, 1579862700, 1579863000,
1579863300, 1579863600, 1579863900, 1579864200, 1579864500,
1579864800, 1579865100, 1579865400, 1579865700, 1579866000,
1579866300, 1579866600, 1579866900, 1579867200, 1579867500,
1579867800, 1579868100, 1579868400, 1579868700, 1579869000,
1579869300, 1579869600, 1579869900, 1579870200, 1579870500,
1579870800, 1579871100, 1579871400, 1579871700, 1579872000,
1579872300, 1579872600, 1579872900, 1579873200, 1579873500,
1579873800, 1579874100, 1579874400, 1579874700, 1579875000,
1579875300, 1579875600, 1579875900, 1579876200, 1579876500,
1579876800, 1579877100, 1579877400, 1579877700, 1579878000),
tzone = "",
tclass = c("POSIXct", "POSIXt")),
class = c("xts", "zoo"))
Related
How to plot lagged data against other data in R
I would like to lag one variable by, say, 10 time steps and plot it against the other variable which remains the same. I would like to do this for various lags to see if there is a time period that the first variable influences the other. The data I have is daily and after lagging I am separating into Dec-Feb data only. The problem I am having is the plot and correlation between the lagged variable and the other data is coming out the same as the non-lagged plot and correlation every time. I am not sure how to achieve this. A sample of my data frame "data" can be seen below. Date x y 14158 2017-10-05 1.913918e+00 -0.1538234614 14159 2017-10-06 1.479714e+00 -0.1937094170 14160 2017-10-07 8.783669e-01 -0.1703790211 14161 2017-10-08 5.706581e-01 -0.1294144428 14162 2017-10-09 4.979405e-01 -0.0666569815 14163 2017-10-10 3.233477e-01 0.0072006102 14164 2017-10-11 3.057630e-01 0.0863445067 14165 2017-10-12 5.877673e-01 0.1097707831 14166 2017-10-13 1.208526e+00 0.1301967193 14167 2017-10-14 1.671705e+00 0.1728109268 14168 2017-10-15 1.810979e+00 0.2264911145 14169 2017-10-16 1.426651e+00 0.2702958315 14170 2017-10-17 1.241140e+00 0.3242637704 14171 2017-10-18 8.997498e-01 0.3879727861 14172 2017-10-19 5.594161e-01 0.4172990825 14173 2017-10-20 3.980254e-01 0.3915170864 14174 2017-10-21 2.138538e-01 0.3249736995 14175 2017-10-22 3.926440e-01 0.2224834840 14176 2017-10-23 2.268644e-01 0.0529143372 14177 2017-10-24 5.664923e-01 -0.0081443464 14178 2017-10-25 6.167520e-01 0.0312073984 14179 2017-10-26 7.751882e-02 0.0043897693 14180 2017-10-27 -5.634851e-02 -0.0726825266 14181 2017-10-28 -2.122061e-01 -0.1711305549 14182 2017-10-29 -8.500991e-01 -0.2068581639 14183 2017-10-30 -1.039685e+00 -0.2909120824 14184 2017-10-31 -3.057745e-01 -0.3933633317 14185 2017-11-01 -1.288774e-01 -0.3726346136 14186 2017-11-02 -5.608007e-03 -0.2425754386 14187 2017-11-03 4.853990e-01 -0.0503543980 14188 2017-11-04 5.822672e-01 0.0896130098 14189 2017-11-05 8.491505e-01 0.1299151006 14190 2017-11-06 1.052999e+00 0.0749888307 14191 2017-11-07 1.170470e+00 0.0287317882 14192 2017-11-08 7.919862e-01 0.0788187381 14193 2017-11-09 4.574565e-01 0.1539981316 14194 2017-11-10 4.552032e-01 0.2034393145 14195 2017-11-11 -3.621350e-01 0.2077476707 14196 2017-11-12 -8.053965e-01 0.1759558604 14197 2017-11-13 -8.307459e-01 0.1802858410 14198 2017-11-14 -9.421325e-01 0.2175529008 14199 2017-11-15 -9.880204e-01 0.2392924580 14200 2017-11-16 -7.448127e-01 0.2519253751 14201 2017-11-17 -8.081435e-01 0.2614254732 14202 2017-11-18 -1.216806e+00 0.2629971336 14203 2017-11-19 -1.122674e+00 0.3469995055 14204 2017-11-20 -1.242597e+00 0.4553094014 14205 2017-11-21 -1.294885e+00 0.5049438231 14206 2017-11-22 -9.325514e-01 0.4684133163 14207 2017-11-23 -4.632281e-01 0.4071673624 14208 2017-11-24 -9.689322e-02 0.3710270269 14209 2017-11-25 4.704467e-01 0.4126721465 14210 2017-11-26 8.682453e-01 0.3745057653 14211 2017-11-27 5.105564e-01 0.2373454931 14212 2017-11-28 4.747265e-01 0.1650783370 14213 2017-11-29 5.905379e-01 0.2632154120 14214 2017-11-30 4.083787e-01 0.3888834762 14215 2017-12-01 3.451736e-01 0.5008047592 14216 2017-12-02 5.161312e-01 0.5388177242 14217 2017-12-03 7.109279e-01 0.5515360710 14218 2017-12-04 4.458635e-01 0.5127537202 14219 2017-12-05 -3.986610e-01 0.3896493238 14220 2017-12-06 -5.968253e-01 0.1095843268 14221 2017-12-07 -1.604398e-01 -0.2455506506 14222 2017-12-08 -4.384744e-01 -0.5801038215 14223 2017-12-09 -7.255016e-01 -0.8384627087 14224 2017-12-10 -9.691828e-01 -0.9223171538 14225 2017-12-11 -1.140588e+00 -0.8177806761 14226 2017-12-12 -1.956622e-01 -0.5250998474 14227 2017-12-13 -1.083792e-01 -0.3430768534 14228 2017-12-14 -8.016345e-02 -0.3163476104 14229 2017-12-15 8.899266e-01 -0.2813253830 14230 2017-12-16 1.322833e+00 -0.2545953062 14231 2017-12-17 1.547972e+00 -0.2275373110 14232 2017-12-18 2.164907e+00 -0.3217205817 14233 2017-12-19 2.276258e+00 -0.5773412429 14234 2017-12-20 1.862291e+00 -0.7728091393 14235 2017-12-21 1.125083e+00 -0.9099696881 14236 2017-12-22 7.737118e-01 -1.2441963604 14237 2017-12-23 7.863508e-01 -1.4802661587 14238 2017-12-24 4.313111e-01 -1.4111320559 14239 2017-12-25 -8.814799e-02 -1.0024805520 14240 2017-12-26 -3.615127e-01 -0.4943077147 14241 2017-12-27 -5.011363e-01 -0.0308588186 14242 2017-12-28 -8.474088e-01 0.3717555895 14243 2017-12-29 -7.283247e-01 0.8230450219 14244 2017-12-30 -4.566981e-01 1.2495961116 14245 2017-12-31 -4.577034e-01 1.4805369230 14246 2018-01-01 1.946166e-01 1.5310004017 14247 2018-01-02 5.203149e-01 1.5384595802 14248 2018-01-03 5.024570e-02 1.4036679018 14249 2018-01-04 -7.065297e-01 1.0749574137 14250 2018-01-05 -8.741815e-01 0.7608524752 14251 2018-01-06 1.589530e-01 0.7891084646 14252 2018-01-07 8.632378e-01 1.1230358751 I am using lagged <- lag(ts(x), k=10) This is so the tsp isn't ignored. However, when I do cor(data$x, data$y) and cor(lagged, data$y) the result is the same, where I would have thought it would have been different. How do I get this lag to work before I can go ahead separate via date? Many thanks!
dealing with "missing" times when setting data to xts
I have some data which looks like the following; Dates Open Close 1000 06/06/2019 0:05 244.599 244.524 1001 06/06/2019 0:04 244.592 244.599 1002 06/06/2019 0:03 244.564 244.592 1003 06/06/2019 0:02 244.809 244.564 1004 06/06/2019 0:01 244.849 244.809 1005 06/06/2019 245.080 244.849 1006 05/06/2019 23:59 245.092 245.080 1007 05/06/2019 23:58 245.253 245.092 1008 05/06/2019 23:57 244.858 245.253 1009 05/06/2019 23:56 244.643 244.863 1010 05/06/2019 23:55 244.720 244.643 Where row 1005 doesn't have a time stamp. I try to set my dates to POSIXlt format. data$Dates <- gsub("/", "-", data$Dates) data$Dates <- as.POSIXlt(strptime(data$Dates, format="%d-%m-%Y %H:%M")) Now my data looks like: Dates Open Close 1000 2019-06-06 00:05:00 244.599 244.524 1001 2019-06-06 00:04:00 244.592 244.599 1002 2019-06-06 00:03:00 244.564 244.592 1003 2019-06-06 00:02:00 244.809 244.564 1004 2019-06-06 00:01:00 244.849 244.809 1005 <NA> 245.080 244.849 1006 2019-06-05 23:59:00 245.092 245.080 1007 2019-06-05 23:58:00 245.253 245.092 1008 2019-06-05 23:57:00 244.858 245.253 1009 2019-06-05 23:56:00 244.643 244.863 1010 2019-06-05 23:55:00 244.720 244.643 I am just wondering if there is a way around converting the times with no Hour or Minute data. It only occurs on the hour 0:00 Data: data <- structure(list(Dates = c("06/06/2019 0:05", "06/06/2019 0:04", "06/06/2019 0:03", "06/06/2019 0:02", "06/06/2019 0:01", "06/06/2019", "05/06/2019 23:59", "05/06/2019 23:58", "05/06/2019 23:57", "05/06/2019 23:56", "05/06/2019 23:55"), Open = c(244.599, 244.592, 244.564, 244.809, 244.849, 245.08, 245.092, 245.253, 244.858, 244.643, 244.72), Close = c(244.524, 244.599, 244.592, 244.564, 244.809, 244.849, 245.08, 245.092, 245.253, 244.863, 244.643)), row.names = 1000:1010, class = "data.frame") EDIT: I just thought perhaps I should first split the column into two (one for dates and another for times) fill in the blank cells in the second column with 0:00 and paste back together.
parse_date_time in the lubridate package will successively check alternative formats until it succeeds if you give it a vector of formats. The separators and percent signs can be omitted from the format strings. library(lubridate) parse_date_time(data$Dates, c("dmYHM", "dmY"), tz = "") giving: [1] "2019-06-06 00:05:00 EDT" "2019-06-06 00:04:00 EDT" [3] "2019-06-06 00:03:00 EDT" "2019-06-06 00:02:00 EDT" [5] "2019-06-06 00:01:00 EDT" "2019-06-06 00:00:00 EDT" [7] "2019-06-05 23:59:00 EDT" "2019-06-05 23:58:00 EDT" [9] "2019-06-05 23:57:00 EDT" "2019-06-05 23:56:00 EDT" [11] "2019-06-05 23:55:00 EDT"
Using dplyr, one possibility could be: data %>% mutate(Dates = ifelse(nchar(Dates) == 10, paste(Dates, "0:00", sep = " "), Dates), Dates = as.POSIXct(Dates, format = "%d/%m/%Y %H:%M")) Dates Open Close 1 2019-06-06 00:05:00 244.599 244.524 2 2019-06-06 00:04:00 244.592 244.599 3 2019-06-06 00:03:00 244.564 244.592 4 2019-06-06 00:02:00 244.809 244.564 5 2019-06-06 00:01:00 244.849 244.809 6 2019-06-06 00:00:00 245.080 244.849 7 2019-06-05 23:59:00 245.092 245.080 8 2019-06-05 23:58:00 245.253 245.092 9 2019-06-05 23:57:00 244.858 245.253 10 2019-06-05 23:56:00 244.643 244.863 11 2019-06-05 23:55:00 244.720 244.643 Here, for rows containing just the 10 characters, it combines the date with 0:00. The same with base R: data$Dates <- ifelse(nchar(data$Dates) == 10, paste(data$Dates, "0:00", sep = " "), data$Dates) as.POSIXct(data$Dates, format = "%d/%m/%Y %H:%M")
endpoints are always in UTC/GMT
I think the endpoints are generated by converting the index to UTC. I want the endpoints at change of every hour 9:00/10:00 etc ( or at my specific defined intervals, say at 9:30/10:30 etc). In the below example object 'a' is UTC and endpoints are created at 4:55,5:55 etc, for the object 'b' its at 10:25,11:25 etc. I am looking for a solution that gives endpoints in specified way, irrespective of timezone. Is there any simple mechanism to do so? > head(a) Open High Low Close 2008-01-01 04:30:00 6114.05 6126.65 6111.35 6111.35 2008-01-01 04:35:00 6110.50 6130.65 6110.50 6128.90 2008-01-01 04:40:00 6128.70 6130.15 6123.15 6123.55 2008-01-01 04:45:00 6124.85 6131.90 6123.45 6131.55 2008-01-01 04:50:00 6132.20 6134.45 6128.70 6131.20 2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45 > indexTZ(a) TZ "UTC" > a[endpoints(a,on="hours")] Open High Low Close 2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45 2008-01-01 05:55:00 6136.70 6136.70 6132.15 6134.45 2008-01-01 06:55:00 6157.65 6157.65 6153.20 6154.25 2008-01-01 07:55:00 6155.65 6157.60 6155.00 6157.25 2008-01-01 08:55:00 6143.25 6143.90 6137.50 6138.05 2008-01-01 09:55:00 6150.95 6151.65 6147.50 6149.20 2008-01-02 04:55:00 6113.40 6120.90 6089.00 6089.00 2008-01-02 05:55:00 6086.15 6087.25 6068.80 6068.95 2008-01-02 06:55:00 6098.10 6108.25 6098.10 6105.85 2008-01-02 07:05:00 6107.40 6107.40 6093.70 6094.80 > head(b) Open High Low Close 2008-01-01 10:00:00 6114.05 6126.65 6111.35 6111.35 2008-01-01 10:05:00 6110.50 6130.65 6110.50 6128.90 2008-01-01 10:10:00 6128.70 6130.15 6123.15 6123.55 2008-01-01 10:15:00 6124.85 6131.90 6123.45 6131.55 2008-01-01 10:20:00 6132.20 6134.45 6128.70 6131.20 2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45 > indexTZ(b) TZ "Asia/Kolkata" > b[endpoints(b,on="hours")] Open High Low Close 2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45 2008-01-01 11:25:00 6136.70 6136.70 6132.15 6134.45 2008-01-01 12:25:00 6157.65 6157.65 6153.20 6154.25 2008-01-01 13:25:00 6155.65 6157.60 6155.00 6157.25 2008-01-01 14:25:00 6143.25 6143.90 6137.50 6138.05 2008-01-01 15:25:00 6150.95 6151.65 6147.50 6149.20 2008-01-02 10:25:00 6113.40 6120.90 6089.00 6089.00 2008-01-02 11:25:00 6086.15 6087.25 6068.80 6068.95 2008-01-02 12:25:00 6098.10 6108.25 6098.10 6105.85 2008-01-02 12:35:00 6107.40 6107.40 6093.70 6094.80 > Thanks & Regds Siva Sunku
endpoints always calculates offsets in UTC. There's nothing you can do to change that as an end-user. But you could work-around it by comparing the 1-hour endpoints with the 30-minute endpoints. x <- .xts(1:12, seq(0, by=600, length.out=12), tzone="Asia/Kolkata") x[endpoints(x, "hours")] # [,1] # 1970-01-01 06:20:00 6 # 1970-01-01 07:20:00 12 hourEndpointsTZ30 <- function(x) { h <- endpoints(x, "hours", 1) m <- endpoints(x, "minutes", 30) c(0, setdiff(m, h), last(m)) } x[hourEndpointsTZ30(x)] # [,1] # 1970-01-01 05:50:00 3 # 1970-01-01 06:50:00 9 # 1970-01-01 07:20:00 12
Troubles in applying the zoo aggregate function to a time series
We have the following function to compute monthly returns from a daily series of prices: PricesRet = diff(Prices)/lag(Prices,k=-1) tail(PricesRet) # Monthly simple returns MonRet = aggregate(PricesRet+1, as.yearmon, prod)-1 tail(MonRet) The problem is that it returns wrong values, take for example the simple return for the month of Feb 2013, the function returns a return -0.003517301 while it should have been -0.01304773. Why that happens? Here are the last prices observations: > tail(Prices,30) Prices 2013-01-22 165.5086 2013-01-23 165.2842 2013-01-24 168.4845 2013-01-25 170.6041 2013-01-28 169.7373 2013-01-29 169.8724 2013-01-30 170.6554 2013-01-31 170.7210 2013-02-01 173.8043 2013-02-04 172.2145 2013-02-05 172.8400 2013-02-06 172.8333 2013-02-07 171.3586 2013-02-08 170.5602 2013-02-11 171.2172 2013-02-12 171.4126 2013-02-13 171.8687 2013-02-14 170.7955 2013-02-15 171.2848 2013-02-19 170.9482 2013-02-20 171.6355 2013-02-21 170.0300 2013-02-22 169.9319 2013-02-25 170.9035 2013-02-26 168.6822 2013-02-27 168.5180 2013-02-28 168.4935 2013-03-01 169.6546 2013-03-04 169.3076 2013-03-05 169.0579 Here are price returns: > tail(PricesRet,50) PricesRet 2012-12-18 0.0055865274 2012-12-19 -0.0015461900 2012-12-20 -0.0076140194 2012-12-23 0.0032656346 2012-12-26 0.0147750923 2012-12-27 0.0013482760 2012-12-30 -0.0004768131 2013-01-01 0.0128908541 2013-01-02 -0.0047646818 2013-01-03 0.0103372029 2013-01-06 -0.0024547278 2013-01-07 -0.0076920352 2013-01-08 0.0064368720 2013-01-09 0.0119663301 2013-01-10 0.0153828814 2013-01-13 0.0050590540 2013-01-14 -0.0053324785 2013-01-15 -0.0027043105 2013-01-16 0.0118840383 2013-01-17 -0.0005876459 2013-01-21 -0.0145541598 2013-01-22 -0.0013555548 2013-01-23 0.0193624621 2013-01-24 0.0125802978 2013-01-27 -0.0050807744 2013-01-28 0.0007959058 2013-01-29 0.0046096266 2013-01-30 0.0003844082 2013-01-31 0.0180603867 2013-02-03 -0.0091473127 2013-02-04 0.0036322298 2013-02-05 -0.0000390941 2013-02-06 -0.0085320734 2013-02-07 -0.0046591956 2013-02-10 0.0038517581 2013-02-11 0.0011412046 2013-02-12 0.0026607502 2013-02-13 -0.0062440496 2013-02-14 0.0028645616 2013-02-18 -0.0019651341 2013-02-19 0.0040206637 2013-02-20 -0.0093543648 2013-02-21 -0.0005764665 2013-02-24 0.0057176118 2013-02-25 -0.0129979321 2013-02-26 -0.0009730782 2013-02-27 -0.0001453191 2013-02-28 0.0068911863 2013-03-03 -0.0020455332 2013-03-04 -0.0014747845 The results of the function is instead: > tail(data.frame(MonRet)) MonRet ott 2012 -0.000848156 nov 2012 0.009833881 dic 2012 0.033406884 gen 2013 0.087822700 feb 2013 -0.023875638 mar 2013 -0.003517301
Your returns are wrong. The return for 2013-01-23 should be: > 165.2842/165.5086-1 [1] -0.001355821 but you have 0.0193624621. I suspect this is because Prices is an xts object, not a zoo object. lag.xts breaks the convention in lag.ts and lag.zoo of k=1 implying a "lag" of (t+1) for the more common convention of using k=1 to imply a "lag" of (t-1).
Date time am/pm in R
I'm having an issue with the datetime field of a timeseries: > CO1temp[163:169,] Date OPEN HIGH LOW CLOSE 163 7/11/2011 11:45:00 PM 116.30 116.30 116.09 116.18 164 7/11/2011 11:50:00 PM 116.16 116.78 116.13 116.70 165 7/11/2011 11:55:00 PM 116.69 116.83 116.51 116.65 166 7/12/2011 116.65 116.79 116.44 116.50 167 7/12/2011 12:05:00 AM 116.50 116.60 116.39 116.47 168 7/12/2011 12:10:00 AM 116.49 116.55 116.38 116.52 169 7/12/2011 12:15:00 AM 116.52 116.67 116.39 116.44 As you can see the midnight time (line 166) is not showing properly. Which creates a NA when I create my xts object: CO1 <- as.xts(CO1temp[, 2:5], order.by = as.POSIXct(CO1temp[,1],format='%m/%d/%Y %r'),frequency="5 minutes") > CO1[163:169,] OPEN HIGH LOW CLOSE 2011-07-11 23:45:00 116.30 116.30 116.09 116.18 2011-07-11 23:50:00 116.16 116.78 116.13 116.70 2011-07-11 23:55:00 116.69 116.83 116.51 116.65 <NA> 116.65 116.79 116.44 116.50 2011-07-12 00:05:00 116.50 116.60 116.39 116.47 2011-07-12 00:10:00 116.49 116.55 116.38 116.52 2011-07-12 00:15:00 116.52 116.67 116.39 116.44 This later leads to more problem when I want to analyze this timeseries. ?strptime is quite specific about it: The default for the format methods is "%Y-%m-%d %H:%M:%S" if any component has a time component which is not midnight, and "%Y-%m-%d" otherwise. However my datetime is not in the standard format. I would greatly appreciate any help.
This a kind of a hack but it works. You just have to append "12:00:00 AM" to your vector of date: those which are lacking the hour information will be read correctly, and in the dates that already have the hour information it will just be ignored and only the one that was already there will be read. CO1 <- as.xts(CO1temp[, 2:5], order.by = as.POSIXct(paste(CO1temp$Date,"12:00:00 AM", sep=" "), format='%m/%d/%Y %r'), frequency="5 minutes") CO1 OPEN HIGH LOW CLOSE 2011-07-11 23:45:00 116.30 116.30 116.09 116.18 2011-07-11 23:50:00 116.16 116.78 116.13 116.70 2011-07-11 23:55:00 116.69 116.83 116.51 116.65 2011-07-12 00:00:00 116.65 116.79 116.44 116.50 2011-07-12 00:05:00 116.50 116.60 116.39 116.47 2011-07-12 00:10:00 116.49 116.55 116.38 116.52 2011-07-12 00:15:00 116.52 116.67 116.39 116.44 That being said, if you ended up with your dataframe as it is after using strptime then your date column is already in POSIXct format and therefore the following should work directly: as.xts(CO1temp[, 2:5], order.by = CO1temp$Date, frequency = "5 minutes")