How to confront error "wrong embedding dimension" in cajolst R function? - r

When I try to use cajolst function from urca package I get a strange error.
would you please guide me how can i confront the problem?
result<-urca::cajolst(data ,trend = FALSE, K = 2, season = NULL)
Error in embed(diff(x), K) : wrong embedding dimension.
dates A G
2016-11-30 0 0
2016-12-01 -3.53 3.198
2016-12-02 -2.832 8.703
2016-12-04 -2.666 7.799
2016-12-05 -0.54 7.701
2016-12-06 -1.296 4.685
2016-12-07 -1.785 -4.587
2016-12-08 -6.834 -3.696
2016-12-09 -9.624 -5.461
2016-12-11 -11.374 -0.423
2016-12-12 -6.037 -1.614
2016-12-13 -5.934 -3.231
2016-12-14 -7.279 1.072
2016-12-15 -7.859 -4.823
2016-12-16 -15.132 10.838
2016-12-19 -15.345 11.5
2016-12-20 -15.673 6.639
2016-12-21 -15.391 11.162
2016-12-22 -14.357 7.032
2016-12-23 -14.99 12.355
2016-12-26 -15.626 10.944
2016-12-27 -12.297 10.215
2016-12-28 -13.967 5.957
2016-12-29 -12.946 3.446
2016-12-30 -19.681 10.274
2017-01-02 -18.24 8.781
2017-01-03 -16.83 1.116
2017-01-04 -18.189 -0.036
2017-01-05 -15.897 -1.441
2017-01-06 -20.196 -8.534
2017-01-09 -14.57 -28.768
2017-01-10 -13.27 -29.821
2017-01-11 -8.85 -38.881
2017-01-12 -6.375 -50.885
2017-01-13 -8.056 -51.321
2017-01-16 -5.217 -63.619
2017-01-17 -4.75 -39.163
2017-01-18 3.505 -46.309
2017-01-19 10.939 -45.825
2017-01-20 9.248 -42.973
2017-01-23 9.532 -33.396
2017-01-24 4.235 -31.38
2017-01-25 -1.885 -19.21
2017-01-26 -5.027 -15.74
2017-01-27 0.015 -23.029
2017-01-30 -0.685 -30.773
2017-01-31 -2.692 -25.544
2017-02-01 -2.654 -17.912
2017-02-02 4.002 -43.309
2017-02-03 4.813 -52.627
2017-02-06 7.049 -49.965
2017-02-07 10.003 -40.568
2017-02-08 8.996 -39.828
2017-02-09 7.047 -41.19
2017-02-10 7.656 -50.853
2017-02-13 4.986 -41.318
2017-02-14 8.493 -51.946
2017-02-15 12.547 -59.538
2017-02-16 10.327 -54.496
2017-02-17 7.09 -57.571
2017-02-20 11.633 -54.91
2017-02-21 12.664 -51.597
2017-02-22 16.103 -57.819
2017-02-23 14.25 -51.336
2017-02-24 7.794 -54.898
2017-02-27 15.27 -55.754
2017-02-28 19.984 -58.37
2017-03-01 23.899 -70.73
2017-03-02 16.63 -56.29
2017-03-03 16.443 -55.858
2017-03-06 17.901 -59.377
2017-03-07 19.067 -64.383
2017-03-08 17.219 -57.829
2017-03-09 15.694 -55.022
2017-03-10 17.351 -60.431
2017-03-13 18.945 -59.79
2017-03-14 20.001 -64.848
2017-03-15 23.852 -73.806
2017-03-16 22.697 -64.191
2017-03-17 26.892 -65.328
2017-03-20 29.221 -72.764
2017-03-21 25.165 -53.427
2017-03-22 22.998 -51.676
2017-03-23 20.072 -40.57
2017-03-24 20.758 -43.654
2017-03-27 20.062 -33.672
2017-03-28 22.066 -47.184
2017-03-29 22.363 -54.57
2017-03-30 20.684 -48.199
2017-03-31 17.056 -40.887
2017-04-03 19.12 -39.618
2017-04-04 16.359 -37.1
2017-04-05 18.643 -32.734
2017-04-06 14.708 -30.455
2017-04-07 8.403 -33.553
2017-04-10 6.072 -29.048
2017-04-11 5.186 -20.696
2017-04-12 4.248 -20.924
2017-04-13 12.803 -31.075
2017-04-14 12.566 -29.768
2017-04-17 14.065 -28.906
2017-04-18 14.5 4.121
2017-04-19 13.865 8.835
2017-04-20 16.126 6.191
2017-04-21 17.591 3.77
2017-04-24 22.3 -2.497
2017-04-25 22.731 7.408
2017-04-26 19.146 18.45
2017-04-27 19.052 25.541
2017-04-28 21.889 26.878
2017-05-01 27.323 14.362
2017-05-02 29.93 17.525
2017-05-03 19.835 29.856
2017-05-04 19.683 36.72
2017-05-05 13.545 41.055
2017-05-08 14.165 43.544
2017-05-09 11.325 49.978
2017-05-10 10.143 47.072
2017-05-11 13.718 38.901
2017-05-12 14.216 36.017
2017-05-15 13.701 33.797
2017-05-16 13.505 33.867
2017-05-17 13.456 38.004
2017-05-18 12.613 37.758
2017-05-19 11.166 40.367
2017-05-22 12.221 34.022
2017-05-23 13.682 29.793
2017-05-24 10.05 26.701
2017-05-25 10.122 31.394
2017-05-26 7.592 20.073
2017-05-29 6.796 23.809
2017-05-30 9.638 16.1
2017-05-31 7.983 29.043
2017-06-01 3.594 39.557
2017-06-02 8.763 27.863
2017-06-05 12.157 22.397
2017-06-06 13.383 19.053
2017-06-07 20.52 17.449
2017-06-08 19.534 -1.615
2017-06-09 16.011 -1.989
2017-06-12 9.153 -9.294
2017-06-13 4.295 -0.897
2017-06-14 9.743 -9.818
2017-06-15 10.386 -8.255
2017-06-16 11.983 -12.522
2017-06-19 9.513 -12.931
2017-06-20 10.298 -21.024
2017-06-21 11.087 -11.801
2017-06-22 4.472 -9.048
2017-06-23 9.416 -9.592
2017-06-26 9.686 -12.006
2017-06-27 6.424 -2.632
2017-06-28 3.062 -1.016
2017-06-29 5.593 -0.825
2017-06-30 3.531 0.914
2017-07-03 3.208 -2.596
2017-07-04 -6.373 4.289
2017-07-05 -5.149 5.917
2017-07-06 -6.104 12.75
2017-07-07 -9.565 1.615
2017-07-10 -8.961 -0.053
2017-07-11 -4.065 -8.541
2017-07-12 -10.133 -11.286
2017-07-13 -6.223 -15.181
2017-07-14 -1.524 -14.396
2017-07-17 -1.613 -14.61
2017-07-18 5.781 -35.473
2017-07-19 8.243 -44.186
2017-07-20 7.665 -49.857
2017-07-21 0.485 -41.286
2017-07-24 -0.638 -39.127
2017-07-25 0.767 -40.952
2017-07-26 3.566 -44.388
2017-07-27 6.834 -42.543
2017-07-28 1.306 -37.657
2017-07-31 5.839 -34.048
2017-08-01 5.838 -28.939
2017-08-02 7.298 -26.566
2017-08-03 6.804 -32.876
2017-08-04 8.989 -38.618
2017-08-07 8.862 -36.676
2017-08-08 8.234 -40.893
2017-08-09 7.39 -35.16
2017-08-10 8.593 -35.555
2017-08-11 7.253 -35.175
2017-08-14 5.593 -33.644
2017-08-15 4.528 -37.82
2017-08-16 6.752 -53.217
2017-08-17 6.284 -49.252
2017-08-18 4.765 -55.602
2017-08-21 3.905 -54.32
2017-08-22 1.76 -57.853
2017-08-23 0.406 -58.925
2017-08-24 -2.438 -58.098
2017-08-25 -0.791 -56.682
2017-08-28 2.173 -51.278
2017-08-29 2.523 -54.353
2017-08-30 4.482 -46.325
2017-08-31 0.246 -52.567
2017-09-01 -4.214 -53.636
2017-09-04 -4.548 -52.735
2017-09-05 -1.781 -50.421
2017-09-06 -10.463 -51.122
2017-09-07 -13.119 -52.433
2017-09-08 -11.716 -43.493
2017-09-11 -16.15 -43.142
2017-09-12 -12.478 -29.335
2017-09-13 -16.457 -31.697
2017-09-14 -14.615 -15.13
2017-09-15 -13.911 3.023

One of the issue is that the 'Date' column is also included and secondly, the season is not needed, it can be FALSE or specify an integer value
library(urca)
out <- cajolst(data[-1] ,trend = FALSE, K = 2, season =FALSE)
If there is a season effect and it is `quarterly, the value would be 4
out1 <- cajolst(data[-1] ,trend = FALSE, K = 2, season = 4)
out1
#####################################################
# Johansen-Procedure Unit Root / Cointegration Test #
#####################################################
#The value of the test statistic is: 3.6212 13.2233
data
data <- structure(list(dates = c("2016-11-30", "2016-12-01", "2016-12-02",
"2016-12-04", "2016-12-05", "2016-12-06", "2016-12-07", "2016-12-08",
"2016-12-09", "2016-12-11", "2016-12-12", "2016-12-13", "2016-12-14",
"2016-12-15", "2016-12-16", "2016-12-19", "2016-12-20", "2016-12-21",
"2016-12-22", "2016-12-23", "2016-12-26", "2016-12-27", "2016-12-28",
"2016-12-29", "2016-12-30", "2017-01-02", "2017-01-03", "2017-01-04",
"2017-01-05", "2017-01-06", "2017-01-09", "2017-01-10", "2017-01-11",
"2017-01-12", "2017-01-13", "2017-01-16", "2017-01-17", "2017-01-18",
"2017-01-19", "2017-01-20", "2017-01-23", "2017-01-24", "2017-01-25",
"2017-01-26", "2017-01-27", "2017-01-30", "2017-01-31", "2017-02-01",
"2017-02-02", "2017-02-03", "2017-02-06", "2017-02-07", "2017-02-08",
"2017-02-09", "2017-02-10", "2017-02-13", "2017-02-14", "2017-02-15",
"2017-02-16", "2017-02-17", "2017-02-20", "2017-02-21", "2017-02-22",
"2017-02-23", "2017-02-24", "2017-02-27", "2017-02-28", "2017-03-01",
"2017-03-02", "2017-03-03", "2017-03-06", "2017-03-07", "2017-03-08",
"2017-03-09", "2017-03-10", "2017-03-13", "2017-03-14", "2017-03-15",
"2017-03-16", "2017-03-17", "2017-03-20", "2017-03-21", "2017-03-22",
"2017-03-23", "2017-03-24", "2017-03-27", "2017-03-28", "2017-03-29",
"2017-03-30", "2017-03-31", "2017-04-03", "2017-04-04", "2017-04-05",
"2017-04-06", "2017-04-07", "2017-04-10", "2017-04-11", "2017-04-12",
"2017-04-13", "2017-04-14", "2017-04-17", "2017-04-18", "2017-04-19",
"2017-04-20", "2017-04-21", "2017-04-24", "2017-04-25", "2017-04-26",
"2017-04-27", "2017-04-28", "2017-05-01", "2017-05-02", "2017-05-03",
"2017-05-04", "2017-05-05", "2017-05-08", "2017-05-09", "2017-05-10",
"2017-05-11", "2017-05-12", "2017-05-15", "2017-05-16", "2017-05-17",
"2017-05-18", "2017-05-19", "2017-05-22", "2017-05-23", "2017-05-24",
"2017-05-25", "2017-05-26", "2017-05-29", "2017-05-30", "2017-05-31",
"2017-06-01", "2017-06-02", "2017-06-05", "2017-06-06", "2017-06-07",
"2017-06-08", "2017-06-09", "2017-06-12", "2017-06-13", "2017-06-14",
"2017-06-15", "2017-06-16", "2017-06-19", "2017-06-20", "2017-06-21",
"2017-06-22", "2017-06-23", "2017-06-26", "2017-06-27", "2017-06-28",
"2017-06-29", "2017-06-30", "2017-07-03", "2017-07-04", "2017-07-05",
"2017-07-06", "2017-07-07", "2017-07-10", "2017-07-11", "2017-07-12",
"2017-07-13", "2017-07-14", "2017-07-17", "2017-07-18", "2017-07-19",
"2017-07-20", "2017-07-21", "2017-07-24", "2017-07-25", "2017-07-26",
"2017-07-27", "2017-07-28", "2017-07-31", "2017-08-01", "2017-08-02",
"2017-08-03", "2017-08-04", "2017-08-07", "2017-08-08", "2017-08-09",
"2017-08-10", "2017-08-11", "2017-08-14", "2017-08-15", "2017-08-16",
"2017-08-17", "2017-08-18", "2017-08-21", "2017-08-22", "2017-08-23",
"2017-08-24", "2017-08-25", "2017-08-28", "2017-08-29", "2017-08-30",
"2017-08-31", "2017-09-01", "2017-09-04", "2017-09-05", "2017-09-06",
"2017-09-07", "2017-09-08", "2017-09-11", "2017-09-12", "2017-09-13",
"2017-09-14", "2017-09-15"), A = c(0, -3.53, -2.832, -2.666,
-0.54, -1.296, -1.785, -6.834, -9.624, -11.374, -6.037, -5.934,
-7.279, -7.859, -15.132, -15.345, -15.673, -15.391, -14.357,
-14.99, -15.626, -12.297, -13.967, -12.946, -19.681, -18.24,
-16.83, -18.189, -15.897, -20.196, -14.57, -13.27, -8.85, -6.375,
-8.056, -5.217, -4.75, 3.505, 10.939, 9.248, 9.532, 4.235, -1.885,
-5.027, 0.015, -0.685, -2.692, -2.654, 4.002, 4.813, 7.049, 10.003,
8.996, 7.047, 7.656, 4.986, 8.493, 12.547, 10.327, 7.09, 11.633,
12.664, 16.103, 14.25, 7.794, 15.27, 19.984, 23.899, 16.63, 16.443,
17.901, 19.067, 17.219, 15.694, 17.351, 18.945, 20.001, 23.852,
22.697, 26.892, 29.221, 25.165, 22.998, 20.072, 20.758, 20.062,
22.066, 22.363, 20.684, 17.056, 19.12, 16.359, 18.643, 14.708,
8.403, 6.072, 5.186, 4.248, 12.803, 12.566, 14.065, 14.5, 13.865,
16.126, 17.591, 22.3, 22.731, 19.146, 19.052, 21.889, 27.323,
29.93, 19.835, 19.683, 13.545, 14.165, 11.325, 10.143, 13.718,
14.216, 13.701, 13.505, 13.456, 12.613, 11.166, 12.221, 13.682,
10.05, 10.122, 7.592, 6.796, 9.638, 7.983, 3.594, 8.763, 12.157,
13.383, 20.52, 19.534, 16.011, 9.153, 4.295, 9.743, 10.386, 11.983,
9.513, 10.298, 11.087, 4.472, 9.416, 9.686, 6.424, 3.062, 5.593,
3.531, 3.208, -6.373, -5.149, -6.104, -9.565, -8.961, -4.065,
-10.133, -6.223, -1.524, -1.613, 5.781, 8.243, 7.665, 0.485,
-0.638, 0.767, 3.566, 6.834, 1.306, 5.839, 5.838, 7.298, 6.804,
8.989, 8.862, 8.234, 7.39, 8.593, 7.253, 5.593, 4.528, 6.752,
6.284, 4.765, 3.905, 1.76, 0.406, -2.438, -0.791, 2.173, 2.523,
4.482, 0.246, -4.214, -4.548, -1.781, -10.463, -13.119, -11.716,
-16.15, -12.478, -16.457, -14.615, -13.911), G = c(0, 3.198,
8.703, 7.799, 7.701, 4.685, -4.587, -3.696, -5.461, -0.423, -1.614,
-3.231, 1.072, -4.823, 10.838, 11.5, 6.639, 11.162, 7.032, 12.355,
10.944, 10.215, 5.957, 3.446, 10.274, 8.781, 1.116, -0.036, -1.441,
-8.534, -28.768, -29.821, -38.881, -50.885, -51.321, -63.619,
-39.163, -46.309, -45.825, -42.973, -33.396, -31.38, -19.21,
-15.74, -23.029, -30.773, -25.544, -17.912, -43.309, -52.627,
-49.965, -40.568, -39.828, -41.19, -50.853, -41.318, -51.946,
-59.538, -54.496, -57.571, -54.91, -51.597, -57.819, -51.336,
-54.898, -55.754, -58.37, -70.73, -56.29, -55.858, -59.377, -64.383,
-57.829, -55.022, -60.431, -59.79, -64.848, -73.806, -64.191,
-65.328, -72.764, -53.427, -51.676, -40.57, -43.654, -33.672,
-47.184, -54.57, -48.199, -40.887, -39.618, -37.1, -32.734, -30.455,
-33.553, -29.048, -20.696, -20.924, -31.075, -29.768, -28.906,
4.121, 8.835, 6.191, 3.77, -2.497, 7.408, 18.45, 25.541, 26.878,
14.362, 17.525, 29.856, 36.72, 41.055, 43.544, 49.978, 47.072,
38.901, 36.017, 33.797, 33.867, 38.004, 37.758, 40.367, 34.022,
29.793, 26.701, 31.394, 20.073, 23.809, 16.1, 29.043, 39.557,
27.863, 22.397, 19.053, 17.449, -1.615, -1.989, -9.294, -0.897,
-9.818, -8.255, -12.522, -12.931, -21.024, -11.801, -9.048, -9.592,
-12.006, -2.632, -1.016, -0.825, 0.914, -2.596, 4.289, 5.917,
12.75, 1.615, -0.053, -8.541, -11.286, -15.181, -14.396, -14.61,
-35.473, -44.186, -49.857, -41.286, -39.127, -40.952, -44.388,
-42.543, -37.657, -34.048, -28.939, -26.566, -32.876, -38.618,
-36.676, -40.893, -35.16, -35.555, -35.175, -33.644, -37.82,
-53.217, -49.252, -55.602, -54.32, -57.853, -58.925, -58.098,
-56.682, -51.278, -54.353, -46.325, -52.567, -53.636, -52.735,
-50.421, -51.122, -52.433, -43.493, -43.142, -29.335, -31.697,
-15.13, 3.023)), class = "data.frame", row.names = c(NA, -210L
))

Related

How to filter rows that occur sequentially that fit multiple conditions using dplyr

I am trying to filter my data for certain conditions using dplyr. The conditions need to be applied to rows that occur sequentially. I have a condition that would apply to the first row (not the first row of the df) and then I am interested to see if there if the following row (the second row) meets another set of conditions. If the conditions are met for the 1st and 2nd row then I want to be able to see the 1st, 2nd, 3rd, and 4th row
Here are the conditions that I want to filter for
The Close of Row1 is greater than or equal to the Open of Row1
The Close of Row2 is greater than or equal to the Open of Row2
The Close of Row2 is less than or equal to the High of Row1
Here is an example of my data.
structure(list(Date = c("01/14/2022", "01/14/2022", "01/14/2022",
"01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022",
"01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022",
"01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022",
"01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022",
"01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022", "01/14/2022",
"01/14/2022", "01/14/2022", "01/14/2022", "01/18/2022", "01/18/2022",
"01/18/2022", "01/18/2022", "01/18/2022", "01/18/2022"), Time = c("08:05",
"08:10", "08:15", "08:20", "08:25", "08:30", "08:35", "08:40",
"08:45", "08:50", "08:55", "09:00", "09:05", "09:10", "09:15",
"09:20", "09:25", "09:30", "09:35", "09:40", "09:45", "09:50",
"09:55", "10:00", "10:05", "10:10", "10:15", "10:20", "10:25",
"10:30", "10:35", "09:00", "09:05", "09:10", "09:15", "09:20",
"09:25"), Open = c(4618.75, 4621.25, 4621, 4617, 4622, 4624.75,
4623.75, 4620.75, 4617.25, 4613.75, 4612, 4610, 4613.25, 4614,
4617.75, 4619, 4619.75, 4619.5, 4619.25, 4618.25, 4634.75, 4635.75,
4635.25, 4643.25, 4650.75, 4640.75, 4646, 4641.25, 4654.5, 4639.5,
4638, 4610.5, 4611.5, 4612, 4611.75, 4610, 4605.75), High = c(4621.75,
4623.75, 4623.25, 4625.5, 4625, 4625, 4625, 4621.75, 4620.25,
4617.75, 4612.5, 4614, 4614.5, 4619.75, 4621.25, 4623, 4621.5,
4622.5, 4624.25, 4638.75, 4640.5, 4645.75, 4644.25, 4652.5, 4653.5,
4649.5, 4651.75, 4655.75, 4655, 4642.75, 4640, 4612.25, 4612.75,
4612.5, 4612.5, 4610.5, 4608.75), Low = c(4612.75, 4617.5, 4616,
4617, 4620.5, 4620.5, 4616.75, 4616, 4611.75, 4610.25, 4606.75,
4607, 4609.5, 4614, 4616.5, 4616, 4616.25, 4617.75, 4614.5, 4615.25,
4629.25, 4633.5, 4633.25, 4642.5, 4635.75, 4640.5, 4639.5, 4641,
4638.75, 4633.75, 4631.5, 4609.5, 4609.75, 4609.25, 4608, 4604.5,
4604.75), Close = c(4621.25, 4620.75, 4616.75, 4622, 4624.5,
4623.75, 4620.75, 4617, 4613.5, 4612, 4609.75, 4613.25, 4614,
4617.5, 4619, 4619.75, 4619.5, 4619.25, 4618, 4635, 4635.5, 4635.25,
4643.5, 4651, 4641, 4646.25, 4641.5, 4654.75, 4639.5, 4637.75,
4639.5, 4611.5, 4612, 4611.5, 4609.5, 4605.75, 4608.75), Up = c(6712L,
3316L, 2396L, 3218L, 2246L, 2817L, 5079L, 3495L, 4783L, 4404L,
5390L, 5139L, 2908L, 3943L, 4140L, 4026L, 3068L, 6227L, 26196L,
31057L, 17725L, 20980L, 16256L, 16262L, 18580L, 12499L, 11163L,
13486L, 10349L, 11161L, 12024L, 1619L, 2010L, 1503L, 1772L, 2987L,
1731L), Down = c(6157L, 3075L, 2774L, 3197L, 2199L, 2564L, 5702L,
3750L, 4015L, 3527L, 5204L, 3302L, 3206L, 3767L, 3059L, 3899L,
2770L, 6792L, 24774L, 28216L, 18406L, 20660L, 15670L, 15362L,
20526L, 11039L, 11507L, 12231L, 11981L, 11810L, 12161L, 1552L,
1985L, 1763L, 2402L, 3947L, 1362L)), row.names = c(NA, -37L), class = "data.frame")
Here is the expected output.
Date Time Open High Low Close Up Down
12 01/14/2022 09:00 4610.00 4614.00 4607.00 4613.25 5139 3302
13 01/14/2022 09:05 4613.25 4614.50 4609.50 4614.00 2908 3206
14 01/14/2022 09:10 4614.00 4619.75 4614.00 4617.50 3943 3767
15 01/14/2022 09:15 4617.75 4621.25 4616.50 4619.00 4140 3059
32 01/18/2022 09:00 4610.50 4612.25 4609.50 4611.50 1619 1552
33 01/18/2022 09:05 4611.50 4612.75 4609.75 4612.00 2010 1985
34 01/18/2022 09:10 4612.00 4612.50 4609.25 4611.50 1503 1763
35 01/18/2022 09:15 4611.75 4612.50 4608.00 4609.50 1772 2402
If there are overlaps, this will print the overlapping rows only once (i.e. all unique rows in the set of all possibly-overlapping groups)
library(dplyr, warn.conflicts = FALSE)
df %>%
mutate(cond = (Close >= Open) & lead(Close >= Open) & lead(Close) <= High) %>%
slice(unique(c(outer(0:3, which(cond), '+'))))
#> Date Time Open High Low Close Up Down cond
#> 1 01/14/2022 08:20 4617.00 4625.50 4617.00 4622.00 3218 3197 TRUE
#> 2 01/14/2022 08:25 4622.00 4625.00 4620.50 4624.50 2246 2199 FALSE
#> 3 01/14/2022 08:30 4624.75 4625.00 4620.50 4623.75 2817 2564 FALSE
#> 4 01/14/2022 08:35 4623.75 4625.00 4616.75 4620.75 5079 5702 FALSE
#> 5 01/14/2022 09:00 4610.00 4614.00 4607.00 4613.25 5139 3302 TRUE
#> 6 01/14/2022 09:05 4613.25 4614.50 4609.50 4614.00 2908 3206 FALSE
#> 7 01/14/2022 09:10 4614.00 4619.75 4614.00 4617.50 3943 3767 TRUE
#> 8 01/14/2022 09:15 4617.75 4621.25 4616.50 4619.00 4140 3059 TRUE
#> 9 01/14/2022 09:20 4619.00 4623.00 4616.00 4619.75 4026 3899 FALSE
#> 10 01/14/2022 09:25 4619.75 4621.50 4616.25 4619.50 3068 2770 FALSE
#> 11 01/14/2022 09:30 4619.50 4622.50 4617.75 4619.25 6227 6792 FALSE
#> 12 01/14/2022 09:40 4618.25 4638.75 4615.25 4635.00 31057 28216 TRUE
#> 13 01/14/2022 09:45 4634.75 4640.50 4629.25 4635.50 17725 18406 FALSE
#> 14 01/14/2022 09:50 4635.75 4645.75 4633.50 4635.25 20980 20660 FALSE
#> 15 01/14/2022 09:55 4635.25 4644.25 4633.25 4643.50 16256 15670 FALSE
#> 16 01/14/2022 10:35 4638.00 4640.00 4631.50 4639.50 12024 12161 TRUE
#> 17 01/18/2022 09:00 4610.50 4612.25 4609.50 4611.50 1619 1552 TRUE
#> 18 01/18/2022 09:05 4611.50 4612.75 4609.75 4612.00 2010 1985 FALSE
#> 19 01/18/2022 09:10 4612.00 4612.50 4609.25 4611.50 1503 1763 FALSE
#> 20 01/18/2022 09:15 4611.75 4612.50 4608.00 4609.50 1772 2402 FALSE
Created on 2022-01-20 by the reprex package (v2.0.1)
This also works
df %>%
mutate(cond = (Close >= Open) & lead(Close >= Open) & lead(Close) <= High) %>%
filter(purrr::reduce(1:3, ~ .x | lag(.x), .init = cond))

min and max over time range on each day of xts

I have an xts object with intraday OHLC price data over several years. I'd like to be able to write a function that calculates the min and max value between 04:00:00 and 05:00:00 every day and include that as a column in the xts object. Im not really familiar with manipulating xts objects. Can anyone point me in the right direction? Here's a head of the xts object.
Open High Low Close Volume
2017-01-01 00:00:00 968.29 968.76 966.74 966.97 106562
2017-01-01 00:05:00 966.97 967.00 966.89 966.89 13731
2017-01-01 00:10:00 966.89 966.89 964.86 964.86 124137
2017-01-01 00:15:00 964.86 964.99 964.80 964.80 3001
2017-01-01 00:20:00 964.80 964.80 964.80 964.80 0
2017-01-01 00:25:00 964.80 965.09 964.54 964.91 48000
2017-01-01 00:30:00 964.91 965.01 964.91 965.01 2501
2017-01-01 00:35:00 965.01 967.82 965.57 967.82 71501
2017-01-01 00:40:00 967.82 967.82 967.08 967.08 50
2017-01-01 00:45:00 967.08 967.40 967.40 967.40 50
2017-01-01 00:50:00 967.40 968.08 967.40 968.08 14000
2017-01-01 00:55:00 968.08 968.08 966.89 968.00 1008
2017-01-01 01:00:00 968.00 968.10 968.00 968.10 1002
2017-01-01 01:05:00 968.10 968.10 967.62 967.62 5200
2017-01-01 01:10:00 967.62 967.70 966.29 966.29 35476
2017-01-01 01:15:00 966.29 966.29 966.28 966.28 3068
2017-01-01 01:20:00 966.28 966.66 965.00 965.00 30471
2017-01-01 01:25:00 965.00 965.01 964.00 964.00 77884
2017-01-01 01:30:00 964.00 964.76 964.76 964.76 500
2017-01-01 01:35:00 964.76 967.48 964.69 965.00 134129
2017-01-01 01:40:00 965.00 965.00 963.67 963.67 59676
2017-01-01 01:45:00 963.67 963.67 963.67 963.67 0
2017-01-01 01:50:00 963.67 964.56 963.66 964.55 5531
2017-01-01 01:55:00 964.55 963.43 963.40 963.40 3000
2017-01-01 02:00:00 963.40 964.60 963.40 964.60 1301
2017-01-01 02:05:00 964.60 964.60 964.60 964.60 0
2017-01-01 02:10:00 964.60 964.60 964.00 964.11 49954
2017-01-01 02:15:00 964.11 964.60 964.59 964.60 5000
2017-01-01 02:20:00 964.60 964.60 964.60 964.60 0
2017-01-01 02:25:00 964.60 964.60 964.51 964.51 2000
2017-01-01 02:30:00 964.51 964.51 964.51 964.51 0
2017-01-01 02:35:00 964.51 964.51 963.23 963.99 16667
2017-01-01 02:40:00 963.99 963.99 963.65 963.66 10000
2017-01-01 02:45:00 963.66 964.26 963.16 964.26 75500
2017-01-01 02:50:00 964.26 964.26 964.26 964.26 0
2017-01-01 02:55:00 964.26 964.26 964.26 964.26 0
2017-01-01 03:00:00 964.26 964.61 963.98 964.61 13000
2017-01-01 03:05:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:10:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:15:00 964.61 964.61 964.61 964.61 0
2017-01-01 03:20:00 964.61 964.82 964.48 964.82 16666
2017-01-01 03:25:00 964.82 965.00 963.99 964.97 50500
2017-01-01 03:30:00 964.97 964.97 964.02 964.02 56000
2017-01-01 03:35:00 964.02 964.29 964.29 964.29 500
2017-01-01 03:40:00 964.29 963.53 963.52 963.52 24000
2017-01-01 03:45:00 963.52 963.52 963.43 963.43 16500
2017-01-01 03:50:00 963.43 963.67 963.42 963.42 25002
2017-01-01 03:55:00 963.42 963.42 961.69 961.69 84507
2017-01-01 04:00:00 961.69 961.69 960.90 960.93 57909
2017-01-01 04:05:00 960.93 960.93 960.93 960.93 0
2017-01-01 04:10:00 960.93 961.19 961.19 961.19 400
2017-01-01 04:15:00 961.19 962.09 961.19 962.09 7001
2017-01-01 04:20:00 962.09 962.09 962.09 962.09 0
2017-01-01 04:25:00 962.09 962.10 961.14 961.14 32000
2017-01-01 04:30:00 961.14 961.14 960.93 960.93 41900
2017-01-01 04:35:00 960.93 961.94 960.93 961.64 640
2017-01-01 04:40:00 961.64 961.71 961.64 961.71 1
2017-01-01 04:45:00 961.71 962.00 961.90 961.99 5499
2017-01-01 04:50:00 961.99 961.99 961.99 961.99 0
2017-01-01 04:55:00 961.99 961.99 961.99 961.99 1
2017-01-01 05:00:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:05:00 961.99 961.99 961.99 961.99 40
2017-01-01 05:10:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:15:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:20:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:25:00 961.99 961.99 961.99 961.99 0
2017-01-01 05:30:00 961.99 962.10 961.99 962.10 1382
2017-01-01 05:35:00 962.10 968.84 962.10 968.84 122909
2017-01-01 05:40:00 968.84 968.86 963.78 965.53 161263
2017-01-01 05:45:00 965.53 964.81 963.11 963.81 18021
2017-01-01 05:50:00 963.81 964.39 963.85 964.39 40006
2017-01-01 05:55:00 964.39 964.47 964.00 964.47 39966
2017-01-01 06:00:00 964.47 964.47 964.47 964.47 0
You can do this by filtering on the hours of the index and then using period.max and period.min functions. The values will be put in last record of the chosen hour. See example below with intraday data of MSFT, max and min values for between 15:00 and 16:00.
library(xts)
# max of high values between 15 and 16. (excluding 16:00)
msft$max <- period.max(msft$high[.indexhour(msft) == 15], endpoints(msft$high[.indexhour(msft) == 15], on = "hour"))
# min of low values between 15 and 16. (excluding 16:00)
msft$min <- period.min(msft$low[.indexhour(msft) == 15], endpoints(msft$low[.indexhour(msft) == 15], on = "hour"))
head(msft[8:24], 16)
open high low close volume max min
2020-01-23 14:50:00 166.180 166.2300 166.090 166.1050 87934 NA NA
2020-01-23 14:55:00 166.105 166.2200 166.103 166.1700 92280 NA NA
2020-01-23 15:00:00 166.160 166.3500 166.160 166.3400 114359 NA NA
2020-01-23 15:05:00 166.335 166.3400 166.285 166.2850 102633 NA NA
2020-01-23 15:10:00 166.290 166.3050 166.170 166.2550 125558 NA NA
2020-01-23 15:15:00 166.250 166.2750 166.210 166.2400 103938 NA NA
2020-01-23 15:20:00 166.230 166.2500 166.180 166.2350 99649 NA NA
2020-01-23 15:25:00 166.240 166.3000 166.225 166.2850 93846 NA NA
2020-01-23 15:30:00 166.270 166.4164 166.175 166.3600 183154 NA NA
2020-01-23 15:35:00 166.360 166.5000 166.320 166.4600 177178 NA NA
2020-01-23 15:40:00 166.450 166.4650 166.380 166.3800 112174 NA NA
2020-01-23 15:45:00 166.385 166.4050 166.290 166.3875 152806 NA NA
2020-01-23 15:50:00 166.382 166.5200 166.362 166.4500 205667 NA NA
2020-01-23 15:55:00 166.450 166.6900 166.305 166.6700 508469 166.69 166.16
2020-01-23 16:00:00 166.660 166.7200 166.589 166.7200 934090 NA NA
2020-01-24 09:35:00 167.510 167.5300 166.890 166.8918 1152646 NA NA
data:
msft <- structure(c(166.224, 166.29, 166.29, 166.2456, 166.165, 166.1446,
166.1601, 166.18, 166.105, 166.16, 166.335, 166.29, 166.25, 166.23,
166.24, 166.27, 166.36, 166.45, 166.385, 166.382, 166.45, 166.66,
167.51, 167.03, 167.265, 167.325, 167.37, 167.16, 167.405, 167.35,
167.31, 167.39, 167.17, 167.1, 166.845, 167.03, 167.1223, 167.125,
167.21, 167.34, 167.235, 167.3, 167.37, 167.1977, 166.9814, 166.8499,
166.99, 166.93, 166.83, 166.64, 166.775, 166.85, 166.71, 166.6838,
166.46, 166.35, 165.765, 166.2269, 166.01, 166.19, 166.13, 166.31,
166.36, 166.42, 166.3682, 165.99, 166.1328, 165.85, 165.74, 165.8439,
165.655, 165.5434, 165.47, 165.3227, 165.0627, 165.03, 165.2546,
165.14, 165.1, 164.91, 164.75, 164.65, 164.53, 164.81, 164.8979,
164.6, 164.89, 164.94, 165.03, 165.12, 165.17, 165.24, 165.4,
165.335, 165.2734, 164.985, 164.9, 164.61, 164.93, 165.18, 166.315,
166.29, 166.3, 166.265, 166.22, 166.2201, 166.2, 166.23, 166.22,
166.35, 166.34, 166.305, 166.275, 166.25, 166.3, 166.4164, 166.5,
166.465, 166.405, 166.52, 166.69, 166.72, 167.53, 167.34, 167.39,
167.495, 167.47, 167.48, 167.4251, 167.3699, 167.42, 167.41,
167.2, 167.1, 167.03, 167.21, 167.23, 167.255, 167.35, 167.35,
167.33, 167.405, 167.38, 167.25, 167.01, 167, 167.02, 167.02,
166.8384, 166.9056, 166.86, 166.94, 166.75, 166.6844, 166.47,
166.42, 166.22, 166.4049, 166.221, 166.2003, 166.3749, 166.3999,
166.43, 166.43, 166.375, 166.175, 166.16, 165.96, 165.93, 165.86,
165.671, 165.64, 165.49, 165.4, 165.08, 165.27, 165.26, 165.34,
165.12, 165, 164.825, 164.765, 164.82, 164.89, 165, 164.89, 164.99,
165.041, 165.293, 165.23, 165.27, 165.44, 165.6046, 165.37, 165.295,
165.18, 164.93, 164.945, 165.185, 165.24, 166.22, 166.225, 166.25,
166.15, 166.145, 166.13, 166.1015, 166.09, 166.103, 166.16, 166.285,
166.17, 166.21, 166.18, 166.225, 166.175, 166.32, 166.38, 166.29,
166.362, 166.305, 166.589, 166.89, 167.03, 167.22, 167.32, 167.225,
167.16, 167.2, 167.23, 167.2801, 167.145, 167.05, 166.84, 166.77,
167, 167.1, 167.02, 167.18, 167.23, 167.223, 167.28, 167.1843,
166.862, 166.85, 166.821, 166.8121, 166.85, 166.55, 166.6303,
166.69, 166.7, 166.54, 166.4, 166.31, 165.76, 165.74, 165.8966,
165.91, 166.07, 166.09, 166.171, 166.32, 166.22, 165.96, 165.97,
165.82, 165.73, 165.72, 165.64, 165.49, 165.45, 165.32, 165.045,
164.89, 164.91, 165.09, 165.1, 164.91, 164.74, 164.53, 164.529,
164.53, 164.735, 164.59, 164.54, 164.88, 164.938, 165.01, 165.0792,
165.12, 165.22, 165.335, 165.263, 164.88, 164.89, 164.58, 164.58,
164.87, 164.87, 166.29, 166.275, 166.26, 166.16, 166.145, 166.155,
166.19, 166.105, 166.17, 166.34, 166.285, 166.255, 166.24, 166.235,
166.285, 166.36, 166.46, 166.38, 166.3875, 166.45, 166.67, 166.72,
166.8918, 167.27, 167.325, 167.371, 167.2251, 167.4, 167.34,
167.29, 167.3988, 167.2, 167.1047, 166.86, 167.025, 167.11, 167.12,
167.2, 167.345, 167.23, 167.29, 167.37, 167.1916, 167.0027, 166.85,
167, 166.94, 166.85, 166.64, 166.7738, 166.85, 166.72, 166.68,
166.4672, 166.3512, 165.79, 166.21, 165.9969, 166.18, 166.14,
166.2968, 166.36, 166.43, 166.36, 165.99, 166.13, 165.83, 165.73,
165.8405, 165.65, 165.545, 165.48, 165.33, 165.05, 165.0227,
165.26, 165.1425, 165.101, 164.91, 164.74, 164.6581, 164.5292,
164.805, 164.89, 164.59, 164.8801, 164.9498, 165.04, 165.12,
165.16, 165.2302, 165.4, 165.34, 165.28, 164.987, 164.89, 164.605,
164.94, 165.185, 165.04, 158120, 165333, 101115, 78491, 123999,
76037, 82733, 87934, 92280, 114359, 102633, 125558, 103938, 99649,
93846, 183154, 177178, 112174, 152806, 205667, 508469, 934090,
1152646, 558627, 277325, 321651, 255494, 333848, 272126, 395463,
194593, 211910, 193131, 242112, 210240, 193265, 139617, 204182,
179146, 159259, 237888, 410982, 213787, 233082, 188071, 193742,
132377, 118994, 264247, 182490, 109514, 138164, 221052, 194127,
169059, 458214, 247712, 169523, 115531, 161259, 263230, 155536,
82474, 87549, 109057, 101772, 130642, 171988, 117235, 134507,
236662, 219303, 217698, 219808, 420288, 208087, 149358, 197435,
218090, 267667, 320279, 422434, 340478, 273866, 258938, 212451,
268017, 323657, 267686, 214060, 222314, 293731, 288867, 219687,
304733, 251063, 425450, 455311, 741208, 1429645),
.Dim = c(100L, 5L),
.Dimnames = list(NULL, c("open", "high", "low", "close", "volume")),
index = structure(c(1579785300, 1579785600, 1579785900, 1579786200, 1579786500,
1579786800, 1579787100, 1579787400, 1579787700, 1579788000,
1579788300, 1579788600, 1579788900, 1579789200, 1579789500,
1579789800, 1579790100, 1579790400, 1579790700, 1579791000,
1579791300, 1579791600, 1579854900, 1579855200, 1579855500,
1579855800, 1579856100, 1579856400, 1579856700, 1579857000,
1579857300, 1579857600, 1579857900, 1579858200, 1579858500,
1579858800, 1579859100, 1579859400, 1579859700, 1579860000,
1579860300, 1579860600, 1579860900, 1579861200, 1579861500,
1579861800, 1579862100, 1579862400, 1579862700, 1579863000,
1579863300, 1579863600, 1579863900, 1579864200, 1579864500,
1579864800, 1579865100, 1579865400, 1579865700, 1579866000,
1579866300, 1579866600, 1579866900, 1579867200, 1579867500,
1579867800, 1579868100, 1579868400, 1579868700, 1579869000,
1579869300, 1579869600, 1579869900, 1579870200, 1579870500,
1579870800, 1579871100, 1579871400, 1579871700, 1579872000,
1579872300, 1579872600, 1579872900, 1579873200, 1579873500,
1579873800, 1579874100, 1579874400, 1579874700, 1579875000,
1579875300, 1579875600, 1579875900, 1579876200, 1579876500,
1579876800, 1579877100, 1579877400, 1579877700, 1579878000),
tzone = "",
tclass = c("POSIXct", "POSIXt")),
class = c("xts", "zoo"))

How to plot lagged data against other data in R

I would like to lag one variable by, say, 10 time steps and plot it against the other variable which remains the same. I would like to do this for various lags to see if there is a time period that the first variable influences the other. The data I have is daily and after lagging I am separating into Dec-Feb data only. The problem I am having is the plot and correlation between the lagged variable and the other data is coming out the same as the non-lagged plot and correlation every time. I am not sure how to achieve this.
A sample of my data frame "data" can be seen below.
Date x y
14158 2017-10-05 1.913918e+00 -0.1538234614
14159 2017-10-06 1.479714e+00 -0.1937094170
14160 2017-10-07 8.783669e-01 -0.1703790211
14161 2017-10-08 5.706581e-01 -0.1294144428
14162 2017-10-09 4.979405e-01 -0.0666569815
14163 2017-10-10 3.233477e-01 0.0072006102
14164 2017-10-11 3.057630e-01 0.0863445067
14165 2017-10-12 5.877673e-01 0.1097707831
14166 2017-10-13 1.208526e+00 0.1301967193
14167 2017-10-14 1.671705e+00 0.1728109268
14168 2017-10-15 1.810979e+00 0.2264911145
14169 2017-10-16 1.426651e+00 0.2702958315
14170 2017-10-17 1.241140e+00 0.3242637704
14171 2017-10-18 8.997498e-01 0.3879727861
14172 2017-10-19 5.594161e-01 0.4172990825
14173 2017-10-20 3.980254e-01 0.3915170864
14174 2017-10-21 2.138538e-01 0.3249736995
14175 2017-10-22 3.926440e-01 0.2224834840
14176 2017-10-23 2.268644e-01 0.0529143372
14177 2017-10-24 5.664923e-01 -0.0081443464
14178 2017-10-25 6.167520e-01 0.0312073984
14179 2017-10-26 7.751882e-02 0.0043897693
14180 2017-10-27 -5.634851e-02 -0.0726825266
14181 2017-10-28 -2.122061e-01 -0.1711305549
14182 2017-10-29 -8.500991e-01 -0.2068581639
14183 2017-10-30 -1.039685e+00 -0.2909120824
14184 2017-10-31 -3.057745e-01 -0.3933633317
14185 2017-11-01 -1.288774e-01 -0.3726346136
14186 2017-11-02 -5.608007e-03 -0.2425754386
14187 2017-11-03 4.853990e-01 -0.0503543980
14188 2017-11-04 5.822672e-01 0.0896130098
14189 2017-11-05 8.491505e-01 0.1299151006
14190 2017-11-06 1.052999e+00 0.0749888307
14191 2017-11-07 1.170470e+00 0.0287317882
14192 2017-11-08 7.919862e-01 0.0788187381
14193 2017-11-09 4.574565e-01 0.1539981316
14194 2017-11-10 4.552032e-01 0.2034393145
14195 2017-11-11 -3.621350e-01 0.2077476707
14196 2017-11-12 -8.053965e-01 0.1759558604
14197 2017-11-13 -8.307459e-01 0.1802858410
14198 2017-11-14 -9.421325e-01 0.2175529008
14199 2017-11-15 -9.880204e-01 0.2392924580
14200 2017-11-16 -7.448127e-01 0.2519253751
14201 2017-11-17 -8.081435e-01 0.2614254732
14202 2017-11-18 -1.216806e+00 0.2629971336
14203 2017-11-19 -1.122674e+00 0.3469995055
14204 2017-11-20 -1.242597e+00 0.4553094014
14205 2017-11-21 -1.294885e+00 0.5049438231
14206 2017-11-22 -9.325514e-01 0.4684133163
14207 2017-11-23 -4.632281e-01 0.4071673624
14208 2017-11-24 -9.689322e-02 0.3710270269
14209 2017-11-25 4.704467e-01 0.4126721465
14210 2017-11-26 8.682453e-01 0.3745057653
14211 2017-11-27 5.105564e-01 0.2373454931
14212 2017-11-28 4.747265e-01 0.1650783370
14213 2017-11-29 5.905379e-01 0.2632154120
14214 2017-11-30 4.083787e-01 0.3888834762
14215 2017-12-01 3.451736e-01 0.5008047592
14216 2017-12-02 5.161312e-01 0.5388177242
14217 2017-12-03 7.109279e-01 0.5515360710
14218 2017-12-04 4.458635e-01 0.5127537202
14219 2017-12-05 -3.986610e-01 0.3896493238
14220 2017-12-06 -5.968253e-01 0.1095843268
14221 2017-12-07 -1.604398e-01 -0.2455506506
14222 2017-12-08 -4.384744e-01 -0.5801038215
14223 2017-12-09 -7.255016e-01 -0.8384627087
14224 2017-12-10 -9.691828e-01 -0.9223171538
14225 2017-12-11 -1.140588e+00 -0.8177806761
14226 2017-12-12 -1.956622e-01 -0.5250998474
14227 2017-12-13 -1.083792e-01 -0.3430768534
14228 2017-12-14 -8.016345e-02 -0.3163476104
14229 2017-12-15 8.899266e-01 -0.2813253830
14230 2017-12-16 1.322833e+00 -0.2545953062
14231 2017-12-17 1.547972e+00 -0.2275373110
14232 2017-12-18 2.164907e+00 -0.3217205817
14233 2017-12-19 2.276258e+00 -0.5773412429
14234 2017-12-20 1.862291e+00 -0.7728091393
14235 2017-12-21 1.125083e+00 -0.9099696881
14236 2017-12-22 7.737118e-01 -1.2441963604
14237 2017-12-23 7.863508e-01 -1.4802661587
14238 2017-12-24 4.313111e-01 -1.4111320559
14239 2017-12-25 -8.814799e-02 -1.0024805520
14240 2017-12-26 -3.615127e-01 -0.4943077147
14241 2017-12-27 -5.011363e-01 -0.0308588186
14242 2017-12-28 -8.474088e-01 0.3717555895
14243 2017-12-29 -7.283247e-01 0.8230450219
14244 2017-12-30 -4.566981e-01 1.2495961116
14245 2017-12-31 -4.577034e-01 1.4805369230
14246 2018-01-01 1.946166e-01 1.5310004017
14247 2018-01-02 5.203149e-01 1.5384595802
14248 2018-01-03 5.024570e-02 1.4036679018
14249 2018-01-04 -7.065297e-01 1.0749574137
14250 2018-01-05 -8.741815e-01 0.7608524752
14251 2018-01-06 1.589530e-01 0.7891084646
14252 2018-01-07 8.632378e-01 1.1230358751
I am using
lagged <- lag(ts(x), k=10)
This is so the tsp isn't ignored. However, when I do
cor(data$x, data$y)
and
cor(lagged, data$y)
the result is the same, where I would have thought it would have been different. How do I get this lag to work before I can go ahead separate via date?
Many thanks!

How to calculate daily fluid infusion volume with variable infusion rates

Working in R, I need to calculate daily infusion volume (mL) given a variable infusion rate (mL/hour).
My dataframe has two columns: date (year, month, day, hours, mins, sec) when the infusion rate was changed, and the new infusion rate (ml/hr). From these data I have calculated cumulative infusion volume for the entire study (~ 3 weeks duration). I now need to calculate infusion volume for every 24 hours, midnight to midnight. The first and last study days are less than 24 hours duration and are excluded.
I don't know how to approach my problem with infusion rates spanning across 24 hour time periods at midnight.
One thought was to generate a new data frame consisting of time in secs (from zero to end of study) and volume infused per second, then sum infusion volume every day. This of course will generate a large (unnecessary) dataframe (>1 million rows).
I am looking for direction on how to approach in R.
No code to share at this time. My dataframe is shared:https://drive.google.com/file/d/1YfZkuOStOxWIXrxklWEo1r46hjFQPIXM/view
DF <- structure(list(`date&time` = structure(c(1519043251, 1519047111,
1519049877, 1519050201, 1519053454, 1519054180, 1519060742, 1519062334,
1519083584, 1519108892, 1519114732, 1519118888, 1519127198, 1519140960,
1519142031, 1519150508, 1519161027, 1519167167, 1519206508, 1519206877,
1519222879, 1519278875, 1519290863, 1519293411, 1519314864, 1519317665,
1519334695, 1519364934, 1519364996, 1519378625, 1519384577, 1519428049,
1519495090, 1519541667, 1519544091, 1519551993, 1519594678, 1519626216,
1519650059, 1519658045, 1519712871, 1519722853, 1519726863, 1519744270,
1519786071, 1519787755, 1519788820, 1519789685, 1519791798, 1519801303,
1519801380, 1519809813, 1519815924, 1519826260, 1519830433, 1519833629,
1519841284, 1519857415, 1519885051, 1519885120, 1519885141, 1519887091,
1519939049, 1519939482, 1519945740, 1519971397, 1519975527, 1519987363,
1519988481, 1520004464, 1520033974, 1520093329, 1520179994, 1520204550,
1520233073, 1520237983, 1520238103, 1520241519, 1520241904, 1520263216,
1520290670, 1520349278, 1520370509, 1520406514, 1520436434, 1520447318,
1520456518, 1520461383, 1520501027, 1520522600, 1520542062, 1520590191,
1520618693, 1520621059, 1520626341, 1520627226, 1520630596, 1520637370,
1520664044, 1520676143, 1520689466, 1520717079, 1520724147, 1520754787,
1520788241, 1520806426, 1520818840, 1520829807, 1520839843, 1520839936,
1520891100, 1520897458, 1520921676, 1520933752), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), `infusion rate` = c(25.75, 30.75,
25.75, 25.81, 25.81, 25.75, 25.65, 25.65, 27.55, 18.47, 18.25,
16.25, 15.25, 13.25, 13.25, 15.25, 16.25, 15.25, 15.45, 12.45,
12.25, 12.45, 11.45, 11.5, 11.57, 13.57, 11.57, 10.57, 10.55,
11.55, 13.55, 13.52, 13.56, 13.64, 13.7, 13.67, 13.67, 13.65,
14.65, 14.61, 14.67, 14.69, 13.69, 13.67, 16.67, 21.67, 24.67,
29.67, 34.67, 29.67, 29.65, 24.65, 22.65, 19.65, 19.65, 17.65,
14.65, 14.63, 14.65, 15.65, 14.65, 15.65, 16.65, 15.65, 15.68,
15.71, 15.74, 15.81, 15.92, 15.89, 15.9, 15.94, 15.93, 14.94,
15.92, 16.03, 15.03, 15, 15.02, 14.96, 14.91, 14.93, 14.94, 14.94,
14.91, 14.92, 14.92, 14.92, 14.94, 14.95, 15.95, 14.95, 16.95,
19.95, 22.95, 25.95, 26.95, 26.93, 26.89, 23.89, 20.89, 18.89,
18.87, 16.87, 15.87, 15.87, 14.87, 17.87, 16.87, 16.98, 17.98,
16.98, 15.98, 0)), row.names = 2:115, class = "data.frame")
I need the output to be two columns of data; time in days and daily infusion volume.
One possible solution is to use the foverlaps() function from the data.table package. foverlaps() finds all overlapping intervals (ranges, periods) by an overlap join:
library(data.table)
# coerce to data.table
setDT(DF)
# rename column names (syntactically correct)
setnames(DF, names(DF) %>% make.names())
DF
# create intervals (ranges) of infusion periods
DF_ranges <- DF[, .(start = head(date.time, -1L),
end = tail(date.time, -1L),
inf.rate = head(infusion.rate, -1L))]
setkey(DF_ranges, start, end)
# create sequence of calendar days (starting at midnight)
day_seq <- DF[, seq(lubridate::floor_date(min(date.time), "day"),
max(date.time), "1 day")]
# create intervals of days (from midnight to midnight)
day_ranges <- data.table(start = day_seq, end = day_seq + as.difftime(1, units = "days"))
# find all overlapping intervals (overlap join )
ovl <- foverlaps(day_ranges, DF_ranges)
# compute duration of infusion periods within each day
ovl[, inf.hours := difftime(pmin(end, i.end), pmax(start, i.start), units = "hours")]
# compute infusion volume for each period
ovl[, inf.vol := inf.rate * as.double(inf.hours)]
# aggregate by day
ovl[, .(inf.vol.per.day = sum(inf.vol)), by = .(day = as.Date(i.start))][
# drop first and last day
-c(1L, .N)]
day inf.vol.per.day
1: 2018-02-20 455.7107
2: 2018-02-21 324.6403
3: 2018-02-22 293.5880
4: 2018-02-23 298.9512
5: 2018-02-24 324.7212
6: 2018-02-25 327.3658
7: 2018-02-26 338.3609
8: 2018-02-27 338.1620
9: 2018-02-28 507.9508
10: 2018-03-01 368.7672
11: 2018-03-02 379.4539
12: 2018-03-03 381.9141
13: 2018-03-04 381.5335
14: 2018-03-05 360.6198
15: 2018-03-06 358.0437
16: 2018-03-07 358.3588
17: 2018-03-08 361.6632
18: 2018-03-09 421.2107
19: 2018-03-10 567.7771
20: 2018-03-11 413.8286
21: 2018-03-12 403.4742
day inf.vol.per.day
The intermediate results are
DF_ranges
start end inf.rate
1: 2018-02-19 12:27:31 2018-02-19 13:31:51 25.75
2: 2018-02-19 13:31:51 2018-02-19 14:17:57 30.75
3: 2018-02-19 14:17:57 2018-02-19 14:23:21 25.75
4: 2018-02-19 14:23:21 2018-02-19 15:17:34 25.81
5: 2018-02-19 15:17:34 2018-02-19 15:29:40 25.81
---
109: 2018-03-12 07:30:43 2018-03-12 07:32:16 16.87
110: 2018-03-12 07:32:16 2018-03-12 21:45:00 16.98
111: 2018-03-12 21:45:00 2018-03-12 23:30:58 17.98
112: 2018-03-12 23:30:58 2018-03-13 06:14:36 16.98
113: 2018-03-13 06:14:36 2018-03-13 09:35:52 15.98
day_ranges
start end
1: 2018-02-19 2018-02-20
2: 2018-02-20 2018-02-21
3: 2018-02-21 2018-02-22
4: 2018-02-22 2018-02-23
5: 2018-02-23 2018-02-24
6: 2018-02-24 2018-02-25
7: 2018-02-25 2018-02-26
8: 2018-02-26 2018-02-27
9: 2018-02-27 2018-02-28
10: 2018-02-28 2018-03-01
11: 2018-03-01 2018-03-02
12: 2018-03-02 2018-03-03
13: 2018-03-03 2018-03-04
14: 2018-03-04 2018-03-05
15: 2018-03-05 2018-03-06
16: 2018-03-06 2018-03-07
17: 2018-03-07 2018-03-08
18: 2018-03-08 2018-03-09
19: 2018-03-09 2018-03-10
20: 2018-03-10 2018-03-11
21: 2018-03-11 2018-03-12
22: 2018-03-12 2018-03-13
23: 2018-03-13 2018-03-14
start end
foverlaps(day_ranges, DF_ranges)
start end inf.rate i.start i.end
1: 2018-02-19 12:27:31 2018-02-19 13:31:51 25.75 2018-02-19 2018-02-20
2: 2018-02-19 13:31:51 2018-02-19 14:17:57 30.75 2018-02-19 2018-02-20
3: 2018-02-19 14:17:57 2018-02-19 14:23:21 25.75 2018-02-19 2018-02-20
4: 2018-02-19 14:23:21 2018-02-19 15:17:34 25.81 2018-02-19 2018-02-20
5: 2018-02-19 15:17:34 2018-02-19 15:29:40 25.81 2018-02-19 2018-02-20
---
131: 2018-03-12 07:32:16 2018-03-12 21:45:00 16.98 2018-03-12 2018-03-13
132: 2018-03-12 21:45:00 2018-03-12 23:30:58 17.98 2018-03-12 2018-03-13
133: 2018-03-12 23:30:58 2018-03-13 06:14:36 16.98 2018-03-12 2018-03-13
134: 2018-03-12 23:30:58 2018-03-13 06:14:36 16.98 2018-03-13 2018-03-14
135: 2018-03-13 06:14:36 2018-03-13 09:35:52 15.98 2018-03-13 2018-03-14
ovl
start end inf.rate i.start i.end inf.hours inf.vol
1: 2018-02-19 12:27:31 2018-02-19 13:31:51 25.75 2018-02-19 2018-02-20 1.0722222 hours 27.609722
2: 2018-02-19 13:31:51 2018-02-19 14:17:57 30.75 2018-02-19 2018-02-20 0.7683333 hours 23.626250
3: 2018-02-19 14:17:57 2018-02-19 14:23:21 25.75 2018-02-19 2018-02-20 0.0900000 hours 2.317500
4: 2018-02-19 14:23:21 2018-02-19 15:17:34 25.81 2018-02-19 2018-02-20 0.9036111 hours 23.322203
5: 2018-02-19 15:17:34 2018-02-19 15:29:40 25.81 2018-02-19 2018-02-20 0.2016667 hours 5.205017
---
131: 2018-03-12 07:32:16 2018-03-12 21:45:00 16.98 2018-03-12 2018-03-13 14.2122222 hours 241.323533
132: 2018-03-12 21:45:00 2018-03-12 23:30:58 17.98 2018-03-12 2018-03-13 1.7661111 hours 31.754678
133: 2018-03-12 23:30:58 2018-03-13 06:14:36 16.98 2018-03-12 2018-03-13 0.4838889 hours 8.216433
134: 2018-03-12 23:30:58 2018-03-13 06:14:36 16.98 2018-03-13 2018-03-14 6.2433333 hours 106.011800
135: 2018-03-13 06:14:36 2018-03-13 09:35:52 15.98 2018-03-13 2018-03-14 3.3544444 hours 53.604022

R: Calculate 12-month cumulative returns

I have a time series dataset, and I'd like to get rolling 12-month cumulative returns. Below is what my code and data look like:
df <- data.frame(
A = c(-0.0195, 0.0079, 0.0034, 0.0394, -0.0065, 0.0034, 0.0136, 0.0683, -0.0063, -0.0537, -0.0216, -0.0036, 0.0659, -0.0377, -0.0568, 0.0039, -0.0191, 0.0028),
B = c(-0.0211, 0.0021, 0.0014, 0.0358, 0.0009, 0.0153, 0.0071, 0.0658, 0.0033, -0.0542, -0.0261, 0.0064, 0.0665, -0.0304, -0.0507, 0.0089, NA, NA),
C= c(-0.0176, 0.0144, 0.0057, 0.0442, -0.0152, -0.0105, 0.0213, 0.0712, -0.0176, -0.0531, -0.0163, -0.0154, 0.0652, NA, NA, NA, NA, NA)
)
row.names(df) <- c("2016-10-31", "2016-09-30", "2016-08-31", "2016-07-31", "2016-06-30", "2016-05-31", "2016-04-30", "2016-03-31", "2016-02-29", "2016-01-31", "2015-12-31", "2015-11-30", "2015-10-31", "2015-09-30", "2015-08-31", "2015-07-31", "2015-06-30", "2015-05-31")
> df
A B C
2016-10-31 -0.0195 -0.0211 -0.0176
2016-09-30 0.0079 0.0021 0.0144
2016-08-31 0.0034 0.0014 0.0057
2016-07-31 0.0394 0.0358 0.0442
2016-06-30 -0.0065 0.0009 -0.0152
2016-05-31 0.0034 0.0153 -0.0105
2016-04-30 0.0136 0.0071 0.0213
2016-03-31 0.0683 0.0658 0.0712
2016-02-29 -0.0063 0.0033 -0.0176
2016-01-31 -0.0537 -0.0542 -0.0531
2015-12-31 -0.0216 -0.0261 -0.0163
2015-11-30 -0.0036 0.0064 -0.0154
2015-10-31 0.0659 0.0665 0.0652
2015-09-30 -0.0377 -0.0304 NA
2015-08-31 -0.0568 -0.0507 NA
2015-07-31 0.0039 0.0089 NA
2015-06-30 -0.0191 NA NA
2015-05-31 0.0028 NA NA
I want to get 12-month cumulative returns for each month using the formula prod(1+R)-1(product of 12 individual period returns minus 1). The results should be:
A(1-Y) B(1-Y) C(1-Y)
2016-10-31 0.0198 0.0322 0.0052
2016-09-30 0.1086 0.1246 0.0898
2016-08-31 0.0585 0.0881
2016-07-31 -0.0050 0.0316
2016-06-30 -0.0390 0.0048
2016-05-31 -0.0512
2016-04-30 -0.0517
B(1-Y) only has 5 cumulative returns because B has no data prior to 2015-07-31. Thus, it does not satisfy the condition of 12 months (there are only 11 months between 2015-07-31 and 2016-05-31).
I have tried Return.cumulative(df), but this function gives cumulative returns since inception, which is not what I am looking for. Any suggestion will be appreciated!
As A.Webb mentions, rollapply gives the desired result. You can also use a for loop.
a <- b <- c <- 0
for(i in (nrow(df)-11):1){
a[i] <- prod(1 + df$A[i:(i+11)]) - 1
b[i] <- prod(1 + df$B[i:(i+11)]) - 1
c[i] <- prod(1 + df$C[i:(i+11)]) - 1
}
cum.returns <- data.frame(a,b,c)

Resources