dealing with "missing" times when setting data to xts - r
I have some data which looks like the following;
Dates Open Close
1000 06/06/2019 0:05 244.599 244.524
1001 06/06/2019 0:04 244.592 244.599
1002 06/06/2019 0:03 244.564 244.592
1003 06/06/2019 0:02 244.809 244.564
1004 06/06/2019 0:01 244.849 244.809
1005 06/06/2019 245.080 244.849
1006 05/06/2019 23:59 245.092 245.080
1007 05/06/2019 23:58 245.253 245.092
1008 05/06/2019 23:57 244.858 245.253
1009 05/06/2019 23:56 244.643 244.863
1010 05/06/2019 23:55 244.720 244.643
Where row 1005 doesn't have a time stamp. I try to set my dates to POSIXlt format.
data$Dates <- gsub("/", "-", data$Dates)
data$Dates <- as.POSIXlt(strptime(data$Dates, format="%d-%m-%Y %H:%M"))
Now my data looks like:
Dates Open Close
1000 2019-06-06 00:05:00 244.599 244.524
1001 2019-06-06 00:04:00 244.592 244.599
1002 2019-06-06 00:03:00 244.564 244.592
1003 2019-06-06 00:02:00 244.809 244.564
1004 2019-06-06 00:01:00 244.849 244.809
1005 <NA> 245.080 244.849
1006 2019-06-05 23:59:00 245.092 245.080
1007 2019-06-05 23:58:00 245.253 245.092
1008 2019-06-05 23:57:00 244.858 245.253
1009 2019-06-05 23:56:00 244.643 244.863
1010 2019-06-05 23:55:00 244.720 244.643
I am just wondering if there is a way around converting the times with no Hour or Minute data. It only occurs on the hour 0:00
Data:
data <- structure(list(Dates = c("06/06/2019 0:05", "06/06/2019 0:04",
"06/06/2019 0:03", "06/06/2019 0:02", "06/06/2019 0:01", "06/06/2019",
"05/06/2019 23:59", "05/06/2019 23:58", "05/06/2019 23:57", "05/06/2019 23:56",
"05/06/2019 23:55"), Open = c(244.599, 244.592, 244.564, 244.809,
244.849, 245.08, 245.092, 245.253, 244.858, 244.643, 244.72),
Close = c(244.524, 244.599, 244.592, 244.564, 244.809, 244.849,
245.08, 245.092, 245.253, 244.863, 244.643)), row.names = 1000:1010, class = "data.frame")
EDIT:
I just thought perhaps I should first split the column into two (one for dates and another for times) fill in the blank cells in the second column with 0:00 and paste back together.
parse_date_time in the lubridate package will successively check alternative formats until it succeeds if you give it a vector of formats. The separators and percent signs can be omitted from the format strings.
library(lubridate)
parse_date_time(data$Dates, c("dmYHM", "dmY"), tz = "")
giving:
[1] "2019-06-06 00:05:00 EDT" "2019-06-06 00:04:00 EDT"
[3] "2019-06-06 00:03:00 EDT" "2019-06-06 00:02:00 EDT"
[5] "2019-06-06 00:01:00 EDT" "2019-06-06 00:00:00 EDT"
[7] "2019-06-05 23:59:00 EDT" "2019-06-05 23:58:00 EDT"
[9] "2019-06-05 23:57:00 EDT" "2019-06-05 23:56:00 EDT"
[11] "2019-06-05 23:55:00 EDT"
Using dplyr, one possibility could be:
data %>%
mutate(Dates = ifelse(nchar(Dates) == 10, paste(Dates, "0:00", sep = " "), Dates),
Dates = as.POSIXct(Dates, format = "%d/%m/%Y %H:%M"))
Dates Open Close
1 2019-06-06 00:05:00 244.599 244.524
2 2019-06-06 00:04:00 244.592 244.599
3 2019-06-06 00:03:00 244.564 244.592
4 2019-06-06 00:02:00 244.809 244.564
5 2019-06-06 00:01:00 244.849 244.809
6 2019-06-06 00:00:00 245.080 244.849
7 2019-06-05 23:59:00 245.092 245.080
8 2019-06-05 23:58:00 245.253 245.092
9 2019-06-05 23:57:00 244.858 245.253
10 2019-06-05 23:56:00 244.643 244.863
11 2019-06-05 23:55:00 244.720 244.643
Here, for rows containing just the 10 characters, it combines the date with 0:00.
The same with base R:
data$Dates <- ifelse(nchar(data$Dates) == 10, paste(data$Dates, "0:00", sep = " "), data$Dates)
as.POSIXct(data$Dates, format = "%d/%m/%Y %H:%M")
Related
min and max over time range on each day of xts
I have an xts object with intraday OHLC price data over several years. I'd like to be able to write a function that calculates the min and max value between 04:00:00 and 05:00:00 every day and include that as a column in the xts object. Im not really familiar with manipulating xts objects. Can anyone point me in the right direction? Here's a head of the xts object. Open High Low Close Volume 2017-01-01 00:00:00 968.29 968.76 966.74 966.97 106562 2017-01-01 00:05:00 966.97 967.00 966.89 966.89 13731 2017-01-01 00:10:00 966.89 966.89 964.86 964.86 124137 2017-01-01 00:15:00 964.86 964.99 964.80 964.80 3001 2017-01-01 00:20:00 964.80 964.80 964.80 964.80 0 2017-01-01 00:25:00 964.80 965.09 964.54 964.91 48000 2017-01-01 00:30:00 964.91 965.01 964.91 965.01 2501 2017-01-01 00:35:00 965.01 967.82 965.57 967.82 71501 2017-01-01 00:40:00 967.82 967.82 967.08 967.08 50 2017-01-01 00:45:00 967.08 967.40 967.40 967.40 50 2017-01-01 00:50:00 967.40 968.08 967.40 968.08 14000 2017-01-01 00:55:00 968.08 968.08 966.89 968.00 1008 2017-01-01 01:00:00 968.00 968.10 968.00 968.10 1002 2017-01-01 01:05:00 968.10 968.10 967.62 967.62 5200 2017-01-01 01:10:00 967.62 967.70 966.29 966.29 35476 2017-01-01 01:15:00 966.29 966.29 966.28 966.28 3068 2017-01-01 01:20:00 966.28 966.66 965.00 965.00 30471 2017-01-01 01:25:00 965.00 965.01 964.00 964.00 77884 2017-01-01 01:30:00 964.00 964.76 964.76 964.76 500 2017-01-01 01:35:00 964.76 967.48 964.69 965.00 134129 2017-01-01 01:40:00 965.00 965.00 963.67 963.67 59676 2017-01-01 01:45:00 963.67 963.67 963.67 963.67 0 2017-01-01 01:50:00 963.67 964.56 963.66 964.55 5531 2017-01-01 01:55:00 964.55 963.43 963.40 963.40 3000 2017-01-01 02:00:00 963.40 964.60 963.40 964.60 1301 2017-01-01 02:05:00 964.60 964.60 964.60 964.60 0 2017-01-01 02:10:00 964.60 964.60 964.00 964.11 49954 2017-01-01 02:15:00 964.11 964.60 964.59 964.60 5000 2017-01-01 02:20:00 964.60 964.60 964.60 964.60 0 2017-01-01 02:25:00 964.60 964.60 964.51 964.51 2000 2017-01-01 02:30:00 964.51 964.51 964.51 964.51 0 2017-01-01 02:35:00 964.51 964.51 963.23 963.99 16667 2017-01-01 02:40:00 963.99 963.99 963.65 963.66 10000 2017-01-01 02:45:00 963.66 964.26 963.16 964.26 75500 2017-01-01 02:50:00 964.26 964.26 964.26 964.26 0 2017-01-01 02:55:00 964.26 964.26 964.26 964.26 0 2017-01-01 03:00:00 964.26 964.61 963.98 964.61 13000 2017-01-01 03:05:00 964.61 964.61 964.61 964.61 0 2017-01-01 03:10:00 964.61 964.61 964.61 964.61 0 2017-01-01 03:15:00 964.61 964.61 964.61 964.61 0 2017-01-01 03:20:00 964.61 964.82 964.48 964.82 16666 2017-01-01 03:25:00 964.82 965.00 963.99 964.97 50500 2017-01-01 03:30:00 964.97 964.97 964.02 964.02 56000 2017-01-01 03:35:00 964.02 964.29 964.29 964.29 500 2017-01-01 03:40:00 964.29 963.53 963.52 963.52 24000 2017-01-01 03:45:00 963.52 963.52 963.43 963.43 16500 2017-01-01 03:50:00 963.43 963.67 963.42 963.42 25002 2017-01-01 03:55:00 963.42 963.42 961.69 961.69 84507 2017-01-01 04:00:00 961.69 961.69 960.90 960.93 57909 2017-01-01 04:05:00 960.93 960.93 960.93 960.93 0 2017-01-01 04:10:00 960.93 961.19 961.19 961.19 400 2017-01-01 04:15:00 961.19 962.09 961.19 962.09 7001 2017-01-01 04:20:00 962.09 962.09 962.09 962.09 0 2017-01-01 04:25:00 962.09 962.10 961.14 961.14 32000 2017-01-01 04:30:00 961.14 961.14 960.93 960.93 41900 2017-01-01 04:35:00 960.93 961.94 960.93 961.64 640 2017-01-01 04:40:00 961.64 961.71 961.64 961.71 1 2017-01-01 04:45:00 961.71 962.00 961.90 961.99 5499 2017-01-01 04:50:00 961.99 961.99 961.99 961.99 0 2017-01-01 04:55:00 961.99 961.99 961.99 961.99 1 2017-01-01 05:00:00 961.99 961.99 961.99 961.99 0 2017-01-01 05:05:00 961.99 961.99 961.99 961.99 40 2017-01-01 05:10:00 961.99 961.99 961.99 961.99 0 2017-01-01 05:15:00 961.99 961.99 961.99 961.99 0 2017-01-01 05:20:00 961.99 961.99 961.99 961.99 0 2017-01-01 05:25:00 961.99 961.99 961.99 961.99 0 2017-01-01 05:30:00 961.99 962.10 961.99 962.10 1382 2017-01-01 05:35:00 962.10 968.84 962.10 968.84 122909 2017-01-01 05:40:00 968.84 968.86 963.78 965.53 161263 2017-01-01 05:45:00 965.53 964.81 963.11 963.81 18021 2017-01-01 05:50:00 963.81 964.39 963.85 964.39 40006 2017-01-01 05:55:00 964.39 964.47 964.00 964.47 39966 2017-01-01 06:00:00 964.47 964.47 964.47 964.47 0
You can do this by filtering on the hours of the index and then using period.max and period.min functions. The values will be put in last record of the chosen hour. See example below with intraday data of MSFT, max and min values for between 15:00 and 16:00. library(xts) # max of high values between 15 and 16. (excluding 16:00) msft$max <- period.max(msft$high[.indexhour(msft) == 15], endpoints(msft$high[.indexhour(msft) == 15], on = "hour")) # min of low values between 15 and 16. (excluding 16:00) msft$min <- period.min(msft$low[.indexhour(msft) == 15], endpoints(msft$low[.indexhour(msft) == 15], on = "hour")) head(msft[8:24], 16) open high low close volume max min 2020-01-23 14:50:00 166.180 166.2300 166.090 166.1050 87934 NA NA 2020-01-23 14:55:00 166.105 166.2200 166.103 166.1700 92280 NA NA 2020-01-23 15:00:00 166.160 166.3500 166.160 166.3400 114359 NA NA 2020-01-23 15:05:00 166.335 166.3400 166.285 166.2850 102633 NA NA 2020-01-23 15:10:00 166.290 166.3050 166.170 166.2550 125558 NA NA 2020-01-23 15:15:00 166.250 166.2750 166.210 166.2400 103938 NA NA 2020-01-23 15:20:00 166.230 166.2500 166.180 166.2350 99649 NA NA 2020-01-23 15:25:00 166.240 166.3000 166.225 166.2850 93846 NA NA 2020-01-23 15:30:00 166.270 166.4164 166.175 166.3600 183154 NA NA 2020-01-23 15:35:00 166.360 166.5000 166.320 166.4600 177178 NA NA 2020-01-23 15:40:00 166.450 166.4650 166.380 166.3800 112174 NA NA 2020-01-23 15:45:00 166.385 166.4050 166.290 166.3875 152806 NA NA 2020-01-23 15:50:00 166.382 166.5200 166.362 166.4500 205667 NA NA 2020-01-23 15:55:00 166.450 166.6900 166.305 166.6700 508469 166.69 166.16 2020-01-23 16:00:00 166.660 166.7200 166.589 166.7200 934090 NA NA 2020-01-24 09:35:00 167.510 167.5300 166.890 166.8918 1152646 NA NA data: msft <- structure(c(166.224, 166.29, 166.29, 166.2456, 166.165, 166.1446, 166.1601, 166.18, 166.105, 166.16, 166.335, 166.29, 166.25, 166.23, 166.24, 166.27, 166.36, 166.45, 166.385, 166.382, 166.45, 166.66, 167.51, 167.03, 167.265, 167.325, 167.37, 167.16, 167.405, 167.35, 167.31, 167.39, 167.17, 167.1, 166.845, 167.03, 167.1223, 167.125, 167.21, 167.34, 167.235, 167.3, 167.37, 167.1977, 166.9814, 166.8499, 166.99, 166.93, 166.83, 166.64, 166.775, 166.85, 166.71, 166.6838, 166.46, 166.35, 165.765, 166.2269, 166.01, 166.19, 166.13, 166.31, 166.36, 166.42, 166.3682, 165.99, 166.1328, 165.85, 165.74, 165.8439, 165.655, 165.5434, 165.47, 165.3227, 165.0627, 165.03, 165.2546, 165.14, 165.1, 164.91, 164.75, 164.65, 164.53, 164.81, 164.8979, 164.6, 164.89, 164.94, 165.03, 165.12, 165.17, 165.24, 165.4, 165.335, 165.2734, 164.985, 164.9, 164.61, 164.93, 165.18, 166.315, 166.29, 166.3, 166.265, 166.22, 166.2201, 166.2, 166.23, 166.22, 166.35, 166.34, 166.305, 166.275, 166.25, 166.3, 166.4164, 166.5, 166.465, 166.405, 166.52, 166.69, 166.72, 167.53, 167.34, 167.39, 167.495, 167.47, 167.48, 167.4251, 167.3699, 167.42, 167.41, 167.2, 167.1, 167.03, 167.21, 167.23, 167.255, 167.35, 167.35, 167.33, 167.405, 167.38, 167.25, 167.01, 167, 167.02, 167.02, 166.8384, 166.9056, 166.86, 166.94, 166.75, 166.6844, 166.47, 166.42, 166.22, 166.4049, 166.221, 166.2003, 166.3749, 166.3999, 166.43, 166.43, 166.375, 166.175, 166.16, 165.96, 165.93, 165.86, 165.671, 165.64, 165.49, 165.4, 165.08, 165.27, 165.26, 165.34, 165.12, 165, 164.825, 164.765, 164.82, 164.89, 165, 164.89, 164.99, 165.041, 165.293, 165.23, 165.27, 165.44, 165.6046, 165.37, 165.295, 165.18, 164.93, 164.945, 165.185, 165.24, 166.22, 166.225, 166.25, 166.15, 166.145, 166.13, 166.1015, 166.09, 166.103, 166.16, 166.285, 166.17, 166.21, 166.18, 166.225, 166.175, 166.32, 166.38, 166.29, 166.362, 166.305, 166.589, 166.89, 167.03, 167.22, 167.32, 167.225, 167.16, 167.2, 167.23, 167.2801, 167.145, 167.05, 166.84, 166.77, 167, 167.1, 167.02, 167.18, 167.23, 167.223, 167.28, 167.1843, 166.862, 166.85, 166.821, 166.8121, 166.85, 166.55, 166.6303, 166.69, 166.7, 166.54, 166.4, 166.31, 165.76, 165.74, 165.8966, 165.91, 166.07, 166.09, 166.171, 166.32, 166.22, 165.96, 165.97, 165.82, 165.73, 165.72, 165.64, 165.49, 165.45, 165.32, 165.045, 164.89, 164.91, 165.09, 165.1, 164.91, 164.74, 164.53, 164.529, 164.53, 164.735, 164.59, 164.54, 164.88, 164.938, 165.01, 165.0792, 165.12, 165.22, 165.335, 165.263, 164.88, 164.89, 164.58, 164.58, 164.87, 164.87, 166.29, 166.275, 166.26, 166.16, 166.145, 166.155, 166.19, 166.105, 166.17, 166.34, 166.285, 166.255, 166.24, 166.235, 166.285, 166.36, 166.46, 166.38, 166.3875, 166.45, 166.67, 166.72, 166.8918, 167.27, 167.325, 167.371, 167.2251, 167.4, 167.34, 167.29, 167.3988, 167.2, 167.1047, 166.86, 167.025, 167.11, 167.12, 167.2, 167.345, 167.23, 167.29, 167.37, 167.1916, 167.0027, 166.85, 167, 166.94, 166.85, 166.64, 166.7738, 166.85, 166.72, 166.68, 166.4672, 166.3512, 165.79, 166.21, 165.9969, 166.18, 166.14, 166.2968, 166.36, 166.43, 166.36, 165.99, 166.13, 165.83, 165.73, 165.8405, 165.65, 165.545, 165.48, 165.33, 165.05, 165.0227, 165.26, 165.1425, 165.101, 164.91, 164.74, 164.6581, 164.5292, 164.805, 164.89, 164.59, 164.8801, 164.9498, 165.04, 165.12, 165.16, 165.2302, 165.4, 165.34, 165.28, 164.987, 164.89, 164.605, 164.94, 165.185, 165.04, 158120, 165333, 101115, 78491, 123999, 76037, 82733, 87934, 92280, 114359, 102633, 125558, 103938, 99649, 93846, 183154, 177178, 112174, 152806, 205667, 508469, 934090, 1152646, 558627, 277325, 321651, 255494, 333848, 272126, 395463, 194593, 211910, 193131, 242112, 210240, 193265, 139617, 204182, 179146, 159259, 237888, 410982, 213787, 233082, 188071, 193742, 132377, 118994, 264247, 182490, 109514, 138164, 221052, 194127, 169059, 458214, 247712, 169523, 115531, 161259, 263230, 155536, 82474, 87549, 109057, 101772, 130642, 171988, 117235, 134507, 236662, 219303, 217698, 219808, 420288, 208087, 149358, 197435, 218090, 267667, 320279, 422434, 340478, 273866, 258938, 212451, 268017, 323657, 267686, 214060, 222314, 293731, 288867, 219687, 304733, 251063, 425450, 455311, 741208, 1429645), .Dim = c(100L, 5L), .Dimnames = list(NULL, c("open", "high", "low", "close", "volume")), index = structure(c(1579785300, 1579785600, 1579785900, 1579786200, 1579786500, 1579786800, 1579787100, 1579787400, 1579787700, 1579788000, 1579788300, 1579788600, 1579788900, 1579789200, 1579789500, 1579789800, 1579790100, 1579790400, 1579790700, 1579791000, 1579791300, 1579791600, 1579854900, 1579855200, 1579855500, 1579855800, 1579856100, 1579856400, 1579856700, 1579857000, 1579857300, 1579857600, 1579857900, 1579858200, 1579858500, 1579858800, 1579859100, 1579859400, 1579859700, 1579860000, 1579860300, 1579860600, 1579860900, 1579861200, 1579861500, 1579861800, 1579862100, 1579862400, 1579862700, 1579863000, 1579863300, 1579863600, 1579863900, 1579864200, 1579864500, 1579864800, 1579865100, 1579865400, 1579865700, 1579866000, 1579866300, 1579866600, 1579866900, 1579867200, 1579867500, 1579867800, 1579868100, 1579868400, 1579868700, 1579869000, 1579869300, 1579869600, 1579869900, 1579870200, 1579870500, 1579870800, 1579871100, 1579871400, 1579871700, 1579872000, 1579872300, 1579872600, 1579872900, 1579873200, 1579873500, 1579873800, 1579874100, 1579874400, 1579874700, 1579875000, 1579875300, 1579875600, 1579875900, 1579876200, 1579876500, 1579876800, 1579877100, 1579877400, 1579877700, 1579878000), tzone = "", tclass = c("POSIXct", "POSIXt")), class = c("xts", "zoo"))
How to deal with one column of two formats and single class?
I have one column with two different formats but the same class 'factor'. D$date 2009-05-12 11:30:00 2009-05-13 11:30:00 2009-05-14 11:30:00 2009-05-15 11:30:00 42115.652 2876 8765 class(D$date) factor What I need is to convert the number to date. D$date <- as.character(D$date) D$date=ifelse(!is.na(as.numeric(D$date)), as.POSIXct(as.numeric(D$date) * (60*60*24), origin="1899-12-30", tz="UTC"), D$date) Now the number was converted but to a strange number "1429630800". I tried without ifelse: as.POSIXct(as.numeric(42115.652) * (60*60*24), origin="1899-12-30", tz="UTC") [1] "2015-04-21 15:38:52 UTC" It was converted nicely.
The problem is that you are mixing up classes in the true/false halves of your ifelse. You can fix this by adding as.character like this D$date = ifelse(!is.na(as.numeric(D$date)), as.character(as.POSIXct(as.numeric(D$date) * (60*60*24), origin="1899-12-30", tz="UTC")), D$date) #D # date #1 2009-05-12 11:30:00 #2 2009-05-13 11:30:00 #3 2009-05-14 11:30:00 #4 2009-05-15 11:30:00 #5 2015-04-21 15:38:52 #6 1907-11-15 00:00:00 #7 1923-12-30 00:00:00
You can also create a function which transforms each value in POSIX, then using lapply and do.call. b <- c("2009-05-12 11:30:00", "2009-05-13 11:30:00", "2009-05-14 11:30:00", "2009-05-15 11:30:00", "42115.652", "2876", "8765") foo <- function(x){ if(!is.na(as.numeric(x))){ as.POSIXct(as.numeric(x) * (60*60*24), origin="1899-12-30", tz="UTC") }else{ as.POSIXct(x, origin="1899-12-30", tz="UTC") } } do.call("c", lapply(b, foo)) [1] "2009-05-12 13:30:00 CEST" "2009-05-13 13:30:00 CEST" "2009-05-14 13:30:00 CEST" "2009-05-15 13:30:00 CEST" [5] "2015-04-21 17:38:52 CEST" "1907-11-15 01:00:00 CET" "1923-12-30 01:00:00 CET"
R time series missing values
I was working with a time series dataset having hourly data. The data contained a few missing values so I tried to create a dataframe (time_seq) with the correct time value and do a merge with the original data so the missing values become 'NA'. > data date value 7980 2015-03-30 20:00:00 78389 7981 2015-03-30 21:00:00 72622 7982 2015-03-30 22:00:00 65240 7983 2015-03-30 23:00:00 47795 7984 2015-03-31 08:00:00 37455 7985 2015-03-31 09:00:00 70695 7986 2015-03-31 10:00:00 68444 //converting the date in the data to POSIXct format. > data$date <- format.POSIXct(data$date,'%Y-%m-%d %H:%M:%S') // creating a dataframe with the correct sequence of dates. > time_seq <- seq(from = as.POSIXct("2014-05-01 00:00:00"), to = as.POSIXct("2015-04-30 23:00:00"), by = "hour") > df <- data.frame(date=time_seq) > df date 8013 2015-03-30 20:00:00 8014 2015-03-30 21:00:00 8015 2015-03-30 22:00:00 8016 2015-03-30 23:00:00 8017 2015-03-31 00:00:00 8018 2015-03-31 01:00:00 8019 2015-03-31 02:00:00 8020 2015-03-31 03:00:00 8021 2015-03-31 04:00:00 8022 2015-03-31 05:00:00 8023 2015-03-31 06:00:00 8024 2015-03-31 07:00:00 // merging with the original data > a <- merge(data,df, x.by = data$date, y.by = df$date ,all=TRUE) > a date value 4005 2014-07-23 07:00:00 37003 4006 2014-07-23 07:30:00 NA 4007 2014-07-23 08:00:00 37216 4008 2014-07-23 08:30:00 NA The values I get after merging are incorrect and they contain half-hourly values. What would be the correct approach for solving this? Why are is the merge result in 30 minute intervals when both my dataframes are hourly? PS:I looked into this question : Fastest way for filling-in missing dates for data.table and followed the steps but it didn't help.
You can use the padr package to solve this problem. library(padr) library(dplyr) #for the pipe operator data %>% pad() %>% fill_by_value()
Finding date based on time
I was thinking of how to find date(which does not exist in the table) based on time. Example: Remember, I only have the time time = c("9:44","15:30","23:48","00:30","05:30", "15:30", "22:00", "00:45") I know for the fact that the start date is 2014-08-28, but how do I get the date which changes after midnight. Expected outcome would be 9:44 2014-08-28 15:30 2014-08-28 23:48 2014-08-28 00:30 2014-08-29 05:30 2014-08-29 15:30 2014-08-29 22:00 2014-08-29 00:45 2014-08-30
Here's an example using data.table package ITime class which enables you to manipulate time (upon converting time to this class you can now subtract/add minutes/hours/etc.) library(data.table) time <- as.ITime(time) Date <- as.IDate("2014-08-28") + c(0, cumsum(diff(time) < 0)) data.table(time, Date) # time Date # 1: 09:44:00 2014-08-28 # 2: 15:30:00 2014-08-28 # 3: 23:48:00 2014-08-28 # 4: 00:30:00 2014-08-29 # 5: 05:30:00 2014-08-29 # 6: 15:30:00 2014-08-29 # 7: 22:00:00 2014-08-29 # 8: 00:45:00 2014-08-30
Using the chron package we assume that a later time is on the same day and an earlier time is on the next day: library(chron) date <- as.Date("2014-08-28") + cumsum(c(0, diff(times(paste0(time, ":00"))) < 0)) data.frame(time, date) giving: time date 1 9:44 2014-08-28 2 15:30 2014-08-28 3 23:48 2014-08-28 4 00:30 2014-08-29 5 05:30 2014-08-29 6 15:30 2014-08-29 7 22:00 2014-08-29 8 00:45 2014-08-30
Here's one way to do it: time = c("9:44","15:30","23:48","00:30","05:30", "15:30", "22:00", "00:45") times <- sapply(strsplit(time, ":", TRUE), function(x) Reduce("+", as.numeric(x) * c(60, 1))) as.POSIXct("2014-08-28") + times + 60*60*24*cumsum(c(F, tail(times < lag(times), -1))) # [1] "2014-08-28 00:09:44 CEST" "2014-08-28 00:15:30 CEST" "2014-08-28 00:23:48 CEST" "2014-08-29 00:00:30 CEST" "2014-08-29 00:05:30 CEST" "2014-08-29 00:15:30 CEST" "2014-08-29 00:22:00 CEST" "2014-08-30 00:00:45 CEST"
You can concatenate system date with time and get result. For example, in Oracle we can get date with time as: to_char(sysdate,'DD-MM-RRRR')|| ' ' || To_char(sysdate,'HH:MIAM') This will result as eg. 12-09-2015 09:50 AM For your requirement, use this as: to_char(sysdate,'DD-MM-RRRR')|| ' 00:45' and so on.
Date time am/pm in R
I'm having an issue with the datetime field of a timeseries: > CO1temp[163:169,] Date OPEN HIGH LOW CLOSE 163 7/11/2011 11:45:00 PM 116.30 116.30 116.09 116.18 164 7/11/2011 11:50:00 PM 116.16 116.78 116.13 116.70 165 7/11/2011 11:55:00 PM 116.69 116.83 116.51 116.65 166 7/12/2011 116.65 116.79 116.44 116.50 167 7/12/2011 12:05:00 AM 116.50 116.60 116.39 116.47 168 7/12/2011 12:10:00 AM 116.49 116.55 116.38 116.52 169 7/12/2011 12:15:00 AM 116.52 116.67 116.39 116.44 As you can see the midnight time (line 166) is not showing properly. Which creates a NA when I create my xts object: CO1 <- as.xts(CO1temp[, 2:5], order.by = as.POSIXct(CO1temp[,1],format='%m/%d/%Y %r'),frequency="5 minutes") > CO1[163:169,] OPEN HIGH LOW CLOSE 2011-07-11 23:45:00 116.30 116.30 116.09 116.18 2011-07-11 23:50:00 116.16 116.78 116.13 116.70 2011-07-11 23:55:00 116.69 116.83 116.51 116.65 <NA> 116.65 116.79 116.44 116.50 2011-07-12 00:05:00 116.50 116.60 116.39 116.47 2011-07-12 00:10:00 116.49 116.55 116.38 116.52 2011-07-12 00:15:00 116.52 116.67 116.39 116.44 This later leads to more problem when I want to analyze this timeseries. ?strptime is quite specific about it: The default for the format methods is "%Y-%m-%d %H:%M:%S" if any component has a time component which is not midnight, and "%Y-%m-%d" otherwise. However my datetime is not in the standard format. I would greatly appreciate any help.
This a kind of a hack but it works. You just have to append "12:00:00 AM" to your vector of date: those which are lacking the hour information will be read correctly, and in the dates that already have the hour information it will just be ignored and only the one that was already there will be read. CO1 <- as.xts(CO1temp[, 2:5], order.by = as.POSIXct(paste(CO1temp$Date,"12:00:00 AM", sep=" "), format='%m/%d/%Y %r'), frequency="5 minutes") CO1 OPEN HIGH LOW CLOSE 2011-07-11 23:45:00 116.30 116.30 116.09 116.18 2011-07-11 23:50:00 116.16 116.78 116.13 116.70 2011-07-11 23:55:00 116.69 116.83 116.51 116.65 2011-07-12 00:00:00 116.65 116.79 116.44 116.50 2011-07-12 00:05:00 116.50 116.60 116.39 116.47 2011-07-12 00:10:00 116.49 116.55 116.38 116.52 2011-07-12 00:15:00 116.52 116.67 116.39 116.44 That being said, if you ended up with your dataframe as it is after using strptime then your date column is already in POSIXct format and therefore the following should work directly: as.xts(CO1temp[, 2:5], order.by = CO1temp$Date, frequency = "5 minutes")