Milliseconds separated by comma - r

I have the following data.frame called "data" (it is much larger but i just give the first lines as an example):
Timestamp Weight Degrees
1 30-09-2016 11:45:00,000 38.19 40.00
2 01-10-2016 06:19:57,860 39.12 40.00
3 01-10-2016 06:20:46,393 42.11 41.00
I would like to convert the "Timestamp" to a date/time vector including milliseconds. This seems to be a problem because the milliseconds are separated by a comma.
Also, data.frame has mode "list" and Timestamp has mode "character" which clearly aren't right...
I have tried data$Timestamp <- as.POSIXct(data$Timestamp,format='%d-%m-%Y %H:%M:%OS') but I only get "2016-09-30 11:42:00 UTC", without the milliseconds. The mode however becomes "numeric", which should be a step in the right direction. I only have set options(digits.secs=3).
I'd really appreciate your help. Thank you in advance.

x = c("30-09-2016 11:45:00,000", "01-10-2016 06:19:57,860", "01-10-2016 06:20:46,393")
format(as.POSIXct(gsub(",", ".", x), format='%d-%m-%Y %H:%M:%OS'), '%d-%m-%Y %H:%M:%OS3')
#[1] "30-09-2016 11:45:00.000" "01-10-2016 06:19:57.859" "01-10-2016 06:20:46.392"
OR
x = c("30-09-2016 11:45:00,000", "01-10-2016 06:19:57,860", "01-10-2016 06:20:46,393")
#Converting to POSIXct
options(digits.secs=3)
y = as.POSIXct(gsub(",", ".", x), format='%d-%m-%Y %H:%M:%OS', tz = "UTC")
y
#[1] "2016-09-30 11:45:00.000 UTC" "2016-10-01 06:19:57.859 UTC" "2016-10-01 06:20:46.392 UTC"
#Converting to numeric
as.numeric(y)
#[1] 1475253900 1475320798 1475320846
#Converting numeric back to POSIXct
as.POSIXct(as.numeric(y), origin = "1970-01-01", tz = "UTC")
#[1] "2016-09-30 11:45:00.000 UTC" "2016-10-01 06:19:57.859 UTC" "2016-10-01 06:20:46.392 UTC"
OR
x = c("30-09-2016 11:45:00,000", "01-10-2016 06:19:57,860", "01-10-2016 06:20:46,393")
library(lubridate)
options(digits.secs=3)
dmy_hms(gsub(",", ".", x))
#[1] "2016-09-30 11:45:00.000 UTC" "2016-10-01 06:19:57.860 UTC" "2016-10-01 06:20:46.393 UTC"

Related

I would like to extract the time from a character vector [duplicate]

This question already has answers here:
Convert date-time string to class Date
(4 answers)
Closed 3 years ago.
I have date&time stamp as a character variable
"2018-12-13 11:00:01 EST" "2018-10-23 22:00:01 EDT" "2018-11-03 14:15:00 EDT" "2018-10-04 19:30:00 EDT" "2018-11-10 17:15:31 EST" "2018-10-05 13:30:00 EDT"
How can I strip the time from this character vector?
PS: Can someone please help. I have tried using strptime but I am getting NA values as a result
It's a bit unclear whether you want the date or time but if you want the date then as.Date ignores any junk after the date so:
x <- c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT")
as.Date(x)
## [1] "2018-12-13" "2018-10-23"
would be sufficient to get a Date vector from the input vector x. No packages are used.
If you want the time then:
read.table(text = x, as.is = TRUE)[[2]]
## [1] "11:00:01" "22:00:01"
If you want a data frame with each part in a separate column then:
read.table(text = x, as.is = TRUE, col.names = c("date", "time", "tz"))
## date time tz
## 1 2018-12-13 11:00:01 EST
## 2 2018-10-23 22:00:01 EDT
I think the OP wants to extract the time from date-time variable (going by the title of the question).
x <- "2018-12-13 11:00:01 EST"
as.character(strptime(x, "%Y-%m-%d %H:%M:%S"), "%H:%M:%S")
[1] "11:00:01"
Another option:
library(lubridate)
format(ymd_hms(x, tz = "EST"), "%H:%M:%S")
[1] "11:00:01"
The package lubridate makes everything like this easy:
library(lubridate)
x <- "2018-12-13 11:00:01 EST"
as_date(ymd_hms(x))
You can use the as.Date function and specify the format
> as.Date("2018-12-13 11:00:01 EST", format="%Y-%m-%d")
[1] "2018-12-13"
If all values are in a vector:
x = c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT",
"2018-11-03 14:15:00 EDT", "2018-10-04 19:30:00 EDT",
"2018-11-10 17:15:31 EST", "2018-10-05 13:30:00 EDT")
> as.Date(x, format="%Y-%m-%d")
[1] "2018-12-13" "2018-10-23" "2018-11-03" "2018-10-04" "2018-11-10"
[6] "2018-10-05"

Converting Navy sunset/sunrise data into time

I have downloaded the 2015-2017 sunset/sunrise data from the Navy and I am trying to format it into dates and time to further use with other data I have. This is how my data set looks in R.
I have managed to convert the date into the right R data format. Yet, I still can't convert my rise/set column data from a number/integer format to a time data (as hh:mm).
Based on one internet source, I wrote the following codes:
Sun2015$Srise<- format(strptime(Sun2015$Rise, format="%H:%M"))
However this gives NA in my data
OR
Sun2015$Srise<-str_pad(Sun2015$Rise, 4, pad="0")
Sun2015$Srise<-hour(hm(Sun2015$Srise))
Yet, I received the following error:
Warning message: In .parse_hms(..., order = "HM", quiet = quiet) :
Some strings failed to parse.
Is there a better way to convert the columns into the right time format so that I can merge the date and time columns into date-time columns for sunset and sunrise?
Thank you in advance for your help.
You can convert your military time to 2400 time strings using sprint("%04d", data) and go from there. For example, with the first 5 lines of your data:
# Sample of your data
Day <- c("1/1/2015", "1/2/2015", "1/3/2015", "1/4/2015", "1/5/2015")
Rise <- c(652,652,652,653,653)
Set <- c(1755,1756,1756,1757,1757)
sun2015 <- data.frame(Day, Rise, Set)
# Convert to 2400 style strings with leading zeroes where necessary
sun2015$Rise <- sprintf("%04d", sun2015$Rise)
sun2015$Set <- sprintf("%04d", sun2015$Set)
# Merge with your date
sun2015$day_rise <- as.POSIXct(paste0(sun2015$Day, " ",sun2015$Rise), format = "%m/%d/%Y %H%M", origin = "1970-01-01", tz = "UTC")
sun2015$day_set <- as.POSIXct(paste0(sun2015$Day, " ",sun2015$Set), format = "%m/%d/%Y %H%M", origin = "1970-01-01", tz = "UTC")
> sun2015$day_rise
[1] "2015-01-01 06:52:00 UTC" "2015-01-02 06:52:00 UTC" "2015-01-03 06:52:00 UTC" "2015-01-04 06:53:00 UTC"
[5] "2015-01-05 06:53:00 UTC"
> sun2015$day_set
[1] "2015-01-01 17:55:00 UTC" "2015-01-02 17:56:00 UTC" "2015-01-03 17:56:00 UTC" "2015-01-04 17:57:00 UTC"
[5] "2015-01-05 17:57:00 UTC"
You can adjust to the appropriate time zone if necessary.

Concatenating Dates into a single Column

My dates in Submitted.on column are of different formats hence resorted to converting the various formats separately and then concatenating them into a new column using ifelse and as.Date. But the however when I use strptime to have the timestamp also in the new column it throws error and a warning message.
data$Submitted.on[0:5]
#[1] 02-06-02 0:00 03/30/2010 23:15:12
#[3] 11-05-09 6:28 07/29/2009 23:07:38
#[5] 07-10-05 0:00
vec1 = as.character(strptime(data$Submitted.on, "%m/%d/%Y %H:%M:%S"))
vec1[0:5]
#[1] NA "2010-03-30 23:15:12"
#[3] NA "2009-07-29 23:07:38"
#[5] NA
vec2 = as.character(strptime(data$Submitted.on, "%m-%d-%y %H:%M"))
vec2[0:5]
#[1] "2002-02-06 00:00:00" NA
#[3] "2009-11-05 06:28:00" NA
#[5] "2005-07-10 00:00:00"
data['new_format']=as.Date(ifelse(is.na(vec1),vec2,vec1))
data[0:5,'new_format']
#[1] "2002-02-06" "2010-03-30" "2009-11-05" "2009-07-29"
#[5] "2005-07-10"
Using as.Date works great but when I use strptime for timestamp also , it gives warning message.
data['new_format']=strptime(ifelse(is.na(vec1),vec2,vec1),"%Y-%m-%d
%H:%M:%S")
#Warning message:
#In `[<-.data.frame`(`*tmp*`, "new_format", value = list(sec = c(0, :
#provided 11 variables to replace 1 variables
data[0:5,'new_format']
[1] 0 12 0 38 0
Any help on how to have the timestamp also will be of great help.
We can use parse_date_time from lubridate
library(lubridate)
parse_date_time(data$Submitted.on, guess_formats(data$Submitted.on,
c("mdy HMS", "mdy MS")))
#[1] "2002-02-06 00:00:00 UTC" "2010-03-30 23:15:12 UTC" "2009-11-05 00:06:28 UTC"
#[4] "2009-07-29 23:07:38 UTC" "2005-07-10 00:00:00 UTC"
Regarding the use of ifelse, we would advocate against this as strptime gives a POSIXlt class. So, instead of ifelse, one can use the indexing method
v1 <- strptime(data$Submitted.on, "%m/%d/%Y %H:%M:%S", tz = "UTC")
v1[is.na(v1)] <- strptime(data$Submitted.on[is.na(v1)], "%m-%d-%y %H:%M", tz = "UTC")
v1
#[1] "2002-02-06 00:00:00 UTC" "2010-03-30 23:15:12 UTC" "2009-11-05 06:28:00 UTC"
#[4] "2009-07-29 23:07:38 UTC" "2005-07-10 00:00:00 UTC"
data
data <- structure(list(Submitted.on = c("02-06-02 0:00", "03/30/2010 23:15:12",
"11-05-09 6:28", "07/29/2009 23:07:38", "07-10-05 0:00")),
.Names = "Submitted.on", row.names = c(NA, -5L), class = "data.frame")

Find previous hour and next hour in R

Suppose I pass "2015-01-01 01:50:50", then it should return "2015-01-01 01:00:00" and "2015-01-01 02:00:00". How to calculate these values in R?
Assuming your time were a variable "X", you can use round or trunc.
Try:
round(X, "hour")
trunc(X, "hour")
This would still require some work to determine whether the values had actually been rounded up or down (for round). So, If you don't want to have to think about that, you can consider using the "lubridate" package:
X <- structure(c(1430050590.96162, 1430052390.96162), class = c("POSIXct", "POSIXt"))
X
# [1] "2015-04-26 17:46:30 IST" "2015-04-26 18:16:30 IST"
library(lubridate)
ceiling_date(X, "hour")
# [1] "2015-04-26 18:00:00 IST" "2015-04-26 19:00:00 IST"
floor_date(X, "hour")
# [1] "2015-04-26 17:00:00 IST" "2015-04-26 18:00:00 IST"
I would go with the following wrapper using base R (you can specify your time zone using the tz argument within the strptime function)
Myfunc <- function(x){x <- strptime(x, format = "%F %H") ; c(x, x + 3600L)}
Myfunc("2015-01-01 01:50:50")
## [1] "2015-01-01 01:00:00 IST" "2015-01-01 02:00:00 IST"

fastPOSIXct equivalent for converting non-UTC to UTC

Hi I have a character vector (rr) that is several million in length, and it represents time and date stamps in the format %Y-%m-%d %H:%M:%S recorded in Australia/Sydney.
How do get a POSIXct object (quickly) that represents this.
I have found fastPOSIXct in the fasttime package, but for this to be accurate, it requires the original character string to be in GMT/UTC, (which mine is not) and then converted back into the correct timezone using the tz arguement...
> head(rr)
[1] "2009-05-01 10:01:00" "2009-05-01 10:02:00" "2009-05-01 10:03:00" "2009-05-01 10:04:00"
[5] "2009-05-01 10:05:00" "2009-05-01 10:06:00"
> as.POSIXct(head(rr),tz="Australia/Sydney")
[1] "2009-05-01 10:01:00 EST" "2009-05-01 10:02:00 EST" "2009-05-01 10:03:00 EST"
[4] "2009-05-01 10:04:00 EST" "2009-05-01 10:05:00 EST" "2009-05-01 10:06:00 EST"
The above line takes ages if doing it on the full set of data...so any speed improvements would be appreciated. Thanks.
Inspired by Dirk's answer to this qn, I made this wrapper for handling a whole bunch of dates across the year:
fastPOSIXct_generic <- function(x, mytz = "America/New_York")
{
# Caution, read: ?DateTimeClasses
stopifnot(is.character(x))
times_UTC <- fastPOSIXct(x, tz='UTC')
num_times <- as.numeric(times_UTC)
t1 <- as.POSIXct(x[1], tz = mytz)
t2 <- as.POSIXct(x[1], tz = "UTC")
offset <- as.numeric(difftime(t1, t2, units = "secs"))
daylightoffset <- as.POSIXlt(t1)$isdst
# For this first 'time' in t1 and t2, remove possible impact of losing one hour by setting clocks one hour forward during summer months:
offset <- offset + daylightoffset * 3600
num_times <- num_times + offset
new_num_times <- as.POSIXct(num_times, tz = mytz, origin = '1970-01-01')
new_num_times2 <- new_num_times - as.POSIXlt(new_num_times)$isdst * 3600
return(new_num_times2)
}
# Test Sydney time
mm <- as.POSIXct(c("2015-03-15 15:00:00", "2015-4-10 15:00:00", "2014-10-01 15:00:00", "2015-10-15 15:00:00"), tz = "Australia/Sydney")
# "2015-03-15 15:00:00 AEDT" "2015-04-10 15:00:00 AEST" "2014-10-01 15:00:00 AEST" "2015-10-15 15:00:00 AEDT"
aus_stamps <- as.character(mm)
aus_back <- fastPOSIXct_generic(x = aus_stamps, mytz = "Australia/Sydney")
#"2015-03-15 15:00:00 AEDT" "2015-04-10 15:00:00 AEST" "2014-10-01 15:00:00 AEST" "2015-10-15 15:00:00 AEDT"
identical(mm, aus_back)
# TRUE
My use cases are nearly always UTC to America/New_York, where so far it has seemed to work fine. I don't know whether it works correctly for other time zones; just the cases where dst has time go forward an hour.
Here is one approach:
i) Lie to fasttime() and pretend the data was UTC, use to parse the data into a vector x
ii) Compute an offset to UTC using your first data point:
R> d1 <- "2009-05-01 10:01:01" ## or use `head(rr,1)`
R> t1 <- as.POSIXct(d1,tz="Australia/Sydney")
R> t2 <- as.POSIXct(d1,tz="UTC")
R> offset <- as.numeric(difftime(t2, t1, units="secs"))
R> offset
[1] 36000
iii) Apply the offset value to your data -- that is a quick addition as POSIXct really is a numeric type with (fractional) seconds (since epoch) as its unit.

Resources