Timestamp looks correct but appears wrong on ggplot2 - r

I have a time series of CO2 data with a UNIX timestamp that looks like this: 1658759863
e.g.
df <- data.frame(timestamp=seq(1653998631,1663998631,10),co2_ppm=runif(1000001))
I then convert to POSIXct and filter data only after 11:52:51
df <- df%>%
mutate(timestamp=as.POSIXct(timestamp,origin="1970-01-01",tz = "GMT"))%>%
filter(timestamp>"2022-07-28 11:52:51" )
attr(df$timestamp,"tzone")
[1] "GMT"
When I filter it to only plot value after "11:52:00", it returns after "10:52:00"
why?

This is a daylight savings issue, see the example below.
s <- Sys.time()
as.POSIXct(as.integer(s), origin = "1970-01-01", tz = "GMT")
#> [1] "2022-07-28 15:05:57 GMT"
as.POSIXct(as.integer(s), origin = "1970-01-01", tz = "Europe/London")
#> [1] "2022-07-28 16:05:57 WEST"
s
#> [1] "2022-07-28 16:05:57 BST"
Created on 2022-07-28 by the reprex package (v2.0.1)
R's time is the number of seconds since an origin and time zone and though London and Greenwich are in the same time zone, London's official time is off by 1 hour in this time of year.

Related

How to change a number into datetime format in R

I have a vector a = 40208.64507.
In excel, I can automatically change a to a datetime: 2010/1/30 15:28:54 by click the Date type.
I tried some methods but I cannot get the same result in R, just as in excel.
a = 40208.64507
# in Excel, a can change into: 2010/1/30 15:28:54
as.Date(a, origin = "1899-12-30")
lubridate::as_datetime(a, origin = "1899-12-30")
Is there any way to get the same results in R as in Excel?
Here are several ways. chron class is the closest to Excel in terms of internal representations -- they are the same except for origin -- and the simplest so we list that one first. We also show how to use chron as an intermediate step to get POSIXct.
Base R provides an approach which avoids package dependencies and lubridate might be used if you are already using it.
1) Add the appropriate origin using chron to get a chron datetime or convert that to POSIXct. Like Excel, chron works in days and fractions of a day, but chron uses the UNIX Epoch as origin whereas Excel uses the one shown below.
library(chron)
a <- 40208.64507
# chron date/time
ch <- as.chron("1899-12-30") + a; ch
## [1] (01/30/10 15:28:54)
# POSIXct date/time in local time zone
ct <- as.POSIXct(ch); ct
## [1] "2010-01-30 10:28:54 EST"
# POSIXct date/time in UTC
as.POSIXct(format(ct), tz = "UTC")
## [1] "2010-01-30 10:28:54 UTC"
2) Using only base R convert the number to Date class using the indicated origin and then to POSIXct.
# POSIXct with local time zone
ct <- as.POSIXct(as.Date(a, origin = "1899-12-30")); ct
## [1] "2010-01-30 10:28:54 EST"
# POSIXct with UTC time zone
as.POSIXct(format(ct), tz = "UTC")
## [1] "2010-01-30 15:28:54 UTC"
3) Using lubridate it is similar to base R so we can write
library(lubridate)
# local time zone
as_datetime(as_date(a, origin = "1899-12-30"), tz = "")
[1] "2010-01-30 15:28:54 EST"
# UTC time zone
as_datetime(as_date(a, origin = "1899-12-30"))
[1] "2010-01-30 15:28:54 UTC"

Date conversion in R number to date

I have this number 13132800000 and i know that is a birthdate "06/02/1970" or "02 june 1970".
How can i convert this number to this date in R
I have no idea of what kind of date is that
With
Library(zoo)
as.Date(person$birthDate)
[1] "-5877641-06-23"
looks like timestamp in miliseconds
try
as.POSIXct( 13132800000/1000, origin = "1970-01-01", tz = "UTC" )
[1] "1970-06-02 UTC"

convert Julian day number to date time format yyyy-mm-dd hh:mm:ss in R

How to convert Julian day number to date and time
if the origin is ("2000-01-01") and I have two Julian day numbers JDN (4822.178270,4822.17840)
what is the equivalent date time?
the code is
JDN <- c(4822.178270,4822.17840)
temp<- as.Date(JDN +0.5, origin=as.Date("2000-01-01 00:00:00")) # that gave only date as "2013-03-15" "2013-03-15" without time.
# my result should be:
"2013-03-15 16:16:42" "2013-03-15 16:16:53"
as.POSIXct('2000-01-01')+((JDN+0.5)*24*60*60)
This should do it:
JDN <- c(4822.178270,4822.17840)
origin <- lubridate::ymd_hms('2000-01-01 00:00:00')
origin + JDN * 3600*24
#> [1] "2013-03-15 04:16:42 UTC" "2013-03-15 04:16:53 UTC"
Created on 2020-01-22 by the reprex package (v0.3.0)

standard deviation of time in a column in hr:min:sec format

In the question "average time in a column in hr:min:sec format" the following example is given:
Col_Time = c('03:08:20','03:11:30','03:22:18','03:27:39')
library(chron)
mean(times(Col_Time))
[1] 03:17:27
How can I get hr:min:sec as result for the standard deviation? If I use the R function sd, the result looks like that:
sd(times(Col_Time))
[1] 0.006289466
sd is operating on the number internally representing the time (days for chron::times, seconds for hms and POSIXct, settable for difftime), which is fine. The only problem is that it is dropping the class from the result so it isn't printed nicely. The solution, then, is just to convert back to the time class afterwards:
x <- c('03:08:20','03:11:30','03:22:18','03:27:39')
chron::times(sd(chron::times(x)))
#> [1] 00:09:03
hms::as.hms(sd(hms::as.hms(x)))
#> 00:09:03.409836
as.POSIXct(sd(as.POSIXct(x, format = '%H:%M:%S')),
tz = 'UTC', origin = '1970-01-01')
#> [1] "1970-01-01 00:09:03 UTC"
as.difftime(sd(as.difftime(x, units = 'secs')),
units = 'secs')
#> Time difference of 543.4098 secs
You can use lubridate package. The hms function will convert time from characters to HMS format. Then use seconds to convert to seconds and calculate mean/sd. Finally, use seconds_to_period to get the result in HMS format.
library(lubridate)
Col_Time = c('03:08:20','03:11:30','03:22:18','03:27:39')
#Get the mean
seconds_to_period(mean(seconds(hms(Col_Time))))
# [1] "3H 17M 26.75S"
#Get the sd
seconds_to_period(sd(seconds(hms(Col_Time))))
#[1] "9M 3.40983612739285S"

R POSIXct returns NA with "03/12/2017 02:17:13"

I have a data set containing the following date, along with several others
03/12/2017 02:17:13
I want to put the whole data set into a data table, so I used read_csv and as.data.table to create DT which contained the date/time information in date.
Next I used
DT[, date := as.POSIXct(date, format = "%m/%d/%Y %H:%M:%S")]
Everything looked fine except I had some NA values where the original data had dates. The following expression returns an NA
as.POSIXct("03/12/2017 02:17:13", format = "%m/%d/%Y %H:%M:%S")
The question is why and how to fix.
Just use functions anytime() or utctime() from package anytime
R> library(anytime)
R> anytime("03/12/2017 02:17:13")
[1] "2017-03-12 01:17:13 CST"
R>
or
R> utctime("03/12/2017 02:17:13")
[1] "2017-03-11 20:17:13 CST"
R>
The real crux is that time did not exists in North America due to DST. You could parse it as UTC as UTC does not observer daylight savings:
R> utctime("03/12/2017 02:17:13", tz="UTC")
[1] "2017-03-12 02:17:13 UTC"
R>
You can express that UTC time as Mountain time, but it gets you the previous day:
R> utctime("03/12/2017 02:17:13", tz="America/Denver")
[1] "2017-03-11 19:17:13 MST"
R>
Ultimately, you (as the analyst) have to provide as to what was measured. UTC would make sense, the others may need adjustment.
My solution is below but ways to improve appreciated.
The explanation for the NA is that in the mountain time zone in the US, that date and time is in the window of the switch to daylight savings where the time doesn't exist, hence NA. While the time zone is not explicitly specified, I guess R must be picking it up from the computer's time, which is in "America/Denver"
The solution is to explicitly state the date/time string is in UTC and then convert back as follows:
time.utc <- as.POSIXct("03/12/2017 02:17:13", format = "%m/%d/%Y %H:%M:%S", tz = "UTC")
> time.utc
[1] "2017-03-12 02:17:13 UTC"
>
Next, add 6 hours to the UTC time which is the difference between UTC and MST
time.utc2 <- time.utc + 6 * 60 * 60
> time.utc2
[1] "2017-03-12 08:17:13 UTC"
>
Now convert to America/Denver time using daylight savings.
time.mdt <- format(time.utc2, usetz = TRUE, tz = "America/Denver")
> time.mdt
[1] "2017-03-12 01:17:13 MST"
>
Note that this is in standard time, because daylight savings doesn't start until 2 am.
If you change the original string from 2 am to 3 am, you get the following
> time.mdt
[1] "2017-03-12 03:17:13 MDT"
>
The hour between 2 and 3 is lost in the change from standard to daylight savings but the data are now correct.

Resources