I have a large dataframe with a column containing date-times, encoded as a factor variable.My Sys.timezone() is "Europe/Berlin". The date-times have this format:
2015-05-05 17:27:04+05:00
where +05:00 represents the timeshift from GMT. Importantly, I have multiple timezones in my dataset, so I cannot set a specific timezone and ignore the last 6 characters of the strings. This is what I tried so far:
# Test Date
test <- "2015-05-05 17:27:04+05:00"
# Removing the ":" to make it readable by %z
A <- paste(substr(test,1,22),substr(test,24,25),sep = "");A
# Returns
# "2015-05-05 17:27:04+0500"
output <- as.POSIXct(as.character(A, "%Y-%B-%D %H:%M:%S%z"))
# Returns
# "2015-05-05 17:27:04 CEST"
The output of "CEST" for +0500 is incorrect. Moreover, when I run this code on the whole column I see that every date is coded as CEST, regardless of the offset.
How can I keep the specified timezone when converting to POSIXct?
In order to facilitate the process you can use lubridate package.
E.g.
library("lubridate")#load the package
ymd_hms("2015-05-05 17:27:04+05:00",tz="GMT")#set the date format
[1] "2015-05-05 12:27:04 GMT"
Therefore you keep the timezone info. Finally:
as.POSIXct(ymd_hms("2015-05-05 17:27:04+05:00",tz="GMT"),tz = "GMT")#transform the date into another timezone
[1] "2015-05-05 12:27:04 GMT"
Related
I want to convert number of days to date with time:
> 15525.1+as.Date("1970-01-01")
[1] "2012-07-04" ## correct but no time
I tried this:
> apollo.fmt <- "%B %d, %Y, %H:%M:%S"
> as.POSIXct((15525.1+as.Date("1970-01-01")), format=apollo.fmt, tz="UTC")
[1] "2012-07-04 04:24:00 CEST"
but as you see the results provide in CEST. But I need it it in UTC.
Any hints on this?
For the original conversion, refer to this question: Converting numeric time to datetime POSIXct format in R and these pages: Date-times in R , Date-time conversions and Converting excel dates (number) to R date-time object. Bascially, it depends on your data source, the time origin for that data sources (Excel, Apache etc.) and the units. For example, you may have the total time elapsed in seconds, minutes, hours or days since the time origin for your data source which will be different for Excel or Apache. Once you have this information, you can use strptime or origin arguments and convert to R date-time objects.
If you are only concerned with changing the timezone, you can use attr:
> u <- Sys.time()
> u
[1] "2017-12-21 09:01:35 EST"
> attr(u, "tzone") <- "UTC"
> u
[1] "2017-12-21 14:01:35 UTC"
You may want to check up on the valid timezones for your machine though. A good way to get a time-zone that works with your machine would be googleway::google_timezone. To get the coordinates for your location (or the location from where you're importing data), you can either look those up online or use ggmap::geocode() - useful if converting time stamps in data from different time zones.
I think the problem is as.POSIXct doesn't change anything if the time is already POSIXct, so the tz option has no effect.
Use attr as explained here
I´m relative new to R and right now I´m struggling with converting my time data.
I have a list of values, which should be daily data (01/012007-31/12/2015)
str(timedata)
num [1:3103(1d)] 733043 733044 733045 733046 733047 ...
I would like to read the daily dates 01/01/07 02/01/07....
I tried to convert it with the function
as.POSIXct(timedata, origin = "2007-01-01", tz = "GMT")
but the result is wrong:
"2007-01-09 11:39:08 GMT" "2007-01-09 11:39:09 GMT" "2007-01-09 11:39:10 GMT"
"2007-01-09 11:39:11 GMT" "2007-01-09 11:39:12 GMT"...
maybe someone could help me?! I guess it should be possible to get my dates, but which function makes sense?
This gives a daily vector of dates from Jan 1, 2007 to Dec 31, 2015:
tsD <- seq.Date("2007-01-01", "2015-12-31")
You could also coerce that vector to Dates. But to do so you need to use the correct origin for your data which appears to be the erroneous starting date chosen by the Lotus 1-2-3 developers and perpetuated by Microsoft:
tsD <- as.Date(my.timedata, origin="0000-01-01")
I actually don't know how to do that is one step. So I would be subtracting 1 from those values to get a corrected date series. Read ?as.Date
> tsD
[1] "2007-01-02" "2007-01-03" "2007-01-04" "2007-01-05" "2007-01-06"
If you want the dates to be displayed as 'dd/mm/yyyy' you need to read about formatting strings in ?strptime but use the generic format function which has a Date method:
> format(tsD, format="%d-%m-%Y")
[1] "02-01-2007" "03-01-2007" "04-01-2007" "05-01-2007" "06-01-2007"
The date July, 1, 2016 1:15pm and 43 seconds is given to me as the string 160701131543.
I have an entire column in my data frame of this date time. How should I go about parsing this column into usable data.
You can use the as.POSIXct function and specify the format, in your case the format is year, month, day, hour, minute, second. Read more about formatting date and time data on the ?strptime help page.
as.POSIXct("160701131543", format = "%y%m%d%H%M%S")
[1] "2016-07-01 13:15:43 EDT"
The timezone can be changed with the 'tz' parameter.
Here is another option with lubridate. The default tz is "UTC". It can be changed by specifying tz
library(lubridate)
ymd_hms("160701131543")
#[1] "2016-07-01 13:15:43 UTC"
I'm trying to combine two vectors of dates into a single vector. I have been using dates with the lubridate package.
First I create two vectors of dates:
library(lubridate)
mydate <- mdy("04/01/2016")
mydate_range <- mydate + (1:12)*months(1)
anotherdate_range <- mdy("05/01/2017") + (1:12)*months(1)
Inspecting mydate_range and anotherdate_range these seem to have worked fine.
But then when I try to combine these into one vector things get weird.
combineddates <- c(mydate_range, anotherdate_range)
combineddates
[1] "2016-04-30 19:00:00 CDT" "2016-05-31 19:00:00 CDT" "2016-06-30 19:00:00 CDT"
The first date of combineddates is now "2016-04-30". Before I combined them using the c() function the first date of mydate_range was "2016-05-01".
Not sure why this changed. How should I join these date vectors?
The reason for the date change is the conversion due to time zone adjustments. 2016-04-30 19:00:00 CDT is the same as 2016-05-01 GMT. Most likely your initial sequence was in GMT and somewhere along the way it got converted to local time.
I find it best to define the time zone in your initial definition and it should stay consistent throughout.
I have a file with nearly four thousand entries in a column formatted like this:
1/28/2015 14:13
How do I get R to read these as real numbers?
As #RomanLuštrik suggested:
mydate <- "1/28/2015 14:13"
# convert to date
strptime(mydate, "%m/%d/%Y %H:%M")
# [1] "2015-01-28 14:13:00 GMT"
# make it numeric
as.numeric(strptime(mydate, "%m/%d/%Y %H:%M"))
# [1] 1422454380
datestring<-"your variable"
x<-strptime(datestring, %b/%d,%Y %H:%M)
Just check out the strptime() info
there is the lubridate package with a lot of functions for this for changing formats
for real numbers you have POSIXct() function.