Converting time string with abbreviated timezone info in R - r

I have a data file that has timezone info in the datetime field. It looks something like this
"Wed May 18 13:42:29 MDT 2016"
I'm trying to convert this to a POSIXct object. Unfortunately %Z is allowed only for output. For eg.
> timestr <- "Wed May 18 13:42:29 MDT 2016"
> tm.posixct <- as.POSIXct(timestr, tz="US/Mountain", format = "%a %b %d %H:%M:%S MDT %Y")
My data file contains MDT or MST, hence I can't use that in the format string. Is there anyway to use the TZ info in that timefield for conversion instead of doing string manipulation and removing the timezone info before passing it to as.POSIXct?

Related

as.Date wont convert full written months properly [duplicate]

When I try to parse a timestamp in the following format: "Thu Nov 8 15:41:45 2012", only NA is returned.
I am using Mac OS X, R 2.15.2 and Rstudio 0.97.237. The language of my OS is Dutch: I presume this has something to do with it.
When I try strptime, NA is returned:
var <- "Thu Nov 8 15:41:45 2012"
strptime(var, "%a %b %d %H:%M:%S %Y")
# [1] NA
Neither does as.POSIXct work:
as.POSIXct(var, "%a %b %d %H:%M:%S %Y")
# [1] NA
I also tried as.Date on the string above but without %H:%M:%S components:
as.Date("Thu Nov 8 2012", "%a %b %d %Y")
# [1] NA
Any ideas what I could be doing wrong?
I think it is exactly as you guessed, strptime fails to parse your date-time string because of your locales. Your string contains both abbreviated weekday (%a) and abbreviated month name (%b). These time specifications are described in ?strptime:
Details
%a: Abbreviated weekday name in the current locale on this
platform
%b: Abbreviated month name in the current locale on this platform.
"Note that abbreviated names are platform-specific (although the
standards specify that in the C locale they must be the first three
letters of the capitalized English name:"
"Knowing what the abbreviations are is essential if you wish to use
%a, %b or %h as part of an input format: see the examples for
how to check."
See also
[...] locales to query or set a locale.
The issue of locales is relevant also for as.POSIXct, as.POSIXlt and as.Date.
From ?as.POSIXct:
Details
If format is specified, remember that some of the format
specifications are locale-specific, and you may need to set the
LC_TIME category appropriately via Sys.setlocale. This most often
affects the use of %b, %B (month names) and %p (AM/PM).
From ?as.Date:
Details
Locale-specific conversions to and from character strings are used
where appropriate and available. This affects the names of the days
and months.
Thus, if weekdays and month names in the string differ from those in the current locale, strptime, as.POSIXct and as.Date fail to parse the string correctly and NA is returned.
However, you may solve this issue by changing the locales:
# First save your current locale
loc <- Sys.getlocale("LC_TIME")
# Set correct locale for the strings to be parsed
# (in this particular case: English)
# so that weekdays (e.g "Thu") and abbreviated month (e.g "Nov") are recognized
Sys.setlocale("LC_TIME", "en_GB.UTF-8")
# or
Sys.setlocale("LC_TIME", "C")
#Then proceed as you intended
x <- "Thu Nov 8 15:41:45 2012"
strptime(x, "%a %b %d %H:%M:%S %Y")
# [1] "2012-11-08 15:41:45"
# Then set back to your old locale
Sys.setlocale("LC_TIME", loc)
With my personal locale I can reproduce your error:
Sys.setlocale("LC_TIME", loc)
# [1] "fr_FR.UTF-8"
strptime(var,"%a %b %d %H:%M:%S %Y")
# [1] NA
Was just messing around with same problem, and found this solution to be much cleaner because there is no need to change any of system settings manually, because there is a wrapper function doing this job in the lubridate package, and all you have to do is set the argument locale:
date <- c("23. juni 2014", "1. november 2014", "8. marts 2014", "16. juni 2014", "12. december 2014", "13. august 2014")
df$date <- dmy(df$Date, locale = "Danish")
[1] "2014-06-23" "2014-11-01" "2014-03-08" "2014-06-16" "2014-12-12" "2014-08-13"

Confusion regarding DateTime conversion in R

I am trying to convert a character string into a dateTime object in R
Sample data:
Following is the code that I am using for conversion
sample$Tweet_Timestamp <- lapply(sample$Tweet_Timestamp, function(x) as.POSIXct(strptime(x, "%a %b %d %H:%M:%S %z %Y")))
sample<-sample%>%unnest(Tweet_Timestamp)
The result I am getting is as follows:
Now in the result we can see that the date has converted from 18th Feb to 19th Feb. I cannot understand the reason why I am getting such result.Can someone help me decipher this?
as.POSIXct will automatically convert the date-time to the time zone of your local system. If you wish to retain the original time zone, you can do so by adding tz = "UTC" which is the defualt Universal Time.
For instance, the following code (using 1st row of your sample data):
as.POSIXct(strptime("Tue Feb 18 23:09:57 +0000 2014", "%a %b %d %H:%M:%S %z %Y", tz = "UTC"))
will produce the following output (without altering the time zone):
[1] "2014-02-18 23:09:57 UTC"

Character string with timezone convert to date

I am trying to convert a vector of dates that I read from a csv file using read.table. These were read as a vector of character strings. I am trying to convert it to a date vector using as_date.
The date vector has elements of the below type
dateString
"Wed Dec 11 00:00:00 ICT 2013"
On trying to convert using the below command,
as.Date(dateString,"%a %b %e %H:%M:%S %Z %Y")
Error in strptime(x, format, tz = "GMT") :
use of %Z for input is not supported
What would be the right format to use in strptime? or in as.Date?
Just use the anytime() function from the anytime package:
R> anytime::anytime("Wed Dec 11 00:00:00 ICT 2013")
[1] "2013-12-11 CST"
R>
There is also an utctime() variant to not impose your local time, and much. By now we also had a number of questions here so just search.
And if you want a date, it works the same way:
R> anytime::anydate("Wed Dec 11 00:00:00 ICT 2013")
[1] "2013-12-11"
R>

strptime does not convert certain dates [duplicate]

When I try to parse a timestamp in the following format: "Thu Nov 8 15:41:45 2012", only NA is returned.
I am using Mac OS X, R 2.15.2 and Rstudio 0.97.237. The language of my OS is Dutch: I presume this has something to do with it.
When I try strptime, NA is returned:
var <- "Thu Nov 8 15:41:45 2012"
strptime(var, "%a %b %d %H:%M:%S %Y")
# [1] NA
Neither does as.POSIXct work:
as.POSIXct(var, "%a %b %d %H:%M:%S %Y")
# [1] NA
I also tried as.Date on the string above but without %H:%M:%S components:
as.Date("Thu Nov 8 2012", "%a %b %d %Y")
# [1] NA
Any ideas what I could be doing wrong?
I think it is exactly as you guessed, strptime fails to parse your date-time string because of your locales. Your string contains both abbreviated weekday (%a) and abbreviated month name (%b). These time specifications are described in ?strptime:
Details
%a: Abbreviated weekday name in the current locale on this
platform
%b: Abbreviated month name in the current locale on this platform.
"Note that abbreviated names are platform-specific (although the
standards specify that in the C locale they must be the first three
letters of the capitalized English name:"
"Knowing what the abbreviations are is essential if you wish to use
%a, %b or %h as part of an input format: see the examples for
how to check."
See also
[...] locales to query or set a locale.
The issue of locales is relevant also for as.POSIXct, as.POSIXlt and as.Date.
From ?as.POSIXct:
Details
If format is specified, remember that some of the format
specifications are locale-specific, and you may need to set the
LC_TIME category appropriately via Sys.setlocale. This most often
affects the use of %b, %B (month names) and %p (AM/PM).
From ?as.Date:
Details
Locale-specific conversions to and from character strings are used
where appropriate and available. This affects the names of the days
and months.
Thus, if weekdays and month names in the string differ from those in the current locale, strptime, as.POSIXct and as.Date fail to parse the string correctly and NA is returned.
However, you may solve this issue by changing the locales:
# First save your current locale
loc <- Sys.getlocale("LC_TIME")
# Set correct locale for the strings to be parsed
# (in this particular case: English)
# so that weekdays (e.g "Thu") and abbreviated month (e.g "Nov") are recognized
Sys.setlocale("LC_TIME", "en_GB.UTF-8")
# or
Sys.setlocale("LC_TIME", "C")
#Then proceed as you intended
x <- "Thu Nov 8 15:41:45 2012"
strptime(x, "%a %b %d %H:%M:%S %Y")
# [1] "2012-11-08 15:41:45"
# Then set back to your old locale
Sys.setlocale("LC_TIME", loc)
With my personal locale I can reproduce your error:
Sys.setlocale("LC_TIME", loc)
# [1] "fr_FR.UTF-8"
strptime(var,"%a %b %d %H:%M:%S %Y")
# [1] NA
Was just messing around with same problem, and found this solution to be much cleaner because there is no need to change any of system settings manually, because there is a wrapper function doing this job in the lubridate package, and all you have to do is set the argument locale:
date <- c("23. juni 2014", "1. november 2014", "8. marts 2014", "16. juni 2014", "12. december 2014", "13. august 2014")
df$date <- dmy(df$Date, locale = "Danish")
[1] "2014-06-23" "2014-11-01" "2014-03-08" "2014-06-16" "2014-12-12" "2014-08-13"

converting utc format to required format using python

I am trying to convert UTC format date format to required format using python. Simply I have the datetime in the format(Fri Dec 07 19:06:06 +0000 2012), I need to convert this into my format(2012-12-07 19:06:06:546 +0000)
Code:
created_at = "Fri Dec 07 19:06:06 +0000 2012"
d = datetime.strptime(created_at, '%a %b %d %H:%M:%S %z %Y')
date_object = d.strftime('%y-%m-%d %H:%M:%S')
result:
ValueError: 'z' is a bad directive in format '%a %b %d %H:%M:%S %z %Y'
The below link from python bugs says its fixed but i didn understand what is fixed and in which version i can use %z
http://bugs.python.org/issue6641
I am not able to use %z . Is there any other way to handle this??
Do it like that:
from datetime import datetime
TIMESTAMP = datetime.utcnow().strftime('%d/%m/%Y %H:%M:%S')

Resources