Confusion regarding DateTime conversion in R - r

I am trying to convert a character string into a dateTime object in R
Sample data:
Following is the code that I am using for conversion
sample$Tweet_Timestamp <- lapply(sample$Tweet_Timestamp, function(x) as.POSIXct(strptime(x, "%a %b %d %H:%M:%S %z %Y")))
sample<-sample%>%unnest(Tweet_Timestamp)
The result I am getting is as follows:
Now in the result we can see that the date has converted from 18th Feb to 19th Feb. I cannot understand the reason why I am getting such result.Can someone help me decipher this?

as.POSIXct will automatically convert the date-time to the time zone of your local system. If you wish to retain the original time zone, you can do so by adding tz = "UTC" which is the defualt Universal Time.
For instance, the following code (using 1st row of your sample data):
as.POSIXct(strptime("Tue Feb 18 23:09:57 +0000 2014", "%a %b %d %H:%M:%S %z %Y", tz = "UTC"))
will produce the following output (without altering the time zone):
[1] "2014-02-18 23:09:57 UTC"

Related

Timestamp conversion in R and calculating Time Difference between 2 Columns of different DFs

I need to calculate time difference in minutes/hours/days etc between 2 Date-Time columns of two dataframes, please find the details below
df1 <- data.frame (Name = c("Aks","Bob","Caty","David"),
timestamp = c("Mon Apr 1 14:23:09 1980", "Sun Jun 12 12:10:21 1975", "Fri Jan 5 18:45:10 1985", "Thu Feb 19 02:26:19 1990"))
df2 <- data.frame (Name = c("Aks","Bob","Caty","David"),
timestamp = c("Apr-01-1980 14:28:00","Jun-12-1975 12:45:10","Jan-05-1985 17:50:30","Feb-19-1990 02:28:00"))
I am facing problem in converting df1$timestamp and df2$timestamp , here POSIXct & as.Date are not working getting error - non numeric argument to binary operator
I need to calculate time diff in mins/hrs or days
One approach is strptime and indicate the appropriate directives in the datetime format:
df1$timestamp2 <- strptime(df1$timestamp, "%a %b %d %H:%M:%S %Y")
df2$timestamp2 <- strptime(df2$timestamp, "%b-%d-%Y %H:%M:%S")
In this case, you have:
%a abbreviated weekday name
%b abbreviated month name
%d day of the month
%H hour, 24-hour clock
%M minute
%S second
%Y year including century
Then you can use difftime to get the difference, and specify the units (in this case, difference expressed in hours):
difftime(df1$timestamp2, df2$timestamp2, units = "hours")
Output
Time differences in hours
[1] -0.08083333 -0.58027778 0.91111111 -0.02805556
If locale-setting prevent correct reading, try:
# Store current locale
orig_locale <- Sys.getlocale("LC_TIME")
Sys.setlocale("LC_TIME", "C")
# Convert to posix-timestamp
df1$timestamp <- as.POSIXct( df1$timestamp, format = "%a %b %d %H:%M:%S %Y")
df2$timestamp <- as.POSIXct( df2$timestamp, format = "%b-%d-%Y %H:%M:%S")
# Restore locale
Sys.setlocale("LC_TIME", orig_locale)
# Calculate difference
df2$timestamp - df1$timestamp
# Time differences in mins
# [1] 4.850000 34.816667 -54.666667 1.683333

Convert Uncommon Date Format

Is there an easy way to transform this string into a meaningful date format?
date <- "Tue Apr 04 10:18:33 +0000 2017"
Right now I would use some regex and then the lubridate package. But I guess there is a less complicated way.
Try this:
as.POSIXct(date, format = "%a %b %d %H:%M:%S %z %Y")

Character string with timezone convert to date

I am trying to convert a vector of dates that I read from a csv file using read.table. These were read as a vector of character strings. I am trying to convert it to a date vector using as_date.
The date vector has elements of the below type
dateString
"Wed Dec 11 00:00:00 ICT 2013"
On trying to convert using the below command,
as.Date(dateString,"%a %b %e %H:%M:%S %Z %Y")
Error in strptime(x, format, tz = "GMT") :
use of %Z for input is not supported
What would be the right format to use in strptime? or in as.Date?
Just use the anytime() function from the anytime package:
R> anytime::anytime("Wed Dec 11 00:00:00 ICT 2013")
[1] "2013-12-11 CST"
R>
There is also an utctime() variant to not impose your local time, and much. By now we also had a number of questions here so just search.
And if you want a date, it works the same way:
R> anytime::anydate("Wed Dec 11 00:00:00 ICT 2013")
[1] "2013-12-11"
R>

Converting time string with abbreviated timezone info in R

I have a data file that has timezone info in the datetime field. It looks something like this
"Wed May 18 13:42:29 MDT 2016"
I'm trying to convert this to a POSIXct object. Unfortunately %Z is allowed only for output. For eg.
> timestr <- "Wed May 18 13:42:29 MDT 2016"
> tm.posixct <- as.POSIXct(timestr, tz="US/Mountain", format = "%a %b %d %H:%M:%S MDT %Y")
My data file contains MDT or MST, hence I can't use that in the format string. Is there anyway to use the TZ info in that timefield for conversion instead of doing string manipulation and removing the timezone info before passing it to as.POSIXct?

converting utc format to required format using python

I am trying to convert UTC format date format to required format using python. Simply I have the datetime in the format(Fri Dec 07 19:06:06 +0000 2012), I need to convert this into my format(2012-12-07 19:06:06:546 +0000)
Code:
created_at = "Fri Dec 07 19:06:06 +0000 2012"
d = datetime.strptime(created_at, '%a %b %d %H:%M:%S %z %Y')
date_object = d.strftime('%y-%m-%d %H:%M:%S')
result:
ValueError: 'z' is a bad directive in format '%a %b %d %H:%M:%S %z %Y'
The below link from python bugs says its fixed but i didn understand what is fixed and in which version i can use %z
http://bugs.python.org/issue6641
I am not able to use %z . Is there any other way to handle this??
Do it like that:
from datetime import datetime
TIMESTAMP = datetime.utcnow().strftime('%d/%m/%Y %H:%M:%S')

Resources