Convert date strings with R - r

Iā€™m working with date strings in R. Essentially, I have three different strings that represent date variables. I have these weird date strings from scraping data on the web.
Is it possible to convert these three different date strings into a universal format that makes it easier to perform logic on them with basic R code? Here are what the strings look like. Any help is greatly appreciated.
1. "Wed, Feb 7, 2017 7:30 pm"
2. "Wed Feb 7 08:00:04 2017"
3. "2017-02-7 13:06:14 PST" # Sys.time()
UPDATE: I now have a better understanding of as.POSIXct now, but I still don't understand why this doesn't work ?
as.POSIXct('02/15/2017, 10:00 PM', format = "%M/%D/%Y, %H:%M %r")

While your specific question has already been answered in comments.
I would like to leave this as a general reference for other people who might have similar problems and can come across this question.
So, as you have this in d.b's comment, your data time string have been parsed via command:
as.POSIXct("Wed, Feb 7, 2017 7:30 pm", format = "%A, %b %d,%Y %H:%M")
The difference between your first and the second case was in the format.
So, this is a general guidance on the format:
%a Abbreviated weekday
%A Full weekday
%b Abbreviated month
%B Full month
%c Locale-specific date and time
%d Decimal date
%H Decimal hours (24 hour)
%I Decimal hours (12 hour)
%j Decimal day of the year
%m Decimal month
%M Decimal minute
%p Locale-specific AM/PM
%S Decimal second
%U Decimal week of the year (starting on Sunday)
%w Decimal Weekday (0=Sunday)
%W Decimal week of the year (starting on Monday)
%x Locale-specific Date
%X Locale-specific Time
%y 2-digit year
%Y 4-digit year
%z Offset from GMT
%Z Time zone (character)
This is also useful if you want to do the conversions between different formats:
x <- as.POSIXct( "2017-01-15")
format(x, "%a")
[1] "Sun"
format(x, "Week of the year: %W")
[1] "Week of the year: 02"
source: https://www.stat.berkeley.edu/~s133/dates.html

as.POSIXct("Wed, Feb 7, 2017 7:30 pm", format = "%A, %b %d,%Y %H:%M", tz="PST8PDT")
as.POSIXct("Wed Feb 7 08:00:04 2017", format = "%A %b %d %H:%M:%S %Y",tz="PST8PDT")
as.POSIXct("2017-02-7 13:06:14 PST", format = "%Y-%m-%d %H:%M:%S",tz="PST8PDT")

Related

Timestamp conversion in R and calculating Time Difference between 2 Columns of different DFs

I need to calculate time difference in minutes/hours/days etc between 2 Date-Time columns of two dataframes, please find the details below
df1 <- data.frame (Name = c("Aks","Bob","Caty","David"),
timestamp = c("Mon Apr 1 14:23:09 1980", "Sun Jun 12 12:10:21 1975", "Fri Jan 5 18:45:10 1985", "Thu Feb 19 02:26:19 1990"))
df2 <- data.frame (Name = c("Aks","Bob","Caty","David"),
timestamp = c("Apr-01-1980 14:28:00","Jun-12-1975 12:45:10","Jan-05-1985 17:50:30","Feb-19-1990 02:28:00"))
I am facing problem in converting df1$timestamp and df2$timestamp , here POSIXct & as.Date are not working getting error - non numeric argument to binary operator
I need to calculate time diff in mins/hrs or days
One approach is strptime and indicate the appropriate directives in the datetime format:
df1$timestamp2 <- strptime(df1$timestamp, "%a %b %d %H:%M:%S %Y")
df2$timestamp2 <- strptime(df2$timestamp, "%b-%d-%Y %H:%M:%S")
In this case, you have:
%a abbreviated weekday name
%b abbreviated month name
%d day of the month
%H hour, 24-hour clock
%M minute
%S second
%Y year including century
Then you can use difftime to get the difference, and specify the units (in this case, difference expressed in hours):
difftime(df1$timestamp2, df2$timestamp2, units = "hours")
Output
Time differences in hours
[1] -0.08083333 -0.58027778 0.91111111 -0.02805556
If locale-setting prevent correct reading, try:
# Store current locale
orig_locale <- Sys.getlocale("LC_TIME")
Sys.setlocale("LC_TIME", "C")
# Convert to posix-timestamp
df1$timestamp <- as.POSIXct( df1$timestamp, format = "%a %b %d %H:%M:%S %Y")
df2$timestamp <- as.POSIXct( df2$timestamp, format = "%b-%d-%Y %H:%M:%S")
# Restore locale
Sys.setlocale("LC_TIME", orig_locale)
# Calculate difference
df2$timestamp - df1$timestamp
# Time differences in mins
# [1] 4.850000 34.816667 -54.666667 1.683333

Confusion regarding DateTime conversion in R

I am trying to convert a character string into a dateTime object in R
Sample data:
Following is the code that I am using for conversion
sample$Tweet_Timestamp <- lapply(sample$Tweet_Timestamp, function(x) as.POSIXct(strptime(x, "%a %b %d %H:%M:%S %z %Y")))
sample<-sample%>%unnest(Tweet_Timestamp)
The result I am getting is as follows:
Now in the result we can see that the date has converted from 18th Feb to 19th Feb. I cannot understand the reason why I am getting such result.Can someone help me decipher this?
as.POSIXct will automatically convert the date-time to the time zone of your local system. If you wish to retain the original time zone, you can do so by adding tz = "UTC" which is the defualt Universal Time.
For instance, the following code (using 1st row of your sample data):
as.POSIXct(strptime("Tue Feb 18 23:09:57 +0000 2014", "%a %b %d %H:%M:%S %z %Y", tz = "UTC"))
will produce the following output (without altering the time zone):
[1] "2014-02-18 23:09:57 UTC"

Formatting dates with R

I need to format a date variable with R. I get the current date with a simple date().
today <- date()
today
> "Mon Oct 10 1:00 2016"
I need to format this 'today' variable into a string with a specific format. below is an example of what the string should look like.
string <- ā€10/10/2016 1:00 PM EDT"
string
> ā€10/10/2016 1:00 PM EDT"
So the question is how do you format a character string that looks like "Mon Oct 10 1:00 2016" into ā€10/10/2016 1:00 PM EDT".
I've tried working with strptime() and as.Date() functions but cannot figure out how to convert this string into a formatted date. Thanks for any help.
strftime(strptime(date(),
format = "%a %b %d %H:%M:%S %Y"),
format = "%m/%d/%Y %I:%M %p %Z")
See ??strptime.

Convert this Character string to Date in r

I have the data in character format as " Mar 26, 2015 7:46:22 PM CDT " I have convert this into a Date format and also fetch the month, year and day separately.
Also is it possible to conevert it into a integer format.
Please advice.
You have to specify the format of the input string. Try:
as.POSIXct("Mar 26, 2015 7:46:22 PM CDT", format="%b %d, %Y %I:%M:%S %p")

Converting time format in R

One of my observation of time data is "Wed Oct 31 13:42:46 2007".
I want to convert to the format as "2007-10-31 13:42:46" (remove Wed).
How I can convert it?
As #thelatemail said
x <- "Wed Oct 31 13:42:46 2007"
x <- as.POSIXct(x, format = "%a %b %d %T %Y", tz = "GMT")
# %a day of week, %b month of year, %d day of month as decimal, %T H:M:S, %Y year with century
x
[1] "2007-10-31 13:42:46 GMT"

Resources