I am trying to convert the created_at string but it returns NA
as.POSIXct("Tue Jun 07 23:27:12 +0000 2016", format="%a %b %d %H:%M:%S +0000 %Y", tz="GMT")
[1] NA
Any idea what's going wrong, seems fairly straightforward!
Conversion of dates depends on your locale. For me, this is Slovene, so your case doesn't work.
> as.POSIXct("Tue Jun 07 23:27:12 +0000 2016", format="%a %b %d %H:%M:%S +0000 %Y", tz="GMT")
[1] NA
However, if I change the date to Slovene (Tor = torek = Tuesday)
> as.POSIXct("Tor Jun 07 23:27:12 +0000 2016", format="%a %b %d %H:%M:%S +0000 %Y", tz="GMT")
[1] "2016-06-07 23:27:12 GMT"
In short, change your locale to English and you're set.
> Sys.setlocale("LC_TIME", "English")
[1] "English_United States.1252"
> as.POSIXct("Tue Jun 07 23:27:12 +0000 2016", format="%a %b %d %H:%M:%S +0000 %Y", tz="GMT")
[1] "2016-06-07 23:27:12 GMT"
a solution that doesn't involve changing your locale
library(dplyr)
library(magrittr)
twitter_to_POSIXct <- function(x, timezone = Sys.timezone()){
x %>%
strsplit("\\s+") %>%
unlist %>%
t %>%
as.data.frame(stringsAsFactors = FALSE) %>%
set_colnames(c("week_day", "month_abb",
"day", "hour", "tz",
"year")) %>%
mutate(month_num = which(month.abb %in% month_abb)) %>%
mutate(date_str = paste0(year, "-", month_num, "-", day, " ",
hour)) %>%
mutate(date = format(as.POSIXct(date_str, tz = tz),
tz = timezone)) %>%
pull(date)
}
twitter_to_POSIXct("Tue Jun 07 23:27:12 +0000 2016")
Related
How can I convert a string like "Fri Jul 26 10:58:25 CEST 2019" correctly to a date. The timezone are not the same every time.
as.Date("Fri Jul 26 10:58:25 CEST 2019", "%a %b %d %H:%M:%S CEST %Y")
What is the correct placeholder for the timezone (CEST, CET, ...)? Or how can I ignore the timezone? It is not important for my use case.
Maybe I'm confused by the question but if time zone isn't important then just omit it?
as.Date("Fri Jul 26 10:58:25 2019", "%a %b %d %H:%M:%S %Y")
Or if you need it clearly specified
as.Date("Fri Jul 26 10:58:25 2019", "%a %b %d %H:%M:%S %Y", tz = "CEST")
To remove the CEST using gsub:
as.Date(gsub("(:[[:digit:]][[:digit:]]) ([[:alpha:]]*)", "\\1", x), "%a %b %d %H:%M:%S %Y")
this is my date to convert:
"16:00:00 CT 08 Apr 2018"
and this is my try:
x <- "16:00:00 CT 08 Apr 2018"
Sys.setlocale(category = "LC_ALL", locale = "English_United States.1252")
as.POSIXct(x, format = '%H:%M:%S %A %d %b %Y')
and it returns NA
We can use the CT as such in the format
as.POSIXct(x, format = '%H:%M:%S CT %d %b %Y')
Remove string CT and go further:
as.POSIXct(paste(unlist(strsplit(x," CT ")),collapse = ""),format='%H:%M:%S %d %b %Y')
[1] "2018-04-08 16:00:00 GMT"
I have data of the form:
[1] "Mon Feb 01 09:11:55 +0000 2016" "Mon Feb 01 09:12:11 +0000 2016" ""
[4] "Mon Feb 01 09:14:25 +0000 2016" "" "Mon Feb 01 09:15:40 +0000 2016"
and I want to plot it using R.
I want to do an hourly plot of counts so all those between 9 and 10AM would be counted in one bucket and so on. The data will be over several days but date is unimportant just hour. I might also want to change hour to 30 minutes say.
I've tried various things but I'm a little out of my depth and would be very grateful for a few basic steps to get it to work.
I've tried:
str <- strptime(dt, "%a %b %d %H:%M:%S %z %Y", tz = "GMT")
# head(str,3)
( dt.gmt <- as.POSIXct(str, tz = "GMT") )
format(dt.gmt, tz = "EST", usetz = TRUE)
hms <- format(dt.gmt , format = "%H:%M:%S")
hms<-as.numeric(hms)
head(hms,3)
hms <- table(cut(hms, breaks="hour"))
which gives the error:
Error in breaks + 1 : non-numeric argument to binary operator
I've also tried:
aggdata <-aggregate(hms, by=(hms), FUN=mean, na.rm=TRUE)
which gives:
Error in aggregate.data.frame(as.data.frame(x), ...) : 'by' must be a list
Ok I just tried this, May be this can help you
dt <- c("Mon Feb 01 09:11:55 +0000 2016", "Mon Feb 01 10:12:11 +0000
2016","Mon Feb 01 09:21:55 +0000 2016" )
df <- data.frame('time' = dt,
'id' = c(1, 3, 2))
df$time <- as.POSIXct(gsub("^.+? | \\+\\d{4}","", df$time),
format = "%B %d %X %Y")
df$time <- as.POSIXlt(df$time)
df$hour <- format(df$time, format = '%H')
df
pivot <- aggregate(df$id, by = list(df$hour), FUN = length)
pivot
I'm having some problems converting string vectors into date format. I have found a lot of information here and here but they haven't worked in my case (R 3.2.3).
> strptime("Fri Feb 05 14:10:10 +0000 2016",
format="%a %b %d %H:%M:%S %z %Y", tz="GMT")
[1] NA
I have a spreadsheet full of data with dates that look like this:
Mon Jul 16 15:20:22 +0000 2012
Is there a way to convert these to R dates (preferably PST) without using regular expression or is there no other way? I'd appreciate ideas on doing this conversion efficiently.
Sure, just use strptime() to parse time from strings:
R> strptime("Mon Jul 16 15:20:22 +0000 2012",
+ format="%a %b %d %H:%M:%S %z %Y")
[1] "2012-07-16 10:20:22 CDT"
R>
which uses my local timezone (CDT). If yours is Pacific, you can set it explicitly as in
R> strptime("Mon Jul 16 15:20:22 +0000 2012",
+ format="%a %b %d %H:%M:%S %z %Y", tz="America/Los_Angeles")
[1] "2012-07-16 08:20:22 PDT"
R>
which looks right with a 7 hour delta to UTC.
There's nearly a verbatim example of how to do this in the Examples section of ?strptime:
# ?strptime example:
## An RFC 822 header (Eastern Canada, during DST)
strptime("Tue, 23 Mar 2010 14:36:38 -0400", "%a, %d %b %Y %H:%M:%S %z")
# your data...
strptime("Mon Jul 16 15:20:22 +0000 2012", "%a %b %d %H:%M:%S %z %Y")
This can also be done with lubridate package in tidyverse
library(lubridate)
parse_date_time("Mon Jul 16 15:20:22 +0000 2012", orders = "amdHMSzY")
which is what I prefer.