I am currently looking at some data I uploaded from the EPA website but the date is formatted a little funny: 20170101T2300-0500
I tried reformatting the date and time at once which did not work so I split the column by "T" and successfully reformatted the date but when I entered
Df$time<-strptime(as.character(Df$time),"%I%M%z")
Df$time<- format(Df$time, "%I:%M:%S")
The time column turned into N/A's. I read that "z" was the "offset to Greenwich" factor but for output only, I'm unsure if I used it in the correct context.
You can do directly
strptime("20170101T2300-0500", "%Y%m%dT%H%M%z")
[1] "2017-01-01 22:00:00"
%Y - Year
%m - Month
%d - day
T - strict text delimiter
%H - Hours
%M - Minutes
%z - signed offset
Related
I currently have data as such:
data
I wish to change the 'date' column to date type. (it is now in character).
I have tried the code below but it gives me 'NA' as the result.
as.Date(data$Date, format = "%a %d-%m-%Y %I:%M %p")
Am I making a mistake in the way I am formatting the date to suit the format in my data?
Unfortunately, we don't have your data (we have a picture of your data, but the only way to use that is to transcribe it manually - best to include data as text in your questions).
Anyway, using a brief example in the same format:
dates <- c("Wed 25-Apr-2018 3:20 PM", "Thu 10-Mar-2022 10:53 AM")
We can get the dates by doing:
as.Date(dates, "%a %d-%b-%Y %I:%M %p")
#> [1] "2018-04-25" "2022-03-10"
Note though that this does not preserve the time, and to do this you probably want to use R's built in date-time format, POSIXct. We can get this with:
strptime(dates, "%a %d-%b-%Y %I:%M %p")
#> [1] "2018-04-25 15:20:00 BST" "2022-03-10 10:53:00 GMT"
I think the main problem was that you were using %m for the month, but this only parses months in decimal number format. You need %b for text-abbreviations of months.
I've just read about the difference between POSIXlt and POSIXct and it is said that POSIXlt is a mixed text and character format like "May, 6 1985", "1990-9-1" or "1/20/2012". When I try such kind of things I get an error
as.POSIXlt("May, 6 1985")
# character string is not in a standard unambiguous format
(How) can we dates with format as quoted above put forward to POSIXlt? Here are sources saying that such format works (if I get them right): 1, 2.
Read ?strptime for all the details of specifying a time format. In this case you want %b (for month name), %d (day of month), %Y (4-digit year). (This will only work with an English locale setting as the month names are locale-specific.)
as.POSIXlt("May, 6 1985", format = "%b, %d %Y")
If you have mixed input of formates you can use parse_date_time from the lubridate package.
x <- c("May, 6 1985", "1990-9-1", "1/20/2012")
y <- lubridate::parse_date_time(x, c("ymd", "mdy", "dmy"))
str(y)
# POSIXct[1:3], format: "1985-05-06" "1990-09-01" "2012-01-20"
note on c("ymd", "mdy", "dmy") as this determines the order on first found first converted. Consider 6-1-2000 will encounter mdy as valid before dmy so this means it will be first of June and not sixth of January.
This question already has an answer here:
Format for ordinal dates (day of month with suffixes -st, -nd, -rd, -th)
(1 answer)
Closed 1 year ago.
I currently have a vector of dates that are in the following format:
a <- c("Wednesday 26th May 2021","Thursday 27th May 2021")
I've tried to get it into ISO 8601 using the following:
as.Date(a, "%I %d%S %F %Y")
But I'm not 100% certain about the syntax of writing dates.
Any thoughts are appreciated!
You can remove the date suffixes and use as.Date -
#Added an extra date that does not have th as prefix.
a <- c("Wednesday 26th May 2021","Thursday 27th May 2021",
'Tuesday 1st June 2021', 'Monday 31st May 2021')
as.Date(sub('(?<=\\d)(th|rd|st|nd)', '', a, perl = TRUE), '%A %d %b %Y')
#[1] "2021-05-26" "2021-05-27" "2021-06-01" "2021-05-31"
Read ?strptime for different format specification.
If you are open to packages lubridate::dmy works directly.
lubridate::dmy(a)
#[1] "2021-05-26" "2021-05-27" "2021-06-01" "2021-05-31"
I have a date in a semi-odd format. It's written like:
Tue Oct 11 20:56:33 +0000 2016
The date is a string. The date shows what time and day a tweet was sent as in a data set. I want to find what hour each of my tweets were sent out. What is the best way to work with this? Is there a way to convert it to a date type easily, or is there a way to work with that string with what I want to do? Obviously I'd like to something like:
format(strptime(testTweets$Date, format = "%m %d %H:%M%S %Y", "%H"))
But I'm not sure how to work with the day of the week or the +0000. Thanks for the help!
The above comment beat me to it, but try the following code:
strptime(x, format = "%a %b %d %H:%M:%S %z %Y")
[1] "2016-10-11 20:56:33"
You only need to check the documentation for strptime to figure out the correct format mask to use:
https://www.rdocumentation.org/packages/base/versions/3.4.3/topics/strptime
This question already has answers here:
Read csv with dates and numbers
(3 answers)
Closed 9 years ago.
I am working on "Localization Data for Person Activity Data Set" dataset from UCI and in this data set there is a column of date and time(both in one column) with following format:
27.05.2009 14:03:25:777
27.05.2009 14:03:25:183
27.05.2009 14:03:25:210
27.05.2009 14:03:25:237
...
I am wondering if there is anyway to convert this column to timestamp using R.
First of all, we need to substitute the colon separating the milliseconds from the seconds to a dot, otherwise the final step won't work (thanks to Dirk Eddelbuettel for this one). Since in the end R will use the separators it wants, to be quicker, I'll just go ahead and substitute all the colons for dots:
x <- "27.05.2009 14:03:25:777" # this is a simplified version of your data
y <- gsub(":", ".", x) # this is your vector with the aforementioned substitution
By the way, this is how your vector should look after gsub:
> y
[1] "27.05.2009 14.03.25.777"
Now, in order to have it show the milliseconds, you first need to adjust an R option and then use a function called strptime, which will convert your date vector to POSIXlt (an R-friendly) format. Just do the following:
> options(digits.secs = 3) # this tells R you want it to consider 3 digits for seconds.
> strptime(y, "%d.%m.%Y %H:%M:%OS") # this finally formats your vector
[1] "2009-05-27 14:03:25.777"
I've learned this nice trick here. This other answer also says you can skip the options setting and use, for example, strptime(y, "%d.%m.%Y %H:%M:%OS3"), but it doesn't work for me. Henrik noted that the function's help page, ?strptime states that the %OS3 bit is OS-dependent. I'm using an updated Ubuntu 13.04 and using %OS3 yields NA.
When using strptime (or other POSIX-related functions such as as.Date), keep in mind some of the most common conversions used (edited for brevity, as suggested by DWin. Complete list at strptime):
%a Abbreviated weekday name in the current locale.
%A Full weekday name in the current locale.
%b Abbreviated month name in the current locale.
%B Full month name in the current locale.
%d Day of the month as decimal number (01–31).
%H Hours as decimal number (00–23). Times such as 24:00:00 are accepted for input.
%I Hours as decimal number (01–12).
%j Day of year as decimal number (001–366).
%m Month as decimal number (01–12).
%M Minute as decimal number (00–59).
%p AM/PM indicator in the locale. Used in conjunction with %I and not with %H.
`%S Second as decimal number (00–61), allowing for up to two leap-seconds (but POSIX-compliant implementations will ignore leap seconds).
%U Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1). The US convention.
%w Weekday as decimal number (0–6, Sunday is 0).
%W Week of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1). The UK convention.
%y Year without century (00–99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19
%Y Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC)