I currently have data as such:
data
I wish to change the 'date' column to date type. (it is now in character).
I have tried the code below but it gives me 'NA' as the result.
as.Date(data$Date, format = "%a %d-%m-%Y %I:%M %p")
Am I making a mistake in the way I am formatting the date to suit the format in my data?
Unfortunately, we don't have your data (we have a picture of your data, but the only way to use that is to transcribe it manually - best to include data as text in your questions).
Anyway, using a brief example in the same format:
dates <- c("Wed 25-Apr-2018 3:20 PM", "Thu 10-Mar-2022 10:53 AM")
We can get the dates by doing:
as.Date(dates, "%a %d-%b-%Y %I:%M %p")
#> [1] "2018-04-25" "2022-03-10"
Note though that this does not preserve the time, and to do this you probably want to use R's built in date-time format, POSIXct. We can get this with:
strptime(dates, "%a %d-%b-%Y %I:%M %p")
#> [1] "2018-04-25 15:20:00 BST" "2022-03-10 10:53:00 GMT"
I think the main problem was that you were using %m for the month, but this only parses months in decimal number format. You need %b for text-abbreviations of months.
Related
I've just read about the difference between POSIXlt and POSIXct and it is said that POSIXlt is a mixed text and character format like "May, 6 1985", "1990-9-1" or "1/20/2012". When I try such kind of things I get an error
as.POSIXlt("May, 6 1985")
# character string is not in a standard unambiguous format
(How) can we dates with format as quoted above put forward to POSIXlt? Here are sources saying that such format works (if I get them right): 1, 2.
Read ?strptime for all the details of specifying a time format. In this case you want %b (for month name), %d (day of month), %Y (4-digit year). (This will only work with an English locale setting as the month names are locale-specific.)
as.POSIXlt("May, 6 1985", format = "%b, %d %Y")
If you have mixed input of formates you can use parse_date_time from the lubridate package.
x <- c("May, 6 1985", "1990-9-1", "1/20/2012")
y <- lubridate::parse_date_time(x, c("ymd", "mdy", "dmy"))
str(y)
# POSIXct[1:3], format: "1985-05-06" "1990-09-01" "2012-01-20"
note on c("ymd", "mdy", "dmy") as this determines the order on first found first converted. Consider 6-1-2000 will encounter mdy as valid before dmy so this means it will be first of June and not sixth of January.
For my analysis, I need to add days of the week to the dates in Rstudio. My data starts from the first day of January ( 2019-01-01 00:00:00) and time steps are 5 minutes therefore, the second term is "2019-01-01 00:05:00 to the last day of the year. Unfortunately, some rows are missing and for example, in one case next reading after 2019-05-01 10:05:00 can be 2019-05-17 23:05:00. How can I assign days of the week to my dates?
How's this work? First, we convert the date into the magic R format, and then extract it back in the format we want.
posix <- strptime(x = "2019-01-01 00:00:00",
format = "%Y-%m-%d %H:%M:%S")
strftime(posix, format = "%Y-%m-%d %H:%M:%S %A")
Here, we're essentially exporting it in the exact same format, plus the %A operator that corresponds to the weekday.
I have a date in a semi-odd format. It's written like:
Tue Oct 11 20:56:33 +0000 2016
The date is a string. The date shows what time and day a tweet was sent as in a data set. I want to find what hour each of my tweets were sent out. What is the best way to work with this? Is there a way to convert it to a date type easily, or is there a way to work with that string with what I want to do? Obviously I'd like to something like:
format(strptime(testTweets$Date, format = "%m %d %H:%M%S %Y", "%H"))
But I'm not sure how to work with the day of the week or the +0000. Thanks for the help!
The above comment beat me to it, but try the following code:
strptime(x, format = "%a %b %d %H:%M:%S %z %Y")
[1] "2016-10-11 20:56:33"
You only need to check the documentation for strptime to figure out the correct format mask to use:
https://www.rdocumentation.org/packages/base/versions/3.4.3/topics/strptime
I have a data frame with two text columns which are essentially time-stamps columns namely BEFORE and AFTER and the format is 12/29/2016 4:29:00 PM. I want to compare if the gap between the BEFORE and AFTER time for each row is more than 5 minutes or not. Which package in R will allow subtraction between time stamp?
No need for extra packages, using base R you can achieve the date time comparison.
First coerce your character to a valid date time structure. Here I am using POSIXct:
d1= as.POSIXct("12/29/2016 4:29:00 PM",format="%m/%d/%Y %H:%M:%S")
d2= as.POSIXct("12/30/2016 5:29:00 PM",format="%m/%d/%Y %H:%M:%S")
Then to get the difference in minutes you can use difftime:
difftime(d1,d2,units="mins")
Note the all those functions are vectorized So the same code will work with vectors.
manage PM/AM
to take care of PM/AM we should add %p to the format that should be used in conjonction with %I not %H:
d1= as.POSIXct("12/28/2016 11:53:00 AM",
format="%m/%d/%Y %I:%M:%S %p")
d2= as.POSIXct("12/28/2016 12:03:00 PM",
format="%m/%d/%Y %I:%M:%S %p")
difftime(d2,d1,units="mins")
## Time difference of 10 mins
for more help about date/time format you can read ?strptime.
I have a data.frame (CSV originally) in R with dates are in the following 3 formats:
2011-06-02T17:16:05Z
2012-06-02T17:16:05-07:00
6/2/11 17:16:05
which is year-month-day-time. I don't quite know what the -07:00 is, as it seems to be the same for all timestamps (except for some where it is -08:00), but I guess it's some type of time zone offset.
I am not quite sure what format these are (does anyone know?), but I need to convert it to this format:
6/2/11 17:16:05
which is year-month-day-time
I would like to do this in such a way so that all the dates in the CSV (in one and the same row) is converted to the second format. How can I accomplish this in R?
The full dataset can be found here.
Here's another attempt, assuming your data is text to start with:
test <- c("2011-06-02T17:16:05Z","2012-06-02T17:16:05-07:00")
format(as.POSIXct(test,format="%Y-%m-%dT17:%H:%M"),"%m/%d/%y %H:%M")
[1] "06/02/11 16:05" "06/02/12 16:05"
You can try the following, where myDates would be the column of dates
format(strptime(myDates, format="%Y-%m-%dT17:%H:%M"), format= "%m/%d/%Y %H:%M")
[1] "06/02/2011 16:05" "06/02/2012 16:05"
or with 2-digit year
# Note the lower-case %y at the end
format(strptime(myDates, format="%Y-%m-%dT17:%H:%M"), format= "%m/%d/%y %H:%M")
[1] "06/02/11 16:05" "06/02/12 16:05"
As for the Z, that indicates GMT (think: London).
the -7:00 indicates 7 hours back from GMT (think: Colorado / MST etc)
Please see here for more reference