How to change a column of characters into date form? - r

I currently have data as such:
data
I wish to change the 'date' column to date type. (it is now in character).
I have tried the code below but it gives me 'NA' as the result.
as.Date(data$Date, format = "%a %d-%m-%Y %I:%M %p")
Am I making a mistake in the way I am formatting the date to suit the format in my data?

Unfortunately, we don't have your data (we have a picture of your data, but the only way to use that is to transcribe it manually - best to include data as text in your questions).
Anyway, using a brief example in the same format:
dates <- c("Wed 25-Apr-2018 3:20 PM", "Thu 10-Mar-2022 10:53 AM")
We can get the dates by doing:
as.Date(dates, "%a %d-%b-%Y %I:%M %p")
#> [1] "2018-04-25" "2022-03-10"
Note though that this does not preserve the time, and to do this you probably want to use R's built in date-time format, POSIXct. We can get this with:
strptime(dates, "%a %d-%b-%Y %I:%M %p")
#> [1] "2018-04-25 15:20:00 BST" "2022-03-10 10:53:00 GMT"
I think the main problem was that you were using %m for the month, but this only parses months in decimal number format. You need %b for text-abbreviations of months.

Related

Can format like "May 17, 2017", "17/5/2017" or "17-5-17 05:24:39" be used in as.POSIXlt?

I've just read about the difference between POSIXlt and POSIXct and it is said that POSIXlt is a mixed text and character format like "May, 6 1985", "1990-9-1" or "1/20/2012". When I try such kind of things I get an error
as.POSIXlt("May, 6 1985")
# character string is not in a standard unambiguous format
(How) can we dates with format as quoted above put forward to POSIXlt? Here are sources saying that such format works (if I get them right): 1, 2.
Read ?strptime for all the details of specifying a time format. In this case you want %b (for month name), %d (day of month), %Y (4-digit year). (This will only work with an English locale setting as the month names are locale-specific.)
as.POSIXlt("May, 6 1985", format = "%b, %d %Y")
If you have mixed input of formates you can use parse_date_time from the lubridate package.
x <- c("May, 6 1985", "1990-9-1", "1/20/2012")
y <- lubridate::parse_date_time(x, c("ymd", "mdy", "dmy"))
str(y)
# POSIXct[1:3], format: "1985-05-06" "1990-09-01" "2012-01-20"
note on c("ymd", "mdy", "dmy") as this determines the order on first found first converted. Consider 6-1-2000 will encounter mdy as valid before dmy so this means it will be first of June and not sixth of January.

How to add days of the week to the existing date's column in Rstudio

For my analysis, I need to add days of the week to the dates in Rstudio. My data starts from the first day of January ( 2019-01-01 00:00:00) and time steps are 5 minutes therefore, the second term is "2019-01-01 00:05:00 to the last day of the year. Unfortunately, some rows are missing and for example, in one case next reading after 2019-05-01 10:05:00 can be 2019-05-17 23:05:00. How can I assign days of the week to my dates?
How's this work? First, we convert the date into the magic R format, and then extract it back in the format we want.
posix <- strptime(x = "2019-01-01 00:00:00",
format = "%Y-%m-%d %H:%M:%S")
strftime(posix, format = "%Y-%m-%d %H:%M:%S %A")
Here, we're essentially exporting it in the exact same format, plus the %A operator that corresponds to the weekday.

Extracting the hour from a date

I have a date in a semi-odd format. It's written like:
Tue Oct 11 20:56:33 +0000 2016
The date is a string. The date shows what time and day a tweet was sent as in a data set. I want to find what hour each of my tweets were sent out. What is the best way to work with this? Is there a way to convert it to a date type easily, or is there a way to work with that string with what I want to do? Obviously I'd like to something like:
format(strptime(testTweets$Date, format = "%m %d %H:%M%S %Y", "%H"))
But I'm not sure how to work with the day of the week or the +0000. Thanks for the help!
The above comment beat me to it, but try the following code:
strptime(x, format = "%a %b %d %H:%M:%S %z %Y")
[1] "2016-10-11 20:56:33"
You only need to check the documentation for strptime to figure out the correct format mask to use:
https://www.rdocumentation.org/packages/base/versions/3.4.3/topics/strptime

Adding/subtracting date-time columns in data-frames

I have a data frame with two text columns which are essentially time-stamps columns namely BEFORE and AFTER and the format is 12/29/2016 4:29:00 PM. I want to compare if the gap between the BEFORE and AFTER time for each row is more than 5 minutes or not. Which package in R will allow subtraction between time stamp?
No need for extra packages, using base R you can achieve the date time comparison.
First coerce your character to a valid date time structure. Here I am using POSIXct:
d1= as.POSIXct("12/29/2016 4:29:00 PM",format="%m/%d/%Y %H:%M:%S")
d2= as.POSIXct("12/30/2016 5:29:00 PM",format="%m/%d/%Y %H:%M:%S")
Then to get the difference in minutes you can use difftime:
difftime(d1,d2,units="mins")
Note the all those functions are vectorized So the same code will work with vectors.
manage PM/AM
to take care of PM/AM we should add %p to the format that should be used in conjonction with %I not %H:
d1= as.POSIXct("12/28/2016 11:53:00 AM",
format="%m/%d/%Y %I:%M:%S %p")
d2= as.POSIXct("12/28/2016 12:03:00 PM",
format="%m/%d/%Y %I:%M:%S %p")
difftime(d2,d1,units="mins")
## Time difference of 10 mins
for more help about date/time format you can read ?strptime.

Converting between date formats in R?

I have a data.frame (CSV originally) in R with dates are in the following 3 formats:
2011-06-02T17:16:05Z
2012-06-02T17:16:05-07:00
6/2/11 17:16:05
which is year-month-day-time. I don't quite know what the -07:00 is, as it seems to be the same for all timestamps (except for some where it is -08:00), but I guess it's some type of time zone offset.
I am not quite sure what format these are (does anyone know?), but I need to convert it to this format:
6/2/11 17:16:05
which is year-month-day-time
I would like to do this in such a way so that all the dates in the CSV (in one and the same row) is converted to the second format. How can I accomplish this in R?
The full dataset can be found here.
Here's another attempt, assuming your data is text to start with:
test <- c("2011-06-02T17:16:05Z","2012-06-02T17:16:05-07:00")
format(as.POSIXct(test,format="%Y-%m-%dT17:%H:%M"),"%m/%d/%y %H:%M")
[1] "06/02/11 16:05" "06/02/12 16:05"
You can try the following, where myDates would be the column of dates
format(strptime(myDates, format="%Y-%m-%dT17:%H:%M"), format= "%m/%d/%Y %H:%M")
[1] "06/02/2011 16:05" "06/02/2012 16:05"
or with 2-digit year
# Note the lower-case %y at the end
format(strptime(myDates, format="%Y-%m-%dT17:%H:%M"), format= "%m/%d/%y %H:%M")
[1] "06/02/11 16:05" "06/02/12 16:05"
As for the Z, that indicates GMT (think: London).
the -7:00 indicates 7 hours back from GMT (think: Colorado / MST etc)
Please see here for more reference

Resources