as.Date doesn't retain the time in R - r

I don't understand why this doesn't work:
as.Date('2001-10-26 10:00:00', format='%Y-%m-%d %H:%M:%S')
It returns
"2001-10-26"
but I expected it to be
"2001-10-26 10:00:00".
I don't need anything complicated; just to convert a string to a date with a timestamp. Not sure why the format argument isn't working.
Thanks.

R has several ways (expressed in classes) to deal with time and dates. In your fist code you are using the as.Date function that converts its arguments to Date class. The string you are providing to that function contains also other elements that do not belong to Date class and as a result it prints only elements that the function is allowed to handle. As you can read in the documentation:
... (dates)... They are always printed following the rules of the current Gregorian
calendar, even though that calendar was not in use long ago (it was
adopted in 1752 in Great Britain and its colonies).
the informations are essentially lost.
See here:
a <- as.Date('2001-10-26 10:00:00', format='%Y-%m-%d %H:%M:%S')
format(a, format='%Y-%m-%d %H:%M:%S')
[1] "2001-10-26 00:00:00"
If you want to keep informations about the time and not only the dates you have to use a format that is able to store that informations. The classes are POSIXct and POSIXlt where the first one is just a huge integer that counts the seconds since 1970-01-01 and the second one is just a list where each element stores seconds, days, months etc.
The functions (in base R) you have to use are: strptime (as pointed out in comments by #nongkrong), or as.POSIXlt or as.POSIXct.
There are other functions in other packages (like chron and lubridate) but putting aside special classes developed by single packages (like period in lubridate) the key classes are the ones I've just illustrated here.

Related

Strange behaviour with parse_time()

I am trying to parse a string representing a period consisting of minutes, seconds, and milliseconds. My preferred functions for this would come from the readr package, where seconds and milliseconds may be seen jointly as partial seconds. Apparently, within this package there is a silent assumption that minutes are represented as two digits, i.e. padded with zeros.
readr::parse_time("1:23.456", format="%M:%OS") # doesn't work
readr::parse_time("01:23.456", format="%M:%OS") # works
The ms function from lubridate handles this straight out of the box:
lubridate::ms("1:23.456")
Any workaround for this so I can use parse_time and other functions in readr without resorting to pad with zeros myself?
The format specifications can be shown here:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/strptime.html and
https://readr.tidyverse.org/reference/parse_datetime.html
The issue here is that %M refers to the time shown in minutes between "00" and "99". Note that this is the exact specification that you passed so it is part of the specified format rather than an assumption of the package. As far as I'm aware there is no minutes argument that accepts varying string lengths that can be passed to the format column. (This is dissimilar to the day argument which would accept a single character).
Lubridate's ms function uses a different method to parse time strings. Lubridate's function is far more robust due to being able to handle many formats of times including the one specified in the question when no format is given.

R as.POSIXct conversion to as.Date

I'm dealing with a data.frame object that holds in the first column a 'Date' field. Initially, I converted all the dates (which came in 'Date' as plain string) to POSIXct dates applying as.POSIXct(), but then I realized that as.Date was sufficient to my purpose. So I apply the as.Date() function to the POSIXct dates and I get a strange result: all the dates are scaled back one day ('2020-07-02 01:00:00' --> '2020-07-01'). Also tried with as.Date.POSIXct() with the same result.
Is there something I've missed in the conversion? Is this type of conversion legitim?
Thanks
Without knowing exactly what the data looks like, my best guess would have to do with the origin date used in the as.Date() function call when you convert from POSIXct. Hadley explains here the POSIXct and POSIXlt data types are really just double float values representing number of seconds since 1970-01-01. According to ?as.Date, if you're converting a value of class numeric, you should supply a value to the origin argument.
You could just reload your raw data, and play around with as.Date() and as.POSIXct() and see how the origin argument plays into an accurate conversion.

How to convert date and time into a numeric value

As a new and self taught R user I am struggling with converting date and time values characters into numbers to enable me to group unique combinations of data. I'm hoping someone has come across this before and knows how I might go about it.
I'd like to convert a field of DateTime data (30/11/2012 14:35) to a numeric version of the date and time (seconds from 1970 maybe??) so that I can back reference the date and time if needed.
I have search the R help and online help and only seem to be able to find POSIXct, strptime which seem to convert the other way in the examples I've seen.
I will need to apply the conversion to a large dataset so I need to set the formatting for a field not an individual value.
I have tried to modify some python code but to no avail...
Any help with this, including pointers to tools I should read about would be much appreciated.
You can do this with base R just fine, but there are some shortcuts for common date formats in the lubridate package:
library(lubridate)
d <- ymd_hms("30/11/2012 14:35")
> as.numeric(d)
[1] 1921407275
From ?POSIXct:
Class "POSIXct" represents the (signed) number of seconds since the
beginning of 1970 (in the UTC timezone) as a numeric vector.

Want only the time portion of a date-time object in R

I have a vector of times in R, all_symbols$Time and I am trying to find out how to get JUST the times (or convert the times to strings without losing information). I use
strptime(all_symbol$Time[j], format="%H:%M:%S")
which for some reason assumes the date is today and returns
[1] "2013-10-18 09:34:16"
Date and time formatting in R is quite annoying. I am trying to get the time only without adding too many packages (really any--I am on a school computer where I cannot install libraries).
Once you use strptime you will of necessity get a date-time object and the default behavior for no date in the format string is to assume today's date. If you don't like that you will need to prepend a string that is the date of your choice.
#James' suggestion is equivalent to what I was going to suggest:
format(all_symbol$Time[j], format="%H:%M:%S")
The only package I know of that has time classes (i.e time of day with no associated date value) is package:chron. However I find that using format as a way to output character values from POSIXt objects lends itself well to functions that require factor input.
In the decade since this was written there is now a package named “hms” that has some sort of facility for hours, minutes, and seconds.
hms: Pretty Time of Day
Implements an S3 class for storing and formatting time-of-day values, based on the 'difftime' class.
Came across the same problem recently and found this and other posts R: How to handle times without dates? inspiring. I'd like to contribute a little for whoever has similar questions.
If you only want to you base R, take advantage of as.Date(..., format = ("...")) to transform your date into a standard format. Then, you can use substr to extract the time. e.g. substr("2013-10-01 01:23:45 UTC", 12, 16) gives you 01:23.
If you can use package lubridate, functions like mdy_hms will make life much easier. And substr works most of the time.
If you want to compare the time, it should work if they are in Date or POSIXt objects. If you only want the time part, maybe force it into numeric (you may need to transform it back later). e.g. as.numeric(hm("00:01")) gives 60, which means it's 60 seconds after 00:00:00. as.numeric(hm("23:59")) will give 86340.

Date conversion query

I've come into possession of hundreds of ascii data files where the date and time are separate columns like so:
date time
1-Jan-08 23:05
I need to convert this to a usable R Date object, subtract 8 hours (timezone conversion from UTC to Pacific) and then turn it into unix time. I need to do this since the data are collected every evening (from 5pm through 2am the following morning). So if I were to use regular date/time format it would confound days (day1 spans two days when in fact it was just one evening of data collection). I'd like to consider each day's events separately.
Using unixtime will allow me to calculate time differences in events that occur each day (I will probably retain a date field in addition to the unix time). Can someone suggest an efficient way to do this?
Here is some data to use (this is in UTC)
dummy=data.frame(date="1-Jan-08",time="23:05")
Paste them together (which works vectorised) and then parse, e.g.
datetime <- paste(dummy$date, dummy$time)
parsed <- strptime(datetime, "%d-%b-%y %H:%M")
which you can also assign as columns in the data frame.
Edit: strptime() has an optional tz="" argument you can use.

Resources