Behaviour of as.Date when format = "" - r

I have a simple question.
How does the function as.Date work?
Can anyone explain why a simple "" changes everything in the code below?
as.Date("03/04/2019")
#[1] "0003-04-19"
as.Date("03/04/2019", "")
#[1] "2019-07-16"

This is directly from the help page which you view with ?as.Date:
## S3 method for class 'character'
as.Date(x, format, tryFormats = c("%Y-%m-%d", "%Y/%m/%d"),
optional = FALSE, ...)
x: an object to be converted.
format: character string. If not specified, it will try tryFormats one by one on the first non-NA element, and give an error if none works. Otherwise, the processing is via strptime.
So, in the first case we do not provide format. In this case as.Date tries the following formats: "year-month-day" and "year/month/day". The second one "works" in this case so we get the following return:
as.Date("03/04/2019")
[1] "0003-04-20"
"03" is interpreted as year, "04" as month and "2019" as day. Since day can only have to digits only the first two digits are used.
In the second case we provide format ourselves:
as.Date("03/04/2019", format = "")
[1] "2019-07-16"
If format is provided as.Date does not try any formats. Instead strptime("03/04/2019", format = "") is returned.
In the help page of strptime you can find the following:
For strptime the input string need not specify the date completely: it
is assumed that unspecified seconds, minutes or hours are zero, and an
unspecified year, month or day is the current one.
So you provide a format but it does not contain anything so the current date is returned.
In any case, you can simply specify the format yourself:
as.Date("03/04/2019", format = "%d/%m/%Y")
[1] "2019-04-03"

Related

parse date time function order format

Format of original input data is like: "1975M01", the variable name is Month.
The way we put the input into POSIXct date-time object is: parse_date_time(Month, "%Y%m").
I wonder why we ignored the 'M'(1975M01), like how can the function recognize the year and month and ignore the M in the middle automatically?
You can try as.POSIXct(). We have no days, though. But probably we want the first day of the month and just paste 01 at the end of the string(s). Finally define the "M" literally in format= string, and of course add %day,
x <- as.POSIXct(paste0('1975M01', '01'), format='%YM%m%d')
x
# [1] "1975-01-01 CET"
where
class(x)
# [1] "POSIXct" "POSIXt"
.
This should also work with lubridate but can't install it.

R: why a '%b.%Y' date class is not "Date"?

Sometimes I work with data like this:
sep-2018
From date like this:
Sys.Date()
[1] "2018-09-21"
To have this result, I generally use:
format(Sys.Date(),'%b-%Y')
But its class is not a date:
class(format(Sys.Date(),'%b-%Y'))
[1] "character"
Why it's not a date? Is it possible to have it with class() = date, and how?
Also an external library like zoo have the same thing.
library(zoo)
> class(format(as.yearmon(format(Sys.Date()), "%Y-%m-%d"), "%b.%Y"))
[1] "character"
Also using "%m.%Y" seems to generate the same thing, but it does not creates (for example) ordering issue.
The format command takes the date and outputs a printable string based on the format you provide. To quote the documentation:
An object of similar structure to x containing character representations of the
elements of the first argument x in a common format, and in the current
locale's encoding.
Also, a Date variable is stored as a numeric type internally (number of days since 1970-01-01)
dput(Sys.Date())
#structure(17795, class = "Date")
structure(0, class = "Date")
#[1] "1970-01-01"
So to pinpoint the date, you need day, month and year fields. If you don't have all three, it will probably return NA or an error. Similarly for time classes. If you don't have the data then you can just use some dummy values, and use format to print only the fields you want.
As Rohit says, format doesn't outputs a Date object, but a string in the format of your choice.
To get a Date object from a string like "sep-2018" you could use readr::parse_date().
(my_date <- readr::parse_date("sep-2018", format = '%b-%Y'))
#[1] "2018-09-01"
class(my_date)
#[1] "Date"

error in getting the correct date using strptime in R

I'm using strptime to extract date and the result is a wrong year
Where is the error in the below code:
strptime('8/29/2013 14:13', "%m/%d/%y")
[1] "2020-08-29 PDT"
What are the other ways to extract date and time as separate columns.
The data I have is in this format - 8/29/2013 14:13
I want to split this into two columns, one is 8/29/2013 and the other is 14:13.
You have a four digit year so you need to use %Y
strptime('8/29/2013 14:13', "%m/%d/%Y" )
[1] "2013-08-29 CEST"
Do you really want data and time in separate columns? It usually much easier to deal with a single date-time object.
Here's one possibility to separate time and date from the string.
For convenience, we could first convert the string into a POSIX object:
datetime <- '8/29/2013 14:13'
datetime.P <- as.POSIXct(datetime, format='%m/%d/%Y %H:%M')
Then we can use as.Date() to extract the date from this object and use format() to display it in the desired format:
format(as.Date(datetime.P),"%m/%d/%Y")
#[1] "08/29/2013"
To store the time separately we can use, e.g., the strftime() function:
strftime(datetime.P, '%H:%M')
#[1] "14:13"
The last function (strftime()) is not vectorized, which means that if we are dealing with a vector datetime containing several character strings with date and time in the format as described in the OP, it should be wrapped into a loop like sapply() to extract the time from each string.
Example
datetime <- c('8/29/2013 14:13', '9/15/2014 12:03')
datetime.P <- as.POSIXct(datetime, format='%m/%d/%Y %H:%M')
format(as.Date(datetime.P),"%m/%d/%Y")
#[1] "08/29/2013" "09/15/2014"
sapply(datetime.P, strftime, '%H:%M')
#[1] "14:13" "12:03"
Hope this helps.

select rows by element components of timestamp

I have a vector made up of timestamps as POSIXlt, format: "2015-01-05 15:00:00, which I extracted from a timeframe.
I want to reassign the vector by loosing all elements where Minutes != 00
I've tried
vector <- vector[format(vector, "%M") == 00,]
which creates the following error of missing argument
Error in lapply(X = x, FUN = "[", ..., drop = drop) :
argument is missing, with no default
Also tried
vector <- vector["%M""== 00]
Which is seems to be an open command
Since POSIX time is stored as number of elapsed seconds since 1 Jan 1970, I guess that I could do this by excluding from my vector all elements which are not multiple of 3600. I rather not use this approach though. Thank you in advance, I'm new to R.
Format returns a character type, not numeric, so you should compare it to "00". Also the comma is not needed, as there's only 1 dimension.
vector <- vector[format(vector, "%M") == "00"]
You could try
v2[!v2$min]
#[1] "2015-01-05 15:00:00 EST" "2015-01-05 15:00:30 EST"
Or your command should also work without the comma
data
v1 <- c("2015-01-05 15:00:00", "2015-01-05 15:45:00", "2015-01-05 15:00:30")
v2 <- strptime(v1, '%Y-%m-%d %H:%M:%S')
Using:
vector2 <- vector2[v2$min==0]
I reassign vector 2 (v2) excluding all elements where minutes are not 0.
This was suggested by #akrun.
It does the selection while keeping data type as POSIX.
There were two issues with the first option of initial code:
1.function format() returns character;
2.there was a "," before last "]", which meant that the function was expecting another argument, which does not make sense to a vector as explained by #balint.
With the second option initially submitted there were a few syntax mistakes. The correct syntax is that on this answer, as suggested by #akron.

Strptime not able to convert date and time

My attempt to format date like this results in NA NA. Neither the date nor the time is getting converted. What am I doing wrong?
x <- strptime(c("2013-12-12", "08:43:24.967"),"%Y-%m-%d %H:%M:%OS")
With the format string that you have supplied, strptime expects a vector of date-time strings. You have a vector containing date and time as separate vector elements. This is incorrect.
Instead of passing c("2013-12-12", "08:43:24.967") (two elements, date then time), you need to pass "2013-12-12 08:43:24.967" (one element, date-time).
The data you have can be put in the proper format with paste:
strptime(paste("2013-12-12", "08:43:24.967"),format="%Y-%m-%d %H:%M:%OS")
[1] "2013-12-12 08:43:24"
The fractional seconds aren't printed above, because the default is to not print them. But the expression does capture them (with the default options(digits.secs=NULL)). They would be printed with the proper format string for output, or a specification of the number of digits to print (e.g. options(digits.secs=3))
You need to pass a date string as first argument to strptime function follwed by the date format. It seems like you entered 0 in the date format string and please remove that milliseconds part in the date string.
You can use statement like this:
strptime("2013-12-12 08:43:24", "%Y-%m-%d %H:%M:%S")

Resources