R global date issue: all dates are converted to 1975 [duplicate] - r

This question already has answers here:
Convert four digit year values to class Date
(5 answers)
Closed 2 years ago.
I've encountered a strange issue in R recently:
I have noticed two related errors that I believe have to do with a global date setting in R, but I don't know what the issue is. First, when I use lubridate's "year" function, I get an origin error:
library(lubridate)
library(tidyverse)
year(2008)
Error in as.POSIXlt.numeric(x, tz = tz(x)) : 'origin' must be supplied
When I check origin
lubridate::origin
It says:
[1] "1970-01-01 UTC"
Which is what I was under the impression it is supposed to say. When I try to use as.Date, it says that everything is the year 1975:
as.Date(2008)
[1] "1975-07-02"
as.Date(1910)
[1] "1975-03-26"
However, if I use
ymd("2008-01-01") #it works fine:
[1] "2008-01-01"
I'm at a loss as to what to do - any advice? Thanks!

Well, using as.Date(2008) you pass not a string, but a number. Therefore it takes 2008 days from 1970-01-01, which is... 1975-07-02! :-) Check that with:
as.Date("1970-01-01") + 2008
#> [1] "1975-07-02"
Similarly 1975-03-26 is 1910 days after the origin, 1970-01-01:
as.numeric(as.Date("1975-03-26"))
#> [1] 1910
Note that passing only a year as a string takes today's day and month:
as.Date("2008", format = "%Y")
#> [1] "2008-07-14"
As for lubridate, function year() is for extracting a year from a date object, not to create a date from a string:
year(as.Date("1970-01-01"))
#> [1] 1970
If you want to convert a year as a string to a date, use e.g. parse_date_time():
library(lubridate)
parse_date_time("2008", orders = "%Y")
#> [1] "2008-01-01 UTC"

Related

How do you extract the year from a date using lubridate package in R?

Tried using ymd() function on the column reign_start but give
error as:
Warning: All formats failed to parse. No formats found.
[1] NA
The years are represented in B.C.
ymd() is for converting character representations of dates into variables of class Date with format YYYY-MM-DD. Your dates are already of class Date.
The lubridate function you want is year(). I'm not sure how that handles BC dates but I assume you would like 0026-01-16 to return 26.
So try year(reign_start). Here are some examples which do require ymd() first (your data doesn't).
library(lubridate)
year(ymd("0026-01-16"))
[1] 26
# do we set year negative for BC? also works
year(ymd("-0026-01-16"))
[1] 26

Convert string to Date in R: month abbreviated "dd-mmm-YYYY" [duplicate]

This question already has answers here:
strptime, as.POSIXct and as.Date return unexpected NA
(2 answers)
Closed 4 years ago.
I have a vector of strings which I need to convert to dates. This only works for some of the dates which are in the format of "dd-mmm-YYYY".
Example: this works: strptime("21-Sep-2017", format = "%d-%b-%Y")
This does not work and returns NA: strptime("21-Dec-2017", format = "%d-%b-%Y")
What am I doing wrong or not seeing?
This is because your locale is probably one where December is not abbreviated as Dec. Without changing your session settings, you could simply do
lubridate::parse_date_time("21-Dec-2017", orders = "d-b-Y", locale = "us")
[1] "2017-12-21 UTC"

Extract time from factor column in R

I would like to extract the time from a table column sd_data$start in R with the following characteristics:
str(sd_data$start)
Factor w/ 122 levels "01/03/2017 08:00",..: 1 2 5 10 12 14 18 19 20 21 ...
I found similar questions on the forum but so far all the answers have only given me NAs or blank values (00:00:00) so I see no other option than raise the question again specifically for my dataset.
I have managed to extract the dates and move them to a new column in the table with little effort and I am very surprised how difficult it is (for me at least) to do the same for hours, minutes and seconds. I must be overlooking something.
sd_data$start_date <- as.Date(sd_data$start,format='%d/%m/%Y')
sd_data$start_time <-
Thanks in advance for helping me to find the right lines of code to complete this task.
Here an example of what I am trying to do and where I am failing to get the time out.
smpldata <- "01/03/2017 08:00"
smpltime <-as.Date(as.character(smpldata),format='%d/%m/%Y %M:%S')
smpltime
# [1] 08:00 = what I would like to see
# [1] "2017-03-01" = what I am seeing
Maybe using as.character() to convert to character before convert to date, because the factor type is not well transformed. And including the other string elements on the date format as suggested above by Sotos.
sd_data$start_date <-
as.Date(as.character(sd_data$start),
format='%d/%m/%Y %H:%M:%S')
Another tip is to take a look at lubridate package. It's very usefull for this kind of task.
library(lubridate)
smpldata <- as.factor("01/03/2017 08:00")
(smpltime <-dmy_hm(as.character(smpldata)))
[1] "2017-03-01 08:00:00 UTC"
Here you still see the date. You can handle just the time for plots and other needs using hour() and minute().
hour(smpltime)
[1] 8
minute(smpltime)
[1] 0
Or you can use the format() function to get exactly what you want.
format(smpltime, "%H:%M:%S")
[1] "08:00:00"
format(smpltime, "%H:%M")
[1] "08:00"

Converting a chr to numeric [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 5 years ago.
I am trying to convert a chr into a number. The number I am trying to convert is "20171023063155.557". When I use the as.numeric function, it gives me 20171023063155.559. I have tried a few different methods but cannot get it to convert correctly.
Any help would be much appreciated.
as.POSIXct("20171023063155.557", format = "%Y%m%d%H%M%OS")
[1] "2017-10-23 06:31:55 PDT"
> as.POSIXct("20171023063155.557", format = "%Y%m%d%H%M%S")
[1] "2017-10-23 06:31:55 PDT"
Your string actually appears to be a timestamp. I would therefore suggest that you treat it as such. One option here would be to convert it to a date using as.POSIXct:
x <- "20171023063155.557"
y <- as.POSIXct(x, format = "%Y%m%d%H%M%OS")
With a POSIXct object in hand, you can now easily extract information about your timestamp, e.g.
weekdays(y, FALSE)
months(y, FALSE)
[1] "Monday"
[1] "October"
To verify that millisecond precision information has in fact been stored in the POSIXct object, we can call format to check:
format(y, "%Y-%m-%d %H:%M:%OS6")
[1] "2017-10-23 06:31:55.556999"
The problem is the the "rounding" difference imposed by using 32 bit floating point number (class float) and the default number of significant digits R is configured to print:
x <- as.numeric("20171023063155.557")
x
# [1] 2.017102e+13
getOption("digits")
# 7
options(digits=22)
x
# [1] 20171023063155.55859375
So just change the digits option and you can see your number is converted (almost ;-) correctly...

Formatting Unconventional Date

I'm having trouble formatting a list of dates in R. The conventional methods of formatting in R such as as.Date or as.POSIXct don't seem to be working.
I have dates in the format: 1012015
using
as.POSIXct(as.character(data$Start_Date), format = "%m%d%Y")
does not give me an error, but my date returns
"0015-10-12" because the month is not a two digit number.
Is there a way to change this into the correct date format?F
The lubridate package can help with this:
lubridate::mdy(1012015)
[1] "2015-01-01"
The format looks ambiguous but the OP gave two hints:
He is using format = "%m%d%Y" in his own attempt, and
he argues the issue is because the month is not a two digit number
This uses only base R. The %08d specifies a number to be formatted into 8 characters with 0 fill giving in this case "01012015".
as.POSIXct(sprintf("%08d", 1012015), format = "%m%d%Y")
## [1] "2015-01-01 EST"
Note that if you don't have any hours/minutes/seconds it would be less error prone to use "Date" class since then the possibility of subtle time zone errors is eliminated.
as.Date(sprintf("%08d", 1012015), format = "%m%d%Y")
## [1] "2015-01-01"

Resources