This question already has answers here:
Creating a unique sequence of dates
(6 answers)
Create a Vector of All Days Between Two Dates
(3 answers)
Closed 7 months ago.
Some R code:
> dates <- as.Date(c('2020-01-01', '2020-01-02'))
> min(dates)
[1] "2020-01-01"
> max(dates)
[1] "2020-01-02"
> min(dates):max(dates)
[1] 18262 18263
> as.Date(min(dates):max(dates))
Error in as.Date.numeric(min(dates):max(dates)) :
'origin' must be supplied
> as.Date(min(dates):max(dates), origin="1970-01-01")
[1] "2020-01-01" "2020-01-02"
This shows that min and max are working as expected, but when I put them in a range, the dates turn into integers. How do I prevent that?
I can just use the "origin", but it seems like a hack.
Instead of using : and then reconverting the coerced numeric storage to Date class, use the seq which already have a method for Date class
seq(min(some_dates), max(some_dates), by = "1 day")
[1] "2020-01-01" "2020-01-02"
Related
I am working with an external package that's converting columns of a dataframe with the lubridate date type Date into numeric type. (Confirmed by running as.numeric() on the columns).
I'm wondering if there's a way to convert it back?
For example, if I have the date "O1-01-2021" then running as.numeric on it returns -719143. How can I turn that back into "O1-01-2021" ?
Note that Date class is part of base R, not lubridate.
You probably assumed that the data was year/month/day by mistake. Using base R to eliminate lubridate as a problem we can replicate the question's result like this:
as.numeric(as.Date("01-01-2021", "%Y-%m-%d"))
## [1] -719143
Had we used day/month/year we would have gotten:
as.numeric(as.Date("01-01-2021", "%d-%m-%Y"))
## [1] 18628
or using lubridate
library(lubridate)
as.numeric(dmy("01-01-2021"))
## [1] 18628
It would be best if you fix the mistake that resulted in -719143 but if you don't control that and are faced with an input of
-719143 and want to get as.Date("2021-01-01") as the output then:
# input x is numeric; result is Date class
fixup <- function(x) as.Date(format(.Date(x), "%y-%m-%d"), "%d-%m-%y")
fixup(-719143)
## [1] "2020-01-01"
Note that we can't tell from the question whether 01-01-2020 is supposed to represent day-month-year or month-day-year so we assumed the first but if it is to represent the second then it should be obvious at this point how to proceed.
EDIT #2: It looks like the original data is being parsed as Jan 20, year 1, which might happen if the year-month-day columns were jumbled while being parsed:
as.numeric(as.Date("01-01-2021", format = "%Y-%m-%d", origin = "1970-01-01"))
[1] -719143
as.numeric(as.Date("0001-01-20", origin = "1970-01-01"))
[1] -719143
Is there a way to share an example of the raw data as you have it? e.g. dput(MY_DATA[1:10, DATE_COL])
EDIT: -719143 is about 1970 years of days, which can't be a coincidence, given that many date/time formats use 1970 as a baseline. I wonder if 01-01-2021 is being interpreted as the numeric formula equal to -2021 and so we're looking at perhaps -2021 seconds/days/[?] before year zero, which would be about -1970 years before the epoch...
-719143/(365)
[1] -1970.255
For instance, we can get something close with:
as.numeric(as.Date("0000-01-01", origin = "1970-01-01"))
[1] -719528
Original answer:
R treats a string describing a date as text:
x <- "01-01-2021"
class(x)
[1] "character"
We can convert it to a Date data type using these two equivalent commands:
base_dt <- as.Date(x, "%m-%d-%Y") # base R version
lubridt <- lubridate::mdy(x) # convenience lubridate function
identical(base_dt, lubridt)
[1] TRUE
Under the hood, a Date object in R is a numeric value with a flag telling R it's a date:
> typeof(lubridt) # What general type of data is it?
[1] "double" # --> numeric, stored as a double
> as.numeric(lubridt)
[1] 18628
> class(lubridt) # Does it have any special class attributes?
[1] "Date" # --> yes, it's a Date
> dput(lubridt) # How would we construct it from scratch?
structure(18628, class = "Date") # --> by giving 18628 a Date attribute
In R, a Date is encoded as the number of days since 1970 began:
> as.Date("1970-01-1") + as.numeric(lubridt)
[1] "2021-01-01"
We could convert it back to the original text using:
format(base_dt, "%m-%d-%Y")
[1] "01-01-2021"
identical(x, format(base_dt, "%m-%d-%Y"))
[1] TRUE
This question already has answers here:
Convert string to date, format: "dd.mm.yyyy"
(4 answers)
Closed last year.
as.Date('28/3/2021', '%d/%m/%y')
gives output:
[1] "2020-03-28"
How do I write the code to ensure the year is correct i.e. 2021, not 2020?
Here is the solution + explanation:
> # %Y is used for 4 digit numbers
>
> as.Date('28/3/2021', '%d/%m/%Y')
[1] "2021-03-28"
>
> # while %y for 2 digits
>
> as.Date('28/3/21', '%d/%m/%y')
[1] "2021-03-28"
>
This question already has answers here:
integer data frame to date in R [duplicate]
(3 answers)
Closed 4 years ago.
I have a date.frame and a column with
values(20180213190133, 20180213190136, 20180213190173 , 20180213190193 , 20180213190213, 20180213190233, 20180213190333, 20180213190533, 20180213190733, 20180213190833, 20180213190833, 20180213190833, 201802131901833, 20180213191133, 20180213192133, 20180213194133, 20180213199133, 20180213199133, 20180213199133, 20180213199133, 20180213190136.... 1200 entries)
I want to convert this column which is of type int to Date.
I tried using :
as.Date() and as.POSIXct(). Both doesn't work. I am getting N/A value.
Please let me how can I convert this filed from int to Date.
Thanks
Try this:
Input data
values<-c(20180213190133, 20180213190136, 20180213190173)
values_date<-as.Date(substr(as.character(values),start = 1,stop=8), format = "%Y%m%d")
> values_date
[1] "2018-02-13" "2018-02-13" "2018-02-13"
> class(values_date)
[1] "Date"
If you want to mantain also hour/minute/second you can try this:
values_date<-as.POSIXlt(as.character(values), format = "%Y%m%d%H%M%S")
After this the class will be "POSIXlt" "POSIXt" and not Date but there are some strange info in your input data
In the third number, last two figures are "73", this number is incorrect for seconds and you will have NA in output.
values_date
[1] "2018-02-13 19:01:33 CET" "2018-02-13 19:01:36 CET" NA
This question already has an answer here:
Convert R character to date [duplicate]
(1 answer)
Closed 4 years ago.
I have a csv file (df) with dates as " Mar-97, Apr-97..." . After importing to r with read.csv and stringAsFactors = F, the class(dates) is character.
I have tried : df$dates <- as.Date(df$Dates , format = "%d-%b-%y") & as.Date(df$Dates , format = "%b-%y"). class is converted to Date but it shows NA values?
you can try lubridate library:
library(lubridate)
> parse_date_time("Mar-97", "m y")
[1] "1997-03-01 UTC"
and you can vectorize
df=c("Mar 17","Apr 17")
> parse_date_time(df, "m y")
[1] "2017-03-01 UTC" "2017-04-01 UTC"
This question already has an answer here:
Extracting 2 digit hour from POSIXct in R
(1 answer)
Closed 7 years ago.
This is how my timestamp looks like
date1 = timestamp[1]
date1
[1] 07.01.2016 11:52:35
3455 Levels: 01.02.2016 00:11:52 01.02.2016 00:23:35 01.000:30:21 31.01.2016 23:16:18
When I try to extract the hour, I get an invalid 'trim' argument. Thank you !
format(date1, "%I")
Error in format.default(structure(as.character(x), names = names(x), dim = dim(x), :
invalid 'trim' argument
How can I extract single components like the hour from this timestamp.
With base R:
format(strptime(x,format = '%d.%m.%Y %H:%M:%S'), "%H")
[1] "11"
data
x <- as.factor("07.01.2016 11:52:35")
You first need to parse the time
d = strptime(date1,format = '%d.%m.%Y %H:%M:%S')
Then use lubridate to extract parts like hour etc.
library(lubridate)
hour(d)
minute(d)