R: Using the lubridate as.Dates function to convert YYYYMMDD to dates - r

Currently I am attempting to convert dates in the YYYYMMDD format to separate columns for year, month, and day. I know that using the as.Date function I can convert YYYYMMDD to YYYY-MM-DD, and work from there, however R is misinterpreting the dates and I'm not sure what to do. The function is converting the values into dates, but not correctly.
For example: R is converting '19030106' to '2019-03-01', when it should be '1903-01-06'. I'm not sure how to fix this, but this is the code I am using.
library(lubridate)
PrecipAll$Date <- as.Date(as.character(PrecipAll$YYYYMMDD), format = "%y%m%d")
YYYYMMDD is currently numeric, and I needed to include as.character in order for it to output a date at all, but if there are better solutions please help.
Additionally, if you have any tips on separating the corrected dates into separate Year, Month, and Date columns that would be greatly appreciated.

With {lubridate}, try ymd() to parse the YYYYMMDD varaible, regradless if it is in numeric or character form. Also use {lubridate}'s year, month, and day functions to get those variables as numeric signals.
library(lubridate)
PrecipAll <- data.frame(YYYYMMDD = c(19030106, 19100207, 20001130))
mutate(.data = PrecipAll,
date = lubridate::ymd(YYYYMMDD),
year = year(date),
month_n = month(date),
day_n = day(date))
YYYYMMDD date year month_n day_n
1 19030106 1903-01-06 1903 1 6
2 19100207 1910-02-07 1910 2 7
3 20001130 2000-11-30 2000 11 30

Related

I'm getting NA on date column while trying to change date time format to date only in R programming language

I want to merge two data frames in R programming language using the date as the primary key. While trying to change the date, time format on one of the data frames to date only, im getting NA on the date column. Below is the date time format which i want to change to mm dd yy only.
4/12/2016 0:00
This is the code chunk i used.
sleep_day <- sleep_day %>%
rename(date = sleepday) %>%
mutate(date = as.Date(date,format ="%m/%d/%Y %I:%M:%S %p" , tz=Sys.timezone()))
i am expecting the date column to change from date, time to date alone. ie from mm dd yy 00:00 to mm dd yy. The result i got on the date column is NA in R programming
Your format is not correct:
test <- "4/12/2016 0:00"
as.Date(test,format ="%m/%d/%Y %H" , tz=Sys.timezone())
will work. Look at ?strptime.
As an advice, prefer to work with lubridate library, with has easy-to-use functions, which parse a lot of different formats:
library(lubridate)
mdy_hm(test)
"2016-04-12 UTC"

How do I initialize a date in R from integer values for year, month, and day

I have year, month, and day as separate numeric values and would like to create a date object. In most languages like C# or javascript I can do something like
let x=new Date(2010,11,19);
to initialize a date. What's the R equivalent of this? Obviously I can do
as.Date(paste0(year,'-',month,'-',day))
But it seems pointlessly dumb to convert to character in order to convert to date, right? Surely there is a way to cut out the unnecessary extra string conversion?
We can use ISOdate which returns datetime object (POSIXct) which can changed to Date class with as.Date
as.Date(ISOdate(year, month, day))
#[1] "2010-11-19"
data
year <- 2010
month <- 11
day <- 19

Converting string to date in R returns NAs

I have a column of my dataframe as
date
17-Feb
17-Mar
16-Dec
16-Nov
16-Sep
17-Feb
I am trying to convert it into a date column from string. I am using the following pieces of code:
as.Date(df$Date, format="%y-%b")
and
as.POSIXct(df$Date, format="%y-%b")
Both of them give NAs
I am getting the format from this link
The starting number is year. Sorry for the confusion.
I assume from your approach that the 17 and 16 refer to the year 2017 and 2016 respectively. You need to also specify the day of month. If you don't care about it, then set it to the 1st.
A slight modification to your code will work, by appending '-01' to the date then updating your format argument to reflect this:
df = data.frame(Date = c("17-Feb", "17-Mar", "16-Dec"))
as.Date(paste0(df$Date, "-01"), format="%y-%b-%d")

Converting variables in form of "2015M01" to date format in R?

I have a date frame df that simply looks like this:
month values
2012M01 99904
2012M02 99616
2012M03 99530
2012M04 99500
2012M05 99380
2012M06 99103
2013M01 98533
2013M02 97600
2013M03 96431
2013M04 95369
2013M05 94527
2013M06 93783
with month that was written in form of "M01", "M02"... and so on.
Now I want to convert this column to date format, is there a way to do it in R with lubridate?
I also want to select columns that contain one certain month from each year, like only March columns from all these years, what is the best way to do it?
The short answer is that dates require a year, month and day, so you cannot convert directly to a date format. You have 2 options.
Option 1: convert to a year-month format using zoo::as.yearmon.
library(zoo)
df$yearmon <- as.yearmon(df$month, "%YM%m")
# you can get e.g. month from that
months(df$yearmon[1])
# [1] "January"
Option 2: convert to a date by assuming that the day is always the first day of the month.
df$date <- as.Date(paste(df$month, "01", sep = "-"), "%YM%m-%d")
For selection (and I think you mean select rows, not columns), you already have everything you need. For example, to select only March 2013:
library(dplyr)
df %>% filter(month == "2013M03")
Something like this will get it:
raw <- "2012M01"
dt <- strptime(raw,format = "%YM%m")
dt will be in a Posix format. The strptime function will assign a '1' as the default day of month to make it a complete date.

R aggregate by %Y-%b as date

I have a data frame item_sold_time with a item_sold_time$SaleDate "POSIXct" "POSIXt" column, looking like this 2015-04-28 07:59:43.
I want to aggregate by month in year:
df2<-aggregate(list(Qty=item_sold_time$Qty), by=list(ISBN=item_sold_time$ISBN,DateYM=strftime(item_sold_time$SaleDate, format="%Y-%b")), FUN=sum)
The problem with strftime is that it converts the date into characters and I get an error if I try to convert it back into date.
I tried all combinations of date formatting, I could find in 2 days of searching. The final destination of that date is to be used in this plot:
ggplot(df2, aes(x = DateYM, y=Qty))
Please help. Thanks
Set each date to (e.g.) the 15th of the month, then can aggregate by date and plot as normal.
as.Date(format(item_sold_time$SaleDate, "%Y-%m-15"))

Resources