I would like to retain my current date column in year-month format as date. It currently gets converted to chr format. I have tried as_datetime but it coerces all values to NA.
The format I am looking for is: "2017-01"
library(lubridate)
df<- data.frame(Date=c("2017-01-01","2017-01-02","2017-01-03","2017-01-04",
"2018-01-01","2018-01-02","2018-02-01","2018-03-02"),
N=c(24,10,13,12,10,10,33,45))
df$Date <- as_datetime(df$Date)
df$Date <- ymd(df$Date)
df$Date <- strftime(df$Date,format="%Y-%m")
Thanks in advance!
lubridate only handle dates, and dates have days. However, as alistaire mentions, you can floor them by month of you want work monthly:
library(tidyverse)
df_month <-
df %>%
mutate(Date = floor_date(as_date(Date), "month"))
If you e.g. want to aggregate by month, just group_by() and summarize().
df_month %>%
group_by(Date) %>%
summarize(N = sum(N)) %>%
ungroup()
#> # A tibble: 4 x 2
#> Date N
#> <date> <dbl>
#>1 2017-01-01 59
#>2 2018-01-01 20
#>3 2018-02-01 33
#>4 2018-03-01 45
You can solve this with zoo::as.yearmon() function. Follows the solution:
library(tidyquant)
library(magrittr)
library(dplyr)
df <- data.frame(Date=c("2017-01-01","2017-01-02","2017-01-03","2017-01-04",
"2018-01-01","2018-01-02","2018-02-01","2018-03-02"),
N=c(24,10,13,12,10,10,33,45))
df %<>% mutate(Date = zoo::as.yearmon(Date))
You can use cut function, and use breaks="month" to transform all your days in your dates to the first day of the month. So any date within the same month will have the same date in the new created column.
This is usefull to group all other variables in your data frame by month (essentially what you are trying to do). However cut will create a factor, but this can be converted back to a date. So you can still have the date class in your data frame.
You just can't get rid of the day in a date (because then, is not a date...). Afterwards you can create a nice format for axes or tables. For example:
true_date <-
as.POSIXlt(
c(
"2017-01-01",
"2017-01-02",
"2017-01-03",
"2017-01-04",
"2018-01-01",
"2018-01-02",
"2018-02-01",
"2018-03-02"
),
format = "%F"
)
df <-
data.frame(
Date = cut(true_date, breaks = "month"),
N = c(24, 10, 13, 12, 10, 10, 33, 45)
)
## here df$Date is a 'factor'. You could use substr to create a formated column
df$formated_date <- substr(df$Date, start = 1, stop = 7)
## and you can convert back to date class. format = "%F", is ISO 8601 standard date format
df$true_date <- strptime(x = as.character(df$Date), format = "%F")
str(df)
Related
I have a date format as follows: yyyymmdd. So, 10 March 2022 is fromatted as 20220310. So there is no separator between the day, month and year. But no I want to replace to column with all those dates with a column that only contains the year. Normally I would use the following code:
df <- df %>%
mutate(across(contains("Date"), ~(parse_date_time(., orders = c('ymd')))))
And then separate the column into three different columns with year, month and days and than delete the monht and day column. But somehow the code above doesn't work. Hope that anyone can help me out.
Not as fancy, but you could simply get the year from a substring of the whole date:
df$Year <- as.numeric(substr(as.character(df$Date),1,4))
you can try this:
df$column_with_date <- as.integer(x = substr(x = df$column_with_date, start = 1, stop = 4))
The as.integer function is optional, but you could use it to save more space in memory.
You code works if it is in the format below. You can use mutate_at with a list of year, month, and day to create the three columns like this:
df <- data.frame(Date = c("20220310"))
library(lubridate)
library(dplyr)
df %>%
mutate(across(contains("Date"), ~(parse_date_time(., orders = c('ymd'))))) %>%
mutate_at(vars(Date), list(year = year, month = month, day = day))
#> Date year month day
#> 1 2022-03-10 2022 3 10
Created on 2022-07-25 by the reprex package (v2.0.1)
I have a DF and I would like to create a column with YEAR and MONTH, but setting 2 digits for the month. See my code:
ID <- c(111,222,333,444,555)
DATE <- c(as.Date(c('10/10/2021','12/11/2021','30/12/2021','20/01/2022','25/02/2022') ,"%d/%m/%Y"))
DF_1 <- data.frame(ID, DATE)
Adding the YEAR and MONTH column:
DF_2 <- DF_1 %>%
mutate(YEAR_MONTH = paste(lubridate::year(DATA),
lubridate::month(DATE),
sep = ""))
As you can see, in IDs 444 and 555 the month only presented one digit. I would like it to look like this:
ID <- c(111,222,333,444,555)
DATE <- c(as.Date(c('10/10/2021','12/11/2021','30/12/2021','20/01/2022','25/02/2022') ,"%d/%m/%Y"))
YEAR_MONTH <- c('202110','202111','202112','202201','202202')
DF_3 <- data.frame(ID, DATE, YEAR_MONTH)
How would I go about treating these months that are showing up with just one digit?
Grateful.
Instead of using lubridate year/month, we can directly modify with format which returns the 4 digit year and 2 digit month. lubridate returns a numeric/integer value which cannot have 0 as padding on the left
library(dplyr)
DF_1 <- DF_1 %>%
mutate(YEAR_MONTH = format(DATE, "%Y%m"))
Or using base R
DF_1$YEAR_MONTH <- with(DF_1, format(DATE, "%Y%m"))
I have a dateframe with a column with numbers that represent a date. So 110190-1111 is ddmmyy-xxxx, where the x's don't matter. It is implicit that the century is 1900.
df <- c("110190-1111", "220391-1111", "241287-1111")
I would like to have it converted to.
c("1990-01-11", "1991-03-22", "1987-12-24)
I have removed the last 4 digits and the "-" with the following.
ID <- c("110190-1111", "220391-1111", "241287-1111")
df <- data.frame(ID)
df <- df %>% mutate(date=gsub("-.*", "", ID))
I have tried fiddling with the as.Date function with no luck. Any suggestions? Thanks.
as.Date ignores junk at the end so
df %>% mutate(Date = as.Date(ID, "%d%m%y"))
giving:
ID Date
1 110190-1111 1990-01-11
2 220391-1111 1991-03-22
3 241287-1111 1987-12-24
or using only base R:
transform(df, Date = as.Date(ID, "%d%m%y"))
We can use dmy from lubridate
library(lubridate)
df$date <- dmy(df$date)
I have my dates formatted as 'YYYYMMDD' like '20150531' but now I want to separate my data into categories for the 7 days of the week by creating another variable called Day. How could I do this in R?
You can try weekdays() function from base R:
#Data
df <- data.frame(Date='2015031',stringsAsFactors = F)
df$Weekday <- weekdays(as.Date(df$Date,'%Y%m%d'))
Output:
df
Date Weekday
1 2015031 Sunday
We can convert to Date class and then use format to get the weekday name in base R
df1$Weekday <- format(as.Date(df1$date , '%Y%m%d'), '%a')
-output
df1
# date Weekday
#1 2015031 Sun
data
df1 <- data.frame(date = '2015031')
I am having trouble standardizing the Date format to be dd-mm-YYYY, This is my current code
Dataset
date
1 23/07/2020
2 22-Jul-2020
Current Output
df$date<-as.Date(df$date)
df$date = format(df$date, "%d-%b-%Y")
date
1 20-Jul-0022
2 <NA>
Desired Output
date
1 23-Jul-2020
2 22-Jul-2020
You can try this way
library(lubridate)
df$date <- dmy(df$date)
df$date <- format(df$date, format = "%d-%b-%Y")
# date
# 1 23-Jul-2020
# 2 22-Jul-2020
Data
df <- read.table(text = "date
1 23/07/2020
2 22-Jul-2020", header = TRUE)
I've saved your example data set as a dataframe named df. I used group_by from dplyr to all each date to be converted separately to the correct format.
library(tidyverse)
df %>%
group_by(date) %>%
mutate(date = as.Date(date, tryFormats = c("%d-%b-%Y", "%d/%m/%Y"))) %>%
mutate(date = format(date, "%d-%b-%Y"))