Custom Date Transformation in R - r

Using R
How do we convert "yyyymm" to "yyyy-mm-01" across all the rows?
Eg: "201603" to "2016-03-01" (ie "yyyy-mm-dd" format)
PS: Here, (dd = 01) is the default date for all 12 months. ie("2016-01-01" , "2016-02-01" , etc...)

A simple paste solution:
x <- "201603"
paste0(substr(x, 1,4), "-", substr(x, 5,6), "-01")
[1] "2016-03-01"

If you want to transform as date:
as.Date(paste0(201603, 01), format = "%Y%m%d")
This will create the 2016-03-01 format as the date and not as a character.
If you want to use in all rows on column Date
data <- data %>%
mutate(Date = as.Date(paste(Date, 01) format = "%Y%m%d")

additional solution
library(lubridate)
library(stringr)
x <- c("201603")
ymd(str_c(x,"01"))
[1] "2016-03-01"

Related

How to parse both of these date formats?

Dates formatted as 4/29/2016 are parsed correctly, but dates formatted as 6242016 and 2042016 are not parsed.
Does R think that some of the dates without the slash have day first instead of month?
I've tried including dmy in lubrdiate and it still doesn't work.
I've tried looking at Sys.getlocale("LC_TIME") and it gives me "English_United States.1252".
demo$date <- as.character(demo$date)
demo <- demo %>%
mutate(date = parse_date_time(date, "mdy"))
You can wrangle your dates all into the same format using stringr. Then convert to numeric and use lubridate to parse.
library(stringr)
library(lubridate)
dates <- c("6242016", "2042016", "4/29/2016")
dates <- str_remove_all(dates, "/")
dates <- as.numeric(dates)
lubridate::mdy(dates)
# [1] "2016-06-24" "2016-02-04" "2016-04-29"
as.Date(sprintf("%08d",
as.numeric(gsub("/", "", c("6242016", "2042016", "4/29/2016")))),
format = "%m%d%Y")
# [1] "2016-06-24" "2016-02-04" "2016-04-29"
This
as.Date("2042016", "%m%d%Y")
returns NA as opposed to
as.Date("02042016", "%m%d%Y")
This is because the month has to be represented by two digits (00-12)
Try adding a leading zero for months in the range [1,9].

How to convert a column as date with special character?

In my dataframe I have a date column and I would like to convert it from character to date in the format d/m/y.
The head of my data:
head(df$date)
[1] [17/Jun/2019:08:33:49 [17/Jun/2019:08:38:20 [17/Jun/2019:08:38:24 [17/Jun/2019:09:52:42
[5] [17/Jun/2019:09:52:44 [17/Jun/2019:09:52:45
I used this but it converts every value into NA
df$date = as.Date(df$date, "[%d%b%y")
Try:
df$date <- strptime(df$date, format = "[%d/%b/%Y:%H:%M:%S")
df$date <- as.Date(df$date, format = "%d/%m/%y")
Using a tidyverse approach, looks like the dmy_hms() function accommodates that atypical first colon:
library(lubridate)
df <- df %>% mutate(date = dmy_hms(date), date = date(date))
Using your first value as an example:
date <- "17/Jun/2019:08:33:49"
date <- dmy_hms(date)
date
#[1] "2019-06-17 08:33:49 UTC"
date <- date(date) #or all in one line, date <- dmy_hms(date) %>% date()
date
#[1] "2019-06-17"
Assuming this is your input
x <- c("[17/Jun/2019:08:33:49", "[17/Jun/2019:08:38:20",
"17/Jun/2019:08:38:24", "[17/Jun/2019:09:52:42")
First convert it into POSIXct format and then to Date
as.Date(as.POSIXct(x, format = "[%d/%b/%Y:%T"))
#[1] "2019-06-17" "2019-06-17" "2019-06-17" "2019-06-17"
or any other format
format(as.POSIXct(x, format = "[%d/%b/%Y:%T"), "%d/%m/%Y")
#[1] "17/06/2019" "17/06/2019" "17/06/2019" "17/06/2019"
If you want to convert into Date object try this.
df$date = as.Date(df$date,format="[%d/%b/%Y:%H:%M:%S")
If you want to retain time as well, then try the following.
df$date = as.POSIXct(df$date,format="[%d/%b/%Y:%H:%M:%S")
Best wishes.

Removing time stamp to include just the month and day

I have a table called "rejected_at" in the dataframe "Job_applications". The format is: "M/DD/YYYY H:M:S PM/AM". Now i want to create a new table that just has "M/DD". How would i do this?enter image description here
We can convert to DateTime class and then extract the month and day
v1 <- as.POSIXct(df1$rejected_at, "%m/%d/%Y %H:%M:%S")
format(v1, "%d")
format(v1, "%m")
Or if we the format needed is %m/%d
format(v1, "%m/%d")
Or using tidyverse
library(dplyr)
library(lubridate)
df1 %>%
mutate(rejected_at = mdy_hms(rejected_at),
day = day(rejected_at),
month = month(rejected_at))
Assuming the rejected_at column is just text, you could try using sub here:
x <- "12/30/2018 4:05:44 PM"
sub("/[^/]+$", "", x)
[1] "12/30"
Or, in your case:
Job_applications$new_col <- sub("/[^/]+$", "", Job_applications$rejected_at)
Edit:
If you also wanted to retain the year, then we can try using sub with a capture group targeting the full date:
sub("(\\d+/\\d+/\\d+).*", "\\1", x)
[1] "12/30/2018"
You can do it using lubridate
library(lubridate)
new_date <- mdy_hms("08/14/2017 09:59:06 PM")
new1 <- paste0(month(new_date),"/",day(new_date))

How to convert a date to YYYYDDD?

I can't figure out how to turn Sys.Date() into a number in the format YYYYDDD. Where DDD is the day of the year, i.e. Jan 1 would be 2016001 Dec 31 would be 2016365
Date <- Sys.Date() ## The Variable Date is created as 2016-01-01
SomeFunction(Date) ## Returns 2016001
You can just use the format function as follows:
format(Date, '%Y%j')
which gives:
[1] "2016161" "2016162" "2016163"
If you want to format it in other ways, see ?strptime for all the possible options.
Alternatively, you could use the year and yday functions from the data.table or lubridate packages and paste them together with paste0:
library(data.table) # or: library(lubridate)
paste0(year(Date), yday(Date))
which will give you the same result.
The values that are returned by both options are of class character. Wrap the above solutions in as.numeric() to get real numbers.
Used data:
> Date <- Sys.Date() + 1:3
> Date
[1] "2016-06-09" "2016-06-10" "2016-06-11"
> class(Date)
[1] "Date"
Here's one option with lubridate:
library(lubridate)
x <- Sys.Date()
#[1] "2016-06-08"
paste0(year(x),yday(x))
#[1] "2016160"
This should work for creating a new column with the specified date format:
Date <- Sys.Date
df$Month_Yr <- format(as.Date(df$Date), "%Y%d")
But, especially when working with larger data sets, it is easier to do the following:
library(data.table)
setDT(df)[,NewDate := format(as.Date(Date), "%Y%d"
Hope this helps. May have to tinker if you only want one value and are not working with a data set.

Separate Date into week and year

Currently my dataframe has dates displayed in the 'Date' column as 01/01/2007 etc I would like to convert these into a week/year value i.e. 01/2007. Any ideas?
I have been trying things like this and getting no where...
enviro$Week <- strptime(enviro$Date, format= "%W/%Y")
You have to first convert to date, then you can convert back to the week of the year using format, for example:
### Converts character to date
test.date <- as.Date("10/10/2014", format="%m/%d/%Y")
### Extracts only Week of the year and year
format(test.date, format="Week number %W of %Y")
[1] "Week number 40 of 2014"
### Or if you prefer
format(date, format="%W/%Y")
[1] "40/2014"
So, in your case, you would do something like this:
enviro$Week <- format(as.Date(enviro$Date, format="%m/%d/%Y"), format= "%W/%Y")
But remember that the part as.Date(enviro$Date, format="%m/%d/%Y") is only necessary if your data is not in Date format, and you also should put the right format parameter to convert your character to Date, if that is the case.
What is the class of enviro$Date? If it is of class Date there is probably a better way of doing this, otherwise you can try
v <- strsplit(as.character(enviro$Date), split = "/")
weeks <- sapply(v, "[", 2)
years <- sapply(v, "[", 3)
enviro$Week <- paste(weeks, years, sep = "/")

Resources