Cast text to timestamp - r

Let's say if I have the following data frame.
date <- c('23/01/21 22:53','15/02/21 20:01', '05/03/21 07:49', '10/01/21 18:15', '09/03/21 12:53' )
id <- c(1:5)
df <- data_frame(id, date)
I tried to cast this from text to timestamp using the following code
df$date2 <- strptime(df$date, "%d/%m/%Y %H:%M")
This is the result that I get
The year is now showing 0021 instead of 2021. Is there a way I can show the right year?
Much appreciated with any help.

You could do a replacement to convert the two digit year to the correct four digit value:
date <- c('23/01/21 22:53','15/02/21 20:01', '05/03/21 07:49', '10/01/21 18:15', '09/03/21 12:53')
date <- paste0(substr(date, 1, 6),
ifelse(as.numeric(substr(date, 7, 8)) < 30, "20", "19"),
substr(date, 7, 14))
df <- data.frame(c(1:5), date)
df$date2 <- strptime(df$date, "%d/%m/%Y %H:%M")

Related

Changing date formats from MM-YYYY to DD-MM-YYYY by creating random DD

I have a dataset that has a variable date_of_birth (MM-YYYY). I would like to change this format to DD-MM-YYYY by creating random DD for each observation.
df1 <- as.Date(paste0(df,"01/",MMYYYY),format="%d-%m-%Y")
dates <- c("02-1986", "03-1990")
add_random_day <- function(date) {
date <- lubridate::as_date(date, format="%m-%Y")
days_in_month <- lubridate::days_in_month(date)
random_day <- sapply(days_in_month, sample, size = 1)
lubridate::day(date) <- random_day
date
}
add_random_day(dates)

Convert date format from dd-mm-yyyy to dd/mm/yyyy

I would like convert my date format from dd-mm-yyyy to dd/mm/yyyy
Data:
date
1 22-Jul-2020
Current code:
format(as.Date(df$date, '%d:%m:%Y'), '%d/%m/%Y' )
[1] NA NA
Desired Output:
date
1 22/07/2020
The format in as.Date should match the input format. It is %d followed by -, then abbrevation for month (%b) followed by - and 4 digit year (%Y)
df$date <- format(as.Date(df$date, '%d-%b-%Y'), '%d/%m/%Y' )
df$date
#[1] "22/07/2020"
data
df <- structure(list(date = "22-Jul-2020"), class = "data.frame", row.names = "1")
You can try
library(lubridate)
df <- data.frame(date = c("22-Jul-2020"))
df$date <- dmy(df$date)
df$date <- format(df$date, format = "%d/%m/%Y")
# date
#1 22/07/2020

Convert Date formats in base R [duplicate]

This question already has answers here:
Changing date format in R
(7 answers)
Closed 3 years ago.
Given two dates in a data frame that are in this format:
df <- tibble(date = c('25/05/95', '21/09/18'))
df$date <- as.Date(df$date)
How can I convert the dates into this format - date = c('1995-05-25', '2018-09-21') with the year appearing first and in four digit format, and by only using base R?
Here is my attempt, I successfully reversed the order, but still wasn't able to express the year in 4 digit format:
df <- tibble(date_orig = c('25/05/1995', '21/09/2018'))
df$date <- as.Date(df$date_orig)
year_date <- format(df$date, '%d')
month_date <- format(df$date, '%m')
day_date <- format(df$date, '%y')
df$newdate <- as.Date(paste(paste(year_date, month_date, sep = '-'), day_date, sep = '-'))
df$newdate_final <- as.Date(df$newdate, '%Y-%m-%d')
You need to know which format your date follows and find it in ?strptime to convert it in date object. As you required output is the standard way to represent dates you would not need format.
as.Date(df$date, "%d/%m/%Y")
#[1] "1995-05-25" "2018-09-21"

"Week-Year" string to a meaningful date or numeric format that can be plotted

I have a set of data that includes date that are in the "week-year" format. Therefore, "30-2010" represents the 30th week in 2010. I'm trying to plot the data, but need to adjust the date values to something in a date format or as a numeric value so that ggplot2 will use it as the labels on my x axis. Any ideas on how this can be done?
dte = "30-2010"
Check out ?week in the lubridate package. This seems to do what you want:
library(lubridate)
str <- "30-2010"
wk <- substr(str, 1, 2)
yr <- substr(str, 4, 7)
dt <- as.Date(paste0(yr, "-01-01"))
week(dt) <- as.numeric(wk)
dt
[1] "2010-07-23"

compare time intervals in R

Lets say I have dataframe consisting of 3 columns with dates:
index <- c("31.10.2012", "16.06.2012")
begin <- c("22.10.2012", "29.05.2012")
end <- c("24.10.2012", "17.06.2012")
index.new <- as.Date(index, format = "%d.%m.%Y")
begin.new <- as.Date(begin, format = "%d.%m.%Y")
end.new <- as.Date(end, format = "%d.%m.%Y")
data.frame(index.new, begin.new, end.new)
My problem: I want to select (subset) the rows, where the interval of begin and end-date is within 4 days before the index-day. This is obviously only in row no 2.
Can you help me out here?
Your way to express the problem is messy, in the first case dates.new[1]>dates.new[2] and in the second case dates.new[3]<dates.new[4]. Making things proper:
interval1 = c(dates.new[2], dates.new[1])
interval2 = c(dates.new[3],dates.new[4])
If you wanna check interval2 CONTAINS interval1:
all.equal(findInterval(interval1, interval2),c(1,1))
Pleas let me know if this works and if is what you want
library("timeDate")
index <- c("31.10.2012", "16.06.2012")
begin <- c("22.10.2012", "29.05.2012")
end <- c("24.10.2012", "17.06.2012")
index.new <- as.Date(index, format = "%d.%m.%Y")
begin.new <- as.Date(begin, format = "%d.%m.%Y")
end.new <- as.Date(end, format = "%d.%m.%Y")
data <- data.frame(index.new, begin.new, end.new)
apply(data, 1, function(x){paste(x[1]) %in% paste(timeSequence(x[2], x[3], by = "day"))})

Resources