I have a dataset that all of it’s date variables are messed up. All of the columns are characters. They look like this:
name <- c(“Ana”, “Maria”, “Rachel”, “Julia”)
date_of_birth <- c(“9/8/1997”, “22/3/1966”, “24/10/1969”, “25/6/2019”)
data <- as.data.frame(cbind(name, date_of_bieth))
I need to turn those dates into dd/mm/yyyy format. They are already in this order, but I need to add zero when dd or mm has only one digit.
For example, “9/8/1997” should be “09/08/1997”.
We can try this
> format(as.Date(date_of_birth, format = "%d/%m/%Y"), "%d/%m/%Y")
[1] "09/08/1997" "22/03/1966" "24/10/1969" "25/06/2019"
Related
Let's say I have a data df like this:
df<-data.frame(date=c(202203,202204,202205,202206))
202203 means March, 2022.
So all the values of date column represent year and month .
However, since I don't know the exact date, I want to insert 01 to every values of date column.That is ,202203should be 20220301:March 1st,2022 .
My expected output is
df<-data.frame(date=c(20220301,20220401,20220501,20220601))
I tried to use gsub but, the output was not what I have expected.
df$date <- as.numeric(paste0(df$date, "01"))
oh hang on, got a better one:
df$date <- df$date * 100 + 1
You should use paste0 and loop.
For example:
for(i in df$date){
paste0(i, "01") -> df$date
}
I have columns that are named "X1.1.21", "X12.31.20" etc.
I can get rid of all the "X"s by using the substring function:
names(df) <- substring(names(df), 2, 8)
I've been trying many different methods to change "1.1.21" into a date format in R, but I'm having no luck so far. How can I go about this?
R doesn't like column names that start with numbers (hence you get X in front of them). However, you can still force R to allow column names that start with number by using check.names = FALSE while reading the data.
If you want to include date format as column names, you can use :
df <- data.frame(X1.1.21 = rnorm(5), X12.31.20 = rnorm(5))
names(df) <- as.Date(names(df), 'X%m.%d.%y')
names(df)
#[1] "2021-01-01" "2020-12-31"
However, note that they look like dates but are still of type 'character'
class(names(df))
#[1] "character"
So if you are going to use the column names for some date calculation you need to change it to date type first.
as.Date(names(df))
Hope you can help me out!
For all of my dates in a column, I would like to get a range for each date - 14 days. So for example, if the first date in my column is 29-04-2021, I would like to get the dates from 15-04-2021 until 29-04-2021. I found the function seq that does this but to do this for all the values in my column I need to put the seq function in a for loop.
This is what I tried but the output is only the last row and the date format changed. This is my code (test_IIVAC$'Vacdate 1' is my column with the dates):
df <- data.frame()
for(i in 1:length(test_IIVAC$`Vacdate 1`)){
te <- as.Date(seq(test_IIVAC$`Vacdate 1`[i]-14, test_IIVAC$`Vacdate 1`[i], by = "day"))
df1 <- rbind(df, te)
}
Can anyone help me out in getting the ranges of all the dates in the column and place them in one dataframe with the Date format? The desired output would be:
Output
Thanks a bunch!
You can use any of apply command to generate a sequence of date values and add it as a new column in the original dataframe.
test_IIVAC$dates <- lapply(df$a, function(x) seq(x-14, x, by = 'day'))
I have the problem following, I have a lot of numbers:
x <- c(200103, 200106,200109)
Actually those are dates and I want them in format 2001.03, 2001.06, 2001.09 etc., i.e. I want to add dot after four first numbers. Is there any simple way how can we do that in r?
You can capture data in two groups. 1st 4 characters and next 2 and add "." in between them.
x <- c(200103, 200106,200109)
sub('(.{4})(.{2})', '\\1.\\2', x)
#[1] "2001.03" "2001.06" "2001.09"
The standard way would be to convert to date and use format to get data in required format.
format(as.Date(paste0(x, 1), '%Y%m%d'), '%Y.%m')
We could convert to yearmon class and then use format
library(zoo)
format(as.yearmon(as.character(x), "%Y%m"), "%Y.%m")
#[1] "2001.03" "2001.06" "2001.09"
I would like to find the intersection of two dataframes based on the date column.
Previously, I have been using this command to find the intersect of a yearly date column (where the date only contained the year)
common_rows <-as.Date(intersect(df1$Date, df2$Date), origin = "1970-01-01")
But now my date column for df1 is of type date and looks like this:
1985-01-01
1985-04-01
1985-07-01
1985-10-01
My date column for df2 is also of type date and looks like this (notice the days are different)
1985-01-05
1985-04-03
1985-07-07
1985-10-01
The above command works fine when I keep the format like this (i.e year, month and day) but since my days are different and I am interested in the monthly intersection I dropped the days like this, but that produces and error when I look for the intersection:
df1$Date <- format(as.Date(df1$Date), "%Y-%m")
common_rows <-as.Date(intersect(df1$Date, df2$Date), origin = "1970-01-01")
Error in charToDate(x) :
character string is not in a standard unambiguous format
Is there a way to find the intersection of the two datasets, based on the year and month, while ignoring the day?
The problem is the as.Date() function wrapping your final output. I don't know if you can convert incomplete dates to date objects. If you are fine with simple strings then use common_rows <-intersect(df1$Date, df2$Date). Otherwise, try:
common_rows <-as.Date(paste(intersect(df1$Date, df2$Date),'-01',sep = ''), origin = "1970-01-01")
Try this:
date1 <- c('1985-01-01','1985-04-01','1985-07-01','1985-10-01')
date2 <- c('1985-01-05','1985-04-03','1985-07-07','1985-10-01')
# extract the part without date
date1 <- sapply(date1, function(j) substr(j, 1, 7))
date2 <- sapply(date2, function(j) substr(j, 1, 7))
print(intersect(date1, date2))
[1] "1985-01" "1985-04" "1985-07" "1985-10"