Currently my dataframe has dates displayed in the 'Date' column as 01/01/2007 etc I would like to convert these into a week/year value i.e. 01/2007. Any ideas?
I have been trying things like this and getting no where...
enviro$Week <- strptime(enviro$Date, format= "%W/%Y")
You have to first convert to date, then you can convert back to the week of the year using format, for example:
### Converts character to date
test.date <- as.Date("10/10/2014", format="%m/%d/%Y")
### Extracts only Week of the year and year
format(test.date, format="Week number %W of %Y")
[1] "Week number 40 of 2014"
### Or if you prefer
format(date, format="%W/%Y")
[1] "40/2014"
So, in your case, you would do something like this:
enviro$Week <- format(as.Date(enviro$Date, format="%m/%d/%Y"), format= "%W/%Y")
But remember that the part as.Date(enviro$Date, format="%m/%d/%Y") is only necessary if your data is not in Date format, and you also should put the right format parameter to convert your character to Date, if that is the case.
What is the class of enviro$Date? If it is of class Date there is probably a better way of doing this, otherwise you can try
v <- strsplit(as.character(enviro$Date), split = "/")
weeks <- sapply(v, "[", 2)
years <- sapply(v, "[", 3)
enviro$Week <- paste(weeks, years, sep = "/")
Related
I am importing a CSV with a dates column, and some of the dates are prior to 1/1/1900. I am trying to create a column in my dataframe with the dates in a %m-%d-%Y type of format, but my efforts return empty cells. Below is an example. Thanks for any help.
id <- c(1,2,3)
dates <- c(44321, 1, "December 25, 1890")
df <- data.frame(id, dates)
View(df)
df$dates2 <- as.Date(df$dates, format = "%m-%d-%y")
View(df)
Desired Output:
The mixture of formats you have makes things slightly awkward, but ...
library(lubridate)
as_date(
ifelse(
is.na(as.numeric(dates)),
mdy(dates),
dmy("01-Jan-1900") + days(as.numeric(dates)-1)
)
)
[1] "2021-05-06" "1900-01-01" "1890-12-25"
which seems reasonable.
Are you sure about the conversion of 44321? [Neither 1900 nor 2000 were leap years...]
The as.numeric() calls are required because dates is coerced to character because of the final entry in the vector.
I imported Excel data into R and I have a problem to convert dates.
In R, my data are character and look like :
date<-c('1971-02-00 00:00:00', '1979-06-00 00:00:00')
I would like to convert character into date (MM/YYYY) but the '00' value used for days poses a problem and 'NA' are returned systematically.
It works when I manually replace '00' with '01' and then use as.yearmon, ymd and format. But I have lots of dates to change and I don't know how to change all my '00' into '01' in R.
# data exemple
date1<-c('1971-02-00 00:00:00', '1979-06-00 00:00:00')
# removing time -> doesn't work because of the '00' day
date1c<-format(strptime(date1, format = "%Y-%m-%d"), "%Y/%m/%d")
date1c<-format(strptime(date1, format = '%Y-%m'), '%Y/%m')
# trying to convert character into date -> doesn't work either
date1c<-ymd(date1)
date1c<-strptime(date1, format = "%Y-%m-%d %H:%M:%S")
date1c<-as.Date(date1, format="%Y-%m-%d %H:%M:%S")
date1c<as.yearmon(date1, format='%Y%m')
# everything works if days are '01'
date2<-c('1971-02-01 00:00:00', '1979-06-01 00:00:00')
date2c<-as.yearmon(ymd(format(strptime(date2, format = "%Y-%m-%d"), "%Y/%m/%d")))
date2c
If you have an idea to do it or an another idea to solve my problem, I would be thankful!
Use gsub to replace -00 with -01.
date1<-c('1971-02-01 00:00:00', '1979-06-01 00:00:00')
date1 <- gsub("-00", "-01", date1)
date1c <-format(strptime(date1, format = "%Y-%m-%d"), "%Y/%m/%d")
> date1c
[1] "1971/02/01" "1979/06/01"
Another possibility could be:
as.Date(paste0(substr(date1, 1, 9), "1"), format = "%Y-%m-%d")
[1] "1971-02-01" "1979-06-01"
Here it extracts the first nine characters, pastes it together with 1 and then converts it into a date object.
These alternatives each accept a vector input and produce a vector as output.
Date output
These all will accept a vector as input and produce a Date vector as the output.
# 1. replace first occurrence of '00 ' with '01 ' and then convert to Date
as.Date(sub("00 ", "01 ", date1))
## [1] "1971-02-01" "1979-06-01"
# 2. convert to yearmon class and then to Date
library(zoo)
as.Date(as.yearmon(date1, "%Y-%m"))
## [1] "1971-02-01" "1979-06-01"
# 3. insert a 1 and then convert to Date
as.Date(paste(1, date1), "%d %Y-%m")
## [1] "1971-02-01" "1979-06-01"
yearmon output
Note that if you really are trying to represent just months and years then yearmon class directly represents such objects without the kludge of using an unused day of the month. Such objects are internally represented as a year plus a fraction of a year, i.e. year + 0 for January, year + 1/12 for February, etc. They display in a meaningful way, they sort in the expected manner and can be manipulated, e.g. take the difference between two such objects or add 1/12 to get the next month, etc. As with the others it takes a vector in and produces a vector out.
library(zoo)
as.yearmon(date1, "%Y-%m")
## [1] "Feb 1971" "Jun 1979"
character output
If you want character output rather than Date or yearmon output then these variations work and again accept a vector as input and produce a vector as output:
# 1. replace -00 and everything after that with a string having 0 characters
sub("-00.*", "", date1)
## [1] "1971-02" "1979-06"
# 2. convert to yearmon and then format that
library(zoo)
format(as.yearmon(date1, "%Y-%m"), "%Y-%m")
## [1] "1971-02" "1979-06"
# 3. convert to Date class and then format that
format(as.Date(paste(1, date1), "%d %Y-%m"), "%Y-%m")
## [1] "1971-02" "1979-06"
# 4. pick off the first 7 characters
substring(date1, 1, 7)
## [1] "1971-02" "1979-06"
I need to parse dates and have a cases like "31/02/2018":
library(lubridate)
> dmy("31/02/2018", quiet = T)
[1] NA
This makes sense as the 31st of Feb does not exist. Is there a way to parse the string "31/02/2018" to e.g. 2018-02-28 ? So not to get an NA, but an actual date?
Thanks.
We can write a function assuming you would only have dates which could be higher than the actual date and would have the same format always.
library(lubridate)
get_correct_date <- function(example_date) {
#Split vector on "/" and get 3 components (date, month, year)
vecs <- as.numeric(strsplit(example_date, "\\/")[[1]])
#Check number of days in that month
last_day_of_month <- days_in_month(vecs[2])
#If the input date is higher than actual number of days in that month
#replace it with last day of that month
if (vecs[1] > last_day_of_month)
vecs[1] <- last_day_of_month
#Paste the date components together to get new modified date
dmy(paste0(vecs, collapse = "/"))
}
get_correct_date("31/02/2018")
#[1] "2018-02-28"
get_correct_date("31/04/2018")
#[1] "2018-04-30"
get_correct_date("31/05/2018")
#[1] "2018-05-31"
With small modification you can adjust the dates if they have different format or even if some dates are smaller than the first date.
I have a date format like this
5170301, where it means 1st March 2017.And I have 5 attached to it
I want the format of the date to be changed.
So can anyone help me in splitting that 5 from the date?
We can use substring to read from the 2nd character onwards
v1 <- substring(df1$date, 2)
NOTE: It should work for numeric/character/factor class
Then we change it to Date class
v2 <- as.Date(v1, "%y%m%d")
and if needed change the format
format(v2, "%d %b %Y")
Or as #thelatemail mentioned, it can be mentioned in the format
as.Date(df1$date, "5%y%m%d")
You can split it quite nicely with the stringr package
Split <- stringr::str_split_fixed(string=Column_Name, pattern="5", n=2)
This will give two variables: one blank and one of your value after the "5" (170301)
Then can change it to the date as so:
Date1 <- as.Date(format="%d%m%y", x = Split)
I have an file with birthdays in %d%b%y format. Some eg.
# "01DEC71" "01AUG54" "01APR81" "01MAY81" "01SEP83" "01FEB59"
I tried to reformat the date as
o108$fmtbirth <- format(as.Date(o108$birth, "%d%b%y"), "%Y/%m/%d")
and this is the result
# "1971/12/01" "2054/08/01" "1981/04/01" "1981/05/01" "1983/09/01" "2059/02/01"
These are birthdays and I see 2054. From this page I see that year values between 00 and 68 are coded as 20 for century. Is there a way to toggle this, in my case I want only 00 to 12 to be coded as 20.
1) chron. chron uses 30 by default so this will convert them converting first to Date (since chron can't read those sorts of dates) reformatting to character with two digit years into a format that chron can understand and finally back to Date.
library(chron)
xx <- c("01AUG11", "01AUG12", "01AUG13") # sample data
as.Date(chron(format(as.Date(xx, "%d%b%y"), "%m/%d/%y")))
That gives a cutoff of 30 but we can get a cutoff of 13 using chron's chron.year.expand option:
library(chron)
options(chron.year.expand =
function (y, cut.off = 12, century = c(1900, 2000), ...) {
chron:::year.expand(y, cut.off = cut.off, century = century, ...)
}
)
and then repeating the original conversion. For example assuming we had run this options statement already we would get the following with our xx :
> as.Date(chron(format(as.Date(xx, "%d%b%y"), "%m/%d/%y")))
[1] "2011-08-01" "2012-08-01" "1913-08-01"
2) Date only. Here is an alternative that does not use chron. You might want to replace "2012-12-31" with Sys.Date() if the idea is that otherwise future dates are really to be set 100 years back:
d <- as.Date(xx, "%d%b%y")
as.Date(ifelse(d > "2012-12-31", format(d, "19%y-%m-%d"), format(d)))
EDIT: added Date only solution.
See response from related thread:
format(as.Date("65-05-14", "%y-%m-%d"), "19%y-%m-%d")
o108$fmtbirth <- format(as.Date(o108$birth, "%d%b%y"), "%Y/%m/%d")
o108$fmtbirth <- as.Date(ifelse(o108$fmtbirth > Sys.Date(),
format(o108$fmtbirth, "19%y-%m-%d"),
format(o108$fmtbirth)))