I have a data frame 'rta' with a date variable (date of death) with data entered in multiple formats like DD/MM/YY, D/M/YY, DD/M/YY, D/MM/YY, DD/MM, D/MM, D/M, DD/M.
rta$date.of.death<-c('12/12/08' ,'1/10/08','4/3/08','24/5/08','23/4','11/11','1/12')
Luckily all the dates belong to the year 2008.
I want to make this variable into a uniform format of DD/MM/YYYY, for example 12/12/2008. How to get it this way?
You could use this quick'n'dirty way:
rta <- data.frame(date.of.death=c('12/12/08' ,'1/10/08', '4/3/08',
'24/5/08','23/4','11/11','1/12'),
stringsAsFactors=F)
# append '/08' to the dates without year
noYear <- grep('.+/.+/.+',rta$date.of.death,invert=TRUE)
rta$date.of.death[noYear] <- paste(rta$date.of.death[noYear],'08',sep='/')
# convert the strings into POSIXct dates
dates <- as.POSIXct(rta$date.of.death, format='%d/%m/%y')
# turn the dates into strings having format: DD/MM/YYYY
rta$date.of.death <- format(dates,format='%d/%m/%Y')
> rta$date.of.death
[1] "12/12/2008" "01/10/2008" "04/03/2008" "24/05/2008" "23/04/2008" "11/11/2008" "01/12/2008"
Note:
this code assumes that no date has a four-digit year e.g. 01/01/2008
Try this:
as.Date(paste0(rta$date.of.death, "/08"), "%d/%m/%y")
giving
[1] "2008-12-12" "2008-10-01" "2008-03-04" "2008-05-24" "2008-04-23"
[6] "2008-11-11" "2008-12-01"
Related
I would like to create a vector of dates between two specified moments in time with step 1 month, as described in this thread (Create a Vector of All Days Between Two Dates), to be then converted into factors for data visualization.
However, I'd like to have the dates in the YYYY-Mon, ie. 2010-Feb, format. But so far I managed only to have the dates in the standard format 2010-02-01, using a code like this:
require(lubridate)
first <- ymd_hms("2010-02-07 15:00:00 UTC")
start <- ymd(floor_date(first, unit="month"))
last <- ymd_hms("2017-10-29 20:00:00 UTC")
end <- ymd(ceiling_date(last, unit="month"))
> start
[1] "2010-02-01"
> end
[1] "2017-11-01"
How can I change the format to YYYY-Mon?
You can use format():
start %>% format('%Y-%b')
To create the vector, use seq():
seq(start, end, by = 'month') %>% format('%Y-%b')
Obs: Use capital 'B' for full month name: '%Y-%B'.
I can't figure out how to turn Sys.Date() into a number in the format YYYYDDD. Where DDD is the day of the year, i.e. Jan 1 would be 2016001 Dec 31 would be 2016365
Date <- Sys.Date() ## The Variable Date is created as 2016-01-01
SomeFunction(Date) ## Returns 2016001
You can just use the format function as follows:
format(Date, '%Y%j')
which gives:
[1] "2016161" "2016162" "2016163"
If you want to format it in other ways, see ?strptime for all the possible options.
Alternatively, you could use the year and yday functions from the data.table or lubridate packages and paste them together with paste0:
library(data.table) # or: library(lubridate)
paste0(year(Date), yday(Date))
which will give you the same result.
The values that are returned by both options are of class character. Wrap the above solutions in as.numeric() to get real numbers.
Used data:
> Date <- Sys.Date() + 1:3
> Date
[1] "2016-06-09" "2016-06-10" "2016-06-11"
> class(Date)
[1] "Date"
Here's one option with lubridate:
library(lubridate)
x <- Sys.Date()
#[1] "2016-06-08"
paste0(year(x),yday(x))
#[1] "2016160"
This should work for creating a new column with the specified date format:
Date <- Sys.Date
df$Month_Yr <- format(as.Date(df$Date), "%Y%d")
But, especially when working with larger data sets, it is easier to do the following:
library(data.table)
setDT(df)[,NewDate := format(as.Date(Date), "%Y%d"
Hope this helps. May have to tinker if you only want one value and are not working with a data set.
I`m giving an input as "a <- [12/Dec/2014:05:45:10]"
a is not a time-stamp so cannot use any time and date functions
Now I want the above variable to be split down into 2 parts as:-
date --> 12/Dec/2014
time --> 05:45:10
Any help will be appreciated.
You can use gsub to create a space between the Date and Time and use that to create two columns with read.table
read.table(text=gsub('^\\[([^:]+):(.*)+\\]', '\\1 \\2', a),
sep="", col.names=c('Date', 'Time'))
# Date Time
# 1 12/Dec/2014 05:45:10
Or you can use lubridate to convert it to a 'POSIXct' class
library(lubridate)
a1 <- dmy_hms(a)
a1
#[1] "2014-12-12 05:45:10 UTC"
If we need two columns with the specified format
d1 <- data.frame(Date= format(a1, '%d/%m/%Y'), Time=format(a1, '%H:%M:%S'))
data
a <- "[12/Dec/2014:05:45:10]"
Code
a <- "[12/Dec/2014:05:45:10]"
Sys.setlocale("LC_TIME", "C") # depends on your local setting
as.POSIXlt(a, format = "[%d/%b/%Y:%H:%M:%S]")
Explanation
Depending on your local setting you need to change it such that the abbreviated month names can be read. Then you can use as.POSIXlt together with the format string to convert your string in a date.
Hey I have some data aggregated at quarter level and there is a column contains data like this:
> unique(data$fiscalyearquarter)
[1] "2012Q3" "2010Q3" "2012Q1" "2011Q4" "2012Q4" "2008Q1" "2008Q2" "2010Q4" "2010Q1"
[10] "2009Q2" "2012Q2" "2011Q3" "2013Q2" "2013Q1" "2011Q2" "2013Q4" "2009Q4" "2009Q3"
[19] "2011Q1" "2010Q2" "2013Q3" "2008Q4" "2009Q1" "2014Q1" "2008Q3" "2014Q2"
I am thinking about writing a function that turn a string into a timestamp.
Something like this, split the the string to be year and quarter and then force the quarter to be converted to be month(the middle of the quarter).
convert <- function(myinput = "2008Q2"){
year <- substr(myinput, 1, 4)
quarter <- substr(myinput, 6, 6)
month <- 3 * as.numeric(quarter) - 1
date <- as.Date(paste0(year, sprintf("%02d", month), '01'), '%Y%m%d')
return(date)
}
I have to convert those strings to date format and then analyze it from there.
> convert("2010Q3")
[1] "2010-08-01"
Is there any way beyond my hard coding solution to analyze time series problem at quarterly level?
If x is your vector and you're okay having the date be the first day of the quarter:
library(zoo)
as.Date(as.yearqtr(x))
If you want the date to be the first day of the second month of the quarter like your example, you could hack together something like this:
as.Date(format(as.Date(as.yearqtr(x))+40, "%Y-%m-01"))
Currently my dataframe has dates displayed in the 'Date' column as 01/01/2007 etc I would like to convert these into a week/year value i.e. 01/2007. Any ideas?
I have been trying things like this and getting no where...
enviro$Week <- strptime(enviro$Date, format= "%W/%Y")
You have to first convert to date, then you can convert back to the week of the year using format, for example:
### Converts character to date
test.date <- as.Date("10/10/2014", format="%m/%d/%Y")
### Extracts only Week of the year and year
format(test.date, format="Week number %W of %Y")
[1] "Week number 40 of 2014"
### Or if you prefer
format(date, format="%W/%Y")
[1] "40/2014"
So, in your case, you would do something like this:
enviro$Week <- format(as.Date(enviro$Date, format="%m/%d/%Y"), format= "%W/%Y")
But remember that the part as.Date(enviro$Date, format="%m/%d/%Y") is only necessary if your data is not in Date format, and you also should put the right format parameter to convert your character to Date, if that is the case.
What is the class of enviro$Date? If it is of class Date there is probably a better way of doing this, otherwise you can try
v <- strsplit(as.character(enviro$Date), split = "/")
weeks <- sapply(v, "[", 2)
years <- sapply(v, "[", 3)
enviro$Week <- paste(weeks, years, sep = "/")