How to convert character to Date format in R? - r

How to convert the below in character to Date format?
YYYY.MM
I am facing an issue dealing with zeroes after decimal points for month 10.
Say
2012.10
appears in my input source data as
2012.1
with the zero post decimal missing. How do I bring this back in the Date format?

Since you have only year and month, you need to assign some value for day before you convert to date. In the example below, the day has arbitrarily been chosen as 15.
IF THE INPUT IS CHARACTER
dates = c("2012.10", "2012.01")
lubridate::ymd(paste0(year_month = dates, day = "15"))
#[1] "2012-10-15" "2012-01-15"
IF THE INPUT IS NUMERIC
dates = c(2012.10, 2012.01)
do.call(c, lapply(strsplit(as.character(dates), "\\."), function(d){
if(nchar(d[2]) == 1){
d[2] = paste0(d[2],"0")
}
lubridate::ymd(paste0(year = d[1], month = d[2], day = "15"))
}))
#[1] "2012-10-15" "2012-01-15"

The zoo package has a "yearmon" class for representing year and month without day. Internally it stores them as year + fraction where fraction = 0 for Jan, 1/12 for Feb, 2/12 for Mar and so on but it prints in a nicer format and sorts as expected. Assuming that your input, x, is numeric convert it to character with 2 digit month and then apply as.yearmon with the appropriate format.
library(zoo)
x <- c(2012.1, 2012.01) # test data
as.yearmon(sprintf("%.2f", x), "%Y.%m")
## [1] "Oct 2012" "Jan 2012"
as.Date can be applied to convert a "yearmon" object to "Date" class if desired but normally that is not necessary.
as.Date(as.yearmon(sprintf("%.2f", x), "%Y.%m"))
## [1] "2012-10-01" "2012-01-01"

The code below uses the ymd() function from the lubridate package and sprintf() to coerce dates given in a numeric format
dates <- c(2012.1, 2012.01)
as well as dates given as a character string
dates <- c("2012.1", "2012.01")
where the part left of the decimal point specifies the year whereas the fractional part denote the month.
lubridate::ymd(sprintf("%.2f", as.numeric(dates)), truncated = 1L)
[1] "2012-10-01" "2012-01-01"
The format specification %.2f tells sprintf() to use 2 decimal places.
The parameter truncated = 1L indicates that one date element is missing (day) and should be completed by the default value (the first day of the month). Alternatively, the day of the month can be directly specified in the format specification to sprintf():
lubridate::ymd(sprintf("%.2f-15", as.numeric(dates)))
[1] "2012-10-15" "2012-01-15"

Related

Mixed Date formats in R data frame

how do you work with a column of mixed date types, for example 8/2/2020,2/7/2020, and all are reflecting February,
I have tried zoo::as.Date(mixeddatescolumn,"%d/%m/%Y").The first one is right but the second is wrong.
i have tried solutions here too
Fixing mixed date formats in data frame? but the questions seems different from what i am handling.
It is really tricky to know even for a human if dates like '8/2/2020' is 8th February or 2nd August. However, we can leverage the fact that you know all these dates are in February and remove the "2" part of the date which represents the month and arrange the date in one standard format and then convert the date to an actual Date object.
x <- c('8/2/2020','2/7/2020')
lubridate::mdy(paste0('2/', sub('2/', '', x, fixed = TRUE)))
#[1] "2020-02-08" "2020-02-07"
Or same in base R :
as.Date(paste0('2/', sub('2/', '', x, fixed = TRUE)), "%m/%d/%Y")
Since we know that every month is in February search for /2/ or /02/ and if found the middle number is the month; otherwise, the first number is the month. In either case set the format appropriately and use as.Date. No packages are used.
dates <- c("8/2/2020", "2/7/2020", "2/28/2000", "28/2/2000") # test data
as.Date(dates, ifelse(grepl("/0?2/", dates), "%d/%m/%Y", "%m/%d/%Y"))
## [1] "2020-02-08" "2020-02-07" "2000-02-28" "2000-02-28"

How to convert Hour Minutes character format into a POSIXlt format?

Current situation:
mydate <- "14:45"
class(mydate)
The current class of this value is a character. I would like to convert it into a POSIXlt format.
I tried the strptime() function but it unfortunately adds the full date to my hours when I actually only need Hours:Minutes
mydate <- strptime(mydate, format = "%H:%M")
What can I do to get a POSIXlt format uniquely containing hours and minutes ?
Thanks in advance for your returns !
POSIXlt and POSIXct always contain date and time. You can use chron times class to represent times less than 24:00:00.
library(chron)
tt <- times(paste(mydate, "00", sep = ":"))
tt
## [1] 14:45:00
times class objects are represented internally as a fraction of a day so, for example, adding 1/24 will add an hour.
tt + 1/24 # add one hour
## [1] 15:45:00
For me it works like this:
test <- "2016-04-10T12:21:25.4278624"
z <- as.POSIXct(test,format="%Y-%m-%dT%H:%M:%OS")
#output:
z
"2016-04-10 12:21:25 CEST"
The code is form here: R: convert date from character to datetime

How to Zero Pad Month and Day in R

I've read-in an excel file with dates formatted as m/d/yyyy, which R is reading as a factor. Trying the below returns all NA:
strptime(tvMid$calendar_date, format = "%Y-%m-%d")
It appears I need to zero pad the m and/or d conditionally, based-on whether it is a single-digit month or day. What is the best approach to add these zeroes?
data.frame(
calendar_date = "5/13/2018"
) -> tvMid
strptime(tvMid$calendar_date, format = "%m/%d/%Y")
## [1] "2018-05-13 EDT"

Unable to convert Month-Year string to Date in R

I'm using as.Date to convert a string like Aug-2002 to a dates object representing just the month of Aug of 2002, or if a day must be specified, Aug 1, 2002.
However
> as.Date(c('07-2002'), "%M-%Y")
[1] "2002-11-06"
> as.Date(c('Aug-2002'), "%b-%Y")
[1] NA
Why does the first line of code convert it to a different month and day? And the second one is NA?
I referred to this table for the formatting symbols.
The problem you are having is that the dates you have do not have a day value. Without the day value the format="%m-%Y" will not work in as.Date. These options below will solve them:
as.Date(paste0('01-', c('07-2002')), format="%d-%m-%Y")
library(zoo) #this is a little more forgiving:
as.yearmon(c('07-2002'), "%m-%Y")
as.yearmon(c('Aug-2002'), "%b-%Y")
as.Date(as.yearmon(c('07-2002'), "%m-%Y"))

From MMDD to day of the year in R

I have this .txt file:
http://pastebin.com/raw.php?i=0fdswDxF
First column (Date) shows date in month/day
So 0601 is the 1st of June
When I load this into R and I show the data, it removes the first 0 in the data.
So when loaded it looks like:
601
602
etc
For 1st of June, 2nd of June
For the months 10,11,12, it remains unchanged.
How do I change it back to 0601 etc.?
What I am trying to do is to change these days into the day of the year, for instance,
1st of January (0101) would be 1, and 31st of December would be 365.
There is no leap year to be considered.
I have the code to change this, if my data was shown as 0601 etc, but not as 601 etc.
copperNew$Date = as.numeric(as.POSIXct(strptime(paste0("2013",copperNew$Date), format="%Y%m%d")) -
as.POSIXct("2012-12-31"), units = "days")
Where Date of course is from the file linked above.
Please ask if you do not consider the description to be good enough.
You can use colClasses in the read.table function, then convert to POSIXlt and extract the year date. You are over complicating the process.
copperNew <- read.table("http://pastebin.com/raw.php?i=0fdswDxF", header=TRUE,
colClasses=c("character", "integer", rep("numeric", 3)))
tmp <- as.POSIXlt( copperNew$Date, format='%m%d' )
copperNew$Yday <- tmp$yday
The as.POSIXct function is able to parse a string without a year (assumes the current year) and computes the day of the year for you.
d<-as.Date("0201", format = "%m%d")
strftime(d, format="%j")
#[1] "032"
First you parse your string and obtain Date object which represents your date (notice that it will add current year, so if you want to count days for some specific year add it to your string: as.Date("1988-0201", format = "%Y-%m%d")).
Function strftime will convert your Date to POSIXlt object and return day of year. If you want the result to be a numeric value, you can do it like this: as.numeric(strftime(d, format = "%j"))(Thanks Gavin Simpson)
Convert it to POSIXlt using a year that is not a leap-year, then access the yday element and add 1 (because yday is 0 on January 1st).
strptime(paste0("2011","0201"),"%Y%m%d")$yday+1
# [1] 32
From start-to-finish:
x <- read.table("http://pastebin.com/raw.php?i=0fdswDxF",
colClasses=c("character",rep("numeric",5)), header=TRUE)
x$Date <- strptime(paste0("2011",x$Date),"%Y%m%d")$yday+1
In which language?
If it's something like C#, Java or Javascript, I'd follow these steps:
1-) parse a pair of integers from that column;
2-) create a datetime variable whose day and month are taken from the integers from step one. Set the year to some fixed value, or to the current year.
3-) create another datetime variable, whose date is the 1st of February of the same year as the one in step 2.
The number of the day is the difference in days between the datetime variables, + 1 day.
This one worked for me:
copperNew <- read.table("http://pastebin.com/raw.php?i=0fdswDxF",
header=TRUE, sep=" ", colClasses=c("character",
"integer",
rep("numeric", 3)))
copperNew$diff = difftime(as.POSIXct(strptime(paste0("2013",dat$Date),
format="%Y%m%d", tz="GMT")),
as.POSIXct("2012-12-31", tz="GMT"), units="days")
I had to specify the timezone (tz argument in as.POSIXct), otherwise I got two different timezones for the vectors I am subtracting and therefore non-integer days.

Resources