Date Transformation in R - r

I'm facing a very minor issue, but somehow can't resolve it.
When I'm importing a csv file that has date, the date is coming in "%Y-%m-%d" format. But I want it to be in "%d-%m-%Y" format. I tried "as.Date" to transform it. But it's not working.
The data structure look like this after importing:
Date Share_Val
21/01/2015 20
22/01/2015 19
23/01/2015 21
24/01/2015 23
25/01/2015 26
But when I'm importing the file by read.csv, the data look like the following:
Date Share_Val
01/21/2015 20
01/22/2015 19
01/23/2015 21
01/24/2015 23
01/25/2015 26
I tried lubridate. But it didn't help.
Sam's result comes exactly the way I wanted. But when I'm trying the following, it's not coming
data$date<-format(as.Date(data$date,"%m/%d/%Y"))
Can anybody please give me any suggestions?

See if this helps. Note the stringsAsFactors. If your Date field is a factor, you will need data$Date <- as.character(data$Date) first
data <- data.frame(Date = c("21/01/2015", "22/01/2015", "23/01/2015",
"24/01/2015", "25/01/2015"), Share_Val=c(20, 19, 21, 23, 26),
stringsAsFactors=F)
format(as.Date(data$Date, "%d/%m/%Y"), "%d-%m-%Y")
[1] "21-01-2015" "22-01-2015" "23-01-2015" "24-01-2015" "25-01-2015"

Too long for a comment.
I think you may be misunderstanding how Dates work in R. A variable (or column) of class Date is stored internally as the number of days since 1970-01-01. When you print a Date variable, it is displayed using the %Y-%m-%d format. The as.Date(...) function converts character to Date. The format=... argument controls how the character string is interpreted, not how the result is displayed, as in:
as.Date("02/05/2015", format="%m/%d/%Y")
# [1] "2015-02-05"
as.Date("02/05/2015", format="%d/%m/%Y")
# [1] "2015-05-02"
So in the first case the string is interpreted as 05 Feb, in the second 02 May. Note that in both cases the result is displayed (printed) in %Y-%m-%d format.

Related

as.Date giving me NA's

I've tried everything in this thread as.Date returning NA while converting from 'ddmmmyyyy' to try and sort my problem.
I'm using these commands to turn a factor into a date:
cohort$doi <- as.Date(cohort$doi, format= "%Y/%m/%d")
All my dates are currently in the format: YYYY-MM-DD, so as far as I'm aware the above should work
I used this code yesterday to convert all my dates for various variables from a factor to a date. It worked yesterday and everything was fine. Today I opened my script and imported in my data, ran this command and viewed my data but all of the dates now say NA.
I've tried everything from previous threads (I looked at a few more than just the one I linked above) but nothing has so far worked. I'm not sure what to do now
Example of what doi column looks like:
1970-01-01
1970-02-02
1970-03-03
1970-04-04
The column is currently classed as an factor. And when I do the code I used above, the column is defined as a date but all the dates now say NA
Other than closing R and opening it up again for today, I've done nothing else.
If you read the documentation for as.Date you will note the default format is %Y-%d-%m or %Y/%d/%m:
The default formats follow the rules of the ISO 8601 international standard which expresses a day as "2001-02-03".
In your code you have specified your dates are formatted by slashes, but your sample data shows they are formatted in the default format used by as.Date:
doi <- as.factor(c("1970-01-01",
"1970-02-02",
"1970-03-03",
"1970-04-04"))
as.Date(doi) # default format %Y-%m-%d
[1] "1970-01-01" "1970-02-02" "1970-03-03" "1970-04-04"
as.Date(doi, format = "%Y/%m/%d") # incorrect specification of your date format
[1] NA NA NA NA
as.Date("1970/01/01") # also a default format
[1] "1970-01-01"
Note: as.Date accepts character strings, factors, logical NA and objects of classes "POSIXlt" and "POSIXct".

Recode "date & time variable" into two separate variables

I'm a PhD student (not that experienced in R), and I'm trying to recode a string variable, called RecordedDate into two separate variables: a Date variable and a Time variable. I am using RStudio.
An example of values are:
8/6/2018 18:56
7/26/2018 10:43
7/28/2018 8:36
I would like to you the first part of the value (example: 08/6/2018) to reformat this into a date variable, and the second part of the value (example: 18:56) into a time variable.
I'm thinking the first step would be to create code that can break this up into two variables, based on some rule. I’m thinking maybe I can separate separate everything before the "space" into the Date variable, and after the "space" in the Time variable. I am not able to figure this out.
Then, I'm looking for code that would change the Date from a "string" variable to a "date" type variable. I’m not sure if this is correct, but I’m thinking something like:
better_date <- as.Date(Date, "%m/%d/%Y")
Finally, then I would like to change theTime variable to a "time" type format (if this exists). Not sure how to do this part either, but something that indicates hours and minutes. This part is less important than getting the date variable.
Two immediate ways:
strsplit() on the white space
The proper ways: parse, and then format back out.
Only 2. will guarantee you do not end up with hour 27 or minute 83 ...
Examples:
R> data <- c("8/6/2018 18:56", "7/26/2018 10:43", "7/28/2018 8:36")
R> strsplit(data, " ")
[[1]]
[1] "8/6/2018" "18:56"
[[2]]
[1] "7/26/2018" "10:43"
[[3]]
[1] "7/28/2018" "8:36"
R>
And:
R> data <- c("8/6/2018 18:56", "7/26/2018 10:43", "7/28/2018 8:36")
R> df <- data.frame(data)
R> df$pt <- anytime::anytime(df$data) ## anytime package used
R> df$time <- format(df$pt, "%H:%M")
R> df$day <- format(df$pt, "%Y-%m-%d")
R> df
data pt time day
1 8/6/2018 18:56 2018-08-06 18:56:00 18:56 2018-08-06
2 7/26/2018 10:43 2018-07-26 10:43:00 10:43 2018-07-26
3 7/28/2018 8:36 2018-07-28 00:00:00 00:00 2018-07-28
R>
I often collect data in a data.frame (or data.table) and then add column by column.

Converting string to date in R returns NAs

I have a column of my dataframe as
date
17-Feb
17-Mar
16-Dec
16-Nov
16-Sep
17-Feb
I am trying to convert it into a date column from string. I am using the following pieces of code:
as.Date(df$Date, format="%y-%b")
and
as.POSIXct(df$Date, format="%y-%b")
Both of them give NAs
I am getting the format from this link
The starting number is year. Sorry for the confusion.
I assume from your approach that the 17 and 16 refer to the year 2017 and 2016 respectively. You need to also specify the day of month. If you don't care about it, then set it to the 1st.
A slight modification to your code will work, by appending '-01' to the date then updating your format argument to reflect this:
df = data.frame(Date = c("17-Feb", "17-Mar", "16-Dec"))
as.Date(paste0(df$Date, "-01"), format="%y-%b-%d")

Converting dates from imported CSV file

I'm importing time series data from a CSV file and one of the vectors/columns are dates in the format DD/MM/YYYY. Vector class is characters or factors if I chose the Strings as factors = True. I convert the imported file to a data frame and then run the following:
df$Date <- as.Date(df$Date , "%d/%m/%y")
I get no error message, but the dates are all messed up in the format YYYYMMDD and all the YYYY are the year 2020...
Before:
10/09/2009
11/09/2009
14/09/2009
After:
2020-09-10
2020-09-11
2020-09-14
You are using %y when it should be %Y. See the documentation here.
%y
Year without century (00–99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is the behaviour specified by the 2004 and 2008 POSIX standards, but they do also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’.
%Y
Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC): see http://en.wikipedia.org/wiki/0_(year). Note that the standards also say that years before 1582 in its calendar should only be used with agreement of the parties involved.
Try running the code again so that the data frame is not modified by any previous attempt but this time use
df$Date <- as.Date(df$Date , "%d/%m/%Y")
#Heroka is right.
If ever you need it you could also use posixct objects (they contain information of seconds)
Try this:
df$Date.time <- as.POSIXct(df$Date , format="%d/%m/%Y")
If you want the date and time in strings you can try the following:
df$Date.time <- format(as.POSIXct(df$Date , format="%d/%m/%Y"),format="%Y-%m-%d %H:%M")
or
df$Date <- format(as.POSIXct(df$Date , format="%d/%m/%Y"),format="%Y-%m-%d")

Correct date initially as factor in R

I have dates format like this 31.3.14 for example. I do:
as.Date(gsub("\\.","-", "31.3.14"))
I get this: "0031-03-14", what I would need is: 31-02-2014
I would need that for random dates like: 31.3.99 (to get date as: 31-03-1999)
So I don't know how I could just remove the 00 front of 31 and add 20 front of 20 to be 31-03-2014 and do the same also for dates like 31.3.99.
Try wrapping as.Date with format
x <- c("31.3.14", "31.3.99")
format(as.Date(x, "%d.%m.%y"), "%d-%m-%Y")
# [1] "31-03-2014" "31-03-1999"

Resources