Need help turning a column from character to date in R - r

Im trying to convert a column from character to date.
The current format in the column is "YearMmonth", as in "1990M01".
It is a really weird format, so i'm wondering how R reads this when i use the as.Date code. It is basically "Year Month Month-number". I know how to use the rest of the code, i just need to know how to translate this to R.
I have tried using
df <- as.Date(df, "%YM%m", "%Y/%m")
df <- as.Date(paste0("01-", df), format = "%Y/%m/%d")
and alot others, the main problem is translating the character column.

There are several problems with the code in the question:
The first attempt in the question does not have a day field
the second attempt does but puts the day field first yet the format says it is last.
the use of df in the question suggests that the result is a data frame yet the result is a Date class object, not a data frame.
Here are some approaches that work.
yearmon
Use as.yearmon to convert to a yearmon object or as.Date(as.yearmon(...)) to convert to a Date object. yearmon objects directly represent year and month without day so that may be preferable.
library(zoo)
as.yearmon("1990M01", "%YM%m")
## [1] "Jan 1990"
as.Date(as.yearmon("1990M01", "%YM%m"))
## [1] "1990-01-01"
Replacing the M with a minus would also work:
as.yearmon(chartr("M", "-", "1990M01"))
## [1] "Jan 1990"
Base R
A way that does not involve any packages is to append a day:
as.Date(paste("1990M01", 1), "%YM%m %d")
## [1] "1990-01-01"
or change the M to a minus and append a minus and a 1 in which case it is already in the default format and no format string is needed.
as.Date(sub("M(\\d+)", "-\\1-1", "1990M01"))
## [1] "1990-01-01"

Related

Convert an entire column from character to date class in R [duplicate]

This question already has answers here:
Convert date-time string to class Date
(4 answers)
Closed 2 years ago.
I am using this dataset and I am looking to convert the ArrestDate column from character in to dates so that I can work with dates for analysis.
I've first tried using mutate:
Date <- mutate(crime, ArrestDate = as.Date(ArrestDate, format= "%d.%m.%Y"))
however when I do this the entire ArrestDate column is changed to NAs.
Secondly I tried using strptime but for some reason it converts some dates fine and others to NA:
Date <-strptime(paste(crime$ArrestDate, sep=" "),"%d/%m/%Y")
crime2 <- cbind(Date, crime)
Anyone able to tell me what I am doing wrong or alternatively provide a better approach for this?
Thanks.
The lubridate package offers some very useful functions to transform strings into dates. In this case you can use mdy() since the format is '%m/%d/%Y' (as can be derived from the first record which is '12/31/2019').
library(dplyr)
library(lubridate)
crime %>%
mutate(ArrestDate = mdy(ArrestDate))
Replacing the '.' with '/' works from your first example in the format:
Date <- mutate(crime, ArrestDate = as.Date(ArrestDate, format= "%m/%d/%Y"))
class(Date$ArrestDate)
[1] "Date"

How to convert date in format mm/yy from character to date class?

Just a brief (and hopefully simple!) question;
I have a column of dates such as 08/14 and 08/15. However the column class is character, and I was wondering if it is possible (and if so how) to convert this into a date class? I've tried lots of different ways and it either does nothing, or returns N/A values for the whole column. Hopefully there should be a simple line of code to fix this that I haven't come across yet?
Any help much appreciated!!
One option is to create a day by pasteing and then use as.Date because date also needs day info
as.Date(paste0(c("08/14", "08/15"), "/01"), format = "%m/%y/%d")
#[1] "2014-08-01" "2015-08-01"
Or make use of as.yearmon from zoo
library(zoo)
as.Date(as.yearmon(c("08/14", "08/15"), "%m/%y"))
library(dplyr)
dates <- c("08/14", "10/13", "12/09") # year-month
datesWithYear <- paste(date, "/01", sep = "") %>%
as.Date()
If you want to use the year-month combination as a factor for cohort analysis of something similar you can always use as.yearmon() from the zoo package. tsibble and lubridate also have interesting functions for dealing with dates and time.

How to convert integer to date format in R?

I am trying to convert integer data from my data frame in R, to date format.
The data is under column named svcg_cycle within orig_svcg_filtered data frame.
The original data looking something like 200502, 200503, and so forth, and I expect to turn it into yyyy-mm-dd format.
I am trying to use this code:
as.Date(orig_svcg_filtered$svcg_cycle, origin = "2000-01-01")
but the output is not something that I expected:
[1] "2548-12-15" "2548-12-15" "2548-12-15" "2548-12-15" "2548-12-15"
while it is supposed to be 2005-02-01, 2005-03-01, and so forth.
How to solve this?
If you have
x <- c(200502, 200503)
Then
as.Date(x, origin = "2000-01-01")
tells R you want the days 200,502 and 200,503 days after 2000-01-01. From help("as.Date"):
as.Date will accept numeric data (the number of days since an epoch),
but only if origin is supplied.
So, integer data gives days after the origin supplied, not some sort of numeric code for the dates like 200502 for "2005-02-01".
What you want is
as.Date(paste(substr(x, 1, 4), substr(x, 5, 6), "01", sep = "-"))
# [1] "2005-02-01" "2005-03-01"
The
paste(substr(x, 1, 4), substr(x, 5, 6), "01", sep = "-")
part takes your integers and creates strings like
# [1] "2005-02-01" "2005-03-01"
Then as.Date() knows how to deal with them.
You could alternatively do something like
as.Date(paste0(x, "01"), format = "%Y%m%d")
# [1] "2005-02-01" "2005-03-01"
This just pastes on an "01" to each element (for the day), converts to character, and tells as.Date() what format to read the date into. (See help("as.Date") and help("strptime")).
I like to use Regex to fix these kinds of string formatting issues. as.Date by default only checks for several standard date formats like YYYY-MM-DD. origin is used when you have an integer date (i.e. seconds from some reference point), but in this case your date is actually not an integer date, rather it's just a date formatted as a string of integers.
We simply split the month and day with a dash, and add a day, in this case the first of the month, to make it a valid date (you must have a day to store it as a date object in R). The Regex bit captures the first 4 digits in group one and final two digits in group two. We then combine the two groups, separated by dashes, along with the day.
as.Date(gsub("^(\\d{4})(\\d{2})", "\\1-\\2-01", x))
[1] "2005-02-01" "2005-03-01"
You don't need to specify format in this case, because YYYY-MM-DD is one of the standard formats as.Date checks, however, the format argument is format = "%Y-%m-%d"

Add characters to the end of every element in a column of a dataframe

I have a data frame with column 1, titled "Label", containing dates in the format "MM/DD/YEAR 00:00". I want to add one more set of times (another ':00') to the end of every element of this column so that the full format is "MM/DD/YEAR 00:00:00". I'm pretty new to R, any advice on how to do this?
If the Label column is character, then paste should work:
df$Label <- paste0(df$Label, ":00")
But note that most of R's date APIs (e.g. strptime, as.Date) can cope with a missing seconds component. Depending on what you plan to do, you may not need to do this operation
Edit:
If you want to preserve both the dates and times, while obtaining a bona fide date type, then consider using strptime, which outputs a posixlt:
df$ts <- strptime("8/11/2017 8:55:00", "%m/%d/%Y %H:%M:%S")
or
df$ts <- strptime("8/11/2017 8:55 PM", "%m/%d/%Y %H:%M %p")
Demo

R: Separate String Without Delimiter (like Fixed Width in Excel)

The dataset looks pretty much like this
I searched around but found only the function that needs a delimiter. I managed to import the file to R successfully with two columns.
Then I want to separate DATE column into "Year", "Month", and "Date." So I want to have 4 column in total. And this is where I got stock. The column doesn't have usual "/" or "-" that usually come with the date format.
Thanks for your help.
As #alistaire has shown, you can convert what you have to an R recognised date format with (replace the single character string in the below with your column vector df$DATE to work on the entire data frame):
date <- as.Date( '19981201', '%Y%m%d' )
date
[1] "1998-12-01"
From there, you can separate out your year, month, and day as you please.
year <- format( date, "%Y")
year
[1] "1998"
month <- format( date, "%m" )
month
[1] "12"
day <- format( date, "%d" )
day
[1] "01"
Of course, you could also skip the date step, and just split the first 8 characters into 3 shorter strings (as #warmoverflow has suggested), but I'd recommend the above as probably better in practice. Mostly because you'll be best using the date format for things like sorting and plotting, so it would be a good idea to use it along the way too.
IN RESPONSE TO YOUR ANSWER/FOLLOW-UP-QUESTION:
Notice in the console output in step 3, the column vector is labeled as class int (integer). You probably need to make sure it's being fed to as.Date as character. It looks like that's what you tried to do in step 4, but by surrounding the vector reference in quotations, you pass the character string "v1$DATE", which R has no idea what to do with. Instead:
v1$date_v2 <- as.Date( as.character( v1$DATE ), format = "%Y%m%d" )

Resources