I am trying to convert integer data from my data frame in R, to date format.
The data is under column named svcg_cycle within orig_svcg_filtered data frame.
The original data looking something like 200502, 200503, and so forth, and I expect to turn it into yyyy-mm-dd format.
I am trying to use this code:
as.Date(orig_svcg_filtered$svcg_cycle, origin = "2000-01-01")
but the output is not something that I expected:
[1] "2548-12-15" "2548-12-15" "2548-12-15" "2548-12-15" "2548-12-15"
while it is supposed to be 2005-02-01, 2005-03-01, and so forth.
How to solve this?
If you have
x <- c(200502, 200503)
Then
as.Date(x, origin = "2000-01-01")
tells R you want the days 200,502 and 200,503 days after 2000-01-01. From help("as.Date"):
as.Date will accept numeric data (the number of days since an epoch),
but only if origin is supplied.
So, integer data gives days after the origin supplied, not some sort of numeric code for the dates like 200502 for "2005-02-01".
What you want is
as.Date(paste(substr(x, 1, 4), substr(x, 5, 6), "01", sep = "-"))
# [1] "2005-02-01" "2005-03-01"
The
paste(substr(x, 1, 4), substr(x, 5, 6), "01", sep = "-")
part takes your integers and creates strings like
# [1] "2005-02-01" "2005-03-01"
Then as.Date() knows how to deal with them.
You could alternatively do something like
as.Date(paste0(x, "01"), format = "%Y%m%d")
# [1] "2005-02-01" "2005-03-01"
This just pastes on an "01" to each element (for the day), converts to character, and tells as.Date() what format to read the date into. (See help("as.Date") and help("strptime")).
I like to use Regex to fix these kinds of string formatting issues. as.Date by default only checks for several standard date formats like YYYY-MM-DD. origin is used when you have an integer date (i.e. seconds from some reference point), but in this case your date is actually not an integer date, rather it's just a date formatted as a string of integers.
We simply split the month and day with a dash, and add a day, in this case the first of the month, to make it a valid date (you must have a day to store it as a date object in R). The Regex bit captures the first 4 digits in group one and final two digits in group two. We then combine the two groups, separated by dashes, along with the day.
as.Date(gsub("^(\\d{4})(\\d{2})", "\\1-\\2-01", x))
[1] "2005-02-01" "2005-03-01"
You don't need to specify format in this case, because YYYY-MM-DD is one of the standard formats as.Date checks, however, the format argument is format = "%Y-%m-%d"
Related
Im trying to convert a column from character to date.
The current format in the column is "YearMmonth", as in "1990M01".
It is a really weird format, so i'm wondering how R reads this when i use the as.Date code. It is basically "Year Month Month-number". I know how to use the rest of the code, i just need to know how to translate this to R.
I have tried using
df <- as.Date(df, "%YM%m", "%Y/%m")
df <- as.Date(paste0("01-", df), format = "%Y/%m/%d")
and alot others, the main problem is translating the character column.
There are several problems with the code in the question:
The first attempt in the question does not have a day field
the second attempt does but puts the day field first yet the format says it is last.
the use of df in the question suggests that the result is a data frame yet the result is a Date class object, not a data frame.
Here are some approaches that work.
yearmon
Use as.yearmon to convert to a yearmon object or as.Date(as.yearmon(...)) to convert to a Date object. yearmon objects directly represent year and month without day so that may be preferable.
library(zoo)
as.yearmon("1990M01", "%YM%m")
## [1] "Jan 1990"
as.Date(as.yearmon("1990M01", "%YM%m"))
## [1] "1990-01-01"
Replacing the M with a minus would also work:
as.yearmon(chartr("M", "-", "1990M01"))
## [1] "Jan 1990"
Base R
A way that does not involve any packages is to append a day:
as.Date(paste("1990M01", 1), "%YM%m %d")
## [1] "1990-01-01"
or change the M to a minus and append a minus and a 1 in which case it is already in the default format and no format string is needed.
as.Date(sub("M(\\d+)", "-\\1-1", "1990M01"))
## [1] "1990-01-01"
I have a data frame with column 1, titled "Label", containing dates in the format "MM/DD/YEAR 00:00". I want to add one more set of times (another ':00') to the end of every element of this column so that the full format is "MM/DD/YEAR 00:00:00". I'm pretty new to R, any advice on how to do this?
If the Label column is character, then paste should work:
df$Label <- paste0(df$Label, ":00")
But note that most of R's date APIs (e.g. strptime, as.Date) can cope with a missing seconds component. Depending on what you plan to do, you may not need to do this operation
Edit:
If you want to preserve both the dates and times, while obtaining a bona fide date type, then consider using strptime, which outputs a posixlt:
df$ts <- strptime("8/11/2017 8:55:00", "%m/%d/%Y %H:%M:%S")
or
df$ts <- strptime("8/11/2017 8:55 PM", "%m/%d/%Y %H:%M %p")
Demo
I have dates in a character vector. I cannot easily convert to a date vector using as.Date, because not all of the strings have the form mm/dd/yyyy, thus giving me the ambiguous date error. Some strings have the form m/dd/yyyy (months 1:9).
Here's part of the vector:
data$Date <- c("8/26/2014","3/10/2014","9/25/2014","11/12/2014","8/4/2015")
Indicator for date to let me know which strings I need to add a zero to
data$date <- grepl("[0-9]{2}/[0-9]{2}/[0-9]{4}", data$Date)
Attempt to add zeros through a conditional:
data$Date<-ifelse(data$date == "FALSE", paste0("0", data$Date), data$Date)
Doesn't work (I'm not familiar with paste). Any concise solutions to add a leading zero to single digit months (m/dd/yyy)? I'm guessing gsub or sub? I need all the strings to be in form mm/dd/yyy so I can convert to a date vector.
data <- data.frame(Date=c("8/26/2014","3/10/2014","9/25/2014","11/12/2014","8/4/2015"))
as.Date(data$Date,format="%m/%d/%Y")
works fine for me with your data. Output is
"2014-08-26" "2014-03-10" "2014-09-25" "2014-11-12" "2015-08-04"
The dataset looks pretty much like this
I searched around but found only the function that needs a delimiter. I managed to import the file to R successfully with two columns.
Then I want to separate DATE column into "Year", "Month", and "Date." So I want to have 4 column in total. And this is where I got stock. The column doesn't have usual "/" or "-" that usually come with the date format.
Thanks for your help.
As #alistaire has shown, you can convert what you have to an R recognised date format with (replace the single character string in the below with your column vector df$DATE to work on the entire data frame):
date <- as.Date( '19981201', '%Y%m%d' )
date
[1] "1998-12-01"
From there, you can separate out your year, month, and day as you please.
year <- format( date, "%Y")
year
[1] "1998"
month <- format( date, "%m" )
month
[1] "12"
day <- format( date, "%d" )
day
[1] "01"
Of course, you could also skip the date step, and just split the first 8 characters into 3 shorter strings (as #warmoverflow has suggested), but I'd recommend the above as probably better in practice. Mostly because you'll be best using the date format for things like sorting and plotting, so it would be a good idea to use it along the way too.
IN RESPONSE TO YOUR ANSWER/FOLLOW-UP-QUESTION:
Notice in the console output in step 3, the column vector is labeled as class int (integer). You probably need to make sure it's being fed to as.Date as character. It looks like that's what you tried to do in step 4, but by surrounding the vector reference in quotations, you pass the character string "v1$DATE", which R has no idea what to do with. Instead:
v1$date_v2 <- as.Date( as.character( v1$DATE ), format = "%Y%m%d" )
I have a column named time that contains the date and time, listed in this format ex: 03/20/2016 14:24:47.153. I want to extract the time portion to be used to produce a line graph (so I will like a comprehensive x axis time interval). How can I do this?
We can use sub to extract the "time" from the string.
sub("\\S+\\s+", "", str1)
#[1] "14:24:47.153"
Or convert to Datetime class and then format
options(digits.secs=3)
format(strptime(str1, format= "%m/%d/%Y %H:%M:%OS"), "%H:%M:%OS3")
#[1] "14:24:47.153"
data
str1 <- "03/20/2016 14:24:47.153"