I am working with a "data.frame" which are given in the following formate: Aug 12, 2017.
class(data[,1]) = factor
How can i convert these into dates?
data[,1] <- as.Date.factor(data[,1],format = "%m.%d.%y"), returns NA's.
I would suggest the package lubridate for very easy to use functions to operate with dates. For example:
mdy("Aug 12,2017")
[1] "2017-08-12"
If your date is in YYYY-MM-DD format, you can use the ymd function. There are also other functions such as dmy, dmy_hms (for datetime), etc.
If your column is called my.date, you can do:
data$my.date <- mdy(data$my.date)
Alternatively, you can use the %<>% operator from magrittr to make your code even shorter:
data$my.date %<>% mdy
Use as.POSIXct (Base-R Solution):
as.POSIXct("Aug 12,2017", format="%b%d,%Y")
Output:
[1] "2017-08-12 CEST"
Using strptime, could work:
strptime("Aug 12,2017", "%b%d,%Y")
Output:
[1] "2017-08-12 UTC"
The second parameter for strptime is the format of the dates you have. For instance, if your dates are like this "1/5/2005", then the format would be:
format="%m/%d/%Y"
Hope it helps
Related
I want to convert strings such as "19-SEP-2022" to date. Is there any available function in R? Thank you.
Just to complete I want to add parse_date_time function from lubridate package. With no doubt, the preferred answer here is that of #Marco Sandri:
library(lubridate)
x <- "19-SEP-2022"
x <- parse_date_time(x, "dmy")
class(x)
[1] "2022-09-19 UTC"
> class(x)
[1] "POSIXct" "POSIXt"
Yes, strptime can be used to parse strings into dates.
You could do something like strptime("19-SEP-2022", "%d-%b-%Y").
If your days are not zero-padded, then use %e instead of %d.
A decade or so ago I starting writing the anytime package because of the firm belief that for obvious date(time) patterns we should not need to specify patterns, or learn grammars.
I still use it daily, and so do a bunch of other CRAN users.
> anytime::anydate("19-SEP-2022")
[1] "2022-09-19"
>
So here we do exaxtly what you ask for: supply the string, return a date object.
This is the first line of my dataframe (with column names):
site, date, value
TEES, 20000314, 315
As you can see, the dates don't have separators (- or /), so I can't use as.Date. Thus, I need something like this:
TEES, 2000-03-14, 315
How do I do this? Presumably something with sub
Will this work:
as.Date(gsub('(\\d{4})(\\d{2})(\\d{2})','\\1-\\2-\\3',df$date))
[1] "2000-03-14"
Data:
df
site date value
1 TEES 20000314 315
You could use the ymd function from the lubridate package. This will automatically add "-" to separate YYYY-MM-DD and convert it to Date.
library(lubridate)
ymd(df$date)
# "2000-03-14"
You can use as.Date you just need to specify the tryFormats argument:
as.Date("20000314", tryFormats = c("%Y%m%d"))
[1] "2000-03-14"
The default is to try these formats: c("%Y-%m-%d", "%Y/%m/%d"), which don't match your current structure so you have to tell it how to read your structure.
We can use anydate from anytime
library(anytime)
anydate("20000314")
#[1] "2000-03-14"
I have data which have the format of YYYYMM and I wish convert it to YYYY-MM format.
exemple : 201805 should be in the format of 2018-05
How could I do it please ?
We can use as.yearmon from zoo to convert it to yearmon object and then do the format
library(zoo)
format(as.yearmon(as.character(v1), "%Y%m"), "%Y-%m")
#[1] "2018-05"
data
v1 <- 201805
I like the idea of using actual dates here. If the days component does not matter to you, then you may arbitrarily just set each of your dates to the first of the month. Then, we can leverage R's dates functions to handle the heavy lifting.
x <- "201805"
x <- paste0(x, "01")
x
y <- format(as.Date(x, format = "%Y%m%d"), "%Y-%m-%d")
substr(y, 1, 7)
[1] "20180501"
[1] "2018-05"
You could use regular expressions:
data <- "201805"
sub("(\\d{4})", "\\1-", data)
[1] "2018-05"
Another variant, using only lookarounds:
sub("(?<=\\d{4})(?=\\d{2})", "-", data, perl=TRUE)
How about following one(I am considering that OP need not to perform any checks on its variable's value here).
val="201805"
sub("(..$)","-\\1",val)
OR to perform substitution with last 2 digits only try following.
val="201805"
sub("(\\d{2}$)","-\\1",val)
[1] "2018-05"
Very similar to some of the others, but because I find the package useful I will mention it:
library(lubridate)
date <- "201805"
format(ymd(paste0(date,"01")), "%Y-%m")
Lubridate can make life easy if the formats start to vary.
Here is another option albeit a longer one:
library(tidyverse)
somestring<-"201805"
stringi::stri_sub(somestring,1,4)<-"-"
somestring1<-"201805"
somestring2<-substring(somestring1,1,4)
as.character.Date(paste0(somestring2,somestring))
Result:
"2018-05"
I'm using strptime to extract date and the result is a wrong year
Where is the error in the below code:
strptime('8/29/2013 14:13', "%m/%d/%y")
[1] "2020-08-29 PDT"
What are the other ways to extract date and time as separate columns.
The data I have is in this format - 8/29/2013 14:13
I want to split this into two columns, one is 8/29/2013 and the other is 14:13.
You have a four digit year so you need to use %Y
strptime('8/29/2013 14:13', "%m/%d/%Y" )
[1] "2013-08-29 CEST"
Do you really want data and time in separate columns? It usually much easier to deal with a single date-time object.
Here's one possibility to separate time and date from the string.
For convenience, we could first convert the string into a POSIX object:
datetime <- '8/29/2013 14:13'
datetime.P <- as.POSIXct(datetime, format='%m/%d/%Y %H:%M')
Then we can use as.Date() to extract the date from this object and use format() to display it in the desired format:
format(as.Date(datetime.P),"%m/%d/%Y")
#[1] "08/29/2013"
To store the time separately we can use, e.g., the strftime() function:
strftime(datetime.P, '%H:%M')
#[1] "14:13"
The last function (strftime()) is not vectorized, which means that if we are dealing with a vector datetime containing several character strings with date and time in the format as described in the OP, it should be wrapped into a loop like sapply() to extract the time from each string.
Example
datetime <- c('8/29/2013 14:13', '9/15/2014 12:03')
datetime.P <- as.POSIXct(datetime, format='%m/%d/%Y %H:%M')
format(as.Date(datetime.P),"%m/%d/%Y")
#[1] "08/29/2013" "09/15/2014"
sapply(datetime.P, strftime, '%H:%M')
#[1] "14:13" "12:03"
Hope this helps.
thanks for your help in advance. i am working with the getQuote function in the quantmod package, which returns the following data frame:
is there a way to modify all the dates in the first column to exclude the time stamp, while retaining the data frame structure? i just want the "YYYY-MM-DD" in the first column. i know that if it was a vector of dates, i would use substr(df[,1],1,10). i have also looked into the apply function, with: apply(df[,1],1,substr,1,10).
Another option not mentioned yet:
tt <- getQuote("AAPL")
trunc(tt[,1], units='days')
This returns the date in POSIXlt. You can wrap it in as.POSIXct, if you want.
using ?strptime
tt <- getQuote("AAPL")
tt[,1]
[1] "2013-01-16 02:52:00 CET"
as.POSIXct(strptime(tt[,1],format ='%Y-%m-%d')) ## as.POSIXct because strptime returns POSIXlt
[1] "2013-01-16 CET"
EDIT
You can use the format argument of POSIXct, but you need to convert the tt[,1] to character before.
as.POSIXct(as.character(tt[,1]),format ='%Y-%m-%d')
[1] "2013-01-16 CET"
I would do this with lubridate
library(plyr)
library(lubridate)
tickers <- c("AAPL","AAJX","ABR")
df <- ldply(tickers, getQuote)
rownames(df) <- tickers
df[,"Trade Time"] <- paste(year(df[,"Trade Time"]),month(df[,"Trade Time"]),day(df[,"Trade Time"]),sep="-")
There might be a more elegant way of printing the date, but this is what came to me first.
You may just use gsub. No need to convert data type.
tt <- getQuote("AAPL")
tt[, 'Trade Time']<- gsub(" [0-9]{2}:[0-9]{2}:[0-9]{2}", "", tt[, 'Trade Time'])
It can be as simple as:
tt[,1]=as.Date(tt[,1])
(where tt is tt <- getQuote("AAPL"), as shown in the alternative answers)
The blank before the comma means "do all rows" and the 1 after the comma means "operate on (just) the first column".
I prefer this solution because it gives you a Date object, which must be exactly what you want if you are trying to strip off timestamps.
agstudy's answer give you a date with a timezone, and that is going to bite you the first time you run your script in a different timezone. (Aside: I got some regressions in a unit test suite when I ran them in the U.K. while there at Christmas, due to a subtle timezone assumption in my test code.)