Strptime seems to be missing something in this scenario:
aDateInPOSIXct <- strptime("2018-12-31", format = "%Y-%m-%d")
someText <- "asdf"
df <- data.frame(aDateInPOSIXct, someText, stringsAsFactors = FALSE)
bDateInPOSIXct <- strptime("2019-01-01", format = "%Y-%m-%d")
df[1,1] <- bDateInPOSIXct
Assignment of bDate to the dataframe fails with:
Error in as.POSIXct.numeric(value) : 'origin' must be supplied
And a warning:
provided 11 variables to replace 1 variables
I want to use both POSIXct dates and POSIXct date-times to compare this and that. It's way less work than manipulating character strings -- and POSIX takes care of the time zone issues. Unfortunately, I'm missing something.
You only need to cast your calls to strptime to POSIXct explicitly:
aDateInPOSIXct <- as.POSIXct(strptime("2018-12-31", format = "%Y-%m-%d"))
someText <- "asdf"
df <- data.frame(aDateInPOSIXct, someText, stringsAsFactors = FALSE)
bDateInPOSIXct <- as.POSIXct(strptime("2019-01-01", format = "%Y-%m-%d"))
df[1,1] <- bDateInPOSIXct
Check the R documentation which says:
Character input is first converted to class "POSIXlt" by strptime: numeric input is first converted to "POSIXct".
Related
My problem is that I am importing a CSV file, and trying to get R to recognize the date column as dates and format them as such.
So far I have achieved to replace the format seen below "#yyyy-mm-dd#" with the integer date value in R.
But when I check the class before and after the transformation it still says "character".
I need the column to be recognized as a date class so that I can use it for forecasting. But
DemandCSV <- read_csv("C:/Users/pth/Desktop/Care/Demand.csv")
nrow <- nrow(DemandCSV)
for(i in 1:nrow){
DemandCSV[i,1] <-as.Date(ymd(substr(DemandCSV[i,1], 2, 11)))
}
DemandCSV[,1] <- format(DemandCSV[,1], "%Y-%m-%d")
Figured out an inelegant solution (turns out it was not a solution)
DemandCSV <- read_csv("C:/Users/pth/Desktop/Care/Demand.csv")
nrow <- nrow(DemandCSV)
for(i in 1:nrow){
DemandCSV[i,1] <-as.Date(ymd(substr(DemandCSV[i,1], 2, 11)))
DemandCSV[i,1] <- format(as.Date(as.numeric(DemandCSV[i,1],origin = "01-01-1970")), "%Y-%m-%d")}
DemandCSV %>% pad %>% fill_by_value(0)
Does including the "#" in the format string solve your problem?
data <- c("#2019-09-23#", "#2019-09-24#", "#2019-09-25#")
a <- as.Date(data,format="#%Y-%m-%d#")
or
DemandCSV <- data.frame(date=
c("#2019-09-23#", "#2019-09-24#", "#2019-09-25#"))
mutate_at(DemandCSV,"date",as.Date,format="#%Y-%m-%d#")
Maybe simpler to
Substitute out the #
Rely on anydate from the anytime package
Demo:
R> data <- c("#2019-09-23#", "#2019-09-24#", "#2019-09-25#")
R> anytime::anydate(gsub("#", "", data))
[1] "2019-09-23" "2019-09-24" "2019-09-25"
R>
I have a column of dates, exported from Excel as CSV into dataframe, the default type in "import dataset..." "...from CSV" i.e. d<-read_csv(data.csv).
From a dataframe I like to create a zoo and/or xts object.
The data is:
30/04/2016
31/05/2016
30/06/2016
I get the following errors:
dates <- c('30/04/2016','31/05/2016','30/06/2016')
d <- dates
z <- read.zoo(d)
Error in read.zoo(d) : index has bad entry at data row 1
z <- read.zoo(d, FUN = as.Date())
Error in as.Date() : argument "x" is missing, with no default
z <- read.zoo(d, FUN = as.Date(format="%d/%m/%Y"))
Error in as.Date(format = "%d/%m/%Y") : argument "x" is missing,
with no default
Alternatively, if i read directly into zoo with format arguemnt i get a different error:
ts.z <- read.zoo(d,index=1,tz='',format="%d/%m/%Y")
Error in read.zoo(d, index = 1, tz = "", format = "%d/%m/%Y") :
index has bad entry at data row 1
What are the bad entry row 1 errors? What are the correct ways to specify FUN = ? What are the correct input classes and distinctions for read.zoo?
From ?read.zoo about the file-parameter:
character string or strings giving the name of the file(s) which the
data are to be read from/written to. See read.table and
write.table for more information. Alternatively, in read.zoo, file
can be a connection or a data.frame (e.g., resulting from a previous
read.table call) that is subsequently processed to a "zoo" series.
What is going wrong in your example is that d is neither a filename, a connection or a data.frame. You will have to wrap it in data.frame().
A working example:
z <- read.zoo(data.frame(dates), FUN = as.Date, format='%d/%m/%Y')
which gives:
> z
2016-04-30
2016-05-31
2016-06-30
> class(z)
[1] "zoo"
Used input data:
dates <- c('30/04/2016','31/05/2016','30/06/2016')
I have a csv file containing financial data (i.e. dates with corresponding prices). My goal is to load these data in R and convert the dates from character data to dates. I tried the following:
data<-read.csv("data.csv",sep=";")
attach(data)
as.Date(Date,format="%Y-%b-%d") #'Date' is the column containing the dates
Unfortunately, this only leads to NAs in Date. Things that were proposed in other threads on this issue but did not help me:
reading in the csv file with 'stringsAsFactors=FALSE'
formatting the dates in Excel as dates
Here is a sample of my csv file:
Date;Open;High;Low;Close;Volume;Adj Close
30.10.2015;10842.51953;10850.58008;10748.7002;10850.13965;89270000;10850.13965
29.10.2015;10867.19043;10886.98047;10741.13965;10800.83984;122513100;10800.83984
28.10.2015;10728.16016;10848.41016;10691.62988;10831.95996;0;10831.95996
27.10.2015;10761.37012;10807.41016;10692.19043;10692.19043;0;10692.19043
26.10.2015;10791.17969;10863.08984;10756.83008;10801.33984;73091500;10801.33984
23.10.2015;10610.33008;10847.46973;10586.95996;10794.54004;0;10794.54004
22.10.2015;10213.00977;10508.25;10194.74023;10491.96973;107511600;10491.96973
21.10.2015;10185.41992;10277.58984;10107.91992;10238.09961;70021400;10238.09961
20.10.2015;10174.79981;10194.53027;10080.19043;10147.67969;67235200;10147.67969
Your format argument was incorrect, which is usually the cause of NAs when coercing strings to Date objects. You can use this instead:
R> as.Date(Df$Date, format = "%d.%m.%Y")
#[1] "2015-10-30" "2015-10-29" "2015-10-28" "2015-10-27" "2015-10-26"
#[6] "2015-10-23" "2015-10-22" "2015-10-21" "2015-10-20"
Instead of attach, you can use alternatives such as within to avoid qualifying your column names. For example,
Df <- within(Df, {
Date <- as.Date(Date, format = "%d.%m.%Y")
})
##
R> class(Df$Date)
#[1] "Date"
Data:
Df <- read.table(
text = "Date;Open;High;Low;Close;Volume;Adj Close
30.10.2015;10842.51953;10850.58008;10748.7002;10850.13965;89270000;10850.13965
29.10.2015;10867.19043;10886.98047;10741.13965;10800.83984;122513100;10800.83984
28.10.2015;10728.16016;10848.41016;10691.62988;10831.95996;0;10831.95996
27.10.2015;10761.37012;10807.41016;10692.19043;10692.19043;0;10692.19043
26.10.2015;10791.17969;10863.08984;10756.83008;10801.33984;73091500;10801.33984
23.10.2015;10610.33008;10847.46973;10586.95996;10794.54004;0;10794.54004
22.10.2015;10213.00977;10508.25;10194.74023;10491.96973;107511600;10491.96973
21.10.2015;10185.41992;10277.58984;10107.91992;10238.09961;70021400;10238.09961
20.10.2015;10174.79981;10194.53027;10080.19043;10147.67969;67235200;10147.67969",
header = TRUE, stringsAsFactors = FALSE, sep = ";")
This seems like a simple enough function to write, but I think I'm misunderstanding the requirements for formal arguments / how R parses and evaluates a function.
I'm trying to write a function that converts any character vector of the form "%m/%d/%Y" (and belonging to data.frame df) to a date vector, and formats it as "%m/%d/%Y", as follows:
dateformat <- function(x) {
df$x <- (format(as.Date(df$x, format = "%m/%d/%Y"), "%m/%d/%Y"))
}
I was thinking that...
dateformat(a)
... would just take the "a" as the actual argument for x and plug it into the function, thus resolving as:
df$a <- (format(as.Date(df$a, format = "%m/%d/%Y"), "%m/%d/%Y"))
However, I get the following error when running dateformat(a):
Error in as.Date.default(df$x, format = "%m/%d/%Y") :
do not know how to convert 'df$x' to class “Date”
Can someone please explain why my understanding of formal/actual arguments and/or R function parsing/evaluation is incorrect? Thank you.
Update
Of course, for all the variables I want to convert to dates (e.g., df$a, df$b, df$c), I could just write
df$a <- (format(as.Date(df$a, format = "%m/%d/%Y"), "%m/%d/%Y"))
df$b <- (format(as.Date(df$b, format = "%m/%d/%Y"), "%m/%d/%Y"))
df$c <- (format(as.Date(df$c, format = "%m/%d/%Y"), "%m/%d/%Y"))
But I'm looking to improve my coding skills by making a more general function to which I could feed a vector of variables. For instance, what if I had df$a to df$z, all character variables that I wanted to convert to date variables? After I write a proper function, I'd like to then perhaps run it like so:
for (n in letters) {
dateformat(n)
}
First, the format(...) function returns a character vector, not a date, so if x is a string,
format(as.Date(x, format = "%m/%d/%Y"), "%m/%d/%Y")
converts x to date and then back to character, as in:
result <- format(as.Date("01/03/2014", format = "%m/%d/%Y"), "%m/%d/%Y")
result
# [1] "01/03/2014"
class(result)
# [1] "character"
Second, referencing an object, such as df, in a function, on the LHS of an expression, causes R to create that object in the scope of the function.
a <- 2
f <- function(x) a <- x
f(3)
a
# [1] 2
Here, we set a variable, a, to 2. Then in the function we create a new variable, a in the scope of the function, set it to x (3), and destroy it when the function returns. So in the global environment a is still 2.
If you insist on using a dateformat(...) function, this should work work:
df <- data.frame(a=paste("01",1:10,"2014",sep="/"),
b=paste("02",11:20,"2014",sep="/"),
c=paste("03",21:30,"2014",sep="/"))
dateformat <- function(x) as.Date(df[[x]], format = "%m/%d/%Y")
for (n in letters[1:3]) df[[n]] <- dateformat(n)
sapply(df,class)
# a b c
# "Date" "Date" "Date"
This will be more efficient though:
df <- as.data.frame(lapply(df,as.Date,format="%m/%d/%Y"))
Please help as I have a csv file of large database with date column having various format of dates like 20080408 or 2008/04/08 or 08/04/2008.How do i change these format to one format of dd/mm/yyyy.In R Programing
You can do it with failure tests via lubridate dmy and mdy conversions as well (hence the suppressWarnings() calls. I don't think you're going to be able to ensure proper handling of things like "08/04/2008" if 08 is supposed to be the "day" component, tho, given that the functions can't read minds.
library(lubridate)
dat <- c("20080408", "2008/04/08", "08/04/2008")
dat.1 <- unlist(lapply(dat, function(x) {
suppressWarnings(res <- mdy(x))
if (is.na(res)) { suppressWarnings(res <- ymd(x)) }
return(as.character(res))
}))
dat.1
## [1] "2008-04-08" "2008-04-08" "2008-08-04"
The following should work for your data.frame. You may need to convert your date column to the class as.character in order that the string split function strsplit works correctly. After tha, the loop simply evaluates how many characters are in the string before the first "/" character, and adjusts the formatting accordingly.
Example:
df <- data.frame(DATE=as.character(c("20080408", "2008/04/08", "08/04/2008")), DATE2=as.Date(NA))
df$DATE=as.character(df$DATE)
for(i in seq(df$DATE)){
sp <- unlist(strsplit(df$DATE[i], "/"))
if(nchar(sp[1]) == 8){
df$DATE2[i] <- as.Date(df$DATE[i], format="%Y%m%d")
}
if(nchar(sp[1]) == 4){
df$DATE2[i] <- as.Date(df$DATE[i], format="%Y/%m/%d")
}
if(nchar(sp[1]) == 2){
df$DATE2[i] <- as.Date(df$DATE[i], format="%d/%m/%Y")
}
}
Result:
df
# DATE DATE2
#1 20080408 2008-04-08
#2 2008/04/08 2008-04-08
#3 08/04/2008 2008-04-08
You can read them as character values and convert them using as.Date.
x1 <- '20080408' ## class character (string)
x2 <- '2008/04/08'
x1.dt <- as.Date(x1, format='%Y%m%d')
x2.dt <- as.Date(x2, format='%Y/%m/%d') ## different format
print(c(x1, x2), format='%d/%m/%Y') ## you can return Date objects in any format you want
Check out ?strftime for all the formatting options.