create date vector, change to factor and format - r

I have the following code:
gsub("-","/",paste(cut(seq(as.POSIXct(Sys.Date(),format="%d-%b-%y"), by = "-1 day", length.out = 10),"days"),collapse = ","))
The output:
"2019/03/20,2019/03/19,2019/03/18,2019/03/17,2019/03/16,2019/03/15,2019/03/14,2019/03/13,2019/03/12,2019/03/11"
However the desired result is
'20/03/2019','19/03/2019','18/03/2019','17/03/2019','16/03/2019','15/03/2019','14/03/2019','13/03/2019','12/03/2019','11/03/2019'
How can I accomplish that ?
Regards

Not sure what you are trying to do but you can generate the required output by doing
format(Sys.Date() - 1:10, "%d/%m/%Y")
#[1] "20/03/2019" "19/03/2019" "18/03/2019" "17/03/2019" "16/03/2019" "15/03/2019"
# "14/03/2019" "13/03/2019" "12/03/2019" "11/03/2019"

Related

How format date by country in R?

I need a simple way to format dates by different country formats. In the ideal case make one setup and use it everywhere in the code.
Let's say for EN and FR formats it should be: YYYY-MM-DD (England) and DD-MM-YYYY (France)
# This requires extra work. Each time ask wrapper
format_date <- function(date_obs, country_code) {
if(country_code == "en") result <- format(date_obs, format = "%Y-%m-%d")
if(country_code == "fr") result <- format(date_obs, format = "%d-%m-%Y")
result
}
format_date(today(), "en")
format_date(today(), "fr")
# I need this kind of solution
Sys.setlocale(date_format = '%d-%m-%Y')
print(today()) # <<- should be in French format
Thanks!
There are AFAIK no explicity ways to get the preferred date format of a country in R. The only thing you can do is to retrieve it yourself.
Using data from here, you can convert the date format in R strptime format, and then use it format your dates:
read.csv("https://gist.githubusercontent.com/mlconnor/1887156/raw/014a026f00d0a0a1f5646a91780d26a90781a169/country_date_formats.csv")
date_format <-
date_format %>%
mutate(Date.Format = str_replace_all(Date.Format, c("yyyy" = "%Y",
"MM" = "%m",
"(?<!M|%)M(?!M)" = "%-m",
"dd" = "%d",
"(?<!d|%)d(?!d)" = "%a"))) %>%
select(country = ISO639.2.Country.Code, date_format = Date.Format)
format_to_locale <- function(date, locale) format(date, date_format[date_format$country == locale, "date_format"])
format_to_locale(today(), "FR")
#[1] "07/02/2023"
format_to_locale(today(), "US")
#[1] "02/ 7/2023"
This has probably some limitations, but this is a starting point.

Changing date format when plotting xts object in R

Sadly this answer here seems to not work for me.
From what I saw in the documentation, in the latest version, 0.10-1, the major.format parameter has been removed, opposed to previous versions, like 0.9-7, which has the major.format, that would solve easily my question.
It seems such a major feature to be deprecated. Is there any new way to do this? Seems something simple and easy, but I've been digging this issue for hours without success.
In case the issue lies in my code, here is a snippet of what I'm using.
merra2 = read.table("C:/merra2.csv", header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
merra2$utc = as.POSIXct(merra2$utc, format = "%Y-%m-%d %H:%M:%S", tz="UTC")
merra2$m2_power = as.xts(x=merra2[,"m2_power"],order.by=merra2[,"utc"])
merra2$doy = as.xts(x=merra2[,"doy"],order.by=merra2[,"utc"])
plot.xts(merra2$m2_power, col="blue", lwd = 2, major.ticks="weeks", subset="2012-04-01/2014-04-01")
plot.xts(merra2$m2_power, col="blue", lwd = 2, major.ticks="months", subset="2012-04-01/2014-04-01")
And the input file contains something like:
utc,m2_power,doy
"1980-01-01 00:00:00",643.000,181.5000
"1980-01-01 01:00:00",643.000,181.4583
"1980-01-01 02:00:00",354.000,181.4167
If I add the major.format parameter, nothing changes, the axis stays the same.
Here, a reproductible example :
# Generate a sequence of Dates
StartDate<-"2017-07-01"
EndDate<- "2018-07-05"
dates<-seq(as.POSIXct(StartDate, format="%Y-%m-%d", tz="UTC")
, as.POSIXct(EndDate, format="%Y-%m-%d", tz="UTC")
, by='mins')
# Generate a sequence of x
x <- seq(1, length(dates))
# Create a dataframe, renaming columns
df <- as.data.frame(cbind(as.character(dates,format="%Y-%m-%d", tz="UTC"),x))
colnames(df) <- c("Dates","x")
# Redefine format
df$Dates <- as.POSIXct(df$Dates,format="%Y-%m-%d", tz="UTC")
df$x2 <- as.xts(x= as.numeric(df$x),order.by=df$Dates )
# Plot results
plot.xts(df$x2
, col="blue"
, lwd = 2
, major.ticks="weeks"
, major.format = TRUE
, subset="2017-08-01/2017-08-30")
If you change "major.ticks" the axis change... Have you take a look on the "utc" variable ? What is the complete time interval?

Why does this code plot by day of the week?

Link to the data set which is a date and time column along with electricity usage columns
https://d396qusza40orc.cloudfront.net/exdata%2Fdata%2Fhousehold_power_consumption.zip
power1 <- read.csv(file = "c:/datasets/household_power_consumption.txt", stringsAsFactors=F, header = TRUE,
sep=";", dec = ".", na.strings="?", col.names = c("date1","time1","Global_active_power", "Global_reactive_power",
"Voltage","Global_intensity","Sub_metering_1","Sub_metering_2",
"Sub_metering_3"))
power1$date1 <- as.Date(power1$date1, format="%d/%m/%Y")
power2 <- subset(power1, subset=(date1 >= "2007-02-01" & date1 <= "2007-02-02"))
datetime1 <- paste(as.Date(power2$date1), power2$time1)
power2$Datetime <- as.POSIXct(datetime1)
plot(power2$Global_active_power~power2$Datetime, type="l", ylab="Global Active Power (kilowatts)", xlab="")
When I run the above, I get the graph like I'm supposed to with the days of the week on the x axis even when I run summary, head and str() I don't see anything in the data about days of the week.
I tried to add my own day column with mutate but it didn't work.
And it didn't work when I subset it like the following. It subset properly where I had only the data I needed, but it wouldn't plot with the date1 column or the day of the week column I created via mutate
power2 <- subset(power1, subset=(as.Date(date1, format = "%d/%m/%Y") >= "2007-02-01"
& as.Date(date1, format = "%d/%m/%Y") <= "2007-02-02"))
I know that as.Posixct will have all the metadata in there, but I don't understand why is it when I combine the date and time columns into it's own column only then it plots by day of the week graphwithout me asking.
When I run it like this, the combined date and time column data is corrupted with the wrong year
power11 <- read.csv(file = "c:/datasets/household_power_consumption.txt", stringsAsFactors=F, header = TRUE,
sep=";", dec = ".", col.names = c("date1","time1","Global_active_power", "Global_reactive_power",
"Voltage","Global_intensity","Sub_metering_1","Sub_metering_2",
"Sub_metering_3"))
#colClasses = c("Date", "character", "factor", "numeric","numeric","numeric","numeric","numeric","numeric"))
power22 <- subset(power11, subset=(as.Date(date1, format = "%d/%m/%Y") >= "2007-02-01"
& as.Date(date1, format = "%d/%m/%Y") <= "2007-02-02"))
datetime1 <- paste(as.Date(power22$date1), power22$time1)
power22$Datetime <- as.POSIXct(datetime1)
Maybe this link would be helpful:
http://earlh.com/blog/2009/07/07/plotting-with-custom-x-axis-labels-in-r-part-5-in-a-series/
add an argument to your plot() call: xaxt='n'
plot(power2$Global_active_power~power2$Datetime, type="l", ylab="Global Active Power (kilowatts)", xlab="", xaxt='n')
that tells plot not to add x-axis labels. Then add an axis() call:
axis(side=1, at=power22$Datetime, labels=format(power22$Datetime, '%b-%y'))
I used '%b-%y' here, because that's what I saw on the site I referenced, but you would want to use the format code appropriate to your needs.

Format date in Datatable output

library(DT)
seq_dates <- data.frame(dates = as.Date("2017-01-01") + 1:6 * 100)
datatable(seq_dates) %>% formatDate(1, "toDateString")
I get a datatable in viewer pane displaying dates in following format "Mon May 22 2017".
Q - How can I format date column as "MM-YY"
If I do,
dplyr::mutate(seq_dates, dates = format(dates, format = "%b-%Y")) %>%
datatable()
I get the required date format, but in this second case column sorting doesn't work (sorting is done on alphabets rather than dates.)
P.S - I'm implementing this on shiny.
Hi in these cases do I think the best solution is to add a dummy column with the dates in orginal format and have the dates column being sorted according to the values in the DUMMY column. This is in Datatable quite easily done. Example code below.
seq_dates <- data.frame(dates = as.Date("2017-01-01") + 1:6 * 100)
datatable(seq_dates %>% mutate(DUMMY = dates,dates = format(dates, format = "%b-%Y")),
options = list(
columnDefs = list(
list(targets = 1,orderData = 2),
list(targets = 2, visible = FALSE)
)
))
For what it's worth (and using formatDate), the best that I can do is as follows:
datatable(seq_dates) %>%
formatDate(
columns = 1,
method = "toLocaleDateString",
params = list(
'en-US',
list(
year = 'numeric',
month = 'numeric')
)
)
And this yields date values like 4/2017 and 10/2017.
I've tried to find these parameter options (in github and the original datatables documentation) but to no avail. The only example in DT uses the parameters of short, long and numeric.
Converting "%b-%y" "dates" to date format is not an easy thing as I could see...
If you're not too attached to displaying "%b-%y" format, the easy way is to use "%Y-%m" or "%y-%m" format and the filter will work just fine :
library(DT)
seq_dates <- as.data.frame(seq(Sys.Date() - 100, Sys.Date(), by = "m"))
seq_dates <- format(seq_dates, format = "%y-%m")
datatable(seq_dates)
#resulting in
#1 2017-02
#2 2017-03
#3 2017-04
#4 2017-05
#or
#1 17-02
#2 17-03
#3 17-04
#4 17-05
There is a render method that you can use:
datatable( ...
options = list(..
columnDefs = list(..
list(targets = c(1), render = JS(
"function(data, type, row, meta) {",
"return type === 'display' ? new Date(data).toLocaleString() : data;"))))

Error in `cut` function

I'm trying to group all dates houses in San Francisco were sold by year. I'm using the following code
geo_big$month <- as.Date(paste0(strftime(geo_big$date, format = "%Y-%m"), "-01"))
geo_big$date_r <- cut(geo_big$month, breaks = as.Date(c("2003-04-01", "2004-01-01", "2005-01-01", "2006-01-01", "2007-01-01", "2008-11-01")), include.lowest = TRUE, labels = as.Date(c("2003-01 - 2004-12", "2004-01 - 2004-12", "2005-01 - 2005-12", "2006-01 - 2006-12", "2007-01 - 2007-12", "2008-01 - 2008-11")))
And getting this message:
Error in charToDate(x) :
character string is not in a standard unambiguous format
Anyone know what's going on?
The error given should indicate to you that the issue is not cut but as.Date. (It's complaining to you about not being able to determine the format of the date)
More specifically, it is what you have givn as labels. No need to wrap those in as.Date
The labels should be character and c(.) and the quotation marks are sufficient for that.
Just as a bit of hand, the code above can be cleaned up in a few areas.
Also, the lubridate package might be very useful to you.
# instead of:
geo_big$month <- as.Date(paste0(strftime(geo_big$date, format = "%Y-%m"), "-01"))
# you can use `floor_date`:
library(lubridate)
geo_big$month <- floor_date(geo_big$date, "month") # from the `lubridate` pkg
# instead of:
... a giant cut statement...
# use variables for ease of reading and debugging
# bks <- as.Date(c("2003-04-01", "2004-01-01", "2005-01-01", "2006-01-01", "2007-01-01", "2008-11-01"))
# or:
bks <- c(dmin, seq.Date(ceiling_date(dmin, "year"), floor_date(dmax, "year"), by="year"), dmax) # still using library(lubridate)
# basing your labels on your breaks helps guard against human error & typos
lbls <- head(floor_date(bks, "year"), -1) # dropping the last one, and adding dmax
lbls <- paste( substr(lbls, 1, 7), substr(c(lbls[-1] - 1, dmax), 1, 7), sep=" - ")
# a cleaner, more readable `cut` statement
cut(geo_big$month, breaks=bks, include.lowest=TRUE, labels=lbls)

Resources