I'm posting this because i've been having a little problem with my code. What i want to do is to make a forecast of COVID cases in a province for the next 30 days using the AUTOARIMA script. Everything is ok, but when i plot the forecast model, the date labels appears in increments of 25% (IE: 2020.2, 2020.4, etc), but i want to label that axis with a YMD format. This is my code:
library(readxl)
library(ggplot2)
library(forecast)
data <- read_xlsx("C:/Users/XXXX/Documents/Casos ARIMA Ejemplo.xlsx")
provincia_1 <- ts(data$Provincia_1, frequency = 365, start = c(2020,64))
autoarima_provincia1 <- auto.arima(provincia_1)
forecast_provincia1 <- forecast(autoarima_provincia1, h = 30)
plot(forecast_provincia1, main = "Proyeccion Provincia 1", xlab = "Meses", ylab = "Casos Diarios")
When i plot the forecast, this is what appears (with the problem i've stated before on the dates label)
The database is here:
https://github.com/pgonzalezp/Casos-Covid-provincias
Try to create a data.frame having on one column your predictions and in the other the daily dates. Then plot it.
Introduce your start and ending date as seen below, then at "by" argument, please check documentation from this link:
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/as.Date
df <- data.frame(
date=seq(as.Date("1999-01-01"), as.Date("2014-01-10"), by="6 mon"),
pred_val = forecast_provincia1
)
with(df, plot(date, pred_val ))
I got inspired from here:
R X-axis Date Labels using plot()
Related
I am a beginner at this and am really lost about it.
I would like to create a horizon chart that shows the percentage change in sales for the different towns using ggplot2 and R. Would anyone guide me in the approach I can take to create the chart?
The data that I have looks like this.
This is the type of chart I would like to do.
(source: https://harmoniccode.blogspot.com/2017/11/friday-fun-li-horizon-charts.html)
Thanks in advance for any help given!
Edit: here's a sample code of the data:
x <- data.frame(
"town" =c('sad','sad','sad','sad','happy','happy','happy','happy'),
"month"=c("2017-01","2017-02","2017-03","2017-04","2017-01","2017-02","2017-03","2017-04"),
"median_sales" = c(336500,355000,375000,395000,359000,361500,36000,375000),
"percentage_change" = c(NA,5.4977712,5.6338028,5.3333333,NA,0.6963788,-0.4149378, 4.1666667
))
x <-
x %>%
mutate(month = floor_date(as_date(as.yearmon(month)), "month"))
It would be helpful to give an example that will result in a reasonable plot, and to provide your example data as data rather than an image.
If you google 'horizon plot' the first answer should give you what you need.
Here is a simple example based on the data you gave:
library(latticeExtra)
sales.ts <- ts(matrix(sales$median_sales, ncol=2), names = c("sad", "happy"),
start = c(2017, 1), frequency = 365)
horizonplot(sales.ts)
I think this is correctly presenting your results, but again hard to tell as you haven't given a realistic dataset.
UPDATE: based on the data provided, this is the answer. Again, as you've only provided one time point a horizonplot is probably not what you want. They are designed to plot time series.
x.ts <- ts(matrix(x$median_sales, ncol=2), names = c("sad", "happy"),
start = c(2015, 1), frequency = 12)
horizonplot(x.ts)
I am foresting with combination of data sets from fpp2 package and forecasting function from the forecast package. Output from this forecasting is object list with SNAIVE_MODELS_ALL. This object contain data separate for two series, where first is Electricity and second is Cement.
You can see code below :
# CODE
library(fpp2)
library(dplyr)
library(forecast)
library(gridExtra)
library(ggplot2)
#INPUT DATA
mydata_qauselec <- qauselec
mydata_qcement <- window(qcement, start = 1956, end = c(2010, 2))
# Мerging data
mydata <- cbind(mydata_qauselec, mydata_qcement)
colnames(mydata) <- c("Electricity", "Cement")
# Test Extract Name
mydata1 <- data.frame(mydata)
COL_NAMES <- names(mydata1)
rm(mydata_qauselec, mydata_qcement)
# FORCASTING HORIZON
forecast_horizon <- 12
#FORCASTING
BuildForecast <- function(Z, hrz = forecast_horizon) {
timeseries <- msts(Z, start = 1956, seasonal.periods = 4)
forecast <- snaive(timeseries, biasadj = TRUE, h = hrz)
}
frc_list <- lapply(X = mydata1, BuildForecast)
#FINAL FORCASTING
SNAIVE_MODELS_ALL<-lapply(frc_list, forecast)
So my intention here is to put this object SNAIVE_MODELS_ALL into autoplot function in order to get two plots like pic below.
With code below I draw both plots separate, but my main intention is to do this with function autoplot and some function like apply or something similar, which can automatically draw this two chart like pic above.This is only small example in real example I will have maybe 5 or 10 charts.
#PLOT 1
P_PLOT1<-autoplot(SNAIVE_Electricity,main = "Snaive Electricity forecast",xlab = "Year", ylab = "in billion kWh")+
autolayer(SNAIVE_Electricity,series="Data")+
autolayer(SNAIVE_Electricity$fitted,series="Forecasts")
# PLOT 2
P_PLOT2<-autoplot(SNAIVE_Cement,main = "Snaive Cement forecast",xlab = "Year", ylab = "in millions of tonnes")+
autolayer(SNAIVE_Cement,series="Data")+
autolayer(SNAIVE_Cement$fitted,series="Forecasts")
#UNION PLOTS (PLOT 1 AND PLOT 2)
SNAIVE_PLOT_ALL<-grid.arrange(P_PLOT1,P_PLOT2)
So can anybody help me with this code ?
If I understand in a proper way, one of the difficulties with that problem is that each plot should have a specific title and y label. One of the possible solutions is to set the plot titles and y-lables as function arguments:
PlotForecast <- function(df_pl, main_pl, ylab_plt){
autoplot(df_pl,
main = main_pl,
xlab = "Year", ylab = ylab_plt)+
autolayer(df_pl,series="Data")+
autolayer(df_pl$fitted,series="Forecasts")
}
Prepare lists of the plot labels to be used with PlotForecast():
main_lst <- list("Snaive Electricity forecast", "Snaive Cement forecast")
ylab_lst <- list("in billion kWh", "in millions of tonnes")
Construct a list of plot-objects using a base Map() function:
PL_list <- Map(PlotForecast, df_pl = SNAIVE_MODELS_ALL, main_pl = main_lst,
ylab_plt= ylab_lst)
Then all we have to do is to call grid.arrange() with the plot list:
do.call(grid.arrange, PL_list)
Note, please, that main_lst and ylab_lst are created manually for demonstration purposes, but it is not the best way if you work with a lot of charts. Ideally, the labels should be generated automatically using the original SNAIVE_PLOT_ALL list.
I'm trying to plot some time series data. My plot looks like the following:
I'm uncertain as to why it displays the date as such. I'm using R Markdown in R studio. Below is my code:
agemployment<-read.csv("Employment-Level1.csv", header=TRUE)
Tried to change the class of Date:
as.Date(as.character(agemployment$Date),format="%m%d%Y")
That did nothing. Rest of code here:
`attach(agemployment)
View(agemployment)
head(agemployment)
agemployment<-ts(agemployment,frequency=12,start=c(2008, 1))
plot(agemployment, col="black", main="Agriculture Employment Level",
ylab="Total Employment Level (Thousands)", ylim=c(0, 250),lwd=2,
xaxs="i", yaxs="i", lty=1)'
This produces the above plot. I'm uncertain what I'm doing wrong. I would appreciate any help. Thank you!
EDIT:
Data here:
I suspect your issues are somehow driven by attach, generally attaching data frames is not a good practice. The following super-simple code worked for me:
# small dataset from your example, I use package readr to load it as data frame
df = readr::read_csv("DATE,Employment
1/1/2008,1245
2/1/2008,1280
3/1/2008,1343
4/1/2008,1251
5/1/2008,1236
6/1/2008,1265")
ts <- ts(data = df$Employment, frequency = 12, start = c(2008, 1))
plot(ts)
Using the file generated reproducibly in the Note at the end read the file into a zoo object making the index of class "yearmon" (representing year and month without day). Then plot it.
library(zoo)
z <- read.csv.zoo("Employment-Level1.csv", format = "%m/%d/%Y", FUN = as.yearmon)
plot(z)
or
library(ggplot2)
autoplot(z) + scale_x_yearmon()
(continued after plots)
If you wanted to convert z to a ts object or data frame:
tt <- as.ts(z)
DF <- fortify.zoo(z)
Note
Lines <- "DATE,Employment
1/1/2008,1245
2/1/2008,1280
3/1/2008,1343
4/1/2008,1251
5/1/2008,1236
6/1/2008,1265"
cat(Lines, file = "Employment-Level1.csv") # write out file
Realize that by providing an image in the question it means that everyone who answers must retype your data so in the future please provide the input data to questions in a reproducible form as we have done here.
I am trying to build a forecast plot in R. But, inspite of trying many solutions I am unable to plot my X axis in dates.
My data is in the form of :
Datetime(MM/DD/YYY) ConsumedSpace
01-01-2015 2488
02-01-2015 7484
03-01-2015 4747
Below is the forecast script I am using:
library(forecast)
library(calibrate)
# group searches by date
dataset <- aggregate(ConsumedSpace ~ Date, data = dataset, FUN= sum)
# create a time series based on day of week
ts <- ts(dataset$ConsumedSpace, frequency=6)
# pull out the seasonal, trend, and irregular components from the time series (train the forecast model)
decom <- stl(ts, s.window = "periodic")
#predict the next 7 days of searches
Pred <- forecast(decom)
# plot the forecast model
plot(Pred)
#text(Pred,ts ,labels = dataset$ConsumedSpace)
The output looks like this-- as you can see I have X axis displayed is periods(numbers) rather than in data format.
Any help is highly appreciated.
Try to enter explicit specifications in your plot : plot(x=Date, ...)
if it does not work try :
timeline<-seq(from=your.first.date, to=your.last.date, by="week")
plot(x=...,y=..., xlab=NA, xaxt="n") # no x axis
axis.Date(1, at=(timeline), format=F, labels=TRUE) # Special axis
Edit :
Sorry for my first solution, which does not fit for your timeserie. The problem is there is no date is time series, but an index refering to "start" and "frequency". Here, your problem comes from your use of "frequency", which is supposed to specify the number of observations by unit of time, ie 4 for quarterly data, 12 for monthly data... Here your unit of time is the week, with 6 open days, that's why your graph axes indicates the index ok the weeks. To have a more readable axis you can try this :
dmin<-as.Date("2015-01-01") # Starting date
# Dummy data
ConsumedSpace=rep(c(5488, 7484, 4747, 4900, 4747, 6548, 6548, 7400, 6300, 8484, 5161, 6161),2)
ts<-ts(ConsumedSpace, frequency=6)
decom <- stl(ts, s.window = "periodic")
Pred <- forecast(decom)
plot(Pred, xlab=NA, xaxt="n") # Plot with no axis
ticks<-seq(from=dmin, to= dmin+(length(time(Pred))-1)*7, by = 7) # Ticks sequency : ie weeks label
axis(1, at=time(Pred), labels=ticks) # axis with weeks label at weeks index
You have to use a 7 interval for weeks labels because of the closed day.
It's ugly but it works. There is surely a better way looking closely at your ts() to specify those data are daily data, and adapting your forecasting function.
I've been trying to write out an R script that will plot the date-temp series for a set of locations that are identified by a Deployment_ID.
Ideally, each page of the output pdf would have the name of the Deployment_ID (check), a graph with proper axes (check) and correct scaling of the x-axis to best show the date-temp series for that specific Deployment_ID (not check).
At the moment, the script makes a pdf that shows each ID over the full range of the dates in the date column (i.e. 1988-2010), instead of just the relevant dates (i.e. just 2005), which squishes the scatterplot down into uselessness.
I'm pretty sure it's something to do with how you define xlim, but I can't figure out how to have R access the date min and the date max for each factor as it draws the plots.
Script I have so far:
#Get CSV to read data from, change the file path and name
data <- read.csv(file.path("C:\Users\Person\Desktop\", "SampleData.csv"))
#Make Date real date - must be in yyyy/mm/dd format from the csv to do so
data$Date <- as.Date(data$Date)
#Call lattice to library, note to install.packages(lattice) if you don't have it
library(lattice)
#Make the plots with lattice, this takes a while.
dataplot <- xyplot(data$Temp~data$Date|factor(data$Deployment_ID),
data=data,
stack = TRUE,
auto.key = list(space = "right"),
layout = c(1,1),
ylim = c(-10,40)
)
#make the pdf
pdf("Dataplots_SampleData.pdf", onefile = TRUE)
#print to the pdf? Not really sure how this works. Takes a while.
print(dataplot)
dev.off()
Use the scales argument. give this a try
dataplot <- xyplot(data$Temp~data$Date|factor(data$Deployment_ID),
data=data,
stack = TRUE,
auto.key = list(space = "right"),
layout = c(1,1),
scales= list( relation ="free")
)