Forecast plot with x axis labels as date - r

I have a dataset like revenue and date.
I used arima to plot the data.
ts_data = ts(dataset$Revenue,frequency = 7)
arima.ts = auto.arima(ts_data)
pred = forecast(arima.ts,h=30)
plot(pred,xaxt="n")
When I plot the data, it produces plot like below.
My expectations are below,
I need to display values in Million for predicted values like 13.1M.
I need to show x-axis as date instead of data points numbers.
I tried several links but couldn't crack it. Below are the experiments I made,
Tried with start date and end date in ts_data that also doesnt work.My start date is "2019-09-27" and end date is "2020-07-02"
tried wit axis_date in plot function that also doesnt work.
Please help me to crack the same.
Thanks a lot.

You can specify axis tick values and labels with axis()
plot(pred,xaxt="n", yaxt="n") # deactivate x and y axis
yl<-seq(-5000000,15000000,by=5000000) # position of y-axis ticks
axis(2, at=yl, label=paste(yl/1000000, "M")) # 2 = left axis
You can specify the desired position of y axis ticks with at and the labels to be associated with label. In order to obtain the values like 10 M I have used the function paste to join the numbers with the letter M.
You could use the same method also for x-axis, even tough more efficient methods probably exist. Not having the specific data available I have used a generic spacing and labels only to give you the idea. Once you have set the desired position you can generate the sequence of dates associated with it (to generate a sequence of dates see https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/seq.Date)
axis(1, at=seq(1,40,by=1), label=seq(as.Date("2019-09-27"),as.Date("2020-07-02"),by="week")) # 1 = below axis
You can also change the format of the dates displayed with format() for example label=format(vector_of_date, "%Y-%b-%d") to have year-month(in letter)-day.

Related

Plotting a legend using legend(x,y ....) but x axis are dates in R

I'm trying to put a legend on a line graph using legend(x,y, legend=c("","").... etc. I've changed the date to numeric data and used that for x and it plots, so I know the rest of its right. but when x is a date I'm not sure what to use for x to get the legend to show on the graph.
thanks

How to plot for repeating values in R

I am trying to implement an array in R but plotting same y-values for all x values. If value is NA, then it shouldn't be plotted
I tried the following plot which shows the histogram for all 10 values.
plot(c(1,2,NA,3,4,5,3,NA,2,4),type='h', ylim=c(0,4))
However, for the case below, when I try to control the y-values, the repeated values are not considered in the plot.
plot(c(1,2,NA,3,4,5,3,NA,2,4), rep(1,10),type='h', ylim=c(0,4))
Is this possible with plot function? Please suggest if the same can be done with an alternative.
Please look again at the help page of ?plot.
In your second line you plot the y value 1 at the x values 1 to 5. The plot you get is exactly the plot you asked for, which is not the plot you cared for. In the first plot, your values are interpreted as the y values, not the x values. The x values in the plot are just the indices in the first example.
If you want to get the lines not plotted at the NA values, just do:
x <- c(1,2,NA,3,4,5,3,NA,2,4)
plot(!is.na(x), type = 'h')
Now you plot a TRUE (which is a value of 1) whenever there is a value, and FALSE (which translates to 0) whenever there is none.
This is the exact same as :
xx <- ifelse(is.na(x),0,1)
plot(xx, type = 'h')
On a sidenote: Please do not call this a histogram. A histogram represents counts for bins, this doesn't even come close to that.
plot(!is.na(c(1,2,NA,3,4,5,3,NA,2,4)),type='h', ylim=c(0,4))

Change axis in R with different number of datas

I want to change x-axis in my graphic, but it doesn't work properly with axis(). Datas in the graphic are daily datas and I want to show only years. Hope someone understands me and find a solution. This is how it looks like now: enter image description here and this is how it looks like with the code >axis (1, at = seq(1800, 1975, by = 25), las=2): enter image description here
Without a reproducible code is not easy to get what could be the problem. I try a "quick and dirt" approach.
High level plots are composed by elements that are sub-composed themselves. Hence, separate drawing commands could turn in use by allowing a finer control on the plotting procedure.
In practice, the first thing to do is plot "nothing".
> plot(x, y, type = "n", xlab = "", ylab = "", axes = F)
type = "n" causes the data to not be drawn. axes = F suppresses the axis and the box around the plot. In spite of that, the plotting region is ready to show the data.
The main benefit is that now the plotting area is correctly dimensioned. Try now to add the desired x axis as you tried before.
> points(x, y) # Plots the data in the area
> axis() # Plots the desired axis with your scale
> title() # Plots the desired titles
> box() # Prints the box surrounding the plot
EDITED based on comment by #scoa
As a quick and dirty solution, you can simply enter the following line after your plot() line:
# This reads as, on axis x (1), anchored at the first (day) value of 0
# and last (day) value of 63917 with 9131 day year increments (by)
# and labels (las) perpendicular (2) to axis (for readability)
# EDITED: and AT the anchor locations, put the labels
# 1800 (year) to 1975 (year) in 25 (year) increments
axis (1, at = seq(0, 63917, by = 9131), las=2, labels=seq(1800, 1975, by=25));
For other parameters, check out ?axis. As #scoa mentioned, this is approximate. I have used 365.25 as a day-to-year conversion, but it's not quite right. It should suffice for visual accuracy at the scale you have provided. If you need precise conversion from days to years, you need to operate on your original data set first before plotting.

Plot step-wise decrease in R with Categorical value on X-axis

I want to use a step plot to illustrate a process of elimination. I have a dateframe containing the number of candidates remaining after each step; it looks like this:
Step Candidates Count
1 26587
2 1761
3 849
4 130
The Step column is a categorical variable and I need to represent with the names of the actual steps; I am using numbers because I have not been able to plot when the Step column contains text.
I was able to produce the following figure with the command
plot(df, type = "s")
The problem is the X axis: I need to either get rid of the decimals and add a legend to name each step or, preferably, figure out some way to put the names of the steps in the Step column and populate the axis automatically.
I also want to show the same graph as a log but when I use:
plot(log(df), type = "s")
R gives me log values for both columns. This wouldn't be a problem if I could figure out how to plot the data with Step as a categorical variable but I just cannot figure out how.
My instinct is that this is a fairly simple problem but I've been struggling for most of this morning.
plot(df, type = "s", xaxt='n', log="y")
axis(1, at=1:4, labels=paste("step", 1:4))
Use
xaxt to suppress x-axis ticks and labels
log="y" to get y-axis on log scale
axis to add in the x-axis with labels argument used at specified points on x-axis
You may also want to tweak the labels on the y-axis

How do I make R lattice xyplot ignore gaps in time and make a continuous timeseries plot?

I have an xts series that I'm trying to plot. This series contains of intra-day date for a month with gaps in the data on the weekend. I use xyplot (lattice) in R to plot the time series and am very pleased with the results.
Unfortunately the plots keep the weekend gaps. I'd like to ignore the weekend gaps and make my timeseries plot continuous and would appreciate if someone pointed me in the right direction.
The current command is :
xyplot(close~MyTime, type='l', col='black',ylab='',xlab='', main='Test')
I tried JohnPaul's method and it 'nearly' works. The labels while present, don't render correctly. The last label only goes up to the 3rd of January, while the actual data extends all the way up to February. The command used was:
PlotOrd<-order(Mytime)
xyplot(close~PlotOrd, type='l', col='black',ylab='',xlab='', main='Close',scales=list( x=list( labels=MyTime)) )
If I understand this correctly, what you wish to do is have the weekends not appear in the plot at all. One way to do this is to make another vector which is the order in which you want close plotted - a vector that does not include weekends. Assuming that weekends are not included in time this should work:
PlotOrd<-order(Mytime)
xyplot(close~PlotOrd, type='l', col='black',ylab='',xlab='', main='Test')
This will give you the correct plot, but your x axis tick labels will just be the number from PlotOrd. If you want to keep them as dates add a scales argument for the x axis, like so:
xyplot(close~PlotOrd, type='l', col='black',ylab='',xlab='', main='Test',
scales=list( x=list( labels=Mytime )) )
EDIT
One way to control the axis labels is to use the at argument as well. It is kind of clunky here and I wish I could come up with a more elegant idea, but this will work:
xyplot(close~PlotOrd, type='l', col='black',ylab='',xlab='', main='Test',
scales=list( x=list(at=c(50,100,150,200), labels=Mytime[c(50,100,150,200)] )) )
This will put ticks at observations 50,100,150 and 200, and will give corresponding values from MyTime as the label. The downside is you have to write in the ticks yourself. You could add some code to make a sequence of values. Say you want a label every 15 days. If you figure out how many measurements that corresponds to, you can make a sequence of numbers that far apart (call it MyTicks). Then feed to at=MyTicks and labels=MyTime[MyTicks]. Still it would be nicer to have lattice just pick the ticks for you...

Resources