Using the plot() function in R, I'm trying to produce a scatterplot of points of the form (SaleDate,SalePrice) = (saldt,sapPr) from a time-series, cross-section real estate sales dataset in dataframe format. My problem concerns labels for the X-axis. Just about any series of annual labels would be adequate, e.g. 1999,2000,...,2013 or 1999-01-01,...,2013-01-01. What I'm getting now, a single label, 2000, at what appears to be the proper location won't work.
The following is my call to plot():
plot(r12rgr0$saldt, r12rgr0$salpr/1000, type="p", pch=20, col="blue", cex.axis=.75,
xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01")),
ylim=c(100,650),
main="Heritage Square Sales Prices $000s 1990-2014",xlab="Sale Date",ylab="$000s")
The xlim and ylim are called out to bound the date and price ranges of the data to be plotted; note prices are plotted as $000s. r12rgr0$saldt really is a date; str(r12rgr0$saldt) returns:
Date[1:4190], format: "1999-10-26" "2013-07-06" "2003-08-25" NA NA "2000-05-24" xx
I have reviewed several threads here concerning similar questions, and see that the solution probably lies with turning off the default X-axis behavior and using axis.date, but i) At my current level of R skill, I'm not sure I'd be able to solve the problem, and ii) I wonder why the plotting defaults are producing these rather puzzling (to me, at least) results?
Addl Observations: The Y-axis labels are just fine 100, 200,..., 600. The general appearance of the scatterplot indicates the called-for date ranges are being observed and the relative positions of the plotted points are correct. Replacing xlim=... as above with xlim=c("1999-01-01","2014-01-01")
or
xlim=c(as.numeric(as.character("1999-01-01")),as.numeric(as.character("2014-01-01")))
or
xlim=c(as.POSIXct("1999-01-01", format="%Y-%m-%d"),as.POSIXct("2014-01-01", format="%Y-%m-%d"))
all result in error messages.
With plots it's very hard to reproduce results with out sample data. Here's a sample I'll use
dd<-data.frame(
saldt=seq(as.Date("1999-01-01"), as.Date("2014-01-10"), by="6 mon"),
salpr = cumsum(rnorm(31))
)
A simple plot with
with(dd, plot(saldt, salpr))
produces a few year marks
If i wanted more control, I could use axis.Date as you alluded to
with(dd, plot(saldt, salpr, xaxt="n"))
axis.Date(1, at=seq(min(dd$saldt), max(dd$saldt), by="30 mon"), format="%m-%Y")
which gives
note that xlim will only zoom in parts of the plot. It is not directly connected to the axis labels but the axis labels will adjust to provide a "pretty" range to cover the data that is plotted. Doing just
xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01"))
is the correct way to zoom the plot. No need for conversion to numeric or POSIXct.
If you are running a plot in real time and don't mind some warnings, you can just pass, e.g., format = "%Y-%m-%d" in the plot function. For instance:
plot(seq((Sys.Date()-9),Sys.Date(), 1), runif(10), xlab = "Date", ylab = "Random")
yields:
while:
plot(seq((Sys.Date()-9), Sys.Date(), 1), runif(10), format = "%Y-%m-%d", xlab = "Date", ylab = "Random")
yields:
with lots of warnings about format not being a graphical parameter.
Related
I apologize if this is a messy question, but I don't know any better way to format it. I'm trying to get it so on the x-axis of my graph, all of the dates I have in my dataset are shown. Right now, it's just about half of them - I'll attach a picture of my graph. I would also like if there were more tick marks between each date. I used the lubridate() function to combine my date/time columns, so that's not an issue. I was wondering if there was a way to manipulate the axis tick marks even though these aren't typical numerical values. I'll attach my code below.
PS4T1$newdate <- with(PS4T1, as.POSIXct(paste(date, time), format="%m-%d-%Y %H:%M"))
plot(average ~ newdate, data=PS4T1, type="b", col="blue")
First, format date as.POSIXct, this is important for which plot method is called, apparently you already have done that.
dat <- transform(dat, date=as.POSIXct(date))
Then, subset on the substrings where hours are e.g. '00'. Next plot without x-axis and build custom axis using axis and mtext.
st <- substr(dat$date, 12, 13) == '00'
plot(dat, type='b', col='blue', xaxt='n')
axis(1, dat$date[st], labels=F)
mtext(strftime(dat$date[st], '%b %d'), 1, 1, at=dat$date[st])
Data:
set.seed(42)
dat <- data.frame(
date=as.character(seq.POSIXt(as.POSIXct('2021-06-22'), as.POSIXct('2021-06-29'), 'hour')),
v=runif(169)
)
I have a dataset like revenue and date.
I used arima to plot the data.
ts_data = ts(dataset$Revenue,frequency = 7)
arima.ts = auto.arima(ts_data)
pred = forecast(arima.ts,h=30)
plot(pred,xaxt="n")
When I plot the data, it produces plot like below.
My expectations are below,
I need to display values in Million for predicted values like 13.1M.
I need to show x-axis as date instead of data points numbers.
I tried several links but couldn't crack it. Below are the experiments I made,
Tried with start date and end date in ts_data that also doesnt work.My start date is "2019-09-27" and end date is "2020-07-02"
tried wit axis_date in plot function that also doesnt work.
Please help me to crack the same.
Thanks a lot.
You can specify axis tick values and labels with axis()
plot(pred,xaxt="n", yaxt="n") # deactivate x and y axis
yl<-seq(-5000000,15000000,by=5000000) # position of y-axis ticks
axis(2, at=yl, label=paste(yl/1000000, "M")) # 2 = left axis
You can specify the desired position of y axis ticks with at and the labels to be associated with label. In order to obtain the values like 10 M I have used the function paste to join the numbers with the letter M.
You could use the same method also for x-axis, even tough more efficient methods probably exist. Not having the specific data available I have used a generic spacing and labels only to give you the idea. Once you have set the desired position you can generate the sequence of dates associated with it (to generate a sequence of dates see https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/seq.Date)
axis(1, at=seq(1,40,by=1), label=seq(as.Date("2019-09-27"),as.Date("2020-07-02"),by="week")) # 1 = below axis
You can also change the format of the dates displayed with format() for example label=format(vector_of_date, "%Y-%b-%d") to have year-month(in letter)-day.
I'm struggling with changing the x-axis (time) for my time series forecast plot. I have ran many models but I am struggling with the same issue. I'm going to write the code for the model fit, forecast and the plot here for one of the models. First here is my original time series. Note: I'm fitting my model on my training data that is from 2008-2016 and testing my model on my test data for the 11 months in 2017.
Data Split.
sal.ts <- window(sal.ts.original, start=c(2008,1), end=c(2016,12))
sal.test <- window(sal.ts.original, start=c(2017,1))
Now, the model.
sal.hw.mul <- HoltWinters(sal.ts, seasonal = "mult")
sal.hw.mul
fc.hwm <- forecast(sal.hw.mul, h=11)
fc.hwm
plot(fc.hwm, xlim=c(2017,2017+11/12), main = "Forecast from Mutltiplicative HW", xlab = "Year", ylab = "Total Sales, $M")
lines(sal.test,col='red', lwd=2)
legend("topleft", c("Actual", "Predicted"), col = c(4,2), lty = 1)
Here's my forecast plot:
See that ugly 2017.0, 2017.2.... 2017.8? I want it to instead say 1,2,3,....11 for the 11 months of 2017.
Yes, I only want to plot my test data and forecast on it and not the whole series.
I am pretty sure my problem is around my use of the xlim function. I am using that xlim function to just plot the months of 2017 and if I don't use that then R plots the whole series from 2008-2017. I tried to play around with the axis function a lot by setting xaxt="n" in the plot command but still couldn't figure it out.
Let me know if you need more information from me. Any help will be appreciated.
Update, on someone's suggestion I tried to write a custom axis by setting xaxt = 'n' in my plot. Here's the change in code.
x <- seq(1,11,1)
fc.hwm <- forecast(sal.hw.mul, h=11)
fc.hwm
layout(1:1)
plot(fc.hwm, xaxt='n', xlim=c(2017,2017+11/12), main = "Forecast from Mutltiplicative HW", xlab = "Year", ylab = "Total Sales, $M")
axis(side=1, at= x, labels=c("1","2","3","4","5","6","7","8","9","10","11"))
lines(sal.test,col='red', lwd=2)
legend("topleft", c("Actual", "Predicted"), col = c(4,2), lty = 1)
Like you can see. It gets me there half way. I can remove my current axis label but I am not being able to write a new axis. This new code is not even giving me an error or else I would've tried to debug it. It accepts my code but doesn't give me the desired output.
Here's an idea. I'm not sure what the data look like, but I'm guessing that you have a Date type for the date variable -- and that means that your "by" sequence of integer 1 to 11 might be placing those new labels outside the plot limits. Try using a Date sequence instead.
Change this:
x <- seq(1,11,1)
To something like this:
x <- seq.Date(as.Date("2017-01-01"), as.Date("2017-11-01"), "months")
I'm not sure how far into November your data go, so you might want to set that "to" Date in the sequence to December instead, so you can fully cover your November data points.
Good day
I read on one of the posts here that "the function forecast::plot.forecast is not designed to be used with axis.Date or axis.POSIXct (which are not used in the package forecast)." This can be seen here:custom axis labels plotting a forecast in R
Nevertheless, they managed to use the forecast package and some code to get the correct axis labels. However, this example is for quarterly data. Also, this example here using 'as.POSIXlt' is for weekly data: Forecasting time series data
I've tried playing with the code but I can't get it to work for monthly data. So my axis labels are still wrong. I'm stumped. Please would you help advise how I get the axis labels to reflect correctly using the forecast package?
Example
library(forecast)
headcount<-c(2475,2468,2452,2464,2500,2548,2536,2565,2590,2608,2625,2663)
date<-c("2013/01/31","2013/02/28","2013/03/31","2013/04/30","2013/05/31","2013/06/30",
"2013/07/31","2013/08/31","2013/09/30","2013/10/31","2013/11/30","2013/12/31")
x<-data.frame(headcount,date)
t<-ts(x$headcount,start=c(2013,1),end=c(2013,12),frequency=12)
fit<-forecast(t,h=12)
plot(forecast(fit))
By doing this, the axis labels come out as 2013.0, 2013.5, 2014.5
I know this is only a year's worth of data. I'm just interested in how to fix the axis labels for monthly data,
Kind regards
D
Here's a possible solution using the links provided
plot(forecast(fit), axes = FALSE)
a <- seq(as.Date(date[1]) + 1, by = "months", length = length(date) + 11)
axis(1, at = as.numeric(a)/365.3 + 1970, labels = format(a, format = "%m/%Y"), cex.axis = 0.9)
axis(2, cex.axis = 0.9)
I'm creating a plot in R with dates as the xaxis. My frame has dates, no problem. I'm using custom date range - one that cuts off some of the earliest data by using a fixed start and extend slightly past the latest data by using a end determined by some other code. The range is ~47 days right now. That's all working fine.
My problem is that the xaxis label includes only a single label, "Feb" but I'd like to include at least 3 labels, if not 5.
starttime <- strptime("20110110", "%Y%m%d")
endtime <- strptime("20110226 1202", "%Y%m%d %H%M") #This is actually determined programmatically, but that's not important
xrange <- c(starttime, endtime)
yrange <- c(0, 100)
par(mar=par()$mar+c(0,0,0,7),bty="l")
plot(xrange, yrange, type="n", xlab="Submission Time", ylab="Best Score", main="Top Scores for each team over time")
#More code to loop and add a bunch of lines(), but it's not really relevant
The resulting graph looks like this:
I really just want better labels. I'm not too concerned about exactly what they are, but something with Month + Day, and at least 3 of them.
Try this. I changed your plot() statement a little and added two lines.
starttime <- strptime("20110110", "%Y%m%d")
endtime <- strptime("20110226 1202", "%Y%m%d %H%M")
#This is actually determined programmatically, but that's not important
xrange <- c(starttime, endtime)
yrange <- c(0, 100)
par(mar=par()$mar+c(0,0,0,7),bty="l")
#I added xaxt="n" to supress the plotting of the x-axis
plot(xrange, yrange, type="n", xaxt="n", xlab="Submission Time", ylab="Best Score", main="Top Scores for each team over time")
#I added the following two lines to plot the x-axis with a label every 7 days
atx <- seq(starttime, endtime, by=7*24*60*60)
axis(1, at=atx, labels=format(atx, "%b\n%d"), padj=0.5)
#More code to loop and add a bunch of lines(), but it's not really relevant
In addition, look at axis.Date(), which is not an S3 generic, but can help set up the extra labels you want. I wrote a patch for this function that got incorporated several R versions ago, which allowed axes without labels. Here is an example taken from ?axis.Date:
random.dates <- as.Date("2001/1/1") + 70*sort(stats::runif(100))
plot(random.dates, 1:100)
# or for a better axis labelling
plot(random.dates, 1:100, xaxt="n")
axis.Date(1, at=seq(as.Date("2001/1/1"), max(random.dates)+6, "weeks"))
axis.Date(1, at=seq(as.Date("2001/1/1"), max(random.dates)+6, "days"),
labels = FALSE, tcl = -0.2)
which produces:
There is also Axis.Date() which is an S3 generic, so can be called via Axis(dates.vec, ....) where dates.vec is the x-axis vector of dates. at etc can also be specified.
You can do that by hand if you suppress the x-axis annotation in the call to plot() and then use axis() with manually specified points and labels:
axis(1,at=axis.pos[axis.ind],labels=axis.txt[axis.ind])
using a set of indices axis.ind which selects from x values and formatted labels. You can use strftime() for just about anything, eg '%d %b' should produce day and human-readable short months as in
R> strftime(Sys.Date(), "%d %b")
[1] "26 Feb"