R x axis date label only one value - r

I'm creating a plot in R with dates as the xaxis. My frame has dates, no problem. I'm using custom date range - one that cuts off some of the earliest data by using a fixed start and extend slightly past the latest data by using a end determined by some other code. The range is ~47 days right now. That's all working fine.
My problem is that the xaxis label includes only a single label, "Feb" but I'd like to include at least 3 labels, if not 5.
starttime <- strptime("20110110", "%Y%m%d")
endtime <- strptime("20110226 1202", "%Y%m%d %H%M") #This is actually determined programmatically, but that's not important
xrange <- c(starttime, endtime)
yrange <- c(0, 100)
par(mar=par()$mar+c(0,0,0,7),bty="l")
plot(xrange, yrange, type="n", xlab="Submission Time", ylab="Best Score", main="Top Scores for each team over time")
#More code to loop and add a bunch of lines(), but it's not really relevant
The resulting graph looks like this:
I really just want better labels. I'm not too concerned about exactly what they are, but something with Month + Day, and at least 3 of them.

Try this. I changed your plot() statement a little and added two lines.
starttime <- strptime("20110110", "%Y%m%d")
endtime <- strptime("20110226 1202", "%Y%m%d %H%M")
#This is actually determined programmatically, but that's not important
xrange <- c(starttime, endtime)
yrange <- c(0, 100)
par(mar=par()$mar+c(0,0,0,7),bty="l")
#I added xaxt="n" to supress the plotting of the x-axis
plot(xrange, yrange, type="n", xaxt="n", xlab="Submission Time", ylab="Best Score", main="Top Scores for each team over time")
#I added the following two lines to plot the x-axis with a label every 7 days
atx <- seq(starttime, endtime, by=7*24*60*60)
axis(1, at=atx, labels=format(atx, "%b\n%d"), padj=0.5)
#More code to loop and add a bunch of lines(), but it's not really relevant

In addition, look at axis.Date(), which is not an S3 generic, but can help set up the extra labels you want. I wrote a patch for this function that got incorporated several R versions ago, which allowed axes without labels. Here is an example taken from ?axis.Date:
random.dates <- as.Date("2001/1/1") + 70*sort(stats::runif(100))
plot(random.dates, 1:100)
# or for a better axis labelling
plot(random.dates, 1:100, xaxt="n")
axis.Date(1, at=seq(as.Date("2001/1/1"), max(random.dates)+6, "weeks"))
axis.Date(1, at=seq(as.Date("2001/1/1"), max(random.dates)+6, "days"),
labels = FALSE, tcl = -0.2)
which produces:
There is also Axis.Date() which is an S3 generic, so can be called via Axis(dates.vec, ....) where dates.vec is the x-axis vector of dates. at etc can also be specified.

You can do that by hand if you suppress the x-axis annotation in the call to plot() and then use axis() with manually specified points and labels:
axis(1,at=axis.pos[axis.ind],labels=axis.txt[axis.ind])
using a set of indices axis.ind which selects from x values and formatted labels. You can use strftime() for just about anything, eg '%d %b' should produce day and human-readable short months as in
R> strftime(Sys.Date(), "%d %b")
[1] "26 Feb"

Related

Trying to add more axis marks in Base R with date/time format using lubridate()

I apologize if this is a messy question, but I don't know any better way to format it. I'm trying to get it so on the x-axis of my graph, all of the dates I have in my dataset are shown. Right now, it's just about half of them - I'll attach a picture of my graph. I would also like if there were more tick marks between each date. I used the lubridate() function to combine my date/time columns, so that's not an issue. I was wondering if there was a way to manipulate the axis tick marks even though these aren't typical numerical values. I'll attach my code below.
PS4T1$newdate <- with(PS4T1, as.POSIXct(paste(date, time), format="%m-%d-%Y %H:%M"))
plot(average ~ newdate, data=PS4T1, type="b", col="blue")
First, format date as.POSIXct, this is important for which plot method is called, apparently you already have done that.
dat <- transform(dat, date=as.POSIXct(date))
Then, subset on the substrings where hours are e.g. '00'. Next plot without x-axis and build custom axis using axis and mtext.
st <- substr(dat$date, 12, 13) == '00'
plot(dat, type='b', col='blue', xaxt='n')
axis(1, dat$date[st], labels=F)
mtext(strftime(dat$date[st], '%b %d'), 1, 1, at=dat$date[st])
Data:
set.seed(42)
dat <- data.frame(
date=as.character(seq.POSIXt(as.POSIXct('2021-06-22'), as.POSIXct('2021-06-29'), 'hour')),
v=runif(169)
)

change axis/scale for time series plot after forecast

I'm struggling with changing the x-axis (time) for my time series forecast plot. I have ran many models but I am struggling with the same issue. I'm going to write the code for the model fit, forecast and the plot here for one of the models. First here is my original time series. Note: I'm fitting my model on my training data that is from 2008-2016 and testing my model on my test data for the 11 months in 2017.
Data Split.
sal.ts <- window(sal.ts.original, start=c(2008,1), end=c(2016,12))
sal.test <- window(sal.ts.original, start=c(2017,1))
Now, the model.
sal.hw.mul <- HoltWinters(sal.ts, seasonal = "mult")
sal.hw.mul
fc.hwm <- forecast(sal.hw.mul, h=11)
fc.hwm
plot(fc.hwm, xlim=c(2017,2017+11/12), main = "Forecast from Mutltiplicative HW", xlab = "Year", ylab = "Total Sales, $M")
lines(sal.test,col='red', lwd=2)
legend("topleft", c("Actual", "Predicted"), col = c(4,2), lty = 1)
Here's my forecast plot:
See that ugly 2017.0, 2017.2.... 2017.8? I want it to instead say 1,2,3,....11 for the 11 months of 2017.
Yes, I only want to plot my test data and forecast on it and not the whole series.
I am pretty sure my problem is around my use of the xlim function. I am using that xlim function to just plot the months of 2017 and if I don't use that then R plots the whole series from 2008-2017. I tried to play around with the axis function a lot by setting xaxt="n" in the plot command but still couldn't figure it out.
Let me know if you need more information from me. Any help will be appreciated.
Update, on someone's suggestion I tried to write a custom axis by setting xaxt = 'n' in my plot. Here's the change in code.
x <- seq(1,11,1)
fc.hwm <- forecast(sal.hw.mul, h=11)
fc.hwm
layout(1:1)
plot(fc.hwm, xaxt='n', xlim=c(2017,2017+11/12), main = "Forecast from Mutltiplicative HW", xlab = "Year", ylab = "Total Sales, $M")
axis(side=1, at= x, labels=c("1","2","3","4","5","6","7","8","9","10","11"))
lines(sal.test,col='red', lwd=2)
legend("topleft", c("Actual", "Predicted"), col = c(4,2), lty = 1)
Like you can see. It gets me there half way. I can remove my current axis label but I am not being able to write a new axis. This new code is not even giving me an error or else I would've tried to debug it. It accepts my code but doesn't give me the desired output.
Here's an idea. I'm not sure what the data look like, but I'm guessing that you have a Date type for the date variable -- and that means that your "by" sequence of integer 1 to 11 might be placing those new labels outside the plot limits. Try using a Date sequence instead.
Change this:
x <- seq(1,11,1)
To something like this:
x <- seq.Date(as.Date("2017-01-01"), as.Date("2017-11-01"), "months")
I'm not sure how far into November your data go, so you might want to set that "to" Date in the sequence to December instead, so you can fully cover your November data points.

R X-axis Date Labels using plot()

Using the plot() function in R, I'm trying to produce a scatterplot of points of the form (SaleDate,SalePrice) = (saldt,sapPr) from a time-series, cross-section real estate sales dataset in dataframe format. My problem concerns labels for the X-axis. Just about any series of annual labels would be adequate, e.g. 1999,2000,...,2013 or 1999-01-01,...,2013-01-01. What I'm getting now, a single label, 2000, at what appears to be the proper location won't work.
The following is my call to plot():
plot(r12rgr0$saldt, r12rgr0$salpr/1000, type="p", pch=20, col="blue", cex.axis=.75,
xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01")),
ylim=c(100,650),
main="Heritage Square Sales Prices $000s 1990-2014",xlab="Sale Date",ylab="$000s")
The xlim and ylim are called out to bound the date and price ranges of the data to be plotted; note prices are plotted as $000s. r12rgr0$saldt really is a date; str(r12rgr0$saldt) returns:
Date[1:4190], format: "1999-10-26" "2013-07-06" "2003-08-25" NA NA "2000-05-24" xx
I have reviewed several threads here concerning similar questions, and see that the solution probably lies with turning off the default X-axis behavior and using axis.date, but i) At my current level of R skill, I'm not sure I'd be able to solve the problem, and ii) I wonder why the plotting defaults are producing these rather puzzling (to me, at least) results?
Addl Observations: The Y-axis labels are just fine 100, 200,..., 600. The general appearance of the scatterplot indicates the called-for date ranges are being observed and the relative positions of the plotted points are correct. Replacing xlim=... as above with xlim=c("1999-01-01","2014-01-01")
or
xlim=c(as.numeric(as.character("1999-01-01")),as.numeric(as.character("2014-01-01")))
or
xlim=c(as.POSIXct("1999-01-01", format="%Y-%m-%d"),as.POSIXct("2014-01-01", format="%Y-%m-%d"))
all result in error messages.
With plots it's very hard to reproduce results with out sample data. Here's a sample I'll use
dd<-data.frame(
saldt=seq(as.Date("1999-01-01"), as.Date("2014-01-10"), by="6 mon"),
salpr = cumsum(rnorm(31))
)
A simple plot with
with(dd, plot(saldt, salpr))
produces a few year marks
If i wanted more control, I could use axis.Date as you alluded to
with(dd, plot(saldt, salpr, xaxt="n"))
axis.Date(1, at=seq(min(dd$saldt), max(dd$saldt), by="30 mon"), format="%m-%Y")
which gives
note that xlim will only zoom in parts of the plot. It is not directly connected to the axis labels but the axis labels will adjust to provide a "pretty" range to cover the data that is plotted. Doing just
xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01"))
is the correct way to zoom the plot. No need for conversion to numeric or POSIXct.
If you are running a plot in real time and don't mind some warnings, you can just pass, e.g., format = "%Y-%m-%d" in the plot function. For instance:
plot(seq((Sys.Date()-9),Sys.Date(), 1), runif(10), xlab = "Date", ylab = "Random")
yields:
while:
plot(seq((Sys.Date()-9), Sys.Date(), 1), runif(10), format = "%Y-%m-%d", xlab = "Date", ylab = "Random")
yields:
with lots of warnings about format not being a graphical parameter.

How to add second y axis to seqfplot with sequence frequency?

I'm working with TraMineR to do a sequence analysis of educational data. I can get R to produce a plot of the 10 most frequent sequences in the data using code similar to the following:
library(TraMineR)
##Loading the data
data(actcal)
##Creating the labels and defining the sequence object
actcal.lab <- c("> 37 hours", "19-36 hours", "1-18 hours", "no work")
actcal.seq <- seqdef(actcal, 13:24, labels=actcal.lab)
## 10 most frequent sequences in the data
actcal.freq <- seqtab(actcal.seq)
actcal.freq
## Plotting the object
seqfplot(actcal.seq, pbarw=FALSE, yaxis="pct", tlim=10:1, cex.legend=.75, withlegend="right")
However, I'd also like to have the frequencies of each sequence (which are in the object actcal.freq) along the right side of the plot. For example, the first sequence in the plot created by the code above represents 37.9% of the data (as the plot currently shows). Per the seqtab, this is 757 subjects. I'd like the number 757 to appear on the right y-axis (and so on for the other sequences).
Is this possible? I've played around with axis(side=4, ...) but never been able to get it to reproduce the spacing of the left y-axis.
OK. This is a bit of a mess, but the function resets the par setting if you include a legend by default, so you need to turn that off. Then you can set the axis a bit more easily, and then we can go back for the legend. This should work with your test data above.
#add padding to the right for axis and legend
par("mar"=c(5,4,4,8)+.1)
#plot w/o axis
seqfplot(actcal.seq, pbarw=FALSE, yaxis="pct", tlim=10:1, withlegend=F)
#plot right axis with freqs
axis(4, at = seq(.7, by=1.2, length.out=length(attr(actcal.freq,"freq")$Freq)),
labels = rev(attr(actcal.freq,"freq")$Freq),
mgp = c(1.5, 0.5, 0), las = 1, tick = FALSE)
#now put the legend on
legend("right", legend=attr(actcal.seq, "labels"),
fill=attr(actcal.seq, "cpal"),
inset=-.3, bty="o", xpd=NA, cex=.75)
You may need to play a bit with the margins and especially the inset= parameter of the legend to get it placed correctly. I hope your real data isn't too much different than this because you really have to dig though the function to see how it does the formatting to get things to match up.

Alternative unit on plot()

I have two vectors (in a data frame) that I want to plot like this plot(df$timeStamp,df$value), which works nicely by itself. Now the plot is showing the timestamp in a pure numerical way as markers on the x axis.
When I format the vector of timestamps it into a vector of "hh:mm:ss", plot() complains (which makes sense, as the x-axis data is now a vector of strings).
Is there a way to say plot(x-vector, y-vector, label-x-vector) where the label-x-vector contains the elements to display along the x-axis?
The last part of your general question is done in two commands rather than one. If you look at ?plot.default (linked from ?plot) you'll see an option to leave off the x-axis all together using the xaxt argument (xaxt = 'n'). Do that and then use axis to make the x-axis what you want (check ?axis). I don't know what format your timestamp is currently in so it's hard to help further.
In general it's...
plot(x-vector, y-vector, xaxt = 'n')
axis(1, x-vector, label-x-vector)
(The help for plotting may be just about the messiest part of R-help but once you get used to looking at plot.default, axis, and par you'll start getting a better handle on things)
The standard R plots are pretty good at doing what you want if you give them the correct information. If you can convert your timestamps to actual time objects (Date or POSIXct objects) then plot will tend to do the correct thing. Try the following examples:
tmp <- as.POSIXct( seq(0, length=10, by=60*5), origin='2011-12-28' )
tmp
plot( tmp, runif(10) )
tmp2 <- as.POSIXct( seq(0, length=10, by=60*60*5), origin='2011-12-28' )
tmp2
plot( tmp2, runif(10) )
tmp3 <- as.POSIXct( seq(0, length=10, by=60*60/2), origin='2011-12-28' )
tmp3
plot( tmp3, runif(10) )
In each case the tick labels are pretty meaningful, but if you would like a different format then you can follow #John's example and suppress the default axis, then use axis.POSIXct and specify what format you want.
The examples use equally spaced times (due to my laziness), but will work equally well for unequally spaced times.

Resources