Plot step-wise decrease in R with Categorical value on X-axis - r

I want to use a step plot to illustrate a process of elimination. I have a dateframe containing the number of candidates remaining after each step; it looks like this:
Step Candidates Count
1 26587
2 1761
3 849
4 130
The Step column is a categorical variable and I need to represent with the names of the actual steps; I am using numbers because I have not been able to plot when the Step column contains text.
I was able to produce the following figure with the command
plot(df, type = "s")
The problem is the X axis: I need to either get rid of the decimals and add a legend to name each step or, preferably, figure out some way to put the names of the steps in the Step column and populate the axis automatically.
I also want to show the same graph as a log but when I use:
plot(log(df), type = "s")
R gives me log values for both columns. This wouldn't be a problem if I could figure out how to plot the data with Step as a categorical variable but I just cannot figure out how.
My instinct is that this is a fairly simple problem but I've been struggling for most of this morning.

plot(df, type = "s", xaxt='n', log="y")
axis(1, at=1:4, labels=paste("step", 1:4))
Use
xaxt to suppress x-axis ticks and labels
log="y" to get y-axis on log scale
axis to add in the x-axis with labels argument used at specified points on x-axis
You may also want to tweak the labels on the y-axis

Related

Forecast plot with x axis labels as date

I have a dataset like revenue and date.
I used arima to plot the data.
ts_data = ts(dataset$Revenue,frequency = 7)
arima.ts = auto.arima(ts_data)
pred = forecast(arima.ts,h=30)
plot(pred,xaxt="n")
When I plot the data, it produces plot like below.
My expectations are below,
I need to display values in Million for predicted values like 13.1M.
I need to show x-axis as date instead of data points numbers.
I tried several links but couldn't crack it. Below are the experiments I made,
Tried with start date and end date in ts_data that also doesnt work.My start date is "2019-09-27" and end date is "2020-07-02"
tried wit axis_date in plot function that also doesnt work.
Please help me to crack the same.
Thanks a lot.
You can specify axis tick values and labels with axis()
plot(pred,xaxt="n", yaxt="n") # deactivate x and y axis
yl<-seq(-5000000,15000000,by=5000000) # position of y-axis ticks
axis(2, at=yl, label=paste(yl/1000000, "M")) # 2 = left axis
You can specify the desired position of y axis ticks with at and the labels to be associated with label. In order to obtain the values like 10 M I have used the function paste to join the numbers with the letter M.
You could use the same method also for x-axis, even tough more efficient methods probably exist. Not having the specific data available I have used a generic spacing and labels only to give you the idea. Once you have set the desired position you can generate the sequence of dates associated with it (to generate a sequence of dates see https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/seq.Date)
axis(1, at=seq(1,40,by=1), label=seq(as.Date("2019-09-27"),as.Date("2020-07-02"),by="week")) # 1 = below axis
You can also change the format of the dates displayed with format() for example label=format(vector_of_date, "%Y-%b-%d") to have year-month(in letter)-day.

Change axis in R with different number of datas

I want to change x-axis in my graphic, but it doesn't work properly with axis(). Datas in the graphic are daily datas and I want to show only years. Hope someone understands me and find a solution. This is how it looks like now: enter image description here and this is how it looks like with the code >axis (1, at = seq(1800, 1975, by = 25), las=2): enter image description here
Without a reproducible code is not easy to get what could be the problem. I try a "quick and dirt" approach.
High level plots are composed by elements that are sub-composed themselves. Hence, separate drawing commands could turn in use by allowing a finer control on the plotting procedure.
In practice, the first thing to do is plot "nothing".
> plot(x, y, type = "n", xlab = "", ylab = "", axes = F)
type = "n" causes the data to not be drawn. axes = F suppresses the axis and the box around the plot. In spite of that, the plotting region is ready to show the data.
The main benefit is that now the plotting area is correctly dimensioned. Try now to add the desired x axis as you tried before.
> points(x, y) # Plots the data in the area
> axis() # Plots the desired axis with your scale
> title() # Plots the desired titles
> box() # Prints the box surrounding the plot
EDITED based on comment by #scoa
As a quick and dirty solution, you can simply enter the following line after your plot() line:
# This reads as, on axis x (1), anchored at the first (day) value of 0
# and last (day) value of 63917 with 9131 day year increments (by)
# and labels (las) perpendicular (2) to axis (for readability)
# EDITED: and AT the anchor locations, put the labels
# 1800 (year) to 1975 (year) in 25 (year) increments
axis (1, at = seq(0, 63917, by = 9131), las=2, labels=seq(1800, 1975, by=25));
For other parameters, check out ?axis. As #scoa mentioned, this is approximate. I have used 365.25 as a day-to-year conversion, but it's not quite right. It should suffice for visual accuracy at the scale you have provided. If you need precise conversion from days to years, you need to operate on your original data set first before plotting.

R Barplot: Re-Order X-Axis Labels

I have created a Barplot that displays the number of incidents per hour. However, it looks like R sorts the X-Axis labels as characters and not as numeric. I would like have my X-Axis ordered as 0, 1,2,3,4,5,..., 23. Is there anyway I can explicitly specify the order?
Update: More generally, if I want a specific order, what do I need to do? I have not found any options for the barplot() function in the documentation that allows to specify explicitly the order of the labels of the barplot.
# 5. Bar Plot for the Time (Hour) of Incidents
barplot(xtabs(~sample$Time),
main="Number of Incidents per Time Slot",
ylab="Frequency",
col=rainbow(nlevels(as.factor(sample$Time))),
las=2,
# cex.lab=0.50 This is for the x-axis Label,
cex.names = 0.80
)

R histogram with numbers under bars

I had some problems while trying to plot a histogram to show the frequency of every value while plotting the value as well. For example, suppose I use the following code:
x <- sample(1:10,1000,replace=T)
hist(x,label=TRUE)
The result is a plot with labels over the bar, but merging the frequencies of 1 and 2 in a single bar.
Apart from separate this bar in two others for 1 and 2, I also need to put the values under each bar.
For example, with the code above I would have the number 10 under the tick at the right margin of its bar, and I needed to plot the values right under the bars.
Is there any way to do both in a single histogram with hist function?
Thanks in advance!
Calling hist silently returns information you can use to modify the plot. You can pull out the midpoints and the heights and use that information to put the labels where you want them. You can use the pos argument in text to specify where the label should be in relation to the point (thanks #rawr)
x <- sample(1:10,1000,replace=T)
## Histogram
info <- hist(x, breaks = 0:10)
with(info, text(mids, counts, labels=counts, pos=1))

Change axis labels with matplot in R

I'm trying to change the x axis in a matplot, but this command doesn't work:
TimePoints=1997:2011
matplot(t(DataMatrix),type='l',col="black",lwd=1,xlab="Anni",ylab="Rifiuti",main="Produzione rifiuti")
axis(side=1,at=TimePoints,labels=TimePoints)
with plot I used this without problems. How can I fix it?
Here you can find the objects: https://dl.dropboxusercontent.com/u/47720440/SOF.RData
I usually do this as follows:
Omit the axes altogether.
Add the axes with desired options one by one.
In R:
# Add argument axes=F to omit the axes
matplot(t(DataMatrix),type='l',col="black",lwd=1,xlab="Anni",ylab="Rifiuti",main="Produzione rifiuti",axes=F)
# Add Y-axis as is
axis(2)
# Add X-axis
# Note that your X-axis range is not in years but in the "column numbers",
# i.e. the X-axis range runs from 1 to 15 (the number of columns in your matrix)
# Possibly that's why your original code example did not work as expected?
axis(side=1,at=1:ncol(DataMatrix),labels=TimePoints)

Resources