How to add frequency & percentage on the same histogram in R? - r

Consider the following data set.
x <- c(2,2,2,4,4,4,4,5,5,7,7,8,8,9,10,10,1,1,0,2,3,3,5,6)
hist(x, nclass=10)
I want to have a histogram where the x-axis indicates the intervals & the y-axis on the left represents the frequency. In addition to this, I need another y-axis on the right side of the histogram representing the percentage of the intervals on the same plot. Even though the following graph is for two variables, more or less it looks like what I need (taken from Histogram of two variables in R).
Thanks in advance!

You can add a vertical axis to the right side with axis(4,at = at), where at are points at which tick-marks are to be drawn. If you want the density values of your histogram as tick-marks, call axis(4, at = hist(x,nclass=10)$density).

Related

How to draw Probability Density Function using line plots in IDL?

I have a vector containing values and I would like to draw a probability density function (PDF) graph for the values contained. Let's say I have a vector given by b=[1,1,3,4,5,2,3,5,1,4,2,4,1,1,4,2,3,5]. I can draw a histogram as follows
cghistoplot, b, binsize=1, xtitle='values', ytitle='freq', /fill
However, I want to draw the pdf using a line plot, i.e. I want y values to be normalized by the number of values in the vector (18 here). I know I can use cgplot to make line plots, but it needs x- and y-values.
You can just use the HISTOGRAM built-in routine like the following:
hist = HISTOGRAM(b,BINSIZE=1,LOCATIONS=locs,MIN=0,MAX=6)
Then you can plot the results doing something like:
PLOT,locs,hist,PSYM=10,XTITLE='values',YTITLE='freq'
This will show a quick and easy histogram form. If you want a line plot, just remove the PSYM keyword setting.

Plot a histogram with densities past initialization

This code will create two plots.
a = c(0,1,1,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5,5,5,6,6,6,6,6,7,7,7,7,7,7,8,8,8,8,8,8,8,9,9,9)
b = hist(a, freq=FALSE)
dev.new()
plot(b)
The first is a histogram of the density (which I want to have). But if one would like to plot b later on, it will always be plotted as frequency.
Is there any chance to plot the histogram as density past initialisation?
You just need to change the argument in plot
plot(b,freq=FALSE)

R histogram with numbers under bars

I had some problems while trying to plot a histogram to show the frequency of every value while plotting the value as well. For example, suppose I use the following code:
x <- sample(1:10,1000,replace=T)
hist(x,label=TRUE)
The result is a plot with labels over the bar, but merging the frequencies of 1 and 2 in a single bar.
Apart from separate this bar in two others for 1 and 2, I also need to put the values under each bar.
For example, with the code above I would have the number 10 under the tick at the right margin of its bar, and I needed to plot the values right under the bars.
Is there any way to do both in a single histogram with hist function?
Thanks in advance!
Calling hist silently returns information you can use to modify the plot. You can pull out the midpoints and the heights and use that information to put the labels where you want them. You can use the pos argument in text to specify where the label should be in relation to the point (thanks #rawr)
x <- sample(1:10,1000,replace=T)
## Histogram
info <- hist(x, breaks = 0:10)
with(info, text(mids, counts, labels=counts, pos=1))

Changing X-axis position in R barplot

I have two values of averages LR50<-(424.8, 425.7). I want to plot these as a barplot barplot(LR50). Now, I don't need any of the information except from ylim=c(424,426.5). When I change the x-axis axis(1,at=c(0,10), pos=424) there is still bar plot that falls below the axis.
How do I get the barplot to only plot from the new x-axis at y=424 and up?
You will need to use the xpd parameter (and set it to FALSE), this will clip the plot to the plotting region.
barplot(LR50,ylim = c(424,426.5),xpd=FALSE)
axis(1,at=c(0,10),pos=424)

Problem with axis limits when plotting curve over histogram [duplicate]

This question already has an answer here:
How To Avoid Density Curve Getting Cut Off In Plot
(1 answer)
Closed 6 years ago.
newbie here. I have a script to create graphs that has a bit that goes something like this:
png(Test.png)
ht=hist(step[i],20)
curve(insert_function_here,add=TRUE)
I essentially want to plot a curve of a distribution over an histogram. My problem is that the axes limits are apparently set by the histogram instead of the curve, so that the curve sometimes gets out of the Y axis limits. I have played with par("usr"), to no avail. Is there any way to set the axis limits based on the maximum values of either the histogram or the curve (or, in the alternative, of the curve only)?? In case this changes anything, this needs to be done within a for loop where multiple such graphs are plotted and within a series of subplots (par("mfrow")).
Inspired by other answers, this is what i ended up doing:
curve(insert_function_here)
boundsc=par("usr")
ht=hist(A[,1],20,plot=FALSE)
par(usr=c(boundsc[1:2],0,max(boundsc[4],max(ht$counts))))
plot(ht,add=TRUE)
It fixes the bounds based on the highest of either the curve or the histogram.
You could determine the mx <- max(curve_vector, ht$counts) and set ylim=(0, mx), but I rather doubt the code looks like that since [] is not a proper parameter passing idiom and step is not an R plotting function, but rather a model selection function. So I am guessing this is code in Matlab or some other idiom. In R, try this:
set.seed(123)
png("Test.png")
ht=hist(rpois(20,1), plot=FALSE, breaks=0:10-0.1)
# better to offset to include discrete counts that would otherwise be at boundaries
plot(round(ht$breaks), dpois( round(ht$breaks), # plot a Poisson density
mean(ht$counts*round(ht$breaks[-length(ht$breaks)]))),
ylim=c(0, max(ht$density)+.1) , type="l")
plot(ht, freq=FALSE, add=TRUE) # plot the histogram
dev.off()
You could plot the curve first, then compute the histogram with plot=FALSE, and use the plot function on the histogram object with add=TRUE to add it to the plot.
Even better would be to calculate the the highest y-value of the curve (there may be shortcuts to do this depending on the nature of the curve) and the highest bar in the histogram and give this value to the ylim argument when plotting the histogram.

Resources