R histogram number of instances in each bin on plot - r

I would like to see the number of instances for each bin show up on the graph as well
set.seed(1)
x<-rnorm(1:100)
hist(x)

Try this
set.seed(1)
x<-rnorm(1:100)
y <- hist(x, plot=FALSE)
plot(y, ylim=c(0, max(y$counts)+5))
text(y$mids, y$counts+3, y$counts, cex=0.75)
which gives:

Another much simpler solution is just to use labels=TRUE in the hist(...) method itself. It will include number of occurrences/counts on the top of each bins in the histogram plot.
However, I would recommend to always include xlim and ylim for the histogram plots.
Code:
set.seed(1)
x <- rnorm(1:100)
hist(x, xlim = c(-3,3), ylim = c(1,30), labels = TRUE)

that automatically happen. it's called "frequency" on the left

Related

Making an R histogram plot from a saved hist() call

In R, one can do
x <- rnorm(100, 0, 1) # generate some fake data
hgram <- hist(x, plot=F)
plot(hgram$mids, hgram$counts)
One can further specify a plot type, such as 'h' or 's'. However, these don't really come out looking like a proper histogram. How can one make a nice looking histogram this way?
Thought to add my inputs about making decent looking histograms in R (using your "x" from your question).
Using Base R
# histogram with colors and labels
hist(x, main = "Histogram of Fake Data", xlab = paste("x (units of measure)"), border = "blue", col = "green", prob = TRUE)
# add density
lines(density(x))
# add red line at 95th percentile
abline(v = quantile(x, .95), col = "red")
Using Plotly
install.packages("plotly")
library(plotly)
# basic Plotly histogram
plot_ly(x = x, type = "histogram")
The plotly result should open in a browser window with a variety of interactive controls. More plotly capabilities are available on their website at:
https://plot.ly/r/histograms/#normalized-histogram

R symbol background for custom symbols

I have an R plot where I use the values as symbols. The points also have error bars:
The problem is, obviously, that the error bars (I use arrows for that) cross through the numbers and that just looks ugly and makes them hard to read.
This is my code, any ideas?
x = c(45.58333, 89.83333, 114.03333,138.65000,161.50000,185.15000,191.50000)
y_mean = c(3.350000,6.450000,7.200000,7.033333,8.400000,7.083333,6.750000)
y_sd = c(0.1802776,0.1732051,0.2500000,0.2020726,0.3500000,0.2020726,0.1000000)
values = data.frame(x, y_mean, y_sd)
plot(values$x, values$y_mean, type="n")
arrows(values$x, values$y_mean - values$y_sd,
values$x, values$y_mean + values$y_sd,
length=0.05, angle=90,
code=3, col="red")
lines(values$x, values$y_mean, type="b",
pch=" ",
col="red", bg="white")
text(values$x, values$y_mean, label=round(values$y_mean), col="red")
EDIT:
I executed the exact code shown above as asked:
I would play with the horizontal justification and add small points to keep track of the original position
points(values$x, values$y_mean, pch=19, col="red", cex=0.5)
text(values$x, values$y_mean, label=round(values$y_mean), col="red", adj = -0.2)
One idea is to white out the plot content where the text will be drawn, before drawing the text. This can be done with rect(). Although you risk whiting out the error bars entirely with this approach.
We can use strwidth() and strheight() to get the appropriate sizes for the whiteout rectangles.
x <- c(45.58333, 89.83333, 114.03333,138.65000,161.50000,185.15000,191.50000);
y_mean <- c(3.350000,6.450000,7.200000,7.033333,8.400000,7.083333,6.750000);
y_sd <- c(0.1802776,0.1732051,0.2500000,0.2020726,0.3500000,0.2020726,0.1000000);
xlim <- range(x);
ylim <- c(min(y_mean-y_sd),max(y_mean+y_sd));
plot(NA,xlim=xlim,ylim=ylim,xlab='x',ylab='y');
arrows(x,y_mean-y_sd,x,y_mean+y_sd,length=0.05,angle=90,code=3,col='red');
lines(x,y_mean,type='b',pch=' ',col='red',bg='white');
ls <- as.character(round(y_mean));
ex <- 0.4; ## whiteout expansion factor
lsw <- strwidth(ls); w <- lsw/2*(1+ex);
lsh <- strheight(ls); h <- lsh/2*(1+ex);
rect(x-w,y_mean-h,x+w,y_mean+h,col='white',border=NA);
text(x,y_mean,ls,col='red');
Just apply these changes:
plot(values$x, values$y_mean, type="n",
xlim = c(min(values$x), max(values$x) + 20),
ylim = c(min(values$y_mean)-1, max(values$y_mean)+1))
text(values$x, values$y_mean, label=round(values$y_mean), col="blue", pos = 3)

histogram for multiple variables in R

I want to make a histogram for multiple variables.
I used the following code :
set.seed(2)
dataOne <- runif(10)
dataTwo <- runif(10)
dataThree <- runif(10)
one <- hist(dataOne, plot=FALSE)
two <- hist(dataTwo, plot=FALSE)
three <- hist(dataThree, plot=FALSE)
plot(one, xlab="Beta Values", ylab="Frequency",
labels=TRUE, col="blue", xlim=c(0,1))
plot(two, col='green', add=TRUE)
plot(three, col='red', add=TRUE)
But the problem is that they cover each other, as shown below.
I just want them to be added to each other (showing the bars over each other) i.e. not overlapping/ not covering each other.
How can I do this ?
Try replacing your last three lines by:
plot(One, xlab = "Beta Values", ylab = "Frequency", col = "blue")
points(Two, col = 'green')
points(Three, col = 'red')
The first time you need to call plot. But the next time you call plot it will start a new plot which means you lose the first data. Instead you want to add more data to it either with scatter chart using points, or with a line chart using lines.
It's not quite clear what you are looking for here.
One approach is to place the plots in separate plotting spaces:
par("mfcol"=c(3, 1))
hist(dataOne, col="blue")
hist(dataTwo, col="green")
hist(dataThree, col="red")
par("mfcol"=c(1, 1))
Is this what you're after?

how to break x-axis in a density plot

I want to plot a distribution and a single value (with abline) which is very smaller than the minimum value in my distribution, so the abline won't appear in the plot. How can I plot them in the same plot manipulating the x-axis scale or maybe inserting breaks?
data <- rnorm(1000, -3500, 27)
estimate <- -80000
plot(density(data))
abline(v = estimate)
Here's a rough solution, it's not particularly pretty:
library(plotrix)
d <- density(data)
gap.plot(c(-8000,d$x), c(0,d$y), gap=range(c(-7990,-3620)),
gap.axis="x", type="l", xlab="x", ylab="Density",
xtics=c(-8000,seq(-3600,-3300,by=100)))
abline(v=-8000, col="red", lwd=2)
Not exactly clear what is needed but this might be progress:
plot(density(data), xlim=range(c(data, estimate+10) ) )
abline(v = estimate, col='red')
In package:plotrix there are broken axis plotting functions.

Get plot() bounding box values

I'm generating numerous plots with xlim and ylim values that I'm calculating on a per-plot basis. I want to put my legend outside the plot area (just above the box around the actual plot), but I can't figure out how to get the maximum y-value of the box around my plot area.
Is there a method for even doing this? I can move the legend where I want it by manually changing the legend() x and y values, but this takes a LONG time for the amount of graphs I'm creating.
Thanks!
-JM
Here's a basic example illustrating what I think you're looking for using one of the code examples from ?legend.
#Construct some data and start the plot
x <- 0:64/64
y <- sin(3*pi*x)
plot(x, y, type="l", col="blue")
points(x, y, pch=21, bg="white")
#Grab the plotting region dimensions
rng <- par("usr")
#Call your legend with plot = FALSE to get its dimensions
lg <- legend(rng[1],rng[2], "sin(c x)", pch=21,
pt.bg="white", lty=1, col = "blue",plot = FALSE)
#Once you have the dimensions in lg, use them to adjust
# the legend position
#Note the use of xpd = NA to allow plotting outside plotting region
legend(rng[1],rng[4] + lg$rect$h, "sin(c x)", pch=21,
pt.bg="white", lty=1, col = "blue",plot = TRUE, xpd = NA)
The command par('usr') will return the coordinates of the bounding box, but you can also use the grconvertX and grconvertY functions. A simple example:
plot(1:10)
par(xpd=NA)
legend(par('usr')[1], par('usr')[4], yjust=0, legend='anything', pch=1)
legend( grconvertX(1, from='npc'), grconvertY(1, from='npc'), yjust=0,
xjust=1, legend='something', lty=1)
The oma, omd, and omi arguments of par() control boundaries and margins of plots - they can be queried using par()$omd (etc). and set (if needed) using par(oma=c()) (where the vector can have up to 4 values - see ?par)

Resources