Plot a histogram with densities past initialization - r

This code will create two plots.
a = c(0,1,1,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5,5,5,6,6,6,6,6,7,7,7,7,7,7,8,8,8,8,8,8,8,9,9,9)
b = hist(a, freq=FALSE)
dev.new()
plot(b)
The first is a histogram of the density (which I want to have). But if one would like to plot b later on, it will always be plotted as frequency.
Is there any chance to plot the histogram as density past initialisation?

You just need to change the argument in plot
plot(b,freq=FALSE)

Related

How to add frequency & percentage on the same histogram in R?

Consider the following data set.
x <- c(2,2,2,4,4,4,4,5,5,7,7,8,8,9,10,10,1,1,0,2,3,3,5,6)
hist(x, nclass=10)
I want to have a histogram where the x-axis indicates the intervals & the y-axis on the left represents the frequency. In addition to this, I need another y-axis on the right side of the histogram representing the percentage of the intervals on the same plot. Even though the following graph is for two variables, more or less it looks like what I need (taken from Histogram of two variables in R).
Thanks in advance!
You can add a vertical axis to the right side with axis(4,at = at), where at are points at which tick-marks are to be drawn. If you want the density values of your histogram as tick-marks, call axis(4, at = hist(x,nclass=10)$density).

How to plot heatmap in Julia

I have a (21,100)-array and want to plot it as 2D-Histogram (heatmap).
If I plot it naively with histogram2d(A, nbins= 20) it only plots the first 21 points.
I tried to loop it, but then I had 100 histograms with 21 points.
Another idea would be to put the data in a (2100)-array but this seems like a bad idea.
Addition:
I have a scatter plot/data and want it shown as a heatmap. The more points in one bin the "darker" the color.
So I have 21 x-values each with 100 y-values.
Here it is a typical scenario for a heatmap plot:
using Plots
gr()
data = rand(21,100)
heatmap(1:size(data,1),
1:size(data,2), data,
c=cgrad([:blue, :white,:red, :yellow]),
xlabel="x values", ylabel="y values",
title="My title")

Make y-axis logarithmic in histogram using R [duplicate]

This question already has answers here:
Histogram with Logarithmic Scale and custom breaks
(7 answers)
Closed 5 years ago.
Hi I'm making histogram using R, but the number of Y axis is so large that I need to turn it into logarithmic.See below my script:
hplot<-read.table("libl")
hplot
pdf("first_end")
hist(hplot$V1, breaks=24, xlim=c(0,250000000), ylim=c(0,2000000),main="first end mapping", xlab="Coordinates")
dev.off()
So how should I change my script?
thx
You can save the histogram data to tweak it before plotting:
set.seed(12345)
x = rnorm(1000)
hist.data = hist(x, plot=F)
hist.data$counts = log10(hist.data$counts)
dev.new(width=4, height=4)
hist(x)
dev.new(width=4, height=4)
plot(hist.data, ylab='log10(Frequency)')
Another option would be to use plot(density(hplot$V1), log="y").
It's not a histogram, but it shows just about the same information, and it avoids the illogical part where a bin with zero counts is not well-defined in log-space.
Of course, this is only relevant when your data is continuous and not when it's really categorical or ordinal.
A histogram with the y-axis on the log scale will be a rather odd histogram. Technically it will still fit the definition, but it could look rather misleading: the peaks will be flattened relative to the rest of the distribution.
Instead of using a log transformation, have you considered:
Dividing the counts by 1 million:
h <- hist(hplot$V1, plot=FALSE)
h$counts <- h$counts/1e6
plot(h)
Plotting the histogram as a density estimate:
hist(hplot$V1, freq=FALSE)
You can log your y-values for the plot and add a custom log y-axis afterwards.
Here is an example for a table object of random normal distribution numbers:
# data
count = table(round(rnorm(10000)*2))
# plot
plot(log(count) ,type="h", yaxt="n", xlab="position", ylab="log(count)")
# axis labels
yAxis = c(0,1,10,100,1000)
# draw axis labels
axis(2, at=log(yAxis),labels=yAxis, las=2)

How to plot density of two datasets on same scale in one figure?

How to plot the density of a single column dataset as dots? For example
x <- c(1:40)
On the same plot using the same scale of the x-axis and y-axis, how to add another data set as line format which represent the density of another data that represents the equation of
y = exp(-x)
to the plot?
The equation is corrected to be y = exp(-x).
So, by doing plot(density(x)) or plot(density(y)), I got two separated figures. How to add them in the same axis and using dots for x, smoothed line for y?
You can add a line to a plot with the lines() function. Your code, modified to do what you asked for, is the following:
x <- 1:40
y <- exp(-x)
plot(density(x), type = "p")
lines(density(y))
Note that we specified the plot to give us points with the type parameter and then added the density curve for y with lines. The help pages for ?plot, ?par, ?lines would be some insightful reading. Also, check out the R Graph Gallery to view some more sophisticated graphs that generally have the source code attached to them.

Problem with axis limits when plotting curve over histogram [duplicate]

This question already has an answer here:
How To Avoid Density Curve Getting Cut Off In Plot
(1 answer)
Closed 6 years ago.
newbie here. I have a script to create graphs that has a bit that goes something like this:
png(Test.png)
ht=hist(step[i],20)
curve(insert_function_here,add=TRUE)
I essentially want to plot a curve of a distribution over an histogram. My problem is that the axes limits are apparently set by the histogram instead of the curve, so that the curve sometimes gets out of the Y axis limits. I have played with par("usr"), to no avail. Is there any way to set the axis limits based on the maximum values of either the histogram or the curve (or, in the alternative, of the curve only)?? In case this changes anything, this needs to be done within a for loop where multiple such graphs are plotted and within a series of subplots (par("mfrow")).
Inspired by other answers, this is what i ended up doing:
curve(insert_function_here)
boundsc=par("usr")
ht=hist(A[,1],20,plot=FALSE)
par(usr=c(boundsc[1:2],0,max(boundsc[4],max(ht$counts))))
plot(ht,add=TRUE)
It fixes the bounds based on the highest of either the curve or the histogram.
You could determine the mx <- max(curve_vector, ht$counts) and set ylim=(0, mx), but I rather doubt the code looks like that since [] is not a proper parameter passing idiom and step is not an R plotting function, but rather a model selection function. So I am guessing this is code in Matlab or some other idiom. In R, try this:
set.seed(123)
png("Test.png")
ht=hist(rpois(20,1), plot=FALSE, breaks=0:10-0.1)
# better to offset to include discrete counts that would otherwise be at boundaries
plot(round(ht$breaks), dpois( round(ht$breaks), # plot a Poisson density
mean(ht$counts*round(ht$breaks[-length(ht$breaks)]))),
ylim=c(0, max(ht$density)+.1) , type="l")
plot(ht, freq=FALSE, add=TRUE) # plot the histogram
dev.off()
You could plot the curve first, then compute the histogram with plot=FALSE, and use the plot function on the histogram object with add=TRUE to add it to the plot.
Even better would be to calculate the the highest y-value of the curve (there may be shortcuts to do this depending on the nature of the curve) and the highest bar in the histogram and give this value to the ylim argument when plotting the histogram.

Resources