Plot multiple histograms in R with prob=TRUE - r

I want to plot multiple histograms in R which do not show frequency, but the density instead:
A <- rnorm(100)
B <- rnorm(100)
hist1 <- hist(A,prob=TRUE,breaks=30)
hist2 <- hist(B,prob=TRUE,breaks=30)
plot(hist1, col="red",lty=0, xlim=c(-4,4))
plot(hist2, col="blue", lty=0, xlim=c(-4,4), add=TRUE, main="Example")
lines(density(A))
However, my 'prob=TRUE' option apparently doesn't go through when plotting the objects. Can someone explain to me what I am doing wrong?

leave the prob=T out of the hist() command
hist1 <- hist(A,breaks=30)
hist2 <- hist(B,freq=F,breaks=30)
And put freq=F into the plot command.
plot(hist1, col="red",lty=0, xlim=c(-4,4),freq=F)
plot(hist2, col="blue", lty=0, xlim=c(-4,4), add=TRUE, main="Example",freq=F)

Related

Synchronize two plots from different datasets based on time in R [duplicate]

I would like to overlay two plots:
plot1
t1 <- c(0,1,2,3,4,5,6,7,8,9,10)
d1 <- c(0,2,4,6,8,10,12,14,16,18,20)
plot2
t2 <- c(0,1,2,3,4,5)
d2 <- c(1,3,7,8,8,8)
I tried
plot(d1~t1, col="black", type="l")
par(new=T)
plot(d2~t2, col="black", type="l")
But the problem is: in this way, both x axes also overlay each other while x in plot1 is 1:10 and plot2 1:5
You can use lines for the second plot (instead of plot). Furthermore, we scale the x-axis values of the second plot (t2) with 2 (I(2 * t2)).
plot(d1 ~ t1, col="black", type="l", xlim=c(0,10))
lines(d2 ~ I(2 * t2), col="black", type="l", xlim=c(0,5))
In this way, the x-range of the second plot is identical to the x-range of the first one.

Kernel density scatter plot in R

I saw a beautiful plot and I'd like to recreate it. Here's an example showing what I've got so far:
# kernel density scatterplot
library(RColorBrewer)
library(MASS)
greyscale <- rev(brewer.pal(4, "Greys"))
x <- rnorm(20000, mean=5, sd=4.5); x <- x[x>0]
y <- x + rnorm(length(x), mean=.2, sd=.4)
z <- kde2d(x, y, n=100)
plot(x, y, pch=".", col="hotpink")
contour(z, drawlabels=FALSE, nlevels=4, col=greyscale, add=T)
abline(c(0,1), lty=1, lwd=2)
abline(lm(y~x), lty=2, lwd=2)
I'm struggling to fill the contours with colour. Is this a job for smoothScatter or another package? I suspect it might be down to my use of kde2d and, if so, can someone please explain this function or link me to a good tutorial?
Many thanks!
P.S. the final image should be greyscale
Seems like you want a filled contour rather than jus a contour. Perhaps
library(RColorBrewer)
library(MASS)
greyscale <-brewer.pal(5, "Greys")
x <- rnorm(20000, mean=5, sd=4.5); x <- x[x>0]
y <- x + rnorm(length(x), mean=.2, sd=.4)
z <- kde2d(x, y, n=100)
filled.contour(z, nlevels=4, col=greyscale, plot.axes = {
axis(1); axis(2)
#points(x, y, pch=".", col="hotpink")
abline(c(0,1), lty=1, lwd=2)
abline(lm(y~x), lty=2, lwd=2)
})
which gives

overlay two plots with different x scale

I would like to overlay two plots:
plot1
t1 <- c(0,1,2,3,4,5,6,7,8,9,10)
d1 <- c(0,2,4,6,8,10,12,14,16,18,20)
plot2
t2 <- c(0,1,2,3,4,5)
d2 <- c(1,3,7,8,8,8)
I tried
plot(d1~t1, col="black", type="l")
par(new=T)
plot(d2~t2, col="black", type="l")
But the problem is: in this way, both x axes also overlay each other while x in plot1 is 1:10 and plot2 1:5
You can use lines for the second plot (instead of plot). Furthermore, we scale the x-axis values of the second plot (t2) with 2 (I(2 * t2)).
plot(d1 ~ t1, col="black", type="l", xlim=c(0,10))
lines(d2 ~ I(2 * t2), col="black", type="l", xlim=c(0,5))
In this way, the x-range of the second plot is identical to the x-range of the first one.

How to plot a nice Lorenz Curve for factors in R (ggplot ?)

I need a nice plot for my thesis on the different distributions of different factors. Only the standard approach seemed with the package(ineq) was flexible enough.
However, it doesn't let me to put dots (see comment below) at the classes. It is important to see them, ideally to name them individually. Is this possible?
Distr1 <- c( A=137, B=499, C=311, D=173, E=219, F=81)
Distr2 <- c( G=123, H=400, I=250, J=16)
Distr3 <- c( K=145, L=600, M=120)
library(ineq)
Distr1 <- Lc(Distr1, n = rep(1,length(Distr1)), plot =F)
Distr2 <- Lc(Distr2, n = rep(1,length(Distr2)), plot =F)
Distr3 <- Lc(Distr3, n = rep(1,length(Distr3)), plot =F)
plot(Distr1,
col="black",
#type="b", # !is not working
lty=1,
lwd=3,
main="Lorenz Curve for My Distributions"
)
lines(Distr2, lty=2, lwd=3)
lines(Distr3, lty=3, lwd=3)
legend("topleft",
c("Distr1", "Distr2", "Distr3"),
lty=c(1,2,3),
lwd=3)
This is how it looks now
In case you really want to use ggplot, here is a simple solution
# Compute the Lorenz curve Lc{ineq}
library(ineq)
Distr1 <- c( A=100, B=900, C=230, D=160, E=190, F=40, G=5,H=30,J=60, K=500)
Distr1 <- Lc(Distr1, n = rep(1,length(Distr1)), plot =F)
# create data.frame from LC
p <- Distr1[1]
L <- Distr1[2]
Distr1_df <- data.frame(p,L)
# plot
ggplot(data=Distr1_df) +
geom_point(aes(x=p, y=L)) +
geom_line(aes(x=p, y=L), color="#990000") +
scale_x_continuous(name="Cumulative share of X", limits=c(0,1)) +
scale_y_continuous(name="Cumulative share of Y", limits=c(0,1)) +
geom_abline()
To show the problem, only Distr1 is needed; it' good to strip down before posting.
library(ineq)
Distr1 <- c( A=137, B=499, C=311, D=173, E=219, F=81)
Distr1 <- Lc(Distr1, n = rep(1,length(Distr1)), plot =F)
plot(Distr1$p,Distr1$L,
col="black",
type="b", # it should be "b"
lty=1,
lwd=3,
main="Lorenz Curve for My Distributions"
)
As there is a package (gglorenz) handling Lorenz Curves automatically for ggplot, I add this:
library(ggplot2)
library(gglorenz)
Distr1 <- c( A=137, B=499, C=311, D=173, E=219, F=81)
x <- data.frame(Distr1)
ggplot(x, aes(Distr1)) +
stat_lorenz() +
geom_abline(color = "grey")

histogram for multiple variables in R

I want to make a histogram for multiple variables.
I used the following code :
set.seed(2)
dataOne <- runif(10)
dataTwo <- runif(10)
dataThree <- runif(10)
one <- hist(dataOne, plot=FALSE)
two <- hist(dataTwo, plot=FALSE)
three <- hist(dataThree, plot=FALSE)
plot(one, xlab="Beta Values", ylab="Frequency",
labels=TRUE, col="blue", xlim=c(0,1))
plot(two, col='green', add=TRUE)
plot(three, col='red', add=TRUE)
But the problem is that they cover each other, as shown below.
I just want them to be added to each other (showing the bars over each other) i.e. not overlapping/ not covering each other.
How can I do this ?
Try replacing your last three lines by:
plot(One, xlab = "Beta Values", ylab = "Frequency", col = "blue")
points(Two, col = 'green')
points(Three, col = 'red')
The first time you need to call plot. But the next time you call plot it will start a new plot which means you lose the first data. Instead you want to add more data to it either with scatter chart using points, or with a line chart using lines.
It's not quite clear what you are looking for here.
One approach is to place the plots in separate plotting spaces:
par("mfcol"=c(3, 1))
hist(dataOne, col="blue")
hist(dataTwo, col="green")
hist(dataThree, col="red")
par("mfcol"=c(1, 1))
Is this what you're after?

Resources