Multiple Layers in ggplot2 - r

I want to overlay a plot of an empirical cdf with a cdf of a normal distribution. I can only get the code to work without using ggplot.
rnd_nv1 <- rnorm(1000, 1.5, 0.5)
plot(ecdf(rnd_nv1))
lines(seq(0, 3, by=.1), pnorm(seq(0, 3, by=.1), 1.5, 0.5), col=2)
For ggplot to work I would need a single data frame, for example joining rnd_vn1 and pnorm(seq(0, 3, by=.1), 1.5, 0.5), col=2). This is a problem, because the function rnorm gives me just the function values without values on the domain. I don't even know how rnorm creates these, if I view the table I just see function values. But then again, magically, the plot of rnd_nv1 works.

The following plots the two lines but they overlap, since they are almost equal.
set.seed(1856)
x <- seq(0, 3, by = 0.1)
rnd_nv1 <- rnorm(1000, 1.5, 0.5)
dat <- data.frame(x = x, ecdf = ecdf(rnd_nv1)(x), norm = pnorm(x, 1.5, 0.5))
library(ggplot2)
long <- reshape2::melt(dat, id.vars = "x")
ggplot(long, aes(x = x, y = value, colour = variable)) +
geom_line()

Related

Adding observations as proportions on a horizontal barplot in R using text() function

I cannot figure out how to get the percentage of responses at the end of the bars. I know I'm missing something within the text() function, just not sure what exactly I'm missing. Thank you!
#Training/Specialty Barplot
trainbarplot <- barplot(table(PSR$training), horiz = TRUE,
main="Respondent Distribution of Training", cex.main = 1.1, font.main = 2,
cex.lab = 0.8, cex.names = 0.4, font.axis = 4, las = 2,
xlab="Response Frequency", xlim=c(0, 40), cex.axis = 0.8,
border="black",
col=rgb (0.1, 0.1, 0.4, 0.5, 0.6),
density=c(50,40,30) , angle=c(9,11,36)
)
text(trainbarplot, table(PSR$training) - 3,
labels=paste(round(proportions(table(PSR$training))*100, 0), "%"))
Generate data
I generated some sample data to replicate your problem. Please note that you should always try to provide an example dataset :)
set.seed(123)
df1 <- data.frame(x = rnorm(10, mean=10, sd=2), y = LETTERS[1:20])
Plot the data
Here's a plot that follows the same structure as your code:
bp <- barplot(df1$x, names.arg = df1$y, col = df1$colour, horiz = T)
text(x= df1$x+0.5, y= bp, labels=paste0(round(df1$x),"%"), xpd=TRUE)
Using ggplot2
You can also plot your data using ggplot2. For instance, you could first create a new column in your dataset with information on the labels...
df1$perc <- paste0(round(df1$x),"%")
Next, you can plot your data using ggplot and adding different relevant layers.
library(ggplot2)
ggplot(df1, aes(x = x, y = y)) +
geom_col() +
geom_text(aes(label = perc)) +
theme_minimal()
Good luck!

How can I individually change the decimal place of a select few axis labels in ggplot?

I have a simple plot below. I log scaled the x-axis and I want the graph to show 0.1, 1, 10. I can't figure out how to override the default of 0.1, 1.0, 10.0.
Is there a way I could change only two of the x-axis labels?
library(ggplot2)
x <- c(0.1, 1, 10)
y <- c(1, 5, 10)
ggplot()+
geom_point(aes(x,y)) +
scale_x_log10()
You could specify labels and breaks in scale_x_log10
library(ggplot2)
x <- c(0.1, 1, 10)
y <- c(1, 5, 10)
ggplot() + geom_point(aes(x,y)) + scale_x_log10(labels = x, breaks = x)

How to add to ggplot2 plot inside of for loop

I'm trying to plot multiple circles of different sizes on a plot using ggplot2's geom_point inside of a for loop. Every time I run it though, it plots all the circles, but all in the location of the last circle instead of in their respective locations as given by the data frame. Below is an example of the code I am running. I'm wondering how I would fix this or if there's a better way to get at what I'm trying to do here.
data <- data.frame("x" = c(0, 500, 1000, 1500, 2000),
"y" = c(1500, 500, 2000, 0, 1000),
"size" = c(3, 5, 1.5, 4.2, 2.6)
)
g <- ggplot(data = data, aes(x = x, y = y)) + xlim(0,2000) + ylim(0,2000)
for(i in 1:5) {
g <- g + geom_point(aes(x=data$x[i],y=data$y[i]), size = data$size[i], pch = 1)
}
print(g)
It's pretty rare to need a for-loop for a plot -- ggplot2 will take the whole dataframe and process it all without you needing to manage each row.
ggplot(data = data, aes(x = x, y = y, size = size)) +
geom_point(pch = 1)

Plot two densities in the same figure in R

I have two densities
N(µ = 1, σ2 = 1) and
N(µ = −3.5, σ2 = 3/4). I know I am have to use plot() and lines() but I am not sure how to convert the densities into functions. I am not even sure if that's what I have to do.
Any help would be appreciated. Thank you
You can use the dnorm() function, along with a sequence of numbers generated with seq() to get values to plot a pdf:
Get 5000 values between -10 and 10
x<-seq(-10,10,length=5000)
Calculate values - notice that dnorm() uses standard deviation and not the variance, so you need to take the square root of 0.75.
y<-dnorm(x,mean=0, sd=1)
z<-dnorm(x, mean = -3.5, sd = sqrt(0.75))
Plot first density in red with plot():
plot(x, y, type="l" , ylim = c(0,1), xlim = c(-8,8), col = "red")
Plot second one on top of first using the lines() function in blue:
lines(x,z, type = "l", col = "blue")
This code will plot you two densities in the same figure ;)
library(tidyverse)
seq(-10, 10, by = 0.1) %>%
tibble(x = .) %>%
mutate(D1 = dnorm(x, 1, 1),
D2 = dnorm(x, -3.5, 3/4)) %>%
gather(-x, key = Distribution, value = Value) %>%
ggplot(aes(x, Value)) +
geom_line(aes(color = Distribution))

How to use ggplot to plot probability densities?

I am looking for the ggplot way to plot a probability density function (or any function). I used to use the old plot() function in R to do this. For example, to plot a beta distribution with alpha=1 and beta=1 (uniform):
x <- seq(0,1,length=100)
db <- dbeta(x, 1, 1)
plot(x, db, type='l')
How can I do it in ggplot?
ggplot2 has a stat_function() function to superimpose a function on a plot in much the same way as curve() does. I struggled a little bit to get this to work without generating the data until I realised how to use the variables produced by the statistic --- here ..y... The following is similar to what you would get with curve(dbeta(x, shape1 = 2, shape2 = 2), col = "red"):
require(ggplot2)
x <- seq(0, 1, len = 100)
p <- qplot(x, geom = "blank")
stat <- stat_function(aes(x = x, y = ..y..), fun = dbeta, colour="red", n = 100,
args = list(shape1 = 2, shape2 = 2))
p + stat
library(ggplot2)
x <- seq(0,1,length=100)
db <- dbeta(x, 1, 1)
You can use the qplot function within ggplot2 to make a quick plot
qplot(x, db, geom="line")
or you can add a geom_line layer to a ggplot
ggplot() + geom_line(aes(x,db))

Resources