Create proportion bar plot with dodge position - r

I am trying to create the following plot but with proportion on the y axis.
library(ggplot2)
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity), position = "dodge")
but when I add y=..prop.., it doesn't group it by clarity. I have tried the following:
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = ..prop.., fill = clarity), position = "dodge")

To calculate proportion (or frequency) you can use ..count.. (proportion is specific count divided by all count's):
library(ggplot2)
ggplot(diamonds, aes(cut, (..count..) / sum(..count..), fill = clarity)) +
geom_bar(position = "dodge")

Related

ggplot: adding a frequency plot over a percentage plot

I am interested in doing a plot showing percentages by group.
something like this:
data(iris)
ggplot(iris,
aes(x = Sepal.Length, group = factor(Species), fill = factor(Species))) +
geom_histogram(position = "fill")+theme_bw()
however, I would also like to plot a histogram showing the frequency distribution on top of this graph.
something like the plot below.
ggplot(iris,aes(x = Sepal.Length)) +
geom_histogram()+theme_bw()
Does anyone know how to do this?
Note I know how to do a frequency plot by group: ggplot(iris,aes(x = Sepal.Length, group = factor(Species), fill = factor(Species))) + geom_histogram()+theme_bw(). But this is not what I want. Rather I would like a small frequency distribution at the bottom of the percentage plot presented at the beginning.
Thank you very much
Something like this?
library(gridExtra)
p1 <- ggplot(iris,
aes(x = Sepal.Length,
group = factor(Species),
fill = factor(Species))) +
geom_histogram(position = "fill") +
theme_bw() +
theme(legend.position = "top")
p2 <- ggplot(iris,aes(x = Sepal.Length,
group = factor(Species),
fill = factor(Species))) +
geom_histogram() +
theme_bw() +
theme(legend.position = "none")
grid.arrange(p1, p2,
heights = c(4, 1.5))
Edit: So you are looking for this then? Note that in this case the absolute values of the smaller histogram become meaningless since they were scaled down to be ~25% of the vertical chart range.
ggplot() +
geom_histogram(data = iris,
aes(x = Sepal.Length,
group = factor(Species),
fill = factor(Species)),
position = "fill",
alpha = 1) +
geom_histogram(data = iris,
aes(x = Sepal.Length,
y = ..ncount.. / 4),
alpha = 0.5,
fill = 'black')

geom_bar and geom_point in the same ggplot and within the same groups

I have the current code
ggplot(data = niveles[niveles$departamento=="CUNDINAMARCA" &
niveles$prueba=="MATEMÁTICAS" &
!is.na(niveles$nivel),]) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = año, y = desempeño, fill = nivel)) +
geom_point(data = niveles[niveles$prueba=="MATEMÁTICAS" &
niveles$departamento=="COLOMBIA" &
!is.na(niveles$nivel),], shape = 24,
aes(x = año, y = desempeño, group = nivel, fill = "blue"))
which gives me the following plot:
However, I was hoping to get each one of the "points" withing its corresponding category of the "niveles" variable. Does anyone know how I can do that?
You can dodge points the same way as you dodge bars using position=position_dodge(). However, you need to add a width argument specifying how much to "dodge". A value of 1 should correspond with the dodged bars. You also have an unknown "blue" category in the legend. That's because the fill argument should appear outside the aesthetic (aes)
I also think you should subset the data first instead of doing all that within the ggplot command.
An alternative is to facet by department (see option 2 below).
But first to dodge the points.
Option 1: Subsetting
Create a subset for prueba and non-missing for nivel:
MATH <- niveles[niveles$prueba=="MATEMÁTICAS" & !is.na(niveles$nivel),]
Create subsets for each department:
CUNDINAMARCA <- MATH[MATH$departamento=="CUNDINAMARCA",]
COLOMBIA <- MATH[MATH$departamento=="CUNDINAMARCA",]
Then make your graph:
ggplot(data = CUNDINAMARCA) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = año, y = desempeño, fill = nivel)) +
geom_point(data = COLOMBIA, shape = 24,
position = position_dodge(width=1), # You need this to align points with bars
aes(x = año, y = desempeño, group = nivel), fill = "blue")
I can't test it on your data but I used the mtcars dataset as an example.
mtcars <- mtcars %>%
mutate(gear=factor(gear), cyl=factor(cyl))
VS0 <- mtcars[mtcars$vs==0,]
VS1 <- mtcars[mtcars$vs==1,]
ggplot() +
geom_bar(data = VS0, stat="identity", position = position_dodge(),
aes(x = cyl, y = mpg, fill = gear)) +
geom_point(data = VS1, shape = 24,
position = position_dodge(width=1),
aes(x = cyl, y = mpg, group = gear), fill = "blue")
Option 2: Facetting
ggplot(data = mtcars, group=vs) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = cyl, y = mpg, fill = gear)) +
facet_grid(~vs, labeller=label_both)
For your data, maybe this would work:
DATA <- MATH[MATH$departamento %in% c("CUNDINAMARCA","COLOMBIA"),]
ggplot(data = DATA, group=departamento) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = año, y = desempeño, fill = nivel)) +
facet_grid(~departamento, labeller=label_both)

Add Legend and change Colour on grouped bar plot

I created a plot like this;
library("ggplot2")
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = color, y = ..prop.., group = 2)) +
scale_y_continuous(labels=scales::percent) +
facet_grid(~cut)
Now I want to add a legend for the variable "color", also I want to change the colour of the bars. The graph is exactly how I want it to be, and if possible I don't want change the structure of the dataset, just add a legend and change colours.
I could not find example that fit for this "percentage"-style graphics.
ggplot(data = diamonds, aes(x = color, y = ..prop.., group = cut)) +
geom_bar(aes(fill = factor(..x.., labels = LETTERS[seq(from = 4, to = 10 )]))) +
labs(fill = "color") +
scale_y_continuous(labels = scales::percent) +
facet_grid(~ cut)

ggplot dodge bar not displaying when doing log coordinate transform

I’m having problems with ggplot using coordinate transformations with a log10 scale. I wish to plot my data on a log10 axis, but without scaling the data itself. This works with sqrt, but when using a log axis coordinates no bars appear. Please can you tell me what I’m missing?
d <- data.frame(x=factor(c(1,1,2,2)), y=c(1,2,3,4), fill=factor(c(1,2,3,4)))
#sqrt axis tranformaion works
ggplot(d, aes(x = x, fill = fill)) +
geom_bar(aes(y = y), stat = "identity", position = "dodge") +
coord_trans(y = "sqrt")
#log10 axis tranformaion doesn't work
ggplot(d, aes(x = x, fill = fill)) +
geom_bar(aes(y = y), stat = "identity", position = "dodge") +
coord_trans(y = "log10")
#log10 axis tranformaion works with points rather than bars
ggplot(d, aes(x = x, fill = fill)) +
geom_point(aes(y = y), stat = "identity") +
coord_trans(y = "log10")
Try using the scale_y_log10 function:
ggplot(d, aes(x = x, fill = fill)) +
geom_bar(aes(y = y), stat = "identity", position = "dodge") +
scale_y_log10()

Plot multiple group histogram with overlaid line ggplot

I'm trying to plot a multiple group histogram with overlaid line, but I cannot get the right scaling for the histogram.
For example:
ggplot() + geom_histogram(data=df8,aes(x=log(Y),y=..density..),binwidth=0.15,colour='black') +
geom_line(data = as.data.frame(pdf8), aes(y=pdf8$f,x=pdf8$x), col = "black",size=1)+theme_bw()
produces the right scale. But when I try to perform fill according to groups, each group is scaled separately.
ggplot() + geom_histogram(data=df8,aes(x=log(Y),fill=vec8,y=..density..),binwidth=0.15,colour='black') +
geom_line(data = as.data.frame(pdf8), aes(y=pdf8$f,x=pdf8$x), col = "black",size=1)+theme_bw()
How would I scale it so that a black line is overlaid over the histogram and on the y axis is density?
It is going to be difficult for others to help you without a reproducible example, but perhaps something like this is what you're after:
library(ggplot2)
ggplot(data = mtcars, aes(x = mpg, fill = factor(cyl))) +
geom_histogram(aes(y = ..density..)) +
geom_line(stat = "density")
If you would rather the density line pertain to the entire dataset, you need to move the fill aesthetic into the geom_histogram function:
ggplot(data = mtcars, aes(x = mpg)) +
geom_histogram(aes(y = ..density.., fill = factor(cyl))) +
geom_line(data = mtcars, stat = "density")

Resources