ggplot - How to stack and group bar chart? - r

I have two bar charts with the same factor labels that I'd like to combine into one graph (lines 59-62) so that for each age group, there are two bars (one for perpetrator age and one for victim age) and each bar retains the stacked murdered/not murdered bar.
Any help appreciated. Thanks!

what you want is kind of impossible, because the bars should be stacked by two variables. There is for sure a way to do what you want, but the easiest way in my opinion is to reshape2::melt your data.frame and then facet side by side the two graphs.
m <- reshape2::melt(shootings[,c(2,5,8,9)],measure.vars=c("Perp_Age","Victim_Age"))
ggplot(m, aes(fill=Murdered, y=Cases, x=value))+
geom_bar(position='stack', stat='identity')+
facet_wrap(~variable)
wich gives you this plot:

Related

Linking legend to plot with a line or an arrow

Context: when you have "many" categories it can become hard to distinguish them in a bar plot. I found the plot below dealing with this situation quite nicely by linking the legend with categories in the plot.
Question: is it possible to do something similar with ggplot2?
With ggplot2 it is straighforward to get this:
But I really do not know were to start to acheive the result shown in the 1st plot.
Here is some code to sort it out:
library(ggplot2)
ggplot(data = mtcars, aes(x = vs, y = disp, fill = factor(carb))) +
geom_bar(stat = "identity")
Expected output (not as nice as the one presented above but it shows the idea)
There is no proper legend on the axes in any of the plots, but my guess is that the desired chart is based on relative frequencies, while your plot seems to show absolute frequencies, though I'm not sure about that.
Assuming that you want to produce a stacked bar chart giving the (relative) number of observations of a categorial variable in two groups, there are two ways to get the two stacked bars to be of the same height:
There need to be the exact same amount of observations in both of
them. Then you can use absolute frequencies.
The absolute frequencies need to be transformed to relative frequencies (or percent) by dividing them by the total number of observations in each group.
You can calculate the relative frequencies yourself and use them as the y-values.
Or refer to this post, as it seems to describe exactly what you want using ggplot2.

How to create a stacked and grouped bar chart from two data frames?

I have a stacked barchart that looks like this.
If I have a second dataframe that has the same layout as the one that created the plot, and I want to group both datasets by position while still keeping the stacked percentages, how would I go about this. I'm not sure how to do it in ggplot2
Hard to say without seeing the data and without more information about what you actually want to achieve, but the general approach I would use is to say combine your dataframes - especially if the variables are the same. You just want to make sure to maintain "where" each dataset originated, and that will be your identifying column.
So, if your data is in myData1 and myData2:
# add identifying columns
myData1$id <- 'dataset1'
myData2$id <- 'dataset2'
# put them together
newData <- rbind(myData1, myData2)
You are not clear on what you're looking for in the combined plot, so you can go about that any number of ways (depending on what you want to do). Maybe the simplest example would be to use facet_grid() or facet_wrap() from ggplot2 to show them in side-by-side plots:
ggplot(newData, aes(x=name, y=value)) +
geom_col(aes(fill=gene)) +
facet_wrap(~id)

Stacked bar plot using factor lists in base R

I am hoping to make a stacked bar plot to show two factors. The questions and answers I can find on this site that address this problem all work with data that appears to be in a matrix format and use ggplot2. My data is in lists of observations, like this:
mydata = data.frame(V1=c("A","B","B","C","C"), V2=c("X","X","Y","Z","Z"))
I would like to show categories of V1 on the x axis of my plot, but stacked to show the proportions of V2 in each bar.
I can use the "count" function in the plyr library to find the frequency of each observation,
library(plyr)
mydata.count = count(mydata)
but I don't know how to structure my barplot command to group data by the level of V1: barplot(mydata.count$freq) separates all combinations of V1 and V2 into separate bars.
If possible, I would like to create this plot using the base R barplot functions so that it is visually consistent with other plots in my study.
Here is another possibility with ggplot:
ggplot(as.data.frame(table(mydata)), aes(x=V1, y=Freq, fill=V2)) + geom_bar(stat="identity")
ggplot(as.data.frame(table(mydata)), aes(x=V2, y=Freq, fill=V1)) + geom_bar(stat="identity")

How can I plot bars that are stacked on top of one another in ggplot?

I want to make a bar chart from this dataframe:
library(ggplot2)
mydf=data.frame(c("A","B","C","D"),c(100,110,90,120),c(150,200,160,180))
names(mydf)=c("myfirstC","mysecondC","mythirdC")
In order to plot a bar chart with bars that are stacked on top of one another, I am trying to use this code:
ggplot(data=mydf, aes(x=myfirstC))+
geom_col(aes(y=mysecondC), colour="blue")+
geom_col(aes(y=mythirdC), colour="red")
head(mydf)
Unfortunately, this code only returns a plot with bars from the "mythirdC"-column only.
Question: How do I need to change the code in order to get a stacked plot, without reshaping the dataframe?
If your really don't want to reshape you can try this:
ggplot(data=mydf, aes(x=myfirstC))+
geom_col(aes(y=mysecondC+mythirdC), fill="blue")+
geom_col(aes(y=mythirdC), fill="red")

Adding more complexity to stacked barchart in ggplot

It's a bit cluttered, I know, but I'm trying to further divide the data that makes up my stacked bar chart.
Here's what it looks like so far:
A = ggplot(data=yield,aes(N,Mean.Yield,fill=Cutting))
B=A+facet_grid(Location~Mngmt)+geom_bar(stat="identity")
B+labs(x="Nitrogen Level")+labs(y="Yield (lb/acre)")
Yielding this graph:
(I would post the graph but apparently my reputation isn't up to snuff as a new member!)
How can I further divide the bars by the factor "species"? I'm assuming it involves adding another geom, but I'm new to all this.
Thanks!
Edited to add:
Attempting to use mtcars for dummy data, though not the best as mpg is not additive like yield over two cutting is in my data.
mtcars$cyl=as.factor(mtcars$cyl)
mtcars$vs=as.factor(mtcars$vs)
mtcars$am=as.factor(mtcars$am)
mtcars$gear=as.factor(mtcars$gear)
mtcars$carb=as.factor(mtcars$carb)
A = ggplot(data=mtcars,aes(cyl,mpg,fill=gear))
B=A+facet_grid(am~vs)+geom_bar(stat="identity")
This yields this ugly graph: http://i.imgur.com/sK7A5am.png(http://i.imgur.com/sK7A5am.png) I'm hoping to split each of those bars (e.g., cylinders) into two side by side bars (in this example, 6 side by side bars denoting the mpg of engines with varying levels of carb for each cylinder factor). I hope this makes sense. Thanks again!
Okay, based upon your comments, I think you want to change the position within the geom_bar(). Using the diamonds dataset from ggplot2, Does this look like what you want?
library(ggplot2)
## note the diamonds dataset comes with ggplot2
ggplot(diamonds, aes(clarity, fill=cut)) +
geom_bar(position="dodge")
(source: ggplot2.org)
Then you would just add in your facet and other details. With the diamonds example, this would be
ggplot(diamonds, aes(clarity, fill=cut)) +
geom_bar(position="dodge") +
facet_grid(color ~ clarity)
I figured out how to do this browsing the ggplot2 help files

Resources