Stacking overlapping tiles using geom_tile() - r

I'm trying to figure out how to place tiles that represent points using geom_tile(), and my issue is that overlapping tiles only appear as one tile. I'm trying to get tiles with the same y-values to be adjacent to each other, even though they have the same value. My initial thought was to use position = "dodge", however that spread out the tiles all over my bar graph.
My current code is
ggplot(dataset, aes(x = Country, y = `Health Sciences`)) +
geom_bar(stat = "identity", width = 0.25) +
geom_tile(dataset_long, mapping = aes(x = Country, y = Percent, fill = Subject),
position = position_dodge(width=0, preserve = "total")) +
coord_flip()
but it doesn't produce the intended effect. The graph below shows some tiles that are "stacked" atop one another if they have overlapping values, however, I'm trying to get them to be directly adjacent to each other instead. Any help would be appreciated, thanks!
Graph with stacked tiles

Related

customising my Facet grid with error bars

Hi everyone I want to do a grid bar plot of my data which is the relative abundance of fungi and make it as neat as possible but when I use facet grid the bar plots don't look right to facilitate comparison. Here is the data:
When I use the code :
theme_set(theme_bw())
facet_plot <- ggplot(my_data, aes(x = depth, y = relative abundance, fill = Treatment)) +
geom_bar(stat = "identity", position = "dodge") +
geom_errorbar(aes(ymin=relative abundance-se, ymax=relative abundance+se), width=.2,position=position_dodge(.9))+
facet_grid(. ~ phylum)
I get a plot that looks like this:
As you can see , the plot looks strange especially the last three barplots. Does anyone know how I can modify my code so each plot has its own y axis or any other way of adjusting the scale?
Best wishes

ggplot fill property changes scale

I have a simple dataframe and using ggplot to create a bar graph using the code:
ggplot(data=data_cases,aes(x = k,y = val)) +
stat_summary(fun.y=sum, geom = "bar") +
scale_x_discrete(name="Type",
labels=c('A&R','A&E','C&E'))
This code generates the desired result. However when i add a fill property to color the portions of the graph, it changes the y scale. In the image below, the picture on the left has the correct scale, the one on the right is what is produced if the fill property is set (ggplot(data=data_cases,aes(x = k,y = val, fill=state)))
Data:
"k","state","val"
"A&C","SA ",3
"C&E","SA ",2
"A&C","NSW",29
"A&E","NSW",10
"C&E","NSW",11
"C&E","NT ",1
"A&C","WA ",3
"A&E","WA ",1
"C&E","WA ",4
"A&C","VIC",24
"A&E","VIC",1
"C&E","VIC",15
"A&C","QLD",7
"A&E","QLD",2
"C&E","QLD",17
It is because this second chart is showing the number of cases per state, e.g. almost 30 for NSW with type A&R. Each bar is starting from 0.
If you want to be like the original then all the bars should be stacked on top of each other: use position='stack'
ggplot(data=data_cases,aes(x = k,y = val)) +
stat_summary(fun.y=sum, geom = "bar", position="stack") + # <---
scale_x_discrete(name="Type",
labels=c('A&R','A&E','C&E'))
ggplot has a bunch of positions like this. ?position_dodge, ?position_fill, ?position_stack, ?position_identity, ...
can also use geom_col
ggplot(df, aes(k, val, fill = state)) +
geom_col()

Bounding position for geom_text()

I am making several instances of a tilted bar chart. As the sizes of count and the differences in percent vary, part of one of the labels (count) is pushed outside the bar in some instances. I need the labels to be entirely inside the bar in all instances. If not repositioned to fit inside the bar, I need the labels to be centered as is.
The code is:
library(tidyverse)
library(ggplot2)
data <- tibble(type = c('Cat', 'Dog'),
group = c('Pets', 'Pets'),
count = c(10000, 990000),
percent = c(1, 99))
ggplot(data, aes(x = group, y = percent, fill = type)) +
geom_bar(stat = 'identity',
position = position_stack(reverse = TRUE)) +
coord_flip() +
geom_text(aes(label = count),
position = position_stack(vjust = 0.5,
reverse = TRUE))
Use hjust="inward":
ggplot(data, aes(x = group, y = percent, fill = type)) +
geom_bar(stat = 'identity', position = position_stack(reverse = TRUE)) +
coord_flip() +
geom_text(aes(label = count), hjust = "inward", position = position_stack(vjust = 0.5, reverse = TRUE))
One thing key to note here is that plots in ggplot are drawn differently depending on the graphics device resolution, width, and height settings. This is why plots look a bit different depending on the computer you use to plot them. If I take your default graph and save different aspect ratios, this becomes evident:
width=3, height=5
width=7, height=5
The aspect ratio and resolution change the plot. You can also see this for yourself within R studio by just resizing the plot viewer window.
With that being said, there are some options to adjust your plot to be less likely to clip text out of bounds:
Rotate your text or rotate your plot back to horizontal bars. For long text labels, they are going to work out better with horizontal bars anyway.
geom_text_repel from the ggrepel package. Direct replacement of geom_text puts your labels in the plot area, and you can use min.segment.length= to specify the minimum line length as well as force= and direction= to play with positioning. Again, works better if you flip back your chart.
Use the expand= argument applied to scale_y_continuous. Try adding scale_y_continuous(expand=c(0.25,0.25)) to your plot, for example. Note that since your coordinate system is flipped, you have to specify "y" to expand "x". This expands the plot area around the geoms.
Change the output width= and height= and resolution when exporting your plots. As indicated above, this is the simple solution.
There are probably other suggestions, but that's mine.

What does the height of bars in the two bar chats with different "position" represents?

I know in this command line the height of the bars represents the count of each group in this variable "color":
ggplot(diamonds, aes(color, fill = cut)) +
geom_bar()
But I really wanna know what about this command line:
ggplot(diamonds, aes(color, fill = cut)) +
geom_bar(alpha=0.5, position = "identity")
I know the former is defaulted as position "stack" and I also know the meaning of position "identity". But I really can't figure out what the height of the bars in the later one represents?
Thanks many in advance!
I think the best way to understand it is to imagine using position='dodge' (which places multiple bars for different cuts, separated by color) and instead layering all the cut bars on top of each other.
ggplot(diamonds, aes(color, fill = cut)) +
geom_bar(alpha=0.5, position = "dodge")
ggplot(diamonds, aes(color, fill = cut)) +
geom_bar(alpha=0.5, position = "identity")
(Note, the colors get distorted because the 'Fair' cut is in front.)
When you use position=stack, for each x position counts per group in the fill are.stacked on top.of each other..with position=identity on the other hand for each x position if there are multiple groups in the fill varaibles they also start at y=0 and are essentially overlaid.

Grouped bar plot column width uneven due to no data

I am trying to display a grouped bar plot for my dataset, however, due to some months have no data (no income), the column width is showing up as unequal and I was hoping to have the same column width regardless if some states have no income. Notice how the bar plot is grouped for January, something grouped like that across all months although other states have no income (I'd like to have them spaced out if some states do not have any income). Any help will be much appreciated, thanks.
library(ggplot2)
plot = ggplot(Checkouts, aes(fill=Checkouts$State, x=Checkouts$Month, y=Checkouts$Income)) +
geom_bar(colour = "black", stat = "identity")
My Bar Plot
Checkouts table/data
There are two ways that this can be done.
If you are using the latest version of ggplot2(from 2.2.1 I believe), there is a parameter called preserve in the function position_dodge which preserves the vertical position and adjust only the horizontal position. Here is the code for it.
Code:
import(ggplot2)
plot = ggplot(Checkouts, aes(fill=Checkouts$State, x=Checkouts$Month, y=Checkouts$Income)) +
geom_bar(colour = "black", stat = "identity", position = position_dodge(preserve = 'single'))
Another way is to precompute and add dummy rows for each of the missing. using table is the best solution.
You are looking for position_dodge2(preserve = "single")(https://ggplot2.tidyverse.org/reference/position_dodge.html).
library(ggplot2)
plot = ggplot(Checkouts, aes(fill = State, x = Month, y= Income)) +
geom_bar(colour = "black", stat = "identity",
position = position_dodge2(preserve = "single"))
Also, you don't need to specify the columns to the data frame with $ in ggplot(). For example, Checkouts$State can be replaced with State.

Resources