format internal lines of a stacked geom_bar ggplot - r

I want to remove the internal borders from my ggplot, leaving a coloured border around the outside of each bar only. Here is a test data frame, with a stacked bar plot. Ideally, I will end up with the groups in the stack still being a shade of grey, with a colourful outline per box.
test <- data.frame(iso=rep(letters[1:5],3),
num= sample(1:99, 15, replace=T),
fish=rep(c("pelagic", "reef", "benthic"), each=5),
colour=rep(rainbow(n=5),3))
ggplot(data=test, aes(x=iso, y=num, fill=fish, colour=colour)) +
geom_bar(stat="identity") +
theme_bw() +
scale_colour_identity() + scale_fill_grey(start = 0, end = .9)

You can accomplish this by moving the fill and colour aes() settings into two separate geom_bar() elements: one which takes the sum for each iso value (the outline), and another which splits things up by fish:
ggplot(data=test, aes(x=iso, y=num)) +
geom_bar(stat="summary", fun.y="sum", aes(color=colour)) +
geom_bar(stat="identity", aes(fill=fish)) +
theme_bw() +
scale_colour_identity() +
scale_fill_grey(start = 0, end = .9)

Related

unstack stacked ggplot legend

I'm working with a chemistry dataset, where I have 11 different chemicals, here labeled under the column c1,c2,...c11
I have made pie charts using library(ggplot2) , and would like to do 2 things with my plot.
Display all variables in the legend in a horizontal fashion (done), and not have them stacked (not done), as you see in my example. Having just one line would be great. 2 lines could also be acceptable.
Change colors to be color-blind friendly
Here is a pretend dataset we can work with so you can see what I have at this point. I have tried searching "legend margins" to increase the area the legend is plotted on, but to no avail.
data <- read.delim("https://pastebin.com/raw/MS5GLAxa", header = T)
ggplot(data, aes(x="", y=ratio, fill=chemical)) +
geom_bar(stat="identity", width=1,position = position_fill()) + facet_wrap(~treatment, nrow=1)+
coord_polar("y", start=0)+
theme_void(base_size = 20)+
theme(legend.position=c(0.5, 1.2),legend.direction = "horizontal")+
theme(plot.margin=unit(c(0,0,0,0), 'cm'))
Some side bonuses here would be to be able to:
increase the size of the pie chart (I believe I achieved this with making my margins as small as possible on the sides)
have the pie chart have solid colors, and no white lines in graph
Use guides to make the number of rows to 1 and use scale_fill_brewer with color blindness friendly palette.
ggplot(data, aes(x="", y=ratio, fill=chemical)) +
geom_bar(stat="identity", width=1,position = position_fill()) +
facet_wrap(~treatment, nrow=1)+
coord_polar("y", start=0) +
scale_fill_brewer(palette="Paired") +
theme_void(base_size = 20) +
theme(legend.position=c(0.5, 1.5),legend.direction = "horizontal",
plot.margin=unit(c(0,0,0,0), 'cm')) +
guides(fill = guide_legend(nrow = 1)) # if required nrow = 2

ggplot fill dotplot but group as no filled

I want to plot a dotplot grouped as non colored figure but filled as the coloured one. To generate coloured I used:
Sample dataset:
data <- data.frame(estado1 = c('APLV','APLV','APLV','APLV','APLV','NO APLV','APLV','NO APLV','NO APLV','APLV','NO APLV','APLV','APLV','APLV','APLV','APLV','APLV','APLV','NO APLV','APLV'), combined_ige = c(3.6,2.84,1.2,14.33,0,0,0,0,0.07,2,0,0.3,0.11,0,0,1.31,0,0,0,0.19), sxtypes = c('skin_resp','skin','skin','skin_dig','dig','dig_resp','skin_dig','dig','dig','skin_resp','skin_dig_resp','dig','dig','dig_resp','skin_dig_resp','skin','dig','skin_dig_resp','resp','skin_dig'))
code
ggplot(data, aes(x=estado1, y=combined_ige, fill= sxtypes)) +
geom_dotplot(binaxis='y', stackdir='center',
stackratio=1.5, dotsize=1.2, alpha=0.6) +
geom_hline(yintercept = (0.35), linetype="dashed") +
geom_hline(yintercept = (0.77), linetype="dashed", col="red") +
xlab("Status group") +
ggtitle("IgE específicas combinadas") +
scale_y_log10(labels = function(y) format(y, scientific = F))
When I use "fill = sxtypes" in order to colour dots, them group in layers overlapping each other. I want them to stay in the same positions as in the not coloured figure at the time they colour as in the second figure.

Spacing between groups of bars in histogram

When I produce histograms in ggplot2 where the bar positions are dodge, I expect something like this where there is space between the groups of bars (i.e. notice the white space between each groups of red/green pairs):
I'm having a hard time producing the same effect when I build a histogram with continuous data. I can't seem to add space between the groups of bars, and instead, everything gets squashed together. As you can see, it makes it visually difficult to compare the red/green pairs:
To reproduce my problem, I created a sample data set here: https://www.dropbox.com/s/i9nxzo1cmbwwfsa/data.csv?dl=0
Code to reproduce:
data <- read.csv("https://www.dropbox.com/s/i9nxzo1cmbwwfsa/data.csv?dl=1")
ggplot(data, aes(x = soldPrice, fill = month)) +
geom_histogram(binwidth=1e5, position=position_dodge()) +
labs(x="Sold Price", y="Sales", fill="") +
scale_x_continuous(labels=scales::comma, breaks=seq(0, 2e6, by = 1e5)) +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))
How can I add white space between the groups of red/green pairs?
Alternative 1: overlapping bars with geom_histogram()
From ?position_dodge():
Dodging preserves the vertical position of an geom while adjusting the horizontal position
This function accepts a width argument that determines the space to be created.
To get what I think you want, you need to supply a suitable value to position_dodge(). In your case, where binwidth=1e5, you might play with e.g. 20% of that value: position=position_dodge(1e5-20*(1e3)).
(I left the rest of your code untouched.)
You could use the following code:
ggplot(data, aes(x = soldPrice, fill = month)) +
geom_histogram(binwidth=1e5, position=position_dodge(1e5-20*(1e3))) + ### <-----
labs(x="Sold Price", y="Sales", fill="") +
scale_x_continuous(labels=scales::comma, breaks=seq(0, 2e6, by = 1e5)) +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))
yielding this plot:
Alternative 2: use ggplot-object and render with geom_bar
geom_histogram() was not designed to produce what you want. geom_bar() on the other hand provides the flexibility you need.
You can generate the histogram with geom_histogram and save it in an ggplot-object. Then, you generate the plotting information with ggplot_build(). Now,
you may use the histogram plotting information in the object to generate a bar plot with geom_bar()
## save ggplot object to h
h <- ggplot(data, aes(x = soldPrice, fill = month)) +
geom_histogram(binwidth=1e5, position=position_dodge(1e5-20*(1e3)))
## get plotting information as data.frame
h_plotdata <- ggplot_build(h)$data[[1]]
h_plotdata$group <- as.factor(h_plotdata$group)
levels(h_plotdata$group) <- c("May 2018", "May 2019")
## plot with geom_bar
ggplot(h_plotdata, aes(x=x, y=y, fill = group)) +
geom_bar(stat = "identity") +
labs(x="Sold Price", y="Sales", fill="") +
scale_x_continuous(labels=scales::comma, breaks=seq(0, 2e6, by = 1e5)) +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))
yielding this graph:
Please, let me know whether this is what you want.

How to set automatic label position based on box height

In a previous question, I asked about moving the label position of a barplot outside of the bar if the bar was too small. I was provided this following example:
library(ggplot2)
options(scipen=2)
dataset <- data.frame(Riserva_Riv_Fine_Periodo = 1:10 * 10^6 + 1,
Anno = 1:10)
ggplot(data = dataset,
aes(x = Anno,
y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity",
width=0.8,
position="dodge") +
geom_text(aes( y = Riserva_Riv_Fine_Periodo,
label = round(Riserva_Riv_Fine_Periodo, 0),
angle=90,
hjust= ifelse(Riserva_Riv_Fine_Periodo < 3000000, -0.1, 1.2)),
col="red",
size=4,
position = position_dodge(0.9))
And I obtain this graph:
The problem with the example is that the value at which the label is moved must be hard-coded into the plot, and an ifelse statement is used to reposition the label. Is there a way to automatically extract the value to cut?
A slightly better option might be to base the test and the positioning of the labels on the height of the bar relative to the height of the highest bar. That way, the cutoff value and label-shift are scaled to the actual vertical range of the plot. For example:
ydiff = max(dataset$Riserva_Riv_Fine_Periodo)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity", width=0.8) +
geom_text(aes(label = round(Riserva_Riv_Fine_Periodo, 0), angle=90,
y = ifelse(Riserva_Riv_Fine_Periodo < 0.3*ydiff,
Riserva_Riv_Fine_Periodo + 0.1*ydiff,
Riserva_Riv_Fine_Periodo - 0.1*ydiff)),
col="red", size=4)
You would still need to tweak the fractional cutoff in the test condition (I've used 0.3 in this case), depending on the physical size at which you render the plot. But you could package the code into a function to make the any manual adjustments a bit easier.
It's probably possible to automate this by determining the actual sizes of the various grobs that make up the plot and setting the condition and the positioning based on those sizes, but I'm not sure how to do that.
Just as an editorial comment, a plot with labels inside some bars and above others risks confusing the visual mapping of magnitudes to bar heights. I think it would be better to find a way to shrink, abbreviate, recode, or otherwise tweak the labels so that they contain the information you want to convey while being able to have all the labels inside the bars. Maybe something like this:
library(scales)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo/1000)) +
geom_col(width=0.8, fill="grey30") +
geom_text(aes(label = format(Riserva_Riv_Fine_Periodo/1000, big.mark=",", digits=0),
y = 0.5*Riserva_Riv_Fine_Periodo/1000),
col="white", size=3) +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
theme_classic() +
labs(y="Riserva (thousands)")
Or maybe go with a line plot instead of bars:
ggplot(dataset, aes(Anno, Riserva_Riv_Fine_Periodo/1e3)) +
geom_line(linetype="11", size=0.3, colour="grey50") +
geom_text(aes(label=format(Riserva_Riv_Fine_Periodo/1e3, big.mark=",", digits=0)),
size=3) +
theme_classic() +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
expand_limits(y=0) +
labs(y="Riserva (thousands)")

How can I use different color or linetype aesthetics in same plot with ggplot?

I'm creating a plot with ggplot that uses colored points, vertical lines, and horizontal lines to display the data. Ideally, I'd like to use two different color or linetype scales for the geom_vline and geom_hline layers, but ggplot discourages/disallows multiple variables mapped to the same aesthetic.
# Create example data
library(tidyverse)
library(lubridate)
set.seed(1234)
example.df <- data_frame(dt = seq(ymd("2016-01-01"), ymd("2016-12-31"), by="1 day"),
value = rnorm(366),
grp = sample(LETTERS[1:3], 366, replace=TRUE))
date.lines <- data_frame(dt = ymd(c("2016-04-01", "2016-10-31")),
dt.label = c("April Fools'", "Halloween"))
value.lines <- data_frame(value = c(-1, 1),
value.label = c("Threshold 1", "Threshold 2"))
If I set linetype aesthetics for both geom_*lines, they get put in the
linetype legend together, which doesn't necessarily make logical sense
ggplot(example.df, aes(x=dt, y=value, colour=grp)) +
geom_hline(data=value.lines, aes(yintercept=value, linetype=value.label)) +
geom_vline(data=date.lines, aes(xintercept=as.numeric(dt), linetype=dt.label)) +
geom_point(size=1) +
scale_x_date() +
theme_minimal()
Alternatively, I could set one of the lines to use a colour aesthetic,
but then that again puts the legend lines in an illogical legend
grouping
ggplot(example.df, aes(x=dt, y=value, colour=grp)) +
geom_hline(data=value.lines, aes(yintercept=value, colour=value.label)) +
geom_vline(data=date.lines, aes(xintercept=as.numeric(dt), linetype=dt.label)) +
geom_point(size=1) +
scale_x_date() +
theme_minimal()
The only partial solution I've found is to use a fill aesthetic instead
of colour in geom_pointand setting shape=21 to use a fillable shape,
but that forces a black border around the points. I can get rid of the
border by manually setting color="white, but then the white border
covers up points. If I set colour=NA, no points are plotted.
ggplot(example.df, aes(x=dt, y=value, fill=grp)) +
geom_hline(data=value.lines, aes(yintercept=value, colour=value.label)) +
geom_vline(data=date.lines, aes(xintercept=as.numeric(dt), linetype=dt.label)) +
geom_point(shape=21, size=2, colour="white") +
scale_x_date() +
theme_minimal()
This might be a case where ggplot's "you can't have two variables mapped
to the same aesthetic" rule can/should be broken, but I can't figure out clean way around it. Using fill with geom_point shows the most promise, but there's no way to remove the point borders.
Any ideas for plotting two different color or linetype aesthetics here?

Resources