Split axis plot in ggplot2 - r

I just found this plot in Factfulness (book by Hans Rosling and his children). I find the aestetics of the split quite appealing.
While it's possible to make something similar using geom_rect(), it's a quite different look. Another approach would be to use cowplot or patchwork but quite tricky. Here's as far as I got trying to replicate the top part with
gapminder %>%
filter(year==1997, gdpPercap<16000) %>%
ggplot(aes(gdpPercap, y=lifeExp, size=pop)) +
geom_point(alpha=0.5)+
scale_x_log10()+
ggthemes::theme_base()+
theme(legend.position = "none",
plot.background = element_blank(),
plot.margin = unit(c(0.5, 0, 0, 0), "cm")) -> P1
gapminder %>%
filter(year==1997, gdpPercap>16000) %>%
ggplot(aes(gdpPercap, y=lifeExp, size=pop)) +
geom_point(alpha=0.5)+
scale_x_log10()+
ggthemes::theme_base()+
theme(legend.position = "none",
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
plot.background = element_blank(),
plot.margin = unit(c(0.5, 0.5, 0, 0), "cm"),
axis.title.x = element_blank()) -> P2
cowplot::plot_grid(P1, P2, rel_widths = c(2,1), labels = NULL,
align = "h")
I think al the rest of the text and highlights are possible with existing packages. I am wondering what's the way to get a common x axis (the right side should display the ticks according to the ). Ideally, the x axis title would be centered but that might be too much to ask. I can also move it inside as text.
There are problems with axes, as you can see in the plot with y ticks. I wonder if facets would be a better approach. I am also not sure if the point sizes is wrongly calculated because I filter the data first.

Here is a solution using facets. You can solve the x-axis breaks problem by precomputing the breaks using the scale package's log10 break calculator. You could use a mutate() in the pipeline to make a new variable that splits the facets.
library(tidyverse)
library(gapminder)
breaks <- scales::log10_trans()$breaks(range(gapminder$gdpPercap), n = 6)
gapminder %>%
filter(year==1997) %>%
mutate(facet = factor(ifelse(gdpPercap > 16000, "High", "Low"),
levels = c("Low", "High"))) %>%
ggplot(aes(gdpPercap, y=lifeExp, size=pop)) +
geom_point(alpha=0.5)+
scale_x_log10(breaks = breaks)+
ggthemes::theme_base()+
facet_grid(~ facet,
scales = "free_x", space = "free_x") +
ggtitle("My title") +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5),
plot.background = element_blank())

Related

Adjust grid lines in ggplot+geom_tile (heatmap) or geom_raster

This heatmap has a grid builtin, which I am failing to find the way to customize.
I want to preserve horizontal lines in the grid, if possible increase thickness, and disable vertical lines. Each row should look as a continuous time-serie where data is present and blank where it is not.
Either adding vertical/horizontal lines on-top would possibly cover some data, because of that grid lines, or controlled gaps between tiny rectangles, is preferable.
Alternativelly, geom_raster doesn't shows any grid at all. With which I would need to add the horizontal lines of the grid.
I tried changing linetype, the geom_tile argument, which does seem to change the type or allow to fully disable it with linetype=0, fully disabling the grid, but it wouldn't allow to preserve horizontal grid-lines. I didn't saw any changes by modifying the size argument.
This is the code generating the plot as above:
ggplot( DF, aes( x=rows, y=name, fill = value) ) +
#geom_raster( ) +
geom_tile( colour = 'white' ) +
scale_fill_gradient(low="steelblue", high="black",
na.value = "white")+
theme_minimal() +
theme(
legend.position = "none",
plot.margin=margin(grid::unit(0, "cm")),
#line = element_blank(),
#panel.grid = element_blank(),
panel.border = element_blank(),
panel.grid = element_blank(),
panel.spacing = element_blank(),
#panel.grid = element_line(color="black"),
#panel.grid.minor = element_blank(),
plot.caption = element_text(hjust=0, size=8, face = "italic"),
plot.subtitle = element_text(hjust=0, size=8),
plot.title = element_text(hjust=0, size=12, face="bold")) +
labs( x = "", y = "",
#caption= "FUENTE: propia",
fill = "Legend Title",
#subtitle = "Spaces without any data (missing, filtered, etc)",
title = "Time GAPs"
)
I tried to attach DF %>% dput but I get Body is limited to 30000 characters; you entered 203304. If anyone is familiar with a similar Dataset, please advise.
Additionally,
There are 2 gaps at left&right of the plot area, one is seen inbetween the y-axis, and at the right you can see the X-axis outbounding, and are not controlled by a plot.margin argument.
I would want to set the grid to a thicker line when month changes.
The following data set has the same names and essential structure as your own, and will suffice for an example:
set.seed(1)
DF <- data.frame(
name = rep(replicate(35, paste0(sample(0:9, 10, T), collapse = "")), 100),
value = runif(3500),
rows = rep(1:100, each = 35)
)
Let us recreate your plot with your own code, using the geom_raster version:
library(ggplot2)
p <- ggplot( DF, aes( x=rows, y=name, fill = value) ) +
geom_raster( ) +
scale_fill_gradient(low="steelblue", high="black",
na.value = "white") +
theme_minimal() +
theme(
legend.position = "none",
plot.margin=margin(grid::unit(0, "cm")),
panel.border = element_blank(),
panel.grid = element_blank(),
panel.spacing = element_blank(),
plot.caption = element_text(hjust=0, size=8, face = "italic"),
plot.subtitle = element_text(hjust=0, size=8),
plot.title = element_text(hjust=0, size=12, face="bold")) +
labs( x = "", y = "", fill = "Legend Title", title = "Time GAPs")
p
The key here is to realize that discrete axes are "actually" numeric axes "under the hood", with the discrete ticks being placed at integer values, and factor level names being substituted for those integers on the axis. That means we can draw separating white lines using geom_hline, with values at 0.5, 1.5, 2.5, etc:
p + geom_hline(yintercept = 0.5 + 0:35, colour = "white", size = 1.5)
To change the thickness of the lines, simply change the size parameter.
Created on 2022-08-01 by the reprex package (v2.0.1)

Customised Bubble plot

I am trying to do a bubble plot. My data are:
Year<-rep(2001:2005, each = 5)
name<-c("John","Ellen","Mark","Randy","Luisa")
Name<-c(rep(name,5))
Value<-sample(seq(0,25,by=1),25)
mydata<-data.frame(Year,Name,Value)
And by far I've got to this point:
ggplot(mydata, aes(x=Year, y=Name, size = Value)) +
geom_point() +
theme(axis.line = element_blank(),
axis.text.x=element_text(size=11,margin=margin(b=10),colour="black"),
axis.text.y=element_text(size=13,margin=margin(l=10),colour="black",
face="italic"),
axis.ticks = element_blank(),
axis.title=element_text(size=18,face="bold"),
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.text = element_text(size=14),
legend.title = element_text(size=18))
I need many modifications but I couldn't understand how to do that (I am not very familiar with ggplot2).
First, I would like to use the viridis scale, but neither scale_color_viridis nor scale_fill_viridis are working (I have also tried setting the discrete=T argument).
Second, I would like to avoid the 0 values to be plotted (i.e., having a blank space where the 0 value is being plotted), but neither using na.omit (e.g. as ggplot(na.omit(mydata), aes(x=Year, y=Name, size = Value)) or as ggplot(mydata, aes(x=Year, y=Name, size = na.omit(Value)))) or removing the 0 from Value object work.
Third, I'd like the legend to be a continuous scale: the plotted values of Value are in a range from 1 to 25 (as I would like to remove the zeros) but the default legend is discrete with 5 points break.
I would like the plot to look more or less like this (with the bubble sizes depending on the value of Value):
Any suggestions? Sorry for the many questions but I have some real difficulties in understanding how ggplot works. Thanks!
In order to map a variable in your data to some scale, you use the aes() function to couple what ggplot2 calls an 'aesthetic' to an expression (typically a symbol for a column in your data). Thus, to make a colour scale, you have to specify a colour aesthetic inside the aes() function. In the code below, I also specify an alpha aesthetic, which is 1 if Value > 0 and 0 otherwise, making the 0-value points completely transparent. I specify I() to let ggplot2 know that it should take this value literally instead of mapping it to a scale.
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.0.3
Year<-rep(2001:2005, each = 5)
name<-c("John","Ellen","Mark","Randy","Luisa")
Name<-c(rep(name,5))
Value<-sample(seq(0,25,by=1),25)
mydata<-data.frame(Year,Name,Value)
g <- ggplot(mydata, aes(x=Year, y=Name, size = Value)) +
geom_point(aes(colour = Value,
alpha = I(as.numeric(Value > 0))))
Once we have specified the aesthetics, we can begin customising the scales. The typical pattern is scale_{the aesthetic}_{type of scale}, so we need to add scale_colour_viridis_c() if we want to map the colour values to the viridis scale (the *_c is for continuous scales). In the scales, we can specify for example the limits, which you've indicated should be between 1 and 25. Also, I added a scale_size_area() where we say that we do not want a legend for the size of the points by setting `guide = "none".
g + scale_colour_viridis_c(option = "C", direction = -1,
limits = c(1, 25)) +
scale_size_area(guide = "none") +
theme(axis.line = element_blank(),
axis.text.x=element_text(size=11,margin=margin(b=10),colour="black"),
axis.text.y=element_text(size=13,margin=margin(l=10),colour="black",
face="italic"),
axis.ticks = element_blank(),
axis.title=element_text(size=18,face="bold"),
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.text = element_text(size=14),
legend.title = element_text(size=18))
Created on 2021-02-24 by the reprex package (v1.0.0)
Is that what you are looking for?
library(ggplot2)
Year<-rep(2001:2005, each = 5)
name<-c("John","Ellen","Mark","Randy","Luisa")
Name<-c(rep(name,5))
Value<-sample(seq(0,25,by=1),25)
Value <- ifelse(Value == 0, NA, Value)
mydata<-data.frame(Year,Name,Value)
ggplot(mydata, aes(x=Year, y=Name, size = Value, colour = Value)) +
geom_point() +
scale_colour_viridis_c() +
scale_size(guide = F) +
theme(axis.line = element_blank(),
axis.text.x=element_text(size=11,margin=margin(b=10),colour="black"),
axis.text.y=element_text(size=13,margin=margin(l=10),colour="black",
face="italic"),
axis.ticks = element_blank(),
axis.title=element_text(size=18,face="bold"),
panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.text = element_text(size=14),
legend.title = element_text(size=18))
#> Warning: Removed 1 rows containing missing values (geom_point).
Concerning your points:
I did only see the scale_colour_viridis_c and the scale_colour_viridis_b functions which differ in the colors as far as I could see. Maybe I am missing some package?
Secondly regarding the NAs: you just needed to replace the 0s by NAs.
And lastly regarding the scale: The color-scale is automatically continuous. Depicting sizes continuously is a bit tricky, therefore it will always be discrete. But I removed it from the legend for you so that you only have the color there as in your example.
Just as an alternative way to think about this... maybe it's helpful. :-)
library(tidyverse)
set.seed(123)
df <- tibble(
year = rep(2001:2005, each = 5),
name = rep(c("John","Ellen","Mark","Randy","Luisa"),5),
value = sample(seq(0,25,by=1),25)
)
df %>%
mutate(name_2 = ifelse(year>2001 & year<2005, NA, name)) %>%
ggplot(aes(year, value, group = name, label = name_2, color = name)) +
geom_line() +
geom_point() +
geom_text(vjust = -1) +
scale_color_brewer(palette = "Set1") +
theme_minimal(base_family = "serif") +
theme(legend.position = "none") +
xlab("")

ggplot2: How to crop out of the blank area on top and bottom of a plot?

This is a follow up of Question How to fit custom long annotations geom_text inside plot area for a Donuts plot?. See the accepted answer, the resulting plot understandably has extra blank area on the top and on the bottom. How can I get rid of those extra blank areas? I looked at theme aspect.ratio but this is not what I intend though it does the job but distorts the plot. I'm after cropping the plot from a square to a landscape form.
How can I do that?
UPDATE This is a self contained example of my use-case:
library(ggplot2); library(dplyr); library(stringr)
df <- data.frame(group = c("Cars", "Trucks", "Motorbikes"),n = c(25, 25, 50),
label2=c("Cars are blah blah blah", "Trucks some of the best in town", "Motorbikes are great if you ..."))
df$ymax = cumsum(df$n)
df$ymin = cumsum(df$n)-df$n
df$ypos = df$ymin+df$n/2
df$hjust = c(0,0,1)
ggplot(df %>%
mutate(label2 = str_wrap(label2, width = 10)), #change width to adjust width of annotations
aes(x="", y=n, fill=group)) +
geom_rect(aes_string(ymax="ymax", ymin="ymin", xmax="2.5", xmin="2.0")) +
expand_limits(x = c(2, 4)) + #change x-axis range limits here
# no change to theme
theme(axis.title=element_blank(),axis.text=element_blank(),
panel.background = element_rect(fill = "white", colour = "grey50"),
panel.grid=element_blank(),
axis.ticks.length=unit(0,"cm"),axis.ticks.margin=unit(0,"cm"),
legend.position="none",panel.spacing=unit(0,"lines"),
plot.margin=unit(c(0,0,0,0),"lines"),complete=TRUE) +
geom_text(aes_string(label="label2",x="3",y="ypos",hjust="hjust")) +
coord_polar("y", start=0) + scale_x_discrete()
And this is the result I'd like to find an answer to fix those annotated resulting blank spaces:
This is a multi-part solution to answer this and the other related question you've posted.
First, for changing the margins in a single graph, #Keith_H was on the right track; using plot.margin inside theme() is a convenient way. However, as mentioned, this alone won't solve the issue if the goal is to combine multiple plots, as in the case of the other question linked above.
To do that you'll need a combination of plot.margin and a specific plotting order within arrangeGrob(). You'll need a specific order because plots get printed in the order you call them, and because of that, it will be easier to change the margins of plots that are layered behind other plots, instead of in front of plots. We can think of it like covering the plot margins we want to shrink by expanding the plot on top of the one we want to shrink. See the graphs below for illustration:
Before plot.margin setting:
#Main code for the 1st graph can be found in the original question.
After plot.margin setting:
#Main code for 2nd graph:
ggplot(df %>%
mutate(label2 = str_wrap(label2, width = 10)),
aes(x="", y=n, fill=group)) +
geom_rect(aes_string(ymax="ymax", ymin="ymin", xmax="2.5", xmin="2.0")) +
geom_text(aes_string(label="label2",x="3",y="ypos",hjust="hjust")) +
coord_polar(theta='y') +
expand_limits(x = c(2, 4)) +
guides(fill=guide_legend(override.aes=list(colour=NA))) +
theme(axis.line = element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
axis.text.y=element_blank(),
axis.text.x=element_blank(),
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = "white"),
plot.margin = unit(c(-2, 0, -2, -.1), "cm"),
legend.position = "none") +
scale_x_discrete(limits=c(0, 1))
After combining plot.margin setting and arrangeGrob() reordering:
#Main code for 3rd graph:
p1 <- ggplot(mtcars,aes(x=1:nrow(mtcars),y=mpg)) + geom_point()
p2 <- ggplot(df %>%
mutate(label2 = str_wrap(label2, width = 10)), #change width to adjust width of annotations
aes(x="", y=n, fill=group)) +
geom_rect(aes_string(ymax="ymax", ymin="ymin", xmax="2.5", xmin="2.0")) +
geom_text(aes_string(label="label2",x="3",y="ypos",hjust="hjust")) +
coord_polar(theta='y') +
expand_limits(x = c(2, 4)) + #change x-axis range limits here
guides(fill=guide_legend(override.aes=list(colour=NA))) +
theme(axis.line = element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
axis.text.y=element_blank(),
axis.text.x=element_blank(),
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = "white"),
plot.margin = unit(c(-2, 0, -2, -.1), "cm"),
legend.position = "none") +
scale_x_discrete(limits=c(0, 1))
final <- arrangeGrob(p2,p1,layout_matrix = rbind(c(1),c(2)),
widths=c(4),heights=c(2.5,4),respect=TRUE)
Note that in the final code, I reversed the order you had in the arrangeGrob from p1,p2 to p2,p1. I then adjusted the height of the first object plotted, which is the one we want to shrink. This adjustment allows the earlier plot.margin adjustment to take effect, and as that takes effect, the graph printed last in order, which is P1, will start to take the space of what was the margins of P2. If you make one of these adjustments with out the others, the solution won't work. Each of these steps are important to produce the end result above.
You can set the plot.margins to negative values for the top and bottom of the plot.
plot.margin=unit(c(-4,0,-4,0),"cm"),complete=TRUE)
edit: here is the output:

Minor grid lines in ggplot2 with discrete values and facet grid

I have a plot created using ggplot2 where I'm trying to modify some of the minor grid lines. Here is the current version:
library(tidyverse)
data(starwars)
starwars = starwars %>%
filter(!is.na(homeworld), !is.na(skin_color)) %>%
mutate(tatooine = factor(if_else(homeworld == "Tatooine", "Tatooine Native", "Other Native")),
skin_color = factor(skin_color))
ggplot(starwars, aes(birth_year, skin_color)) +
geom_point(aes(color = gender), size = 4, alpha = 0.7, show.legend = FALSE) +
facet_grid(tatooine ~ ., scales = "free_y", space = "free_y", switch = "y") +
theme_minimal() +
theme(
panel.grid.major.x = element_blank(),
panel.grid.major.y = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
strip.placement = "outside",
strip.background = element_rect(fill="gray90", color = "white"),
) +
geom_hline(yintercept = seq(0, length(unique(starwars$skin_color))) + .5, color="gray30")
Y axis is a factor and a facet grid is used, with an uneven number of categories in each grid. I added some minor grid lines using geom_hline (my understanding is that panel.grid.minor does not work with categorical data i.e., factors).
I would like to remove the lines highlighted in yellow below, and then ADD a single black line in between the two facet grids (i.e., where the current double lines are that are highlighted in yellow).
Any way to do this? I'd prefer avoiding hard coding the position of any lines, in case the data change. Thanks.
Removing the top and bottom grid lines dynamically is relatively easy. You code the line positions in the data set based on the faceting groups and exclude the highest and lowest value, and plot the geom_hline with an xintercept inside the aes() statement. That approach is robust to changing the data (to see that this approach works if you change the data, comment out the # filter(!is.na(birth_year)) line below).
library(tidyverse)
library(grid)
data(starwars)
starwars = starwars %>%
filter(!is.na(homeworld), !is.na(skin_color)) %>%
mutate(tatooine = factor(if_else(homeworld == "Tatooine", "Tatooine Native", "Other Native")),
skin_color = factor(skin_color)) %>%
# filter(!is.na(birth_year)) %>%
group_by(tatooine) %>%
# here we assign the line_positions
mutate(line_positions = as.numeric(factor(skin_color, levels = unique(skin_color))),
line_positions = line_positions + .5,
line_positions = ifelse(line_positions == max(line_positions), NA, line_positions))
plot_out <- ggplot(starwars, aes(birth_year, skin_color)) +
geom_point(aes(color = gender), size = 4, alpha = 0.7, show.legend = FALSE) +
geom_hline(aes(yintercept = line_positions)) +
facet_grid(tatooine ~ ., scales = "free_y", space = "free_y", switch = "y") +
theme_minimal() +
theme(
panel.grid.major.x = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_line(colour = "black"),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
strip.placement = "outside",
strip.background = element_rect(fill="gray90", color = "white"),
)
print(plot_out)
gives
However, adding a solid between the facets without any hardcoding is difficult. There are some possible ways to add borders between facets (see here), but if we don't know whether the facets change it is not obvious to which value the border should be assigned. I guess there is a possible solution with drawing a hard coded line in the plot that divides the facets, but the tricky part is to determine dynamically where that border is going to be located, based on the data and how the facets are ultimately draw (e.g. in which order etc). I'd be interested in hearing other opinions on this.

ggplot2 - multiple plots scaling

I tried to generate multiple grid plots with ggplot2. So I would like to generate distribution plot with additional boxplot below x-axis and that for different groups and variables like that:
CODE: I tried to do that with the following code :
library(ggplot2)
require(grid)
x=rbind(data.frame(D1=rnorm(1000),Name="titi",ID=c(1:1000)),
data.frame(D1=rnorm(1000)+1,Name="toto",ID=c(1:1000)))
space=1
suite=1
p1=ggplot(x, aes(x=D1, color=Name, fill=Name)) +
geom_histogram(aes(y=..density..),alpha=0.35,color=adjustcolor("white",0),position="identity",binwidth = 0.05)+
geom_density(alpha=.2,size=1)+
theme_minimal()+
labs(x=NULL,y="Density")+
theme(legend.position = "top",
legend.title = element_blank())+
scale_fill_manual(values=c("gray30","royalblue1"))+
scale_color_manual(values=c("gray30","royalblue1"))
p2=ggplot(x, aes(x=factor(Name), y=D1,fill=factor(Name),color=factor(Name)))+
geom_boxplot(alpha=0.2)+
theme_minimal()+
coord_flip()+
labs(x=NULL,y=NULL)+
theme(legend.position = "none",
axis.text.y = element_blank(),
axis.text.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.major.y = element_blank())+
scale_fill_manual(values=c("gray30","royalblue1"))+
scale_color_manual(values=c("gray30","royalblue1"))
grid.newpage()
pushViewport(viewport(layout=grid.layout(5,1)))
define_region <- function(row, col){
viewport(layout.pos.row = row, layout.pos.col = col)
}
print(p1, vp=define_region(1:4,1))
print(p2, vp=define_region(5,1))
RESULT:
QUESTION: During my search I observed that scale between density distribution plot and boxplot are not the same (problem 1). I haven't found solution to plot these two graph in grid (I'm lost).
With the cowplot package this becomes a bit easier. However, we should properly set the x-axis range to ensure they are the same for both plots. This is because the density plots are naturally a bit wider than pure data plots, and the axis for p1 will therefore be a bit wider. When the axes are fixed we can arrange and align them (axis text and margins will no longer matter).
library(cowplot)
comb <- plot_grid(
p1 + xlim(-5, 5),
p2 + ylim(-5, 5), # use ylim for p2 because of coord_flip()
align = 'v', rel_heights = c(4, 1), nrow = 2
)
Similarly we can arrange multiples of the combination plots:
plot_grid(comb, comb, comb, comb)

Resources