I am using the fgsea library for some analyses, in particular I use the plotEnrichment function a lot. This function returns a ggplot object with all the layers, but I'd like to change the curve it shows from bright to something else. This code
library(fgsea)
data(examplePathways)
data(exampleRanks)
p = plotEnrichment(examplePathways[["5991130_Programmed_Cell_Death"]], exampleRanks)
will return this plot
Is there any way to change the colour once its created?
Note: I am pretty sure there are ways to do this fairly easily, but I did not create the plot so I don't know what each layer is called or how they were created.
As per BrianFisher's recommendations, I tried
p + scale_color_brewer(palette="GnBu")
p + scale_color_manual(values=c("blue","red"))
But they did not change anything on the plot, as far as I could tell.
Another way to achieve this is by changing the ggplot object directly by using the following code:
## change the aes parameter in the object
p$layers[[5]]$aes_params$colour <- 'blue'
## then plot p
p
This yields the following graph:
A short walk-through
This technique has proven useful to me on numerous occasions. Hence, some more detail:
p$layers gives us the info we need to dig further: we need to access the geom_line configuration. So, after consulting the info below, we choose to continue with p$layers[[5]]
> p$layers
[[1]]
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
[[2]]
mapping: yintercept = ~yintercept
geom_hline: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
[[3]]
mapping: yintercept = ~yintercept
geom_hline: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
[[4]]
mapping: yintercept = ~yintercept
geom_hline: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
[[5]]
geom_line: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
[[6]]
mapping: x = ~x, y = ~-diff/2, xend = ~x, yend = ~diff/2
geom_segment: arrow = NULL, arrow.fill = NULL, lineend = butt, linejoin = round, na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
If we add an $ after p$layers[[5]], we get the possible choices to extend the code (in RStudio) like in the picture below:
We choose aes_params and add a new $. At that moment, the only choice is colour. We are at the endpoint: here we can set colour of the geom_line.
So, now you know where the hacky, mysterious code came from; and here it is for the very last time:
p$layers[[5]]$aes_params$colour <- 'blue'
If you look at plotEnrichment:
plotEnrichment
function (pathway, stats, gseaParam = 1, ticksSize = 0.2)
{
rnk <- rank(-stats)
ord <- order(rnk)
statsAdj <- stats[ord]
statsAdj <- sign(statsAdj) * (abs(statsAdj)^gseaParam)
statsAdj <- statsAdj/max(abs(statsAdj))
pathway <- unname(as.vector(na.omit(match(pathway, names(statsAdj)))))
pathway <- sort(pathway)
gseaRes <- calcGseaStat(statsAdj, selectedStats = pathway,
returnAllExtremes = TRUE)
bottoms <- gseaRes$bottoms
tops <- gseaRes$tops
n <- length(statsAdj)
xs <- as.vector(rbind(pathway - 1, pathway))
ys <- as.vector(rbind(bottoms, tops))
toPlot <- data.frame(x = c(0, xs, n + 1), y = c(0, ys, 0))
diff <- (max(tops) - min(bottoms))/8
x = y = NULL
g <- ggplot(toPlot, aes(x = x, y = y)) + geom_point(color = "green",
size = 0.1) + geom_hline(yintercept = max(tops), colour = "red",
linetype = "dashed") + geom_hline(yintercept = min(bottoms),
colour = "red", linetype = "dashed") + geom_hline(yintercept = 0,
colour = "black") + geom_line(color = "green") + theme_bw() +
geom_segment(data = data.frame(x = pathway), mapping = aes(x = x,
y = -diff/2, xend = x, yend = diff/2), size = ticksSize) +
theme(panel.border = element_blank(), panel.grid.minor = element_blank()) +
labs(x = "rank", y = "enrichment score")
g
}
The color is hardcoded in geom_line(color = "green"), partly because there is no column in the data.frame that specifies the color. so you have two options:
a) Plot over it
p + geom_line(color="steelblue")
b) change the function and save it as another function (e.g plotEnr below)
plotEnr = function (pathway, stats, gseaParam = 1, ticksSize = 0.2)
{
LINECOL = "red"
rnk <- rank(-stats)
ord <- order(rnk)
statsAdj <- stats[ord]
statsAdj <- sign(statsAdj) * (abs(statsAdj)^gseaParam)
statsAdj <- statsAdj/max(abs(statsAdj))
pathway <- unname(as.vector(na.omit(match(pathway, names(statsAdj)))))
pathway <- sort(pathway)
gseaRes <- calcGseaStat(statsAdj, selectedStats = pathway,
returnAllExtremes = TRUE)
bottoms <- gseaRes$bottoms
tops <- gseaRes$tops
n <- length(statsAdj)
xs <- as.vector(rbind(pathway - 1, pathway))
ys <- as.vector(rbind(bottoms, tops))
toPlot <- data.frame(x = c(0, xs, n + 1), y = c(0, ys, 0))
diff <- (max(tops) - min(bottoms))/8
x = y = NULL
g <- ggplot(toPlot, aes(x = x, y = y)) + geom_point(size = 0.1) +
geom_hline(yintercept = max(tops), colour = "red",
linetype = "dashed") +
geom_hline(yintercept = min(bottoms),
colour = "red", linetype = "dashed") + geom_hline(yintercept = 0,
colour = "black") + geom_line(col=LINECOL) + theme_bw() +
geom_segment(data = data.frame(x = pathway), mapping = aes(x = x,
y = -diff/2, xend = x, yend = diff/2), size = ticksSize) +
theme(panel.border = element_blank(), panel.grid.minor = element_blank()) +
labs(x = "rank", y = "enrichment score")
g
}
plotEnr(examplePathways[["5991130_Programmed_Cell_Death"]], exampleRanks)
Related
I recently asked this question. However, I am asking a separate question now as the scope of my new question falls outside the range of the last question.
I am trying to create a heatmap in ggplot... however, outside of the axis I am trying to plot geom_tile. The issue is I cannot find a consistent way to get it to work. For example, the code I am using to plot is:
library(colorspace)
library(ggplot2)
library(ggnewscale)
library(tidyverse)
asd <- expand_grid(paste0("a", 1:9), paste0("b", 1:9))
df <- data.frame(
a = asd$`paste0("a", 1:9)`,
b = asd$`paste0("b", 1:9)`,
c = sample(20, 81, replace = T)
)
# From discrete to continuous
df$a <- match(df$a, sort(unique(df$a)))
df$b <- match(df$b, sort(unique(df$b)))
z <- sample(10, 18, T)
# set color palettes
pal <- rev(diverging_hcl(palette = "Blue-Red", n = 11))
palEdge <- rev(sequential_hcl(palette = "Plasma", n = 11))
# plot
ggplot(df, aes(a, b)) +
geom_tile(aes(fill = c)) +
scale_fill_gradientn(
colors = pal,
guide = guide_colorbar(
frame.colour = "black",
ticks.colour = "black"
),
name = "C"
) +
theme_classic() +
labs(x = "A axis", y = "B axis") +
new_scale_fill() +
geom_tile(data = tibble(a = 1:9,
z = z[1:9]),
aes(x = a, y = 0, fill = z, height = 0.3)) +
geom_tile(data = tibble(b = 1:9,
z = z[10:18]),
aes(x = 0, y = b, fill = z, width = 0.3)) +
scale_fill_gradientn(
colors = palEdge,
guide = guide_colorbar(
frame.colour = "black",
ticks.colour = "black"
),
name = "Z"
)+
coord_cartesian(clip = "off", xlim = c(0.5, NA), ylim = c(0.5, NA)) +
theme(aspect.ratio = 1,
plot.margin = margin(10, 15.5, 25, 25, "pt")
)
This produces something like this:
However, I am trying to find a consistent way to plot something more like this (which I quickly made in photoshop):
The main issue im having is being able to manipulate the coordinates of the new scale 'outside' of the plotting area. Is there a way to move the tiles that are outside so I can position them in an area that makes sense?
There are always the two classic options when plotting outside the plot area:
annotate/ plot with coord_...(clip = "off")
make different plots and combine them.
The latter option usually gives much more flexibility and way less headaches, in my humble opinion.
library(colorspace)
library(tidyverse)
library(patchwork)
asd <- expand_grid(paste0("a", 1:9), paste0("b", 1:9))
df <- data.frame(
a = asd$`paste0("a", 1:9)`,
b = asd$`paste0("b", 1:9)`,
c = sample(20, 81, replace = T)
)
# From discrete to continuous
df$a <- match(df$a, sort(unique(df$a)))
df$b <- match(df$b, sort(unique(df$b)))
z <- sample(10, 18, T)
# set color palettes
pal <- rev(diverging_hcl(palette = "Blue-Red", n = 11))
palEdge <- rev(sequential_hcl(palette = "Plasma", n = 11))
# plot
p_main <- ggplot(df, aes(a, b)) +
geom_tile(aes(fill = c)) +
scale_fill_gradientn("C",colors = pal,
guide = guide_colorbar(frame.colour = "black",
ticks.colour = "black")) +
theme_classic() +
labs(x = "A axis", y = "B axis")
p_bottom <- ggplot() +
geom_tile(data = tibble(a = 1:9, z = z[1:9]),
aes(x = a, y = 0, fill = z, height = 0.3)) +
theme_void() +
scale_fill_gradientn("Z",limits = c(0,10),
colors = palEdge,
guide = guide_colorbar(
frame.colour = "black", ticks.colour = "black"))
p_left <- ggplot() +
theme_void()+
geom_tile(data = tibble(b = 1:9, z = z[10:18]),
aes(x = 0, y = b, fill = z, width = 0.3)) +
scale_fill_gradientn("Z",limits = c(0,10),
colors = palEdge,
guide = guide_colorbar( frame.colour = "black", ticks.colour = "black"))
p_left + p_main +plot_spacer()+ p_bottom +
plot_layout(guides = "collect",
heights = c(1, .1),
widths = c(.1, 1))
Created on 2021-02-21 by the reprex package (v1.0.0)
A common layout in many sites is to draw the grid as shaded bars:
I'm doing this with this function:
grid_bars <- function(data, y, n = 5, fill = "gray90") {
breaks <- pretty(data[[y]], n)
len <- length(breaks)-1
all_bars <- data.frame(
b.id = rep(1:len, 4),
b.x = c(rep(-Inf, len), rep(Inf, len*2), rep(-Inf, len)),
b.y = c(rep(breaks[-length(breaks)], 2), rep(breaks[-1], 2))
)
bars <- all_bars[all_bars$b.id %in% (1:len)[c(FALSE, TRUE)], ]
grid <- list(
geom_polygon(data = bars, aes(b.x, b.y, group = b.id),
fill = fill, colour = fill),
scale_y_continuous(breaks = breaks),
theme(panel.grid = element_blank())
)
return(grid)
}
#-------------------------------------------------
dat <- data.frame(year = 1875:1972,
level = as.vector(LakeHuron))
ggplot(dat, aes(year, level)) +
grid_bars(dat, "level", 10) +
geom_line(colour = "steelblue", size = 1.2) +
theme_classic()
But it needs to specify data and y again. How to take those directly from the ggplot?
After having a look at the options for extending ggplot2 in Hadley Wickham's book on ggplot2 you probably have to set up your own Geom or Stat layer to achieve the desired result. This way you can access the data and aesthetics specified in ggplot() or even pass different data and aesthetics to your fun. Still a newbie in writing extensions for ggplot2 but a first approach may look like so:
library(ggplot2)
# Make bars dataframe
make_bars_df <- function(y, n) {
breaks <- pretty(y, n)
len <- length(breaks) - 1
all_bars <- data.frame(
group = rep(1:len, 4),
x = c(rep(-Inf, len), rep(Inf, len * 2), rep(-Inf, len)),
y = c(rep(breaks[-length(breaks)], 2), rep(breaks[-1], 2))
)
all_bars[all_bars$group %in% (1:len)[c(FALSE, TRUE)], ]
}
# Setup Geom
geom_grid_bars_y <- function(mapping = NULL, data = NULL, stat = "identity",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, n = 5, ...) {
layer(
geom = GeomGridBarsY, mapping = mapping, data = data, stat = stat,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(n = n, ...)
)
}
GeomGridBarsY <- ggproto("GeomGridBarsY", Geom,
required_aes = c("y"),
default_aes = aes(alpha = NA, colour = NA, fill = "gray90", group = NA,
linetype = "solid", size = 0.5, subgroup = NA),
non_missing_aes = aes("n"),
setup_data = function(data, params) {
transform(data)
},
draw_group = function(data, panel_scales, coord, n = n) {
bars <- make_bars_df(data[["y"]], n)
# setup data for GeomPolygon
## If you want this to work with facets you have to take care of the PANEL
bars$PANEL <- factor(1)
# Drop x, y, group from data
d <- data[ , setdiff(names(data), c("x", "y", "group"))]
d <- d[!duplicated(d), ]
# Merge information in data to bars
bars <- merge(bars, d, by = "PANEL")
# Set color = fill
bars[["colour"]] <- bars[["fill"]]
# Draw
grid::gList(
ggplot2::GeomPolygon$draw_panel(bars, panel_scales, coord)
)
},
draw_key = draw_key_rect
)
grid_bars <- function(n = 5, fill = "gray90") {
list(
geom_grid_bars_y(n = n, fill = fill),
scale_y_continuous(breaks = scales::pretty_breaks(n = n)),
theme(panel.grid = element_blank())
)
}
dat <- data.frame(year = 1875:1972,
level = as.vector(LakeHuron))
ggplot(dat, aes(year, level)) +
grid_bars(n = 10, fill = "gray95") +
geom_line(colour = "steelblue", size = 1.2) +
theme_classic()
Just for reference:
A first and simple approach to get grid bars one could simply adjust the size of the grid lines via theme() like so:
# Simple approach via theme
ggplot(dat, aes(year, level)) +
geom_line(colour = "steelblue", size = 1.2) +
scale_y_continuous(breaks = scales::pretty_breaks(n = 10)) +
theme_classic() +
theme(panel.grid.major.y = element_line(size = 8))
Created on 2020-06-14 by the reprex package (v0.3.0)
I have the code below, and it works fine. The problem is, I would like to add "k" and plot a straight line similar to "z", but "k" is a vector of different numbers. Each element in "k" should be plotted as a line on the 3 facets created. If k was a singular value, I would just repeat the geom_segment() command with different y limits. Is there an easy way to do this? The final output should look like attached, assuming I could draw straight lines.
x <- iris[-1:-3]
bw <- 1
nbin <- 100
y <- head(iris, 50)[2]
z <- 1
k <- c(2, 3, 4)
ggplot(x, aes(x = Petal.Width)) +
geom_density(aes(y = bw *..count.., fill = Species), size = 1, alpha = 0.4) +
geom_segment(aes(x = 5, y = 250, xend = z, yend = 250, color = "red")) +
facet_wrap(~Species)+
scale_x_continuous(labels = scales::math_format(10^.x), limits = c(0, 5), expand = c(0,0)) +
scale_y_continuous(expand = c(0,0), limits = c(0, NA)) +
annotation_logticks(sides = "b", short=unit(-1,"mm"), mid=unit(-2,"mm"), long=unit(-3,"mm")) +
coord_cartesian(clip='off') + theme(panel.background = element_blank(),
panel.border = element_rect(colour = "black", fill=NA))
you can try this. Assuming that your plot is saved as p1.
k_data = data.frame(k, Species = levels(x$Species))
p1 + geom_segment(data = k_data, aes(x =5, y = 200, xend = k, yend = 200),
color = "blue", inherit.aes = F)
The idea is to create a dataframe with the columns k and Species and use this data exclusivley in a geom by setting inherit.aes = F
In this solution, the value of k is made part of the data set being plotted through a pipe. It is a temporary modification of the data set, since it is not assigned back to it nor to any other data set.
library(ggplot2)
library(dplyr)
x <- iris[-1:-3]
str(x)
bw <- 1
nbin <- 100
y <- head(iris, 50)[2]
z <- 1
k <- c(2, 3, 4)
x %>%
mutate(k = rep(k, each = 50)) %>%
ggplot(aes(x = Petal.Width)) +
geom_density(aes(y = bw *..count.., fill = Species), size = 1, alpha = 0.4) +
geom_segment(aes(x = 5, y = 250, xend = z, yend = 250), color = "red") +
geom_segment(aes(x = 5, y = 200, xend = k, yend = 200), color = "blue") +
facet_wrap(~Species)+
scale_x_continuous(labels = scales::math_format(10^.x), limits = c(0, 5), expand = c(0,0)) +
scale_y_continuous(expand = c(0,0), limits = c(0, NA)) +
annotation_logticks(sides = "b", short=unit(-1,"mm"), mid=unit(-2,"mm"), long=unit(-3,"mm")) +
coord_cartesian(clip='off') +
theme(panel.background = element_blank(),
panel.border = element_rect(colour = "black", fill=NA))
This question already has answers here:
Create discrete color bar with varying interval widths and no spacing between legend levels
(5 answers)
Closed last year.
I'd like to break the legend into categories rather than having a continuous range of colours. Could someone kindly help me for the specific example I am using here? Below is my current trial with colour breaks at 40, 60 and 80. Thank you very much!
library(raster)
library(ggplot2)
library(maptools)
data("wrld_simpl")
#sample raster
r <- raster(ncol=10, nrow=20)
r[] <- 1:ncell(r)
extent(r) <- extent(c(-180, 180, -70, 70))
#plotting
var_df <- as.data.frame(rasterToPoints(r))
p <- ggplot() +
geom_polygon(data = wrld_simpl[wrld_simpl#data$UN!="10",],
aes(x = long, y = lat, group = group),
colour = "black", fill = "grey")
p <- p + geom_raster(data = var_df, aes(x = x, y = y, fill = layer))
p <- p + coord_equal() + theme_bw() +labs(x="", y="")
p <- p + theme(legend.key=element_blank(),
axis.text.y =element_text(size=16),
axis.text.x =element_text(size=16),
legend.text =element_text(size=12),
legend.title=element_text(size=12))
# p <- p + scale_fill_gradientn(colours = rev(terrain.colors(10)))
p <- p + scale_colour_manual(values = c("red", "blue", "green","yellow"),
breaks = c("40", "60", "80", max(var_df$layer)),
labels = c("1-40", "40-60", "60-80", "80+"))
p <- p + geom_polygon(data = wrld_simpl[wrld_simpl#data$UN!="10",],
aes(x = long, y = lat, group = group),
colour = "black", fill = NA)
p
Current continuous legend:
Example of legend with breaks:
Here you go. I took the plot_discrete_cbar() function written by #AF7 from here
library(raster)
library(ggplot2)
library(maptools)
# Plot discrete colorbar function
plot_discrete_cbar = function (
# Vector of breaks. If +-Inf are used, triangles will be added to the sides of the color bar
breaks,
palette = "Greys", # RColorBrewer palette to use
# Alternatively, manually set colors
colors = RColorBrewer::brewer.pal(length(breaks) - 1, palette),
direction = 1, # Flip colors? Can be 1 or -1
spacing = "natural", # Spacing between labels. Can be "natural" or "constant"
border_color = NA, # NA = no border color
legend_title = NULL,
legend_direction = "horizontal", # Can be "horizontal" or "vertical"
font_size = NULL,
expand_size = 1, # Controls spacing around legend plot
spacing_scaling = 1, # Multiplicative factor for label and legend title spacing
width = 0.1, # Thickness of color bar
triangle_size = 0.1 # Relative width of +-Inf triangles
) {
require(ggplot2)
if (!(spacing %in% c("natural", "constant"))) stop("Spacing must be either 'natural' or 'constant'")
if (!(direction %in% c(1, -1))) stop("Direction must be either 1 or -1")
if (!(legend_direction %in% c("horizontal", "vertical"))) {
stop("Legend_direction must be either 'horizontal' or 'vertical'")
}
breaks = as.numeric(breaks)
new_breaks = sort(unique(breaks))
if (any(new_breaks != breaks)) warning("Wrong order or duplicated breaks")
breaks = new_breaks
if (class(colors) == "function") colors = colors(length(breaks) - 1)
if (length(colors) != length(breaks) - 1) {
stop("Number of colors (", length(colors), ") must be equal to number of breaks (",
length(breaks), ") minus 1")
}
if (!missing(colors)) {
warning("Ignoring RColorBrewer palette '", palette, "', since colors were passed manually")
}
if (direction == -1) colors = rev(colors)
inf_breaks = which(is.infinite(breaks))
if (length(inf_breaks) != 0) breaks = breaks[-inf_breaks]
plotcolors = colors
n_breaks = length(breaks)
labels = breaks
if (spacing == "constant") {
breaks = 1:n_breaks
}
r_breaks = range(breaks)
if(is.null(font_size)) {
print("Legend key font_size not set. Use default value = 5")
font_size <- 5
} else {
print(paste0("font_size = ", font_size))
font_size <- font_size
}
cbar_df = data.frame(stringsAsFactors = FALSE,
y = breaks,
yend = c(breaks[-1], NA),
color = as.character(1:n_breaks)
)[-n_breaks,]
xmin = 1 - width/2
xmax = 1 + width/2
cbar_plot = ggplot(cbar_df, aes(xmin = xmin, xmax = xmax,
ymin = y, ymax = yend, fill = color)) +
geom_rect(show.legend = FALSE,
color = border_color)
if (any(inf_breaks == 1)) { # Add < arrow for -Inf
firstv = breaks[1]
polystart = data.frame(
x = c(xmin, xmax, 1),
y = c(rep(firstv, 2), firstv - diff(r_breaks) * triangle_size)
)
plotcolors = plotcolors[-1]
cbar_plot = cbar_plot +
geom_polygon(data = polystart, aes(x = x, y = y),
show.legend = FALSE,
inherit.aes = FALSE,
fill = colors[1],
color = border_color)
}
if (any(inf_breaks > 1)) { # Add > arrow for +Inf
lastv = breaks[n_breaks]
polyend = data.frame(
x = c(xmin, xmax, 1),
y = c(rep(lastv, 2), lastv + diff(r_breaks) * triangle_size)
)
plotcolors = plotcolors[-length(plotcolors)]
cbar_plot = cbar_plot +
geom_polygon(data = polyend, aes(x = x, y = y),
show.legend = FALSE,
inherit.aes = FALSE,
fill = colors[length(colors)],
color = border_color)
}
if (legend_direction == "horizontal") { # horizontal legend
mul = 1
x = xmin
xend = xmax
cbar_plot = cbar_plot + coord_flip()
angle = 0
legend_position = xmax + 0.1 * spacing_scaling
} else { # vertical legend
mul = -1
x = xmax
xend = xmin
angle = -90
legend_position = xmax + 0.2 * spacing_scaling
}
cbar_plot = cbar_plot +
geom_segment(data = data.frame(y = breaks, yend = breaks),
aes(y = y, yend = yend),
x = x - 0.05 * mul * spacing_scaling, xend = xend,
inherit.aes = FALSE) +
annotate(geom = 'text', x = x - 0.1 * mul * spacing_scaling, y = breaks,
label = labels,
size = font_size) +
scale_x_continuous(expand = c(expand_size, expand_size)) +
scale_fill_manual(values = plotcolors) +
theme_void()
if (!is.null(legend_title)) { # Add legend title
cbar_plot = cbar_plot +
annotate(geom = 'text', x = legend_position, y = mean(r_breaks),
label = legend_title,
angle = angle,
size = font_size)
}
return(cbar_plot)
}
Cut data into bins for the discrete colorbar
myvalues <- c(seq(0, 200, 40), Inf)
var_df$cuts <- cut(var_df$layer, myvalues, include.lowest = TRUE)
levels(var_df$cuts)
#> [1] "[0,40]" "(40,80]" "(80,120]" "(120,160]" "(160,200]" "(200,Inf]"
Plot the raster
p <- ggplot() +
geom_polygon(data = wrld_simpl[wrld_simpl#data$UN != "10", ],
aes(x = long, y = lat, group = group),
colour = "black", fill = "grey")
p <- p + geom_raster(data = var_df, aes(x = x, y = y, fill = cuts)) # matching cuts & fill
p <- p + coord_equal() + theme_minimal() + labs(x="", y="")
p <- p + theme(legend.key =element_blank(),
axis.text.y =element_text(size=16),
axis.text.x =element_text(size=16),
legend.text =element_text(size=12),
legend.title=element_text(size=12))
p <- p + scale_fill_brewer("Layer", palette = "YlGnBu", drop = FALSE)
p <- p + geom_polygon(data = wrld_simpl[wrld_simpl#data$UN != "10", ],
aes(x = long, y = lat, group = group),
colour = "black", fill = NA)
p <- p + theme(legend.position = "none")
Plot the discrete colorbar
dbar <- plot_discrete_cbar(myvalues,
palette = "YlGnBu",
legend_title = NULL,
spacing = "natural")
# reduce top and bottom margins
p1 <- p + theme(plot.margin = unit(c(10, 10, -35, 10), "pt"))
dbar <- dbar + theme(plot.margin = unit(c(-35, 10, -30, 10), "pt"))
Combine two plots together
# devtools::install_github('baptiste/egg')
library(egg)
ggarrange(p1, dbar, nrow = 2, ncol = 1, heights = c(1, 0.4))
Created on 2018-10-18 by the reprex package (v0.2.1.9000)
I am very intrigued by the following visulization (Decile term)
And I wonder how it would be possible to do it in R.
There is of course histograms and density plots, but they do not make such a nice visualization. Especially, I would like to know if it possible to do it with ggplot/tidyverse.
edit in response to the comment
library(dplyr)
library(ggplot2)
someData <- data_frame(x = rnorm(1000))
ggplot(someData, aes(x = x)) +
geom_histogram()
this produces a histogram (see http://www.r-fiddle.org/#/fiddle?id=LQXazwMY&version=1)
But how I can get the coloful bars? How to implement the small rectangles? (The arrows are less relevant).
You have to define a number of breaks, and use approximate deciles that match those histogram breaks. Otherwise, two deciles will end up in one bar.
d <- data_frame(x = rnorm(1000))
breaks <- seq(min(d$x), max(d$x), length.out = 50)
quantiles <- quantile(d$x, seq(0, 1, 0.1))
quantiles2 <- sapply(quantiles, function(x) breaks[which.min(abs(x - breaks))])
d$bar <- as.numeric(as.character(cut(d$x, breaks, na.omit((breaks + dplyr::lag(breaks)) / 2))))
d$fill <- cut(d$x, quantiles2, na.omit((quantiles2 + dplyr::lag(quantiles2)) / 2))
ggplot(d, aes(bar, y = 1, fill = fill)) +
geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1])
Or with more distinct colors:
ggplot(d, aes(bar, y = 1, fill = fill)) +
geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1]) +
scale_fill_brewer(type = 'qual', palette = 3) # The only qual pallete with enough colors
Add some styling and increase the breaks to 100:
ggplot(d, aes(bar, y = 1, fill = fill)) +
geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1], size = 0.3) +
scale_fill_brewer(type = 'qual', palette = 3) +
theme_classic() +
coord_fixed(diff(breaks)[1], expand = FALSE) + # makes square blocks
labs(x = 'x', y = 'count')
And here is a function to make that last one:
decile_histogram <- function(data, var, n_breaks = 100) {
breaks <- seq(min(data[[var]]), max(data[[var]]), length.out = n_breaks)
quantiles <- quantile(data[[var]], seq(0, 1, 0.1))
quantiles2 <- sapply(quantiles, function(x) breaks[which.min(abs(x - breaks))])
data$bar <- as.numeric(as.character(
cut(data[[var]], breaks, na.omit((breaks + dplyr::lag(breaks)) / 2)))
)
data$fill <- cut(data[[var]], quantiles2, na.omit((quantiles2 + dplyr::lag(quantiles2)) / 2))
ggplot2::ggplot(data, ggplot2::aes(bar, y = 1, fill = fill)) +
ggplot2::geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1], size = 0.3) +
ggplot2::scale_fill_brewer(type = 'qual', palette = 3) +
ggplot2::theme_classic() +
ggplot2::coord_fixed(diff(breaks)[1], expand = FALSE) +
ggplot2::labs(x = 'x', y = 'count')
}
Use as:
d <- data.frame(x = rnorm(1000))
decile_histogram(d, 'x')