I am trying to draw a line through the density plots from ggridges
library(ggplot2)
library(ggridges)
ggplot(iris, aes(x = Sepal.Length, y = Species)) +
geom_density_ridges(rel_min_height = 0.01)
Indicating the highest point and label the value of x at that point. Something like this below. Any suggestions on accomplishing this is much appreciated
One neat approach is to interrogate the ggplot object itself and use it to construct additional features:
# This is the OP chart
library(ggplot2)
library(ggridges)
gr <- ggplot(iris, aes(x = Sepal.Length, y = Species)) +
geom_density_ridges(rel_min_height = 0.01)
Edit: This next part has been shortened, using purrr::pluck to extract the whole data part of the list, instead of manually specifying the columns we'd need later.
# Extract the data ggplot used to prepare the figure.
# purrr::pluck is grabbing the "data" list from the list that
# ggplot_build creates, and then extracting the first element of that list.
ingredients <- ggplot_build(gr) %>% purrr::pluck("data", 1)
# Pick the highest point. Could easily add quantiles or other features here.
density_lines <- ingredients %>%
group_by(group) %>% filter(density == max(density)) %>% ungroup()
# Use the highest point to add more geoms
ggplot(iris, aes(x = Sepal.Length, y = Species)) +
geom_density_ridges(rel_min_height = 0.01) +
geom_segment(data = density_lines,
aes(x = x, y = ymin, xend = x,
yend = ymin+density*scale*iscale)) +
geom_text(data = density_lines,
aes(x = x, y = ymin + 0.5 *(density*scale*iscale),
label = round(x, 2)),
hjust = -0.2)
Related
I would like to make a PCA score plot using ggplot2, and then convert the plot into interactive plot using plotly.
What I want to do is to add a frame (not ellipse using stat_ellipse, I know it worked).
My problem is that when I try to use sample name as tooltip in ggplotly, the frame will disappear. I don't know how to fix it.
Below is my code
library(ggplot2)
library(plotly)
library(dplyr)
## Demo data
dat <- iris[1:4]
Group <- iris$Species
## Calculate PCA
df_pca <- prcomp(dat, center = T, scale. = FALSE)
df_pcs <- data.frame(df_pca$x, Group = Group)
percentage <-round(df_pca$sdev^2 / sum(df_pca$sdev^2) * 100, 2)
percentage <-paste(colnames(df_pcs),"(", paste(as.character(percentage), "%", ")", sep = ""))
## Visualization
Sample_Name <- rownames(df_pcs)
p <- ggplot(df_pcs, aes(x = PC1, y = PC2, color = Group, label = Sample_Name)) +
xlab(percentage[1]) +
ylab(percentage[2]) +
geom_point(size = 3)
ggplotly(p, tooltip = "label")
Until here it works! You can see that sample names can be properly shown in the ggplotly plot.
Next I tried to add a frame
## add frame
hull_group <- df_pcs %>%
dplyr::mutate(Sample_Name = Sample_Name) %>%
dplyr::group_by(Group) %>%
dplyr::slice(chull(PC1, PC2))
p2 <- p +
ggplot2::geom_polygon(data = hull_group, aes(fill = Group), alpha = 0.1)
You can see that the static plot still worked! The frame is properly added.
However, when I tried to convert it to plotly interactive plot. The frame disappeared.
ggplotly(p2, tooltip = "label")
Thanks a lot for your help.
It works if you move the data and mapping from the ggplot() call to the geom_point() call:
p2 <- ggplot() +
geom_point(data = df_pcs, mapping = aes(x = PC1, y = PC2, color = Group, label = Sample_Name), size = 3) +
ggplot2::geom_polygon(data = hull_group, aes(x = PC1, y = PC2, fill = Group, group = Group), alpha = 0.2)
ggplotly(p2, tooltip = "label")
You might want to change the order of the geom_point and geom_polygon to make sure that the points are on top of the polygon (this also affects the tooltip location).
I have my an empty panel in my facetted ggplot. I would like to insert my standalone plot into this. Is this possible? See below for example code.
I found a possible solution Here, but can't get it to 'look nice'. To 'look nice' I want the standalone plot to have the same dimensions as one of the facetted plots.
library(ggplot2)
library(plotly)
data("mpg")
first_plot = ggplot(data = mpg, aes(x = trans, y = cty)) +
geom_point(size= 1.3)
facet_plot = ggplot(data = mpg, aes(x = year, y = cty)) +
geom_point(size = 1.3) +
facet_wrap(~manufacturer)
facet_plot # room for one more panel which I want first_plot to go?
# try an merge but makes first plot huge, compared with facetted plots.
subplot(first_plot, facet_plot, which_layout = 2)
Besides the options to manipulate the gtable or using patchwork one approach to achieve your desired result would be via some data wrangling to add the standalone plot as an additional facet. Not sure whether this will work for your real data but at least for mpg you could do:
library(ggplot2)
library(dplyr)
mpg_bind <- list(standalone = mpg, facet = mpg) %>%
bind_rows(.id = "id") %>%
mutate(x = ifelse(id == "standalone", trans, year),
facet = ifelse(id == "standalone", "all", manufacturer),
facet = forcats::fct_relevel(facet, "all", after = 1000))
ggplot(data = mpg_bind, aes(x = x, y = cty)) +
geom_point(size = 1.3) +
facet_wrap(~facet, scales = "free_x")
I am trying to assign a colour scale/gradient to some violin plots based on the y-axis value of Income. However, I only get a white violin plot. I can change the colour based on state.region but not Income.
Data
USA.states <- data.frame(state.region,state.x77)
Code
p <- ggplot(USA.states,aes(x=state.region,y=Income,fill=Income))+
geom_violin(trim = F,)+
ggtitle("Violin plot of income and Population")
p + scale_fill_gradient(low="red",high="blue")
I assigned the fill to Income but it just ends up filled with white.
You can create a pseudo-fill from segments, and you can create those from the underlying data (in the ggplot_built object) directly.
If you want an additional polygon outline, you would still need to create the polygons manually though, using x and y coordinates as calculated for the segments. (There is certainly a cleverer way to put this into a data frame than below, so don't take this as gospel).
Of another note, the violins in the original plot seem to be scaled, but I don't exactly understand how, so I just scaled it with a constant which I found with some trial and error.
library(tidyverse)
USA.states <- data.frame(state.region,state.x77)
p <- ggplot(USA.states,aes(x=state.region,y=Income,fill=Income))+
geom_violin(trim = F)
mywidth <- .35 # bit of trial and error
# This is all you need for the fill:
vl_fill <- data.frame(ggplot_build(p)$data) %>%
mutate(xnew = x- mywidth*violinwidth, xend = x+ mywidth*violinwidth)
# Bit convoluted for the outline, need to be rearranged: the order matters
vl_poly <-
vl_fill %>%
select(xnew, xend, y, group) %>%
pivot_longer(-c(y, group), names_to = "oldx", values_to = "x") %>%
arrange(y) %>%
split(., .$oldx) %>%
map(., function(x) {
if(all(x$oldx == "xnew")) x <- arrange(x, desc(y))
x
}) %>%
bind_rows()
ggplot() +
geom_polygon(data = vl_poly, aes(x, y, group = group),
color= "black", size = 1, fill = NA) +
geom_segment(data = vl_fill, aes(x = xnew, xend = xend, y = y, yend = y,
color = y))
Created on 2021-04-14 by the reprex package (v1.0.0)
I have a dataset with numeric values and a categorical variable. The distribution of the numeric variable differs for each category. I want to plot "density plots" for each categorical variable so that they are visually below the entire density plot.
This is similiar to components of a mixture model without calculating the mixture model (as I already know the categorical variable which splits the data).
If I take ggplot to group according to the categorical variable, each of the four densities are real densities and integrate to one.
library(ggplot2)
ggplot(iris, aes(x = Sepal.Width)) + geom_density() + geom_density(aes(x = Sepal.Width, group = Species, colour = 'Species'))
What I want is to have the densities of each category as a sub-density (not integrating to 1). Similiar to the following code (which I only implemented for two of the three iris species)
myIris <- as.data.table(iris)
# calculate density for entire dataset
dens_entire <- density(myIris[, Sepal.Width], cut = 0)
dens_e <- data.table(x = dens_entire[[1]], y = dens_entire[[2]])
# calculate density for dataset with setosa
dens_setosa <- density(myIris[Species == 'setosa', Sepal.Width], cut = 0)
dens_sa <- data.table(x = dens_setosa[[1]], y = dens_setosa[[2]])
# calculate density for dataset with versicolor
dens_versicolor <- density(myIris[Species == 'versicolor', Sepal.Width], cut = 0)
dens_v <- data.table(x = dens_versicolor[[1]], y = dens_versicolor[[2]])
# plot densities as mixture model
ggplot(dens_e, aes(x=x, y=y)) + geom_line() + geom_line(data = dens_sa, aes(x = x, y = y/2.5, colour = 'setosa')) +
geom_line(data = dens_v, aes(x = x, y = y/1.65, colour = 'versicolor'))
resulting in
Above I hard-coded the number to reduce the y values. Is there any way to do it with ggplot? Or to calculate it?
Thanks for your ideas.
Do you mean something like this? You need to change the scale though.
ggplot(iris, aes(x = Sepal.Width)) +
geom_density(aes(y = ..count..)) +
geom_density(aes(x = Sepal.Width, y = ..count..,
group = Species, colour = Species))
Another option may be
ggplot(iris, aes(x = Sepal.Width)) +
geom_density(aes(y = ..density..)) +
geom_density(aes(x = Sepal.Width, y = ..density../3,
group = Species, colour = Species))
Using R's standard plot or better still GGPLOT,
is there a way to create a plot like this?
Note especially the horizontal lines across selected bar
with asterisk on top of it.
I don't know of an easy way to annotate graphs like this in ggplot2. Here's a relatively generic approach to make the data you'd need to plot. You can use a similar approach to annotate the relationships as necessary. I'll use the iris dataset as an example:
library(ggplot2)
library(plyr) #for summarizing data
#summarize average sepal length by species
dat <- ddply(iris, "Species", summarize, length = mean(Sepal.Length))
#Create the data you'll need to plot for the horizontal lines
horzlines <- data.frame(x = 1,
xend = seq_along(dat$Species)[-1],
y = seq(from = max(dat$length), by = 0.5, length.out = length(unique(dat$Species))-1),
yend = seq(from = max(dat$length), by = 0.5, length.out = length(unique(dat$Species))-1),
label = c("foo", "bar")
)
ggplot() +
geom_histogram(data = dat, aes(Species, length), stat = "identity") +
geom_segment(data = horzlines, aes(x = x, xend = xend, y = y, yend = yend)) +
geom_text(data = horzlines, aes(x = (x + xend)/2, y = y + .25, label = label))
Giving you something like this: