Related
First we prepare some toy data that sufficiently resembles the one I am working with.
rawdata <- data.frame(Score = rnorm(1000, seq(1, 0, length.out = 10), sd = 1),
Group = rep(LETTERS[1:3], 10000))
stdev <- c(10.78,10.51,9.42)
Now we plot the estimated densities via geom_density_ridges. I also add a grey highlight around zero via geom_rect. I also flip the chart with coord_flip.
p <- ggplot(rawdata, aes(x = Score, y = Group)) +
scale_y_discrete() +
geom_rect(inherit.aes = FALSE, mapping = aes(ymin = 0, ymax = Inf, xmin = -0.1 * min(stdev), xmax = 0.1 * max(stdev)),
fill = "grey", alpha = 0.5) +
geom_density_ridges(aes(fill = Group), scale = 0.5, size = 1, alpha=0.5) +
scale_color_manual(values = col) +
scale_fill_manual(values = col) +
labs(title="Toy Graph", y="Group", x="Value") +
coord_flip(xlim = c(-8, 8), ylim = NULL, expand = TRUE, clip = "on")
p
And this is the solution I get, which is close to what I was expecting, despite the detail of this enormous gap between the y axis an the start of the first factor in the x axis A. I tried using expand=c(0,0) inside scale_y_discrete() following some suggestions from other posts, but it does not make the gap smaller at all. If possible I would still like to have a certain gap, although minimal. I've been also trying to flip the densities in the y axis so the gap is filled by first factor density plot but I have been unsuccessful as it does not seem as trivial as one could expect.
Sorry, I know this might be technically two different questions, "How to reduce the gap from the y axis to the first density plot?" and "How to flip the densities from y axis to reduce the gap?" But I would really be happy with the first one as I understand the second question seems to be apparently less straightforward.
Thanks in advance! Any help is appreciated.
Flipping the densities also effectively reduces the space, so this might be all you need to do. You can achieve it with a negative scale parameter:
ggplot(rawdata, aes(x = Score, y = Group)) +
scale_y_discrete() +
geom_rect(inherit.aes = FALSE,
mapping = aes(ymin = 0, ymax = Inf,
xmin = -0.1 * min(stdev),
xmax = 0.1 * max(stdev)),
fill = "grey", alpha = 0.5) +
geom_density_ridges(aes(fill = Group), scale = -0.5, size = 1, alpha = 0.5) +
scale_color_manual(values = col) +
scale_fill_manual(values = col) +
labs(title = "Toy Graph", y = "Group", x = "Value") +
coord_flip(xlim = c(-8, 8), ylim = NULL, expand = TRUE, clip = "on")
If you want to keep the densities pointing the same way but just reduce space on the left side, simply set hard limits in your coord_flip, with no expansion:
ggplot(rawdata, aes(x = Score, y = Group)) +
geom_rect(inherit.aes = FALSE,
mapping = aes(ymin = 0, ymax = Inf,
xmin = -0.1 * min(stdev),
xmax = 0.1 * max(stdev)),
fill = "grey", alpha = 0.5) +
geom_density_ridges(aes(fill = Group), scale = 0.5, size = 1, alpha = 0.5) +
scale_color_manual(values = col) +
scale_fill_manual(values = col) +
scale_y_discrete() +
labs(title = "Toy Graph", y = "Group", x = "Value") +
coord_flip(xlim = c(-8, 8), ylim = c(0.8, 4), expand = FALSE)
I am trying to plot a polygon hull using ggplot and plotly.
While without label polygons are shown in the plot, when I add extra labels in aesthetics the polygons disappear.
library(data.table)
library(ggplot2)
library(dplyr)
library(plotly)
df <- data.table(continent = c(rep("America",3), rep("Europe",4)),
state = c("USA", "Brasil", "Chile", "Italy", "Swiss", "Spain", "Greece"),
X = rnorm(7, 5, 1),
Y = rnorm(7, -13, 1)
)
df$X_sd = sd(df$X)
df$Y_sd = sd(df$Y)
hull2 <- df %>%
group_by(continent) %>%
slice(chull(X,Y))
p <- df %>%
ggplot( aes(x=X,
y=Y,
fill = continent,
color = continent,
label=state))+
geom_polygon(data = hull2,
lwd = 1,
alpha = 0.1,
linetype = "dashed")+
geom_errorbarh(aes(xmin = X - X_sd,
xmax = X + X_sd),
size = 0.5,
alpha = 0.3) +
geom_errorbar(aes(ymin = Y - Y_sd,
ymax = Y + Y_sd),
size = 0.5,
alpha = 0.3) +
geom_point(shape=21,
color="black",
size=3)+
theme_bw()+
theme(legend.position = "none")
ggplotly(p)
How odd! If you most label = state to the aes for the last geom_ you'll get the standard warning, but it works and the state shows up in the tooltip.
The designation of color = continent shows up, as well. I am going to guess that you're not interested in having that in your tooltip, so I've added how you could change that at the end. There is a tooltip with the continent listed two times, but with the information about how to remove the color, you'll see how you might make further adjustments depending on the trace.
p <- df %>%
ggplot(aes(x = X, y = Y,
fill = continent,
color = continent #,
# label = state)
)) +
geom_polygon(data = hull2, lwd = 1,
alpha = 0.1, linetype = "dashed") +
geom_errorbarh(aes(xmin = X - X_sd,
xmax = X + X_sd),
size = 0.5, alpha = 0.3) +
geom_errorbar(aes(ymin = Y - Y_sd,
ymax = Y + Y_sd),
size = 0.5, alpha = 0.3) +
geom_point(shape = 21,
color = "black",
size = 3, aes(label = state)) +
theme_bw() + theme(legend.position = "none")
p
ggplotly(p)
To remove the color from the tooltip, assign ggplotly to an object. Then you can remove the string from the 7th and 8th trace.
p1 = ggplotly(p)
lapply(7:8,
function(i){
p1$x$data[[i]]$text <<- stringr::str_replace(p1$x$data[[i]]$text,
"continent: black<br />",
"")
})
p1
FYI, there are 8 traces that make up your plot. The first trace has the double continent text.
I would like to create a raincloud plot. I have successfully done it. But I would like to know if instead of the density curve, I can put a histogram (it's better for my dataset).
This is my code if it can be usefull
ATSC <- ggplot(data = data, aes(y = atsc, x = numlecteur, fill = numlecteur)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), alpha = .5) +
geom_point(aes(y = atsc, color = numlecteur), position = position_jitter(width = .15), size = .5, alpha = 0.8) +
geom_point(data = sumld, aes(x = numlecteur, y = mean), position = position_nudge(x = 0.25), size = 2.5) +
geom_errorbar(data = sumld, aes(ymin = lower, ymax = upper, y = mean), position = position_nudge(x = 0.25), width = 0) +
guides(fill = FALSE) +
guides(color = FALSE) +
scale_color_brewer(palette = "Spectral") +
scale_y_continuous(breaks=c(0,2,4,6,8,10), labels=c("0","2","4","6","8","10"))+
scale_fill_brewer(palette = "Spectral") +
coord_flip() +
theme_bw() +
expand_limits(y=c(0, 10))+
xlab("Lecteur") + ylab("Age total sans check")+
raincloud_theme
I think we can maybe put the "geom_histogram()" but it doesn't work
Thank you in advance for your help !
(sources : https://peerj.com/preprints/27137v1.pdf
https://neuroconscience.wordpress.com/2018/03/15/introducing-raincloud-plots/)
This is actually not quite easy. There are a few challenges.
geom_histogram is "horizontal by nature", and the custom geom_flat_violin is vertical - as are boxplots. Therefore the final call to coord_flip in that tutorial. In order to combine both, I think best is switch x and y, forget about coord_flip, and use ggstance::geom_boxploth instead.
Creating separate histograms for each category is another challenge. My workaround to create facets and "merge them together".
The histograms are scaled way bigger than the width of the points/boxplots. My workaround scale via after_stat function.
How to nudge the histograms to the right position above Boxplot and points - I am converting the discrete scale to a continuous by mapping a constant numeric to the global y aesthetic, and then using the facet labels for discrete labels.
library(tidyverse)
my_data<-read.csv("https://data.bris.ac.uk/datasets/112g2vkxomjoo1l26vjmvnlexj/2016.08.14_AnxietyPaper_Data%20Sheet.csv")
my_datal <-
my_data %>%
pivot_longer(cols = c("AngerUH", "DisgustUH", "FearUH", "HappyUH"), names_to = "EmotionCondition", values_to = "Sensitivity")
# use y = -... to position boxplot and jitterplot below the histogram
ggplot(data = my_datal, aes(x = Sensitivity, y = -.5, fill = EmotionCondition)) +
# after_stat for scaling
geom_histogram(aes(y = after_stat(count/100)), binwidth = .05, alpha = .8) +
# from ggstance
ggstance::geom_boxploth( width = .1, outlier.shape = NA, alpha = 0.5) +
geom_point(aes(color = EmotionCondition), position = position_jitter(width = .15), size = .5, alpha = 0.8) +
# merged those calls to one
guides(fill = FALSE, color = FALSE) +
# scale_y_continuous(breaks = 1, labels = unique(my_datal$EmotionCondition))
scale_color_brewer(palette = "Spectral") +
scale_fill_brewer(palette = "Spectral") +
# facetting, because each histogram needs its own y
# strip position = left to fake discrete labels in continuous scale
facet_wrap(~EmotionCondition, nrow = 4, scales = "free_y" , strip.position = "left") +
# remove all continuous labels from the y axis
theme(axis.title.y = element_blank(), axis.text.y = element_blank(),
axis.ticks.y = element_blank())
Created on 2021-04-15 by the reprex package (v1.0.0)
I'd like to insert median lines for factor levels into a violin plot in ggplot2. Here's some reproducible data:
set.seed(12)
FactorVar <- sample(LETTERS[1:5], 500, replace = T)
NumericVar <- abs(rnorm(500))
df <- data.frame(FactorVar, NumericVar)
To get the grouped medians I use tapply:
medians <- tapply(df$NumericVar, df$FactorVar, FUN = median)
And this is the code for the plot. As can be seen, I'm inserting each median line individually. That's cumbersome and uneconomical:
library(ggplot2)
g <-
ggplot(data = df,
aes(x = FactorVar, y = NumericVar, fill = FactorVar)) +
geom_violin(scale = "count", trim = F, adjust = 0.75) +
geom_point(aes(y = NumericVar),
position = position_jitter(width = .15), size = 0.9, alpha = 0.8) +
geom_hline(yintercept = mean(NumericVar), color = "blue", size = 0.8, linetype = 4) +
geom_segment(x = 0.5, xend = 1.5, y= medians[1], yend = medians[1], color = "red", linetype = 2) +
geom_segment(x = 1.5, xend = 2.5, y = medians[2], yend = medians[2], color = "red", linetype = 2) +
geom_segment(x = 2.5, xend = 3.5, y = medians[3], yend = medians[3], color = "red", linetype = 2) +
geom_segment(x = 3.5, xend = 4.5, y = medians[4], yend = medians[4], color = "red", linetype = 2) +
geom_segment(x = 4.5, xend = 5.5, y = medians[5], yend = medians[5], color = "red", linetype = 2) +
guides(fill = FALSE) +
guides(color = FALSE) +
coord_flip() +
theme_gray(); g
How can the median segments be inserted in a single command? Also, observe how the median line for factor A is thinner than the others? Why's that?
One method (that simplifies the +/- axis) would be to facet it. Before, though, we'll need to put the medians into a frame, preferably with the same grouping factors as the original.
mediansdf <- data.frame(FactorVar=names(medians), NumericVar=medians)
g <-
ggplot(data = df,
aes(x = FactorVar, y = NumericVar, fill = FactorVar)) +
geom_violin(scale = "count", trim = F, adjust = 0.75) +
geom_point(aes(y = NumericVar),
position = position_jitter(width = .15), size = 0.9, alpha = 0.8) +
geom_hline(yintercept = mean(NumericVar), color = "blue", size = 0.8, linetype = 4) +
guides(fill = FALSE) +
guides(color = FALSE) +
coord_flip() +
theme_gray() +
facet_grid(FactorVar~., scales="free") +
geom_segment(aes(x = 0.5, xend = 1.5, yend = NumericVar), color = "red", linetype = 2, data = mediansdf)
g
This example reused the y aesthetic, but since we have a different frame, we could easily use different names (and specify them within aes(...). One advantage to using the same variable names is (in my opinion) clearer declarative code.
Since the facet_grid adds the factor label on the right side, you likely could remove it from the axis. Note, if you do not use scales="free", then you'll see all factors in each facet, which is distracting and unnecessary.
The reason I am suggesting facets is that it makes the x and xend simple and relative to a single violin, so 0.5 to 1.5; otherwise, as you saw, there is some assumption on which is going with which integer placement.
Last, the appearance of thinner red lines for me was while looking at the raster plot window. If you save to vector-based format (e.g., PDF), the lines appear to be the same thickness.
I have the following code, which produces the following plot:
cols <- brewer.pal(n = 3, name = 'Dark2')
p4 <- ggplot(all.m, aes(x=xval, y=yval, colour = Approach, ymax = 0.95)) + theme_bw() +
geom_errorbar(aes(ymin= yval - se, ymax = yval + se), width=5, position=pd) +
geom_line(position=pd) +
geom_point(aes(shape=Approach, colour = Approach), size = 4) +
geom_hline(aes(yintercept = cp.best$slope, colour = "C2P"), show_guide = FALSE) +
scale_color_manual(name="Approach", breaks=c("C2P", "P2P", "CP2P"), values = cols[c(1,3,2)]) +
scale_y_continuous(breaks = seq(0.4, 0.95, 0.05), "Test AUROC") +
scale_x_continuous(breaks = seq(10, 150, by = 20), "# Number of Patient Samples in Training")
p4 <- p4 + theme(legend.direction = 'horizontal',
legend.position = 'top',
plot.margin = unit(c(5.1, 7, 4.5, 3.5)/2, "lines"),
text = element_text(size=15), axis.title.x=element_text(vjust=-1.5), axis.title.y=element_text(vjust=2))
p4 <- p4 + guides(colour=guide_legend(override.aes=list(shape=c(NA,17,16))))
p4
When I try show_guide = FALSE in geom_point, the shape of the point in the upper legend are all set to default solid circles.
How can I make the lower legend to disappear, without affecting the upper legend?
This is a solution, complete with reproducible data:
library("ggplot2")
library("grid")
library("RColorBrewer")
cp2p <- data.frame(xval = 10 * 2:15, yval = cumsum(c(0.55, rnorm(13, 0.01, 0.005))), Approach = "CP2P", stringsAsFactors = FALSE)
p2p <- data.frame(xval = 10 * 1:15, yval = cumsum(c(0.7, rnorm(14, 0.01, 0.005))), Approach = "P2P", stringsAsFactors = FALSE)
pd <- position_dodge(0.1)
cp.best <- list(slope = 0.65)
all.m <- rbind(p2p, cp2p)
all.m$Approach <- factor(all.m$Approach, levels = c("C2P", "P2P", "CP2P"))
all.m$se <- rnorm(29, 0.1, 0.02)
all.m[nrow(all.m) + 1, ] <- all.m[nrow(all.m) + 1, ] # Creates a new row filled with NAs
all.m$Approach[nrow(all.m)] <- "C2P"
cols <- brewer.pal(n = 3, name = 'Dark2')
p4 <- ggplot(all.m, aes(x=xval, y=yval, colour = Approach, ymax = 0.95)) + theme_bw() +
geom_errorbar(aes(ymin= yval - se, ymax = yval + se), width=5, position=pd) +
geom_line(position=pd) +
geom_point(aes(shape=Approach, colour = Approach), size = 4, na.rm = TRUE) +
geom_hline(aes(yintercept = cp.best$slope, colour = "C2P")) +
scale_color_manual(values = c(C2P = cols[1], P2P = cols[2], CP2P = cols[3])) +
scale_shape_manual(values = c(C2P = NA, P2P = 16, CP2P = 17)) +
scale_y_continuous(breaks = seq(0.4, 0.95, 0.05), "Test AUROC") +
scale_x_continuous(breaks = seq(10, 150, by = 20), "# Number of Patient Samples in Training")
p4 <- p4 + theme(legend.direction = 'horizontal',
legend.position = 'top',
plot.margin = unit(c(5.1, 7, 4.5, 3.5)/2, "lines"),
text = element_text(size=15), axis.title.x=element_text(vjust=-1.5), axis.title.y=element_text(vjust=2))
p4
The trick is to make sure that all of the desired levels of all.m$Approach appear in all.m, even if one of them gets dropped out of the graph. The warning about the omitted point is suppressed by the na.rm = TRUE argument to geom_point.
Short answer:
Just add a dummy geom_point layer (transparent points) where shape is mapped to the same level as in geom_hline.
geom_point(aes(shape = "int"), alpha = 0)
Longer answer:
Whenever possible, ggplot merges / combines legends of different aesthetics. For example, if colour and shape is mapped to the same variable, then the two legends are combined into one.
I illustrate this using simple data set with 'x', 'y' and a grouping variable 'grp' with two levels:
df <- data.frame(x = rep(1:2, 2), y = 1:4, grp = rep(c("a", "b"), each = 2))
First we map both color and shape to 'grp'
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4)
Fine, the legends for the aesthetics, color and shape, are merged into one.
Then we add a geom_hline. We want it to have a separate color from the geom_lines and to appear in the legend. Thus, we map color to a variable, i.e. put color inside aes of geom_hline. In this case we do not map the color to a variable in the data set, but to a constant. We may give the constant a desired name, so we don't need to rename the legend entries afterwards.
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4) +
geom_hline(aes(yintercept = 2.5, color = "int"))
Now two legends appears, one for the color aesthetics of geom_line and geom_hline, and one for the shape of the geom_points. The reason for this is that the "variable" which color is mapped to now contains three levels: the two levels of 'grp' in the original data, plus the level 'int' which was introduced in the geom_hline aes. Thus, the levels in the color scale differs from those in the shape scale, and by default ggplot can't merge the two scales into one legend.
How to combine the two legends?
One possibility is to introduce the same, additional level for shape as for color by using a dummy geom_point layer with transparent points (alpha = 0) so that the two aesthetics contains the same levels:
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4) +
geom_hline(aes(yintercept = 2.5, color = "int")) +
geom_point(aes(shape = "int"), alpha = 0) # <~~~~ a blank geom_point
Another possibility is to convert the original grouping variable to a factor, and add the "geom_hline level" to the original levels. Then use drop = FALSE in scale_shape_discrete to include "unused factor levels from the scale":
datadf$grp <- factor(df$grp, levels = c(unique(df$grp), "int"))
ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
geom_line() +
geom_point(size = 4) +
geom_hline(aes(yintercept = 2.5, color = "int")) +
scale_shape_discrete(drop = FALSE)
Then, as you already know, you may use the guides function to "override" the shape aesthetics in the legend, and remove the shape from the geom_hline entry by setting it to NA:
guides(colour = guide_legend(override.aes = list(shape = c(16, 17, NA))))