ggplot outline jitter datapoints - r

I'm trying to create a scatterplot where the points are jittered (geom_jitter), but I also want to create a black outline around each point. Currently I'm doing it by adding 2 geom_jitters, one for the fill and one for the outline:
beta <- paste("beta == ", "0.15")
ggplot(aes(x=xVar, y = yVar), data = data) +
geom_jitter(size=3, alpha=0.6, colour=my.cols[2]) +
theme_bw() +
geom_abline(intercept = 0.0, slope = 0.145950, size=1) +
geom_vline(xintercept = 0, linetype = "dashed") +
annotate("text", x = 2.5, y = 0.2, label=beta, parse=TRUE, size=5)+
xlim(-1.5,4) +
ylim(-2,2)+
geom_jitter(shape = 1,size = 3,colour = "black")
However, that results in something like this:
Because jitter randomly offsets the data, the 2 geom_jitters are not in line with each other. How do I ensure the outlines are in the same place as the fill points?
I've see threads about this (e.g. Is it possible to jitter two ggplot geoms in the same way?), but they're pretty old and not sure if anything new has been added to ggplot that would solve this issue
The code above works if, instead of using geom_jitter, I use the regular geom_point, but I have too many overlapping points for that to be useful
EDIT:
The solution in the posted answer works. However, it doesn't quite cooperate for some of my other graphs where I'm binning by some other variable and using that to plot different colours:
ggplot(aes(x=xVar, y = yVar, color=group), data = data) +
geom_jitter(size=3, alpha=0.6, shape=21, fill="skyblue") +
theme_bw() +
geom_vline(xintercept = 0, linetype = "dashed") +
scale_colour_brewer(name = "Title", direction = -1, palette = "Set1") +
xlim(-1.5,4) +
ylim(-2,2)
My group variable has 3 levels, and I want to colour each group level by a different colour in the brewer Set1 palette. The current solution just colours everything skyblue. What should I fill by to ensure I'm using the correct colour palette?

You don't actually have to use two layers; you can just use the fill aesthetic of a plotting character with a hole in it:
# some random data
set.seed(47)
df <- data.frame(x = rnorm(100), y = runif(100))
ggplot(aes(x = x, y = y), data = df) + geom_jitter(shape = 21, fill = 'skyblue')
The colour, size, and stroke aesthetics let you customize the exact look.
Edit:
For grouped data, set the fill aesthetic to the grouping variable, and use scale_fill_* functions to set color scales:
# more random data
set.seed(47)
df <- data.frame(x = runif(100), y = rnorm(100), group = sample(letters[1:3], 100, replace = TRUE))
ggplot(aes(x=x, y = y, fill=group), data = df) +
geom_jitter(size=3, alpha=0.6, shape=21) +
theme_bw() +
geom_vline(xintercept = 0, linetype = "dashed") +
scale_fill_brewer(name = "Title", direction = -1, palette = "Set1")

Related

ground geom_text to x axis (e.g. y =0)

I have a graph made in ggplot that looks like this:
I wish to have the numeric labels at each of the bars to be grounded/glued to the x axis where y <= 0.
This is the code to generate the graph as such:
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=numofpics, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels = as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")
I've tried vjust and experimenting with position_nudge for the geom_text element, but every solution I can find changes the position of each element of the geom_text respective to its current position. As such everything I try results in situation like this one:
How can I make ggplot ground the text to the bottom of the x axis where y <= 0, possibly with the possibility to also introduce a angle = 45?
Link to dataframe = https://drive.google.com/file/d/1b-5AfBECap3TZjlpLhl1m3v74Lept2em/view?usp=sharing
As I said in the comments, just set the y-coordinate of the text to 0 or below, and specify the angle : geom_text(aes(x=row, y=-100, label=bbch), angle=45)
I'm behind a proxy server that blocks connections to google drive so I can't access your data. I'm not able to test this, but I would introduce a new label field in my dataset that sets y to be 0 if y<0:
df <- df %>%
mutate(labelField = if_else(numofpics<0, 0, numofpics)
I would then use this label field in my geom_text call:
geom_text(aes(x=row, y=labelField, label=bbch), angle = 45)
Hope that helps.
You can simply define the y-value in geom_text (e.g. -50)
ggplot(data=df) +
geom_bar(aes(x=row, y=numofpics, fill = crop, group = 1), stat='identity') +
geom_point(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_line(data=df, aes(x = df$row, y=df$numofparcels*50, group = 2), alpha = 0.25) +
geom_text(aes(x=row, y=-50, label=bbch)) +
geom_hline(yintercept=300, linetype="dashed", color = "red", size=1) +
scale_y_continuous(sec.axis= sec_axis(~./50, name="Number of Parcels")) +
scale_x_discrete(name = c(),breaks = unique(df$crop), labels =
as.character(unique(df$crop)))+
labs(x=c(), y="Number of Pictures")

R ggplot combine legends for colour and fill with different factor length

I am making a plot with data from an incomplete factorial design. Due to the design, I have different length for the manual scale for colour and the manual scale for fill. Thus, I get two legends. How could I delete one of them or even better combine them?
I have looked at those questions:
Merge separate size and fill legends in ggplot
How to merge color, line style and shape legends in ggplot
How to combine scales for colour and size into one legend?
However, the answers did not help me as they did not handle incomplete designs.
Here is some example data and the plot I produced so far:
#Example data
Man1 <- c(25,25,30,30,30,30,35,35,40,40,40,40,45,45)
Man2 <- c(25,25,30,30,40,40,35,35,40,40,30,30,45,45)
DV <- c(24.8,25.2,29.9,30.3,35.2,35.7,34,35.1,40.3,39.8,35.8,35.9,44,44.8)
Data <- data.frame(Man1,Man2,DV)
#Plot
ggplot(data = Data, aes(x = Man1, y = DV, group=as.factor(Man2), colour=as.factor(Man2))) +
theme_bw() +
geom_abline(intercept = 0, slope = 1, linetype = "longdash") +
geom_point(position = position_dodge(1))
geom_smooth(method = "lm", aes(x = Man1, y = DV, group=as.factor(Man2), fill=as.factor(Man2))) +
scale_colour_manual(name = "Man2", values=c('grey20', 'blue','grey20','tomato3', 'grey20')) +
scale_fill_manual(name = "Man2", values=c('blue','tomato3'))
This gives me the following picture:
ggplot of incomplete design with two legends
Could someone give me a hint how to delete one of the legends or even better combine them? I would appreciate it!
By default the scale drops unused factor levels, which is relevant here because can only get lines for a couple of your groups.
You can use drop = FALSE to change this in the appropriate scale_*_manual() (which is for fill here).
Then use the same vector of colors for both the fill and color scales. I usually make a named vector for this.
# Make vector of colors
colors = c("25" = 'grey20', "30" = 'blue', "35" = 'grey20', "40" = 'tomato3', "45" = 'grey20')
#Plot
ggplot(data = Data, aes(x = Man1, y = DV, group=as.factor(Man2), colour= as.factor(Man2))) +
theme_bw() +
geom_abline(intercept = 0, slope = 1, linetype = "longdash") +
geom_point(position = position_dodge(1)) +
geom_smooth(method = "lm", aes(fill=as.factor(Man2))) +
scale_colour_manual(name = "Man2", values = colors) +
scale_fill_manual(name = "Man2", values = colors, drop = FALSE)
Alternatively, use guide = "none" to remove the fill legend all together.
ggplot(data = Data, aes(x = Man1, y = DV, group=as.factor(Man2), colour= as.factor(Man2))) +
theme_bw() +
geom_abline(intercept = 0, slope = 1, linetype = "longdash") +
geom_point(position = position_dodge(1)) +
geom_smooth(method = "lm", aes(fill=as.factor(Man2))) +
scale_colour_manual(name = "Man2", values = colors) +
scale_fill_manual(name = "Man2", values=c('blue','tomato3'), guide = "none")

ggplot2 - using two different color scales for same fill in overlayed plots

A very similar question to the one asked here. However, in that situation the fill parameter for the two plots are different. For my situation the fill parameter is the same for both plots, but I want different color schemes.
I would like to manually change the color in the boxplots and the scatter plots (for example making the boxes white and the points colored).
Example:
require(dplyr)
require(ggplot2)
n<-4*3*10
myvalues<- rexp((n))
days <- ntile(rexp(n),4)
doses <- ntile(rexp(n), 3)
test <- data.frame(values =myvalues,
day = factor(days, levels = unique(days)),
dose = factor(doses, levels = unique(doses)))
p<- ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot( aes(fill = dose))+
geom_point( aes(fill = dose), alpha = 0.4,
position = position_jitterdodge())
produces a plot like this:
Using 'scale_fill_manual()' overwrites the aesthetic on both the boxplot and the scatterplot.
I have found a hack by adding 'colour' to geom_point and then when I use scale_fill_manual() the scatter point colors are not changed:
p<- ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot(aes(fill = dose), outlier.shape = NA)+
geom_point(aes(fill = dose, colour = factor(test$dose)),
position = position_jitterdodge(jitter.width = 0.1))+
scale_fill_manual(values = c('white', 'white', 'white'))
Are there more efficient ways of getting the same result?
You can use group to set the different boxplots. No need to set the fill and then overwrite it:
ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot(aes(group = interaction(day, dose)), outlier.shape = NA)+
geom_point(aes(fill = dose, colour = dose),
position = position_jitterdodge(jitter.width = 0.1))
And you should never use data$column inside aes - just use the bare column. Using data$column will work in simple cases, but will break whenever there are stat layers or facets.

Change transparency, shape and size of a categorical variables

I am trying to plot using ggplot and trying to set the transparency, size and shape for geom_point using a binary variable in my dataset.
For example, if binary_variable == 1 then set the size to 1, shape = triangle, transparency = 0.2, if binary_variable == 0 set the size to 0.5 etc.
I have been able to make the colour change as follows:
library(ggplot2)
df <- data.frame(variable1 = 1:5,
variable2 = 1:5,
binary = c(0,0,0,1,1))
ggplot(df, aes(x = variable1, y = variable2, colour = as.factor(binary))) +
geom_point(size = 2, alpha = 0.3) +
scale_colour_manual(values = c("grey", "black"), labels = c("cat1", "cat2")) +
theme_bw()
You can control shape, colour and aesthetics in the same way using the scale_X_manual functions. See the help page for all the different ways these can be controlled.
The key part to make this work though is to make sure that you added the variable you want to control to the aes part of the ggplot function.
Here is an example:
df$binary <- as.factor(df$binary)
ggplot(df, aes(x = variable1, y = variable2, colour = binary, shape = binary, alpha = binary)) +
geom_point(size = 2) +
scale_colour_manual(values = c("blue", "red")) +
scale_shape_manual(values=c(16,17)) +
scale_alpha_manual(values=c(1, 0.5)) +
theme_bw()

How to jitter both geom_line and geom_point by the same magnitude?

I have a ggplot2 linegraph with two lines featuring significant overlap. I'm trying to use position_jitterdodge() so that they are more visible, but I can't get the lines and points to both jitter in the same way. I'm trying to jitter the points and line horizontally only (as I don't want to suggest any change on the y-axis). Here is an MWE:
## Create data frames
dimension <- factor(c("A", "B", "C", "D"))
df <- data.frame("dimension" = rep(dimension, 2),
"value" = c(20, 21, 34, 32,
20, 21, 36, 29),
"Time" = c(rep("First", 4), rep("Second", 4)))
## Plot it
ggplot(data = df, aes(x = dimension, y = value,
shape = Time, linetype = Time, group = Time)) +
geom_line(position = position_jitterdodge(dodge.width = 0.45)) +
geom_point(position = position_jitterdodge(dodge.width = 0.45)) +
xlab("Dimension") + ylab("Value")
Which produces the ugly:
I've obviously got something fundamentally wrong here: What should I do to make the geom_point jitter follow the geom_line jitter?
Another option for horizontal only would be to specify position_dodge and pass this to the position argument for each geom.
pd <- position_dodge(0.4)
ggplot(data = df, aes(x = dimension, y = value,
shape = Time, linetype = Time, group = Time)) +
geom_line(position = pd) +
geom_point(position = pd) +
xlab("Dimension") + ylab("Value")
One solution is to manually jitter the points:
df$value_j <- jitter(df$value)
ggplot(df, aes(dimension, value_j, shape=Time, linetype=Time, group=Time)) +
geom_line() +
geom_point() +
labs(x="Dimension", y="Value")
The horizontal solution for your discrete X axis isn't as clean (it's clean under the covers when ggplot2 does it since it handles the axis and point transformations for you quite nicely) but it's doable:
df$dim_j <- jitter(as.numeric(factor(df$dimension)))
ggplot(df, aes(dim_j, value, shape=Time, linetype=Time, group=Time)) +
geom_line() +
geom_point() +
scale_x_continuous(labels=dimension) +
labs(x="Dimension", y="Value")
On July 2017, developpers of ggplot2 have added a seed argument on position_jitter function (https://github.com/tidyverse/ggplot2/pull/1996).
So, now (here: ggplot2 3.2.1) you can pass the argument seed to position_jitter in order to have the same jitter effect in geom_point and geom_line (see the official documentation: https://ggplot2.tidyverse.org/reference/position_jitter.html)
Note that this seed argument does not exist (yet) in geom_jitter.
ggplot(data = df, aes(x = dimension, y = value,
shape = Time, linetype = Time, group = Time)) +
geom_line(position = position_jitter(width = 0.25, seed = 123)) +
geom_point(position = position_jitter(width = 0.25, seed = 123)) +
xlab("Dimension") + ylab("Value")

Resources