How to keep coloured geom_sina points within geom_violin plots? - r

I succeeded to jitter points in violins by combining geom_violin and geom_sina (left plot in the figure above), but when I try to color the points, they are jittered on several columns outside the violins (right plot in figure above).
What I would like to get is the left plot with colored points (I do not care if they are mixed (I mean not grouped by color).
Here is a demo script using mtcars dataset (I do not know mtcars dataset in detail, then apologize if I did some crazy use of the variables).
library(ggplot2)
library(ggforce)
data(mtcars)
p <- ggplot(mtcars, aes(x=factor(vs), y=mpg)) + geom_violin()
p + geom_sina(alpha = 0.5)
p + geom_sina(aes(colour = factor(cyl)), alpha = 0.5)

Thanks to teunbrand.
p + geom_sina(aes(colour = factor(cyl), group = factor(vs)), alpha = 0.5)
makes it.

Related

ggolot horizontal gaps contour

I've been trying to plot contour plots using ggplot2 and csv file. I can't figure out why there are horizontal gaps showing up across the image.
Here is the code:
library(ggplot2)
plot2 <- ggplot(data=thirtyfour,aes( x = X.m., y = Z.m., z = t_ca.2))
plot2
plot2 + geom_tile(aes(fill = t_ca.2) )+
scale_fill_continuous(limits=c(0,0.0219),
breaks=seq(0,0.0219, by=0.01),
low="blue",
high="yellow")
In the geom_tile aesthetic try adjusting the height parameter.
+geom_tile(aes(fill=t_ca.2, height=1)) +...
Otherwise, please provide reproducible example code.

overlay rotated density plot

I'm struggling to overlap rotated density plot onto the original scatterplot. Here are 2 plots I have:
require(ggplot2); set.seed(1);
df1 <- data.frame(ID=paste0('ID',1:1000), value=rnorm(1000,500,100))
p1 <- ggplot(data = df1, aes(x=reorder(ID, value), y=value)) +
geom_point(size=2, alpha = 0.7)+
coord_trans(y="log10")
p2 <- ggplot(data = df1, aes(x=value)) +
coord_trans(x="log10") +
geom_density() +
coord_flip()
p1
p2
First, there's a little problem with the density plot that its vertical axis is not log10-transformed. But main issue is that I can't find how to draw it on the previous plot keeping correct coordinates.
Because you are using coord_flip on your second plot you are effectively trying to plot two different values onto the same x axis (density and ID). There are plenty of posts discouraging this, here's one for example: How do I plot points with two different y-axis ranges on the same panel in the same X axis?.

Why doesn't geom_hline generate a legend in ggplot2?

I have some code that is plots a histogram of some values, along with a few horizontal lines to represent reference points to compare against. However, ggplot is not generating a legend for the lines.
library(ggplot2)
library(dplyr)
## Siumlate an equal mix of uniform and non-uniform observations on [0,1]
x <- data.frame(PValue=c(runif(500), rbeta(500, 0.25, 1)))
y <- c(Uniform=1, NullFraction=0.5) %>% data.frame(Line=names(.) %>% factor(levels=unique(.)), Intercept=.)
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, group=Line, color=Line, linetype=Line),
data=y, alpha=0.5)
I even tried reducing the problem to just plotting the lines:
ggplot(y) +
geom_hline(aes(yintercept=Intercept, color=Line)) + xlim(0,1)
and I still don't get a legend. Can anyone explain why my code isn't producing plots with legends?
By default show_guide = FALSE for geom_hline. If you turn this on then the legend will appear. Also, alpha needs to be inside of aes otherwise the colours of the lines will not be plotted properly (on the legend). The code looks like this:
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, colour=Line, linetype=Line, alpha=0.5),
data=y, show_guide=TRUE)
And output:

Plot density with ggplot2 without line on x-axis

I use ggplot2::ggplot for all 2D plotting needs, including density plots, but I find that when plotting a number of overlapping densities with extreme outliers on a single space (in different colors) the line on the x-axis becomes a little distracting.
My question is then, can you remove the bottom section of the density plot from being plotted? If so, how?
You can use this example:
library(ggplot2)
ggplot(movies, aes(x = rating)) + geom_density()
Should turn out like this:
How about using stat_density directly
ggplot(movies, aes(x = rating)) + stat_density(geom="line")
You can just draw a white line over it:
ggplot(movies, aes(x = rating)) +
geom_density() +
geom_hline(color = "white", yintercept = 0)

Conditional graphing and fading colors

I am trying to create a graph where because there are so many points on the graph, at the edges of the green it starts to fade to black while the center stays green. The code I am currently using to create this graph is:
plot(snb$px,snb$pz,col=snb$event_type,xlim=c(-2,2),ylim=c(1,6))
I looked into contour plotting but that did not work for this. The coloring variable is a factor variable.
Thanks!
This is a great problem for ggplot2.
First, read the data in:
snb <- read.csv('MLB.csv')
With your data frame you could try plotting points that are partly transparent, and setting them to be colored according to the factor event_type:
require(ggplot2)
p1 <- ggplot(data = snb, aes(x = px, y = py, color = event_type)) +
geom_point(alpha = 0.5)
print(p1)
and then you get this:
Or, you might want to think about plotting this as a heatmap using geom_bin2d(), and plotting facets (subplots) for each different event_type, like this:
p2 <- ggplot(data = snb, aes(x = px, y = py)) +
geom_bin2d(binwidth = c(0.25, 0.25)) +
facet_wrap(~ event_type)
print(p2)
which makes a plot for each level of the factor, where the color will be the number of data points in each bins that are 0.25 on each side. But, if you have more than about 5 or 6 levels, this might look pretty bad. From the small data sample you supplied, I got this
If the levels of the factors don't matter, there are some nice examples here of plots with too many points. You could also try looking at some of the examples on the ggplot website or the R cookbook.
Transparency could help, which is easily achieved, as #BenBolker points out, with adjustcolor:
colvect = adjustcolor(c("black", "green"), alpha = 0.2)
plot(snb$px, snb$pz,
col = colvec[snb$event_type],
xlim = c(-2,2),
ylim = c(1,6))
It's built in to ggplot:
require(ggplot2)
p <- ggplot(data = snb, aes(x = px, y = pz, color = event_type)) +
geom_point(alpha = 0.2)
print(p)

Resources