Viridis and ggplot2/ggmarginal - r

I am encountering a problem using viridis with ggplot2 and ggmarginal.
I would like to colorize dots on a Bland-Altman Plot that I am plotting with ggplot2:
diff <- (a1$A1_phones - a1$A1_video)
diffp <- (a1$A1_phones - a1$A1_video)/a1$A1_video*100
sd.diff <- sd(diff)
sd.diffp <- sd(diffp)
my.data <- data.frame(a1$A1_video, a1$A1_phones, diff, diffp)
dev.off()
diffplot <- ggplot(my.data, aes(a1$A1_video, diff)) +
geom_point(size=2, colour = rgb(0,0,0, alpha = 0.5)) +
theme_bw() +
#when the +/- 2SD lines will fall outside the default plot limits
#Thanks to commenter for noticing this.
ylim(mean(my.data$diff) - 7*sd.diff, mean(my.data$diff) + 7*sd.diff) +
geom_hline(yintercept = 0, linetype = 3) +
geom_hline(yintercept = mean(my.data$diff)) +
geom_hline(yintercept = mean(my.data$diff) + 2*sd.diff, linetype = 2) +
geom_hline(yintercept = mean(my.data$diff) - 2*sd.diff, linetype = 2) +
ylab("Difference Video vs Algorithm [ms]") +
xlab("Average of Video vs Algorithm [ms]")
p<-ggMarginal(diffplot, type="histogram", bins = 40)+ scale_colour_viridis_d()
It would now be very beautiful to colorize the dots from A1_video differently than those from A1_phones and have viridis drawing a continuous density plot.

I am not sure if this is what you want, please try to be more specific and provide sample data. If you just want the color to change based on another column in the source dataset , it must be specified inside the aes() function:
diffplot <- ggplot(my.data,aes(col=a1$A1_video))

Related

R ggplot2 scale alpha discrete to display in legend

I'm trying to make a plot across two factors (strain and sex) and use the alpha value to communicate sex. Here is my code and the resulting plot:
ggplot(subset(df.zfish.data.overall.long, day=='day_01' & measure=='distance.from.bottom'), aes(x=Fish.name, y=value*100)) +
geom_boxplot(aes(alpha=Sex, fill=Fish.name), outlier.shape=NA) +
scale_alpha_discrete(range=c(0.3,0.9)) +
scale_fill_brewer(palette='Set1') +
coord_cartesian(ylim=c(0,10)) +
ylab('Distance From Bottom (cm)') +
xlab('Strain') +
scale_x_discrete(breaks = c('WT(AB)', 'WT(TL)', 'WT(TU)', 'WT(WIK)'), labels=c('AB', 'TL', 'TU', 'WIK')) +
guides(color=guide_legend('Fish.name'), fill=FALSE) +
theme_classic(base_size=10)
I'd like for the legend to reflect the alpha value in the plot (i.e. alpha value F = 0.3, alpha value M=0.9) as greyscale/black as I think that will be intuitive.
I've tried altering the scale_alpha_discrete, but cannot figure out how to send it a single color for the legend. I've also tried playing with 'guides()' without much luck. I suspect there's a simple solution, but I cannot see it.
One option to achieve your desired result would be to set the fill color for the alpha legend via the override.aes argument of guide_legend.
Making use of mtcars as example data:
library(ggplot2)
ggplot(mtcars, aes(x = cyl, y = mpg)) +
geom_boxplot(aes(fill = factor(cyl), alpha = factor(am))) +
scale_alpha_discrete(range = c(0.3, 0.9), guide = guide_legend(override.aes = list(fill = "black"))) +
scale_fill_brewer(palette='Set1') +
theme_classic(base_size=10) +
guides(fill = "none")
#> Warning: Using alpha for a discrete variable is not advised.

Issue with log_2 scaling using ggplot2 and log2_trans()

I am trying to plot data using ggplot2 in R.
The datapoints occur for each 2^i-th x-value (4, 8, 16, 32,...). For that reason, I want to scale my x-Axis by log_2 so that my datapoints are spread out evenly. Currently most of the datapoints are clustered on the left side, making my plot hard to read (see first image).
I used the following command to get this image:
ggplot(summary, aes(x=xData, y=yData, colour=groups)) +
geom_errorbar(aes(ymin=yData-se, ymax=yData+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd)
However trying to scale my x-axis with log2_trans yields the second image, which is not what I expected and does not follow my data.
Code used:
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=solver.name)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(trans = log2_trans(),
breaks = trans_breaks("log2", function(x) 2^x),
labels = trans_format("log2", math_format(2^.x)))
Using scale_x_continuous(trans = log2_trans()) only doesn't help either.
EDIT:
Attached the data for reproducing the results:
https://pastebin.com/N1W0z11x
EDIT 2:
I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Removing the position=pd statements solved the issue
Here is a way you could format your x-axis:
# Generate dummy data
x <- 2^seq(1, 10)
df <- data.frame(
x = c(x, x, x),
y = c(0.5*x, x, 1.5*x),
z = rep(letters[seq_len(3)], each = length(x))
)
The plot of this would look like this:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line()
Adjusting the x-axis would work like so:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line() +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
The labels argument is just so you have labels in the format 2^x, you could change that to whatever you like.
I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Adjusting the amount of position dodge and the with of the error bars according to the new scaling solved the problem.
pd <- position_dodge(0.2) # move them .2 to the left and right
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=algorithm)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=0.4, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
Adding scale_y_continuous(trans="log2") yields the results I was looking for:

ggplot2 cuts off parts of my figure when zooming in

I am trying to make a figure in gglot2 that looks something like this.
However, I seem right now to have a trade off between having all the squares small like ... Or zooming in on the squares and having parts cut
both displayed here as I may not add more pictures, yet
My code is as follow
if (!require('ggplot2')) install.packages('ggplot2'); library('ggplot2')
Odds <- c(1.2,1,0.97,1,1.38,0.95,0.85,0.95)
x <- c(5,3.5,0,-3.5,-5,-3.5,0,3.5)
y <- c(0,3.5,5,3.5,0,-3.5,-5,-3.5)
summed <- data.frame(Odds,x,y)
d <- qplot(x, y, data=summed, colour =Odds)
d + theme_classic(base_size = 14) + geom_point(size = 30, shape=15) +
scale_colour_gradient(low="grey", high = "black") +
ylab("") +
xlab("") +
scale_y_continuous(breaks=NULL) + scale_x_continuous(breaks=NULL)
I hope some of you can help me.
GGplot grants a certain "bleeding" or "breathing" around plot area. The size of your geom_point simply goes beyond that space.
A solution is to set custom limits to the ploting area. Try this:
d <-qplot(x, y, data=summed, color =Odds)
d + theme_classic(base_size = 14) + geom_point(size = 30, shape=15) +
scale_colour_gradient(low="grey", high = "black") +
ylab("") +
xlab("") +
scale_y_continuous(breaks=NULL) +
scale_x_continuous(breaks=NULL) +
theme(aspect.ratio = 1) + ## Optional. Ensures you get a square shaped plot
expand_limits(x =c(min(x) - 1, max(x) + 1), ## Expands the limits, reads from your predefined "x" and "y" objects.
y =c(min(y)-1, max(y) +1))

ggplot2 add offset to jitter positions

I have data that looks like this
df = data.frame(x=sample(1:5,100,replace=TRUE),y=rnorm(100),assay=sample(c('a','b'),100,replace=TRUE),project=rep(c('primary','secondary'),50))
and am producing a plot using this code
ggplot(df,aes(project,x)) + geom_violin(aes(fill=assay)) + geom_jitter(aes(shape=assay,colour=y),height=.5) + coord_flip()
which gives me this
This is 90% of the way to being what I want. But I would like it if each point was only plotted on top of the violin plot for the matching assay type. That is, the jitterred positions of the points were set such that the triangles were only ever on the upper teal violin plot and the circles in the bottom red violin plot for each project type.
Any ideas how to do this?
In order to get the desired result, it is probably best to use position_jitterdodge as this gives you the best control over the way the points are 'jittered':
ggplot(df, aes(x = project, y = x, fill = assay, shape = assay, color = y)) +
geom_violin() +
geom_jitter(position = position_jitterdodge(dodge.width = 0.9,
jitter.width = 0.5,
jitter.height = 0.2),
size = 2) +
coord_flip()
which gives:
You can use interaction between assay & project:
p <- ggplot(df,aes(x = interaction(assay, project), y=x)) +
geom_violin(aes(fill=assay)) +
geom_jitter(aes(shape=assay, colour=y), height=.5, cex=4)
p + coord_flip()
The labeling can be adjusted by numeric scaled x axis:
# cbind the interaction as a numeric
df$group <- as.numeric(interaction(df$assay, df$project))
# plot
p <- ggplot(df,aes(x=group, y=x, group=cut_interval(group, n = 4))) +
geom_violin(aes(fill=assay)) +
geom_jitter(aes(shape=assay, colour=y), height=.5, cex=4)
p + coord_flip() + scale_x_continuous(breaks = c(1.5, 3.5), labels = levels(df$project))

overlaying plots in ggplot2

How to overlay one plot on top of the other in ggplot2 as explained in the following sentences? I want to draw the grey time series on top of the red one using ggplot2 in R (now the red one is above the grey one and I want my graph to be the other way around). Here is my code (I generate some data in order to show you my problem, the real dataset is much more complex):
install.packages("ggplot2")
library(ggplot2)
time <- rep(1:100,2)
timeseries <- c(rep(0.5,100),rep(c(0,1),50))
upper <- c(rep(0.7,100),rep(0,100))
lower <- c(rep(0.3,100),rep(0,100))
legend <- c(rep("red should be under",100),rep("grey should be above",100))
dataset <- data.frame(timeseries,upper,lower,time,legend)
ggplot(dataset, aes(x=time, y=timeseries)) +
geom_line(aes(colour=legend, size=legend)) +
geom_ribbon(aes(ymax=upper, ymin=lower, fill=legend), alpha = 0.2) +
scale_colour_manual(limits=c("grey should be above","red should be under"),values = c("grey50","red")) +
scale_fill_manual(values = c(NA, "red")) +
scale_size_manual(values=c(0.5, 1.5)) +
theme(legend.position="top", legend.direction="horizontal",legend.title = element_blank())
Convert the data you are grouping on into a factor and explicitly set the order of the levels. ggplot draws the layers according to this order. Also, it is a good idea to group the scale_manual codes to the geom it is being applied to for readability.
legend <- factor(legend, levels = c("red should be under","grey should be above"))
c <- data.frame(timeseries,upper,lower,time,legend)
ggplot(c, aes(x=time, y=timeseries)) +
geom_ribbon(aes(ymax=upper, ymin=lower, fill=legend), alpha = 0.2) +
scale_fill_manual(values = c("red", NA)) +
geom_line(aes(colour=legend, size=legend)) +
scale_colour_manual(values = c("red","grey50")) +
scale_size_manual(values=c(1.5,0.5)) +
theme(legend.position="top", legend.direction="horizontal",legend.title = element_blank())
Note that the ordering of the values in the scale_manual now maps to "grey" and "red"

Resources