I was just wondering if anybody had any experience with coloring something like a UMAP made in ggplot based on the expression of multiple genes at the same time? What I want to do is something like the blend function in Seurat featureplots, but with 3 genes / colors instead of 2.
I'm looking to make something like this:
Where the colors for the genes combine where there is overlap.
What I've gotten to so far is
ggplot(FD, vars = c("UMAP_1", "UMAP_2", "FOSL2", "JUNB", "HES1"), aes(x = UMAP_1, y = UMAP_2, colour = FOSL2)) +
geom_point(size=0.3, alpha=1) +
scale_colour_gradientn(colours = c("lightgrey", colour1), limits = c(0, 0.3), oob = scales::squish) +
new_scale_color() +
geom_point(aes(colour = JUNB), size=0.3, alpha=0.7) +
scale_colour_gradientn(colours = c("lightgrey", colour2), limits = c(0.1, 0.2), oob = scales::squish) +
new_scale_color() +
geom_point(aes(colour = HES1), size=0.3, alpha=0.1) +
scale_colour_gradientn(colours = c("lightgrey", colour3), limits = c(0, 0.3), oob = scales::squish)
Where FD is a data frame containing the information from the seurat object for the UMAP coordinates and the expression levels of the three genes of interest. All I can get is a plot where the points from one layer obscure those below it, I've tried messing around with the colours, gradients, alpha and scales but I'm guessing I'm doing it the wrong way.
If anyone knows of a way to make this work or has any suggestions on something else to try that would be very much appreciated.
There is no 'vanilla' way of doing this in ggplot2. One can precalculate the blended colours and append invisible layers and scales with the ggnewscale package.
Let's pretend for reproducibility purposes that we want to make a UMAP of the iris dataset and using the descriptors of leaves as 'genes'.
library(ggplot2)
library(scales)
library(ggnewscale)
#> Warning: package 'ggnewscale' was built under R version 4.1.1
# Calculate a UMAP
umap <- uwot::umap(iris[, 1:4])
# Combine with original data and blended colours
df <- cbind.data.frame(
setNames(as.data.frame(umap), c("x", "y")),
iris,
colour = rgb(
rescale(iris$Sepal.Length),
rescale(iris$Sepal.Width),
rescale(iris$Petal.Length)
)
)
ggplot(df, aes(x, y, colour = colour)) +
geom_point() +
scale_colour_identity() +
new_scale_colour() +
# shape = NA --> invisible layers
geom_point(aes(colour = Sepal.Length), shape = NA) +
scale_colour_gradient(low = "black", high = "red") +
new_scale_colour() +
geom_point(aes(colour = Sepal.Width), shape = NA) +
scale_colour_gradient(low = "black", high = "green") +
new_scale_colour() +
geom_point(aes(colour = Petal.Length), shape = NA) +
scale_colour_gradient(low = "black", high = "blue")
#> Warning: Removed 150 rows containing missing values (geom_point).
#> Warning: Removed 150 rows containing missing values (geom_point).
#> Warning: Removed 150 rows containing missing values (geom_point).
On the more experimental side of things, I have a package on github that has related functionality.
library(ggchromatic) # devtools::install_github("teunbrand/ggchromatic")
ggplot(df, aes(x, y, colour = rgb_spec(Sepal.Length, Sepal.Width, Petal.Length))) +
geom_point()
Created on 2021-10-18 by the reprex package (v2.0.1)
A small sidenote: a plot becomes very hard to interpret when some attributes of the data are mapped to different colour channels.
I have gotten the following graph in ggplot2, and it is almost exactly what I want. The one detail that I want to change is to put different shapes on the lines, just in case someone is colorblind and unable to distinguish whatever two colors I ultimately choose.
library(ggplot2)
color1 = "red"
color2 = "blue"
DF1.grp1 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(1,2,3,4,5,6),grp=rep("grp1",6))
DF1.grp2 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(2,3,4,4,5,9),grp=rep("grp2",6))
DF1 <- rbind(DF1.grp1,DF1.grp2)
ggplot(shape=DF1$grp) + #
geom_line(data=DF1,aes(x=X,y=Y,color=grp),size=1)+
geom_point(data=DF1,aes(x=X,y=Y,color=grp),size=3)+
xlab("x variable") +
ylab("y variable") +
scale_colour_manual(values=c(color1,color2)) +
labs(color="") +
geom_density(alpha = 0.5) + theme_bw()+theme(legend.position="bottom")
In the geom_point line, I have tried by including shape=grp, and that does give me different shapes on each line and in the right color. However, the legend gives the colors and shapes separately and puts the shapes in black.
library(ggplot2)
color1 = "red"
color2 = "blue"
DF1.grp1 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(1,2,3,4,5,6),grp=rep("grp1",6))
DF1.grp2 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(2,3,4,4,5,9),grp=rep("grp2",6))
DF1 <- rbind(DF1.grp1,DF1.grp2)
ggplot(shape=DF1$grp) + #
geom_line(data=DF1,aes(x=X,y=Y,color=grp),size=1)+
geom_point(data=DF1,aes(x=X,y=Y,color=grp,shape=grp),size=3)+
xlab("x variable") +
ylab("y variable") +
scale_colour_manual(values=c(color1,color2)) +
labs(color="") +
geom_density(alpha = 0.5) + theme_bw()+theme(legend.position="bottom")
The ideal plot for me would be to change the first image by just putting a blue triangle on the blue part of the legend and a red circle on the red part of the legend, replacing the circles that are there.
What would be the appropriate change to my first block of code to accomplish this?
For legends to be combined, they must have the same title. In your second example, set both titles to "" with labs(color="", shape="") and you will get a single legend with over-plotted symbol and line.
Here is a simple solution
library(ggplot2)
DF1.grp1 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(1,2,3,4,5,6),grp=rep("grp1",6))
DF1.grp2 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(2,3,4,4,5,9),grp=rep("grp2",6))
DF1 <- rbind(DF1.grp1,DF1.grp2)
ggplot(DF1, aes(X, Y, color = grp, shape = grp)) +
geom_point() +
geom_line() +
scale_colour_manual(values=c('red','blue'))+
scale_shape_manual(values = c(24,25))+
theme_bw()+theme(legend.position="bottom")
Created on 2020-02-07 by the reprex package (v0.3.0)
To your extended question
The shapes I have used are already ones which can be filled. If you put out the aesthetics into the individual geoms you have the possibility to change them individually. That is what I did in the following code and I have used fill instead of color.
The legend of a specific geom can be switched off by guide = 'none' in the scale_.
library(ggplot2)
DF1.grp1 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(1,2,3,4,5,6),grp=rep("grp1",6))
DF1.grp2 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(2,3,4,4,5,9),grp=rep("grp2",6))
DF1 <- rbind(DF1.grp1,DF1.grp2)
ggplot(DF1, aes(X, Y)) +
geom_point(aes(fill = grp, shape = grp)) +
geom_line(aes(color = grp)) +
scale_colour_manual(values=c('red','blue'), guide = 'none')+
scale_shape_manual(values = c(24,25))+
theme_bw()+theme(legend.position="bottom")
Created on 2020-02-08 by the reprex package (v0.3.0)
To your further extended question
library(ggplot2)
DF1.grp1 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(1,2,3,4,5,6),grp=rep("grp1",6))
DF1.grp2 <- data.frame(X=c(5,10,15,20,25,30),
Y=c(2,3,4,4,5,9),grp=rep("grp2",6))
DF1 <- rbind(DF1.grp1,DF1.grp2)
ggplot(DF1, aes(X, Y)) +
geom_point(aes(fill = grp, shape = grp), stroke =0, size =5) +
geom_line(aes(color = grp)) +
scale_colour_manual(values=c('red','blue'), guide = 'none')+
scale_fill_manual(values=c('red','blue'))+
scale_shape_manual(values = c(24,25))+
theme_bw()+
theme(legend.position="bottom")
Created on 2020-02-08 by the reprex package (v0.3.0)
I have added scale_fill_ and have set the stroke of geom_point to zero. I have also made the points larger, so you can see better.
I have data where each point lays on a spectrum between two centroids. I have generated a color for each point by specifying a color for each centroid, then setting the color of each point as a function of its position between its two centroids. I used this to manually specify colors for each point and plotted the data in the following way:
lb.plot.dat <- data.frame('UMAP1' = lb.umap$layout[,1], 'UMAP2' = lb.umap$layout[,2],
'sample' = as.factor(substr(colnames(lb.vip), 1, 5)),
'fuzzy.class' = color.vect))
p3 <- ggplot(lb.plot.dat, aes(x = UMAP1, y = UMAP2)) + geom_point(aes(color = color.vect)) +
ggtitle('Fuzzy Classification') + scale_color_identity()
p3 + facet_grid(cols = vars(sample)) + theme(legend.) +
ggsave(filename = 'ref-samps_bcell-vip-model_fuzzy-class.png', height = 8, width = 16)
(color.vect is the aforementioned vector of colors for each point in the plot)
I would like to generate a legend of this plot that gives the color used for each centroid. I have a named vector class.cols that contains the colors used for each centroid and is named according to the corresponding class.
Is there a way to transform this vector into a legend for the plot even though it is not explicitly used in the plotting call?
You can turn on legend drawing in scale_color_identity() by setting guide = "legend". You'll have to specify the breaks and labels in the scale function so that the legend correctly states what each color represents, and not just the name of the color.
library(ggplot2)
df <- data.frame(x = 1:3, y = 1:3, color = c("red", "green", "blue"))
# no legend by default
ggplot(df, aes(x, y, color = color)) +
geom_point() +
scale_color_identity()
# legend turned on
ggplot(df, aes(x, y, color = color)) +
geom_point() +
scale_color_identity(guide = "legend")
Created on 2019-12-15 by the reprex package (v0.3.0)
I have a bunch of data for people touching bacteria for up to 5 touches. I'm comparing how much they pick up with and without gloves. I'd like to plot the mean by the factor NumberContacts and colour it red. E.g. the red dots on the following graphs.
So far I have:
require(tidyverse)
require(reshape2)
Make some data
df<-data.frame(Yes=rnorm(n=100),
No=rnorm(n=100),
NumberContacts=factor(rep(1:5, each=20)))
Calculate the mean for each group= NumberContacts
centroids<-aggregate(data=melt(df,id.vars ="NumberContacts"),value~NumberContacts+variable,mean)
Get them into two columns
centYes<-subset(centroids, variable=="Yes",select=c("NumberContacts","value"))
centNo<-subset(centroids, variable=="No",select="value")
centroids<-cbind(centYes,centNo)
colnames(centroids)<-c("NumberContacts","Gloved","Ungloved")
Make an ugly plot.
ggplot(df,aes(x=gloves,y=ungloved)+
geom_point()+
geom_abline(slope=1,linetype=2)+
stat_ellipse(type="norm",linetype=2,level=0.975)+
geom_point(data=centroids,size=5,color='red')+
#stat_summary(fun.y="mean",colour="red")+ doesn't work
facet_wrap(~NumberContacts,nrow=2)+
theme_classic()
Is there a more elegant way by using stat_summary? Also How can I change the look of the boxes at the top of my graphs?
stat_summary is not an option because (see ?stat_summary):
stat_summary operates on unique x
That is, while we can take a mean of y, x remains fixed. But we may do something else that is very concise:
ggplot(df, aes(x = Yes, y = No, group = NumberContacts)) +
geom_point() + geom_abline(slope = 1, linetype = 2)+
stat_ellipse(type = "norm", linetype = 2, level = 0.975)+
geom_point(data = df %>% group_by(NumberContacts) %>% summarise_all(mean), size = 5, color = "red")+
facet_wrap(~ NumberContacts, nrow = 2) + theme_classic() +
theme(strip.background = element_rect(fill = "black"),
strip.text = element_text(color = "white"))
which also shows that to modify the boxes above you want to look at strip elements of theme.
I'm trying to add shapes on the lines plotted using geom_freqpoly to give more visibility to them if the plot is printed b/w on paper.
data <- data.frame(time=runif(1000,0,20000),
class=c("a","b","c","d"))
ggplot(data, aes(time, colour = class)) + geom_freqpoly(binwidth = 1000) + geom_point(aes(shape=class))
but this generates this error:
'Error: geom_point requires the following missing aesthetics: y'
How can I solve this error?
Another thing is that I want to use a single colour (eg. blue) to draw the lines
but with scale_colour_brewer() I can't change the colour scale, I want to change it because the lightest colour is nearly white and you can barely see it.
How can I add a custom min and max for the colours?
How about this? The error you are getting is being produced by geom_point which needs x and y, so I removed it.
ggplot(data, aes(x = time, color = class)) +
geom_freqpoly(binwidth = 1000) +
scale_color_brewer(palette = "Blues") +
theme_dark()
If you don't want the dark background, pass manual values from RColorBrewer. The following example uses every second color to increase the contrast.
p1 <- ggplot(data, aes(x = time, color = class)) +
geom_freqpoly(binwidth = 1000) +
scale_color_manual(values = RColorBrewer::brewer.pal(9, name = "Blues")[c(3, 5, 7, 9)])
EDIT
You can extract summarised data from a ggplot object using layer_data function.
xy <- layer_data(p1)
ggplot(xy, aes(x = x, y = count, color = colour)) +
theme_bw() +
geom_line() +
geom_point() +
scale_color_manual(values = RColorBrewer::brewer.pal(9, name = "Blues")[c(3, 5, 7, 9)])