I'm trying to make a plot across two factors (strain and sex) and use the alpha value to communicate sex. Here is my code and the resulting plot:
ggplot(subset(df.zfish.data.overall.long, day=='day_01' & measure=='distance.from.bottom'), aes(x=Fish.name, y=value*100)) +
geom_boxplot(aes(alpha=Sex, fill=Fish.name), outlier.shape=NA) +
scale_alpha_discrete(range=c(0.3,0.9)) +
scale_fill_brewer(palette='Set1') +
coord_cartesian(ylim=c(0,10)) +
ylab('Distance From Bottom (cm)') +
xlab('Strain') +
scale_x_discrete(breaks = c('WT(AB)', 'WT(TL)', 'WT(TU)', 'WT(WIK)'), labels=c('AB', 'TL', 'TU', 'WIK')) +
guides(color=guide_legend('Fish.name'), fill=FALSE) +
theme_classic(base_size=10)
I'd like for the legend to reflect the alpha value in the plot (i.e. alpha value F = 0.3, alpha value M=0.9) as greyscale/black as I think that will be intuitive.
I've tried altering the scale_alpha_discrete, but cannot figure out how to send it a single color for the legend. I've also tried playing with 'guides()' without much luck. I suspect there's a simple solution, but I cannot see it.
One option to achieve your desired result would be to set the fill color for the alpha legend via the override.aes argument of guide_legend.
Making use of mtcars as example data:
library(ggplot2)
ggplot(mtcars, aes(x = cyl, y = mpg)) +
geom_boxplot(aes(fill = factor(cyl), alpha = factor(am))) +
scale_alpha_discrete(range = c(0.3, 0.9), guide = guide_legend(override.aes = list(fill = "black"))) +
scale_fill_brewer(palette='Set1') +
theme_classic(base_size=10) +
guides(fill = "none")
#> Warning: Using alpha for a discrete variable is not advised.
I've made a histogram graph that shows the distribution of lidar returns per elevation for three lidar scans I have done.
I've converted my data to long format, with:
one column called 'value', describing the z position of each point
one column called 'variable', containing the name of each
scan group
In the attached image you can see the histograms of my three scan groups. I am currently using viridis to color the histogram by scan group (ie. the name of the scan in the variable column). However, I want to match the colours in the graph with colours I already have.
How might I do this?
The hexcols I'd like to like color each of my three histograms with are:
lightgreen = "#62FE96"
lightred = "#FE206B"
darkpurple = "#62278E"
A link to my data - 'density2'
My current code:
library(tidyverse)
library(viridisLite)
library(viridis)
# histogram
p <- density2 %>%
ggplot( aes(x=value,color = variable, show.legend = FALSE)) +
geom_histogram(binwidth = 1, alpha = 0.5, position="identity") +
scale_color_viridis(discrete =TRUE) +
scale_fill_viridis(discrete=TRUE) +
theme_bw() +
labs(fill="") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
p + scale_y_sqrt() + theme(legend.position="none") + labs(y = "data pts", x = "elevation (m)")
Any help would be most appreciated!
Delete the scale_color_viridis and scale_fill_viridis lines - these are applying the Viridis color scale. Replace with scale_fill_manual(values = c(lightgreen, lightred, darkpurple)). And in your aesthetic mapping replace color = variable with fill = variable. For a histogram, color refers to the color of the lines outlining each bar, and fill refers to the color each bar is filled in.
This should leave you with:
p <- density2 %>%
ggplot(aes(x = value, fill = variable)) +
geom_histogram(binwidth = 1, alpha = 0.5, position = "identity") +
scale_fill_manual(values = c(lightgreen, lightred, darkpurple)) +
theme_bw() +
labs(fill = "") +
theme(panel.grid = element_blank())
p + scale_y_sqrt() +
theme(legend.position = "none") +
labs(y = "data pts", x = "elevation (m)")
I've also done some other clean-up. show.legend = FALSE does not belong inside aes() - and your theme(legend.position = "none") should take care of it.
I did not download your data, save it in my working directory, import it into R, and test this code on it. If you need more help, please post a small subset of your data in a copy/pasteable format (e.g., dput(density2[1:20, ]) for the first 20 rows---choose a suitable subset) and I'll be happy to test and adjust.
Example code and figure:
data <- data.frame( ID = c(LETTERS[1:26], paste0("A",LETTERS[1:26])),
Group = rep(c("Control","Treatment"),26),
x = rnorm(52,50,20),
y = rnorm(52,50,10))
ggplot(data, aes(y=y,x=x, label=ID, color=Group)) +
geom_text(size=8) +
scale_color_manual(values=c("blue","red")) +
theme_classic() +
theme(legend.text = element_text(color=c("blue","red")))
What I'm trying to solve is removing the legend symbols (the "a") and coloring the Group labels (Control and Treatment) as they appear in the plot (Blue and Red respectively).
I've tried:
geom_text(show_guide = F)
But that just removes the legend entirely.
To keep it simple I could just use annotate...but wondering if there's a legend specific solution.
ggplot(data, aes(y=y,x=x, label=ID, color=Group)) +
geom_text(size=8, show_guide=F) +
scale_color_manual(values=c("blue","red")) +
theme_classic() +
annotate("text",label="Control", color="blue",x=20,y=80,size=8) +
annotate("text",label="Treatment", color="Red",x=23,y=77,size=8)
Another option is to use point markers (instead of the letter "a") as the legend symbols, which you can do with the following workaround:
Remove the geom_text legend.
Add a "dummy" point geom and set the point marker size to NA, so no points are actually plotted, but a legend will be generated.
Override the size of the point markers in the legend, so that point markers will appear in the legend key to distinguish each group.
ggplot(data, aes(y=y,x=x, label=ID, color=Group)) +
geom_text(size=8, show.legend=FALSE) +
geom_point(size=NA) +
scale_color_manual(values=c("blue","red")) +
theme_classic() +
labs(colour="") +
guides(colour=guide_legend(override.aes=list(size=4)))
Beginning with ggplot2 2.3.2, you can specify the glyph used in the legend using the argument key_glyph:
ggplot(data, aes(x=x, y=y, label=ID, color=Group)) +
geom_text(size=8, key_glyph="point") +
scale_color_manual(values=c("blue", "red")) +
labs(color=NULL) +
theme_classic()
For a full list of glyphs, refer to the ggplot2 documentation for draw_key. Credit to R Data Berlin for alerting me to this simple solution. Emil Hvitfeldt also has a nice blog post showcasing the options.
As a quick fix you can tweak the legend key, by hard coding the info you want, although around the other way - keep the key and remove the label.
library(grid)
GeomText$draw_key <- function (data, params, size) {
txt <- ifelse(data$colour=="blue", "Control", "Treatment")
# change x=0 and left justify
textGrob(txt, 0, 0.5,
just="left",
gp = gpar(col = alpha(data$colour, data$alpha),
fontfamily = data$family,
fontface = data$fontface,
# also added 0.5 to reduce size
fontsize = data$size * .pt* 0.5))
}
And when you plot you suppress the legend labels, and make legend key a bit wider to fit text.
ggplot(data, aes(y=y,x=x, label=ID, color=Group)) +
geom_text(size=8) +
scale_color_manual(values=c("blue","red")) +
theme_classic() +
theme(legend.text = element_blank(),
legend.key.width = unit(1.5, "cm"))
Is there a way to change the shape of the points for missing data in R? I am plotting .csv files like this one in a lollipop style.
Name,chr,Pos,Reads...ME_016,Reads...ME_017,Reads...ME_018,Reads...ME_019
cg01389728,chr10,6620395,33.82,41.38,41.38,38.46
cg01389728,chr10,6620410,0,-,-,-
cg01389728,chr10,6620430,0,0,-,-
cg01389728,chr10,6620447,0,-,0,-
cg01389728,chr10,6620478,0,-,-,-
cg01389728,chr10,6620510,28.33,29.85,25.64,28.13
cg01389728,chr10,6620520,0,0,-,0
cg01389728,chr10,6620531,0,-,50,-
Using ggplot2, my graphs are created with this:
dataset <-read.table("testset", sep=",",na.strings="-", header=TRUE)
dataset <- subset(dataset, select=c(-Name, -chr))
dataset <- melt(dataset, id.vars="Pos")
dataset$variable <- gsub("\\.\\.\\.","_",dataset$variable)
xaxes <- unique(dataset$Pos)
dataset$Pos <- as.factor(dataset$Pos)
ggplot(dataset, aes(x=Pos, y=variable,fill=cut(value, breaks=10))) + geom_point(size=4, shape=21) + geom_line() + scale_fill_discrete(labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%")) +
xlab("CpG Positions") +
ylab("Sample") +
labs(fill="Coverage in %") +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),plot.title = element_text(vjust=2),axis.title.x = element_text(vjust=-0.5),axis.title.y = element_text(vjust=1.5))
However, I want to set the shape of the missing points ("-") in the plot to an "x", (shape=4) and show them also in the legend.
I've tried approaches like:
scale_fill_manual(values=c(value, NA))
or:
scale_shape_manual(values=c(21,4))
By default, the "-" are also shown with shape 21 and grey colour. There must be a way to manipulate this? Writing a method like this might be the trick, but how to call it for the whole column?
formas <- function(x){
+ if(is.na(x)) forma <- 4
+ if(!is.na(x)) forma <- 21
+ return(forma)
+ }
This comes pretty close, I think.
ggplot(dataset, aes(x=Pos, y=variable,
color=cut(value, breaks=10),
shape=ifelse(is.na(value),"Missing","Present"))) +
geom_point(size=4) +
geom_line() +
scale_shape_manual(name="",values=c(Missing=4,Present=19))+
scale_color_discrete(labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%")) +
xlab("CpG Positions") +
ylab("Sample") +
labs(color="Coverage in %") +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),plot.title = element_text(vjust=2),axis.title.x = element_text(vjust=-0.5),axis.title.y = element_text(vjust=1.5))
Change are:
used color instead of fill, with shape=19 for points with data
added shape aesthetic to ggplot(...) call.
removed shape=21 from geom_point(...) call.
added scale_shape_manual(...) to define the shapes for Missing and Present, and turn off the guide label.
I know you wanted filled points with a black outline (it does look better), but when I tried that with the added shape aesthetic, the fill legend does not display the colors correctly. Try it yourself.
Here is another approach that comes closer to producing the graph you specified (circular points with black outline and fill color determined by coverage).
fill.colors <- hcl(h=seq(15, 375, length=11), l=65, c=100)[1:10]
ggplot(dataset, aes(x=Pos, y=variable,
fill=cut(value, breaks=10),
shape=ifelse(is.na(value),"Missing","Present"))) +
geom_point(size=4) +
geom_line() +
scale_fill_manual(name="Coverage in %",
values=fill.colors,
labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%"),
drop=FALSE) +
scale_shape_manual(name="",values=c(Missing=4,Present=21),limits=c("Missing"))+
xlab("CpG Positions") +
ylab("Sample") +
labs(color="Coverage in %") +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),
plot.title = element_text(vjust=2),
axis.title.x = element_text(vjust=-0.5),
axis.title.y = element_text(vjust=1.5))+
guides(fill=guide_legend(override.aes=list(colour=fill.colors),order=1))
The problem in the other answer with using point shape 21 and the fill aesthetic is that, while the fill colors are displayed correctly in the plot, they are not displayed correctly in the legend. One way around that is to force ggplot to set the legend fill colors using
guides(fill=guide_legend(override.aes=list(colour=fill.colors),order=1))
Unfortunately, to do that you have to specify the fill colors manually (so that the actual fill and the override fill are the same). This code does that using
fill.colors <- hcl(h=seq(15, 375, length=11), l=65, c=100)[1:10]
which creates a color palette that mimics the ggplot default. You could of course use your own color palette here.
While this does come closer to your original intent, I actually think the other answer provides a better data visualization. The black outlines around the points, while "attractive", make it much more difficult to distinguish between fill colors, especially with 10 possible colors (which is at the edge of discernability anyway).
I can't see, why this is not working:
fill.colors <- hcl(h=seq(15, 375, length=11), l=65, c=100)[1:10]
ggplot(dataset, aes(x=Pos, y=variable
,color=cut(value, breaks=c(-0.01,10,20,30,40,50,60,70,80,90,100))
,shape=ifelse(is.na(value),"Missing","Present"))) +
geom_point(size=4) +
scale_shape_manual(name="",values=c("Missing"=4,"Present"=19),limits=c("Missing"))+
scale_color_manual(name="Coverage in %",
values=ifelse(is.na(dataset$value),"grey",fill.colors),
labels=c("0-10%","10-20%","20-30%","30-40%","40-50%","50-60%","60-70%","70-80%","80-90%","90-100%"),drop=FALSE) +
theme_bw() +
theme(axis.text.x = element_text(angle=90, hjust=1, vjust=0.5),
plot.title = element_text(vjust=2),
axis.title.x = element_text(vjust=-0.5),
axis.title.y = element_text(vjust=1.5)) +
xlab("CpG Positions") +
ylab("Sample") +
labs(color="Coverage in %") +
guides(fill=guide_legend(override.aes=list(colour=fill.colors),order=1))
NA values are not shown anymore with an X, and instead of displaying them in "grey", the class 90-100% will be shown in grey. No error message is shown - what is the problem?
I have a empirical PDF + CDF combo I'd like to plot on the same panel. distro.df has columns pdf, cdf, and day. I'd like the pdf values to be plotted as bars, and the cdf as lines. This does the trick for making the plot:
p <- ggplot(distro.df, aes(x=day))
p <- p + geom_bar(aes(y=pdf/max(pdf)), stat="identity", width=0.95, fill=fillCol)
p <- p + geom_line(aes(y=cdf))
p <- p + xlab("Day") + ylab("")
p <- p + theme_bw() + theme_update(panel.background = element_blank(), panel.border=element_blank())
However, I'm having trouble getting a legend to appear. I'd like a line for the cdf and a filled block for the pdf. I've tried various contortions with guides, but can't seem to get anything to appear.
Suggestions?
EDIT:
Per #Henrik's request: to make a suitable distro.df object:
df <- data.frame(day=0:10)
df$pdf <- runif(length(df$day))
df$pdf <- df$pdf / sum(df$pdf)
df$cdf <- cumsum(df$pdf)
Then the above to make the plot, then invoke p to see the plot.
This generally involves moving fill into aes and using it in both the geom_bar and geom_line layers. In this case, you also need to add show_guide = TRUE to geom_line.
Once you have that, you just need to set the fill colors in scale_fill_manual so CDF doesn't have a fill color and use override.aes to do the same thing for the lines. I didn't know what your fill color was, so I just used red.
ggplot(df, aes(x=day)) +
geom_bar(aes(y=pdf/max(pdf), fill = "PDF"), stat="identity", width=0.95) +
geom_line(aes(y=cdf, fill = "CDF"), show_guide = TRUE) +
xlab("Day") + ylab("") +
theme_bw() +
theme_update(panel.background = element_blank(),
panel.border=element_blank()) +
scale_fill_manual(values = c(NA, "red"),
breaks = c("PDF", "CDF"),
name = element_blank(),
guide = guide_legend(override.aes = list(linetype = c(0,1))))
I'd still like a solution to the above (and will checkout #aosmith's answer), but I am currently going with a slightly different approach to eliminate the need to solve the problem:
p <- ggplot(distro.df, aes(x=days, color=pdf, fill=pdf))
p <- p + geom_bar(aes(y=pdf/max(pdf)), stat="identity", width=0.95)
p <- p + geom_line(aes(y=cdf), color="black")
p <- p + xlab("Day") + ylab("CDF")
p <- p + theme_bw() + theme_update(panel.background = element_blank(), panel.border=element_blank())
p
This also has the advantage of displaying some of the previously missing information, namely the PDF values.