To make my figure suitable for black-white printing, I mapped one variable with "shape", "lty", "color" together.
ggplot(df, aes(x=time, y=mean,
shape=quality,
lty=quality,
color=quality))
I got the figure like,
I would like to make part of legends as subscribs, with the codes:
labels=c(expression(Pol[(Art)]), expression(Pol['(Aca-)']), expression(Pol['(Aca-)']))
Unfortunately, when I put the "label" in color or shape, it makes the legend quite complex, like,
Is it possible to map "shape", "color","lty" to one varible, and set the subscript, but keep them in one set of legend?
To change the labels of a categorical scale, you use scale_*_discrete(labels = ...). Here you just need to do that for color, shape, and linetype.
You should avoid using lty = generally; that synonym is permitted for compatibility with base R, but it's not universally supported throughout ggplot2.
I changed your labels to be closer to what I think you meant (the third entry is now "Aca+" instead of a repeat of "Aca-") and to make them left-align better (by adding an invisible "+" to the first one to create the appropriate spacing).
lab1 <- c(expression(Pol[(Art)*phantom("+")]),
expression(Pol['(Aca-)']),
expression(Pol['(Aca+)']))
library(ggplot2)
ggplot(mtcars,
aes(wt, mpg,
color = factor(cyl),
shape = factor(cyl),
linetype = factor(cyl))) +
geom_point() +
stat_smooth(se = F) +
scale_color_discrete(labels = lab1) +
scale_shape_discrete(labels = lab1) +
scale_linetype_discrete(labels = lab1)
If you find yourself needing to repeat exact copies of a function like this, there's two workarounds:
Relabel the data itself - OR -
Use purrr::invoke_map to iterate over the functions
library(purrr)
ggplot(mtcars,
aes(wt, mpg,
color = factor(cyl),
shape = factor(cyl),
linetype = factor(cyl))) +
geom_point() +
stat_smooth(se = F) +
invoke_map(list(scale_color_discrete,
scale_linetype_discrete,
scale_shape_discrete),
labels = lab1)
Update:
This approach is mostly fine, but now the expression(...) syntax has a superior alternative, the excellent markdown-based {ggtext} package: https://github.com/wilkelab/ggtext
To change to this method, use a (optionally, named) vector of labels that look like this:
library(ggtext)
lab1 <- c(
`4` = "Pol<sub>(Art)</sub>",
`6` = "Pol<sub>(Aca-)</sub>",
`8` = "Pol<sub>(Aca+)</sub>"
)
And then add this line to your theme:
... +
theme(
legend..text = element_markdown()
)
The advantages over the other method are that:
markdown syntax is a lot easier to search for help online and
now those labels can be stored in the actual data as a column, rather than passing them separately to each geom
You can use that new column as your aesthetic mapping [ggplot(..., aes(color = my_new_column, linetype = my_new_column, ...)] instead of having to pass extra labels in each layer using the purrr::invoke method.
Related
Let's say I don't need a 'proper' variable mapping but still would like to have legend keys to help the chart understanding. My actual data are similar to the following df
df <- data.frame(id = 1:10, line = rnorm(10), points = rnorm(10))
library(ggplot2)
ggplot(df) +
geom_line(aes(id, line, colour = "line")) +
geom_point(aes(id, points, colour = "points"))
Basically, I would like the legend key relative to points to be.. just a point, without the line in the middle. I got close to that with this:
library(reshape2)
df <- melt(df, id.vars="id")
ggplot() +
geom_point(aes(id, value, shape = variable), df[df$variable=="points",]) +
geom_line(aes(id, value, colour = variable), df[df$variable=="line",])
but it defines two separate legends. Fixing the second code (and having to reshape my data) would be fine too, but I'd prefer a way (if any) to manually change any legend key (and keep using the first approch). Thanks!
EDIT :
thanks #alexwhan you refreshed my memory about variable mapping. However, the easiest way I've got so far is still the following (very bad hack!):
df <- data.frame(id = 1:10, line = rnorm(10), points = rnorm(10))
ggplot(df) +
geom_line(aes(id, line, colour = "line")) +
geom_point(aes(id, points, shape = "points")) +
theme(legend.title=element_blank())
which is just hiding the title of the two different legends.
Other ideas more than welcome!!!
You can use override.aes= inside guides() function to change default appearance of legend. In this case your guide is color= and then you should set shape=c(NA,16) to remove shape for line and then linetype=c(1,0) to remove line from point.
ggplot(df) +
geom_line(aes(id, line, colour = "line")) +
geom_point(aes(id, points, colour = "points"))+
guides(color=guide_legend(override.aes=list(shape=c(NA,16),linetype=c(1,0))))
I am not aware of any way to do this easily, but you can do a hack version like this (using your melted dataframe):
p <- ggplot(df.m, aes(id, value)) +
geom_line(aes(colour = variable, linetype = variable)) + scale_linetype_manual(values = c(1,0)) +
geom_point(aes(colour = variable, alpha = variable)) + scale_alpha_manual(values = c(0,1))
The key is that you need to get the mapping right to have it displayed correctly in the legend. In this case, getting it 'right', means fooling it to look the way you want it to. It's probably worth pointing out this only works because you can set linetype to blank (0) and then use the alpha scale for the points. You can't use alpha for both, because it will only take one scale.
I am trying to add corresponding labels to the color in the bar in a histogram. Here is a reproducible code.
ggplot(aes(displ),data =mpg) + geom_histogram(aes(fill=class),binwidth = 1,col="black")
This code gives a histogram and give different colors for the car "class" for the histogram bars. But is there any way I can add the labels of the "class" inside corresponding colors in the graph?
The inbuilt functions geom_histogram and stat_bin are perfect for quickly building plots in ggplot. However, if you are looking to do more advanced styling it is often required to create the data before you build the plot. In your case you have overlapping labels which are visually messy.
The following codes builds a binned frequency table for the dataframe:
# Subset data
mpg_df <- data.frame(displ = mpg$displ, class = mpg$class)
melt(table(mpg_df[, c("displ", "class")]))
# Bin Data
breaks <- 1
cuts <- seq(0.5, 8, breaks)
mpg_df$bin <- .bincode(mpg_df$displ, cuts)
# Count the data
mpg_df <- ddply(mpg_df, .(mpg_df$class, mpg_df$bin), nrow)
names(mpg_df) <- c("class", "bin", "Freq")
You can use this new table to set a conditional label, so boxes are only labelled if there are more than a certain number of observations:
ggplot(mpg_df, aes(x = bin, y = Freq, fill = class)) +
geom_bar(stat = "identity", colour = "black", width = 1) +
geom_text(aes(label=ifelse(Freq >= 4, as.character(class), "")),
position=position_stack(vjust=0.5), colour="black")
I don't think it makes a lot of sense duplicating the labels, but it may be more useful showing the frequency of each group:
ggplot(mpg_df, aes(x = bin, y = Freq, fill = class)) +
geom_bar(stat = "identity", colour = "black", width = 1) +
geom_text(aes(label=ifelse(Freq >= 4, Freq, "")),
position=position_stack(vjust=0.5), colour="black")
Update
I realised you can actually selectively filter a label using the internal ggplot function ..count... No need to preformat the data!
ggplot(mpg, aes(x = displ, fill = class, label = class)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=ifelse(..count..>4, ..count.., "")))
This post is useful for explaining special variables within ggplot: Special variables in ggplot (..count.., ..density.., etc.)
This second approach will only work if you want to label the dataset with the counts. If you want to label the dataset by the class or another parameter, you will have to prebuild the data frame using the first method.
Looking at the examples from the other stackoverflow links you shared, all you need to do is change the vjust parameter.
ggplot(mpg, aes(x = displ, fill = class, label = class)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", vjust=1.5)
That said, it looks like you have other issues. Namely, the labels stack on top of each other because there aren't many observations at each point. Instead I'd just let people use the legend to read the graph.
I'm tinkering with geom_point trying to plot the following code. I have converted cars$vs to a factor with discrete levels so that I can visualize both levels of that variable in different colors by assigning it to "fill" in the ggplot aes settings.
cars <- mtcars
cars$vs <- as.factor(cars$vs)
ggplot(cars,aes(x = mpg, y = disp, fill = vs)) +
geom_point(size = 4) +
scale_fill_discrete(name = "Test")
As you can see, the graph does not differentiate between both "fill" conditions via color. However, it preserves the legend label I have specified in scale_fill_discrete.
Alternatively, I can plot the following (same code, but instead of "fill", use "color")
cars <- mtcars
cars$vs <- as.factor(cars$vs)
ggplot(cars,aes(x = mpg, y = disp, color = vs)) +
geom_point(size = 4) +
scale_fill_discrete(name = "Test")
As you can see, using "color" instead of "fill" differentiates between the levels of the factor via color, but seems to override any changes I make to the legend title using scale_fill_discrete.
Am I using "fill" incorrectly? How can I plot different levels of a factor in different colors using this method and have control over the plot legend vis scale_fill_discrete?
Since you are using color as mapping, you can use scale_color_* to change the corresponding attributes instead of scale_fill_*:
ggplot(cars,aes(x = mpg, y = disp, color = vs)) +
geom_point(size = 4) +
scale_color_discrete(name = "Test")
To use a fill with geom_point you should use a fill-able shape:
ggplot(cars,aes(x = mpg, y = disp, fill = vs)) +
geom_point(size = 4, shape = 21) +
scale_fill_discrete(name = "Test")
See ?pch, which shows that shapes 21 to 25 can be colored and filled with different colors.ggplot will not use the fill unless the shape is one that is fill-able. This behavior has changed a bit in different versions, as seen in the NEWS file.
There's no reason to use fill with geom_point unless you want the outline and fill colors of the points to be different, so the other answer recommending color is probably what you want.
This is a question about the ggplot2 package (author: Hadley Wickham). I have existing ggplot objects with distinct colors (resp shapes, linetype, fill...) that I would like to map to a single color, e.g. black. What is the recommended approach?
Clarification: I have to work with these ggplot objects: I cannot re-make them
A ggplot with variables grouped as factors: this is the plot object p I need to work with
p <- ggplot(mtcars, aes(x = mpg, y = wt, group = factor(cyl), colour = factor(cyl))) +
geom_point(size = 5)
Several approaches I know of:
1. scale_colour_grey hack
p + scale_colour_grey(start = 0, end = 0) + # gives correct, useless legend
guides(color = FALSE)
The shorter p + scale_colour_grey(0,0) does not work, you have to be explicit about start and end.
2. scale_colour_manual with rep() hack
p + scale_colour_manual(values = rep("black",3)) # gives correct, useless legend
The simpler scale_colour_manual(values = "black") does not work. This was probably the most intuitive approach. Having to specify the length of the vector makes it less attractive an approach.
3. geom_point() recalled
p + geom_point(colour = "black") + # gives incorrect legend
guides(color = FALSE)
It is well documented that the following is not allowed:
p + scale_colour_manual(colour = "black")
Error in discrete_scale(aesthetic, "manual", pal, ...) :
unused argument (colour = "black")
Removing the color mapping directly seems to work:
p_bw = p
p_bw$mapping$colour = NULL
gridExtra::grid.arrange(p, p_bw)
If you just want to set the points to black and get rid of the color legend, I think you can just to this:
p + scale_colour_manual(values=rep("black",length(unique(mtcars$cyl))),
guide=FALSE)
Given the following dataset:
data = cbind(1:10,c('open','reopen','closed'),letters[1:3],1:10)
data = rbind(data,cbind(1:10,c('open','closed','reopen'),letters[1:3],5:10))
data = rbind(data,cbind(1:10,c('closed','open','reopen'),letters[1:3],3:10))
data = data.frame(data);
colnames(data) <- c("id","status","author","when")
I'd like to get a plot similar to the following:
ggplot(data, aes(when,id)) +
geom_line(aes(group = id,colour = status)) +
geom_point(aes(group = id,colour = author))
But, as such I get a single legend by 'author' with the status and author values. How can I get the same result but with a legend for author and other for status? My rationale is that I want to layer two plots of the same dataset on top of each other.
I don't think you can have different color scales / legends for one ggplot. You could hack something together (see this question for legend hacking), but in this case where one of your geom's is point, you could just use fill and one of the point options that are filled in.
ggplot(data, aes(when,id)) +
geom_line(aes(group = id,colour = status)) +
geom_point(aes(group = id, fill = author),
shape = 21, color = NA, size = 4)
Here the colors used are the same for each, but you can edit the color or fill scales individually, e.g., adding
scale_fill_brewer(type = "qual") +
scale_color_brewer(type = "qual", palette = 2)
I do agree with AndyClifton that using color in two ways will be hard to distinguish. You could also experiment with line types, point shapes, or even plotting with geom_text using a word, a letter, or a number as a label instead of points. You say you have more than 6 values for author, but it will be very difficult to distinguish more than 6 colors for author, especially when color is also being used for status.
Let's take your data. First you should be aware that you have a problem that your when and id column is a string, so you are plotting 1, 10, 2, 3, ... not 1,...9,10. We can fix that:
data$when.num <-as.numeric(as.character(data$when))
data$id.num <-as.numeric(as.character(data$id))
Then we'll plot it but use different shapes to get two different legends:
require(ggplot2)
p <- ggplot(data, aes(x = when.num, y = id)) +
geom_line(aes(group = id,colour = status)) +
geom_point(aes(group = id,shape = author))
print(p)
And you get this:
I think this is much clearer than using coloured points for the author, but this is a question of taste.