I'm tinkering with geom_point trying to plot the following code. I have converted cars$vs to a factor with discrete levels so that I can visualize both levels of that variable in different colors by assigning it to "fill" in the ggplot aes settings.
cars <- mtcars
cars$vs <- as.factor(cars$vs)
ggplot(cars,aes(x = mpg, y = disp, fill = vs)) +
geom_point(size = 4) +
scale_fill_discrete(name = "Test")
As you can see, the graph does not differentiate between both "fill" conditions via color. However, it preserves the legend label I have specified in scale_fill_discrete.
Alternatively, I can plot the following (same code, but instead of "fill", use "color")
cars <- mtcars
cars$vs <- as.factor(cars$vs)
ggplot(cars,aes(x = mpg, y = disp, color = vs)) +
geom_point(size = 4) +
scale_fill_discrete(name = "Test")
As you can see, using "color" instead of "fill" differentiates between the levels of the factor via color, but seems to override any changes I make to the legend title using scale_fill_discrete.
Am I using "fill" incorrectly? How can I plot different levels of a factor in different colors using this method and have control over the plot legend vis scale_fill_discrete?
Since you are using color as mapping, you can use scale_color_* to change the corresponding attributes instead of scale_fill_*:
ggplot(cars,aes(x = mpg, y = disp, color = vs)) +
geom_point(size = 4) +
scale_color_discrete(name = "Test")
To use a fill with geom_point you should use a fill-able shape:
ggplot(cars,aes(x = mpg, y = disp, fill = vs)) +
geom_point(size = 4, shape = 21) +
scale_fill_discrete(name = "Test")
See ?pch, which shows that shapes 21 to 25 can be colored and filled with different colors.ggplot will not use the fill unless the shape is one that is fill-able. This behavior has changed a bit in different versions, as seen in the NEWS file.
There's no reason to use fill with geom_point unless you want the outline and fill colors of the points to be different, so the other answer recommending color is probably what you want.
Related
A plot will be made from these data:
mtcars %>%
gather(-mpg, key = "var", value = "value") %>%
ggplot(aes(x = value, y = mpg)) +
geom_point() +
facet_wrap(~ var, scales = "free") +
theme_bw()
How can I change the gray color of the titles of the panels
for instance
panels of am and hp green
panels of gear drat disp red
panels of vs wt blue
panels cyl qsec carb black
add a legend
green = area
red= bat
blue= vege
black = indus
Unfortunately, it seems the way to answer OP's question is still going to be quite hacky.
If you're not into gtable hacks like those referenced... here's another exceptionally hacky way to do this. Enjoy the strange ride.
TL;DR - The idea here is to use a rect geom outside of the plot area to draw each facet label box color
Here's the basic plot below. OP wanted to (1) change the gray color behind the facet labels (called the "strip" labels) to a specific color depending on the facet label, then (2) add a legend.
First of all, I just referenced the gathered dataframe as df, so the plot code looks like this now:
df <- mtcars %>% gather(-mpg, key = "var", value = "value")
ggplot(df, aes(x = value, y = mpg)) +
geom_point() +
facet_wrap(~ var, scales = "free") +
theme_bw()
How to recolor each facet label?
As referenced in the other answers, it's pretty simple to change all the facet label colors at once (and facet label text) via the theme() elements strip.background and strip.text:
plot + theme(
strip.background = element_rect(fill="blue"),
strip.text=element_text(color="white"))
Of course, we can't do that for all facet labels, because strip.background and element_rect() cannot be sent a vector or have mapping applied to the aesthetics.
The idea here is that we use something that can have aesthetics mapped to data (and therefore change according to the data) - use a geom. In this case, I'm going to use geom_rect() to draw a rectangle in each facet, then color that rect based upon the criteria OP states in their question. Moreover, using geom_rect() in this way also creates a legend automatically for us, since we are going to use mapping and aes() to specify the color. All we need to do is allow ggplot2 to draw layers outside the plot area, use a bit of manual fine-tuning to get the placement correct, and it works!
The Hack
First, a separate dataset is created containing a column called var that contains all facet names. Then var_color specifies the names OP gave for each facet. We specify color using a scale_fill_manual() function. Finally, it's important to use coord_cartesian() carefully here. We need this function for two reasons:
Cut the panel area in the plot to only contain the points. If we did not specify the y limit, the panel would automatically resize to accomodate the rect geom.
Turn clipping off. This allows layers drawn outside the panel to be seen.
We then need to turn strip.background to transparent (so we can see the color of the box), and we're good to go. Hopefully you can follow along below.
I'm representing all the code below for extra clarity:
library(ggplot2)
library(tidyr)
library(dplyr)
# dataset for plotting
df <- mtcars %>% gather(-mpg, key = "var", value = "value")
# dataset for facet label colors
hacky_df <- data.frame(
var = c("am", "carb", "cyl", "disp", "drat", "gear", "hp", "qsec", "vs", "wt"),
var_color = c("area", "indus", "indus", "bat", "bat", "bat", "area", "indus", "vege", "vege")
)
# plot code
plot_new <-
ggplot(df) + # don't specify x and y here. Otherwise geom_rect will complain.
geom_rect(
data=hacky_df,
aes(xmin=-Inf, xmax=Inf,
ymin=36, ymax=42, # totally defined by trial-and-error
fill=var_color, alpha=0.4)) +
geom_point(aes(x = value, y = mpg)) +
coord_cartesian(clip="off", ylim=c(10, 35)) +
facet_wrap(~ var, scales = "free") +
scale_fill_manual(values = c("area" = "green", "bat" = "red", "vege" = "blue", "indus" = "black")) +
theme_bw() +
theme(
strip.background = element_rect(fill=NA),
strip.text = element_text(face="bold")
)
plot_new
To make my figure suitable for black-white printing, I mapped one variable with "shape", "lty", "color" together.
ggplot(df, aes(x=time, y=mean,
shape=quality,
lty=quality,
color=quality))
I got the figure like,
I would like to make part of legends as subscribs, with the codes:
labels=c(expression(Pol[(Art)]), expression(Pol['(Aca-)']), expression(Pol['(Aca-)']))
Unfortunately, when I put the "label" in color or shape, it makes the legend quite complex, like,
Is it possible to map "shape", "color","lty" to one varible, and set the subscript, but keep them in one set of legend?
To change the labels of a categorical scale, you use scale_*_discrete(labels = ...). Here you just need to do that for color, shape, and linetype.
You should avoid using lty = generally; that synonym is permitted for compatibility with base R, but it's not universally supported throughout ggplot2.
I changed your labels to be closer to what I think you meant (the third entry is now "Aca+" instead of a repeat of "Aca-") and to make them left-align better (by adding an invisible "+" to the first one to create the appropriate spacing).
lab1 <- c(expression(Pol[(Art)*phantom("+")]),
expression(Pol['(Aca-)']),
expression(Pol['(Aca+)']))
library(ggplot2)
ggplot(mtcars,
aes(wt, mpg,
color = factor(cyl),
shape = factor(cyl),
linetype = factor(cyl))) +
geom_point() +
stat_smooth(se = F) +
scale_color_discrete(labels = lab1) +
scale_shape_discrete(labels = lab1) +
scale_linetype_discrete(labels = lab1)
If you find yourself needing to repeat exact copies of a function like this, there's two workarounds:
Relabel the data itself - OR -
Use purrr::invoke_map to iterate over the functions
library(purrr)
ggplot(mtcars,
aes(wt, mpg,
color = factor(cyl),
shape = factor(cyl),
linetype = factor(cyl))) +
geom_point() +
stat_smooth(se = F) +
invoke_map(list(scale_color_discrete,
scale_linetype_discrete,
scale_shape_discrete),
labels = lab1)
Update:
This approach is mostly fine, but now the expression(...) syntax has a superior alternative, the excellent markdown-based {ggtext} package: https://github.com/wilkelab/ggtext
To change to this method, use a (optionally, named) vector of labels that look like this:
library(ggtext)
lab1 <- c(
`4` = "Pol<sub>(Art)</sub>",
`6` = "Pol<sub>(Aca-)</sub>",
`8` = "Pol<sub>(Aca+)</sub>"
)
And then add this line to your theme:
... +
theme(
legend..text = element_markdown()
)
The advantages over the other method are that:
markdown syntax is a lot easier to search for help online and
now those labels can be stored in the actual data as a column, rather than passing them separately to each geom
You can use that new column as your aesthetic mapping [ggplot(..., aes(color = my_new_column, linetype = my_new_column, ...)] instead of having to pass extra labels in each layer using the purrr::invoke method.
I am trying to add corresponding labels to the color in the bar in a histogram. Here is a reproducible code.
ggplot(aes(displ),data =mpg) + geom_histogram(aes(fill=class),binwidth = 1,col="black")
This code gives a histogram and give different colors for the car "class" for the histogram bars. But is there any way I can add the labels of the "class" inside corresponding colors in the graph?
The inbuilt functions geom_histogram and stat_bin are perfect for quickly building plots in ggplot. However, if you are looking to do more advanced styling it is often required to create the data before you build the plot. In your case you have overlapping labels which are visually messy.
The following codes builds a binned frequency table for the dataframe:
# Subset data
mpg_df <- data.frame(displ = mpg$displ, class = mpg$class)
melt(table(mpg_df[, c("displ", "class")]))
# Bin Data
breaks <- 1
cuts <- seq(0.5, 8, breaks)
mpg_df$bin <- .bincode(mpg_df$displ, cuts)
# Count the data
mpg_df <- ddply(mpg_df, .(mpg_df$class, mpg_df$bin), nrow)
names(mpg_df) <- c("class", "bin", "Freq")
You can use this new table to set a conditional label, so boxes are only labelled if there are more than a certain number of observations:
ggplot(mpg_df, aes(x = bin, y = Freq, fill = class)) +
geom_bar(stat = "identity", colour = "black", width = 1) +
geom_text(aes(label=ifelse(Freq >= 4, as.character(class), "")),
position=position_stack(vjust=0.5), colour="black")
I don't think it makes a lot of sense duplicating the labels, but it may be more useful showing the frequency of each group:
ggplot(mpg_df, aes(x = bin, y = Freq, fill = class)) +
geom_bar(stat = "identity", colour = "black", width = 1) +
geom_text(aes(label=ifelse(Freq >= 4, Freq, "")),
position=position_stack(vjust=0.5), colour="black")
Update
I realised you can actually selectively filter a label using the internal ggplot function ..count... No need to preformat the data!
ggplot(mpg, aes(x = displ, fill = class, label = class)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=ifelse(..count..>4, ..count.., "")))
This post is useful for explaining special variables within ggplot: Special variables in ggplot (..count.., ..density.., etc.)
This second approach will only work if you want to label the dataset with the counts. If you want to label the dataset by the class or another parameter, you will have to prebuild the data frame using the first method.
Looking at the examples from the other stackoverflow links you shared, all you need to do is change the vjust parameter.
ggplot(mpg, aes(x = displ, fill = class, label = class)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", vjust=1.5)
That said, it looks like you have other issues. Namely, the labels stack on top of each other because there aren't many observations at each point. Instead I'd just let people use the legend to read the graph.
This question already has an answer here:
Using colors in aes() function in ggplot2
(1 answer)
Closed 3 years ago.
gb <- read.csv('results-gradient-boosting.csv')
p <- ggplot(gb) + geom_point(aes(x = pred, y = y),alpha = 0.4, fill = 'darkgrey', size = 2) +
geom_line(aes(x = pred, y = pred,color = 'darkgrey'),size = 0.6) +
geom_line(aes(x = pred, y = pred + 3,color = I("darkgrey")), linetype = 'dashed',size = 0.6) +
geom_line(aes(x = pred, y = pred -3,color = 'darkgrey'),linetype = 'dashed',size = 0.6)
My code is above. I have no idea why when I put color inside aes, the color turns out to be red. But if I put it outside of aes, it is correct. Thanks for your help!
When you put color="darkgrey" outside aes, ggplot takes it literally to mean that the line should be colored "darkgrey". But when you put color="darkgrey" inside aes, ggplot takes it to mean that you want to map color to a variable. In this case, the variable has only one value: "darkgrey". But that's not the color "darkgrey". It's just a string. You could call it anything. The color ggplot chooses will be based on the default palette. Map color to a variable when you want different colors for different levels of that variable.
For example, see what happens in the example below. The colors are chosen from ggplot's default palette and are completely independent of the names we've used for colour in each call to geom_line. You will get the same three colors when you have any color aesthetic that takes on three different unique values:
library(ggplot2)
theme_set(theme_classic())
ggplot(mtcars) +
geom_line(aes(mpg, wt, colour="green")) +
geom_line(aes(mpg, wt - 1, colour="blue")) +
geom_line(aes(mpg, wt + 1, colour="star trek"))
But now we put the colors outside aes so they are taken literally, and we comment out the third line, because it will cause an error if we don't use a valid colour.
ggplot(mtcars) +
geom_line(aes(mpg, wt), colour="green") +
geom_line(aes(mpg, wt - 1), colour="blue") #+
#geom_line(aes(mpg, wt + 1), colour="star trek")
Note that if we map colour to an actual column of mtcars (one that has three unique levels), we get the same three colors as in the first example, but now they are mapped to an actual feature of the data:
ggplot(mtcars) +
geom_line(aes(mpg, wt, colour=factor(cyl)))
And finally, what if we want to set those mapped colors to different values:
ggplot(mtcars) +
geom_line(aes(mpg, wt, colour=factor(cyl))) +
scale_colour_manual(values=c("purple", hcl(150,100,80), rgb(0.9,0.5,0.3)))
This is my first question on stackoverflow so please correct me if the question is unclear.
I would like to assign geom attributes for ggplot2 to a variable for reuse in multiple plots. For example, let's say I want to assign the attributes of size and shape to a variable to resuse in plotting data other than mtcars.
This code works, but if I have a lot of plots I don't want to keep re-entering the size and shape attributes.
ggplot(mtcars) +
geom_point(aes(x = wt,
y = mpg),
size = 5,
shape = 21
)
How should I assign a variable (eg size.shape) these attributes so that I can use it in the below code to produce the same plot?
ggplot(mtcars) +
geom_point(aes(x = wt,
y = mpg),
size.shape
)
If you always want to use the same values for size and shape (or other aesthetics), you could use update_geom_defaults() to set the default values to other values:
update_geom_defaults("point", list(size = 5, shape = 21))
These will then be used whenever you do not specifically give values for the aesthetics.
Example
The plot you create with the usual default settings looks as follows:
ggplot(mtcars) + geom_point(aes(x = wt, y = mpg))
But when you reset the defaults for size and shape, it looks differently:
update_geom_defaults("point", list(size = 5, shape = 21))
ggplot(mtcars) + geom_point(aes(x = wt, y = mpg))
As you can see, the actual plot is done with the same code as before, but the result is different because you changed the default values for size and shape. Of course, you can still produce plots with any value for these aesthetics, by simply providing values in geom_point():
ggplot(mtcars) + geom_point(aes(x = wt, y = mpg), size = 2, shape = 2)
Note that the defaults are given by geom, which means that only geom_point() is affected.
This solution is convenient, if there is only one set of values for size and shape that you want to use. If you have several sets of values that you want to be able to pick from when creating a plot, then you might be better off with something along the lines of the comment by lukeA.