ggplot boxplot: same colour looks different depending when it is set - r

Just encountered a surprising colour problem while doing a boxplot with ggplot2.
The same colour (#FF4040) looks drastically different whether I set it as fill parameter or later in scale_fill_manual.
Here is an example you can copy/paste using the mtcars dataset.
library(ggplot2)
data('mtcars')
ggplot (data = mtcars, aes(x = as.factor(cyl), disp)) +
geom_boxplot(aes(fill = '#FF4040'))
ggplot (data = mtcars, aes(x = as.factor(cyl), disp)) +
geom_boxplot(aes(fill = as.factor(cyl)))+
scale_fill_manual(breaks=c('4', '6', '8'),
values=c('#FF4040', '#FF4040', '#FF4040'))
Here is the comparison:

As I said in comments in first example you don't change fill color only mapping fill. So instead of geom_boxplot(aes(fill= '#FF4040')) put geom_boxplot(fill= '#FF4040') and you recive the same result as the second version.

Related

Change fill/colour for geom_dotplot or geom_histogram with a continuous variable

Is it possible to fill ggplot's geom_dotplot with continuous variables?
library(ggplot2)
ggplot(mtcars, aes(x = mpg, fill = disp)) +
geom_dotplot()
this should be pretty straightforward, but I've tried messing with the groups aes and no success.
The max I can do is to discretize the disp variable but it is not optimal.
ggplot(mtcars, aes(x = mpg, fill = factor(disp))) +
geom_dotplot()
Good question! You have to set group = variable within aes (where variable is equal to the same column that you're using for fill or color):
library(ggplot2)
ggplot(mtcars, aes(mpg, fill = disp, group = disp)) +
geom_dotplot()
geom_dotplot in away is just like a histogram. You can't set fill/colour there easily as grouping is done. To make it work you have to set group.
Example using geom_histogram:
ggplot(mtcars, aes(mpg, fill = disp, group = disp)) +
geom_histogram()

ggplot geom_bar() fill not coloring bars on plot

Using the fill argument on geom_bar is not coloring the bars on my plot. I'm using the train.csv from the titanic data set here.
passengers <- read.csv('../input/train.csv')
I have tried moving the fill outside of the aes(), tried moving the aes up to the ggplot() function.
This is the code I'm using on the Titanic Data set
ggplot(data = passengers) +
geom_bar(mapping = aes(x=Survived, fill = Pclass))
This is the code I'm using as a template which works fine on the ggplot built in diamonds data.
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut))
I just keep getting grey bars with the geom_bar for Survived using Pclass as the fill.
This is doing the trick for me:
ggplot(data = passengers) +
geom_bar(mapping = aes(x=Survived, fill = as.character(Pclass)))
You can try as.factor()
ggplot(data = passengers) +
geom_bar(mapping = aes(x=Survived, fill = as.factor(passengers$Pclass)))
Probably your variables is not factor

I canĀ“t plot a geom_bar correctly

I'm trying to improve my skills on R language and I found a problem.
#Load the library.
library(ggplot2)
#Execute a simple code
ggplot(mtcars, aes(x = cyl, fill = am)) + geom_bar()
My main question is, what I'm doing bad, why the fill aesthetic has not been plotted
Adrian. In the way that you are using it, with geom_bar(), fill should be a factor rather than a continuous variable.
ggplot(mtcars, aes(x = cyl, fill = as.character(am))) ## as.character or as.vector transform "am"
+ geom_bar()
To ilustate the differenece in ggplot's behavior between vector and numeric, look at this plot:
ggplot(mtcars, aes(x = cyl, fill = as.character(am), color = as.character(am), alpha = am))
+ geom_bar()

ggplot conflict between fill and scale_fill_discrete/plot legend

I'm tinkering with geom_point trying to plot the following code. I have converted cars$vs to a factor with discrete levels so that I can visualize both levels of that variable in different colors by assigning it to "fill" in the ggplot aes settings.
cars <- mtcars
cars$vs <- as.factor(cars$vs)
ggplot(cars,aes(x = mpg, y = disp, fill = vs)) +
geom_point(size = 4) +
scale_fill_discrete(name = "Test")
As you can see, the graph does not differentiate between both "fill" conditions via color. However, it preserves the legend label I have specified in scale_fill_discrete.
Alternatively, I can plot the following (same code, but instead of "fill", use "color")
cars <- mtcars
cars$vs <- as.factor(cars$vs)
ggplot(cars,aes(x = mpg, y = disp, color = vs)) +
geom_point(size = 4) +
scale_fill_discrete(name = "Test")
As you can see, using "color" instead of "fill" differentiates between the levels of the factor via color, but seems to override any changes I make to the legend title using scale_fill_discrete.
Am I using "fill" incorrectly? How can I plot different levels of a factor in different colors using this method and have control over the plot legend vis scale_fill_discrete?
Since you are using color as mapping, you can use scale_color_* to change the corresponding attributes instead of scale_fill_*:
ggplot(cars,aes(x = mpg, y = disp, color = vs)) +
geom_point(size = 4) +
scale_color_discrete(name = "Test")
To use a fill with geom_point you should use a fill-able shape:
ggplot(cars,aes(x = mpg, y = disp, fill = vs)) +
geom_point(size = 4, shape = 21) +
scale_fill_discrete(name = "Test")
See ?pch, which shows that shapes 21 to 25 can be colored and filled with different colors.ggplot will not use the fill unless the shape is one that is fill-able. This behavior has changed a bit in different versions, as seen in the NEWS file.
There's no reason to use fill with geom_point unless you want the outline and fill colors of the points to be different, so the other answer recommending color is probably what you want.

How would you plot a box plot and specific points on the same plot?

We can draw box plot as below:
qplot(factor(cyl), mpg, data = mtcars, geom = "boxplot")
and point as:
qplot(factor(cyl), mpg, data = mtcars, geom = "point")
How would you combine both - but just to show a few specific points(say when wt is less than 2) on top of the box?
If you are trying to plot two geoms with two different datasets (boxplot for mtcars, points for a data.frame of literal values), this is a way to do it that makes your intent clear. This works with the current (Sep 2016) version of ggplot (ggplot2_2.1.0)
library(ggplot2)
ggplot() +
# box plot of mtcars (mpg vs cyl)
geom_boxplot(data = mtcars,
aes(x = factor(cyl), y= mpg)) +
# points of data.frame literal
geom_point(data = data.frame(x = factor(c(4,6,8)), y = c(15,20,25)),
aes(x=x, y=y),
color = 'red')
I threw in a color = 'red' for the set of points, so it's easy to distinguish them from the points generated as part of geom_boxplot
Use + geom_point(...) on your qplot (just add a + geom_point() to get all the points plotted).
To plot selectively just select those points that you want to plot:
n <- nrow(mtcars)
# plot every second point
idx <- seq(1,n,by=2)
qplot( factor(cyl), mpg, data=mtcars, geom="boxplot" ) +
geom_point( aes(x=factor(cyl)[idx],y=mpg[idx]) ) # <-- see [idx] ?
If you know the points before-hand, you can feed them in directly e.g.:
qplot( factor(cyl), mpg, data=mtcars, geom="boxplot" ) +
geom_point( aes(x=factor(c(4,6,8)),y=c(15,20,25)) ) # plot (4,15),(6,20),...
You can show both by using ggplot() rather than qplot(). The syntax may be a little harder to understand, but you can usually get much more done. If you want to plot both the box plot and the points you can write:
boxpt <- ggplot(data = mtcars, aes(factor(cyl), mpg))
boxpt + geom_boxplot(aes(factor(cyl), mpg)) + geom_point(aes(factor(cyl), mpg))
I don't know what you mean by only plotting specific points on top of the box, but if you want a cheap (and probably not very smart) way of just showing points above the edge of the box, here it is:
boxpt + geom_boxplot(aes(factor(cyl), mpg)) + geom_point(data = ddply(mtcars, .(cyl),summarise, mpg = mpg[mpg > quantile(mpg, 0.75)]), aes(factor(cyl), mpg))
Basically it's the same thing except for the data supplied to geom_point is adjusted to include only the mpg numbers in the top quarter of the distribution by cylinder. In general I'm not sure this is good practice because I think people expect to see points beyond the whiskers only, but there you go.

Resources