ggplot: adjusting alpha/fill two factors cdf - r

I'm having some issues getting my ggplot alpha to be sufficiently dark for my plot.
Example code:
ggplot(mtcars, aes(x=mpg, color=factor(gear), alpha=factor(carb))) + stat_ecdf()
As you can see, whenever carb == 1, it's very difficult to see the plot elements. In my real world data set, the factor for color has four levels and the alpha factor has two levels. I was hoping to have the alpha a slightly lighter shade of the color, but more visible than how it's occurring in that example).

You can adjust the alpha scale, as the user in the comment suggests, either by specifying a range or a specific set breaks to scale_alpha_discrete. That doesn't produce a very easy-to-read result, though:
ggplot(mtcars, aes(x=mpg, color=factor(gear), alpha=factor(carb))) +
stat_ecdf() +
scale_alpha_discrete(range=c(0.4, 1))
Another option would be to save color for the many-leveled factor and choose a different aesthetic for the few-leveled one, like maybe linetype
ggplot(mtcars, aes(x=mpg, linetype=factor(gear), color=factor(carb))) +
stat_ecdf()
For readability, though, faceting might be a better bet.
ggplot(mtcars, aes(x=mpg, color=factor(carb))) +
stat_ecdf() + facet_wrap(~gear, nrow=3)

Related

How to set background color for each panel in grouped boxplot?

I plotted a grouped boxplot and trying to change the background color for each panel. I can use panel.background function to change whole plot background. But how this can be done for individual panel? I found a similar question here. But I failed to adopt the code to my plot.
Top few lines of my input data look like
Code
p<-ggplot(df, aes(x=Genotype, y=Length, fill=Treatment)) + scale_fill_manual(values=c("#69b3a2", "#CF7737"))+
geom_boxplot(width=2.5)+ theme(text = element_text(size=20),panel.spacing.x=unit(0.4, "lines"),
axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.text.y = element_text(angle=90, hjust=1,colour="black")) +
labs(x = "Genotype", y = "Petal length (cm)")+
facet_grid(~divide,scales = "free", space = "free")
p+theme(panel.background = element_rect(fill = "#F6F8F9", colour = "#E7ECF1"))
Unfortunately, like the other theme elements, the fill aesthetic of element_rect() cannot be mapped to data. You cannot just send a vector of colors to fill either (create your own mapping of sorts). In the end, the simplest solution probably is going to be very similar to the answer you linked to in your question... with a bit of a twist here.
I'll use mtcars as an example. Note that I'm converting some of the continuous variables in the dataset to factors so that we can create some more discrete values.
It's important to note, the rect geom is drawn before the boxplot geom, to ensure the boxplot appears on top of the rect.
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
All done... but not quite. Something is wrong and you might notice this if you pay attention to the boxes on the legend and the gridlines in the plot panels. It looks like the alpha value is incorrect for some facets and okay for others. What's going on here?
Well, this has to do with how geom_rect works. It's drawing a box on each plot panel, but just like the other geoms, it's mapped to the data. Even though the x and y aesthetics for the geom_rect are actually not used to draw the rectangle, they are used to indicate how many of each rectangle are drawn. This means that the number of rectangles drawn in each facet corresponds to the number of lines in the dataset which exist for that facet. If 3 observations exist, 3 rectangles are drawn. If 20 observations exist for one facet, 20 rectangles are drawn, etc.
So, the fix is to supply a dataframe that contains one observation each for every facet. We have to then make sure that we supply any and all other aesthetics (x and y here) that are included in the ggplot call, or we will get an error indicating ggplot cannot "find" that particular column. Remember, even if geom_rect doesn't use these for drawing, they are used to determine how many observations exist (and therefore how many to draw).
rect_df <- data.frame(carb=unique(mtcars$carb)) # supply one of each type of carb
# have to give something to disp
rect_df$disp <- 0
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
data=rect_df,
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
That's better.

scale_color_brewer (ggplot2) does not colour all the lines

When running this ggplot2 code:
ggplot(canine_lower, aes(x=x, y=y, colour=Teeth)) +
geom_smooth(method="lm", formula= y~poly(x,4), se=FALSE) +
scale_color_grey(start=0.9, end=0.1)
I get this plot thanks to the scale_color_grey function:
There is a gradual grey transition among all groups of teeth (from 1 to 16).
However, I would like to colourize it. For this reason I employed the scale_color_brewer, with partly success. The run code is:
ggplot(canine_lower, aes(x=x, y=y, colour=Teeth)) +
geom_smooth(method="lm", formula= y~poly(x,4), se=FALSE)+
scale_color_brewer(palette="Reds")
which offers this unfinished plot:
As seen above, from 10 to 16 there is no color.
How can I span the color range using this function? Is there any other alternative function?
I must say that I tried with scale_color_gradient with no success.
The maximum number of colors from brewer.pal (in the package RColorBrewer), the function scale_color_brewer uses to generate the colors, is 9 for sequential palettes. If you look at the help for brewer.pal you can check the maximum number of colors for each of the palette types.
You can generate larger palettes in many other ways, such as scale_color_viridis as shown by #NateDay, or with the two examples below, but it will be difficult to distinguish so many different colors in the graph.
mtcars$rowname=rownames(mtcars)
ggplot(mtcars[1:16, ], aes(mpg, hp, color=rowname)) +
geom_point() +
scale_colour_manual(values=hcl(seq(0,360,length=17)[1:16], 100,65))
ggplot(mtcars[1:16, ], aes(mpg, hp, color=rowname)) +
geom_point() +
scale_colour_manual(values=hcl(0,100,seq(40,100,length=16)))
you could use library(viridis) as an alternative:
# a reproducible example
mtcars <- add_rownames(mtcars)
ggplot(mtcars, aes(mpg, hp, color = rowname)) +
geom_point() +
viridis::scale_color_viridis(discrete = TRUE)
scale_color_gradient() is failing you because it is designed to be used to map to continuous values, not discrete ones.

controlling point colors with geom_jitter and geom_boxplot in ggplot2 in R

I have the following boxplot in ggplot2 to which I add the points plotted with geom_jitter:
p <- ggplot(mtcars, aes(factor(cyl), mpg)) + geom_boxplot(aes(colour=factor(cyl))) + geom_jitter(aes(color=factor(cyl)))
I colored the individual points according to factor(cyl) which works great. However, some points still appear as black. What are these? are these the outlier to the boxplots? If so, it's strange since some of them are just as far from median as the colored points (which are not outliers), but perhaps that is explained by the randomness of geom_jitter?
can someone please explain if that's the correct explanation, and also, how can I make the outliers go away if I use geom_jitter? thanks.
The black point is the outlier of the boxplot.
Plotting just the box plot you can see that.
ggplot(mtcars, aes(cyl, mpg)) +
geom_boxplot(aes(fill=as.factor(cyl)), outlier.size = 0)
Setting outlier.size = 0 does the job of getting rid of the outlier dot. You can change colours also. Check out ?geom_boxplot for more details.
ggplot(mtcars, aes(cyl, mpg)) +
geom_boxplot(aes(fill=as.factor(cyl)), outlier.size = 0) +
geom_jitter(color=factor(cyl))

ggplot2 colour geom_point by factor but geom_smooth based on all data

In ggplot2, the following command p <- qplot(wt, mpg, data=mtcars, colour=factor(cyl)) taken from here plots a scatter plot with each point coloured according to factor
I would like to fit all data with a geom_smooth irrespective of factor but keeping the colour of individual points according to factor. p + geom_smooth(method="lm") does a linear fit on each factor. How do I do this?
You can do this fairly easily by stepping back from the 'qplot' wrapper function and using the 'ggplot' and geometry functions directly.
ggplot(mtcars, aes(x=wt, y=mpg)) +
geom_point(aes(colour=factor(cyl))) +
geom_smooth(method="lm")
Step 1: Set your initial 'ggplot' settings. These are the settings that you want to be defaults for the geometry functions.
ggplot(mtcars, aes(x=wt, y=mpg))
In this case, we are using the 'mtcars' data for all geometries with 'wt' assigned to the x-axis and 'mpg' assigned to the y-axis. By specifying these at the beginning, we lessen the risk of messing something up when copy-pasting into the geometry functions.
Step 2: Draw the point geometry, using the factors of 'cyl' to color the points. This is what the original 'qplot' function was doing, but we're specifying it a little more explicitly.
geom_point(aes(colour=factor(cyl)))
Step 3: Draw the smoothed linear model. This is exactly what the OP wrote before, but now that the aesthetic of coloring is no longer part of the defaults, the model draws as intended.
geom_smooth(method="lm")
Chain it all together with the + et voila!
For reference: You could just as easily do this by being explicit in each layer, like so:
ggplot() +
geom_point(data=mtcars, aes(x=wt, y=mpg, colour=factor(cyl))) +
geom_smooth(data=mtcars, method="lm", aes(x=wt, y=mpg))
In my opinion, you'll find ggplot a lot easier if you start to use the ggplot() function rather than qplot. The control of aesthetics makes a lot more sense. In this case, you just build your base:
p <- ggplot(mtcars, aes(wt, mpg))
Then build the two geoms on top:
p + geom_point(aes(colour = factor(cyl))) +
geom_smooth(method = "lm")
Let me know if that wasn't what you're after.
I agree with previous answers from #alexwhan and #Dinre that the ggplot() + geom_point(...) + ... is the best approach to this problem
However, If you just would like to modify your solution try
p + geom_smooth(method = 'lm', aes(colour = NA), colour = 'magenta')

ggplot2: separate color scale per facet

Intuitively I'm looking for something like: facet_(scales="free_color")
I do something like
p <- ggplot(mpg, aes(year, displ, color=model)) + facet_wrap(~manufacturer)
p + geom_jitter()
That is: plot 2d measurements from individuals(model) belonging to different species(manufacturer) faceted by a species, indicating the individual by color.
The problem is that all individuals share the same color scale - so that the points in a facet have very similar colors.
Using the group aesthetic with geom_line would solve the problem, but lines tell different story than dots.
Another obvious solution would be to drop the faceting and draw a separate plot for each subset. (If this should be the only solution: are there any quick, smart or proven ways to do that?)
I'm not sure that this is an available option when you're colouring by a factor. However, a quick way to produce the individual plots would be something like this:
d_ply(mpg, .(manufacturer), function(df) {
jpeg(paste(df$manufacturer[[1]], ".jpeg", sep=""))
plots <- ggplot(df, aes(year, displ, color=factor(model))) + geom_jitter()
print(plots)
dev.off()
})
Related Answers:
Different legends and fill colours for facetted ggplot?
I think you simply want to color by class, where each manufacturer makes several models, each only one or two per class:
p <- ggplot(mpg, aes(year, displ, color=class)) + facet_wrap(~ manufacturer)
p + geom_jitter()

Resources