share label in ggplot facet_wrap to avoid repetition - r

This code
library(ggplot2)
ggplot(mtcars, aes(mpg, hp)) +
facet_wrap(am ~ cyl, ncol = 6) +
geom_point()
produces the following graph
The variable "am" has two values (0 and 1) that are repeated 3 times each. I wonder whether it is possible to include a single strip with the value 0 centered (and the same for the value 1).

Related

Making facet_wrap label cover all related panels in ggplot

I was wondering, is there a way to make facet_wraps() labels cover all of the related columns in ggplot. Here is a example to show what I mean:
library(tidyverse)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
p + facet_wrap(~ vs+cyl, labeller = label_value)
As you can see, in the previous plot, the labels 0 and 1 are present above all of the columns, separately. I would like to modify it in a way that there is only one 0 above the first row of columns and only one 1 over the second row of columns.

Coloring subset of lines in a plot using ggplot

I am trying to connect the dots on my plot using geom_path(). I also want to color certain lines(intervals) based on a group variable(t). This is what I have so far:
ggplot(data, aes(x=x, y=x)) +
geom_point() +
geom_path(color=t)
What this does is it "incorrectly" connects the points based on this group. I just want the correct connecting lines to have a separate color.
Could any one help me with this?
Since you did not share your data: You could be experiencing an edge case that occurs if you color by boolean; e.g., a specific value of a variable.
In this case, ggplot groups your geom_path by var == x. You can prevent this by adding group = 1.
Basic (somewhat contrived) example
ggplot(mtcars) +
geom_point(aes(mpg, hp)) +
geom_path(aes(mpg, hp))
Above plot with color = cyl == 4
ggplot(mtcars) +
geom_point(aes(mpg, hp)) +
geom_path(aes(mpg, hp, color = cyl == 4))
Above plot with group = 1
ggplot(mtcars) +
geom_point(aes(mpg, hp)) +
geom_path(aes(mpg, hp, color = cyl == 4, group = 1))
If you pass either a single color (not what you want), or a vector of colors equal to the number of plot elements, you can get ggplot to color the lines for you. So, for instance,
data <- data.frame(x = 1:10, y = 1:10)
ggplot(data, aes(x=x, y=x)) +
geom_point() +
geom_path(color=rainbow(10))

geom_dotplot() loses dodge after applying colour aesthetics

I want to organize my data by one category on the X-axis, but color it by another category as in this example:
Graph 1, without coloring:
require(ggplot2)
nocolor <- ggplot(mtcars, aes(x=as.factor(cyl), y=disp)) +
geom_dotplot(binaxis="y", stackdir = "center")
print(nocolor)
Graph 2, with coloring:
nododge <- ggplot(mtcars, aes(x=as.factor(cyl), y=disp, fill=as.factor(gear))) +
geom_dotplot(binaxis="y", stackdir = "center")
print(nododge)
One problem that occurs after introducing coloring is that the dots belonging to different groups wont dodge one another anymore. This causes problems with my real data, as I get dots that happen to have the same value and completely obscure one another.
Then I tried this, but it garbled my data:
Graph 3:
garbled <- ggplot(mtcars, aes(x=as.factor(cyl), y=disp)) +
geom_dotplot(binaxis="y", stackdir = "center", fill=as.factor(mtcars$gear))
print(garbled)
The dots dodge one another, but the the coloring is just random and is not true to the actual data.
I expected the answer to this question to solve my problem, but the coloring remained random:
Graph 4:
graphdata <- mtcars
graphdata$colorname <- as.factor(graphdata$gear)
levels(graphdata$colorname) <- c("red", "blue", "black")
jalapic <- ggplot(graphdata, aes(x=as.factor(cyl), y=disp)) +
geom_dotplot(binaxis="y", stackdir = "center", fill=as.character(graphdata$colorname))
print(jalapic)
Does anyone have an idea how to get the dots in Graph #2 to dodge one another, or how to fix the coloring in graphs 3 or 4? I would really appreciate any help, thanks.
Using binpositions = "all" and stackgroups = TRUE:
ggplot(mtcars, aes(x=as.factor(cyl), y=disp, fill=as.factor(gear))) +
geom_dotplot(binaxis="y", stackdir = "center", binpositions="all", stackgroups=TRUE)
gives:
A possible alternative is using stackdir = "up":
ggplot(mtcars, aes(x=as.factor(cyl), y=disp, fill=as.factor(gear))) +
geom_dotplot(binaxis="y", stackdir = "up", binpositions="all", stackgroups=TRUE)
which gives:
Here's another option that might work better than a dotplot, depending on your needs. We plot the individual points, but we separate them so that each point is visible.
In my original answer, I used position_jitterdodge, but the randomness of that method resulted in overlapping points and little control over point placement. Below is an updated approach that directly controls point placement to prevent overlap.
In the example below, we have cyl as the x variable, disp as the y variable, and gear as the colour aesthetic.
Within each cyl, we want points to be dodged by gear.
Within each gear we want points with similar values of disp to be separated horizontally so that they don't overlap.
We do this by adding appropriate increments to the value of cyl in order to shift the horizontal placement of the points. We control this with two parameters: dodge separates groups of points by gear, while sep controls the separation of points within each gear that have similar values of disp. We determine "similar values of disp" by creating a grouping variable called dispGrp, which is just disp rounded to the nearest ten (although this can, of course, be adjusted, depending on the scale of the data, size of the plotted points, and physical size of the graph).
To determine the x-value of each point, we start with the value of cyl, add dodging by gear, and finally spread the points within each gear and dispGrp combination by amounts that depend on the number of points within the each grouping.
All of these data transformations are done within a dplyr chain, and the resulting data frame is then fed to ggplot. The sequence of data transformations and plotting could be generalized into a function, but the code below addressed only the specific case in the question.
library(dplyr)
library(ggplot2)
dodge = 0.3 # Controls the amount dodging
sep = 0.05 # Within each dodge group, controls the amount of point separation
mtcars %>%
# Round disp to nearest 10 to identify groups of points that need to be separated
mutate(dispGrp = round(disp, -1)) %>%
group_by(gear, cyl, dispGrp) %>%
arrange(disp) %>%
# Within each cyl, dodge by gear, then, within each gear, separate points
# within each dispGrp
mutate(cylDodge = cyl + dodge*(gear - mean(unique(mtcars$gear))) +
sep*seq(-(n()-1), n()-1, length.out=n())) %>%
ggplot(aes(x=cylDodge, y=disp, fill=as.factor(gear))) +
geom_point(pch=21, size=2) +
theme_bw() +
scale_x_continuous(breaks=sort(unique(mtcars$cyl)))
Here's my original answer, using position_jitterdodge to dodge by color and then jitter within each color group to separate overlapping points:
set.seed(3521)
ggplot(mtcars, aes(x=factor(cyl), y=disp, fill=as.factor(gear))) +
geom_point(pch=21, size=1.5, position=position_jitterdodge(jitter.width=1.2, dodge.width=1)) +
theme_bw()

How to use free scales but keep a fixed reference point in ggplot?

I am trying to create a plot with facets. Each facet should have its own scale, but for ease of visualization I would like each facet to show a fixed y point. Is this possible with ggplot?
This is an example using the mtcars dataset. I plot the weight (wg) as a function of the number of miles per gallon (mpg). The facets represent the number of cylinders of each car. As you can see, I would like the y scales to vary across facets, but still have a reference point (3, in the example) at the same height across facets. Any suggestions?
library(ggplot2)
data(mtcars)
ggplot(mtcars, aes(mpg, wt)) + geom_point() +
geom_hline (yintercept=3, colour="red", lty=6, lwd=1) +
facet_wrap( ~ cyl, scales = "free_y")
[EDIT: in my actual data, the fixed reference point should be at y = 0. I used y = 3 in the example above because 0 didn't make sense for the range of the data points in the example]
It's unclear where the line should be, let's assume in the middle; you could compute limits outside ggplot, and add a dummy layer to set the scales,
library(ggplot2)
library(plyr)
# data frame where 3 is the middle
# 3 = (min + max) /2
dummy <- ddply(mtcars, "cyl", summarise,
min = 6 - max(wt),
max = 6 - min(wt))
ggplot(mtcars, aes(mpg, wt)) + geom_point() +
geom_blank(data=dummy, aes(y=min, x=Inf)) +
geom_blank(data=dummy, aes(y=max, x=Inf)) +
geom_hline (yintercept=3, colour="red", lty=6, lwd=1) +
facet_wrap( ~ cyl, scales = "free_y")

converting boxplots to densities in ggplot2 in R

I have the following ggplot2 plot:
ggplot(iris) + geom_boxplot(aes(x=Species, y=Petal.Length, fill=Species)) + coord_flip()
I would like to instead plot this as horizontal density plots or histograms, meaning have density line plots for each species or histograms instead of boxplots. This does not do the trick:
> ggplot(iris) + geom_density(aes(x=Species, y=Petal.Length, fill=Species)) + coord_flip()
Error in eval(expr, envir, enclos) : object 'y' not found
for simplicity I used Species as the x variable and as the fill but in my actual data the X axis represents one set of conditions and the fill represents another. Though that should not matter for plotting purposes. I'm trying to make it so the X axis represents different conditions for which the value y is plotted as a density/histogram instead of boxplots.
edit this is better illustrated with a variable that has two factor-like variables like Species. In the mpg dataset, I want to make a density plot for each manufacturer, plotting the distribution of displ for each cyl value. The x-axis (which is vertical in flipped coordinates) represents each manufacturer, and value being histogrammed is displ, but for each manufacturer, I want as many histograms as there are cyl values for that manufacturer. Hope this is clearer. I know that this doesn't work because y= expects counts.
ggplot(mpg, aes(x=manufacturer, fill=cyl, y=displ)) +
geom_density(position="identity") + coord_flip()
The closest I get is:
> ggplot(mpg, aes(x=displ, fill=cyl)) +
+ geom_density(position="identity") + facet_grid(manufacturer ~ .)
But I don't want different grids, I'd like them to be different entries in the same plot like in the histogram case.
Something like this? For both histogram and density plots, the y variable is count. So, you've to plot x = Petal.Length whose frequency (for that given binwidth) will be plotted in the y-axis. Just use fill=Species along with x=Petal.Length to give colours by Species.
For histogram:
ggplot(iris, aes(x=Petal.Length, fill=Species)) +
geom_histogram(position="identity") + coord_flip()
For density:
ggplot(iris, aes(x=Petal.Length, fill=Species)) +
geom_density(position="identity") + coord_flip()
Edit: Maybe you're looking for facetting??
ggplot(mpg, aes(x=displ, fill=factor(cyl))) +
geom_density(position="identity") +
facet_wrap( ~ manufacturer, ncol=3)
Gives:
Edit: Since, you don't want facetting, the only other way I can think of is to create a separate group by pasting manufacturer and cyl together:
dd <- mpg
dd$grp <- factor(paste(dd$manufacturer, dd$cyl))
ggplot(dd, aes(x=displ)) +
geom_density(aes(fill=grp), position="identity")
gives:

Resources