ggplot2: Flip axes and maintain aspect ratio of data - r

In ggplot2, the coord_fixed() coordinate system ensures that the aspect ratio of the data is maintained at a given value. So, the shape of the panel changes to maintain the shape of the data. Meanwhile coord_flip() swaps the axes of the plot. However, a plot in ggplot2 must have exactly one coordinate system, so these functions cannot be combined.
My question is:
Does there exist a way to combine the behaviours of coord_fixed() and coord_flip(), resulting in a coordinate system with the x and y axes exchanged and a fixed aspect ratio of the data?
This is a popular question, however the common answer is incorrect:
How do I to fix aspect ratio and apply coord_flip in ggplot2?
Flipping and maintaining aspect ratio of a chart in ggplot2
The commonly suggested answer is to use coord_flip() together with theme(aspect.ratio = 1) instead of coord_fixed(). However, as per the ggplot2 documentation, this setting refers to the "aspect ratio of the panel." Thus, the data will change shape to maintain the shape of the panel.
I suspect that this is a feature that does not currently exist in ggplot2. But more importantly I think that a correct solution or at least response to this question should be documented.
Quick minimal example of the issue:
library(ggplot2)
x <- 1:100; data <- data.frame(x = x, y = x * 2)
p <- ggplot(data, aes(x, y)) + geom_point()
p # by default panel and data both fit to device window
p + coord_fixed() # panel changes shape to maintain shape of data
p + theme(aspect.ratio = 1) # data changes shape to maintain shape of panel
p + coord_fixed() + coord_flip() # coord_flip() overwrites coord_fixed()
# popular suggested answer does not maintain aspect ratio of data:
p + coord_flip() + theme(aspect.ratio = 1)

I agree that the theme solution isn't really a proper one. Here is a solution that does work programatically by calculating the aspect from the actual axes ranges stored in the plot object, but it takes a few lines of code:
ranges <- ggplot_build(p)$layout$panel_ranges[[1]][c('x.range', 'y.range')]
sizes <- sapply(ranges, diff)
aspect <- sizes[1] / sizes[2]
p + coord_flip() + theme(aspect.ratio = aspect)
The solution I would probably use in practice, is to use the horizontal geoms in the ggstance package (although this may not always be feasible).
Note: This will only give the exact correct answer for two continuous scales with an equal multiplicative extend argument (i.e. the default).
edit: In many cases I would recommend using coord_equal combined with the ggstance package instead of this solution.

I ended up just flipping the x and y arguments in the aes specification. So for example instead of:
ggplot(mtcars,aes(x=wt,y=drat))+geom_point()+coord_fixed()
I did:
ggplot(mtcars,aes(x=drat,y=wt))+geom_point()+coord_fixed()

Related

Customize linetype in ggplot2 OR add automatic arrows/symbols below a line

I would like to use customized linetypes in ggplot. If that is impossible (which I believe to be true), then I am looking for a smart hack to plot arrowlike symbols above, or below, my line.
Some background:
I want to plot some water quality data and compare it to the standard (set by the European Water Framework Directive) in a red line. Here's some reproducible data and my plot:
df <- data.frame(datum <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y=rnorm(53,mean=100,sd=40))
(plot1 <-
ggplot(df, aes(x=datum,y=y)) +
geom_line() +
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
However, in this plot it is completely unclear if the Standard is a maximum value (as it would be for example Chloride) or a minimum value (as it would be for Oxygen). So I would like to make this clear by adding small pointers/arrows Up or Down. The best way would be to customize the linetype so that it consists of these arrows, but I couldn't find a way.
Q1: Is this at all possible, defining custom linetypes?
All I could think of was adding extra points below the line:
extrapoints <- data.frame(datum2 <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y2=68)
plot1 + geom_point(data=extrapoints, aes(x=datum2,y=y2),
shape=">",size=5,colour="red",rotate=90)
However, I can't seem to rotate these symbols pointing downward. Furthermore, this requires calculating the right spacing of X and distance to the line (Y) every time, which is rather inconvenient.
Q2: Is there any way to achieve this, preferably as automated as possible?
I'm not sure what is requested, but it sounds as though you want arrows at point up or down based on where the y-value is greater or less than some expected value. If that's the case, then this satisfies using geom_segment:
require(grid) # as noted by ?geom_segment
(plot1 <-
ggplot(df, aes(x=datum,y=y)) + geom_line()+
geom_segment(data = data.frame( df$datum, y= 70, up=df$y >70),
aes(xend = datum , yend =70 + c(-1,1)[1+up]*5), #select up/down based on 'up'
arrow = arrow(length = unit(0.1,"cm"))
) + # adjust units to modify size or arrow-heads
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
If I'm wrong about what was desired and you only wanted a bunch of down arrows, then just take out the stuff about creating and using "up" and use a minus-sign.

re-sizing ggplot geom_dotplot

I'm having trouble creating a figure with ggplot2. I am using geom_dotplot with center stacking to display my data which are discrete values for 4 categories.
For aesthetic reasons I want to customize the positions of the dots so that
reduce the empty space between dots along the y axis, (ie the dots are 1 value large)
The distributions fit and don't overlap
I've adjusted the bin and dotsize to achieve aesthetic goal 1, but that requires me to fiddle with the ylim() parameter to make sure that the groups fit in the plot. This results in a plot with more whitw space and few numbers on the y axis.
Question: Can anyone explain a way to resize the empty space on this plot?
My code is below:.
plot <- ggplot(figdata, aes(y=Counts, x=category, col=strain)) +
geom_dotplot(aes(fill=strain), dotsize=1, binwidth=.7,
binaxis= "y",stackdir ="centerwhole", stackratio=.7) +
ylim(18,59)
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black")
Which produces:
EDIT: Incorporating jitter will allow the all the data to fit, but I don't want to add noise to this data and would prefer to show it as discreet data.
adjusting the binwidth and dotsize to 0.3 as suggested below also fits all the data, however it leaves too much white space.
I think that I might have to transform my data so that the values are steps smaller than 1, in order to get everything to fit horizontally and dot sizes to big large enough to reduce white space.
I think the easiest way is using coord_cartesian:
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black") +
coord_cartesian(ylim=c(17,40))
Which gives me this plot (with fake data that are not as neatly distributed as yours):

Using ggplot2: Create faceted scatterplot with scaled and moved density

I would like to plot some data as a scatter plot using facet_wrap, while superimposing some information such as a linear regression and the density.
I managed to do all that, but the density values are out of proportion with respect to my points, which is a normal thing since these points are far away. Nevertheless, I'd like to scale and move my density curve so that it is clearly visible; I don't care about it's real values but more about its shape.
Here is an exaggerated minimum working example of what I have:
set.seed(48151623)
mydf <- data.frame(x1=rnorm(mean=5,n=100),x2=rnorm(n=100,mean=10),x3=rnorm(n=100,mean=20,sd=3))
mydf$var <- mydf$x1 + mydf$x2 * mydf$x3
mydf.wide <- melt(mydf,id.vars='var',measure.vars=c(1:3))
ggplot(data=mydf.wide,aes(x=value,y=var)) +
geom_point(colour='red') +
geom_smooth(method='lm') +
stat_density(aes(x=value,y=..scaled..),position='identity',geom='line') +
facet_wrap(~variable,scale='free_x')
Which results in:
What I would like resembles to this ugly hack:
stat_density(aes(x=value,y=..scaled..*100+200),position='identity',geom='line')
Ideally, I would use y=..scaled..* diff(range(value)) + min(value) but when I do this I get an error saying that 'value' was not found. I suspect the problem is related to the faceting, but I would prefer to keep my facets.
How can I scale and move the density curve in this case?
I suggest to make two plots and combine them with grid.arrange:
p1 <- ggplot(data=mydf.wide,aes(x=value,y=var)) +
geom_point(colour='red') +
geom_smooth(method='lm') +
facet_wrap(~variable,scale='free_x') +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
plot.margin = unit(c(1, 1, 0, 0.5), "lines"))
p2 <- ggplot(data=mydf.wide,aes(x=value,y=var)) +
stat_density(aes(x=value,y=..scaled..),position='identity',geom='line') +
facet_wrap(~variable,scale='free_x') +
theme(strip.background=element_blank(),
strip.text=element_blank(),
plot.margin = unit(c(-1, 1, 0.5, 0.35), "lines"))
library(gridExtra)
grid.arrange(p1, p2, heights = c(2,1))
I'm not sure if this completely answers your question, but it was too long to put in a comment, so... In response to your second chunk of code in your question, since you've already defined x=value, you can use x instead of value in your definition of y.
stat_density(aes(x=value,y=..scaled..*diff(range(x)) +
min(x)),position='identity',geom='line')
This seems to fix your error and produces the following plot:
The only problem is, of course, if you have data with low y-values, then you're still going to overlap your density curves with your scatterplot. But, if this isn't the case, I personally think this is a fairly informative figure, as long as you can communicate effectively that the y axis values aren't important in interpreting the density curves--only the shapes of the curves are important.
I appreciate the answers of everyone, which led me to better understand ggplot underlying mechanisms. I also realize how awkward my requirement is; ggplot is not going to solve my problem.
I managed to do what I wanted not by using ggplot stat_density but to directly calculate my densities in another data frame:
set.seed(48151623)
mydf <- data.frame(x1=rnorm(mean=5,n=100),x2=rnorm(n=100,mean=10),x3=rnorm(n=100,mean=20,sd=3))
mydf$var <- mydf$x1 + mydf$x2 * mydf$x3
mydf.wide <- melt(mydf,id.vars='var',measure.vars=c(1:3))
mydf.densities <- do.call('rbind',lapply(unique(mydf.wide$variable), function(var) {
tmp <- mydf.wide[which(mydf.wide$variable==var),c('var','value')]
dfit <- density(tmp$value,cut=0)
scaledy <-dfit$y/max(dfit$y) * diff(range(tmp$var)) + min(tmp$var)
data.frame(x=dfit$x,y=scaledy,variable=rep(var,length(dfit$x)))
}))
ggplot(data=mydf.wide,aes(x=value,y=var)) +
geom_point(colour='red') +
geom_smooth(method='lm') +
geom_line(aes(x=x,y=y),data=mydf.densities) +
facet_wrap(~variable,scale='free_x')
(I know that the construction of mydf.densities is a bit obfuscated, but I will work on that later).
I'm giving out the bounty to the most voted solution at the end of the day, for your troubles.

ggplot2: plotting two size aesthetics

From what I can find on stackoverflow, (such as this answer to using two scale colour gradients on one ggplot) this may not (yet) be possible with ggplot2.
I want to create a bubbleplot with two size aesthetics, one always larger than the other. The idea is to show the proportion as well as the absolute values. Now I could colour the points by the proportion but I prefer multi-bubbles. In Excel this is relatively simple. (http://i.stack.imgur.com/v5LsF.png) Is there a way to replicate this in ggplot2 (or base)?
Here's an option. Mapping size in two geom_point layers should work. It's a bit of a pain getting the sizes right for bubblecharts in ggplot though.
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(size = disp), shape = 1) +
geom_point(aes(size = hp/(2*disp))) + scale_size_continuous(range = c(15,30))
To get it looking most like your exapmle, add theme_bw():
P <- p + theme_bw()
The scale_size_continuous() is where you have to just fiddle around till you're happy - at least in my experience. If someone has a better idea there I'd love to hear it.

rdata & ggplot: specifying plot initial plot size?

I'm using ggplot2 and attempting to create an empty plot with some basic dimensions, like I might do w/ the stock plot function like so:
plot(x = c(0, 10), y=c(-7, 7))
Then I'd plot the points with geom_point() (or, stock point() function)
How can I set that basic plot up using ggplot? I'm only able to draw a plot using like:
ggplot() + layer(data=data, mapping = aes(x=side, y=height), geom = "point")
But this has max x/y values based on the data.
There are two ways to approach this:
Basically the same approach as with base graphics; the first layer put down has the limits you want, using geom_blank()
ggplot() +
geom_blank(data=data.frame(x=c(0,10),y=c(-7,7)), mapping=aes(x=x,y=y))
Using expand_limits()
ggplot() +
expand_limits(x=c(0,10), y=c(-7,7))
In both cases, if your data extends beyond this, the axes will be further expanded.
You can set the overall plotting region limits using xlim and ylim:
ggplot(data = data) +
geom_point(aes(x = side, y = height) +
xlim(c(0,10)) +
ylim(c(-7,7))
Also see coord_cartesian which zooms in and out rather than hard coding the axis limits.
Edit Since #Brian clarified the differences between his answer and mine well, I thought I should mention it as well in my answer, so no one misses it. Using xlim and ylim will set the limits of the plotting region no matter what data you add in subsequent layers. Brian's method using expand_limits is a way to set the minimum ranges.

Resources