In short:
I would like to have separate legends for each "panel" of two-panel plot made using facet_wrap . Using facet_wrap(scales="free") works fine for when I want different axis scales, but not for the size of points.
Background:
I have data for several samples with three measurements each: x, y, and z. Each sample is from either class 1 or class 2. x and y have the same distributions in each class. However, all the z measurements for class 1 are less than 1.0; z measurements for class 2 range from 0 to 100.
Where I'm stuck:
Plot x and y on the x and y axes, respectively. Make the area of each point proportional to its z value.
d = matrix(c(runif(100),runif(20)*100),ncol=3)
e = data.frame( gl(2,20), d )
colnames(e) = c("class","x","y","z")
ggplot( data = e, aes(x=x, y=y, size=z) ) +
geom_point() + scale_area() +
facet_wrap( ~ class, ncol=1, scales="free" )
Problem:
Note that the dots on the first panel are difficult to see because they are on the very low end of the scale used for the single legend which ranges from 0 to 100. Is it even possible to have two separate legends (each with a different range) or should I make two plots and combine them with viewports?
A solution using grid.arrange. I've left in the call to facet_wrap so the strip.text remains. You could easily remove this.
# plot for class 1
c1 <- ggplot(e[e$class==1,], aes(x=x,y=y,size=z)) + geom_point() + scale_area() + facet_wrap(~class)
# plot for class 2
c2 <- c1 %+% e[e$class==2,]
library(gridExtra)
grid.arrange(c1,c2, ncol=1)
Related
I'm studying the example of coord_trans() of ggplot2:
library(ggplot2)
library(scales)
set.seed(4747)
df <- data.frame(a = abs(rnorm(26)),letters)
plot <- ggplot(df,aes(a,letters)) + geom_point()
plot + coord_trans(x = "log10")
plot + coord_trans(x = "sqrt")
I modified the code plot + coord_trans(x = "log10") as following and get what I expected:
plot + scale_x_log10(breaks=trans_breaks("log10", function(x) 10^x),
labels=trans_format("log10", math_format(10^.x)))
I modified the code plot + coord_trans(x = "sqrt") as following and get a strange x-axis:
plot + scale_x_sqrt(breaks=trans_breaks("sqrt", function(x) sqrt(x)),
labels=trans_format("sqrt", math_format(.x^0.5)))
How could I fix the problem?
I get why you said it was a strange / terrible axis. The documentation for trans_breaks even warns you about this in its first line:
These often do not produce very attractive breaks.
To make it less unattractive, I would use round(,2) so my axis labels only have 2 decimal points instead of the default 8 or 9 - cluttering up the axis. Then I would set a sensible range, say in your case 0 to 5 (c(0,5)).
Finally, you can specify the number of ticks for your axis using n in the trans_breaks call.
So putting it together, here's how you can format your x-axis and its tick label in the scale_x_sqrt(x) format:
plot <- ggplot(df,aes(a,letters)) + geom_point()
plot + scale_x_sqrt(breaks=trans_breaks("sqrt", function(x) round(sqrt(x),2), n=5)(c(0, 5)))
Produces this:
The c(0,5) is passed to pretty(), a lesser-known Base R's function. From the documentation, pretty does the following:
Compute a sequence of about n+1 equally spaced "round" values which cover the range of the values in x.
pretty(c(0,5)) simply produces [1] 0 1 2 3 4 5 in our case.
You can even fine-tune your axis by changing the parameters. Here the code uses 3 decimal points (round(x,3)) and we asked for 3 number of ticks n=3:
plot <- ggplot(df,aes(a,letters)) + geom_point()
plot + scale_x_sqrt(breaks=trans_breaks("sqrt", function(x) round(sqrt(x),3), n=3)(c(0, 5)))
Produces this:
EDIT based on OP's additional comments:
To get round integer values, floor() or round(x,0) works, so the following code:
plot <- ggplot(df,aes(a,letters)) + geom_point()
plot + scale_x_sqrt(breaks=trans_breaks("sqrt", function(x) round(sqrt(x),0), n=5)(c(0, 5)))
Produces this:
I have two plots that I combine. arrangeGrob() squeezes them so that the size of the new image is the same as one alone. How can I arrange them while preserving the ratio/size?
require(ggplot2)
require(gridExtra)
dat <- read.csv("http://www.ats.ucla.edu/stat/data/fish.csv")
frqncy <- as.data.table(table(dat$child))#
frqncy$V1 <- as.numeric(frqncy$V1)
plot1 <- ggplot(frqncy, aes(x=V1, y= N)) +
geom_histogram(stat="identity", binwidth = 2.5)
plot2 <- ggplot(frqncy, aes(x=V1, y= N)) +
geom_density(stat="identity")
plot <- arrangeGrob(plot1, plot2)
Plot looks like
I have not found any parameter in ggplot() or arrangeGrob() that fixes the ratio of the input.
Edit: Additional complications arise from the definition of axis labels in arrangeGrob(), i.e.
plot <- arrangeGrob(plot1, plot2, left="LHS label")
Then the new file will not automaticall shrink to the minimum height/width combination of plot1 and plot2.
there are several other options, depending on what you want*
library(ggplot2)
p = qplot(1, 1)
grid.arrange(p, p, respect=TRUE) # both viewports are square
grid.arrange(p, p, respect=TRUE, heights=c(1,2)) # relative heights
p1 = p + theme(aspect.ratio=3)
grid.arrange(p,p1, respect=TRUE) # one is square, the other thinner
*: the aspect ratio is often not a well-defined property of plots (unless set manually), because the default is to extend the plot to the available space defined by the plot window/device/viewport.
You can control this when you output to a device. For example, a PDF file:
pdf("plot.pdf", width=5,height=8)
plot
dev.off()
Another option is to set a fixed ratio between the x and y coordinates in the plot itself using coord_fixed(ratio=n), where n is the y/x ratio. This will set the relative physical length of the x and y axes based on the nominal value range for each axis. If you use coord_fixed() the graph will always maintain the desired aspect ratio no matter what device size you use for your output.
For example, in your case both graphs have x-range 0 to 3 and y-range 0 to 132. If you set coord_fixed(ratio=1), your graphs will be tall and super skinny because the x-axis length will be 3/132 times the y-axis length (or to put it another way, 1 x-unit will take up the same physical length and 1 y-unit, but there are only 3 x-units and 132 y-units). Play around with the value of ratio to see how it works. A ratio of somewhere around 0.02 is probably about right for your graphs.
As an example, try the following code. Here I've set the ratio to 0.1, so now 1 x-unit takes up 10 times the physical length of each y-unit (that is, 0 to 3 on the x-axis has the same physical length as 0 to 30 on the y-axis).
plot1 <-ggplot(frqncy, aes(x=V1, y= N)) +
geom_histogram(stat="identity", binwidth = 2.5) +
coord_fixed(ratio=0.1)
plot2 <- ggplot(frqncy, aes(x=V1, y= N)) +
geom_density(stat="identity") +
coord_fixed(ratio=0.1)
plot <- arrangeGrob(plot1, plot2)
pdf("plot.pdf", 5,8)
plot
dev.off()
I want to create a scatter plot, but the scale of the axes is messed up. I want it to have an increasing order, but in the plot y = 7 lies between y = 8.8 and y = 11.8.
It is a bit difficult to explain, so I uploaded a picture of the plot to
splot <- ggplot(df, aes(x_val, y_val)) + geom_point() + ggtitle(title) + xlab(label) + ylab(label)
df looks like that
x_val y_val x_min x_max y_min y_max series
1 8.2640626 7.1605616 7.43370308695577 9.09442211304423 5.62731954407747 8.69380365592253 1IWG
2 10.0321728 8.8790822 8.43774194466477 11.6266036553352 6.97682936735609 10.7813350326439 1J4N
3 13.4994332665331 11.8238683366733 12.4200921869666 14.5787743460995 9.99549351881522 13.6522431545315 1KPL
Thanks for any help.
Use str(df) to examine your data frame df. If the variables you are trying to plot are factors, then use as.numeric() to convert them so that they are interpreted as numbers. Or you can try to specify that they are numeric when you create your data set, depending on how the frame is defined.
I have a somewhat "weird" two-dimensional distribution (not normal with some uniform values, but it kinda looks like this.. this is just a minimal reproducible example), and want to log-transform the values and plot them.
library("ggplot2")
library("scales")
df <- data.frame(x = c(rep(0,200),rnorm(800, 4.8)), y = c(rnorm(800, 3.2),rep(0,200)))
Without the log transformation, the scatterplot (incl. rug plot which I need) works (quite) well, apart from a marginally narrower rug plot on the x axis:
p <- ggplot(df, aes(x, y)) + geom_point() + geom_rug(alpha = I(0.5)) + theme_minimal()
p
When plotting the same with a log10-transform though, the points at the margin (at x = 0 and y = 0, respectively) are plotted outside the rug plot or just on the axis (with other data, and only one half side of a point is visible).
p + scale_x_log10() + scale_y_log10()
How can I "rescale" the axes so that all the points are contained fully within the grid and the rug plots are unaffected, as in the first example?
Maybe you want
p + scale_x_log10(oob=squish_infinite) + scale_y_log10(oob=squish_infinite)
I don't really know what you expect to happen for those values that can be negative or infinite, but one general advice when transformations don't do what you want is to perform them outside of ggplot2. Something like this might be useful,
library(plyr)
df2 <- colwise(log10)(df) # log transform columns
df2 <- colwise(squish_infinite)(df2) # do something with infinites
p %+% df2 # plot the transformed data
In the following example, how do I set separate ylims for each of my facets?
qplot(x, value, data=df, geom=c("smooth")) + facet_grid(variable ~ ., scale="free_y")
In each of the facets, the y-axis takes a different range of values and I would like to different ylims for each of the facets.
The defaults ylims are too long for the trend that I want to see.
This was brought up on the ggplot2 mailing list a short while ago. What you are asking for is currently not possible but I think it is in progress.
As far as I know this has not been implemented in ggplot2, yet. However a workaround - that will give you ylims that exceed what ggplot provides automatically - is to add "artificial data". To reduce the ylims simply remove the data you don't want plot (see at the and for an example).
Here is an example:
Let's just set up some dummy data that you want to plot
df <- data.frame(x=rep(seq(1,2,.1),4),f1=factor(rep(c("a","b"),each=22)),f2=factor(rep(c("x","y"),22)))
df <- within(df,y <- x^2)
Which we could plot using line graphs
p <- ggplot(df,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")
print(p)
Assume we want to let y start at -10 in first row and 0 in the second row, so we add a point at (0,-10) to the upper left plot and at (0,0) ot the lower left plot:
ylim <- data.frame(x=rep(0,2),y=c(-10,0),f1=factor(c("a","b")),f2=factor(c("x","y")))
dfy <- rbind(df,ylim)
Now by limiting the x-scale between 1 and 2 those added points are not plotted (a warning is given):
p <- ggplot(dfy,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
Same would work for extending the margin above by adding points with higher y values at x values that lie outside the range of xlim.
This will not work if you want to reduce the ylim, in which case subsetting your data would be a solution, for example to limit the upper row between -10 and 1.5 you could use:
p <- ggplot(dfy,aes(x,y))+geom_line(subset=.(y < 1.5 | f1 != "a"))+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
There are actually two packages that solve that problem now:
https://github.com/zeehio/facetscales, and https://cran.r-project.org/package=ggh4x.
I would recommend using ggh4x because it has very useful tools, such as facet grid multiple layers (having 2 variables defining the rows or columns), scaling the x and y-axis as you wish in each facet, and also having multiple fill and colour scales.
For your problems the solution would be like this:
library(ggh4x)
scales <- list(
# Here you have to specify all the scales, one for each facet row in your case
scale_y_continuous(limits = c(2,10),
scale_y_continuous(breaks = c(3, 4))
)
qplot(x, value, data=df, geom=c("smooth")) +
facet_grid(variable ~ ., scale="free_y") +
facetted_pos_scales(y = scales)
I have one example of function facet_wrap
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(vars(class), scales = "free",
nrow=2,ncol=4)
Above code generates plot as:
my level too low to upload an image, click here to see plot