I have two plots that I combine. arrangeGrob() squeezes them so that the size of the new image is the same as one alone. How can I arrange them while preserving the ratio/size?
require(ggplot2)
require(gridExtra)
dat <- read.csv("http://www.ats.ucla.edu/stat/data/fish.csv")
frqncy <- as.data.table(table(dat$child))#
frqncy$V1 <- as.numeric(frqncy$V1)
plot1 <- ggplot(frqncy, aes(x=V1, y= N)) +
geom_histogram(stat="identity", binwidth = 2.5)
plot2 <- ggplot(frqncy, aes(x=V1, y= N)) +
geom_density(stat="identity")
plot <- arrangeGrob(plot1, plot2)
Plot looks like
I have not found any parameter in ggplot() or arrangeGrob() that fixes the ratio of the input.
Edit: Additional complications arise from the definition of axis labels in arrangeGrob(), i.e.
plot <- arrangeGrob(plot1, plot2, left="LHS label")
Then the new file will not automaticall shrink to the minimum height/width combination of plot1 and plot2.
there are several other options, depending on what you want*
library(ggplot2)
p = qplot(1, 1)
grid.arrange(p, p, respect=TRUE) # both viewports are square
grid.arrange(p, p, respect=TRUE, heights=c(1,2)) # relative heights
p1 = p + theme(aspect.ratio=3)
grid.arrange(p,p1, respect=TRUE) # one is square, the other thinner
*: the aspect ratio is often not a well-defined property of plots (unless set manually), because the default is to extend the plot to the available space defined by the plot window/device/viewport.
You can control this when you output to a device. For example, a PDF file:
pdf("plot.pdf", width=5,height=8)
plot
dev.off()
Another option is to set a fixed ratio between the x and y coordinates in the plot itself using coord_fixed(ratio=n), where n is the y/x ratio. This will set the relative physical length of the x and y axes based on the nominal value range for each axis. If you use coord_fixed() the graph will always maintain the desired aspect ratio no matter what device size you use for your output.
For example, in your case both graphs have x-range 0 to 3 and y-range 0 to 132. If you set coord_fixed(ratio=1), your graphs will be tall and super skinny because the x-axis length will be 3/132 times the y-axis length (or to put it another way, 1 x-unit will take up the same physical length and 1 y-unit, but there are only 3 x-units and 132 y-units). Play around with the value of ratio to see how it works. A ratio of somewhere around 0.02 is probably about right for your graphs.
As an example, try the following code. Here I've set the ratio to 0.1, so now 1 x-unit takes up 10 times the physical length of each y-unit (that is, 0 to 3 on the x-axis has the same physical length as 0 to 30 on the y-axis).
plot1 <-ggplot(frqncy, aes(x=V1, y= N)) +
geom_histogram(stat="identity", binwidth = 2.5) +
coord_fixed(ratio=0.1)
plot2 <- ggplot(frqncy, aes(x=V1, y= N)) +
geom_density(stat="identity") +
coord_fixed(ratio=0.1)
plot <- arrangeGrob(plot1, plot2)
pdf("plot.pdf", 5,8)
plot
dev.off()
Related
I'm doing quantitative image analysis, and visualizing the results with ggplot2. The output contains one datapoint for each pixel in the original image.
geom_raster() nicely visualizes my data in R. But it would be nice to output a raster image corresponding to the results. That way, I could flip through several derived images using a lightweight image viewer (e.g., feh), and the pixels would line up perfectly.
Is there an easy way to output the pixels, and only the pixels, to an image file? No legend, no axes, nothing but the pixels. Assume my data.frame has columns for row and col, and the desired output resolution is also known.
Here's one way:
library(ggplot2)
library(reshape2) # for melt(...)
n <- 100
set.seed(1) # for reproducible example
img <- matrix(rnorm(n^2,30,3),nc=n)
gg <- melt(data.frame(x=1:n,img),id="x")
ggplot(gg) + geom_raster(aes(x=x,y=variable,fill=value))+
scale_x_continuous(expand=c(0,0))+ # get rid of extra space on x-axis
guides(fill=FALSE)+ # turn off color legend
theme(axis.text=element_blank(), # turn off the axis annotations
axis.ticks=element_blank(),
axis.title=element_blank())
Thanks to jlhoward for pointing me in the right direction. There are a few more missing ingredients -- for instance, without labs(x=NULL, y=NULL), the output PNG will have white borders on the bottom and left.
I decided my solution should have two parts:
Craft a ggplot object to visualize my data. (This step is the same as usual.)
Call a general-purpose function to take care of all the annoying details which are necessary to output that plot as a pixel-perfect PNG.
Here is one such function.
BorderlessPlotPng <- function(plot, ...) {
# Write a ggplot2 plot to an image file with no borders.
#
# Args:
# plot: A ggplot2 plot object.
# ...: Arguments passed to the png() function.
require(grid)
png(type='cairo', antialias=NULL, units='px', ...)
print(plot
+ theme(plot.margin=unit(c(0, 0, -0.5, -0.5), 'line'),
axis.text=element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
legend.position='none')
+ scale_x_continuous(expand=c(0, 0))
+ scale_y_continuous(expand=c(0, 0))
+ labs(x=NULL, y=NULL)
)
dev.off()
}
To see it in action, here's a plot of some synthetic data. (I made each output pixel 10 pixels wide for demonstration purposes.)
# Synthetic data.
width <- 64
height <- 48
d <- data.frame(row=rep(1:height, each=width),
col=rep(1:width, height),
x=rnorm(n=width * height))
# Construct and print the plot.
library(ggplot2)
plot <- (ggplot(data=d, aes(x=col, y=height + 1 - row, fill=x))
+ geom_raster()
+ scale_fill_gradient2()
)
pixel_size <- 10
BorderlessPlotPng(plot,
filename='test.png',
width=width * pixel_size,
height=height * pixel_size)
Output:
Of course, running with pixel_size <- 1 would give you a 1:1 image, which you could compare to the original image by flipping back and forth.
I have a somewhat "weird" two-dimensional distribution (not normal with some uniform values, but it kinda looks like this.. this is just a minimal reproducible example), and want to log-transform the values and plot them.
library("ggplot2")
library("scales")
df <- data.frame(x = c(rep(0,200),rnorm(800, 4.8)), y = c(rnorm(800, 3.2),rep(0,200)))
Without the log transformation, the scatterplot (incl. rug plot which I need) works (quite) well, apart from a marginally narrower rug plot on the x axis:
p <- ggplot(df, aes(x, y)) + geom_point() + geom_rug(alpha = I(0.5)) + theme_minimal()
p
When plotting the same with a log10-transform though, the points at the margin (at x = 0 and y = 0, respectively) are plotted outside the rug plot or just on the axis (with other data, and only one half side of a point is visible).
p + scale_x_log10() + scale_y_log10()
How can I "rescale" the axes so that all the points are contained fully within the grid and the rug plots are unaffected, as in the first example?
Maybe you want
p + scale_x_log10(oob=squish_infinite) + scale_y_log10(oob=squish_infinite)
I don't really know what you expect to happen for those values that can be negative or infinite, but one general advice when transformations don't do what you want is to perform them outside of ggplot2. Something like this might be useful,
library(plyr)
df2 <- colwise(log10)(df) # log transform columns
df2 <- colwise(squish_infinite)(df2) # do something with infinites
p %+% df2 # plot the transformed data
What's the ggplot2 equivalent of "dotplot" histograms? With stacked points instead of bars? Similar to this solution in R:
Plot Histogram with Points Instead of Bars
Is it possible to do this in ggplot2? Ideally with the points shown as stacks and a faint line showing the smoothed line "fit" to these points (which would make a histogram shape.)
ggplot2 does dotplots Link to the manual.
Here is an example:
library(ggplot2)
set.seed(789); x <- data.frame(y = sample(1:20, 100, replace = TRUE))
ggplot(x, aes(y)) + geom_dotplot()
In order to make it behave like a simple dotplot, we should do this:
ggplot(x, aes(y)) + geom_dotplot(binwidth=1, method='histodot')
You should get this:
To address the density issue, you'll have to add another term, ylim(), so that your plot call will have the form ggplot() + geom_dotplot() + ylim()
More specifically, you'll write ylim(0, A), where A will be the number of stacked dots necessary to count 1.00 density. In the example above, the best you can do is see that 7.5 dots reach the 0.50 density mark. From there, you can infer that 15 dots will reach 1.00.
So your new call looks like this:
ggplot(x, aes(y)) + geom_dotplot(binwidth=1, method='histodot') + ylim(0, 15)
Which will give you this:
Usually, this kind of eyeball estimate will work for dotplots, but of course you can try other values to fine-tune your scale.
Notice how changing the ylim values doesn't affect how the data is displayed, it just changes the labels in the y-axis.
As #joran pointed out, we can use geom_dotplot
require(ggplot2)
ggplot(mtcars, aes(x = mpg)) + geom_dotplot()
Edit: (moved useful comments into the post):
The label "count" it's misleading because this is actually a density estimate may be you could suggest we changed this label to "density" by default. The ggplot implementation of dotplot follow the original one of Leland Wilkinson, so if you want to understand clearly how it works take a look at this paper.
An easy transformation to make the y axis actually be counts, i.e. "number of observations". From the help page it is written that:
When binning along the x axis and stacking along the y axis, the numbers on y axis are not meaningful, due to technical limitations of ggplot2. You can hide the y axis, as in one of the examples, or manually scale it to match the number of dots.
So you can use this code to hide y axis:
ggplot(mtcars, aes(x = mpg)) +
geom_dotplot(binwidth = 1.5) +
scale_y_continuous(name = "", breaks = NULL)
I introduce an exact approach using #Waldir Leoncio's latter method.
library(ggplot2); library(grid)
set.seed(789)
x <- data.frame(y = sample(1:20, 100, replace = TRUE))
g <- ggplot(x, aes(y)) + geom_dotplot(binwidth=0.8)
g # output to read parameter
### calculation of width and height of panel
grid.ls(view=TRUE, grob=FALSE)
real_width <- convertWidth(unit(1,'npc'), 'inch', TRUE)
real_height <- convertHeight(unit(1,'npc'), 'inch', TRUE)
### calculation of other values
width_coordinate_range <- diff(ggplot_build(g)$panel$ranges[[1]]$x.range)
real_binwidth <- real_width / width_coordinate_range * 0.8 # 0.8 is the argument binwidth
num_balls <- real_height / 1.1 / real_binwidth # the number of stacked balls. 1.1 is expanding value.
# num_balls is the value of A
g + ylim(0, num_balls)
Apologies : I don't have enough reputation to 'comment'.
I like cuttlefish44's "exact approach", but to make it work (with ggplot2 [2.2.1]) I had to change the following line from :
### calculation of other values
width_coordinate_range <- diff(ggplot_build(g)$panel$ranges[[1]]$x.range)
to
### calculation of other values
width_coordinate_range <- diff(ggplot_build(g)$layout$panel_ranges[[1]]$x.range)
In short:
I would like to have separate legends for each "panel" of two-panel plot made using facet_wrap . Using facet_wrap(scales="free") works fine for when I want different axis scales, but not for the size of points.
Background:
I have data for several samples with three measurements each: x, y, and z. Each sample is from either class 1 or class 2. x and y have the same distributions in each class. However, all the z measurements for class 1 are less than 1.0; z measurements for class 2 range from 0 to 100.
Where I'm stuck:
Plot x and y on the x and y axes, respectively. Make the area of each point proportional to its z value.
d = matrix(c(runif(100),runif(20)*100),ncol=3)
e = data.frame( gl(2,20), d )
colnames(e) = c("class","x","y","z")
ggplot( data = e, aes(x=x, y=y, size=z) ) +
geom_point() + scale_area() +
facet_wrap( ~ class, ncol=1, scales="free" )
Problem:
Note that the dots on the first panel are difficult to see because they are on the very low end of the scale used for the single legend which ranges from 0 to 100. Is it even possible to have two separate legends (each with a different range) or should I make two plots and combine them with viewports?
A solution using grid.arrange. I've left in the call to facet_wrap so the strip.text remains. You could easily remove this.
# plot for class 1
c1 <- ggplot(e[e$class==1,], aes(x=x,y=y,size=z)) + geom_point() + scale_area() + facet_wrap(~class)
# plot for class 2
c2 <- c1 %+% e[e$class==2,]
library(gridExtra)
grid.arrange(c1,c2, ncol=1)
In the following example, how do I set separate ylims for each of my facets?
qplot(x, value, data=df, geom=c("smooth")) + facet_grid(variable ~ ., scale="free_y")
In each of the facets, the y-axis takes a different range of values and I would like to different ylims for each of the facets.
The defaults ylims are too long for the trend that I want to see.
This was brought up on the ggplot2 mailing list a short while ago. What you are asking for is currently not possible but I think it is in progress.
As far as I know this has not been implemented in ggplot2, yet. However a workaround - that will give you ylims that exceed what ggplot provides automatically - is to add "artificial data". To reduce the ylims simply remove the data you don't want plot (see at the and for an example).
Here is an example:
Let's just set up some dummy data that you want to plot
df <- data.frame(x=rep(seq(1,2,.1),4),f1=factor(rep(c("a","b"),each=22)),f2=factor(rep(c("x","y"),22)))
df <- within(df,y <- x^2)
Which we could plot using line graphs
p <- ggplot(df,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")
print(p)
Assume we want to let y start at -10 in first row and 0 in the second row, so we add a point at (0,-10) to the upper left plot and at (0,0) ot the lower left plot:
ylim <- data.frame(x=rep(0,2),y=c(-10,0),f1=factor(c("a","b")),f2=factor(c("x","y")))
dfy <- rbind(df,ylim)
Now by limiting the x-scale between 1 and 2 those added points are not plotted (a warning is given):
p <- ggplot(dfy,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
Same would work for extending the margin above by adding points with higher y values at x values that lie outside the range of xlim.
This will not work if you want to reduce the ylim, in which case subsetting your data would be a solution, for example to limit the upper row between -10 and 1.5 you could use:
p <- ggplot(dfy,aes(x,y))+geom_line(subset=.(y < 1.5 | f1 != "a"))+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
There are actually two packages that solve that problem now:
https://github.com/zeehio/facetscales, and https://cran.r-project.org/package=ggh4x.
I would recommend using ggh4x because it has very useful tools, such as facet grid multiple layers (having 2 variables defining the rows or columns), scaling the x and y-axis as you wish in each facet, and also having multiple fill and colour scales.
For your problems the solution would be like this:
library(ggh4x)
scales <- list(
# Here you have to specify all the scales, one for each facet row in your case
scale_y_continuous(limits = c(2,10),
scale_y_continuous(breaks = c(3, 4))
)
qplot(x, value, data=df, geom=c("smooth")) +
facet_grid(variable ~ ., scale="free_y") +
facetted_pos_scales(y = scales)
I have one example of function facet_wrap
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(vars(class), scales = "free",
nrow=2,ncol=4)
Above code generates plot as:
my level too low to upload an image, click here to see plot