scatterplot with equal axes - r

I have a data set like this one below:
DataFrame <- data.frame(x=runif(25),y=runif(25),
z=sample(letters[1:4],25,rep=TRUE))
and using the Lattice package, I can make a scatter plot with equal axes (with a 1:1 line going through the centre) with the following lines:
xyplot(y ~ x | z, data=DataFrame,
scales=list(relation="free"),
prepanel=function(x,y,...) {
rg <- range(na.omit(c(x,y)))
list(xlim=rg,ylim=rg)
},panel=function(x,y,...) {
panel.abline(0,1)
panel.xyplot(x,y,...)
})
In ggplot2, I have gotten this far:
ggplot(data=DataFrame) + geom_point(aes(x=x,y=y)) +
facet_grid(~z,scales="free") + coord_equal(ratio=1) +
geom_abline(intercept=0,slope=1)
But I'm not sure that coord_equal() is the function I'm looking for. What might be the equivalent function call in ggplot2?

Your problem lies in setting free facet scales. Once you set the facet scales to be free, you can't then add coord_equal() If you eliminate the free scales, then coord_equal() works properly.

Maybe facet_wrap() is a better choice, and as far as I know the control of xlim and ylim for individual panels is not available in ggplot2.

Related

How can I define a color palette (normalize) for multiple hexbin plots in R

I want to find a way to set a certain range of a color palette that is used for a hexbin plot to normalize multiple plots in R.
So far I have tried:
library(hexbin)
library(gplots)
my.colors <- function (n)
{
(rich.colors(n))
}
plot(hexbin(lastthousand$V4, lastthousand$V5, xbnds=c(0,35), ybnds=c(0,35),), xlab="Green Pucks", ylab="Red Pucks",colramp = my.colors, colorcut = seq(0, 1, length = 25),lcex=0.66)
Which results in the follwing plot:
hexbin plot #1
I understand that "colourcut" controls the resolution of the color palette. But I found no way to controll the min/max values
Lets say I have a second plot - 'hexbin plot #2' - with counts from 1(dark-blue) to 100(red). Is there a way to use only the colors 1(dark-blue)-24(light-blue) [based on only a part of the 1(dark-blue)-100(red) scale] for hexbin plot #1?
The final goal is to have several hexbin plots next to each other which follow the same colour scheme (min and max based on the one with the highest counts).
-this is my first question here :) and I'm new to R, please be gentle
//edit: For everyone with the same problem: My supervisor suggested to use facets in ggplot2. Will see how that works and return with another edit if it solves the issue.
//edit2: factes did the trick:
library(gplots)
library(ggplot2)
p <- ggplot(data=lastthousand, aes(lastthousand$V4,lastthousand$V5))+ geom_hex()
p + facet_grid(. ~ Market) + xlab("green pucks") + ylab("red pucks") + scale_colour_gradientn(colours=rainbow(7))
Maybe this can be useful: https://gist.github.com/wahalulu/1376861
and this for ranges:
https://stackoverflow.com/a/15505591/1600108
https://stackoverflow.com/a/14586941/1600108
Facets does the trick:
library(gplots)
library(ggplot2)
p <- ggplot(data=lastthousand, aes(lastthousand$V4,lastthousand$V5))+ geom_hex()
p + facet_grid(. ~ Market) + xlab("green pucks") + ylab("red pucks") + scale_colour_gradientn(colours=rainbow(7))

ggplot query or change plot limits

I have a ggplot object returned by a function in an R package. I want to add some elements to this plot before plotting it. But, I do not know the plot limits. Is there a way to query the ggplot object to find the plot limits? Actually, what I'd really like to do is simply set new limits for subsequent plotting, but I understand this is not possible, based on discussions of the impossibility of plotting data against two different y-axes.
For example, say I want to plot a small rectangle in lower-left corner of plot, but not knowing the plot limits, I don't know where to put it:
p = function() return(ggplot() + xlim(-2, 5) + ylim(-3, 5) +
geom_rect(mapping=aes(xmin=1, xmax=2, ymin=1, ymax=2)))
gp = p()
gp = gp + geom_rect(mapping=aes(xmin=0, ymin=0, xmax=0.5, ymax=0.5))
print(gp)
In ggplot2 3.0.0:
ggplot_build(gp)$layout$panel_params[[1]][c("x.range","y.range")]
ggplot_build(p)$layout$panel_ranges[[1]][c("x.range","y.range")]

ggplot2 and geom_density: How to remove baseline?

I'm using ggplot as described here
Smoothed density estimates
and entered in the R console
m <- ggplot(movies, aes(x = rating))
m + geom_density()
This works but is there some way to remove the connection between the x-axis and the density plot (the vertical lines which connect the density plot to the x-axis)
The most consistent way to do so is (thanks to #baptiste):
m + stat_density(geom="line")
My original proposal was to use geom_line with an appropriate stat:
m + geom_line(stat="density")
but it is no longer recommended since I'm receiving reports it's not universally working for every case in newer versions of ggplot.
The suggested answers dont provide exactly the same results as geom_density. Why not draw a white line over the baseline?
+ geom_hline(yintercept=0, colour="white", size=1)
This worked for me.
Another way would be to calculate the density separately and then draw it. Something like this:
a <- density(movies$rating)
b <- data.frame(a$x, a$y)
ggplot(b, aes(x=a.x, y=a.y)) + geom_line()
It's not exactly the same, but pretty close.

rdata & ggplot: specifying plot initial plot size?

I'm using ggplot2 and attempting to create an empty plot with some basic dimensions, like I might do w/ the stock plot function like so:
plot(x = c(0, 10), y=c(-7, 7))
Then I'd plot the points with geom_point() (or, stock point() function)
How can I set that basic plot up using ggplot? I'm only able to draw a plot using like:
ggplot() + layer(data=data, mapping = aes(x=side, y=height), geom = "point")
But this has max x/y values based on the data.
There are two ways to approach this:
Basically the same approach as with base graphics; the first layer put down has the limits you want, using geom_blank()
ggplot() +
geom_blank(data=data.frame(x=c(0,10),y=c(-7,7)), mapping=aes(x=x,y=y))
Using expand_limits()
ggplot() +
expand_limits(x=c(0,10), y=c(-7,7))
In both cases, if your data extends beyond this, the axes will be further expanded.
You can set the overall plotting region limits using xlim and ylim:
ggplot(data = data) +
geom_point(aes(x = side, y = height) +
xlim(c(0,10)) +
ylim(c(-7,7))
Also see coord_cartesian which zooms in and out rather than hard coding the axis limits.
Edit Since #Brian clarified the differences between his answer and mine well, I thought I should mention it as well in my answer, so no one misses it. Using xlim and ylim will set the limits of the plotting region no matter what data you add in subsequent layers. Brian's method using expand_limits is a way to set the minimum ranges.

How to control ylim for a faceted plot with different scales in ggplot2?

In the following example, how do I set separate ylims for each of my facets?
qplot(x, value, data=df, geom=c("smooth")) + facet_grid(variable ~ ., scale="free_y")
In each of the facets, the y-axis takes a different range of values and I would like to different ylims for each of the facets.
The defaults ylims are too long for the trend that I want to see.
This was brought up on the ggplot2 mailing list a short while ago. What you are asking for is currently not possible but I think it is in progress.
As far as I know this has not been implemented in ggplot2, yet. However a workaround - that will give you ylims that exceed what ggplot provides automatically - is to add "artificial data". To reduce the ylims simply remove the data you don't want plot (see at the and for an example).
Here is an example:
Let's just set up some dummy data that you want to plot
df <- data.frame(x=rep(seq(1,2,.1),4),f1=factor(rep(c("a","b"),each=22)),f2=factor(rep(c("x","y"),22)))
df <- within(df,y <- x^2)
Which we could plot using line graphs
p <- ggplot(df,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")
print(p)
Assume we want to let y start at -10 in first row and 0 in the second row, so we add a point at (0,-10) to the upper left plot and at (0,0) ot the lower left plot:
ylim <- data.frame(x=rep(0,2),y=c(-10,0),f1=factor(c("a","b")),f2=factor(c("x","y")))
dfy <- rbind(df,ylim)
Now by limiting the x-scale between 1 and 2 those added points are not plotted (a warning is given):
p <- ggplot(dfy,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
Same would work for extending the margin above by adding points with higher y values at x values that lie outside the range of xlim.
This will not work if you want to reduce the ylim, in which case subsetting your data would be a solution, for example to limit the upper row between -10 and 1.5 you could use:
p <- ggplot(dfy,aes(x,y))+geom_line(subset=.(y < 1.5 | f1 != "a"))+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
There are actually two packages that solve that problem now:
https://github.com/zeehio/facetscales, and https://cran.r-project.org/package=ggh4x.
I would recommend using ggh4x because it has very useful tools, such as facet grid multiple layers (having 2 variables defining the rows or columns), scaling the x and y-axis as you wish in each facet, and also having multiple fill and colour scales.
For your problems the solution would be like this:
library(ggh4x)
scales <- list(
# Here you have to specify all the scales, one for each facet row in your case
scale_y_continuous(limits = c(2,10),
scale_y_continuous(breaks = c(3, 4))
)
qplot(x, value, data=df, geom=c("smooth")) +
facet_grid(variable ~ ., scale="free_y") +
facetted_pos_scales(y = scales)
I have one example of function facet_wrap
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(vars(class), scales = "free",
nrow=2,ncol=4)
Above code generates plot as:
my level too low to upload an image, click here to see plot

Resources