I'm trying to create a picture with points (actually bars, but whatever) in two distinct colours with parallel saturated-to-unsaturated colour scales, with corresponding colourbar legends. I'm most of the way there, but there are a few minor points I can't handle yet.
tl;dr the color scales I get from a red-to-white gradient and a saturated-red-to-completely-unsaturated gradient are not identical.
Set up data: y will determine both y-axis position and degree of saturation, w will determine binary colour choice.
set.seed(101)
dd <- data.frame(x=1:100,y=rnorm(100))
dd$w <- as.logical(sample(0:1,size=nrow(dd),
replace=TRUE))
Get packages:
library(ggplot2)
library(cowplot)
library(gridExtra)
I can get the plot I want by allowing alpha (transparency) to vary with y, but the legend is ugly:
g0 <- ggplot(dd,aes(x,y))+
geom_point(size=8,aes(alpha=y,colour=w))+
scale_colour_manual(values=c("red","blue"))
## + scale_alpha(guide="colourbar") ## doesn't work
I can draw each half of the points by themselves to get a legend similar to what I want:
g1 <- ggplot(dd[!dd$w,],aes(x,y))+
geom_point(size=8,aes(colour=y))+
scale_colour_gradient(low="white",high="red",name="not w")+
expand_limits(x=range(dd$x),y=range(dd$y))
g2 <- ggplot(dd[dd$w,],aes(x,y))+
geom_point(size=8,aes(colour=y))+
scale_colour_gradient(low="white",high="blue",name="w")+
expand_limits(x=range(dd$x),y=range(dd$y))
Now I can use tools from cowplot to pick off the legends and combine them with the original plot:
g1_leg <- get_legend(g1)
g2_leg <- get_legend(g2)
g0_noleg <- g0 + theme(legend.position='none')
ggdraw(plot_grid(g0_noleg,g1_leg,g2_leg,nrow=1,rel_widths=c(1,0.2,0.2)))
This is most of the way there, but:
ideally I'd like to squash the two colourbars together (I know I can probably do that with sufficient grid-hacking ...)
the colours don't quite match; the legend colours are slightly warmer than the point colours ...
Ideas? Or other ways of achieving the same goal?
Related
I'm superimposing two images in R. One image is a boxplot (using boxplot()), the other a scatterplot (using scatterplot()). I noticed a discrepancy in the scale along the x-axis. (A) is the boxplot scale. (B) is for the scatterplot.
What I've been trying to do is re-scale (B) to suit (A). I note there is a condition called xlim in scatterplot. Tried it, didn't work. I've also noted this example came up as I was typing out the question: Change Axis Label - R scatterplot.
Tried it, didn't work.
How can I modify the x-axis to change the scale from 1.0, 1.5, 2.0, 2.5, 3.0 to simply 1,2,3.
In Stata, I'm aware you can specify the x-axis range, and then indicate the step-ups between. For example, the range may be 0-100, and each measurable point would be set to 10. So you'd end up with 10, 20,....,100.
My R code, as it stands, looks something like this:
library(car)
boxplot(a,b,c)
par(new=T)
scatterplot(x, y, smooth=TRUE, boxplots=FALSE)
I've tried modifying scatterplot as such without any success:
scatterplot(x, y, smooth=TRUE, boxplots=FALSE, xlim=c(1,3))
As mentioned in comments use as.factor, then xaxis should align. Here is ggplot solution:
#dummy data
dat1 <- data.frame(group=as.factor(rep(1:3,4)),
var=c(runif(12)))
dat2 <- data.frame(x=as.factor(1:3),y=runif(3))
library(ggplot2)
library(grid)
library(gridExtra)
#plot points on top of boxplot
ggplot(dat1,aes(group,var)) +
geom_boxplot() +
geom_point(aes(x,y),dat2)
Plot as separate plots
gg_boxplot <-
ggplot(dat1,aes(group,var)) +
geom_boxplot()
gg_point <-
ggplot(dat2,aes(x,y)) +
geom_point()
grid.arrange(gg_boxplot,gg_point,
ncol=1,
main="Plotting is easier with ggplot")
EDIT
Using xlim as suggested by #RuthgerRighart
#dummy data - no factors
dat1 <- data.frame(group=rep(1:3,4),
var=c(runif(12)))
dat2 <- data.frame(x=1:3,y=runif(3))
par(mfrow=c(2,1))
boxplot(var~group,dat1,xlim=c(1,3))
plot(dat2$x,dat2$y,xlim=c(1,3))
I've created a map by overlaying polygons using spplot and with the alpha value of the fill set to 10/255 so that areas with more polygons overlapping have a more saturated color. The polygons are set to two different colors (blue and red) based on a binary variable in the attribute table. Thus, while the color saturation depends on the number of polygons overlapping, the color depends on the ratio of the blue and red classes of polygons.
There is, of course, no easy built-in legend for this so I need to create one from scratch. There is a nice solution to this in base graphics found here. I also came up with a not-so-good hack to do this in ggplot based on this post from kohske. A similar question was posted here and I did my best to give some solutions, but couldn't really come up with a solid answer. Now I need to do the same for myself, but I specifically would like to use R and also use grid graphics.
This is the ggplot hack I came up with
Variable_A <- 100 # max of variable
Variable_B <- 100
x <- melt(outer(1:Variable_A, 1:Variable_B)) # set up the data frame to plot from
p <- ggplot(x) + theme_classic() + scale_alpha(range=c(0,0.5), guide="none") +
geom_tile(aes(x=Var1, y=Var2, fill="Variable_A", col.regions="red", alpha=Var1)) +
geom_tile(aes(x=Var1, y=Var2, fill="Variable_B", col.regions="blue", alpha=Var2)) +
scale_x_continuous(limits = c(0, Variable_A), expand = c(0, 0)) +
scale_y_continuous(limits = c(0, Variable_B), expand = c(0, 0)) +
xlab("Variable_A") + ylab("Variable_B") +
guides(fill=FALSE)
p
Which gives this:
This doesn't work for my purposes for two reasons. 1) Because the alpha value varies, the second color plotted (blue in this case) overwhelms the first one as the alpha values get higher. The correct legend should have blue and red mixed evenly along the 1:1 diagonal. In addition, the colors don't really properly correspond to the map colors. 2) I don't know how to overlay a ggplot object on the lattice map created with spplot. I tried to create a grob using ggplotGrob(p), but still couldn't figure out how to add the grob to the spplot map.
The ideal solution would be to create a similar figure using lattice graphics. I think that using tiles is probably the right solution, but what would be best is to have the alpha values stay constant and vary the number of tiles plotted going from left to right (for red) and bottom to top (for blue). Thus, the colors and saturation should properly match the map (I think...).
Any help is much appreciated!
How about mapping the angle to color, and alpha to the sum of the two variables -- does this do what you want?
d <- expand.grid(x=1:100, y=1:100)
ggplot(d, aes(x, y, fill=atan(y/x), alpha=x+y)) +
geom_tile() +
scale_fill_gradient(high="red", low="blue")+
theme(legend.position="none", panel.background=element_blank())
If I generate a ggplot by:
x <- rnorm( 10^3, mean=0, sd=1)
y <- rnorm( 10^3, mean=0, sd=1)
z=x^2+y^2
df <- data.frame(x,y,z)
ggplot(df)+geom_point(aes(x,y,color=z))
By default, this is plotted on a blue scale. How can I combine different colors to make a new scale?
There is an almost limitless number of ways to set colors in ggplot, so many in fact that it can get confusing. Here are a couple of examples to get you started. See documentation here, here, and here many more options. IMO this site gives an excellent overview of color options in ggplot.
As #rawr points out in the comment, the options all involve some version of scale_color_
scale_color_gradient(...) associates colors with low and high values of the color scale variable and interpolates between them.
ggp <- ggplot(df)+geom_point(aes(x,y,color=z))
ggp + scale_color_gradient(low="red", high="blue")
scale_color_gradientn(...) takes a color palette as argument (e.g., a vector of colors) and interpolates between those. Color palettes can be defined manually or using one of the many tools in R. For example, the RColorBrewer package provides access to the color schemes on www.colorbrewer.org.
library(RColorBrewer) # for brewer.pal(...)
ggp + scale_color_gradientn(colours=rev(brewer.pal(9,"YlOrRd")))
library(colorRamps) # for matlab.like(...)
ggp + scale_color_gradientn(colours=matlab.like(10))
scale_color_gradient2(...) produces a divergent color scale, designed for data that has a natural midpoint (your example doesn't...).
ggp +
scale_color_gradient2(low="blue",mid="green",high="red",midpoint=5,limits=c(0,10))
This really just scratches the surface. For example, there is another set of tools in ggplot to deal with discrete color scales.
Let's say I have this data.frame:
df <- data.frame(x = rep(1, 20), y = runif(20, 10, 20))
and I want to plot df$y vs. df$x.
Since the x values are constant, points that have identical or close y values will be plotted on top of each other in a simple scatterplot, which kind of hides the density of points at such y-values. One solution for that situation is of course to use a violin plot.
I'm looking for another solution - plotting clusters of points instead of the individual points, which will therefore look similar to a bubble plot. In a bubble plot however, a third dimension is required in order to make the bubbles meaningful, which I don't have in my data. Does anyone know of an R function/package that take as input points (and probably a defined radius) and will cluster them and plot them?
You can jitter the x values:
plot(jitter(df$x),df$y)
You could try a hexplot, using either the hexplot library or stat_binhex in ggplot2.
http://cran.r-project.org/web/packages/hexbin/
http://docs.ggplot2.org/0.9.3/stat_binhex.html
The other standard approach (vs. jitter) is to use a partially transparent color, so that overlapping points will appear darker than "lone" points.
De gustibus, etc.
Using transparency is another solution. E.g.:
ggplot(df, aes(x=x, y=y)) +
geom_point(alpha=0.2, size=3)
When there is only one x value, a density plot:
ggplot(df, aes(x=y)) +
stat_density(geom="line")
or a violin plot:
ggplot(df, aes(x=x, y=y)) +
geom_violin()
might also be options for displaying your data.
look at the sunflowerplot function (and the xyTable function that it uses to count overlapping points).
You could also use the my.symbols function from the TeachingDemos package with the results of xyTable to use other shapes (polygrams or example).
I an working with ggplot. I want to desine a graphic with ggplot. This graphics is with two continuous variables but I would like to get a graphic like this:
Where x and y are the continuous variables. My problem is I can't get it to show circles in the line of the plot. I would like the plot to have circles for each pair of observations from the continuous variables. For example in the attached graphic, it has a circle for pairs (1,1), (2,2) and (3,3). It is possible to get it? (The colour of the line doesn't matter.)
# dummy data
dat <- data.frame(x = 1:5, y = 1:5)
ggplot(dat, aes(x,y,color=x)) +
geom_line(size=3) +
geom_point(size=10) +
scale_colour_continuous(low="blue",high="red")
Playing with low/high will change the colours.
In general, to remove the legend, use + theme(legend.position="none")