Common breaks and free axes for overlapping lattice histograms - r

What is the required incantation to achieve an overlapping, faceted lattice::histogram with common break points (across groups, but potentially varying across panels)?
For example, assume I want the total range of the data (groups combined) for each panel to be split into 30 bins.
Example data:
library(lattice)
set.seed(1)
d <- data.frame(v1=rep(c('A', 'B'), each=1000),
v2=rep(c(0.5, 1), each=2000),
mean=rep(c(0, 10, 2, 12), each=1000))
d$x <- rnorm(nrow(d), d$mean, d$v2)
Using nint=30?
p1 <- histogram(~x|v1, d, groups=v2, nint=30,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
p1
Above, the bins are consistent across groups, but (1) the x-axis limits are shared across panels (problematic when the x-axis range varies substantially across panels - I really want the 30 bins to be calculated individually for each panel), and (2) the y-axis is cramped when using type='percent' (it should extend further).
Using breaks=30?
p2 <- histogram(~x|v1, d, groups=v2, breaks=30,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
p2
Now the axis limits look good, but the bins width varies across groups.
So...
Using lattice, how can I achieve overlapping, faceted histograms that have constant bin width across groups within panels, but have axis limits that fit the data for each panel?
(I realise that ggplot is an option, but I want the figure style to be consistent with my other lattice plots.)

This works, but I'm afraid it's rather pedestrian. At least it only requires the trellis object itself; it will assume the number of bins you want in each panel is equal to the nint parameter.
It works like this: check whether the panels ranges overlap. If they don't, split each (slightly extended) range into nint bins, then concatenate them with a few empty bins in between. We also need to work out the y range, which we do by scaling according to the maximum number of counts.
fix_facets <- function(p1)
{
n_bins <- p1$panel.args.common$nint
xvals1 <- p1$panel.args[[1]]$x
xvals2 <- p1$panel.args[[2]]$x
if(min(xvals2) > max(xvals1) | min(xvals1) > max(xvals2)){
left_range <- range(xvals1)
left_range <- left_range + (diff(left_range) * c(-0.1, 0.1))
left_bins <- seq(left_range[1], left_range[2], diff(left_range)/n_bins)
right_range <- range(xvals2)
right_range <- right_range + (diff(right_range) * c(-0.1, 0.1))
right_bins <- seq(right_range[1], right_range[2], diff(right_range)/n_bins)
if(max(left_range) < min(right_range)){
mid_bins <- seq(max(left_bins), min(right_bins), diff(left_bins[1:2]))
all_bins <- c(left_bins, mid_bins, right_bins)
} else {
mid_bins <- seq(max(right_bins), min(left_bins), diff(right_bins[1:2]))
all_bins <- c(right_bins, mid_bins, left_bins)
}
p1$panel.args.common$breaks <- all_bins
p1$x.limits[[1]] <- left_range
p1$x.limits[[2]] <- right_range
histleft <- hist(xvals1, breaks = left_bins)
histright <- hist(xvals2, breaks = right_bins)
group_factor <- 100 * length(p1$condlevels[[1]])
p1$y.limits[[1]][2] <- group_factor * max(histleft$counts) / length(xvals1)
p1$y.limits[[2]][2] <- group_factor * max(histright$counts) / length(xvals2)
}
return(p1)
}
So with your example, we can do this:
p1 <- histogram(~x|v1, d, groups=v2, nint=30,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
fix_facets(p1)
and to show it works with other numbers of bins...
p1 <- histogram(~x|v1, d, groups=v2, nint=10,
scales=list(relation='free'), type='percent',
panel = function(...) {
panel.superpose(..., panel.groups=panel.histogram,
col=c('red', 'blue'), alpha=0.3)
})
fix_facets(p1)

Related

Extend axis limits without plotting (in order to align two plots by x-unit)

I am trying to combine two ggplot objects with patchwork - two plots with different subsets of data, but the same x variable (and therefore same unit). I would like to align the plots according to the x values - Each x unit should have the same physical width in the final plot.
This is very easy when actually plotting the entire width of the larger data set (see plot below) - but I struggle to plot only parts of the data and keeping the same alignment.
library(ggplot2)
library(patchwork)
library(dplyr)
p1 <-
ggplot(mtcars, aes(mpg)) +
geom_density(trim = TRUE) +
scale_x_continuous(limits = c(10,35))
p2 <-
ggplot(filter(mtcars, mpg < 20), aes(mpg)) +
geom_histogram(binwidth = 1, boundary = 1) +
scale_x_continuous(limits = c(10,35))
p1/p2
Created on 2019-08-07 by the reprex package (v0.3.0)
The desired output
That's photoshopped
adding coord_cartesian(xlim = c(10,(20 or 35)), clip = 'off'), and/or changing scale_x limits to c(0,(20 or 35)) doesn't work.
patchwork also won't let me set the widths of both plots when they are in two rows, which makes sense in a way. So I could create an empty plot for the second row and set the widths for those, but this seems a terrible hack and I feel there must be a much easier solution.
I am not restricted to patchwork, but any solution allowing to use it would be very welcome.
I modified the align_plots function from the cowplot package for this, so that its plot_grid function can now support adjustments to the dimensions of each plot.
(The main reason I went with cowplot rather than patchwork is that I haven't had much tinkering experience with the latter, and overloading common operators like + makes me slightly nervous.)
Demonstration of results
# x / y axis range of p1 / p2 have been changed for illustration purpose
p1 <- ggplot(mtcars, aes(mpg, 1 + stat(count))) +
geom_density(trim = TRUE) +
scale_x_continuous(limits = c(10,35)) +
coord_cartesian(ylim = c(1, 3.5))
p2 <- ggplot(filter(mtcars, mpg >= 15 & mpg < 30), aes(mpg)) +
geom_histogram(binwidth = 1, boundary = 1)
plot_grid(p1, p2, ncol = 1, align = "v") # plots in 1 column, x-axes aligned
plot_grid(p1, p2, nrow = 1, align = "h") # plots in 1 row, y-axes aligned
Plots in 1 column (x-axes aligned for 15-28 range):
Plots in 1 row (y-axes aligned for 1 - 3.5 range):
Caveats
This hack assumes the plots that the user intends to align (either horizontally or vertically) have reasonably similar axes of comparable magnitude. I haven't tested it on more extreme cases.
This hack expects simple non-faceted plots in Cartesian coordinates. I'm not sure what one could expect from aligning faceted plots. Similarly, I'm not considering polar coordinates (what's there to align?) or map projections (haven't looked into this, but they feel rather complicated).
This hack expects the gtable cell containing the plot panel to be in the 7th row / 5th column of the gtable object, which is based on my understanding of how ggplot objects are typically converted to gtables, and may not survive changes to the underlying code.
Code
Modified version of cowplot::align_plots:
align_plots_modified <- function (..., plotlist = NULL, align = c("none", "h", "v", "hv"),
axis = c("none", "l", "r", "t", "b", "lr", "tb", "tblr"),
greedy = TRUE) {
plots <- c(list(...), plotlist)
num_plots <- length(plots)
grobs <- lapply(plots, function(x) {
if (!is.null(x)) as_gtable(x)
else NULL
})
halign <- switch(align[1], h = TRUE, vh = TRUE, hv = TRUE, FALSE)
valign <- switch(align[1], v = TRUE, vh = TRUE, hv = TRUE, FALSE)
vcomplex_align <- hcomplex_align <- FALSE
if (valign) {
# modification: get x-axis value range associated with each plot, create union of
# value ranges across all plots, & calculate the proportional width of each plot
# (with white space on either side) required in order for the plots to align
plot.x.range <- lapply(plots, function(x) ggplot_build(x)$layout$panel_params[[1]]$x.range)
full.range <- range(plot.x.range)
plot.x.range <- lapply(plot.x.range,
function(x) c(diff(c(full.range[1], x[1]))/ diff(full.range),
diff(x)/ diff(full.range),
diff(c(x[2], full.range[2]))/ diff(full.range)))
num_widths <- unique(lapply(grobs, function(x) {
length(x$widths)
}))
num_widths[num_widths == 0] <- NULL
if (length(num_widths) > 1 || length(grep("l|r", axis[1])) > 0) {
vcomplex_align = TRUE
warning("Method not implemented for faceted plots. Placing unaligned.")
valign <- FALSE
}
else {
max_widths <- list(do.call(grid::unit.pmax,
lapply(grobs, function(x) {x$widths})))
}
}
if (halign) {
# modification: get y-axis value range associated with each plot, create union of
# value ranges across all plots, & calculate the proportional width of each plot
# (with white space on either side) required in order for the plots to align
plot.y.range <- lapply(plots, function(x) ggplot_build(x)$layout$panel_params[[1]]$y.range)
full.range <- range(plot.y.range)
plot.y.range <- lapply(plot.y.range,
function(x) c(diff(c(full.range[1], x[1]))/ diff(full.range),
diff(x)/ diff(full.range),
diff(c(x[2], full.range[2]))/ diff(full.range)))
num_heights <- unique(lapply(grobs, function(x) {
length(x$heights)
}))
num_heights[num_heights == 0] <- NULL
if (length(num_heights) > 1 || length(grep("t|b", axis[1])) > 0) {
hcomplex_align = TRUE
warning("Method not implemented for faceted plots. Placing unaligned.")
halign <- FALSE
}
else {
max_heights <- list(do.call(grid::unit.pmax,
lapply(grobs, function(x) {x$heights})))
}
}
for (i in 1:num_plots) {
if (!is.null(grobs[[i]])) {
if (valign) {
grobs[[i]]$widths <- max_widths[[1]]
# modification: change panel cell's width to a proportion of unit(1, "null"),
# then add whitespace to the left / right of the plot's existing gtable
grobs[[i]]$widths[[5]] <- unit(plot.x.range[[i]][2], "null")
grobs[[i]] <- gtable::gtable_add_cols(grobs[[i]],
widths = unit(plot.x.range[[i]][1], "null"),
pos = 0)
grobs[[i]] <- gtable::gtable_add_cols(grobs[[i]],
widths = unit(plot.x.range[[i]][3], "null"),
pos = -1)
}
if (halign) {
grobs[[i]]$heights <- max_heights[[1]]
# modification: change panel cell's height to a proportion of unit(1, "null"),
# then add whitespace to the bottom / top of the plot's existing gtable
grobs[[i]]$heights[[7]] <- unit(plot.y.range[[i]][2], "null")
grobs[[i]] <- gtable::gtable_add_rows(grobs[[i]],
heights = unit(plot.y.range[[i]][1], "null"),
pos = -1)
grobs[[i]] <- gtable::gtable_add_rows(grobs[[i]],
heights = unit(plot.y.range[[i]][3], "null"),
pos = 0)
}
}
}
grobs
}
Utilising the above modified function with cowplot package's plot_grid:
# To start using (in current R session only; effect will not carry over to subsequent session)
trace(cowplot::plot_grid, edit = TRUE)
# In the pop-up window, change `grobs <- align_plots(...)` (at around line 27) to
# `grobs <- align_plots_modified(...)`
# To stop using
untrace(cowplot::plot_grid)
(Alternatively, we can define a modified version of plot_grid function that uses align_plots_modified instead of cowplot::align_plots. Results would be the same either way.)
Here is an option with grid.arrange that does not use a blank plot, but requires a manual of adjustment of:
plot margin
x axis expansion
number of decimal places in y axis labels
library(ggplot2)
library(dplyr)
library(gridExtra)
p1 <-
ggplot(mtcars, aes(mpg)) +
geom_density(trim = TRUE) +
scale_x_continuous(limits = c(10,35), breaks=seq(10,35,5), expand = expand_scale(add=c(0,0)))
p2 <-
ggplot(filter(mtcars, mpg < 20), aes(mpg)) +
geom_histogram(binwidth = 1, boundary = 1) +
scale_x_continuous(limits = c(10,20), breaks=seq(10,20,5), expand = expand_scale(add=c(0,0))) +
scale_y_continuous(labels = scales::number_format(accuracy = 0.01)) +
theme(plot.margin = unit(c(0,1,0,0), "cm"))
grid.arrange(p1, p2,
layout_matrix = rbind(c(1, 1), c(2, NA))
)
Should make this plot:

incorporate standalone legend in ggpairs (take 2)

tl;dr can't get a standalone legend (describing common colours across the whole plot) in ggpairs to my satisfaction.
Sorry for length.
I'm trying to draw a (lower-triangular) pairs plot using GGally::ggpairs (an extension package for drawing various kinds of plot matrices with ggplot2). This is essentially the same question as How to add an external legend to ggpairs()? , but I'm not satisfied with the answer to that question aesthetically, so I'm posting this as an extension (if suggested/recommended by commenters, I will delete this question and offer a bounty on that question instead). In particular, I would like the legend to appear outside the sub-plot frame, either putting it within one virtual subplot but allowing additional width to hold it, or (ideally) putting it in a separate (empty) subplot. As I show below, both of my partial solutions have problems.
Fake data:
set.seed(101)
dd <- data.frame(x=rnorm(100),
y=rnorm(100),
z=rnorm(100),
f=sample(c("a","b"),size=100,replace=TRUE))
library(GGally)
Base plot function:
ggfun <- function(...) {
ggpairs(dd,mapping = ggplot2::aes(color = f),
columns=1:3,
lower=list(continuous="points"),
diag=list(continuous="blankDiag"),
upper=list(continuous="blank"),
...)
}
Function to trim top/right column:
trim_gg <- function(gg) {
n <- gg$nrow
gg$nrow <- gg$ncol <- n-1
v <- 1:n^2
gg$plots <- gg$plots[v>n & v%%n!=0]
gg$xAxisLabels <- gg$xAxisLabels[-n]
gg$yAxisLabels <- gg$yAxisLabels[-1]
return(gg)
}
gg0 <- trim_gg(ggfun(legends=TRUE))
Get rid of legends in left column (as in the linked question above):
library(ggplot2) ## for theme()
for (i in 1:2) {
inner <- getPlot(gg0,i,1)
inner <- inner + theme(legend.position="none")
gg0 <- putPlot(gg0,inner,i,1)
}
inner <- getPlot(gg0,2,2)
inner <- inner + theme(legend.position="right")
gg0 <- putPlot(gg0,inner,2,2)
Problems:
the blank panel behind the legend is actually masking some points; I don't know why it's not outside the panel as usual, I assume that's something that ggpairs is doing
if it were outside the panel (on top or to the right), I would want to make sure to leave some extra space so the panels themselves were all the same size. However, ggmatrix/ggpairs looks very inflexible about this.
The only alternative I've been able to try to far is following ggplot separate legend and plot by extracting the legend and using gridExtra::grid.arrange():
g_legend <- function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)
}
library(gridExtra)
grid.arrange(getPlot(gg0,1,1),
g_legend(getPlot(gg0,2,2)),
getPlot(gg0,2,1),
getPlot(gg0,2,2)+theme(legend.position="none"),
nrow=2)
Problems:
the axes and labels suppressed by ggpairs are back ...
I also considered creating a panel with a special plot that contained only the legend (i.e. trying to use theme(SOMETHING=element.blank) to suppress the plot itself, but couldn't figure out how to do it.
As a last resort, I could trim the axes where appropriate myself, but this is practically reinventing what ggpairs is doing in the first place ...
With some slight modification to solution 1: First, draw the matrix of plots without their legends (but still with the colour mapping). Second, use your trim_gg function to remove the diagonal spaces. Third, for the plot in the top left position, draw its legend but position it into the empty space to the right.
data(state)
dd <- data.frame(state.x77,
State = state.name,
Abbrev = state.abb,
Region = state.region,
Division = state.division)
columns <- c(3, 5, 6, 7)
colour <- "Region"
library(GGally)
library(ggplot2) ## for theme()
# Base plot
ggfun <- function(data = NULL, columns = NULL, colour = NULL, legends = FALSE) {
ggpairs(data,
columns = columns,
mapping = ggplot2::aes_string(colour = colour),
lower = list(continuous = "points"),
diag = list(continuous = "blankDiag"),
upper = list(continuous = "blank"),
legends = legends)
}
# Remove the diagonal elements
trim_gg <- function(gg) {
n <- gg$nrow
gg$nrow <- gg$ncol <- n-1
v <- 1:n^2
gg$plots <- gg$plots[v > n & v%%n != 0]
gg$xAxisLabels <- gg$xAxisLabels[-n]
gg$yAxisLabels <- gg$yAxisLabels[-1]
return(gg)
}
# Get the plot
gg0 <- trim_gg(ggfun(dd, columns, colour))
# For plot in position (1,1), draw its legend in the empty panels to the right
inner <- getPlot(gg0, 1, 1)
inner <- inner +
theme(legend.position = c(1.01, 0.5),
legend.direction = "horizontal",
legend.justification = "left") +
guides(colour = guide_legend(title.position = "top"))
gg0 <- putPlot(gg0, inner, 1, 1)
gg0

xyplot bottom axis when last row has fewer panels than columns

Consider a lattice xyplot that has relation='fixed', alternating=FALSE, and as.table=TRUE.
If the last row of panels is incomplete (i.e. there are fewer panels than columns of the layout), the x-axis is not plotted. For example, panel 4 in the plot below does not have x-axis ticks/labels.
library(lattice)
d <- data.frame(x=runif(100), y=runif(100), grp=gl(5, 20))
xyplot(y~x|grp, d, as.table=TRUE, scales=list(alternating=FALSE, tck=c(1, 0)))
How can I add that axis?
Ideally I want axes only at bottom and left sides, and the incomplete row of panels at bottom (unlike when using as.table=FALSE, which plots the incomplete row at the top). For the example above, I'd like the axis plotted on the bottom border of panel 4, rather than in line with the x-axis of panel 5.
I know that this is easily solved with, e.g., a base graphics approach. I'm specifically interested in a lattice solution.
I am not a lattice expert, but I believe this might work. The idea was originally posted here. First I will regenerate the example:
library(lattice)
set.seed(1)
d <- data.frame(x=runif(100), y=runif(100), grp=gl(5, 20))
Next, lets define a function that will control the panel settings:
trellis.par.set(clip = list(panel = "off"))
myPan <- function(...){
panel.xyplot(...)
if(panel.number() == 4) {
at = seq(0,1,by = 0.2)
panel.axis("bottom", at = at, outside = T,
labels = T, half = F)
}
if(panel.number() == 5) {
at = seq(0,1,by = 0.2)
panel.axis("bottom",at = at, outside = T,
labels = T, half = F)
}
}
Now to the plot:
xyplot(y~x|grp, d, as.table=TRUE,
scales = list(
x = list(draw = F, relation="same"),
y = list(tck=c(1,0), alternating=F)),
layout = c(2,3),
panel = myPan)
As can be seen, in the xyplot command we asked not to draw the x axis (draw = F) but later panel calls myPan function. There we specifically demand to draw x-axis for panels 4 and 5.
output
Hope it can give you some direction for improvements.
Here's another approach based on code provided in a (now deleted) answer by #user20650. It uses grid directly, focussing on panels of the active trellis plot that are missing axes (or at least assumed to be missing axes), and adding them. We also assume that the x-scale is fixed.
The function (which also exists as a gist here):
add_axes <- function() {
library(grid)
library(lattice)
l <- trellis.currentLayout()
pan <- which(l[nrow(l), ]==0)
if(length(pan) > 0) {
g <- grid.ls(print=FALSE)
# use an existing panel as a template for ticks
ticks <- grid.get(g$name[grep("ticks.bottom.panel", g$name)][[1]])
# use an existing panel as a template for labels
labels <- grid.get(g$name[grep("ticklabels.bottom.panel", g$name)][[1]])
ax <- grobTree(ticks, labels)
invisible(sapply(pan, function(x) {
trellis.focus("panel", x, nrow(l)-1, clip.off=TRUE)
grid.draw(ax)
trellis.unfocus()
}))
}
}
An example:
library(lattice)
d <- data.frame(x=runif(100), y=runif(100), grp=gl(5, 20))
xyplot(y~x|grp, d, as.table=TRUE, scales=list(tck=c(1,0), alternating=FALSE),
layout=c(4, 2), xlim=c(-0.1, 1.1))
add_axes()

Stacked histograms like in flow cytometry

I'm trying to use ggplot or base R to produce something like the following:
I know how to do histograms with ggplot2, and can easily separate them using facet_grid or facet_wrap. But I'd like to "stagger" them vertically, such that they have some overlap, as shown below. Sorry, I'm not allowed to post my own image, and it's quite difficult to find a simpler picture of what I want. If I could, I would only post the top-left panel.
I understand that this is not a particularly good way to display data -- but that decision does not rest with me.
A sample dataset would be as follows:
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) , cbind( rnorm(1e3)+2, 2), cbind( rnorm(1e3)+3, 3), cbind( rnorm(1e3)+4, 4)))
And I can plot it with geom_histogram as follows:
ggplot(my.data) + geom_histogram(aes(x=V1,fill=as.factor(V2))) + facet_grid( V2~.)
But I'd like the y-axes to overlap.
require(ggplot2)
require(plyr)
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) , cbind( rnorm(1e3)+2, 2), cbind( rnorm(1e3)+3, 3), cbind( rnorm(1e3)+4, 4)))
my.data$V2=as.factor(my.data$V2)
calculate the density depending on V2
res <- dlply(my.data, .(V2), function(x) density(x$V1))
dd <- ldply(res, function(z){
data.frame(Values = z[["x"]],
V1_density = z[["y"]],
V1_count = z[["y"]]*z[["n"]])
})
add an offset depending on V2
dd$offest=-as.numeric(dd$V2)*0.2 # adapt the 0.2 value as you need
dd$V1_density_offest=dd$V1_density+dd$offest
and plot
ggplot(dd, aes(Values, V1_density_offest, color=V2)) +
geom_line()+
geom_ribbon(aes(Values, ymin=offest,ymax=V1_density_offest, fill=V2),alpha=0.3)+
scale_y_continuous(breaks=NULL)
densityplot() from bioconductor flowViz package is one option for stacked densities.
from: http://www.bioconductor.org/packages/release/bioc/manuals/flowViz/man/flowViz.pdf :
For flowSets the idea is to horizontally stack plots of density estimates for all frames in the
flowSet for one or several flow parameters. In the latter case, each parameter will be plotted
in a separate panel, i.e., we implicitely condition on parameters.
you can see example visuals here:
http://www.bioconductor.org/packages/release/bioc/vignettes/flowViz/inst/doc/filters.html
source("http://bioconductor.org/biocLite.R")
biocLite("flowViz")
Using the ggridges package:
ggplot(my.data, aes(x = V1, y = factor(V2), fill = factor(V2), color = factor(V2))) +
geom_density_ridges(alpha = 0.5)
I think it's going to be difficult to get ggplot to offset the histograms like that. At least with faceting it makes new panels, and really, this transformation makes the y-axis meaningless. (The value is in the comparison from row to row). Here's one attempt at using base graphics to try to accomplish a similar thing.
#plotting function
plotoffsethists <- function(vals, groups, freq=F, overlap=.25, alpha=.75, colors=apply(floor(rbind(col2rgb(scales:::hue_pal(h = c(0, 360) + 15, c = 100, l = 65)(nlevels(groups))),alpha=alpha*255)),2,function(x) {paste0("#",paste(sprintf("%02X",x),collapse=""))}), ...) {
print(colors)
if (!is.factor(groups)) {
groups<-factor(groups)
}
offsethist <- function (x, col = NULL, offset=0, freq=F, ...) {
y <- if (freq) y <- x$counts
else
x$density
nB <- length(x$breaks)
rect(x$breaks[-nB], 0+offset, x$breaks[-1L], y+offset, col = col, ...)
}
hh<-tapply(vals, groups, hist, plot=F)
ymax<-if(freq)
sapply(hh, function(x) max(x$counts))
else
sapply(hh, function(x) max(x$density))
offset<-(mean(ymax)*overlap) * (length(ymax)-1):0
ylim<-range(c(0,ymax+offset))
xlim<-range(sapply(hh, function(x) range(x$breaks)))
plot.new()
plot.window(xlim, ylim, "")
box()
axis(1)
Map(offsethist, hh, colors, offset, freq=freq, ...)
invisible(hh)
}
#sample call
par(mar=c(3,1,1,1)+.1)
plotoffsethists(my.data$V1, factor(my.data$V2), overlap=.25)
Complementing Axeman's answer, you can add the option stat="binline" to the geom_density_ridges geom. This results in a histogram like plot, instead of a density line.
library(ggplot2)
library(ggridges)
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) ,
cbind( rnorm(1e3)+2, 2),
cbind( rnorm(1e3)+3, 3),
cbind( rnorm(1e3)+4, 4)))
my.data$V2 <- as.factor(my.data$V2)
ggplot(my.data, aes(x=V1, y=factor(V2), fill=factor(V2))) +
geom_density_ridges(alpha=0.6, stat="binline", bins=30)
Resulting image:

scatter plot specifying color and labelling axis in r

I have following data and plot:
pos <- rep(1:2000, 20)
xv =c(rep(1:20, each = 2000))
# colrs <- unique(xv)
colrs <- xv # edits
yv =rnorm(2000*20, 0.5, 0.1)
xv = lapply(unique(xv), function(x) pos[xv==x])
to.add = cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
plot (bp,yv, pch = "*", col = colrs)
I have few issues in this plot I could not figure out.
(1) I want to use different color for different group or two different color for different groups (i.e xv), but when I tried color function in terms to be beautiful mixture. Although I need to highlight some points (for example bp 4000 to 4500 for example with blue color)
(2) Instead of bp positions I want to put a tick mark and label with the group.
Thank you, appreciate your help.
Edits: with help of the following answer (with slight different approach in case I have unbalanced number in each group will work) I could get the similar plot. But still question remaining regarding colors is what if I want to use two alternate colors in alternate group ?
You can solve your colour issue by repeating the colour index however many times each group has a point plotted, like so:
plot (bp,yv, pch = "*", col = rep(colrs,each=2000))
The default colour palette (see ?palette or palette() ) will wrap around itself and you might want to specify your own to get 20 distinct colours.
To relabel the x axis, try plotting without the axis and then specifying the points and labels manually.
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20)
If you are trying to squeeze a lot of labels in there, you might have to shrink the text (cex.axis)or spin the labels 90 degrees (las=2).
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20,cex.axis=0.7,las=2)
Result:
One way is you could use a nested ifelse.
I'm still learning R, but one way it could be done would look something like:
plot(whatev$x, whatev$y, col=ifelse(xv<2000,red,ifelse(2000<xv & xv<4000,yellow,blue)))
You could nest as many of these as you want to have specificity on the colors and the intervals. The ifelse command is of form ifelse(TEST, True, False).
A simpler way would be to use the unique groups in xv to assign rainbow colors.
colrs=rainbow(length(unique(xv))) #Or colrs=rainbow(length(xv)) if xv is unique.
plot(whatev$x, whatev$y, col=colrs)
I hope I got all that right. I'm still learning R myself.
I'm going to go out on a limb and guess that your real data are something like 2000 values of things from 20 different groups. For instance, heights of 2000 plants of 20 different species. In such a case, you might want to look at the dotplot() function (or as illustrated below, dotplot.table()) in the lattice package.
Generate matrix of hypothetical values:
set.seed(1)
myY <- sapply( seq_len(20), function(x) rnorm(2000, x^(1/3)))
Transpose matrix to get groups as rows
myY <- t(myY)
Provide names of groups to matrix:
dimnames(myY)[[1]]<-paste("group", seq_len(nrow(myY)))
Load lattice package
library(lattice)
Generate dotplot
dotplot(myY, horizontal = FALSE, panel = function(x, y, horizontal, ...) {
panel.dotplot(x = x, y = y, horizontal = horizontal, jitter.x = TRUE,
col = seq_len(20)[x], pch = "*", cex = 1.5)
}, scales = list(x = list(rot = 90))
)
Which looks like (with unfortunate y-axis labeling):
Seeing that #JohnCLK is requesting a way of colouring by values on the x axis, I tried these demos in ggplot2-- each uses a dummy variable that is coded based on values or ranges to be highlighted in the other variables.
So, first set up the data, as in the question:
pos <- rep(1:2000, 20)
xv <- c(rep(1:20, each = 2000))
yv <- (2000*20, 0.5, 0.1)
xv <- lapply(unique(xv), function(x) pos[xv==x])
to.add <- cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
Then load ggplot2, prepare a couple of utility functions, and set the default theme:
library("ggplot2")
make.png <- function(p, fName) {
png(fName, width=640, height=480, units="px")
print(p)
dev.off()
}
make.plot <- function(df) {
p <- ggplot(df,
aes(x = bp,
y = yv,
colour = highlight))
p <- p + geom_point()
p <- p + opts(legend.position = "none")
return(p)
}
theme_set( theme_bw() )
Draw a plot which highlights values in a defined range on the vertical axis:
# highlight a horizontal band
df <- data.frame(cbind(bp, yv))
df$highlight <- 0
df$highlight[ df$yv >= 0.4 & df$yv < 0.45 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_horizontal.png")
Next draw a plot which highlights values in a defined range on the x axis, a vertical band:
# highlight a vertical band
df$highlight <- 0
df$highlight[ df$bp >= 38000 & df$bp < 42000 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_vertical.png")
And finally draw a plot which highlights alternating vertical bands, by x value:
# highlight alternating bands
library("gtools")
alt.band.width <- 2000
df$highlight <- as.integer(df$bp / alt.band.width)
df$highlight <- ifelse(odd(df$highlight), 1, 0)
p <- make.plot(df)
print(p)
make.png(p, "demo_alternating.png")
Hope this helps; it was good practice anyway.

Resources