Stacked histograms like in flow cytometry

Stacked histograms like in flow cytometry - r

I'm trying to use ggplot or base R to produce something like the following:
I know how to do histograms with ggplot2, and can easily separate them using facet_grid or facet_wrap. But I'd like to "stagger" them vertically, such that they have some overlap, as shown below. Sorry, I'm not allowed to post my own image, and it's quite difficult to find a simpler picture of what I want. If I could, I would only post the top-left panel.
I understand that this is not a particularly good way to display data -- but that decision does not rest with me.
A sample dataset would be as follows:
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) , cbind( rnorm(1e3)+2, 2), cbind( rnorm(1e3)+3, 3), cbind( rnorm(1e3)+4, 4)))
And I can plot it with geom_histogram as follows:
ggplot(my.data) + geom_histogram(aes(x=V1,fill=as.factor(V2))) + facet_grid( V2~.)
But I'd like the y-axes to overlap.

require(ggplot2)
require(plyr)
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) , cbind( rnorm(1e3)+2, 2), cbind( rnorm(1e3)+3, 3), cbind( rnorm(1e3)+4, 4)))
my.data$V2=as.factor(my.data$V2)
calculate the density depending on V2
res <- dlply(my.data, .(V2), function(x) density(x$V1))
dd <- ldply(res, function(z){
data.frame(Values = z[["x"]],
V1_density = z[["y"]],
V1_count = z[["y"]]*z[["n"]])
})
add an offset depending on V2
dd$offest=-as.numeric(dd$V2)*0.2 # adapt the 0.2 value as you need
dd$V1_density_offest=dd$V1_density+dd$offest
and plot
ggplot(dd, aes(Values, V1_density_offest, color=V2)) +
geom_line()+
geom_ribbon(aes(Values, ymin=offest,ymax=V1_density_offest, fill=V2),alpha=0.3)+
scale_y_continuous(breaks=NULL)

densityplot() from bioconductor flowViz package is one option for stacked densities.
from: http://www.bioconductor.org/packages/release/bioc/manuals/flowViz/man/flowViz.pdf :
For flowSets the idea is to horizontally stack plots of density estimates for all frames in the
flowSet for one or several ﬂow parameters. In the latter case, each parameter will be plotted
in a separate panel, i.e., we implicitely condition on parameters.
you can see example visuals here:
http://www.bioconductor.org/packages/release/bioc/vignettes/flowViz/inst/doc/filters.html
source("http://bioconductor.org/biocLite.R")
biocLite("flowViz")

Using the ggridges package:
ggplot(my.data, aes(x = V1, y = factor(V2), fill = factor(V2), color = factor(V2))) +
geom_density_ridges(alpha = 0.5)

I think it's going to be difficult to get ggplot to offset the histograms like that. At least with faceting it makes new panels, and really, this transformation makes the y-axis meaningless. (The value is in the comparison from row to row). Here's one attempt at using base graphics to try to accomplish a similar thing.
#plotting function
plotoffsethists <- function(vals, groups, freq=F, overlap=.25, alpha=.75, colors=apply(floor(rbind(col2rgb(scales:::hue_pal(h = c(0, 360) + 15, c = 100, l = 65)(nlevels(groups))),alpha=alpha*255)),2,function(x) {paste0("#",paste(sprintf("%02X",x),collapse=""))}), ...) {
print(colors)
if (!is.factor(groups)) {
groups<-factor(groups)
}
offsethist <- function (x, col = NULL, offset=0, freq=F, ...) {
y <- if (freq) y <- x$counts
else
x$density
nB <- length(x$breaks)
rect(x$breaks[-nB], 0+offset, x$breaks[-1L], y+offset, col = col, ...)
}
hh<-tapply(vals, groups, hist, plot=F)
ymax<-if(freq)
sapply(hh, function(x) max(x$counts))
else
sapply(hh, function(x) max(x$density))
offset<-(mean(ymax)*overlap) * (length(ymax)-1):0
ylim<-range(c(0,ymax+offset))
xlim<-range(sapply(hh, function(x) range(x$breaks)))
plot.new()
plot.window(xlim, ylim, "")
box()
axis(1)
Map(offsethist, hh, colors, offset, freq=freq, ...)
invisible(hh)
}
#sample call
par(mar=c(3,1,1,1)+.1)
plotoffsethists(my.data$V1, factor(my.data$V2), overlap=.25)

Complementing Axeman's answer, you can add the option stat="binline" to the geom_density_ridges geom. This results in a histogram like plot, instead of a density line.
library(ggplot2)
library(ggridges)
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) ,
cbind( rnorm(1e3)+2, 2),
cbind( rnorm(1e3)+3, 3),
cbind( rnorm(1e3)+4, 4)))
my.data$V2 <- as.factor(my.data$V2)
ggplot(my.data, aes(x=V1, y=factor(V2), fill=factor(V2))) +
geom_density_ridges(alpha=0.6, stat="binline", bins=30)
Resulting image:

Related

I want to create the empirical cumulative distribution function for two samples and put the plots in the same plot [R]

I am using this code to generate the empirical cumulative distribution function for the two samples (you can put any numerical values in them). I would like to put them in the same plot but if you run the following commands everything is overlapping really bad [see picture 1]. Is there any way to do it like this [see picture 2] (also I want the symbols to disappear and be a line like the picture 2) .
plot(ecdf(sample[,1]),pch = 1)
par(new=TRUE)
plot(ecdf(sample[,2]),pch = 2)
picture 1:https://www.dropbox.com/s/sg1fr8jydsch4xp/vanboeren2.png?dl=0
picture 2:https://www.dropbox.com/s/erhgla34y5bxa58/vanboeren1.png?dl=0
Update: I am doing this
df1 <- data.frame(x = sample[,1])
df2 <- data.frame(x = sample[,2])
ggplot(df1, aes(x, colour = "g")) + stat_ecdf()
+geom_step(data = df2)
scale_x_continuous(limits = c(0, 5000)) `
which is very close (in terms of shape) but still can not put them at the same plot.

Try this with basic plot:
df1 <- data.frame(x = runif(200,1,5))
df2 <- data.frame(x = runif(200,3,8))
plot(ecdf(df1[,1]),pch = 1, xlim=c(0,10), main=NULL)
par(new=TRUE)
plot(ecdf(df2[,1]),pch = 2, xlim=c(0,10), main=NULL)
Both graphs have now the same xlim (try removing it to see both superimposed incorrectly). The main=NULL removes the title
Result:

How to plot multiple columns at the same time? [duplicate]

I would like to plot y1 and y2 in the same plot.
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
plot(x, y1, type = "l", col = "red")
plot(x, y2, type = "l", col = "green")
But when I do it like this, they are not plotted in the same plot together.
In Matlab one can do hold on, but does anyone know how to do this in R?

lines() or points() will add to the existing graph, but will not create a new window. So you'd need to do
plot(x,y1,type="l",col="red")
lines(x,y2,col="green")

You can also use par and plot on the same graph but different axis. Something as follows:
plot( x, y1, type="l", col="red" )
par(new=TRUE)
plot( x, y2, type="l", col="green" )
If you read in detail about par in R, you will be able to generate really interesting graphs. Another book to look at is Paul Murrel's R Graphics.

When constructing multilayer plots one should consider ggplot package. The idea is to create a graphical object with basic aesthetics and enhance it incrementally.
ggplot style requires data to be packed in data.frame.
# Data generation
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x,1,1)
df <- data.frame(x,y1,y2)
Basic solution:
require(ggplot2)
ggplot(df, aes(x)) + # basic graphical object
geom_line(aes(y=y1), colour="red") + # first layer
geom_line(aes(y=y2), colour="green") # second layer
Here + operator is used to add extra layers to basic object.
With ggplot you have access to graphical object on every stage of plotting. Say, usual step-by-step setup can look like this:
g <- ggplot(df, aes(x))
g <- g + geom_line(aes(y=y1), colour="red")
g <- g + geom_line(aes(y=y2), colour="green")
g
g produces the plot, and you can see it at every stage (well, after creation of at least one layer). Further enchantments of the plot are also made with created object. For example, we can add labels for axises:
g <- g + ylab("Y") + xlab("X")
g
Final g looks like:
UPDATE (2013-11-08):
As pointed out in comments, ggplot's philosophy suggests using data in long format.
You can refer to this answer in order to see the corresponding code.

I think that the answer you are looking for is:
plot(first thing to plot)
plot(second thing to plot,add=TRUE)

Use the matplot function:
matplot(x, cbind(y1,y2),type="l",col=c("red","green"),lty=c(1,1))
use this if y1 and y2 are evaluated at the same x points. It scales the Y-axis to fit whichever is bigger (y1 or y2), unlike some of the other answers here that will clip y2 if it gets bigger than y1 (ggplot solutions mostly are okay with this).
Alternatively, and if the two lines don't have the same x-coordinates, set the axis limits on the first plot and add:
x1 <- seq(-2, 2, 0.05)
x2 <- seq(-3, 3, 0.05)
y1 <- pnorm(x1)
y2 <- pnorm(x2,1,1)
plot(x1,y1,ylim=range(c(y1,y2)),xlim=range(c(x1,x2)), type="l",col="red")
lines(x2,y2,col="green")
Am astonished this Q is 4 years old and nobody has mentioned matplot or x/ylim...

tl;dr: You want to use curve (with add=TRUE) or lines.
I disagree with par(new=TRUE) because that will double-print tick-marks and axis labels. Eg
The output of plot(sin); par(new=T); plot( function(x) x**2 ).
Look how messed up the vertical axis labels are! Since the ranges are different you would need to set ylim=c(lowest point between the two functions, highest point between the two functions), which is less easy than what I'm about to show you---and way less easy if you want to add not just two curves, but many.
What always confused me about plotting is the difference between curve and lines. (If you can't remember that these are the names of the two important plotting commands, just sing it.)
Here's the big difference between curve and lines.
curve will plot a function, like curve(sin). lines plots points with x and y values, like: lines( x=0:10, y=sin(0:10) ).
And here's a minor difference: curve needs to be called with add=TRUE for what you're trying to do, while lines already assumes you're adding to an existing plot.
Here's the result of calling plot(0:2); curve(sin).
Behind the scenes, check out methods(plot). And check body( plot.function )[[5]]. When you call plot(sin) R figures out that sin is a function (not y values) and uses the plot.function method, which ends up calling curve. So curve is the tool meant to handle functions.

if you want to split the plot into two columns (2 plots next to each other), you can do it like this:
par(mfrow=c(1,2))
plot(x)
plot(y)
Reference Link

As described by #redmode, you may plot the two lines in the same graphical device using ggplot. In that answer the data were in a 'wide' format. However, when using ggplot it is generally most convenient to keep the data in a data frame in a 'long' format. Then, by using different 'grouping variables' in the aesthetics arguments, properties of the line, such as linetype or colour, will vary according to the grouping variable, and corresponding legends will appear.
In this case, we can use the colour aessthetics, which matches colour of the lines to different levels of a variable in the data set (here: y1 vs y2). But first we need to melt the data from wide to long format, using e.g. the function 'melt' from reshape2 package. Other methods to reshape the data are described here: Reshaping data.frame from wide to long format.
library(ggplot2)
library(reshape2)
# original data in a 'wide' format
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
df <- data.frame(x, y1, y2)
# melt the data to a long format
df2 <- melt(data = df, id.vars = "x")
# plot, using the aesthetics argument 'colour'
ggplot(data = df2, aes(x = x, y = value, colour = variable)) + geom_line()

If you are using base graphics (i.e. not lattice/ grid graphics), then you can mimic MATLAB's hold on feature by using the points/lines/polygons functions to add additional details to your plots without starting a new plot. In the case of a multiplot layout, you can use par(mfg=...) to pick which plot you add things to.

You can use points for the overplot, that is.
plot(x1, y1,col='red')
points(x2,y2,col='blue')

Idiomatic Matlab plot(x1,y1,x2,y2) can be translated in R with ggplot2 for example in this way:
x1 <- seq(1,10,.2)
df1 <- data.frame(x=x1,y=log(x1),type="Log")
x2 <- seq(1,10)
df2 <- data.frame(x=x2,y=cumsum(1/x2),type="Harmonic")
df <- rbind(df1,df2)
library(ggplot2)
ggplot(df)+geom_line(aes(x,y,colour=type))
Inspired by Tingting Zhao's Dual line plots with different range of x-axis Using ggplot2.

Rather than keeping the values to be plotted in an array, store them in a matrix. By default the entire matrix will be treated as one data set. However if you add the same number of modifiers to the plot, e.g. the col(), as you have rows in the matrix, R will figure out that each row should be treated independently. For example:
x = matrix( c(21,50,80,41), nrow=2 )
y = matrix( c(1,2,1,2), nrow=2 )
plot(x, y, col("red","blue")
This should work unless your data sets are of differing sizes.

You could use the ggplotly() function from the plotly package to turn any of the gggplot2 examples here into an interactive plot, but I think this sort of plot is better without ggplot2:
# call Plotly and enter username and key
library(plotly)
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
plot_ly(x = x) %>%
add_lines(y = y1, color = I("red"), name = "Red") %>%
add_lines(y = y2, color = I("green"), name = "Green")

You can also create your plot using ggvis:
library(ggvis)
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x,1,1)
df <- data.frame(x, y1, y2)
df %>%
ggvis(~x, ~y1, stroke := 'red') %>%
layer_paths() %>%
layer_paths(data = df, x = ~x, y = ~y2, stroke := 'blue')
This will create the following plot:

Using plotly (adding solution from plotly with primary and secondary y axis- It seems to be missing):
library(plotly)
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
df=cbind.data.frame(x,y1,y2)
plot_ly(df) %>%
add_trace(x=~x,y=~y1,name = 'Line 1',type = 'scatter',mode = 'lines+markers',connectgaps = TRUE) %>%
add_trace(x=~x,y=~y2,name = 'Line 2',type = 'scatter',mode = 'lines+markers',connectgaps = TRUE,yaxis = "y2") %>%
layout(title = 'Title',
xaxis = list(title = "X-axis title"),
yaxis2 = list(side = 'right', overlaying = "y", title = 'secondary y axis', showgrid = FALSE, zeroline = FALSE))
Screenshot from working demo:

we can also use lattice library
library(lattice)
x <- seq(-2,2,0.05)
y1 <- pnorm(x)
y2 <- pnorm(x,1,1)
xyplot(y1 + y2 ~ x, ylab = "y1 and y2", type = "l", auto.key = list(points = FALSE,lines = TRUE))
For specific colors
xyplot(y1 + y2 ~ x,ylab = "y1 and y2", type = "l", auto.key = list(points = F,lines = T), par.settings = list(superpose.line = list(col = c("red","green"))))

Use curve for mathematical functions.
And use add=TRUE to use the same plot and axis.
curve( log2 , to=5 , col="black", ylab="log's(.)")
curve( log , add=TRUE , col="red" )
curve( log10, add=TRUE , col="blue" )
abline( h=0 )

Create a multicolor single line plot by attributes in R - project [duplicate]

For a list of n pairs of coordinates x,y is there a way of plotting the line between different points on a specific color?
The solution I've implemented so far is not to use the plot function but lines selecting the range for which I want the color. Here an example:
x <- 1:100
y <- rnorm(100,1,100)
plot(x,y ,type='n')
lines(x[1:50],y[1:50], col='red')
lines(x[50:60],y[50:60], col='black')
lines(x[60:100],y[60:100], col='red')
Is there an easier way of doing this?

Yes, one way of doing this is to use ggplot.
ggplot requires your data to be in data.frame format. In this data.frame I add a column col that indicates your desired colour. The plot is then constructed with ggplot, geom_line, and scale_colour_identity since the col variable is already a colour:
library(ggplot2)
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100),
col = c(rep("red", 50), rep("black", 10), rep("red", 40))
)
ggplot(df, aes(x=x, y=y)) +
geom_line(aes(colour=col, group=1)) +
scale_colour_identity()
More generally, each line segment can be a different colour. In the next example I map colour to the x value, giving a plot that smoothly changes colour from blue to red:
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100)
)
ggplot(df, aes(x=x, y=y)) + geom_line(aes(colour=x))
And if you insist on using base graphics, then use segments as follows:
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100),
col = c(rep("red", 50), rep("black", 10), rep("red", 40))
)
plot(df$x, df$y, type="n")
for(i in 1:(length(df$x)-1)){
segments(df$x[i], df$y[i], df$x[i+1], df$y[i+1], col=df$col[i])
}

For #joran and other lattice fans...
xyplot(y~x, data=df, panel=function(x,y,subscripts, groups, ...) {
for(k in seq_len(length(subscripts)-1)) {
i <- subscripts[k]
j <- subscripts[k+1]
panel.segments(df$x[i], df$y[i], df$x[j], df$y[j], col=df$col[i])
}
})
Unfortunately I don't know of a slick way of doing it, so it's basically wrapping the base solution into a panel function. The above works correctly when using a | to split by groups, for example, y~x|a, with an a variable as here:
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100),
col = c(rep("red", 50), rep("black", 10), rep("red", 40)),
a = 1:2
)
To use group= as well, you'd need the following:
xyplot(y~x, group=a, data=df, panel=function(x,y,subscripts, groups, ...) {
if(missing(groups)) { groups <- rep(1, length(subscripts)) }
grps <- split(subscripts, groups)
for(grp in grps) {
for(k in seq_len(length(grp)-1)) {
i <- grp[k]
j <- grp[k+1]
panel.segments(df$x[i], df$y[i], df$x[j], df$y[j], col=df$col[i])
}
}
})

One-liner using just the base libraries:
segments(head(x, -1), head(y, -1), x[-1], y[-1], rep(c("red", "black", "red"), c(49, 10, 40)))
(inspired by Andrie's usage of segments, see hist post and the discussion there)
Interestingly, it could be shortened to this:
segments(head(x, -1), head(y, -1), x[-1], y[-1], rep(c("red", "black"), c(49, 10)))

If you want to set the color based on the y-values rather than the x-values, use plotrix::clplot . It's a fantastic, wonderful, superduper function. Disclaimer: I wrote it :-) . clplot() thus highlights regions of your data where y takes on specified ranges of values.
As a side note: you can expand on Chase's comment as:
plot(x,y,t='p', col=colorlist[some_function_of_x])
where colorlist is a vector of colors or colornames or whatever, and you pick an algorithm that matches your needs. The first of Andrie's plots could be done with
colorlist=c('red','black')
and
plot(x,y,t='p', col=colorlist[1+(abs(x-55)<=5)])

In base library, I don't think so (however, I cannot speak for ggplot etc.). Looking at the lines function and trying to supply col as a vector...: it doesn't work. I would do it the same way as you.
EDIT after discussion with Andrie and inspired by his post: you can use segments() to do it in one call, see the discussion there.

scatter plot specifying color and labelling axis in r

I have following data and plot:
pos <- rep(1:2000, 20)
xv =c(rep(1:20, each = 2000))
# colrs <- unique(xv)
colrs <- xv # edits
yv =rnorm(2000*20, 0.5, 0.1)
xv = lapply(unique(xv), function(x) pos[xv==x])
to.add = cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
plot (bp,yv, pch = "*", col = colrs)
I have few issues in this plot I could not figure out.
(1) I want to use different color for different group or two different color for different groups (i.e xv), but when I tried color function in terms to be beautiful mixture. Although I need to highlight some points (for example bp 4000 to 4500 for example with blue color)
(2) Instead of bp positions I want to put a tick mark and label with the group.
Thank you, appreciate your help.
Edits: with help of the following answer (with slight different approach in case I have unbalanced number in each group will work) I could get the similar plot. But still question remaining regarding colors is what if I want to use two alternate colors in alternate group ?

You can solve your colour issue by repeating the colour index however many times each group has a point plotted, like so:
plot (bp,yv, pch = "*", col = rep(colrs,each=2000))
The default colour palette (see ?palette or palette() ) will wrap around itself and you might want to specify your own to get 20 distinct colours.
To relabel the x axis, try plotting without the axis and then specifying the points and labels manually.
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20)
If you are trying to squeeze a lot of labels in there, you might have to shrink the text (cex.axis)or spin the labels 90 degrees (las=2).
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20,cex.axis=0.7,las=2)
Result:

One way is you could use a nested ifelse.
I'm still learning R, but one way it could be done would look something like:
plot(whatev$x, whatev$y, col=ifelse(xv<2000,red,ifelse(2000<xv & xv<4000,yellow,blue)))
You could nest as many of these as you want to have specificity on the colors and the intervals. The ifelse command is of form ifelse(TEST, True, False).
A simpler way would be to use the unique groups in xv to assign rainbow colors.
colrs=rainbow(length(unique(xv))) #Or colrs=rainbow(length(xv)) if xv is unique.
plot(whatev$x, whatev$y, col=colrs)
I hope I got all that right. I'm still learning R myself.

I'm going to go out on a limb and guess that your real data are something like 2000 values of things from 20 different groups. For instance, heights of 2000 plants of 20 different species. In such a case, you might want to look at the dotplot() function (or as illustrated below, dotplot.table()) in the lattice package.
Generate matrix of hypothetical values:
set.seed(1)
myY <- sapply( seq_len(20), function(x) rnorm(2000, x^(1/3)))
Transpose matrix to get groups as rows
myY <- t(myY)
Provide names of groups to matrix:
dimnames(myY)[[1]]<-paste("group", seq_len(nrow(myY)))
Load lattice package
library(lattice)
Generate dotplot
dotplot(myY, horizontal = FALSE, panel = function(x, y, horizontal, ...) {
panel.dotplot(x = x, y = y, horizontal = horizontal, jitter.x = TRUE,
col = seq_len(20)[x], pch = "*", cex = 1.5)
}, scales = list(x = list(rot = 90))
)
Which looks like (with unfortunate y-axis labeling):

Seeing that #JohnCLK is requesting a way of colouring by values on the x axis, I tried these demos in ggplot2-- each uses a dummy variable that is coded based on values or ranges to be highlighted in the other variables.
So, first set up the data, as in the question:
pos <- rep(1:2000, 20)
xv <- c(rep(1:20, each = 2000))
yv <- (2000*20, 0.5, 0.1)
xv <- lapply(unique(xv), function(x) pos[xv==x])
to.add <- cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
Then load ggplot2, prepare a couple of utility functions, and set the default theme:
library("ggplot2")
make.png <- function(p, fName) {
png(fName, width=640, height=480, units="px")
print(p)
dev.off()
}
make.plot <- function(df) {
p <- ggplot(df,
aes(x = bp,
y = yv,
colour = highlight))
p <- p + geom_point()
p <- p + opts(legend.position = "none")
return(p)
}
theme_set( theme_bw() )
Draw a plot which highlights values in a defined range on the vertical axis:
# highlight a horizontal band
df <- data.frame(cbind(bp, yv))
df$highlight <- 0
df$highlight[ df$yv >= 0.4 & df$yv < 0.45 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_horizontal.png")
Next draw a plot which highlights values in a defined range on the x axis, a vertical band:
# highlight a vertical band
df$highlight <- 0
df$highlight[ df$bp >= 38000 & df$bp < 42000 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_vertical.png")
And finally draw a plot which highlights alternating vertical bands, by x value:
# highlight alternating bands
library("gtools")
alt.band.width <- 2000
df$highlight <- as.integer(df$bp / alt.band.width)
df$highlight <- ifelse(odd(df$highlight), 1, 0)
p <- make.plot(df)
print(p)
make.png(p, "demo_alternating.png")
Hope this helps; it was good practice anyway.

Elegant way to select the color for a particular segment of a line plot?

For a list of n pairs of coordinates x,y is there a way of plotting the line between different points on a specific color?
The solution I've implemented so far is not to use the plot function but lines selecting the range for which I want the color. Here an example:
x <- 1:100
y <- rnorm(100,1,100)
plot(x,y ,type='n')
lines(x[1:50],y[1:50], col='red')
lines(x[50:60],y[50:60], col='black')
lines(x[60:100],y[60:100], col='red')
Is there an easier way of doing this?

Yes, one way of doing this is to use ggplot.
ggplot requires your data to be in data.frame format. In this data.frame I add a column col that indicates your desired colour. The plot is then constructed with ggplot, geom_line, and scale_colour_identity since the col variable is already a colour:
library(ggplot2)
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100),
col = c(rep("red", 50), rep("black", 10), rep("red", 40))
)
ggplot(df, aes(x=x, y=y)) +
geom_line(aes(colour=col, group=1)) +
scale_colour_identity()
More generally, each line segment can be a different colour. In the next example I map colour to the x value, giving a plot that smoothly changes colour from blue to red:
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100)
)
ggplot(df, aes(x=x, y=y)) + geom_line(aes(colour=x))
And if you insist on using base graphics, then use segments as follows:
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100),
col = c(rep("red", 50), rep("black", 10), rep("red", 40))
)
plot(df$x, df$y, type="n")
for(i in 1:(length(df$x)-1)){
segments(df$x[i], df$y[i], df$x[i+1], df$y[i+1], col=df$col[i])
}

For #joran and other lattice fans...
xyplot(y~x, data=df, panel=function(x,y,subscripts, groups, ...) {
for(k in seq_len(length(subscripts)-1)) {
i <- subscripts[k]
j <- subscripts[k+1]
panel.segments(df$x[i], df$y[i], df$x[j], df$y[j], col=df$col[i])
}
})
Unfortunately I don't know of a slick way of doing it, so it's basically wrapping the base solution into a panel function. The above works correctly when using a | to split by groups, for example, y~x|a, with an a variable as here:
df <- data.frame(
x = 1:100,
y = rnorm(100,1,100),
col = c(rep("red", 50), rep("black", 10), rep("red", 40)),
a = 1:2
)
To use group= as well, you'd need the following:
xyplot(y~x, group=a, data=df, panel=function(x,y,subscripts, groups, ...) {
if(missing(groups)) { groups <- rep(1, length(subscripts)) }
grps <- split(subscripts, groups)
for(grp in grps) {
for(k in seq_len(length(grp)-1)) {
i <- grp[k]
j <- grp[k+1]
panel.segments(df$x[i], df$y[i], df$x[j], df$y[j], col=df$col[i])
}
}
})

One-liner using just the base libraries:
segments(head(x, -1), head(y, -1), x[-1], y[-1], rep(c("red", "black", "red"), c(49, 10, 40)))
(inspired by Andrie's usage of segments, see hist post and the discussion there)
Interestingly, it could be shortened to this:
segments(head(x, -1), head(y, -1), x[-1], y[-1], rep(c("red", "black"), c(49, 10)))

If you want to set the color based on the y-values rather than the x-values, use plotrix::clplot . It's a fantastic, wonderful, superduper function. Disclaimer: I wrote it :-) . clplot() thus highlights regions of your data where y takes on specified ranges of values.
As a side note: you can expand on Chase's comment as:
plot(x,y,t='p', col=colorlist[some_function_of_x])
where colorlist is a vector of colors or colornames or whatever, and you pick an algorithm that matches your needs. The first of Andrie's plots could be done with
colorlist=c('red','black')
and
plot(x,y,t='p', col=colorlist[1+(abs(x-55)<=5)])

In base library, I don't think so (however, I cannot speak for ggplot etc.). Looking at the lines function and trying to supply col as a vector...: it doesn't work. I would do it the same way as you.
EDIT after discussion with Andrie and inspired by his post: you can use segments() to do it in one call, see the discussion there.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Stacked histograms like in flow cytometry - r

Using the ggridges package: ggplot(my.data, aes(x = V1, y = factor(V2), fill = factor(V2), color = factor(V2))) + geom_density_ridges(alpha = 0.5)

Related

I want to create the empirical cumulative distribution function for two samples and put the plots in the same plot [R]

How to plot multiple columns at the same time? [duplicate]

Create a multicolor single line plot by attributes in R - project [duplicate]

scatter plot specifying color and labelling axis in r

Elegant way to select the color for a particular segment of a line plot?

Categories

Resources