Related
I have this tab called x:
I use it to make this plot, using the script:
lg <- x[1,2]
plot(1, type="n", xlab=contig, ylab="Best hits", xlim=c(-200, lg), ylim=c(0, 11))
segments(0,0,lg,0, col="red", lwd=3)
for (i in 1:10) {
segments(x[i,3],i,x[i,4],i); text(x=(x[i,4]-x[i,5]/2),y=i+0.2,labels=paste(x[i,6],"-",x[i,13], "%"))
}
But I'd like to have the text (starting by "NW_...") to be colored in the plot so that each unique text has its own color.
Here since there are 5 unique "NW_", there would be five colors.
I tried:
lg <- x[1,2]
plot(1, type="n", xlab=contig, ylab="Best hits", xlim=c(-200, lg), ylim=c(0, 11))
segments(0,0,lg,0, col="red", lwd=3)
for (i in 1:10) {
segments(x[i,3],i,x[i,4],i); text(x=(x[i,4]-x[i,5]/2),y=i+0.2,labels=paste(x[i,6],"-",x[i,13], "%"), col=rainbow(n=nlevels(x$TID))[x$TID])
}
But this fails.
Actually, I have no clue how to make this in a loop.
Could you help?
Thanks!
Muriel
add a colour column to x.
x$colour <- rainbow(nlevels(x$TID))[x$TID]
Then use this colour in your for loop.
lg <- x[1,2]
plot(1, type="n", xlab=contig, ylab="Best hits", xlim=c(-200, lg), ylim=c(0, 11))
segments(0,0,lg,0, col="red", lwd=3)
for (i in 1:10) {
segments(x[i, 3], i, x[i, 4], i)
text(
x = (x[i, 4] - x[i, 5] / 2),
y = i + 0.2,
labels = paste(x[i, 6], "-", x[i, 13], "%"),
col=x[i, "colour"])
}
Essentially, I want to make sure that all my labels on the x-axis are non-overlapping and that the figure margins are long enough to see vertical labels.
par(mar=c(180, 70, 2, 2.1))
oldfont <- par(font=3)
table(new$Tag)
barplot(table(new$Tag),x,las=2,cex.lab=100)
Please find bar plot image here!
Are you looking for something more than just changing the margins, text size etc.?
Readability can be improve a bit by censoring out the single-counts and truncating the names.
set.seed(1)
words <- sapply(
sample(3:25, 50, replace=TRUE),
function(x) {
paste(sample(c(letters), x, replace=TRUE), collapse="")
}
)
strtrunc <- function(x, l, r="…") {
trunc <- nchar(x) > l
x[trunc] <- paste0(strtrim(x[trunc], l), r)
x
}
samp <- sample(1:50, 500, replace=TRUE)
samp.t <- round(1.2^table(samp))
samp.t[sample(1:50, 20)] <- 1
names(samp.t) <- words
dev.new(width=10, height=5)
par(mar=c(10, 4, 3, 0.5), mgp=c(0, 0.8, -0.5), cex=0.9)
b <- barplot(samp.t, xaxt="n", space=0.5, col=1)
axis(1, at=b, labels=names(samp.t), las=2, tick=FALSE, cex.axis=0.8)
mtext("All counts", line=1, cex=1.5)
#barplot with logarithmic y-axis, truncated names and no single-counts
samp.ts <- samp.t[samp.t != 1]
names(samp.ts) <- strtrunc(names(samp.ts), 15)
dev.new(width=10, height=5)
par(mar=c(10, 4, 3, 0.5), mgp=c(0, 0.8, -0.5), cex=0.9)
b <- barplot(samp.ts, xaxt="n", space=0.5, col=1, log="y")
axis(1, at=b, labels=names(samp.ts), las=2, tick=FALSE, cex.axis=1.2)
mtext("Counts > 1", line=1, cex=1.5)
Bar plots with more than 20 or so named categories generally doesn't really work so well, you'd might be better off finding a different way to visualize your data. Histogram or density plot might be an option, if it makes sense for your data. Otherwise breaking the bar plot up into smaller sections, maybe by sensible groups, might be another.
I have to create some graphs through a loop. The graphs are multipaneled and each panel has three different layers.
I tried this code
pdf('plot.pdf', width=14, height=7)
R <- dim(dataset)[1]
for (i in 1:R) {
par(mfrow=c(1,2))
par(mfg=c(1,1))
plot(...)
points(...)
polygon(...)
par(mfg=c(1,2)
plot(...)
points(...)
polygon(....)
}
dev.off()
but the result is a single graph (and not one graph per loop) fully overlaid.
Graph
Is there an issue in looping with the par function?
EDIT: here's a reproducible example. I tried with split.screen, but the result is the same single-paged pdf with overlaid plots.
The issue seems related to the pdf function itself, since the loop does the job correctly.
set.seed(123)
## create data
varA1 <- matrix(rnorm(60,5,1), nrow=3)
varA2 <- matrix(rnorm(60,5,1), nrow=3)
varB1 <- matrix(rnorm(80,20,10), nrow=4)
varB2 <- matrix(rnorm(80,30,20), nrow=4)
sitesA <- 1:nrow(varA1)
sitesB <- 1:nrow(varB1)
totsites <- 1:max(sitesA, sitesB)
## create pdf
pdf('prova.pdf', width=14, height=7)
for(i in totsites) { # the pdf should contain "totsites" number of pages (in this case, 4)
split.screen(c(1,2))
if(i %in% sitesA) {
screen(1)
plot(var1[i,], ylim=c(0, max(c(var1, var2))), col='darkred', type='b', pch=16)
points(var2[i,], col='red', type='b', pch=16)
polygon(c(1:20,rev(1:20)),c(var1[i,]-1,rev(var1[i,]+1)), col=rgb(100, 0, 0, maxColorValue=255, alpha=50), border=NA)
}
if(i %in% sitesB) {
screen(2)
plot(var3[i,], ylim=c(0, max(c(var3, var4))), col='darkgreen', type='b', pch=16)
points(var4[i,], col='green', type='b', pch=16)
polygon(c(1:20,rev(1:20)),c(var3[i,]-10,rev(var3[i,]+10)), col=rgb(0, 100, 0, maxColorValue=255, alpha=50), border=NA)
}
}
dev.off()
BTW, I got this warning
Warning message:
In par(new = TRUE) : calling par(new=TRUE) with no plot
Using layout() rather than split.screen() seems to be a better option. Also keep it outside the loop.
pdf('prova.pdf', width=14, height=7)
layout(matrix(1:2, nrow=1))
for(i in totsites) { # the pdf should contain "totsites" number of pages (in this case, 4)
if(i %in% sitesA) {
plot(varA1[i,], ylim=c(0, max(c(varA1, varA2))), col='darkred', type='b', pch=16)
points(varA2[i,], col='red', type='b', pch=16)
polygon(c(1:20,rev(1:20)),c(varA1[i,]-1,rev(varA1[i,]+1)), col=rgb(100, 0, 0, maxColorValue=255, alpha=50), border=NA)
} else {
plot.new()
}
if(i %in% sitesB) {
plot(varB1[i,], ylim=c(0, max(c(varB1, varB2))), col='darkgreen', type='b', pch=16)
points(varB2[i,], col='green', type='b', pch=16)
polygon(c(1:20,rev(1:20)),c(varB1[i,]-10,rev(varB1[i,]+10)), col=rgb(0, 100, 0, maxColorValue=255, alpha=50), border=NA)
} else {
plot.new()
}
}
dev.off()
Suppose I want to plot an R function:
weibull <- function(ALPHA, LAMBDA, T){
ALPHA*LAMBDA*(T^(ALPHA-1))
}
So the function takes the arguments alpha, lambda and T. I want to generate a plot where in one plot alpha =0.5, time ranges from 0 to 2 and lambda=1, 2, 4, 8, 16 and in another, alpha=1, time ranges from 0 to 2 and lambda=1, 2, 4, 8, 16.
In the past for plotting functions with just one argument, I've used curve and then done ADD=TRUE if I wanted another curve on the same plot. So for instance, in the past I've used:
lambda <- 0.5
pdf <- function(x){
lambda*exp(-lambda*x)
}
survival <- function(x){
exp(-lambda*x)
}
plot(curve(pdf, 0, 6), type="l", ylim=c(0, 1), lwd=3, ylab="", xlab="", xaxs="i", yaxs="i", main=expression(paste("Exponential Distribution ", lambda, "=0.5")), cex.main=2, cex.axis=2, cex.lab=2)
curve(survival, 0, 6, add=TRUE, col="plum4", lwd=3)
But in this example the functions just have one argument, which is x. Whereas, now I want to vary LAMBDA, T and ALPHA. The curve function does not work and I am not sure how else to approach this.
If you use curve, you can specify an expression with a free variable x that will get replaced by the range of values specified in your from=/to= parameters. For example you can do
weibull <- function(ALPHA, LAMBDA, T){
ALPHA*LAMBDA*(T^(ALPHA-1))
}
lambda<-c(1, 2, 4, 8, 16)
col<-rainbow(length(lambda))
layout(matrix(1:2, nrow=1))
for(i in seq_along(lambda)) {
curve(weibull(.5, lambda[i], x), from=0, to=2, add=i!=1, col=col[i], ylim=c(0,50), main="alpha=.5")
}
legend(1,50,lambda, col=col, lty=1)
for(i in seq_along(lambda)) {
curve(weibull(1, lambda[i], x), from=0, to=2, add=i!=1, col=col[i], ylim=c(0,20), main="alpha=1")
}
which will produce a plot like
I'd do it with plyr and ggplot2,
weibull <- function(alpha, lambda, time){
data.frame(time = time, value = alpha*lambda*(time^(alpha-1)))
}
library(plyr)
library(ggplot2)
params <- expand.grid(lambda = c(1, 2, 4, 8, 16), alpha = c(0.5, 1))
all <- mdply(params, weibull, time = seq(0, 2, length=100))
ggplot(all, aes(time, value, colour=factor(lambda)))+
facet_wrap(~alpha,scales="free", ncol=2) + geom_line()
A tidyverse alternative,
weibull <- function(alpha, lambda, time){
data.frame(time = time, value = alpha*lambda*(time^(alpha-1)))
}
library(ggplot2)
library(tidyverse)
params <- tidyr::crossing(lambda = c(1, 2, 4, 8, 16), alpha = c(0.5, 1))
params %>%
dplyr::mutate(purrr::pmap(., .f = weibull, time = seq(0, 2, length=100))) %>%
tidyr::unnest() %>%
ggplot(aes(time, value, colour=factor(lambda)))+
facet_wrap(~alpha,scales="free", ncol=2) + geom_line()
This is similar to MrFlick's answer but shorter:
par(mfrow=1:2)
lapply(0:4, function(l) curve(weibull(0.5, 2^l, x), col=l+1, add=l!=0, ylim=c(0,50), xlim=c(0,2)))
lapply(0:4, function(l) curve(weibull(1, 2^l, x), col=l+1, add=l!=0, ylim=c(0,50), xlim=c(0,2)))
Ok if you're a big fan of nested lapply's you can also do:
lapply(c(0.5,1), function(a) lapply(0:4, function(l) curve(weibull(a, 2^l, x), col=l+1, add=l!=0, ylim=c(0,50), xlim=c(0,2))))
Stephen Few has recently introduced Bandlines which are an extension to Edward Tufte’s Sparklines. Is there an easy way to produce these kinds of plots using ggplot2?
Since this was introduced this month, I doubt there is already an implementation. But the concept seems simple enough that you can make one fairly easily. Here is a very simple implementation using base graphics (I'm not an expert of ggplot2).
bandline<-function(x, low.col, high.col, axis=TRUE){
l <- max(unlist(lapply(x, length)), na.r=TRUE)
r <- range(unlist(x), na.rm=TRUE)
par(mfcol=c(length(x), 1))
for(i in 1:length(x)){
y <- boxplot.stats(x[[i]])
ifelse(i==1, par(mar=c(0,3,3,3)),
ifelse(i==length(x), par(mar=c(3,3,0,3)),
par(mar=c(0,3,0,3))))
plot(NA, axes=F, bty="n", xlim=c(1,l), ylim=r, xaxs="i")
rect(1,y$stats[2], l, y$stats[1], col="grey80", border=NA)
rect(1,y$stats[4], l, y$stats[2], col="grey60", border=NA)
rect(1,y$stats[5], l, y$stats[4], col="grey40", border=NA)
abline(h=y$stats[3],col="white", lwd=2)
lines(seq_along(x[[i]]), x[[i]])
zhigh <- zlow <- x[[i]]
zhigh[zhigh<=y$stats[5]]<-NA
zlow[zlow>=y$stats[1]]<-NA
points(seq_along(x[[i]]), zlow, bg=low.col, pch=21,cex=2)
points(seq_along(x[[i]]), zhigh, bg=high.col, pch=21, cex=2)
if(axis==TRUE){
axis(2, at=pretty(x[[i]]), las=2)
ifelse(i==1, axis(3, at=seq_len(l)),
ifelse(i==length(x),axis(1, at=seq_len(l)),""))
}
mtext(names(x)[i], side=4, srt=270, line=1)
}
}
And here is an example:
set.seed(1)
dat<-list(a=rnorm(100), b=rnorm(100), c=rnorm(100), d=rnorm(100))
bandline(dat, "black", "white", axis=FALSE)