I have to plot several IR-spectrums. The x-axis with this plots has to be stretched between 2000 and 500. I've tried axis(side=1,at=c(4000,3500,2000,1500,1000,500)), but this does not produce the same distance between the labels. I've searched nearly 2 hours but can't figure out how to achieve this.
Help would be appreciated.
Thanks in advance
I don't think that there's a particularly clean way to do this in base graphics - no doubt there's something in one of the many graphics packages that would do it, but heres' my workaround for what I think you're trying to do.
#Some data to plot
x <- 0:4000
y <- sin(x/100)
#A function to do the stretching that you describe
stretcher <- function(x)
{
lower <- 500 ##lower end of expansion
upper <- 2000 ##upper end of expansion
stretchfactor <- 3 ##must be greater than 1, factor of expansion
x[x>upper] <- x[x>upper] + (stretchfactor-1) * (upper-lower)
x[x<=upper & x>lower] <- (x[x<=upper & x>lower] - lower) * stretchfactor + lower
x
}
#Create the plot
plot(stretcher(x),y,axes=FALSE)
labels <- c(4000,3500,3000,2500,2000,1500,1000,500)
box()
axis(2)
axis(1,labels=labels,at=stretcher(labels))
I'd also emphasis the breaks with something like:
abline(v=stretcher(2000),col='red',lty=2)
abline(v=stretcher(500),col='red',lty=2)
Related
I found a lot of SO question and answers addressing break and gaps in axis. But most of them are of low quality (in an SO meaning) because of no example code, no picture or to complex codes. This is why I asking.
I try to use library(plotrix). If there is a solution without it and/or another library it would be ok for me, too.
This is a normal R-barplot.
barplot(c(10,20,500))
To break the axis and add gap I tried this.
gap.barplot(c(10,20,500),gap=c(50,400), col=FALSE)
The result is not beautiful.
There is no space between the bars. space parameter from barplot() is not accepted by gap.barplot().
The bars have different widths.
The position of the tics are not in the middle of the bar.
Can I control that parameters with plotrix? I don't see something about it in the documentation.
Is there another library or solution for my problem?
There are so many different answers because of a lot of individual problems. For your problem you can try the following. But there is always a better solution out there. And IMO its always better to show your complete data instead of cropping it.
# Your data with names
library(plotrix)
d <- c(10,20,500)
names(d) <- letters[1:3]
# Specify a cutoff where the y.axis should be splitted.
co <- 200
# Now cut off this area in your data.
d[d > co] <- d[d > co] - co
# Create new axis label using the pretty() function
newy <- pretty(d)
newy[ newy > co] <- newy[ newy > co] + co
# remove values in your cutoff.
gr <- which(newy != co)
newy <- newy[ gr ]
# plot the data
barplot(d, axes=F)
# add the axis
axis(2, at = pretty(d)[gr], labels = newy)
axis.break(2, co, style = "gap")
As an alternative you can try to log your axis using log="y".
I found a lot of SO question and answers addressing break and gaps in axis. But most of them are of low quality (in an SO meaning) because of no example code, no picture or to complex codes. This is why I asking.
I try to use library(plotrix). If there is a solution without it and/or another library it would be ok for me, too.
This is a normal R-barplot.
barplot(c(10,20,500))
To break the axis and add gap I tried this.
gap.barplot(c(10,20,500),gap=c(50,400), col=FALSE)
The result is not beautiful.
There is no space between the bars. space parameter from barplot() is not accepted by gap.barplot().
The bars have different widths.
The position of the tics are not in the middle of the bar.
Can I control that parameters with plotrix? I don't see something about it in the documentation.
Is there another library or solution for my problem?
There are so many different answers because of a lot of individual problems. For your problem you can try the following. But there is always a better solution out there. And IMO its always better to show your complete data instead of cropping it.
# Your data with names
library(plotrix)
d <- c(10,20,500)
names(d) <- letters[1:3]
# Specify a cutoff where the y.axis should be splitted.
co <- 200
# Now cut off this area in your data.
d[d > co] <- d[d > co] - co
# Create new axis label using the pretty() function
newy <- pretty(d)
newy[ newy > co] <- newy[ newy > co] + co
# remove values in your cutoff.
gr <- which(newy != co)
newy <- newy[ gr ]
# plot the data
barplot(d, axes=F)
# add the axis
axis(2, at = pretty(d)[gr], labels = newy)
axis.break(2, co, style = "gap")
As an alternative you can try to log your axis using log="y".
I really need your R skills here. Been working with this plot for several days now. I'm a R newbie, so that might explain it.
I have sequence coverage data for chromosomes (basically a value for each position along the length of every chromosome, making the length of the vectors many millions). I want to make a nice coverage plot of my reads. This is what I got so far:
Looks alright, but I'm missing y-labels so I can tell which chromosome it is, and also I've been having trouble modifying the x-axis, so it ends where the coverage ends. Additionally, my own data is much much bigger, making this plot in particular take extremely long time. Which is why I tried this HilbertVis plotLongVector. It works but I can't figure out how to modify it, the x-axis, the labels, how to make the y-axis logged, and the vectors all get the same length on the plot even though they are not equally long.
source("http://bioconductor.org/biocLite.R")
biocLite("HilbertVis")
library(HilbertVis)
chr1 <- abs(makeRandomTestData(len=1.3e+07))
chr2 <- abs(makeRandomTestData(len=1e+07))
par(mfcol=c(8, 1), mar=c(1, 1, 1, 1), ylog=T)
# 1st way of trying with some code I found on stackoverflow
# Chr1
plotCoverage <- function(chr1, start, end) { # Defines coverage plotting function.
plot.new()
plot.window(c(start, length(chr1)), c(0, 10))
axis(1, labels=F)
axis(4)
lines(start:end, log(chr1[start:end]), type="l")
}
plotCoverage(chr1, start=1, end=length(chr1)) # Plots coverage result.
# Chr2
plotCoverage <- function(chr2, start, end) { # Defines coverage plotting function.
plot.new()
plot.window(c(start, length(chr1)), c(0, 10))
axis(1, labels=F)
axis(4)
lines(start:end, log(chr2[start:end]), type="l")
}
plotCoverage(chr2, start=1, end=length(chr2)) # Plots coverage result.
# 2nd way of trying with plotLongVector
plotLongVector(chr1, bty="n", ylab="Chr1") # ylab doesn't work
plotLongVector(chr2, bty="n")
Then I have another vector called genes that are of special interest. They are about the same length as the chromosome-vectors but in my data they contain more zeroes than values.
genes_chr1 <- abs(makeRandomTestData(len=1.3e+07))
genes_chr2 <- abs(makeRandomTestData(len=1e+07))
These gene vectors I would like plotted as a red dot under the chromosomes! Basically, if the vector has a value there (>0), it is presented as a dot (or line) under the long vector plot. This I have not idea how to add! But it seems fairly straightforward.
Please help me! Thank you so much.
DISCLAIMER: Please do not simply copy and paste this code to run off the entire positions of your chromosome. Please sample positions (for example, as #Gx1sptDTDa shows) and plot those. Otherwise you'd probably get a huge black filled rectangle after many many hours, if your computer survives the drain.
Using ggplot2, this is really easily achieved using geom_area. Here, I've generated some random data for three chromosomes with 300 positions, just to show an example. You can build up on this, I hope.
# construct a test data with 3 chromosomes and 100 positions
# and random coverage between 0 and 500
set.seed(45)
chr <- rep(paste0("chr", 1:3), each=100)
pos <- rep(1:100, 3)
cov <- sample(0:500, 300)
df <- data.frame(chr, pos, cov)
require(ggplot2)
p <- ggplot(data = df, aes(x=pos, y=cov)) + geom_area(aes(fill=chr))
p + facet_wrap(~ chr, ncol=1)
You could use the ggplot2 package.
I'm not sure what exactly you want, but here's what I did:
This has 7000 random data points (about double the amount of genes on Chromosome 1 in reality). I used alpha to show dense areas (not many here, as it's random data).
library(ggplot2)
Chr1_cov <- sample(1.3e+07,7000)
Chr1 <- data.frame(Cov=Chr1_cov,fil=1)
pl <- qplot(Cov,fil,data=Chr1,geom="pointrange",ymin=0,ymax=1.1,xlab="Chromosome 1",ylab="-",alpha=I(1/50))
print(pl)
And that's it. This ran in less than a second. ggplot2 has a humongous amount of settings, so just try some out. Use facets to create multiple graphs.
The code beneath is for a sort of moving average, and then plotting the output of that. It is not a real moving average, as a real moving average would have (almost) the same amount of data points as the original - it will only make the data smoother. This code, however, takes an average for every n points. It will of course run quite a bit faster, but you will loose a lot of detailed information.
VeryLongVector <- sample(500,1e+07,replace=TRUE)
movAv <- function(vector,n){
chops <- as.integer(length(vector)/n)
count <- 0
pos <- 0
Cov <-0
pos[1:chops] <- 0
Cov[1:chops] <- 0
for(c in 1:chops){
tmpcount <- count + n
tmppos <- median(count:tmpcount)
tmpCov <- mean(vector[count:tmpcount])
pos[c] <- tmppos
Cov[c] <- tmpCov
count <- count + n
}
result <- data.frame(pos=pos,cov=Cov)
return(result)
}
Chr1 <- movAv(VeryLongVector,10000)
qplot(pos,cov,data=Chr1,geom="line")
I want to create intervals (discretize/bin) of continuous variables to plot a choropleth map using ggplot. After reading various threads, I decided to use cut and quantile to eliminate the problems of: a) manually creating bins, and b) taking care of dominant states (otherwise, I had to manually to create bins and see the map and readjust the bins).
However, I am facing another problem now. Intervals coming out of cut are hardly pretty. So, I am trying to follow this example and this example to come up with my pretty labels.
Here is my list:
x <- seq(1,50)
Rounded quantiles:
qs_x <- round(quantile(x, probs=c(seq(0,0.8,by=0.2),0.9)))
which results:
0% 20% 40% 60% 80% 90%
1 11 21 30 40 45
Using these cuts, I want to come up with these labels:
1-11, 12-21, 22-30, 31-40, 41-45, 45+
I am sure there is an easy solution to convert a list using some apply function, but I am not well-versed with those functions.
Help appreciated.
A 3-liner produces the output you want, without using apply.
labels <- paste(qs_x+1, qs_x[-1], sep="-")
labels[1] <- paste(qs_x[1], qs_x[2], sep="-")
labels[length(labels)] <- paste(tail(qs_x, 1), "+", sep = "")
The first line constructs labels of the form (x1 + 1) - x2, the second line fixes the first label, and the third line fixes the last label. Here is the output
> labels
[1] "1-11" "12-21" "22-30" "31-40" "41-45" "45+"
I would like to jitter the text on a plot so as to avoid overplotting. To do so, I assume that I need a bounding box around the text component. Is there a way to get this?
For example, in base graphics:
plot.new()
text(.5,.5,"word")
text(.6,.5,"word") #does this overlap?
In grid there is a way to drop overlapping text, but I can't seem to find a way to access the code that figures out if overlapping has occurred.
grid.text(c("word","other word"),c(.5,.6),c(.5,.5),check=T)
Maybe the strwidth and strheight functions can help here
stroverlap <- function(x1,y1,s1, x2,y2,s2) {
sh1 <- strheight(s1)
sw1 <- strwidth(s1)
sh2 <- strheight(s2)
sw2 <- strwidth(s2)
overlap <- FALSE
if (x1<x2)
overlap <- x1 + sw1 > x2
else
overlap <- x2 + sw2 > x1
if (y1<y2)
overlap <- overlap && (y1 +sh1>y2)
else
overlap <- overlap && (y2+sh2>y1)
return(overlap)
}
stroverlap(.5,.5,"word", .6,.5, "word")
Package maptools has a function called pointLabel.
Use optimization routines to find good
locations for point labels without
overlaps.
If you were using base graphics it would be thigmophobe {plotrix}
"Find the direction away from the closest point"
Using lattice, Harrell has offered:
labcurve {Hmisc}
"Label Curves, Make Keys, and Interactively Draw Points and Curves"