Related
I have a large dataset of gene expression from ~10,000 patient samples (TCGA), and I'm plotting a predicted expression value (x) and the actual observed value (y) of a certain gene signature. For my downstream analysis, I need to draw a precise line through the plot and calculate different parameters in samples above/below the line.
No matter how I draw a line through the data (geom_smooth(method = 'lm', 'glm', 'gam', or 'loess')), the line always seems imperfect - it doesn't cut through the data to my liking (red line is lm in figure).
After playing around for a while, I realized that the 2d kernel density lines (geom_density2d) actually do a good job of showing the slope/trends of my data, so I manually drew a line that kind of cuts through the density lines (black line in figure).
My question: how can I automatically draw a line that cuts through the kernel density lines, as for the black line in the figure? (Rather than manually playing with different intercepts and slopes till something looks good).
The best approach I can think of is to somehow calculate intercept and slope of the longest diameter for each of the kernel lines, take an average of all those intercepts and slopes and plot that line, but that's a bit out of my league. Maybe someone here has experience with this and can help?
A more hacky approach may be getting the x,y coords of each kernel density line from ggplot_build, and going from there, but it feels too hacky (and is also out of my league).
Thanks!
EDIT: Changed a few details to make the figure/analysis easier. (Density lines are smoother now).
Reprex:
library(MASS)
set.seed(123)
samples <- 10000
r <- 0.9
data <- mvrnorm(n=samples, mu=c(0, 0), Sigma=matrix(c(2, r, r, 2), nrow=2))
x <- data[, 1] # standard normal (mu=0, sd=1)
y <- data[, 2] # standard normal (mu=0, sd=1)
test.df <- data.frame(x = x, y = y)
lm(y ~ x, test.df)
ggplot(test.df, aes(x, y)) +
geom_point(color = 'grey') +
geom_density2d(color = 'red', lwd = 0.5, contour = T, h = c(2,2)) + ### EDIT: h = c(2,2)
geom_smooth(method = "glm", se = F, lwd = 1, color = 'red') +
geom_abline(intercept = 0, slope = 0.7, lwd = 1, col = 'black') ## EDIT: slope to 0.7
Figure:
I generally agree with #Hack-R.
However, it was kind of a fun problem and looking into ggplot_build is not such a big deal.
require(dplyr)
require(ggplot2)
p <- ggplot(test.df, aes(x, y)) +
geom_density2d(color = 'red', lwd = 0.5, contour = T, h = c(2,2))
#basic version of your plot
p_built <- ggplot_build(p)
p_data <- p_built$data[[1]]
p_maxring <- p_data[p_data[['level']] == min(p_data[['level']]),] %>%
select(x,y) # extracts the x/y coordinates of the points on the largest ellipse from your 2d-density contour
Now this answer helped me to find the points on this ellipse which are furthest apart.
coord_mean <- c(x = mean(p_maxring$x), y = mean(p_maxring$y))
p_maxring <- p_maxring %>%
mutate (mean_dev = sqrt((x - mean(x))^2 + (y - mean(y))^2)) #extra column specifying the distance of each point to the mean of those points
coord_farthest <- c('x' = p_maxring$x[which.max(p_maxring$mean_dev)], 'y' = p_maxring$y[which.max(p_maxring$mean_dev)])
# gives the coordinates of the point farthest away from the mean point
farthest_from_farthest <- sqrt((p_maxring$x - coord_farthest['x'])^2 + (p_maxring$y - coord_farthest['y'])^2)
#now this looks which of the points is the farthest from the point farthest from the mean point :D
coord_fff <- c('x' = p_maxring$x[which.max(farthest_from_farthest)], 'y' = p_maxring$y[which.max(farthest_from_farthest)])
ggplot(test.df, aes(x, y)) +
geom_density2d(color = 'red', lwd = 0.5, contour = T, h = c(2,2)) +
# geom_segment using the coordinates of the points farthest apart
geom_segment((aes(x = coord_farthest['x'], y = coord_farthest['y'],
xend = coord_fff['x'], yend = coord_fff['y']))) +
geom_smooth(method = "glm", se = F, lwd = 1, color = 'red') +
# as per your request with your geom_smooth line
coord_equal()
coord_equal is super important, because otherwise you will get super weird results - it messed up my brain too. Because if the coordinates are not set equal, the line will seemingly not pass through the point furthest apart from the mean...
I leave it to you to build this into a function in order to automate it. Also, I'll leave it to you to calculate the y-intercept and slope from the two points
Tjebo's approach was kind of good initially, but after a close look, I found that it found the longest distance between two points on an ellipse. While this is close to what I wanted, it failed with either an irregular shape of the ellipse, or the sparsity of points in the ellipse. This is because it measured the longest distance between two points; whereas what I really wanted is the longest diameter of an ellipse; i.e.: the semi-major axis. See image below for examples/details.
Briefly:
To find/draw density contours of specific density/percentage:
R - How to find points within specific Contour
To get the longest diameter ("semi-major axis") of an ellipse:
https://stackoverflow.com/a/18278767/3579613
For function that returns intercept and slope (as in OP), see last piece of code.
The two pieces of code and images below compare two Tjebo's approach vs. my new approach based on the above posts.
#### Reprex from OP
require(dplyr)
require(ggplot2)
require(MASS)
set.seed(123)
samples <- 10000
r <- 0.9
data <- mvrnorm(n=samples, mu=c(0, 0), Sigma=matrix(c(2, r, r, 2), nrow=2))
x <- data[, 1] # standard normal (mu=0, sd=1)
y <- data[, 2] # standard normal (mu=0, sd=1)
test.df <- data.frame(x = x, y = y)
#### From Tjebo
p <- ggplot(test.df, aes(x, y)) +
geom_density2d(color = 'red', lwd = 0.5, contour = T, h = 2)
p_built <- ggplot_build(p)
p_data <- p_built$data[[1]]
p_maxring <- p_data[p_data[['level']] == min(p_data[['level']]),][,2:3]
coord_mean <- c(x = mean(p_maxring$x), y = mean(p_maxring$y))
p_maxring <- p_maxring %>%
mutate (mean_dev = sqrt((x - mean(x))^2 + (y - mean(y))^2)) #extra column specifying the distance of each point to the mean of those points
p_maxring = p_maxring[round(seq(1, nrow(p_maxring), nrow(p_maxring)/23)),] #### Make a small ellipse to illustrate flaws of approach
coord_farthest <- c('x' = p_maxring$x[which.max(p_maxring$mean_dev)], 'y' = p_maxring$y[which.max(p_maxring$mean_dev)])
# gives the coordinates of the point farthest away from the mean point
farthest_from_farthest <- sqrt((p_maxring$x - coord_farthest['x'])^2 + (p_maxring$y - coord_farthest['y'])^2)
#now this looks which of the points is the farthest from the point farthest from the mean point :D
coord_fff <- c('x' = p_maxring$x[which.max(farthest_from_farthest)], 'y' = p_maxring$y[which.max(farthest_from_farthest)])
farthest_2_points = data.frame(t(cbind(coord_farthest, coord_fff)))
plot(p_maxring[,1:2], asp=1)
lines(farthest_2_points, col = 'blue', lwd = 2)
#### From answer in another post
d = cbind(p_maxring[,1], p_maxring[,2])
r = ellipsoidhull(d)
exy = predict(r) ## the ellipsoid boundary
lines(exy)
me = colMeans((exy))
dist2center = sqrt(rowSums((t(t(exy)-me))^2))
max(dist2center) ## major axis
lines(exy[dist2center == max(dist2center),], col = 'red', lwd = 2)
#### The plot here is made from the data in the reprex in OP, but with h = 0.5
library(MASS)
set.seed(123)
samples <- 10000
r <- 0.9
data <- mvrnorm(n=samples, mu=c(0, 0), Sigma=matrix(c(2, r, r, 2), nrow=2))
x <- data[, 1] # standard normal (mu=0, sd=1)
y <- data[, 2] # standard normal (mu=0, sd=1)
test.df <- data.frame(x = x, y = y)
## MAKE BLUE LINE
p <- ggplot(test.df, aes(x, y)) +
geom_density2d(color = 'red', lwd = 0.5, contour = T, h = 0.5) ## NOTE h = 0.5
p_built <- ggplot_build(p)
p_data <- p_built$data[[1]]
p_maxring <- p_data[p_data[['level']] == min(p_data[['level']]),][,2:3]
coord_mean <- c(x = mean(p_maxring$x), y = mean(p_maxring$y))
p_maxring <- p_maxring %>%
mutate (mean_dev = sqrt((x - mean(x))^2 + (y - mean(y))^2))
coord_farthest <- c('x' = p_maxring$x[which.max(p_maxring$mean_dev)], 'y' = p_maxring$y[which.max(p_maxring$mean_dev)])
farthest_from_farthest <- sqrt((p_maxring$x - coord_farthest['x'])^2 + (p_maxring$y - coord_farthest['y'])^2)
coord_fff <- c('x' = p_maxring$x[which.max(farthest_from_farthest)], 'y' = p_maxring$y[which.max(farthest_from_farthest)])
## MAKE RED LINE
## h = 0.5
## Given the highly irregular shape of the contours, I will use only the largest contour line (0.95) for draing the line.
## Thus, average = 1. See function below for details.
ln = long.diam("x", "y", test.df, h = 0.5, average = 1) ## NOTE h = 0.5
## PLOT
ggplot(test.df, aes(x, y)) +
geom_density2d(color = 'red', lwd = 0.5, contour = T, h = 0.5) + ## NOTE h = 0.5
geom_segment((aes(x = coord_farthest['x'], y = coord_farthest['y'],
xend = coord_fff['x'], yend = coord_fff['y'])), col = 'blue', lwd = 2) +
geom_abline(intercept = ln[1], slope = ln[2], color = 'red', lwd = 2) +
coord_equal()
Finally, I came up with the following function to deal with all this. Sorry for the lack of comments/clarity
#### This will return the intercept and slope of the longest diameter (semi-major axis).
####If Average = TRUE, it will average the int and slope across different density contours.
long.diam = function(x, y, df, probs = c(0.95, 0.5, 0.1), average = T, h = 2) {
fun.df = data.frame(cbind(df[,x], df[,y]))
colnames(fun.df) = c("x", "y")
dens = kde2d(fun.df$x, fun.df$y, n = 200, h = h)
dx <- diff(dens$x[1:2])
dy <- diff(dens$y[1:2])
sz <- sort(dens$z)
c1 <- cumsum(sz) * dx * dy
levels <- sapply(probs, function(x) {
approx(c1, sz, xout = 1 - x)$y
})
names(levels) = paste0("L", str_sub(formatC(probs, 2, format = 'f'), -2))
#plot(fun.df$x,fun.df$y, asp = 1)
#contour(dens, levels = levels, labels=probs, add=T, col = c('red', 'blue', 'green'), lwd = 2)
#contour(dens, add = T, col = 'red', lwd = 2)
#abline(lm(fun.df$y~fun.df$x))
ls <- contourLines(dens, levels = levels)
names(ls) = names(levels)
lines.info = list()
for (i in 1:length(ls)) {
d = cbind(ls[[i]]$x, ls[[i]]$y)
exy = predict(ellipsoidhull(d))## the ellipsoid boundary
colnames(exy) = c("x", "y")
me = colMeans((exy)) ## center of the ellipse
dist2center = sqrt(rowSums((t(t(exy)-me))^2))
#plot(exy,type='l',asp=1)
#points(d,col='blue')
#lines(exy[order(dist2center)[1:2],])
#lines(exy[rev(order(dist2center))[1:2],])
max.dist = data.frame(exy[rev(order(dist2center))[1:2],])
line.fit = lm(max.dist$y ~ max.dist$x)
lines.info[[i]] = c(as.numeric(line.fit$coefficients[1]), as.numeric(line.fit$coefficients[2]))
}
names(lines.info) = names(ls)
#plot(fun.df$x,fun.df$y, asp = 1)
#contour(dens, levels = levels, labels=probs, add=T, col = c('red', 'blue', 'green'), lwd = 2)
#abline(lines.info[[1]], col = 'red', lwd = 2)
#abline(lines.info[[2]], col = 'blue', lwd = 2)
#abline(lines.info[[3]], col = 'green', lwd = 2)
#abline(apply(simplify2array(lines.info), 1, mean), col = 'black', lwd = 4)
if (isTRUE(average)) {
apply(simplify2array(lines.info), 1, mean)
} else {
lines.info[[average]]
}
}
Finally, here's the final implementation of the different answers:
library(MASS)
set.seed(123)
samples = 10000
r = 0.9
data = mvrnorm(n=samples, mu=c(0, 0), Sigma=matrix(c(2, r, r, 2), nrow=2))
x = data[, 1] # standard normal (mu=0, sd=1)
y = data[, 2] # standard normal (mu=0, sd=1)
#plot(x, y)
test.df = data.frame(x = x, y = y)
#### Find furthest two points of contour
## BLUE
p <- ggplot(test.df, aes(x, y)) +
geom_density2d(color = 'red', lwd = 2, contour = T, h = 2)
p_built <- ggplot_build(p)
p_data <- p_built$data[[1]]
p_maxring <- p_data[p_data[['level']] == min(p_data[['level']]),][,2:3]
coord_mean <- c(x = mean(p_maxring$x), y = mean(p_maxring$y))
p_maxring <- p_maxring %>%
mutate (mean_dev = sqrt((x - mean(x))^2 + (y - mean(y))^2))
coord_farthest <- c('x' = p_maxring$x[which.max(p_maxring$mean_dev)], 'y' = p_maxring$y[which.max(p_maxring$mean_dev)])
farthest_from_farthest <- sqrt((p_maxring$x - coord_farthest['x'])^2 + (p_maxring$y - coord_farthest['y'])^2)
coord_fff <- c('x' = p_maxring$x[which.max(farthest_from_farthest)], 'y' = p_maxring$y[which.max(farthest_from_farthest)])
#### Find the average intercept and slope of 3 contour lines (0.95, 0.5, 0.1), as in my long.diam function above.
## RED
ln = long.diam("x", "y", test.df)
#### Plot everything. Black line is GLM
ggplot(test.df, aes(x, y)) +
geom_point(color = 'grey') +
geom_density2d(color = 'red', lwd = 1, contour = T, h = 2) +
geom_smooth(method = "glm", se = F, lwd = 1, color = 'black') +
geom_abline(intercept = ln[1], slope = ln[2], col = 'red', lwd = 1) +
geom_segment((aes(x = coord_farthest['x'], y = coord_farthest['y'],
xend = coord_fff['x'], yend = coord_fff['y'])), col = 'blue', lwd = 1) +
coord_equal()
I have spent sometimes doing this but I could not reach the solution. I have this code to plot concentric circles over 25 by 25 grids using ggplot2 in R. I do not know how to be able to manipulate the center of the concentric circles to be not at the origin(0,0), but at the center of the grid (5,5). I also would like to keep the scale of the grid from 25 to 25. Thank you very much in advance
require(ggplot2)
require(grid)
x <- rep(seq(25), 25)
y <- rep(seq(25), each=25)
circ_rads <- seq(1,5,2)
qplot(x, y) +
lapply(circ_rads, FUN = function(x)
annotation_custom(circleGrob(gp = gpar(fill = "transparent", color = "black")),
-x, x, -x, x)) +
geom_text(aes(x = 0, y = circ_rads + 0.1, label = circ_rads)) +
coord_fixed(ratio = 1)
We can use ggforce::geom_circle like this:
library(ggplot2)
library(ggforce)
x <- rep(seq(25), 25)
y <- rep(seq(25), each=25)
circ_rads <- seq(1,5,2)
xy <- data.frame(x=x, y=y)
circles <- data.frame(
x0 = 5, # You say circles should be a 'centre of the grid' and 5, 5
y0 = 5, # not sure what you really mean, so going with 5, 5 here
r = circ_rads
)
ggplot() +
geom_point(data = xy,
aes(x,
y)) +
geom_circle(data = circles,
aes(x0 = x0,
y0 = y0,
r = r)) +
coord_fixed()
I have a parameterized contour that I'm plotting in R. What I'm trying to do is add arrows along the curve to show the viewer which direction the curve is going in.
Here's the code I'm using to generate the curve:
library(ggplot2)
library(grid)
set.seed(9)
T<-sort(runif(2^12,min=2^-5, max=16))
U<-function(t) exp(4*log(t) - 4*t)*(cos(log(t) + 3*t))
#Re(t^(4+1i)*t)*exp(-(4-3i)*t))
V<-function(t) exp(4*log(t) - 4*t)*(sin(log(t) + 3*t))
#Im(t^(4+1i)*t)*exp(-(4-3i)*t))
X<-sapply(T,U)
Y<-sapply(T,V)
df<-data.frame(X=X,Y=Y)
p<-ggplot(data=df,aes(x = df$X, y = df$Y))
p+theme_bw()+
geom_path(size=1,color='blue',linetype=1) #+
#geom_segment(aes(xend=c(tail(X, n=-1), NA), yend=c(tail(Y, n=-1), NA)),
#arrow=arrow(length=unit(0.2,"cm")),color='blue')
dev.off()
The last part I commented out:
#+
#geom_segment(aes(xend=c(tail(X, n=-1), NA), yend=c(tail(Y, n=-1), NA)),
#arrow=arrow(length=unit(0.2,"cm")),color='blue')
does something similar to what I want, but the arrows are very close together and the curve ends up looking "fuzzy" rather than directed.
Here's the fuzzy and non-fuzzy version of the curve:
Thank you!
It might look better if the arrows were more equally spaced along the curved path, e.g.
library(ggplot2)
library(grid)
set.seed(9)
T <- sort(runif(2^12,min=2^-5, max=16))
U <- function(t) exp(4*log(t) - 4*t)*(cos(log(t) + 3*t))
V <- function(t) exp(4*log(t) - 4*t)*(sin(log(t) + 3*t))
drough <- data.frame(x=sapply(T,U), y=sapply(T,V))
p <- ggplot(data = drough, aes(x = x, y = y)) +
geom_path()
## because the parametric curve was generated with uneven spacing
## we can try to resample more evenly along the path
parametric_smoothie <- function(x, y, N=1e2, phase=1, offset=0) {
lengths <- c(0, sqrt(diff(x)^2 + diff(y)^2))
l <- cumsum(lengths)
lmax <- max(l)
newpos <- seq(phase*lmax/N, lmax-phase*lmax/N, length.out = N) + offset*lmax/N
xx <- approx(l, x, newpos)$y
yy <- approx(l, y, newpos)$y
data.frame(x = xx, y = yy)
}
## this is a finer set of points
dfine <- parametric_smoothie(X, Y, 20)
gridExtra::grid.arrange(p + geom_point(data = drough, col="grey"),
p + geom_point(data = dfine, col="grey"), ncol=2)
## now we use this function to create N start points for the arrows
## and another N end points slightly apart to give a sense of direction
relay_arrow <- function(x, y, N=10, phase = 0.8, offset = 1e-2, ...){
start <- parametric_smoothie(x, y, N, phase)
end <- parametric_smoothie(x, y, N, phase, offset)
data.frame(xstart = start$x, xend = end$x,
ystart = start$y, yend = end$y)
}
breaks <- relay_arrow(drough$x, drough$y, N=20)
p + geom_point(data = breaks, aes(xstart, ystart), col="grey98", size=2) +
geom_segment(data = breaks, aes(xstart, ystart, xend = xend, yend = yend),
arrow = arrow(length = unit(0.5, "line")),
col="red", lwd=1)
One way to do it is to draw them on after. You can probably get the direction better by using the angle aesthetic (if it's easy enough to work out):
p<-ggplot(data=df,aes(x = X, y = Y))
p +
geom_path(size=1,color='blue',linetype=1)+
geom_segment(data = df[seq(1, nrow(df), 20), ], aes(x = X, y = Y, xend=c(tail(X, n=-1), NA), yend=c(tail(Y, n=-1), NA)),
arrow=arrow(length=unit(0.2,"cm"), type = "closed"), color="blue", linetype = 0, inherit.aes = FALSE)
Note the closed arrow type. I had to do that so they weren't interpreted as lines and hence disappear when linetype = 0.
Try this with slight modification of your code (you don't want to compromise the quality of the curve by having smaller number of points and at the same time you want to have smaller number of segments to draw the arrows for better quality of the arrows):
library(ggplot2)
library(grid)
set.seed(9)
T<-sort(runif(2^12,min=2^-5, max=16))
U<-function(t) exp(4*log(t) - 4*t)*(cos(log(t) + 3*t))
#Re(t^(4+1i)*t)*exp(-(4-3i)*t))
V<-function(t) exp(4*log(t) - 4*t)*(sin(log(t) + 3*t))
#Im(t^(4+1i)*t)*exp(-(4-3i)*t))
X<-sapply(T,U)
Y<-sapply(T,V)
df<-data.frame(X=X,Y=Y)
df1 <- df[seq(1,length(X), 8),]
p<-ggplot(data=df,aes(x = df$X, y = df$Y))
p+theme_bw()+
geom_path(size=1,color='blue',linetype=1) +
geom_segment(data=df1,aes(x=X, y=Y, xend=c(tail(X, n=-1), NA), yend=c(tail(Y, n=-1), NA)),
arrow=arrow(length=unit(0.3,"cm"),type='closed'),color='blue')
#dev.off()
Suppose I want to plot the following data:
# First set of X coordinates
x <- seq(0, 10, by = 0.2)
# Angles from 0 to 90 degrees
angles <- seq(0, 90, length.out = 10)
# Convert to radian
angles <- deg2rad(angles)
# Create an empty data frame
my.df <- data.frame()
# For each angle, populate the data frame
for (theta in angles) {
y <- sin(x + theta)
tmp <- data.frame(x = x, y = y, theta = as.factor(theta))
my.df <- rbind(my.df, tmp)
}
x1 <- seq(0, 12, by = 0.3)
y1 <- sin(x1 - 0.5)
tmp <- data.frame(x = x1, y = y1, theta = as.factor(-0.5))
my.df <- rbind(my.df, tmp)
ggplot(my.df, aes(x, y, color = theta)) + geom_line()
That gives me a nice plot:
Now I want to draw a heat map out of this data set. There are tutorials here and there that do it using geom_tile to do it.
So, let's try:
# Convert the angle values from factors to numerics
my.df$theta <- as.numeric(levels(my.df$theta))[my.df$theta]
ggplot(my.df, aes(theta, x)) + geom_tile(aes(fill = y)) + scale_fill_gradient(low = "blue", high = "red")
That does not work, and the reason is that my x coordinates do not have the same step:
x <- seq(0, 10, by = 0.2) vs x1 <- seq(0, 12, by = 0.3)
But as soon as I use the same step x1 <- seq(0, 12, by = 0.2), it works:
I real life, my data sets are not regularly spaced (these are experimental data), but I still need to display them as a heat map. How can I do?
You can use akima to interpolate the function into a form suitable for heat map plots.
library(akima)
library(ggplot2)
my.df.interp <- interp(x = my.df$theta, y = my.df$x, z = my.df$y, nx = 30, ny = 30)
my.df.interp.xyz <- as.data.frame(interp2xyz(my.df.interp))
names(my.df.interp.xyz) <- c("theta", "x", "y")
ggplot(my.df.interp.xyz, aes(x = theta, y = x, fill = y)) + geom_tile() +
scale_fill_gradient(low = "blue", high = "red")
If you wish to use a different resolution you can change the nx and ny arguments to interp.
Another way to do it with just ggplot2 is to use stat_summary_2d.
library(ggplot2)
ggplot(my.df, aes(x = theta, y = x, z = y)) + stat_summary_2d(binwidth = 0.3) +
scale_fill_gradient(low = "blue", high = "red")
The polygon function in R seems rather simple...however I can't get it to work.
It easily works with this code:
x <- seq(-3,3,0.01)
y1 <- dnorm(x,0,1)
y2 <- 0.5*dnorm(x,0,1)
plot(x,y1,type="l",bty="L",xlab="X",ylab="dnorm(X)")
points(x,y2,type="l",col="red")
polygon(c(x,rev(x)),c(y2,rev(y1)),col="skyblue")
When adopting this to something else, it doesn't work. Here some stuff to reproduce the issue:
lowerbound = c(0.05522914,0.06567045,0.07429926,0.08108482,0.08624472,0.09008050,0.09288837,0.09492226)
upperbound = c(0.1743657,0.1494058,0.1333106,0.1227383,0.1156714,0.1108787,0.1075915,0.1053178)
lim = c(100,200,400,800,1600,3200,6400,12800)
plot(upperbound, ylim=c(0, 0.2), type="b", axes=FALSE)
lines(lowerbound, type="b", col="red")
atvalues <- seq(1:8)
axis(side=1, at=atvalues, labels=lim)
axis(side=2, at=c(0,0.05,0.1,0.15,0.2), labels=c(0,0.05,0.1,0.15,0.2))
polygon(lowerbound,upperbound, col="skyblue")
It also doesn't work when only segmenting a subset when directly calling the coordinates:
xpoly <- c(100,200,200,100)
ypoly <- c(lowerbound[1], lowerbound[2], upperbound[2], upperbound[1])
polygon(xpoly,ypoly, col="skyblue")
What am I missing?
Plotting the whole polygon
You need to supply both x and y to polygon. Normally, you'd also do that for plot, but if you don't it will just use the Index as x, that is integers 1 to n. We can use that to make an x range. seq_along will create a 1:n vector, where n is the length of another object.
x <- c(seq_along(upperbound), rev(seq_along(lowerbound)))
y <- c(lowerbound, rev(upperbound))
plot(upperbound, ylim=c(0, 0.2), type="b", axes=FALSE)
lines(lowerbound, type="b", col="red")
atvalues <- seq(1:8)
axis(side=1, at=atvalues, labels=lim)
axis(side=2, at=c(0,0.05,0.1,0.15,0.2), labels=c(0,0.05,0.1,0.15,0.2))
polygon(x = x, y = y, col="skyblue")
Plotting a subset
For a subset, I would create the y first, and then use the old x to easily get `x values:
y2 <- c(lowerbound[1:2], upperbound[2:1])
x2 <- x[which(y2 == y)]
polygon(x2, y2, col="skyblue")
How I would do it
Creating something like this is much easier in ggplot2, where geom_ribbon does a lot of the heavy lifting. We just have to make an actual data.frame, an stop relying on indices.
Full polygon:
library(ggplot2)
ggplot(d, aes(x = x, ymin = low, ymax = up)) +
geom_ribbon(fill = 'skyblue', alpha = 0.5) +
geom_line(aes(y = low), col = 'red') +
geom_line(aes(y = up), col = 'black') +
scale_x_continuous(trans = 'log2') +
theme_bw()
Subset:
ggplot(d, aes(x = x, ymin = low, ymax = up)) +
geom_ribbon(data = d[1:2, ], fill = 'skyblue', alpha = 0.5) +
geom_line(aes(y = low), col = 'red') +
geom_line(aes(y = up), col = 'black') +
scale_x_continuous(trans = 'log2') +
theme_bw()