Polygon function in R creates a line between first and last point - r

I have a time series that I'd like to plot using the polygon function as I want to create a shade between different time series. However, when calling polygon (), the function adds a line between the first and last point (in essence it connects the first and last point to finish the plot). I would like to know how to tell R not to join up the two. Slightly related questions have been posted (Line connecting the points in the plot function in R) but the solutions didn't help. Any help would be appreciated.
I have already tried several things, such as reordering the data like in the part below.
% ts_lb_vec is my time-series in vector format;
% x is a vector of time (2000 to 2015);
% I first call plot which plots x (time) with y (the time-series). This works fine;
plot(x, ts_lb_vec,type='n',ylim=c(-300,300), ylab="", xlab="")
But if I want to use the polygon function to use the shading capabilities, it draws the line and I have tried reordering the data (as below) to try to eliminate the problem but this is unsuccessful
polygon(x[order(x),ts_lb_vec[order(x)], xlim=range(x), ylim=range(ts_lb_vec))
I would just like R when calling the polygon function to not connect my first and last point (see image). The figure attached bellow was produced using the following code:
plot(x, ts_lb_vec,type='n', ylab="", xlab="")
polygon(x, ts_lb_vec)
Just to clarify, what I would like is for the space between two time series to be filled, hence why I need the function polygon. See image below

I put together a solution using ggplot2.
The key step is drawing a separate polygon where the order of one of the curves is inverted to avoid the crossing over back to the start.
# simple example data
examp.df <- data.frame(time = seq_len(15), a = c(1,2,3,4,5,5,5,4,3,2,4,5,6,7,8), b = c(2,4,5,6,7,8,7,6,6,5,6,4,3,2,1))
# the polygon is generated by inverting the curve b
polygon <- data.frame(time <- c(examp.df$time, rev(examp.df$time)), y.pos = c(examp.df$a, rev(examp.df$b)))
ggplot(examp.df) +
geom_polygon(data = polygon, aes(x = time, y = y.pos), fill = "blue", alpha = 0.25) +
geom_line(aes(x= time, y = a), size = 1, color = "red") +
geom_line(aes(x = time, y = b), size = 1, color = "green") +
theme_classic()
Which results in:
If you want to know more about ggplot2 this is a good introduction.

Related

Is it possible to over-ride the x axis range in R package ggbio when using autoplot and ensdb transcripts?

I am trying to use ggbio to plot gene transcripts. I want to plot a very specific range so it matches my ggplot2 plots. The problem is my example plot ends up having range of 133,567,500-133,570,000 regardless of the GRange and whether I specify xlim or not.
This example should only plot a small bit of intron (the thin arrowed line) but instead plots the full 2 exons and intron in between. I believe autoplot wants to plot the entire transcript or transcripts present in the range and widens the range to accommodate for that.
library(EnsDb.Hsapiens.v86)
library(ggbio)
ensdb <- EnsDb.Hsapiens.v86
mut<-GRanges("10", IRanges(133568909, 133569095))
gene <- autoplot(ensdb, which=mut, names.expr="gene_name",xlim=c(133568909,133569095))
gene.gg <- gene#ggplot
png("test_gene_plot_5.png")
gene.gg
dev.off()
Is there any way to over-ride this? I've looked at the manual page for autoplot and I couldn't narrow down an option that would fix it. Others have said to use xlim, but that does not seem to change anything
I like ggbio because it can make a ggplot2 object to be plotted along with other ggplot2 objects. I have not seen an example for that with other approaches like Gvis. But I would entertain other approaches if they could be combined with my existing plots.
Thanks!
Amy
It kind of depends wether you want clipped or squished data. Usually autoplot outputs a ggplot object at some point that can be manipulated as such.
For squished data:
library(GenomicRanges) # just to be sure start and end work
gene#ggplot +
scale_x_continuous(limits = c(start(mut), end(mut)), oob = scales::squish)
For clipped data:
gene#ggplot +
coord_cartesian(xlim = c(start(mut), end(mut)))
But to be totally honest, I'm unsure wether this is the most informative way to communicate that you are plotting the internals of an intron.
Alternatively, I've written a gene model geom at some point that doesn't work through the autoplot methods (which can sometimes be a pain if you want to customise everything). Downside is that you'd have to do some manual gene searching and setting aesthetics. Upside is that it works like most other geoms and is therefore easy to combine with some other data.
library(ggnomics) # from: https://github.com/teunbrand/ggnomics
# Finding a gene's exons manually
my_gene <- transcriptsByOverlaps(EnsDb.Hsapiens.v86, mut)
my_gene <- exonsByOverlaps(EnsDb.Hsapiens.v86, my_gene)
my_gene <- as.data.frame(my_gene)
some_other_data <- data.frame(
x = seq(start(mut), end(mut), by = 10),
y = cumsum(rnorm(19))
)
ggplot(some_other_data) +
geom_line(aes(x, y)) +
geom_genemodel(data = my_gene,
aes(xmin = start, xmax = end,
y = max(some_other_data$y) + 1,
group = 1, strand = strand)) +
coord_cartesian(xlim = c(start(mut), end(mut)))
Hope that helped!

Time Series Analysis using ts.plot and abline()

Please explain me which transformation should I be using in the below code to apply WN model.
Below is the code where difference is used, I did not use log() because the series is decaying :
data <- c(60088,48398,54687,43337,47839,43480,53297,46882,45387,47186,42794,43274,31486,29036,25242,21792,23699,19161)
diff_data <- diff(data)
ts.plot(diff_data)
model_wn <- arima(diff_data, order = c(0, 0, 0))
coeff<-model_wn$coef
ts.plot(data)
abline(0, coeff)
Please explain me two things:
with ts.plot and abline, the abline is not visible in the graph
what can I utilise using the time series analysis with the above data.
'abline' has some parameters that you can specify, for example-
If you want a horizontal line you need to specify h = y-value
If you want a vertical line, you need to specify v = x-value
Your plot is produced by-
ts.plot(data)
If you want a horizontal line in your plot, add this code after the above code-
abline(h = 40000, lty = "dashed", col = "black")
'lty' is for line type and 'col' is for line color.
Similarly, if you want a vertical line, replace 'h' with 'v' in the above code. But remember that the value of 'v' should be within the bounds of your x-variable values.
Hope this helps answer you're question.

"zoom"/"scale" with coord_polar()

I've got a polar plot which uses geom_smooth(). The smoothed loess line though is very small and rings around the center of the plot. I'd like to "zoom in" so you can see it better.
Using something like scale_y_continuous(limits = c(-.05,.7)) will make the geom_smooth ring bigger, but it will also alter it because it will recompute with the datapoints limited by the limits = c(-.05,.7) argument.
For a Cartesian plot I could use something like coord_cartesian(ylim = c(-.05,.7)) which would clip the chart but not the underlying data. However I can see no way to do this with coord_polar()
Any ideas? I thought there might be a way to do this with grid.clip() in the grid package but I am not having much luck.
Any ideas?
What my plot looks like now, note "higher" red line:
What I'd like to draw:
What I get when I use scale_y_continuous() note "higher" blue line, also it's still not that big.
I haven't figured out a way to do this directly in coord_polar, but this can be achieved by modifying the ggplot_build object under the hood.
First, here's an attempt to make a plot like yours, using the fake data provided at the bottom of this answer.
library(ggplot2)
plot <- ggplot(data, aes(theta, values, color = series, group = series)) +
geom_smooth() +
scale_x_continuous(breaks = 30*-6:6, limits = c(-180,180)) +
coord_polar(start = pi, clip = "on") # use "off" to extend plot beyond axes
plot
Here, my Y (or r for radius) axis ranges from about -2.4 to 4.3.
We can confirm this by looking at the associated ggplot_build object:
# Create ggplot_build object and look at radius range
plot_build <- ggplot_build(plot)
plot_build[["layout"]][["panel_params"]][[1]][["r.range"]]
# [1] -2.385000 4.337039
If we redefine the range of r and plot that, we get what you're looking for, a close-up of the plot.
# Here we change the 2nd element (max) of r.range from 4.337 to 1
plot_build[["layout"]][["panel_params"]][[1]][["r.range"]][2] <- 1
plot2 <- ggplot_gtable(plot_build)
plot(plot2)
Note, this may not be a perfect solution, since this seems to introduce some image cropping issues that I don't know how to address. I haven't tested to see if those can be overcome using ggsave or perhaps by further modifying the ggplot_build object.
Sample data used above:
set.seed(4.2)
data <- data.frame(
series = as.factor(rep(c(1:2), each = 10)),
theta = rep(seq(from = -170, to = 170, length.out = 10), times = 2),
values = rnorm(20, mean = 0, sd = 1)
)

Smoothing using kernel and loess in R

I am trying to smooth my data set, using kernel or loess smoothing method. But, They are all not clear or not what I want. Several questions are the followings.
My x data is "conc" and y data is "depth", which is ex. cm.
1) Kernel smooth
k <- kernel("daniell", 150)
plot(k)
K <- kernapply(conc, k)
plot(conc~depth)
lines(K, col = "red")
Here, my data is smoothed by frequency=150. This means that every data point is averaged by neighboring (right and left) 150 data points? What "daniell" means? I could not find what it means online.
2) Loess smooth
p<-qplot(depth, conc, data=total)
p1 <- p + geom_smooth(method = "loess", size = 1, level=0.95)
Here, what is the default of loess smooth function? If I want to smooth my data with frequency=150 like above case (moving average by every 150 data point), how can I modify this code?
3) To show y-axis with log scale, I put "log10(conc)", instead of "conc", and it worked. But, I cannot change the y-axis tick label. I tried to use "scale_y_log10(limits = c(1,1e3))" in my code to show axis tick labe like 10^0, 10^1, 10^2..., but did not work.
Please answer my questions. Thanks a lot for your help.
Sum

How can I recreate this 2d surface + contour + glyph plot in R?

I've run a 2d simulation in some modelling software from which i've got an export of x,y point locations with a set of 6 attributes. I wish to recreate a figure that combines the data, like this:
The ellipses and the background are shaded according to attribute 1 (and the borders of these are of course representing the model geometry, but I don't think I can replicate that), the isolines are contours of attribute 2, and the arrow glyphs are from attributes 3 (x magnitude) and 4 (y magnitude).
The x,y points are centres of the triangulated mesh I think, and look like this:
I want to know how I can recreate a plot like this with R. To start with I have irregularly-spaced data due to it being exported from an irregular mesh. That's immediately where I get stuck with R, having only ever used it for producing box-and-whisper plots and the like.
Here's the data:
https://dl.dropbox.com/u/22417033/Ellipses_noheader.txt
Edit: fields: x, y, heat flux (x), heat flux (y), thermal conductivity, Temperature, gradT (x), gradT (y).
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
It's quite easy to make the lower plot (making the assumption that there is a dataframe named 'edat' read in with:
edat <- read.table(file=file.choose())
with(edat, plot(V1,V2), cex=0.2)
Things get a bit more beautiful with:
with(edat, plot(V1,V2, cex=0.2, col=V5))
So I do not think your original is being faithfully represented by the data. The contour lines are NOT straight across the "conductors". I call them "conductors" because this looks somewhat like iso-potential lines in electrostatics. I'm adding some text here to serve as a search handle for others who might be searching for plotting problems in real world physics: vector-field (the arrows) , heat equations, gradient, potential lines.
You can then overlay the vector field with:
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
You could"zoom in" with xlim and ylim:
with(edat, plot(V1,V2, cex=0.3, col=V5, xlim=c(0, 10000), ylim=c(-8000, -2000) ))
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
Guessing that the contour requested if for the Temps variable. Take your pick of contourplots.
require(akima)
intflow<- with(edat, interp(x=x, y=y, z=Temps, xo=seq(min(x), max(x), length = 410),
yo=seq(min(y), max(y), length = 410), duplicate="mean", linear=FALSE) )
require(lattice)
contourplot(intflow$z)
filled.contour(intflow)
with( intflow, contour(x=x, y=y, z=z) )
The last one will mix with the other plotting examples since those were using base plotting functions. You may need to switch to points instead of plot.
There are several parts to your plot so you will probably need several tools to make the different parts.
The background and ellipses can be created with polygon (once you figure where they should be).
The contourLines function can calculate the contour lines for you which you can add with the lines function (or contour has and add argument and could probably be used to add the lines directly).
The akima package has a function interp which can estimate values on a grid given the values ungridded.
The my.symbols function along with ms.arrows, both from the TeachingDemos package, can be used to draw the vector field.
#DWin is right to say that your graph don't represent faithfully your data, so I would advice to follow his answer. However here is how to reproduce (the closest I could) your graph:
Ellipses <- read.table(file.choose())
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
require(splancs)
require(akima)
First preparing the data:
#First the background layer (the 'kxx' layer):
# Here the regular grid on which we're gonna do the interpolation
E.grid <- with(Ellipses,
expand.grid(seq(min(x),max(x),length=200),
seq(min(y),max(y),length=200)))
names(E.grid) <- c("x","y") # Without this step, function inout throws an error
E.grid$Value <- rep(0,nrow(E.grid))
#Split the dataset according to unique values of kxx
E.k <- split(Ellipses,Ellipses$kxx)
# Find the convex hull delimiting each of those values domain
E.k.ch <- lapply(E.k,function(X){X[chull(X$x,X$y),]})
for(i in unique(Ellipses$kxx)){ # Pick the value for each coordinate in our regular grid
E.grid$Value[inout(E.grid[,1:2],E.k.ch[names(E.k.ch)==i][[1]],bound=TRUE)]<-i
}
# Then the regular grid for the second layer (Temp)
T.grid <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=200),
yo=seq(min(y),max(y),length=200),
duplicate="mean", linear=FALSE))
# The regular grids for the arrow layer (gradT)
dx <- with(Ellipses,
interp(x,y,gradTx,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
dy <- with(Ellipses,
interp(x,y,gradTy,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
T.grid2 <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
gradTgrid<-expand.grid(dx$x,dx$y)
And then the plotting:
palette(grey(seq(0.5,0.9,length=5)))
par(mar=rep(0,4))
plot(E.grid$x, E.grid$y, col=E.grid$Value,
axes=F, xaxs="i", yaxs="i", pch=19)
contour(T.grid, add=TRUE, col=colorRampPalette(c("blue","red"))(15), drawlabels=FALSE)
arrows(gradTgrid[,1], gradTgrid[,2], # Here I multiply the values so you can see them
gradTgrid[,1]-dx$z*40*T.grid2$z, gradTgrid[,2]-dy$z*40*T.grid2$z,
col="yellow", length=0.05)
To understand in details how this code works, I advise you to read the following help pages: ?inout, ?chull, ?interp, ?expand.grid and ?contour.

Resources