Calculate a 2D spline curve in R - r

I'm trying to calculate a Bezier-like spline curve that passes through a sequence of x-y coordinates. An example would be like the following output from the cscvn function in Matlab (example link):
I believe the (no longer maintained) grid package used to do this (grid.xspline function?), but I haven't been able to install an archived version of the package, and don't find any examples exactly along the lines of what I would like.
The bezier package also looks promising, but it is very slow and I also can't get it quite right:
library(bezier)
set.seed(1)
n <- 10
x <- runif(n)
y <- runif(n)
p <- cbind(x,y)
xlim <- c(min(x) - 0.1*diff(range(x)), c(max(x) + 0.1*diff(range(x))))
ylim <- c(min(y) - 0.1*diff(range(y)), c(max(y) + 0.1*diff(range(y))))
plot(p, xlim=xlim, ylim=ylim)
text(p, labels=seq(n), pos=3)
bp <- pointsOnBezier(cbind(x,y), n=100)
lines(bp$points)
arrows(bp$points[nrow(bp$points)-1,1], bp$points[nrow(bp$points)-1,2],
bp$points[nrow(bp$points),1], bp$points[nrow(bp$points),2]
)
As you can see, it doesn't pass through any points except the end values.
I would greatly appreciate some guidance here!

There is no need to use grid really. You can access xspline from the graphics package.
Following from your code and the shape from #mrflick:
set.seed(1)
n <- 10
x <- runif(n)
y <- runif(n)
p <- cbind(x,y)
xlim <- c(min(x) - 0.1*diff(range(x)), c(max(x) + 0.1*diff(range(x))))
ylim <- c(min(y) - 0.1*diff(range(y)), c(max(y) + 0.1*diff(range(y))))
plot(p, xlim=xlim, ylim=ylim)
text(p, labels=seq(n), pos=3)
You just need one extra line:
xspline(x, y, shape = c(0,rep(-1, 10-2),0), border="red")

It may not the be the best approach, bit grid certainly isn't inactive. It's included as a default package with the R installation. It's the underlying graphics engine for plotting libraries like lattice and ggplot. You shouldn't need to install it, you should just be able to load it. Here's how I might translate your code to use grid.xpline
set.seed(1)
n <- 10
x <- runif(n)
y <- runif(n)
xlim <- c(min(x) - 0.1*diff(range(x)), c(max(x) + 0.1*diff(range(x))))
ylim <- c(min(y) - 0.1*diff(range(y)), c(max(y) + 0.1*diff(range(y))))
library(grid)
grid.newpage()
pushViewport(viewport(xscale=xlim, yscale=ylim))
grid.points(x, y, pch=16, size=unit(2, "mm"),
default.units="native")
grid.text(seq(n), x,y, just=c("center","bottom"),
default.units="native")
grid.xspline(x, y, shape=c(0,rep(-1, 10-2),0), open=TRUE,
default.units="native")
popViewport()
which results in
note that grid is pretty low-level so it's not super easy to work with, but it does allow you far more control of what and where you plot.
And if you want to extract the points along the curve rather than draw it, look at the ?xsplinePoints help page.

Thanks to all that helped with this. I'm summarizing the lessons learned plus a few other aspects.
Catmull-Rom spline vs. cubic B-spline
Negative shape values in the xspline function return a Catmull-Rom type spline, with spline passing through the x-y points. Positive values return a cubic B type spline. Zero values return a sharp corner. If a single shape value is given, this is used for all points. The shape of end points is always treated like a sharp corner (shape=0), and other values do not influence the resulting spline at the end points:
# Catmull-Rom spline vs. cubic B-spline
plot(p, xlim=extendrange(x, f=0.2), ylim=extendrange(y, f=0.2))
text(p, labels=seq(n), pos=3)
# Catmull-Rom spline (-1)
xspline(p, shape = -1, border="red", lwd=2)
# Catmull-Rom spline (-0.5)
xspline(p, shape = -0.5, border="orange", lwd=2)
# cubic B-spline (0.5)
xspline(p, shape = 0.5, border="green", lwd=2)
# cubic B-spline (1)
xspline(p, shape = 1, border="blue", lwd=2)
legend("bottomright", ncol=2, legend=c(-1,-0.5), title="Catmull-Rom spline", col=c("red", "orange"), lty=1)
legend("topleft", ncol=2, legend=c(1, 0.5), title="cubic B-spline", col=c("blue", "green"), lty=1)
Extracting results from xspline for external plotting
This took some searching, but the trick is to apply the argument draw=FALSE to xspline.
# Extract xy values
plot(p, xlim=extendrange(x, f=0.1), ylim=extendrange(y, f=0.1))
text(p, labels=seq(n), pos=3)
spl <- xspline(x, y, shape = -0.5, draw=FALSE)
lines(spl)
arrows(x0=(spl$x[length(spl$x)-0.01*length(spl$x)]), y0=(spl$y[length(spl$y)-0.01*length(spl$y)]),
x1=(spl$x[length(spl$x)]), y1=(spl$y[length(spl$y)])
)

Related

How to smooth a circular plot?

x <- sin(1:20)
y <- (dplyr::lag(x)-x)-x
plot(x, y, type="l")
Yields this plot:
How do I get a smooth, cyclical trajectory? The smoothing functions I've tried all want to make a single smooth function.
Example 1 (throws an error):
lines(smooth.spline(x, y))
Example 2 (draws a function):
lo <- loess(y~x)
lines(predict(lo), col='red', lwd=2)
Example 3 (draws a function):
xspline(c(x,y), shape=1, border='blue' )
I think you're asking for both (a) smoother curves and (b) cylindrical/symmetric circles, correct?
Increase the count of points to make it smoother. Fix the aspect ratio to make it a circle (i.e., asp=1).
x <- sin((1:20000) / 1000)
y <- cos((1:20000) / 1000)
plot(x, y, type="l", asp=1)
EDIT, responding to OP's comments below.:
This uses dplyr::lag(), while increasing the number of points. Don't conceptually shift the lag though (i.e., y should be a full value of 1 from x).
(x=1, y=0), (x=1.001, y=0.001), ..., (x=20, y=19)
x <- sin((1:20000) / 1000)
y <- (dplyr::lag(x, 1000)-x)-x
plot(x, y, type="l", asp=1)
If that's not what you want, maybe sketch a picture and attach in (in like MS Paint or Inkscape).

Access lines plotted by R using basic plot()

I am trying to do the following:
plot a time series in R using a polygonal line
plot one or more horizontal lines superimposed
find the intersections of said line with the orizontal ones
I got this far:
set.seed(34398)
c1 <- as.ts(rbeta(25, 33, 12))
p <- plot(c1, type = 'l')
# set thresholds
thresholds <- c(0.7, 0.77)
I can find no way to access the segment line object plotted by R. I really really really would like to do this with base graphics, while realizing that probably there's a ggplot2 concoction out there that would work. Any idea?
abline(h=thresholds, lwd=1, lty=3, col="dark grey")
I will just do one threshold. You can loop through the list to get all of them.
First find the points, x, so that the curve crosses the threshold between x and x+1
shift = (c1 - 0.7)
Lower = which(shift[-1]*shift[-length(shift)] < 0)
Find the actual points of crossing, by finding the roots of Series - 0.7 and plot
shiftedF = approxfun(1:length(c1), c1-0.7)
Intersections = sapply(Lower, function(x) { uniroot(shiftedF, x:(x+1))$root })
points(Intersections, rep(0.7, length(Intersections)), pch=16, col="red")

R overlap normal curve to probability histogram

In R I'm able to overlap a normal curve to a density histogram:
Eventually I can convert the density histogram to a probability one:
a <- rnorm(1:100)
test <-hist(a, plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100 # Probability
plot(test, ylab="Probability")
curve(dnorm(x, mean=mean(a), sd=sd(a)), add=TRUE)
But I cannot overlap the normal curve anymore since it goes off scale.
Any solution? Maybe a second Y-axis
Now the question is clear to me. Indeed a second y-axis seems to be the best choice for this as the two data sets have completely different scales.
In order to do this you could do:
set.seed(2)
a <- rnorm(1:100)
test <-hist(a, plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100 # Probability
plot(test, ylab="Probability")
#start new graph
par(new=TRUE)
#instead of using curve just use plot and create the data your-self
#this way below is how curve works internally anyway
curve_data <- dnorm(seq(-2, 2, 0.01), mean=mean(a), sd=sd(a))
#plot the line with no axes or labels
plot(seq(-2, 2, 0.01), curve_data, axes=FALSE, xlab='', ylab='', type='l', col='red' )
#add these now with axis
axis(4, at=pretty(range(curve_data)))
Output:
At first you should save your rnorm data otherwise you get different data each time.
seed = rnorm(100)
Next go ahead with
hist(seed,probability = T)
curve(dnorm(x, mean=mean(na.omit(seed)), sd=sd(na.omit(seed))), add=TRUE)
Now you have the expected result. Histogram with density curve.
The y-axis isn't a "probability" as you have labeled it. It is count data. If you convert your histogram to probabilities, you shouldn't have a problem:
x <- rnorm(1000)
hist(x, freq= FALSE, ylab= "Probability")
curve(dnorm(x, mean=mean(x), sd=sd(x)), add=TRUE)

Setting Trend Line Length in R

I have managed to create a scatterplot with two datasets on a single plot. One set of data has an X axis that ranges from 0 -40 (Green), while the other only ranges from 0 -15 (Red).
I used this code to add trend lines to the red and green data separately (using par(new)).
plot( x1,y1, col="red", axes=FALSE, xlab="",ylab="",ylim= range(0:1), xlim= range(0:40))
f <- function(x1,a,b,d) {(a*x1^2) + (b*x1) + d}
fit <- nls(y1 ~ f(x1,a,b,d), start = c(a=1, b=1, d=1))
co <- coef(fit)
curve(f(x, a=co[1], b=co[2], d=co[3]), add = TRUE, col="red", lwd=1)
My issue is I can't seem to find a way to stop the red trend line at 15 on the x axis. I "googled" around and nothing seemed to come up for my issue. Lots on excel trend lines! I tired adding an end= statement to fit<- and that did not work either.
Please help,
I hope I have posted enough information. Thanks in advance.
Try using ggplot. Following example uses mtcars data:
library(ggplot2)
ggplot(mtcars, aes(qsec, wt, color=factor(vs)))+geom_point()+ stat_smooth(se=F)
You can do this in base graphics with the from and to arguments of the curve function (see the help for curve for more details). For example:
# Your function
f <- function(x1,a,b,d) {(a*x1^2) + (b*x1) + d}
# Plot the function from x=-100 to x=100
curve(f(x, a=-2, b=3, d=0), from=-100, to=100, col="red", lwd=1, lty=1)
# Same curve going from x=-100 to x=0 (and shifted down by 1000 units so it's
# easy to see)
curve(f(x, a=-2, b=3, d=-1000), from=-100, to=0,
add=TRUE, col="blue", lwd=2, lty=1)
If you want to set the curve x-limits programmatically, you can do something like this (assuming your data frame is called df and your x-variable is called x):
curve(f(x, a=-2, b=3, d=0), from=range(df$x)[1], to=range(df$x)[2],
add=TRUE, col="red", lwd=1, lty=1)

Get xlim from a plot in R

I want an hist and a density on the same plot, I'm trying this:
myPlot <- plot(density(m[,1])), main="", xlab="", ylab="")
par(new=TRUE)
Oldxlim <- myPlot$xlim
Oldylim <- myPlot$ylim
hist(m[,3],xlim=Oldxlim,ylim=Oldylim,prob=TRUE)
but I can't access myPlot's xlim and ylim.
Is there a way to get them from myPlot? What else should I do instead?
Using par(new=TRUE) is rarely, if ever, the best solution. Many plotting functions have an option like add=TRUE that will add to the existing plot (including the plotting function for histograms as mentioned in the comments).
If you really need to do it this way then look at the usr argument to the par function, doing mylims <- par("usr") will give the x and y limits of the existing plot in user coordinates. However when you use that information on a new plot make sure to set xaxs='i' or the actual coordinates used in the new plot will be extended by 4% beyond what you specify.
The functions grconvertX and grconvertY are also useful to know. They could be used or this purpose, but are probably overkill compared to par("usr"), but they can be useful for finding the limits in other coordinate systems, or finding values like the middle of the plotting region in user coordinates.
Have you considered specifying your own xlim and ylim in the first plot (setting them to appropriate values) then just using those values again to set the limits on the histogram in the second plot?
Just by plotting density on its own you should be able to work out sensible values for the minimum and maximum values for both axes then replace xmin, xmax, ymin and ymax for those values in the code below.
something like;
myPlot <- plot(density(m[,1])), main="", xlab="", ylab="", xlim =c(xmin, xmax), ylim = c(ymin, ymax)
par(new=TRUE)
hist(m[,3],xlim=c(min, max),ylim=c(min, max),prob=TRUE)
If for any reason you are not able to use range() to get the limits, I'd follow #Greg's suggestion. This would only work if the par parameters "xaxs" and "yaxs" are set to "s" (which is the default) and the coordinate range is extended by 4%:
plot(seq(0.8,9.8,1), 10:19)
usr <- par('usr')
xr <- (usr[2] - usr[1]) / 27 # 27 = (100 + 2*4) / 4
yr <- (usr[4] - usr[3]) / 27
xlim <- c(usr[1] + xr, usr[2] - xr)
ylim <- c(usr[3] + yr, usr[4] - yr)
I think the best solution is to fix them when you plot your density.
Otherwise hacing in the code of plot.default (plot.R)
xlab=""
ylab=""
log =""
xy <- xy.coords(x, y, xlab, ylab, log)
xlim1 <- range(xy$x[is.finite(xy$x)])
ylim1 <- range(xy$y[is.finite(xy$y)])
or to use the code above to generate xlim and ylim then call your plot for density
dd <- density(c(-20,rep(0,98),20))
plot(dd,xlim=xlim1,ylim=ylim1)
x <- rchisq(100, df = 4)
hist(x,xlim=xlim1,ylim=xlim1,prob=TRUE,add=TRUE)
Why not use ggplot2?
library(ggplot2)
set.seed(42)
df <- data.frame(x = rnorm(500,mean=10,sd=5),y = rlnorm(500,sdlog=1.1))
p1 <- ggplot(df) +
geom_histogram(aes(x=y,y = ..density..),binwidth=2) +
geom_density(aes(x=x),fill="green",alpha=0.3)
print(p1)

Resources