I know there are many ways to calculate the arc length of curve, but I am looking for an efficient way to calculate the arc length of a piecewise spline through irregularly spaced points.
The actual curve I'm trying to find the length of is quite complex (contour line) so here is a quick example using a circle where the actual arclength is known to be 2*pi:
# Generate "random" data
set.seed(50)
theta = seq(0, 2*pi, length.out = 50) + runif(50, -0.05, 0.05)
theta = c(0, theta[theta >=0 & theta <= 2*pi], 2*pi)
data = data.frame(x = cos(theta), y = sin(theta))
# Bezier Curve fit
library("bezier")
bezierArcLength(data, t1=0, t2=1)$arc.length
# Calculate arc length using euclidean distance
library("dplyr")
data$eucdist = sqrt((data$x - lag(data$x))^2 + (data$y - lag(data$y))^2)
print(paste("Euclidean distance:", sum(data$eucdist[-1])))
print(paste("Actual distance:", 2*pi))
# Output
Bezier distance: 5.864282
Euclidean distance: 6.2779
Actual distance: 6.2831
The closest thing I have found is https://www.rdocumentation.org/packages/pracma/versions/1.9.9/topics/arclength but I would have to parameterise my data to be some function(t) ...spline(data, t)... to use arclength. I tried this, but the fitted spline ran along the middle of the circle rather than along the circumference.
Another alternative I have been (unsuccessfully) trying is fit piecewise splines and determine the length of each spline.
Any help would be much appreciated!
EDIT: Added alternate method using the Bezier package, but the arc length found is even worse than just using the Euclidean method.
In lieu of community answers, I've cobbled together a solution which seems to work for what I was after! I'll leave my code here in case anyone has the same question and comes across this.
# Libraries
library("bezier")
library("pracma")
library("dplyr")
# Very slow for loops, sorry! Didn't write it as an apply function
output = data.frame()
for (i in 1:100) {
# Generate "random" data
# set.seed(50)
theta = seq(0, 2*pi, length.out = 50) + runif(50, -0.1, 0.1)
theta = sort(theta)
theta = c(0, theta[theta >=0 & theta <= 2*pi], 2*pi)
data = data.frame(x = cos(theta), y = sin(theta))
# Bezier Curve fit
b = bezierArcLength(data, t1=0, t2=1)$arc.length
# Pracma Piecewise cubic
t = atan2(data$y, data$x)
t = t + ifelse(t < 0, 2*pi, 0)
csx <- cubicspline(t, data$x)
csy <- cubicspline(t, data$y)
dcsx = csx; dcsx$coefs = t(apply(csx$coefs, 1, polyder))
dcsy = csy; dcsy$coefs = t(apply(csy$coefs, 1, polyder))
ds <- function(t) sqrt(ppval(dcsx, t)^2 + ppval(dcsy, t)^2)
s = integral(ds, t[1], t[length(t)])
# Calculate arc length using euclidean distance
data$eucdist = sqrt((data$x - lag(data$x))^2 + (data$y - lag(data$y))^2)
e = sum(data$eucdist[-1])
# Use path distance as parametric variable
data$d = c(0, cumsum(data$eucdist[-1]))
csx <- cubicspline(data$d, data$x)
csy <- cubicspline(data$d, data$y)
dcsx = csx; dcsx$coefs = t(apply(csx$coefs, 1, polyder))
dcsy = csy; dcsy$coefs = t(apply(csy$coefs, 1, polyder))
ds <- function(t) sqrt(ppval(dcsx, t)^2 + ppval(dcsy, t)^2)
d = integral(ds, data$d[1], data$d[nrow(data)])
# Actual value
a = 2*pi
# Append to result
output = rbind(
output,
data.frame(bezier=b, cubic.spline=s, cubic.spline.error=(s-a)/a*100,
euclidean.dist=e, euclidean.dist.error=(e-a)/a*100,
dist.spline=d, dist.spline.error=(d-a)/a*100))
}
# Summary
apply(output, 2, mean)
# Summary output
bezier cubic.spline cubic.spline.error euclidean.dist euclidean.dist.error dist.spline dist.spline.error
5.857931e+00 6.283180e+00 -7.742975e-05 6.274913e+00 -1.316564e-01 6.283085683 -0.001585570
I still don't quite understand what bezierArcLength does, but I'm very happy with my solution using cubicspline from the pracma package as it is a lot more accurate.
Other solutions are still more than welcome!
Related
Suppose I am seeking to integrate the following function from 0 to 10:
How would I accomplish this in R?
Functions
# Functional form
fn <- function(t) -100*(t)^2 + 20000
# First derivative w.r.t. t
fn_dt <- function(t) -200*t
# Density funciton phi
phi <- approxfun(density(rnorm(35, 15, 7)))
# Delta t
delta <- 5
How about the following:
First off, we choose a fixed seed for reproducibility.
# Density funciton phi
set.seed(2017);
phi <- approxfun(density(rnorm(35, 15, 7)))
We define the integrand.
integrand <- function(x) {
f1 <- -500 * x^2 + 100000;
f2 <- phi(x);
f2[is.na(f2)] <- 0;
return(f1 * f2)
}
By default, approxfun returns NA if x falls outside the interval [min(x), max(x)]; since phi is based on the density of a normal distribution, we can replace NAs with 0.
Let's plot the integrand
library(ggplot2);
ggplot(data.frame(x = 0), aes(x)) + stat_function(fun = integrand) + xlim(-50, 50);
We use integrate to calculate the integral; here I assume you are interested in the interval [-Inf, +Inf].
integrate(integrand, lower = -Inf, upper = Inf)
#-39323.06 with absolute error < 4.6
Given the following function:
f(x) = (1/2*pi)(1/(1+x^2/4))
How do I identify it's distribution and write this distribution function in R?
So this is your function right now (hopefully you know how to write an R function; if not, check writing your own function):
f <- function (x) (pi / 2) * (1 / (1 + 0.25 * x ^ 2))
f is defined on (-Inf, Inf) so integration on this range gives an indefinite integral. Fortunately, it approaches to Inf at the speed of x ^ (-2), so the integral is well defined, and can be computed:
C <- integrate(f, -Inf, Inf)
# 9.869604 with absolute error < 1e-09
C <- C$value ## extract integral value
# [1] 9.869604
Then you want to normalize f, as we know that a probability density should integrate to 1:
f <- function (x) (pi / 2) * (1 / (1 + 0.25 * x ^ 2)) / C
You can draw its density by:
curve(f, from = -10, to = 10)
Now that I have the probably distribution function I was wondering how to create a random sample of say n = 1000 using this new distribution function?
An off-topic question, but OK to answer without your making a new thread. Useful as it turns out subtle.
Compare
set.seed(0); range(simf(1000, 1e-2))
#[1] -56.37246 63.21080
set.seed(0); range(simf(1000, 1e-3))
#[1] -275.3465 595.3771
set.seed(0); range(simf(1000, 1e-4))
#[1] -450.0979 3758.2528
set.seed(0); range(simf(1000, 1e-5))
#[1] -480.5991 8017.3802
So I think e = 1e-2 is reasonable. We could draw samples, make a (scaled) histogram and overlay density curve:
set.seed(0); x <- simf(1000)
hist(x, prob = TRUE, breaks = 50, ylim = c(0, 0.16))
curve(f, add = TRUE, col = 2, lwd = 2, n = 201)
suppose that we have a pde that describes the evolution of a variable y(t,x) over time t and space x, and I would like to plot its evolution on a three dimensional diagram (t,x,y). With deSolve I can solve the pde, but I have no idea about how to obtain this kind of diagram.
The example in the deSolve package instruction is the following, where y is aphids, t=0,...,200 and x=1,...,60:
library(deSolve)
Aphid <- function(t, APHIDS, parameters) {
deltax <- c (0.5, rep(1, numboxes - 1), 0.5)
Flux <- -D * diff(c(0, APHIDS, 0)) / deltax
dAPHIDS <- -diff(Flux) / delx + APHIDS * r
list(dAPHIDS )
}
D <- 0.3 # m2/day diffusion rate
r <- 0.01 # /day net growth rate
delx <- 1 # m thickness of boxes
numboxes <- 60
Distance <- seq(from = 0.5, by = delx, length.out = numboxes)
APHIDS <- rep(0, times = numboxes)
APHIDS[30:31] <- 1
state <- c(APHIDS = APHIDS) # initialise state variables
times <-seq(0, 200, by = 1)
out <- ode.1D(state, times, Aphid, parms = 0, nspec = 1, names = "Aphid")
"out" produces a matrix containing all the data that we need, t, y(x1), y(x2), ... y(x60). How can I produce a surface plot to show the evolution and variability of y in (t,x)?
The ways change a bit depending on using package. But you can do it with little cost because out[,-1] is an ideal matrix form to draw surface. I showed two examples using rgl and plot3D package.
out2 <- out[,-1]
AphID <- 1:ncol(out2)
library(rgl)
persp3d(times, AphID, out2, col="gray50", zlab="y")
# If you want to change color with value of Z-axis
# persp3d(times, AphID, out2, zlab="y", col=topo.colors(256)[cut(c(out2), 256)])
library(plot3D)
mat <- mesh(times, AphID)
surf3D(mat$x, mat$y, out2, bty="f", ticktype="detailed", xlab="times", ylab="AphID", zlab="y")
I've been reading about a few methods to fit a circle to data (like this). I would like to see how the methods work on real data and thought of using R for this. I tried searching rseek for packages that can help with this but came up with nothing useful.
So, are there packages that help to easily compute the best fit circle for a given data set (similar to how lm() will fit a linear model to a data set)? Otherwise, how might one perform such a task in R?
Here's a fairly naive implementation of a function that minimises SS(a,b,r) from that paper:
fitSS <- function(xy,
a0=mean(xy[,1]),
b0=mean(xy[,2]),
r0 = mean(sqrt((xy[,1]-a0)^2 + (xy[,2]-b0)^2)),
...){
SS <- function(abr){
sum((abr[3] - sqrt((xy[,1]-abr[1])^2 + (xy[,2]-abr[2])^2))^2)
}
optim(c(a0,b0,r0), SS, ...)
}
I've written a couple of supporting functions to generate random data on circles and to plot circles. Hence:
> xy = sim_circles(10)
> f = fitSS(xy)
The fit$par value is a vector of xcenter, ycenter, radius.
> plot(xy,asp=1,xlim=c(-2,2),ylim=c(-2,2))
> lines(circlexy(f$par))
Note it doesn't use the gradients nor does it check the error code for convergence. You can supply it with initial values or it can have a guess.
Code for plotting and generating circles follows:
circlexy <- function(xyr, n=180){
theta = seq(0,2*pi,len=n)
cbind(xyr[1] + xyr[3]*cos(theta),
xyr[2] + xyr[3]*sin(theta)
)
}
sim_circles <- function(n,x=0,y=0,r=1,sd=0.05){
theta = runif(n, 0, 2*pi)
r = r + rnorm(n, mean=0, sd=sd)
cbind(x + r*cos(theta),
y + r*sin(theta)
)
}
Well, looky here: an R-blogger column has written some code to fit to ellipses and circles. His code, which I won't repost here, is based on previous work done by Radim Halíř and Jan Flusser in Matlab. His code includes (commented) the original Matlab lines for comparison.
I've peeked at a number of papers on this topic, and can only say that I'm not qualified to determine which algorithms are the most robust. For those interested, take a look at these papers:
http://www.emis.de/journals/BBMS/Bulletin/sup962/gander.pdf
http://ralph.cs.cf.ac.uk/papers/Geometry/fit.pdf
http://autotrace.sourceforge.net/WSCG98.pdf
Followup edit: I ran Spacedman's code against the linked R-code for fitting ellipses, using the same "noisy" set of 1e5 points on a circle as input. The results are:
testcircle<-create.test.ellipse(Rx=200,Ry=200,Rot=.56,Noise=5.5,leng=100000)
dim(testcircle)
[1] 100000 2
microbenchmark(fitSS(testcircle),fit.ellipse(testcircle))
Unit: milliseconds
expr min lq median uq max
fitSS(testcircle) 649.98245 704.05751 731.61282 787.84212 2053.7096
fit.ellipse(testcircle) 25.74518 33.87718 38.87143 95.23499 256.2475
neval
100
100
For reference, the output of the two fitting functions were:
From SSfit, the list
ssfit
$par
[1] 249.9530 149.9927 200.0512
$value
[1] 185.8195
$counts
function gradient
134 NA
$convergence
[1] 0
$message
NULL
From fit.ellipse, we get
ellfit
$coef
a b c d e
-7.121109e-01 -1.095501e-02 -7.019815e-01 3.563866e+02 2.136497e+02
f
-3.195427e+04
$center
x y
249.0769 150.2326
$major
[1] 201.7601
$minor
[1] 199.6424
$angle
[1] 0.412268
You can see that the elliptic equation's coefficients are near-zero for terms which "deviate" from a circle; plotting the two results yields almost indistinguishable curves.
To fit an ellipse, there is the fitEllipse function in the PlaneGeometry package. It uses the fitConic package.
library(PlaneGeometry)
library(PlaneGeometry)
# the "true" ellipse:
ell <- Ellipse$new(center = c(1, 1), rmajor = 3, rminor = 2, alpha = 25)
# We add some noise to 30 points on this ellipse:
set.seed(666L)
points <- ell$randomPoints(30, "on") + matrix(rnorm(30*2, sd = 0.2), ncol = 2)
# Now we fit an ellipse to these points:
ellFitted <- fitEllipse(points)
# let's draw all this stuff, true ellipse in blue, fitted ellipse in green:
box <- ell$boundingbox()
plot(NULL, asp = 1, xlim = box$x, ylim = box$y, xlab = NA, ylab = NA)
draw(ell, border = "blue", lwd = 2)
points(points, pch = 19)
draw(ellFitted, border = "green", lwd = 2)
Assume A follows Exponential distribution; B follows Gamma distribution
How to plot the PDF of 0.5*(A+B)
This is fairly straight forward using the "distr" package:
library(distr)
A <- Exp(rate=3)
B <- Gammad(shape=2, scale=3)
conv <- 0.5*(A+B)
plot(conv)
plot(conv, to.draw.arg=1)
Edit by JD Long
Resulting plot looks like this:
If you're just looking for fast graph I usually do the quick and dirty simulation approach. I do some draws, slam a Gaussian density on the draws and plot that bad boy:
numDraws <- 1e6
gammaDraws <- rgamma(numDraws, 2)
expDraws <- rexp(numDraws)
combined <- .5 * (gammaDraws + expDraws)
plot(density(combined))
output should look a little like this:
Here is an attempt at doing the convolution (which #Jim Lewis refers to) in R. Note that there are probably much more efficient ways of doing this.
lower <- 0
upper <- 20
t <- seq(lower,upper,0.01)
fA <- dexp(t, rate = 0.4)
fB <- dgamma(t,shape = 8, rate = 2)
## C has the same distribution as (A + B)/2
dC <- function(x, lower, upper, exp.rate, gamma.rate, gamma.shape){
integrand <- function(Y, X, exp.rate, gamma.rate, gamma.shape){
dexp(Y, rate = exp.rate)*dgamma(2*X-Y, rate = gamma.rate, shape = gamma.shape)*2
}
out <- NULL
for(ix in seq_along(x)){
out[ix] <-
integrate(integrand, lower = lower, upper = upper,
X = x[ix], exp.rate = exp.rate,
gamma.rate = gamma.rate, gamma.shape = gamma.shape)$value
}
return(out)
}
fC <- dC(t, lower=lower, upper=upper, exp.rate=0.4, gamma.rate=2, gamma.shape=8)
## plot the resulting distribution
plot(t,fA,
ylim = range(fA,fB,na.rm=TRUE,finite = TRUE),
xlab = 'x',ylab = 'f(x)',type = 'l')
lines(t,fB,lty = 2)
lines(t,fC,lty = 3)
legend('topright', c('A ~ exp(0.4)','B ~ gamma(8,2)', 'C ~ (A+B)/2'),lty = 1:3)
I'm not an R programmer, but it might be helpful to know that for independent random variables with PDFs f1(x) and f2(x), the PDF
of the sum of the two variables is given by the convolution f1 * f2 (x) of the two input PDFs.