Related
Function:
(x^3 - 3*x^2 - 2*x + 7) + (y^2 + 2*y)
Output on Wolfram (which I understand):
x^3 - 3 x^2 - 2 x + y^2 + 2 y + 7
Output of the code in R: (using the polynomial function in package polynom)
7 - 2*x^2 + x^3
R code:
library(polynom)
p <- polynomial(c(7,-2,-3, 1))
q <- polynomial(c(0, 2, 1))
p + q
Entered the code as above in R.
You can use mpoly to manipulate multivariate polynomials.
library(mpoly)
p <- as.mpoly(c(7, -2, -3, 1), 'x')
q <- as.mpoly(c(0, 2, 1), 'y')
reorder(p + q)
# x^3 - 3 x^2 - 2 x + y^2 + 2 y + 7
More functionality
https://dkahle.github.io/mpoly/
I'm now learning about calculus and want to depict a graph of x^2 + 6x + y^4 = 7, which I can using online graphing tool desmos.
But when I'm not sure how this is achievable on R. The first thing I thought is convert it in a form of y = f(x), but return (x^2 + 6*x - 7)^(1/4) gave me a different result.
At the same time, it seems impossible to return a equation in a function (return (x^2 + 6*x + y^4 = 7)). So how can I depict it on R?
Here is a sample code I usually use to depict a continuous graph.
f <- function(x) {
return () # return an equation
}
ggplot(data.frame(x=seq(-10,10,length.out=10)), aes(x)) + stat_function(fun=f)
You can have separate functions for the positive and negative solutions for y
f1 <- function(x) (7 - x^2 - 6*x)^(1/4)
f2 <- function(x) -f1(x)
Now just create a vector each for positive and negative values along the domain of x:
x <- seq(-7, 1, length = 1000)
y1 <- f1(x)
y2 <- f2(x)
And plot:
ggplot() +
geom_line(aes(x, y1)) +
geom_line(aes(x, y2))
You can use contourLines:
f <- function(x,y) x^2 + 6*x + y^4
x <- seq(-10, 3, len = 200)
y <- seq(-3, 3, len = 200)
z <- outer(x, y, f)
cr <- contourLines(x, y, z, levels = 7)
plot(cr[[1]]$x, cr[[1]]$y, type = "l")
library(ggplot2)
dat <- data.frame(x = cr[[1]]$x, y = cr[[1]]$y)
ggplot(dat) + geom_path(aes(x, y))
I have trouble understanding how to set the levels in the plot of a bivariate distribution in r. The documentation states that I can choose the levels by setting a
numeric vector of levels at which to draw contour lines
Now I would like the contour to show the limit containing 95% of the density or mass. But if, in the example below (adapted from here) I set the vector as a <- c(.95,.90) the code runs without error but the plot is not displayed. If instead, I set the vector as a <- c(.01,.05) the plot is displayed. But I am not sure I understand what the labels "0.01" and "0.05" mean with respect to the density.
library(mnormt)
x <- seq(-5, 5, 0.25)
y <- seq(-5, 5, 0.25)
mu1 <- c(0, 0)
sigma1 <- matrix(c(2, -1, -1, 2), nrow = 2)
f <- function(x, y) dmnorm(cbind(x, y), mu1, sigma1)
z <- outer(x, y, f)
a <- c(.01,.05)
contour(x, y, z, levels = a)
But I am not sure I understand what the labels "0.01" and "0.05" mean with respect to the density.
It means the points where the density is equal 0.01 and 0.05. From help("contour"):
numeric vector of levels at which to draw contour lines.
So it is the function values at which to draw the lines (contours) where the function is equal to those levels (in this case the density). Take a simple example which may help is x + y:
y <- x <- seq(0, 1, length.out = 50)
z <- outer(x, y, `+`)
par(mar = c(5, 5, 1, 1))
contour(x, y, z, levels = c(0.5, 1, 1.5))
Now I would like the contour to show the limit containing 95% of the density or mass.
In your example, you can follow my answer here and draw the exact points:
# input
mu1 <- c(0, 0)
sigma1 <- matrix(c(2, -1, -1, 2), nrow = 2)
# we start from points on the unit circle
n_points <- 100
xy <- cbind(sin(seq(0, 2 * pi, length.out = n_points)),
cos(seq(0, 2 * pi, length.out = n_points)))
# then we scale the dimensions
ev <- eigen(sigma1)
xy[, 1] <- xy[, 1] * 1
xy[, 2] <- xy[, 2] * sqrt(min(ev$values) / max(ev$values))
# then rotate
phi <- atan(ev$vectors[2, 1] / ev$vectors[1, 1])
R <- matrix(c(cos(phi), sin(phi), -sin(phi), cos(phi)), 2)
xy <- tcrossprod(R, xy)
# find the right length. You can change .95 to which ever
# quantile you want
chi_vals <- qchisq(.95, df = 2) * max(ev$values)
s <- sqrt(chi_vals)
par(mar = c(5, 5, 1, 1))
plot(s * xy[1, ] + mu1[1], s * xy[2, ] + mu1[2], lty = 1,
type = "l", xlab = "x", ylab = "y")
The levels indicates where the lines are drawn, with respect to the specific 'z' value of the bivariate normal density. Since max(z) is
0.09188815, levels of a <- c(.95,.90) can't be drawn.
To draw the line delimiting 95% of the mass I used the ellipse() function as suggested in this post (second answer from the top).
library(mixtools)
library(mnormt)
x <- seq(-5, 5, 0.25)
y <- seq(-5, 5, 0.25)
mu1 <- c(0, 0)
sigma1 <- matrix(c(2, -1, -1, 2), nrow = 2)
f <- function(x, y) dmnorm(cbind(x, y), mu1, sigma1)
z <- outer(x, y, f)
a <- c(.01,.05)
contour(x, y, z, levels = a)
ellipse(mu=mu1, sigma=sigma1, alpha = .05, npoints = 250, col="red")
I also found another solution in the book "Applied Multivariate Statistics with R" by Daniel Zelterman.
# Figure 6.5: Bivariate confidence ellipse
library(datasets)
library(MASS)
library(MVA)
#> Loading required package: HSAUR2
#> Loading required package: tools
biv <- swiss[, 2 : 3] # Extract bivariate data
bivCI <- function(s, xbar, n, alpha, m)
# returns m (x,y) coordinates of 1-alpha joint confidence ellipse of mean
{
x <- sin( 2* pi * (0 : (m - 1) )/ (m - 1)) # m points on a unit circle
y <- cos( 2* pi * (0 : (m - 1)) / (m - 1))
cv <- qchisq(1 - alpha, 2) # chisquared critical value
cv <- cv / n # value of quadratic form
for (i in 1 : m)
{
pair <- c(x[i], y[i]) # ith (x,y) pair
q <- pair %*% solve(s, pair) # quadratic form
x[i] <- x[i] * sqrt(cv / q) + xbar[1]
y[i] <- y[i] * sqrt(cv / q) + xbar[2]
}
return(cbind(x, y))
}
### pdf(file = "bivSwiss.pdf")
plot(biv, col = "red", pch = 16, cex.lab = 1.5)
lines(bivCI(var(biv), colMeans(biv), dim(biv)[1], .01, 1000), type = "l",
col = "blue")
lines(bivCI(var(biv), colMeans(biv), dim(biv)[1], .05, 1000),
type = "l", col = "green", lwd = 1)
lines(colMeans(biv)[1], colMeans(biv)[2], pch = 3, cex = .8, type = "p",
lwd = 1)
Created on 2021-03-15 by the reprex package (v0.3.0)
Consider three simple mathematical functions :
f1 <- function(x) 1/x
f2 <- function(x) tan(x)
f3 <- function(x) 1 / sin(x)
There exist certain vertical asymptotes respectively, i.e. f(x) almost gets infinity when x approaches some values. I plot these three functions by ggplot2::stat_function() :
# x is between -5 to 5
ggplot(data.frame(x = c(-5, 5)), aes(x)) +
stat_function(fun = f1, n = 1000) +
coord_cartesian(ylim = c(-50, 50))
# x is between -2*pi to 2*pi
ggplot(data.frame(x = c(-2*pi, 2*pi)), aes(x)) +
stat_function(fun = f2, n = 1000) +
coord_cartesian(ylim = c(-50, 50))
# x is between -2*pi to 2*pi
ggplot(data.frame(x = c(-2*pi, 2*pi)), aes(x)) +
stat_function(fun = f3, n = 1000) +
coord_cartesian(ylim = c(-50, 50))
The asymptotes appear respectively at :
x1 <- 0
x2 <- c(-3/2*pi, -1/2*pi, 1/2*pi, 3/2*pi)
x3 <- c(-pi, 0, pi)
Actually, these lines do not exist, but ggplot makes them visible. I attempted to use geom_vline() to cover them, namely :
+ geom_vline(xintercept = x1, color = "white")
+ geom_vline(xintercept = x2, color = "white")
+ geom_vline(xintercept = x3, color = "white")
The outputs seem rough and indistinct black marks can be seen. Are there any methods which are much robuster ?
A solution related to #Mojoesque's comments that is not perfect, but also relatively simple and with two minor shortcomings: a need to know the asymptotes (x1, x2, x3) and possibly to reduce the range of y.
eps <- 0.01
f1 <- function(x) if(min(abs(x - x1)) < eps) NA else 1/x
f2 <- function(x) if(min(abs(x - x2)) < eps) NA else tan(x)
f3 <- function(x) if(min(abs(x - x3)) < eps) NA else 1 / sin(x)
ggplot(data.frame(x = c(-5, 5)), aes(x)) +
stat_function(fun = Vectorize(f1), n = 1000) +
coord_cartesian(ylim = c(-30, 30))
ggplot(data.frame(x = c(-2*pi, 2*pi)), aes(x)) +
stat_function(fun = Vectorize(f2), n = 1000) +
coord_cartesian(ylim = c(-30, 30))
ggplot(data.frame(x = c(-2*pi, 2*pi)), aes(x)) +
stat_function(fun = Vectorize(f3), n = 1000) +
coord_cartesian(ylim = c(-30, 30))
This solution is based on #Mojoesque's comment, which uses piecewise skill to partition x-axis into several subintervals, and then execute multiple stat_function() by purrr::reduce(). The restraint is that asymptotes need to be given.
Take tan(x) for example :
f <- function(x) tan(x)
asymp <- c(-3/2*pi, -1/2*pi, 1/2*pi, 3/2*pi)
left <- -2 * pi # left border
right <- 2 * pi # right border
d <- 0.001
interval <- data.frame(x1 = c(left, asymp + d),
x2 = c(asymp - d, right))
interval # divide the entire x-axis into 5 sections
# x1 x2
# 1 -6.283185 -4.713389
# 2 -4.711389 -1.571796
# 3 -1.569796 1.569796
# 4 1.571796 4.711389
# 5 4.713389 6.283185
library(tidyverse)
pmap(interval, function(x1, x2) {
stat_function(fun = f, xlim = c(x1, x2), n = 1000)
}) %>% reduce(.f = `+`,
.init = ggplot(data.frame(x = c(left, right)), aes(x)) +
coord_cartesian(ylim = c(-50, 50)))
I have a question about fitting ellipses to data with the ellipse center at the origin. I have explored two methods that fit ellipses but generate an arbitrary center unless I manipulate the data with some imaginary mirror points.
Method#01
This portion of the script directly comes from this useful post. I'm copying the codes directly here for ease.
fit.ellipse <- function (x, y = NULL) {
# from:
# http://r.789695.n4.nabble.com/Fitting-a-half-ellipse-curve-tp2719037p2720560.html
#
# Least squares fitting of an ellipse to point data
# using the algorithm described in:
# Radim Halir & Jan Flusser. 1998.
# Numerically stable direct least squares fitting of ellipses.
# Proceedings of the 6th International Conference in Central Europe
# on Computer Graphics and Visualization. WSCG '98, p. 125-132
#
# Adapted from the original Matlab code by Michael Bedward (2010)
# michael.bedward#gmail.com
#
# Subsequently improved by John Minter (2012)
#
# Arguments:
# x, y - x and y coordinates of the data points.
# If a single arg is provided it is assumed to be a
# two column matrix.
#
# Returns a list with the following elements:
#
# coef - coefficients of the ellipse as described by the general
# quadratic: ax^2 + bxy + cy^2 + dx + ey + f = 0
#
# center - center x and y
#
# major - major semi-axis length
#
# minor - minor semi-axis length
#
EPS <- 1.0e-8
dat <- xy.coords(x, y)
D1 <- cbind(dat$x * dat$x, dat$x * dat$y, dat$y * dat$y)
D2 <- cbind(dat$x, dat$y, 1)
S1 <- t(D1) %*% D1
S2 <- t(D1) %*% D2
S3 <- t(D2) %*% D2
T <- -solve(S3) %*% t(S2)
M <- S1 + S2 %*% T
M <- rbind(M[3,] / 2, -M[2,], M[1,] / 2)
evec <- eigen(M)$vec
cond <- 4 * evec[1,] * evec[3,] - evec[2,]^2
a1 <- evec[, which(cond > 0)]
f <- c(a1, T %*% a1)
names(f) <- letters[1:6]
# calculate the center and lengths of the semi-axes
#
# see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2288654/
# J. R. Minter
# for the center, linear algebra to the rescue
# center is the solution to the pair of equations
# 2ax + by + d = 0
# bx + 2cy + e = 0
# or
# | 2a b | |x| |-d|
# | b 2c | * |y| = |-e|
# or
# A x = b
# or
# x = Ainv b
# or
# x = solve(A) %*% b
A <- matrix(c(2*f[1], f[2], f[2], 2*f[3]), nrow=2, ncol=2, byrow=T )
b <- matrix(c(-f[4], -f[5]), nrow=2, ncol=1, byrow=T)
soln <- solve(A) %*% b
b2 <- f[2]^2 / 4
center <- c(soln[1], soln[2])
names(center) <- c("x", "y")
num <- 2 * (f[1] * f[5]^2 / 4 + f[3] * f[4]^2 / 4 + f[6] * b2 - f[2]*f[4]*f[5]/4 - f[1]*f[3]*f[6])
den1 <- (b2 - f[1]*f[3])
den2 <- sqrt((f[1] - f[3])^2 + 4*b2)
den3 <- f[1] + f[3]
semi.axes <- sqrt(c( num / (den1 * (den2 - den3)), num / (den1 * (-den2 - den3)) ))
# calculate the angle of rotation
term <- (f[1] - f[3]) / f[2]
angle <- atan(1 / term) / 2
list(coef=f, center = center, major = max(semi.axes), minor = min(semi.axes), angle = unname(angle))
}
Let's take a example distribution of polar points for illustration purpose
X<-structure(list(x_polar = c(0, 229.777200000011, 246.746099999989,
-10.8621999999741, -60.8808999999892, 75.8904999999795, -83.938199999975,
-62.9770000000135, 49.1650999999838, 52.3093000000226, 49.6891000000178,
-66.4248999999836, 34.3671999999788, 242.386400000018, 343.60619999998
), y_polar = c(0, 214.868299999973, 161.063599999994, -68.8972000000067,
-77.0230000000447, 93.2863000000361, -16.2356000000145, 27.7828000000445,
-17.8077000000048, 2.10540000000037, 25.6866000000155, -84.6034999999683,
-31.1800000000512, 192.010800000047, 222.003700000001)), .Names = c("x_polar",
"y_polar"), row.names = c(NA, -15L), class = "data.frame")
efit <- fit.ellipse(X)
e <- get.ellipse(efit)
#plot
par(bg=NA)
plot(X, pch=3, col='gray', lwd=2, axes=F, xlab="", ylab="", type='n',
ylim=c(min(X$y_polar)-150, max(X$y_polar)), xlim=c(min(X$x_polar)-150, max(X$x_polar))) #blank plot
points(X$x_polar, X$y_polar, pch=3, col='gray', lwd=2, axes=F, xlab="", ylab="") #observations
lines(e, col="red", lwd=3, lty=2) #plotting the ellipse
points(0,0,col=2, lwd=2, cex=2) #center/origin
To bring the origin of the ellipse at the center we could modify as follows (surely not the best way of doing it)
#generate mirror coordinates
X$x_polar_mirror<- -X$x_polar
X$y_polar_mirror<- -X$y_polar
mydata<-as.matrix(data.frame(c(X$x_polar, X$x_polar_mirror), c(X$y_polar, X$y_polar_mirror)))
#fit the data
efit <- fit.ellipse(mydata)
e <- get.ellipse(efit)
par(bg=NA)
plot(mydata, pch=3, col='gray', lwd=2, axes=F, xlab="", ylab="", type='n',
ylim=c(min(X$y_polar)-150, max(X$y_polar)), xlim=c(min(X$x_polar)-150, max(X$x_polar)))
points(X$x_polar, X$y_polar, pch=3, col='gray', lwd=2, axes=F, xlab="", ylab="")
lines(e, col="red", lwd=3, lty=2)
points(0,0,col=2, lwd=2, cex=2) #center
Well ... it sort of does the job but none would be happy with all those imaginary points considered in the calculation.
Method#02
This is another indirect way of fitting the data but again the ellipse center is not at the origin. Any workaround?
require(car)
dataEllipse(X$x_polar, X$y_polar, levels=c(0.15, 0.7),
xlim=c(-150, 400), ylim=c(-200,300))
My questions: (a) is there a robust alternative way of fitting these points with the ellipse center at the origin (0,0)? (b) is there a measure of the goodness of ellipse fit? Thank you in advance.
I'm not really happy with aproach I've concieved, there should be a closed form solution, but still:
# Ellipse equasion with center in (0, 0) with semiaxis pars[1] and pars[2] rotated by pars[3].
# t and pars[3] in radians
ellipsePoints <- function(t, pars) {
data.frame(x = cos(pars[3]) * pars[1] * cos(t) - sin(pars[3]) * pars[2] * sin(t),
y = sin(pars[3]) * pars[1] * cos(t) + cos(pars[3]) * pars[2] * sin(t))
}
# Way to fit an ellipse through minimising distance to data points.
# If weighted then points which are most remote from center will have bigger impact.
ellipseBrute <- function(x, y, pars, weighted = FALSE) {
d <- sqrt(x**2 + y**2)
t <- asin(y/d)
w <- (d/sum(d))**weighted
t[x == 0 & y == 0] <- 0
ep <- ellipsePoints(t, pars)
sum(w*(sqrt(ep$x**2 + ep$y**2) - d)**2)
}
# Fit through optim.
opt_res <- optim(c(diff(range(X$x_polar)),
diff(range(X$y_polar)),
2*pi)/2,
ellipseBrute,
x = X$x_polar, y = X$y_polar,
weighted = TRUE
)
# Check resulting ellipse throuh plot
df <- ellipsePoints(seq(0, 2*pi, length.out = 1e3), opt_res$par)
plot(y ~ x, df, col = 'blue', t = 'l',
xlim = range(c(X$x_polar, df$x)),
ylim = range(c(X$y_polar, df$y)))
points(0, 0, pch = 3, col = 'blue')
points(y_polar ~ x_polar, X)