How to get R contour function to plot decimals - r

I am trying to plot the contours of a function made of two gaussians, one centered at (2, 2) and the other centered at (-2, -2). Here is my code.
k1 <- 2
k2 <- 2
mu1 <- rbind(2, 2)
mu2 <- rbind(-2, -2)
sigma1 <- rbind(c(.6, 0), c(0, .6))
sigma2 <- rbind(c(.3, 0), c(0, .3))
det1 <- det(sigma1)
det2 <- det(sigma2)
inv1 <- solve(sigma1)
inv2 <- solve(sigma2)
x <- runif(1000, -5, 5)
y <- runif(1000, -5, 5)
w <- rbind(x, y)
ratio <- function(v){
quotient <- log((2*pi)^(-k1/2)*det1^(-1/2)*(exp((-1/2)*t(v-mu1)%*%inv1%*%(v-mu1))))/log((2*pi)^(-k2/2)*det2^(-1/2)*(exp((-1/2)*t(v-mu2)%*%inv2%*%(v-mu2))))
return(quotient)
}
z <- apply(w, 2, ratio)
round.z <- round(z, digits=0)
df <- cbind(x, y, z, round.z)
df <- as.data.frame(df)
grid <- with(df, interp(x, y, z))
contour(grid, levels=0:10, asp=1)
But when I plot these contours I just get the contours with whole number values. It looks like this:
There should be more, similar-looking contours in the first quadrants that have decimal values (because I am taking a ratio) but they do not appear. I can't seem to find how to get contour() to plot decimals. Anyone know how to fix this problem?

As user2554330 commented, I can use levels=c((0:10)/10, 2:10). Thank you to them!

Related

Hist with lines in R

I generate 4 parts of big data: cluster1(10000 points), cluster2(15000 points), cluster3(15000 points) and throws(500 points). Here is the code:
library('MASS')
library('fpc')
#library("dbscan")
library("factoextra")
library("clustertend")
library("boot")
library("stream")
set.seed(123)
mu1<-c(-5,-7)
mu1
sigma1<-matrix(c(4,-2,-2,2), nrow=2, ncol=2, byrow = TRUE)
sigma1
n<-10000
cluster1<-mvrnorm(n,mu1,sigma1)
cluster1
#cluster1<-as.data.frame(cluster1)
#cluster1
#c<-runif(10000,1,1000)
#c
phi <- runif(15000, max = 2*pi)
rho <- sqrt(runif(15000))
x <- sqrt(5)*rho*cos(phi) + 6
y <- sqrt(10/3)*rho*sin(phi) + 4
range(2*(x - 6)^2 + 3*(y - 4)^2)
#[1] 0.001536582 9.999425234
plot(x, y)
cluster2<-cbind(x,y)
cluster2
u <- runif(15000, max = 3)
v <- runif(15000, max = 2)
x <- u + v - 10
y <- v - u + 8
range(x + y)
#[1] -1.999774 1.999826
range(x - y + 15)
#[1] -2.999646 2.999692
plot(x, y)
cluster3<-cbind(x,y)
cluster3
#cluster3<-as.data.frame(cluster1)
#cluster3
x <- runif(500, -20, 20)
y <- runif(500, -20, 20)
#u <- runif(500, max = 20)
#v <- runif(500, max = 20)
#x <- u + v - 20
#y <- v - u
range(x)
range(y)
plot(x,y)
throws<-cbind(x,y)
throws
data<-rbind(cluster1,cluster2,cluster3,throws)
data<-as.data.frame(data)
data
plot(data)
Then I try by using the bootstrap method, construct a distribution of H statistics for some
fixed m, which is from 7% of the total number of generated points(m=2835). Here is th code where I do this:
B<-10#number of iterations
H<-NULL#value of Hopkins statistic
for(i in 1:B){
N<-dim(data)[1]
s<-sample(N,0.8*N)
stat<-hopkins(data[s,], n=2835, byrow = TRUE)$H
H[i]<-stat
#print(c(i, stat))
}
It takes very to generate. Then I should to compare this result with beta distribution - B(m,m). Here is the code:
hist(H)
#(density(H), col="red")
#hist(distB)
X<-seq(min(H), max(H), 0.001)
X
lines(X, dbeta(X,2835,2835), type="l", col="red")
The problem is that lined doesn't draw on hist. Can anybody say what is the problem? Here is the image, I see red line, but it's not exactly right.
Your y-axis values plotted by dbeta() are way too low to register on the supplied y-axis (<0.0000001). You need to overlay the second plot:
# sample data
H <- sample(seq(0.455,0.475,0.001), 1000, replace = TRUE)
#plot histogram
hist(H)
# prepare graphics to add second plot
par(new = TRUE)
# sample data for second plot
X <- seq(0.455,0.475, 0.001)
Y <- dbeta(X,2835,2835)
# plot second plot, remove axes
plot(X, dbeta(X,2835,2835), type="l", col="red", axes = FALSE)
axis(4, Y) # add axis on right side

Differences between plotting contour() function in base R and using geom_contour() or stat_contour() in ggplot2

I have plotted a density function in base R and I would like to replicate the plot in ggplot2.
This is the plot in base R:
library(tidyverse)
library(mvtnorm)
sd <- 1 / 2
# sigma
s1 <- sd^2
# first two vectors
x.points <- seq(-3, 3, length.out = 100)
y.points <- seq(-3, 3, length.out = 100)
# the third vector is a density
z <- matrix(0, nrow = 100, ncol = 100)
mu1 <- c(0, 0)
sigma1 <- matrix(c(s1^2, 0, 0, s1^2), nrow = 2)
for (i in 1:100) {
for (j in 1:100) {
z[i, j] <- dmvnorm(c(x.points[i], y.points[j]),
mean = mu1, sigma = sigma1
)
}
}
contour(x.points, y.points, z, xlim = range(-3, 3), ylim = c(-3, 3), nlevels = 5, drawlabels = TRUE)
To obtain the same result in ggplot2, I am following this example:
library(ggplot2)
library(reshape2) # for melt
volcano3d <- melt(volcano)
names(volcano3d) <- c("x", "y", "z")
# Basic plot
v <- ggplot(volcano3d, aes(x, y, z = z))
v + stat_contour()
But in my case vector z has a different length than x.points and y.points. From the errors I get below, it looks like the three vectors should have the same length. How can I transform the dataset presented above so that it can be run through ggplot2?
data1 <- as.data.frame(cbind(x.points, y.points))
p <- ggplot(data = data1, mapping = aes(x.points, y.points, z=z))
p + geom_contour()
#> Error: Aesthetics must be either length 1 or the same as the data (100): z
p + stat_contour()
#> Error: Aesthetics must be either length 1 or the same as the data (100): z
p + stat_function(fun = contour) + xlim(-3,3)
#> Error: Aesthetics must be either length 1 or the same as the data (100): z
Created on 2021-04-08 by the reprex package (v0.3.0)
The problem is likely that your data isn't in long format: for every value of the z matrix, you need the x and y position, which is different from the base R approach, wherein you just need these positions for every row/column.
We can transform the matrix z to a long format using reshape2::melt and then grab the correct positions from your vectors.
library(tidyverse)
library(mvtnorm)
sd <- 1 / 2
# sigma
s1 <- sd^2
# first two vectors
x.points <- seq(-3, 3, length.out = 100)
y.points <- seq(-3, 3, length.out = 100)
# the third vector is a density
z <- matrix(0, nrow = 100, ncol = 100)
mu1 <- c(0, 0)
sigma1 <- matrix(c(s1^2, 0, 0, s1^2), nrow = 2)
for (i in 1:100) {
for (j in 1:100) {
z[i, j] <- dmvnorm(c(x.points[i], y.points[j]),
mean = mu1, sigma = sigma1
)
}
}
# Here be the reshaping bit
df <- reshape2::melt(z)
df <- transform(
df,
x = x.points[Var1],
y = y.points[Var2]
)
ggplot(df, aes(x, y)) +
geom_contour(aes(z = value))
Created on 2021-04-08 by the reprex package (v1.0.0)

Understanding "levels" in r contour function of bivariate distribution

I have trouble understanding how to set the levels in the plot of a bivariate distribution in r. The documentation states that I can choose the levels by setting a
numeric vector of levels at which to draw contour lines
Now I would like the contour to show the limit containing 95% of the density or mass. But if, in the example below (adapted from here) I set the vector as a <- c(.95,.90) the code runs without error but the plot is not displayed. If instead, I set the vector as a <- c(.01,.05) the plot is displayed. But I am not sure I understand what the labels "0.01" and "0.05" mean with respect to the density.
library(mnormt)
x <- seq(-5, 5, 0.25)
y <- seq(-5, 5, 0.25)
mu1 <- c(0, 0)
sigma1 <- matrix(c(2, -1, -1, 2), nrow = 2)
f <- function(x, y) dmnorm(cbind(x, y), mu1, sigma1)
z <- outer(x, y, f)
a <- c(.01,.05)
contour(x, y, z, levels = a)
But I am not sure I understand what the labels "0.01" and "0.05" mean with respect to the density.
It means the points where the density is equal 0.01 and 0.05. From help("contour"):
numeric vector of levels at which to draw contour lines.
So it is the function values at which to draw the lines (contours) where the function is equal to those levels (in this case the density). Take a simple example which may help is x + y:
y <- x <- seq(0, 1, length.out = 50)
z <- outer(x, y, `+`)
par(mar = c(5, 5, 1, 1))
contour(x, y, z, levels = c(0.5, 1, 1.5))
Now I would like the contour to show the limit containing 95% of the density or mass.
In your example, you can follow my answer here and draw the exact points:
# input
mu1 <- c(0, 0)
sigma1 <- matrix(c(2, -1, -1, 2), nrow = 2)
# we start from points on the unit circle
n_points <- 100
xy <- cbind(sin(seq(0, 2 * pi, length.out = n_points)),
cos(seq(0, 2 * pi, length.out = n_points)))
# then we scale the dimensions
ev <- eigen(sigma1)
xy[, 1] <- xy[, 1] * 1
xy[, 2] <- xy[, 2] * sqrt(min(ev$values) / max(ev$values))
# then rotate
phi <- atan(ev$vectors[2, 1] / ev$vectors[1, 1])
R <- matrix(c(cos(phi), sin(phi), -sin(phi), cos(phi)), 2)
xy <- tcrossprod(R, xy)
# find the right length. You can change .95 to which ever
# quantile you want
chi_vals <- qchisq(.95, df = 2) * max(ev$values)
s <- sqrt(chi_vals)
par(mar = c(5, 5, 1, 1))
plot(s * xy[1, ] + mu1[1], s * xy[2, ] + mu1[2], lty = 1,
type = "l", xlab = "x", ylab = "y")
The levels indicates where the lines are drawn, with respect to the specific 'z' value of the bivariate normal density. Since max(z) is
0.09188815, levels of a <- c(.95,.90) can't be drawn.
To draw the line delimiting 95% of the mass I used the ellipse() function as suggested in this post (second answer from the top).
library(mixtools)
library(mnormt)
x <- seq(-5, 5, 0.25)
y <- seq(-5, 5, 0.25)
mu1 <- c(0, 0)
sigma1 <- matrix(c(2, -1, -1, 2), nrow = 2)
f <- function(x, y) dmnorm(cbind(x, y), mu1, sigma1)
z <- outer(x, y, f)
a <- c(.01,.05)
contour(x, y, z, levels = a)
ellipse(mu=mu1, sigma=sigma1, alpha = .05, npoints = 250, col="red")
I also found another solution in the book "Applied Multivariate Statistics with R" by Daniel Zelterman.
# Figure 6.5: Bivariate confidence ellipse
library(datasets)
library(MASS)
library(MVA)
#> Loading required package: HSAUR2
#> Loading required package: tools
biv <- swiss[, 2 : 3] # Extract bivariate data
bivCI <- function(s, xbar, n, alpha, m)
# returns m (x,y) coordinates of 1-alpha joint confidence ellipse of mean
{
x <- sin( 2* pi * (0 : (m - 1) )/ (m - 1)) # m points on a unit circle
y <- cos( 2* pi * (0 : (m - 1)) / (m - 1))
cv <- qchisq(1 - alpha, 2) # chisquared critical value
cv <- cv / n # value of quadratic form
for (i in 1 : m)
{
pair <- c(x[i], y[i]) # ith (x,y) pair
q <- pair %*% solve(s, pair) # quadratic form
x[i] <- x[i] * sqrt(cv / q) + xbar[1]
y[i] <- y[i] * sqrt(cv / q) + xbar[2]
}
return(cbind(x, y))
}
### pdf(file = "bivSwiss.pdf")
plot(biv, col = "red", pch = 16, cex.lab = 1.5)
lines(bivCI(var(biv), colMeans(biv), dim(biv)[1], .01, 1000), type = "l",
col = "blue")
lines(bivCI(var(biv), colMeans(biv), dim(biv)[1], .05, 1000),
type = "l", col = "green", lwd = 1)
lines(colMeans(biv)[1], colMeans(biv)[2], pch = 3, cex = .8, type = "p",
lwd = 1)
Created on 2021-03-15 by the reprex package (v0.3.0)

Plotting Contours with x, y, z values

I am trying to create a contour plot of 1000 data points. I have the matrix with all of the values in it. Here is my code.
mu1 <- rbind(2, 2)
mu2 <- rbind(-2, -2)
sigma1 <- rbind(c(.6, 0), c(0, .6))
simga2 <- sigma1
det1 <- det(sigma1)
det2 <- det1
inv1 <- solve(sigma1)
inv2 <- inv1
x <- runif(1000, -5, 5)
y <- runif(1000, -5, 5)
w <- rbind(x, y)
ratio <- function(v){
quotient <- (exp((-1/2)*t(v-mu1)%*%inv1%*%(v-mu1)))/(exp((-1/2)*t(v-mu2)%*%inv2%*%(v-mu2)))
return(quotient)
}
z <- apply(w, 2, ratio)
round.z <- round(z, digits=0)
df <- cbind(x, y, z, round.z)
df <- as.data.frame(df)
I want to plot the contours of x and y by the round.z values including where round.z=1. I know that the contour where round.z=1 should be the line y=-x, but I don't know how to get it to show up. Thanks for the help.
The contour and related functions in R want to have the data on a grid, not a random sample like yours. The akima::interp function can convert your data to this format. For example, after running your code,
library(akima)
grid <- with(df, interp(x, y, round.z))
contour(grid, levels = 10^(0:10))
which produces this image:

Using dates on axis in persp

I'm trying to plot the dates on the x-axis of a persp plot, but cannot find a way of doing so. This is where I am at:
x <- seq(-10, 10, length= 30)
x0 <- as.Date("2000-01-01")
x.dates <- seq(x0,x0+length(x)-1,1)
y <- x
f <- function(x,y) { r <- sqrt(x^2+y^2); 10 * sin(r)/r }
z <- outer(x, y, f)
z[is.na(z)] <- 1
op <- par(bg = "white")
persp(x.dates, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightblue",ticktype="detailed")
Here's a way to plot perspective with dates (by Jeff Ryan):
http://www.quantmod.com/examples/chartSeries3d/
The alpha code for the above graph is at the following url. This is a DOWNLOAD of R code, so I purposely omitted the http stuff:
www.quantmod.com/examples/chartSeries3d/chartSeries3d.alpha.R
If you look at the code, you can see how he did it.

Resources