How to add confidence intervals to 3D surface? - r

I have a matrix 40x40 from values obtained by interpolation using library akima to create a 3D surface.
I estimated CI 95% using monte carlo simulations from predicted values and now I want to add them for year 0 to my 3D graph.
I’m doing something wrong and I don’t understand how to plot vertical lines to represent the CIs.
My lines look like this:
And I want to have CI like on this image:
Here's are my data, dropbox link because it's longer than the space allowed to post here: https://www.dropbox.com/s/c6iyd2r00k5jbws/data.rtf?dl=0
and my code:
persp(xyz,theta = 45, phi = 25,border="grey40", ticktype = "detailed", zlim=c(0,.8))->res2
y.bin <- rep(1,25)
x.bin <- seq(-10,10,length.out = 25)
points (trans3d(x.bin, y.bin, z = y0, pmat = res2), col = 1, lwd=2)
lines (trans3d(x.bin, y.bin, z = LCI, pmat = res2), col = 1, lwd=2)
lines (trans3d(x.bin, y.bin, z = UCI, pmat = res2), col = 1, lwd=2)

The problem is that the upper and lower confidence intervals are being drawn as a single line. If you loop over the points with an interval, and then plot the line between the upper and lower values there, the plot looks closer to what you want. Note that point values in the example y0 is not on many of the 3d intervals.
# data from link is imported
persp(xyz,theta = 45, phi = 25,border="grey40", ticktype = "detailed", zlim=c(0,.8))->res2
y.bin <- rep(1,25)
x.bin <- seq(-10,10,length.out = 25)
# y0 points
points (trans3d(x.bin, y.bin, z = y0, pmat = res2), col = 1, lwd=2)
# lines between upper and lower CIs for each location
for(i in 1:length(LCI)){
lines (trans3d(rep(x.bin[i],2), rep(y.bin[i],2), z = c(LCI[i],UCI[i]), pmat = res2), col = 1, lwd=2)
}

Related

Histogram with normal curve and error bars in R

#import data
data = diameters$V1
error = .005 #mm
#make histogram
h <- hist(data, breaks = "FD", density = 10,
col = "lightblue", xlab = "Diameter", main = "Overall")
# Make normal curve
xfit <- seq(min(data), max(data), length = 40)
yfit <- dnorm(xfit, mean = mean(data), sd = sd(data))
yfit <- yfit * diff(h$mids[1:2]) * length(data)
#Draw normal curve
lines(xfit, yfit, col = "black", lwd = 2)
Output:
Expectation:
Is it possible to add error bars to the histogram using the value of +/- error without any external libraries?
You should be able to draw them with the arrows() function:
## Create a histogram from random data
> hist(sample(runif(100)))
> arrows(x0 = 0.15, y0 = 11, x1 = 0.15, y1 = 13, code = 3, length = 0.05, angle = 90)
x0 and x1 specify the start and finish x coordinates (for a straight vertical line, keep them the same)
y0 and y1 specify the start and finish y coordinates e.g the length of the line to draw.
code = 3 tells R to draw a double sided 'arrow', angle = 90 makes the 'arrow' a flat line and length = 0.05 specifies how wide the error bars should be.
See ?arrows for more details.

Plotting the area under the curve of various distributions in R

Suppose I'm trying to find the area below a certain value for a student t distribution. I calculate my t test statistic to be t=1.78 with 23 degrees of freedom, for example. I know how to get the area under the curve above t=1.78 with the pt() function. How can I get a plot of the student distribution with 23 degrees of freedom and the area under the curve above 1.78 shaded in. That is, I want the curve for pt(1.78,23,lower.tail=FALSE) plotted with the appropriate area shaded. Is there a way to do this?
ggplot version:
ggplot(data.frame(x = c(-4, 4)), aes(x)) +
stat_function(fun = dt, args =list(df =23)) +
stat_function(fun = dt, args =list(df =23),
xlim = c(1.78,4),
geom = "area")
This should work:
x_coord <- seq(-5, 5, length.out = 200) # x-coordinates
plot(x_coord, dt(x_coord, 23), type = "l",
xlab = expression(italic(t)), ylab = "Density", bty = "l") # plot PDF
polygon(c(1.78, seq(1.78, 5, by = .3), 5, 5), # polygon for area under curve
c(0, dt(c(seq(1.78, 5, by = .3), 5), 23), 0),
col = "red", border = NA)
Regarding arguments to polygon():
your first and last points should be [1.78, 0] and [5, 0] (5 only in case the plot goes to 5) - these basically devine the bottom edge of the red polygon
2nd and penultimate points are [1.78, dt(1.78, 23)] and [5, dt(5, 23)] - these define the end points of the upper edge
the stuff in between is just X and Y coordinates of an arbitrary number of points along the curve [x, dt(x, 23)] - the more points, the smoother the polygon
Hope this helps

How dose persp define ticks? Especially how persp decide how many ticks and which tick delta are used?

Hy there,
I use persp for a 3D-Plot and i am try to find out how persp define the ticks when the parameter ticktype="detailed" is set.
I want to draw lines into the box around a surface corresponding to the ticks. Up till now, frist I draw the surface without any labels and axes and add all lines and axes afterwords. To make it clear what I have done -> example code:
z <- matrix(rep(1:10, each=10), nrow=10, ncol=10)
x.axis <- 1:nrow(z)
y.axis <- 1:ncol(z)
max.y <- max(y.axis)
# Drawing the surface without the axes and no lines on the surface
pmat <- persp(z = z, x = x.axis, y = y.axis ,
lphi = 100, phi = 25, theta = -30,
axes=F,
border = NA, # no lines on the surface
col="deepskyblue",
expand = 0.5,
shade = 0.65)
Now I add the the lines on the surface with different color and the axes with ticks and labels:
par(new=T)
pmat <- persp(z = z, x = x.axis, y = y.axis ,
lphi = 100, phi = 25, theta = -30,
ticktype = "detailed",
expand = 0.5,
cex.lab=0.75,
col=NA,
border="grey80")
par(new=F)
To get lines on the box around the surface I use the following:
for (z_high in c(2,4,6,8)) {
lines(trans3d(x.axis, max.y, z_high, pmat) , col="black", lty=3)
}
As you can see, I use a own defined vector c(2,4,6,8) which represents the z-values for the box lines in the back. If the input surface now changes, I have to adapted this vector by my own. Is there a way to get the ticks for all axes in the persp plot? Did anyone know how persp define the ticks?

Plotting empirical cumulative probability function and its inverse

I have data cdecn:
set.seed(0)
cdecn <- sample(1:10,570,replace=TRUE)
a <- rnorm(cdecn,mean(cdecn),sd(cdecn))
I have created a plot which displays the cumulative probabilities.
aprob <- ecdf(a)
plot(aprob)
I am wondering how I can switch the x-axis and y-axis to get a new plot, i.e., the inverse of ECDF.
Also, for the new plot, is there a way to add a vertical line through where the my curve intersects 0?
We can do the following. My comments along the code is very explanatory.
## reproducible example
set.seed(0)
cdecn <- sample(1:10,570,replace=TRUE)
a <- rnorm(cdecn,mean(cdecn),sd(cdecn)) ## random samples
a <- sort(a) ## sort samples in ascending order
e_cdf <- ecdf(a) ## ecdf function
e_cdf_val <- 1:length(a) / length(a) ## the same as: e_cdf_val <- e_cdf(a)
par(mfrow = c(1,2))
## ordinary ecdf plot
plot(a, e_cdf_val, type = "s", xlab = "ordered samples", ylab = "ECDF",
main = "ECDF")
## switch axises to get 'inverse' ECDF
plot(e_cdf_val, a, type = "s", xlab = "ECDF", ylab = "ordered sample",
main = "'inverse' ECDF")
## where the curve intersects 0
p <- e_cdf(0)
## [1] 0.01578947
## highlight the intersection point
points(p, 0, pch = 20, col = "red")
## add a dotted red vertical line through intersection
abline(v = p, lty = 3, col = "red")
## display value p to the right of the intersection point
## round up to 4 digits
text(p, 0, pos = 4, labels = round(p, 4), col = "red")
cdecn <- sample(1:10,570,replace=TRUE)
a <- rnorm(cdecn,mean(cdecn),sd(cdecn))
aprob <- ecdf(a)
plot(aprob)
# Switch the x and y axes
x <- seq(0,1,0.001754386)
plot(y=knots(aprob), x=x, ylab = "Fn(y)")
# Add a 45 degree straight line at 0, 0
my_line <- function(x,y,...){
points(x,y,...)
segments(min(x), y==0, max(x), max(y),...)
}
lines(my_line(x=x, y = knots(aprob)))
The "straight line at x==0" bit makes me suspect that you want a QQplot:
qqnorm(a)
qqline(a)

How to plot a normal distribution by labeling specific parts of the x-axis?

I am using the following code to create a standard normal distribution in R:
x <- seq(-4, 4, length=200)
y <- dnorm(x, mean=0, sd=1)
plot(x, y, type="l", lwd=2)
I need the x-axis to be labeled at the mean and at points three standard deviations above and below the mean. How can I add these labels?
The easiest (but not general) way is to restrict the limits of the x axis. The +/- 1:3 sigma will be labeled as such, and the mean will be labeled as 0 - indicating 0 deviations from the mean.
plot(x,y, type = "l", lwd = 2, xlim = c(-3.5,3.5))
Another option is to use more specific labels:
plot(x,y, type = "l", lwd = 2, axes = FALSE, xlab = "", ylab = "")
axis(1, at = -3:3, labels = c("-3s", "-2s", "-1s", "mean", "1s", "2s", "3s"))
Using the code in this answer, you could skip creating x and just use curve() on the dnorm function:
curve(dnorm, -3.5, 3.5, lwd=2, axes = FALSE, xlab = "", ylab = "")
axis(1, at = -3:3, labels = c("-3s", "-2s", "-1s", "mean", "1s", "2s", "3s"))
But this doesn't use the given code anymore.
If you like hard way of doing something without using R built in function or you want to do this outside R, you can use the following formula.
x<-seq(-4,4,length=200)
s = 1
mu = 0
y <- (1/(s * sqrt(2*pi))) * exp(-((x-mu)^2)/(2*s^2))
plot(x,y, type="l", lwd=2, col = "blue", xlim = c(-3.5,3.5))
An extremely inefficient and unusual, but beautiful solution, which works based on the ideas of Monte Carlo simulation, is this:
simulate many draws (or samples) from a given distribution (say the normal).
plot the density of these draws using rnorm. The rnorm function takes as arguments (A,B,C) and returns a vector of A samples from a normal distribution centered at B, with standard deviation C.
Thus to take a sample of size 50,000 from a standard normal (i.e, a normal with mean 0 and standard deviation 1), and plot its density, we do the following:
x = rnorm(50000,0,1)
plot(density(x))
As the number of draws goes to infinity this will converge in distribution to the normal. To illustrate this, see the image below which shows from left to right and top to bottom 5000,50000,500000, and 5 million samples.
In general case, for example: Normal(2, 1)
f <- function(x) dnorm(x, 2, 1)
plot(f, -1, 5)
This is a very general, f can be defined freely, with any given parameters, for example:
f <- function(x) dbeta(x, 0.1, 0.1)
plot(f, 0, 1)
I particularly love Lattice for this goal. It easily implements graphical information such as specific areas under a curve, the one you usually require when dealing with probabilities problems such as find P(a < X < b) etc.
Please have a look:
library(lattice)
e4a <- seq(-4, 4, length = 10000) # Data to set up out normal
e4b <- dnorm(e4a, 0, 1)
xyplot(e4b ~ e4a, # Lattice xyplot
type = "l",
main = "Plot 2",
panel = function(x,y, ...){
panel.xyplot(x,y, ...)
panel.abline( v = c(0, 1, 1.5), lty = 2) #set z and lines
xx <- c(1, x[x>=1 & x<=1.5], 1.5) #Color area
yy <- c(0, y[x>=1 & x<=1.5], 0)
panel.polygon(xx,yy, ..., col='red')
})
In this example I make the area between z = 1 and z = 1.5 stand out. You can move easily this parameters according to your problem.
Axis labels are automatic.
This is how to write it in functions:
normalCriticalTest <- function(mu, s) {
x <- seq(-4, 4, length=200) # x extends from -4 to 4
y <- (1/(s * sqrt(2*pi))) * exp(-((x-mu)^2)/(2*s^2)) # y follows the formula
of the normal distribution: f(Y)
plot(x,y, type="l", lwd=2, xlim = c(-3.5,3.5))
abline(v = c(-1.96, 1.96), col="red") # draw the graph, with 2.5% surface to
either side of the mean
}
normalCriticalTest(0, 1) # draw a normal distribution with vertical lines.
Final result:

Resources