I want to see how three variables x, y, and z respond to a function f using R.
I've searched for R solutions (e.g. rgl using 4d plots) but none seem to allow the input of a function as the fourth variable while allowing manipulation of x, y, and z across their full range of values.
# First I create three variables that each have a domain 0 to 4
x
y
z
# Then I create a function from those three variables
f <- sqrt(x^2 + y^2 + z^2)
EDIT: I originally stated that I wanted x, y, and z to be seq(0, 4, 0.01) but in fact I only want them to range from 0 to 4, and do so independently of other variables. In other words, I want to plot the function across a range of values letting x move independently of y and z and so forth, rather than plotting a 3-D line. The result should be a 3-D surface.
I want to:
a) see how the function f responds to all possible combinations of x, y, and z across a range of x, y, and z values 0 to 4, and
b) find what maxima/minima exist especially when holding one variable constant.
This is rather a mathematical questions. Unfortunately, our computer screens are not really made fro 4D, neither our brains. So what you ask wont be possible as if. Indeed, you want to show a dense set of data (a cube between 0 and 4), and we can not display what is "inside" the cube.
To come back to R, you can always display a slice of it, for example fixing z and plot sqrt(x^2 + y^2 + z^2) for x and y. Here you have two examples:
# Points where the function should be evaluated
x <- seq(0, 4, 0.01)
y <- seq(0, 4, 0.01)
z <- seq(0, 4, 0.01)
# Compute the distance from origin
distance <- function(x,y,z) {
sqrt(x^2 + y^2 + z^2)
}
# Matrix to store the results
slice=matrix(0, nrow=length(x),ncol=length(y))
# Fill the matrix with a slice at z=3
i=1
for (y_val in y)
{
slice[,i]=distance(x,y_val,3)
i=i+1
}
# PLot with plot3D library
require(plot3D)
persp3D(z = slice, theta = 100,phi=50)
# PLot with raster library
library(raster)
plot(raster(slice,xmn=min(x), xmx=max(x), ymn=min(y), ymx=max(y)))
If you change your z values, you will not really change the shape (just making it "flatter" for bigger z). Note that the function being symmetric in x, y and z, the same plots are produced if you keep xor y constant.
For your last question about the maximum, you can re-use the slice matrix and do:
max_ind=which(slice==max(slice),arr.ind = TRUE)
x[max_ind[,1]]
y[max_ind[,2]]
(see Get the row and column name of the minimum element of a matrix)
But again with math we can see from your equation that the maximum will always be obtained by maxing x, y and z. Indeed, the function simply measure the distance from the origin.
Related
I would like to determine the probability that a randomly distributed object of Type A occupies or touches (overlaps) the same space as any randomly distributed object of Type B when populated inside an elliptical cylinder. I would then like to loop this simulation many times to generate a more reliable probability value.
I am able to draw the elliptical cylinder using the shape package:
library(shape)
emptyplot(c(-5, 5), c(-15, 15), main = "filled elliptic cylinder")
filledcylinder(rx = 9, ry = 5, len= 2, angle = 00, col = "white",
lcol = "black", lcolint = "grey")
I do not know how to add points (i.e. objects A and B) to this graph. However, I suspect graphical expression is not the way to go with this task (though I find visualising helpful). I suspect a better approach will be to create a function to describe the elliptical cylinder, similar to the cone in the following example, and run the simulation without graphical output:
# Create a function to describe a cone
cone <- function(x, y){
sqrt(x ^ 2 + y ^ 2)
}
# prepare variables.
x <- y <- seq(-1, 1, length = 30)
z <- outer(x, y, cone)
# plot as a 3D surface for visual reference (even though I actually want a volume)
persp(x, y, z,
main="Perspective Plot of a Cone",
zlab = "Height",
theta = 30, phi = 15,
col = "orange", shade = 0.4)
Sadly I do not know how to do this for my elliptical cylinder. I am aware of the paramaters for describing an elliptical cylinder from the following source:
https://mathworld.wolfram.com/EllipticCylinder.html
Unfortunately, I do not understand much of it. I hope the dimensions given in my filledcylinder can act as a guide. Ultimately the dimension values do not matter, what matters is the code structure into which values can be entered.
As for the objects:
Let there be 50 Type A objects and 50 Type B objects of size x=0.4, y=0.4, z=0.4 (same units as in my graphical elliptical cylinder example).
All objects are to be distributed at random within the volume of the elliptical cylinder, with the exception that objects of Type A cannot overlap with another object of Type A, and objects of Type B cannot overlap with other objects of Type B. Type A objects may overlap with Type B objects.
I would like to output the number of Type A objects that overlap with any Type B object in the given volume, this number as a percentage of total Type A objects, and as a percentage of total all objects for each run of the simulation.
I do not know how to even start to do this.
If you can help, I'm afraid statistics, geometry and non-basic R expressions will need to be explained as if to a (not particularly bright) child.
Thank you very very much for your time!
An implementation with heavily commented code for explanations. This assumes the A- and B-type objects must be entirely within the elliptical cylinder.
library(data.table)
rObj <- function(rx, ry, h, n, dims, eps = 2) {
# Function to create a random sample (by rejection) of non-overlapping
# rectangular prism objects inside an elliptical cylinder whose ellipse is
# centered at x = 0, y = 0 and whose height ranges from -dims[3]/2 to h -
# dims[3]/2. The objects have dimensions (x, y, z) = dims, and all edges are
# parallel or orthogonal to each of the x, y, or z axes.
# INPUTS:
# rx: length of the ellipse
# ry: width of the ellipse
# h: height of the elliptical cylinder
# n: number of non-overlapping objects to return
# dims: dimensions of the rectangular prism objects (vector of length 3)
# eps: oversampling factor
# OUTPUT: a data.table with 3 columns and n rows. Each row gives the
# coordinates of the centroid of a sampled object
dt <- data.table()
while(nrow(dt) < n) {
# increase oversampling if it is not the first pass
if (nrow(dt)) eps <- eps*2
rho <- sqrt(runif(eps*n))
phi <- runif(eps*n, 0, 2*pi)
dt <- data.table(
# sample object centroids
# see https://stackoverflow.com/questions/5529148/algorithm-calculate-pseudo-random-point-inside-an-ellipse
# First, uniformly sample on an ellipse centered on x = 0, y = 0,
# with xlength = rx - dims[1] and ylength = ry - dims[2]
# (any object with a centroid outside of this ellipse will stick out of
# the elliptical cylinder, although some with a centroid within the
# smaller ellipse will still stick out of the elliptical cylinder).
x = (rx - dims[1])/2*rho*cos(phi),
y = (ry - dims[2])/2*rho*sin(phi),
# uniformly sample centroid heights
z = runif(eps*n, 0, h - dims[3])
)[
# remove objects that stick out of bounds
# The ellipse satisfies (x/(rx/2))^2 + (y/(ry/2))^2 = 1, which is the
# same as (x/rx)^2 + (y/ry)^2 = 0.25. Taking advantage of symmetry, add
# half of the x and y dimensions of the objects to the absolute value of
# x and y (the object corner furthest from the foci of the ellipse) and
# check if the result satisfies the standard equation.
((abs(x) + dims[1]/2)/rx)^2 + ((abs(y) + dims[2]/2)/ry)^2 < 0.25
][
# remove objects that overlap a previously placed object
# Since each rectangular prism object is oriented with the x, y, z axes,
# two objects overlap if they are closer than their lengths in each
# dimension.
tabulate(
sequence((.N - 1L):1, 2:.N)[ # row numbers (always keep the first row)
(dist(x) < dims[1]) & (dist(y) < dims[2]) & (dist(z) < dims[3])
],
.N
) == 0L
]
}
dt[1:n] # keep the first n objects
}
# function to get pairwise distances between two vectors
dist2 <- function(x, y) abs(outer(x, y, "-"))
fsim <- function(rx, ry, h, nA, nB, dimA, dimB, nreps, eps = 2) {
# function to simulate placement of A and B rectangular prism objects inside
# an elliptical cylinder and count the number of A-type objects that
# intersect at least one B-type object. All object edges are parallel or
# orthogonal to each of the x, y, or z axes.
# INPUTS:
# rx: length of the ellipse
# ry: width of the ellipses
# h: height of the elliptical cylinder
# nA: number of non-overlapping A-type objects to return
# nB: number of non-overlapping B-type objects to return
# dimX: dimensions of the rectangular prism objects (vector of length 3)
# nreps: the number of replications to simulate
# eps: oversampling factor when randomly sampling non-overlapping objects
# by rejection
# OUTPUT: vector of length "nreps" giving the number of A-type objects that
# intersect at least one B-type object for each replication
dims <- rowMeans(cbind(dimA, dimB)) # average dimensions of the A and B objects
out <- integer(nreps) # initialize the output vector
# repeat the simulation "nreps" times
for (i in 1:nreps) {
# get the coordinates of the A- and B-type objects' centroids
A <- rObj(rx, ry, h, nA, dimA, eps)
B <- rObj(rx, ry, h, nB, dimB, eps)
# count the number of A-type objects that intersect at least one B-type
# object
out[i] <- sum(rowSums((dist2(A$x, B$x) < dims[1])*(dist2(A$y, B$y) < dims[2])*(dist2(A$z, B$z) < dims[3])) != 0L)
}
out
}
Time 10K simulation replications:
system.time(overlaps <- fsim(9, 5, 2, 50L, 50L, rep(0.4, 3), rep(0.4, 3), 1e4L))
#> user system elapsed
#> 27.19 0.25 27.67
mean(overlaps)
#> [1] 18.7408
One approach to get an approximate answer to this problem is to discretize things. Set up a volume as a 3 dimensional array of zeros, then randomly generate the parameters of your shapes one at a time.
For each generated shape, find all the elements of the array that would be inside the shape. If any locations would be outside the cylinder or overlap a shape of the same type, try again. Once you have a legal shape, mark those array entries (e.g. 1 for type A, 2 for type B). Do all type A first, then all type B, and keep count of the times when shape B occupies a space that was previously marked for shape A.
I am trying to create a 3D mesh of a specific building from points that I extracted from a lidar point cloud. I then created a matrix from the x, y and z values to feed into the as.mesh3d function from the rlg package and since its from a lidar survey, I have 27,000+ points for this one building. I run into an error when I try to create the mesh. I've copied in a sample of 20 points from the point cloud:
X <- c(1566328,1566328,1566328,1566328,1566328,1566327,1566327,1566327,
1566327,1566327,1566327,1566327,1566327,1566327,1566327,1566327,
1566326,1566326,1566326,1566326)
Y <- c(5180937,5180937,5180936,5180935,5180936,5180937,5180937,5180936,
5180936,5180935,5180935,5180935,5180936,5180936,5180937,5180938,
5180938,5180937,5180936,5180936)
Z <- c(19.92300028,19.98300046,19.93700046,19.88099962,19.93500046,19.99500046,
20.00400046,20.00600046,19.97199962,19.92499962,19.95400046,
19.99099991,20.01199991,19.97600020,19.95800008,19.93200008,
19.95300008,19.94800008,19.94300020,19.98399991)
#created a matrix
xyz <- matrix(c(X, Y, Z), byrow = TRUE, ncol = 3)
The problem arises when I try to create the mesh using as.mesh3d():
mesh <- as.mesh3d(xyz, y = NULL, Z = NULL, type = "triangle", col = "red")
This is what I get: Error in as.mesh3d.default(xyz, y = NULL, Z = NULL, type = "triangle", : Wrong number of vertices
The same error happens for the original dataset of 27000+ points despite all being of the same length.
I'm really not advanced in R and was hoping I could get some advice or solutions on how to get past this.
Thankyou
The as.mesh3d function assumes the points are already organized as triangles. Since you're giving it 20 points, that's not possible: it needs a multiple of 3 points.
There's a problem with your calculation of xyz: you say byrow = TRUE, but you're specifying values by column. Using
xyz <- cbind(X, Y, Z)
would work.
If I plot all of your points using text3d(xyz, text=1:20), it looks as though there are a lot of repeats.
There are several ways to triangulate those points, but they depend on assumptions about the surface. For example, if you know there is only one Z value for each (X, Y) pair, you could use as.mesh3d.deldir (see the help page) to triangulate. Here's the code and output for your sample:
dxyz <- deldir::deldir(X - mean(X), Y - mean(Y), z = Z)
# Warning message:
# In deldir::deldir(X - mean(X), Y - mean(Y), z = Z) :
# There were different z "weights" corresponding to
# duplicated points.
persp3d(dxyz, col = "red")
I had to subtract the means from X and Y because rounding errors caused it to look very bad without that: rgl does a lot of things in single precision, which only gives 7 or 8 decimal place accuracy.
I would like some help answering the following question:
Dr Barchan makes 600 independent recordings of Eric’s coordinates (X, Y, Z), selects the cases where X ∈ (0.45, 0.55), and draws a histogram of the Y values for these cases.
By construction, these values of Y follow the conditional distribution of Y given X ∈ (0.45,0.55). Use your function sample3d to mimic this process and draw the resulting histogram. How many samples of Y are displayed in this histogram?
We can argue that the conditional distribution of Y given X ∈ (0.45, 0.55) approximates the conditional distribution of Y given X = 0.5 — and this approximation is improved if we make the interval of X values smaller.
Repeat the above simulations selecting cases where X ∈ (0.5 − δ, 0.5 + δ), using a suitably chosen δ and a large enough sample size to give a reliable picture of the conditional distribution of Y given X = 0.5.
I know for the first paragraph we want to have the values generated for x,y,z we got in sample3d(600) and then restrict the x's to being in the range 0.45-0.55, is there a way to code (maybe an if function) that would allow me to keep values of x in this range but discard all the x's from the 600 generated not in the range? Also does anyone have any hints for the conditional probability bit in the third paragraph.
sample3d = function(n)
{
df = data.frame()
while(n>0)
{
X = runif(1,-1,1)
Y = runif(1,-1,1)
Z = runif(1,-1,1)
a = X^2 + Y^2 + Z^2
if( a < 1 )
{
b = (X^2+Y^2+Z^2)^(0.5)
vector = data.frame(X = X/b, Y = Y/b, Z = Z/b)
df = rbind(vector,df)
n = n- 1
}
}
df
}
sample3d(n)
Any help would be appreciated, thank you.
Your function produces a data frame. The part of the question that asks you to find those values in a data frame that are in a given range can be solved by filtering the data frame. Notice that you're looking for a closed interval (the values aren't included).
df <- sample3d(600)
df[df$X > 0.45 & df$X < 0.55,]
Pay attention to the comma.
You can use a dplyr solution as well, but don't use the helper between(), since it will look at an open interval (you need a closed interval).
filter(df, X > 0.45 & X < 0.55)
For the remainder of your assignment, see what you can figure out and if you run into a specific problem, stack overflow can help you.
I'm trying to make a 3D scatterplot with boudaries or zones based on combinations of 3 variables that return certain values. The variables each range between 0:1, and combine to make an index that ranges from -1:1 as follows:
f(x,y,z) = (x*y)-z
I'd like to create a visual representation that will highlight all combinations of variables that return a certain index value. As an example, I can easily show those variables where index > 0 using scatterplot3d (rgl would also work):
# Create imaginary dataset of 50 observations for each variable
x<-runif(50,0,1)
y<-runif(50,0,1)
z<-runif(50,0,1)
# Create subset where f(x,y,z) > 0
x1<-y1<-z1<-1
for (i in 1:length(x)){ if ((x[i]*y[i])-z[i] > 0) {
x1<-rbind(x1, x[i])
y1<-rbind(y1, y[i])
z1<-rbind(z1, z[i])}
}
s3d<-scatterplot3d(x,y,z) # Plot entire dataset
s3d$points3d(x1,y1,z1,pch=19, col="red") # Highlight subset where f(x,y,z) > 0
This gives me the following graph:
It seems fairly intuitive that there should be an easy way to plot either the surface (extending from top/right/back to bottom/left/front) separating the subset from the rest of the data, or else a volume/3D area within which these plots lie. E.g. my first instinct was to use something like surface3d, persp3d or planes3d. However, all attempts so far have only yielded error messages. Most solutions seem to use some form of z<-lm(y~x) but I obviously need something like q<-func((x*y)-z) for all values of x, y and z that yield q > 0.
I know I could calculate extreme points and use them as vertices for a 3D polygon, but that seems too "manual". It feels like I'm overlooking something fairly simple and obvious. I've looked at many similar questions on Stack but can't seem to find one that fits my particular problem. If I've missed any and this question has been answered already, please do point me in the right direction!
Here is a suggestion for an interactive 3D plot that is based on an example from the "R Graphics Cookbook" by Winston Chang.
set.seed(4321)
library(rgl)
interleave <- function(v1,v2) as.vector(rbind(v1,v2))
x <- runif(50)
y <- runif(50)
z <- runif(50)
plot3d(x, y, z, type="s", size=0.6, col=(2+(x*y<z)))
x0 <- y0 <- seq(0, 1, 0.1)
surface3d(x0, y0, outer(x0, y0), alpha=0.4) #plot the surface f(x,y)=x*y
x1 <- x[x * y > z] #select subset that is below the separating surface
y1 <- y[x * y > z]
z1 <- z[x * y > z]
segments3d(interleave(x1, x1), #highlight the distance of the points below the surface
interleave(y1, y1),
interleave(x1 * y1, z1), col="red", alpha=0.4)
If you don't like the red lines and only want the surface and the colored points, this will be sufficient:
plot3d(x,y,z,type="s",size=0.6,col=(2+(x*y<z)))
x0 <- y0 <- seq(0,1,0.1)
surface3d(x0,y0,outer(x0,y0),alpha=0.4)
Does this representation provide the information that you wanted to highlight?
The first thought was to see if the existing functions within scatterplot3d could handle the problem but I think not:
my.lm <- lm(z ~ I(x) * I(y)+0)
s3d$plane3d(my.lm, lty.box = "solid", col="red")
pkg:scatterplot3d doesn't really have a surface3d function so you will need to choose a package that provides that capability; say 'rgl', 'lattice', or 'plot3d'. Any of them should provide the needed facilities.
Let's say I have a unit vector a = Vector(0,1,0) and I want to add a random spread of something between x = Vector(-0.2,0,-0.2) and y = Vector(0.2,0,0.2), how would I go about doing that?
If I were to simply generate a random vector between x and y, I'd get a value somewhere in the bounds of a square:
What I'd like instead is a value within the circle made up by x and y:
This seems like a simple problem but I can't figure out the solution right now. Any help would be appreciated.
(I didn't ask this on mathoverflow since this isn't really a 'research level mathematics question')
If I read your question correctly, you want a vector in a random direction that's within a particular length (the radius of your circle).
The formula for a circle is: x2 + y2 = r2
So, if you have a maximum radius, r, that constrains the vector length, perhaps proceed something like this:
Choose a random value for x, that lies between -r and +r
Calculate a limit for randomising y, based on your chosen x, so ylim = sqrt(r2 - x2)
Finally, choose a random value of y between -ylim and +ylim
That way, you get a random direction in x and a random direction in y, but the vector length will remain within 0 to r and so will be constrained within a circle of that radius.
In your example, it seems that r should be sqrt(0.22) which is approximately 0.28284.
UPDATE
As 3D vector has length (or magnitude) sqrt(x2+y2+z2) you could extend the technique to 3D although I would probably favour a different approach (which would also work for 2D).
Choose a random direction by choosing any x, y and z
Calculate the magnitude m = sqrt(x2+y2+z2)
Normalise the direction vector (by dividing each element by its magnitude), so x = x/m, y = y/m, z=z/m
Now choose a random length, L between 0 and r
Scale the direction vector by the random length. So x = x * L, y = y * L, z = z * L