Find point width as percent of range - r

I have a function that will get the number of visible points from a geom_point layer.
It turns the graph space into a grid putting all the points into their nearest space then find number of distinct points.
It currently looks like this:
.getGGVisiblePoints = function(cleanData, xbuild, layeri) {
#Get grid for points to be put into
roundedPoints = list()
for (axis in c('x', 'y')) {
data = cleanData[axis]
rangeName = paste0("panel_scales_", axis)
limits = xbuild[["layout"]][[rangeName]][[layeri]][["range"]][["range"]]
range = limits[2] - limits[1]
avgPointSize = mean(cleanData$size)
#This will get the width of each cell in the grid
multi = (1/100) * (avgPointSize/1.5) * range
roundedPoints[axis] = multi * round(data/multi)
}
#Return percent of indiviual points
numberOfVisiblePoints = data.frame(roundedPoints) |> distinct() |> nrow()
numberOfPoints = length(roundedPoints$x)
(numberOfVisiblePoints / numberOfPoints)
}
Where it is stuck is calculating the width of the grid cells.
I currently have width = (1/100) * (avgPointSize/1.5) * range
This width formula and function does apporximate them but it is quite for off overestimating for small points (<1) and underestimating for largers points (>=3).
Is there a way in ggplot to find the width of a point based off its size and axis range in either percent of range or in scale of axis?

Related

Scaling a bezier graph to starting and ending points

I have a graph like this:
And I want to be able to convert the position of P1 aka the ball you can drag around to scale with different starting and ending points on my screen.
I esentially want to make it so that the curve dot is around the same position no matter where the starting and ending positions are for the curve
So if I had a different points on my screen it would look the same as the graph
This is what I tried to do but it didn't work
function bezier.scale(startingPosition : Vector2, endingPosition : Vector2)
local screenSize = workspace.CurrentCamera.ViewportSize
local lengthX = (endingPosition.X - startingPosition.X)
local lengthY = (endingPosition.Y - startingPosition.Y)
local screenRelativeX = (screenSize.X - startingPosition.X) + lengthX
local screenRelativeY = (screenSize.Y - startingPosition.Y) + lengthY
local scaleX = (screenRelativeX / graphBackground.Size.X.Offset)
local scaleY = (screenRelativeY / graphBackground.Size.Y.Offset)
local x = (bezierPoint.Position.X.Offset * scaleX)
local y = (bezierPoint.Position.Y.Offset * scaleY)
return Vector2.new(x, y)
end
so your input is 4 2D points ... first 2 points p0,p1 are constant refer to your BEZIER start and end points and the next 2 q0,q1 are start and end point for your animation. So you want affine transform mapping between the two pairs. For that you need rotation and scale and offset...
Scale
is Easy its just ratio between line sizes so:
scale = |q1-q0| / |p1-p0|
Rotation
you can exploit dot product:
ang = acos( dot(p1-p0,q1-q0)/(|p1-p0|*|q1-q0|) )
the sign can be determined by 3D cross product (using z=0) for example:
if (cross(p1-p0,q1-q0).z >=0 ) ang=-ang;
however note that >=0 or <=0 depends on yoru coordinate system and rotation formula so it might be reversed in your case.
offset
simply apply the #1,#2 on p0 lets call the result P0 then the offset is easy:
offset = p0-P0
Putting all toghether
so transforming point p=(x,y) will be:
// #1 apply scale
x' = x*scale
y' = y*scale
// #2 apply rotation
x = x'*cos(ang) + y'*sin(ang)
y =-x'*sin(ang) + y'*cos(ang)
// #3 apply offset
x = x + offset.x
y = y + offset.y
Do not forget to use temp variables x',y' for the rotation! You might also construct 3x3 transform matrix for this instead.
For more info about transform matrices and vector math (dot and cross product included) see:
Understanding 4x4 homogenous transform matrices

How to animate 3D scatter plot by adding each point at a time in R or MATLAB

I have a set of 3D coordinates here. The data has 52170 rows and 4 columns. Each row represent one point. The first column is point index number, increasing from 1 to 52170. The second to fourth columns are coordinates for x, y, and z axis, respectively. The first 10 lines are as follow:
seq x y z
1 7.126616 -102.927567 19.692112
2 -10.546907 -143.824966 50.77417
3 7.189214 -107.792068 18.758278
4 7.148852 -101.784027 19.905006
5 -14.65788 -146.294952 49.899158
6 -37.315742 -116.941185 12.316169
7 8.023512 -103.477882 19.081482
8 -14.641933 -145.100098 50.182739
9 -14.571636 -141.386322 50.547684
10 -15.691803 -145.66481 49.946281
I want to create a 3D scatter plot in which each point is added sequentially to this plot using R or MATLAB. The point represented by the first line is added first, then the point represented by the second line, ..., all the way to the last point.
In addition, I wish to control the speed at which points are added.
For 2D scatter plot, I could use the following code:
library(gganimate)
x <- rnorm(50, 5, 1)
y <- 7*x +rnorm(50, 4, 4)
ind <- 1:50
data <- data.frame(x, y, ind)
ggplot(data, aes(x, y)) + geom_point(aes(group = seq_along(x))) + transition_reveal(ind)
But I cannnot find information on how to do this for 3D scatter plot. Can anyone show me how this could be done? Thank you.
This is an answer for MATLAB
In a general fashion, animating a plot (or 3d plot, or scatter plot, or surface, or other graphic objects) can be done following the same approach:
Do the first plot/plot3/scatter/surf, and retrieve its handle. The first plot can incorporate the first "initial" sets of points or even be empty (use NaN value to create a plot with invisible data point).
Set axis limits and all other visualisation options which are going to be fixed (view point, camera angle, lightning...). No need to set the options which are going to evolove during the animation.
In a loop, update the minimum set of plot object properties: XData, YData ( ZData if 3D plot, CData if the plot object has some and you want to animate the color).
The code below is an implementation of the approach above adapted to your case:
%% Read data and place coordinates in named variables
csvfile = '3D scatter plot.csv' ;
data = csvread(csvfile,2) ;
% [optional], just to simplify notations further down
x = data(:,2) ;
y = data(:,3) ;
z = data(:,4) ;
%% Generate empty [plot3] objects
figure
% create an "axes" object, and retrieve the handle "hax"
hax = axes ;
% create 2 empty 3D point plots:
% [hp_new] will contains only one point (the new point added to the graph)
% [hp_trail] will contains all the points displayed so far
hp_trail = plot3(NaN,NaN,NaN,'.b','Parent',hax,'MarkerSize',2) ;
hold on
hp_new = plot3(NaN,NaN,NaN,'or','Parent',hax,'MarkerSize',6,'MarkerEdgeColor','r','MarkerFaceColor','g','LineWidth',2) ;
hold off
%% Set axes limits (to limit "wobbling" during animation)
xl = [min(x) max(x)] ;
yl = [min(y) max(y)] ;
zl = [min(z) max(z)] ;
set(hax, 'XLim',xl,'YLim',yl,'ZLim',zl)
view(145,72) % set a view perspective (optional)
%% Animate
np = size(data,1) ;
for ip=1:np
% update the "new point" graphic object
set( hp_new , 'XData',x(ip), 'YData',y(ip), 'ZData',z(ip) )
% update the "point history" graphic object
% we will display points from index 1 up to the current index ip
% (minus one) because the current index point is already displayed in
% the other plot object
indices2display = 1:ip-1 ;
set(hp_trail ,...
'XData',x(indices2display), ...
'YData',y(indices2display), ...
'ZData',z(indices2display) )
% force graphic refresh
drawnow
% Set the "speed"
% actually the max speed is given by your harware, so we'll just set a
% short pause in case you want to slow it down
pause(0.01) % <= comment this line if you want max speed
end
This will produce:

Filling a curve with points that fit under the curve in R plot

I was wondering how I can efficiently (using short R code) fill a curve with points that can fill up the area under my curve?
I have tried something without success, here is my R code:
data = rnorm(1000) ## random data points to fill the curve
curve(dnorm(x), -4, 4) ## curve to be filled by "data" above
points(data) ## plotting the points to fill the curve
Here's a method that uses interpolation to ensure that the plotted points won't exceed the height of the curve (although, if you want the actual point markers to not stick out above the curve, you'll need to set the threshold slightly below the height of the curve):
# Curve to be filled
c.pts = as.data.frame(curve(dnorm(x), -4, 4))
# Generate 1000 random points in the same x-interval and with y value between
# zero and the maximum y-value of the curve
set.seed(2)
pts = data.frame(x=runif(1000,-4,4), y=runif(1000,0,max(c.pts$y)))
# Using interpolation, keep only those points whose y-value is less than y(x)
pts = pts[pts$y < approx(c.pts$x,c.pts$y,xout=pts$x)$y, ]
# Plot the points
points(pts, pch=16, col="red", cex=0.7)
A method for plotting exactly a desired number of points under a curve
Responding to #d.b's comment, here's a way to get exactly a desired number of points plotted under a curve:
First, let's figure out how many random points we need to generate over the entire plot region in order to get (roughly) a target number of points under the curve. We do this as follows:
Calculate the area under the curve as a fraction of the area of the rectangle bounded by zero and the maximum height of the curve on the vertical axis, and by the width of the curve on the horizontal axis.
The number of random points we need to generate is the target number of points, divided by the area ratio calculated above.
# Area ratio
aa = sum(c.pts$y*median(diff(c.pts$x)))/(diff(c(-4,4))*max(c.pts$y))
# Target number of points under curve
n.target = 1000
# Number of random points to generate
n = ceiling(n.target/aa)
But we need more points than this to ensure we get at least n.target, because random variation will result in fewer than n.target points about half the time, once we limit the plotted points to those below the curve. So we'll add an excess.factor in order to generate more points under the curve than we need, then we'll just randomly select n.target of those points to plot. Here's a function that takes care of the entire process for a general curve.
# Plot a specified number of points under a curve
pts.under.curve = function(data, n.target=1000, excess.factor=1.5) {
# Area under curve as fraction of area of plot region
aa = sum(data$y*median(diff(data$x)))/(diff(range(data$x))*max(data$y))
# Number of random points to generate
n = excess.factor*ceiling(n.target/aa)
# Generate n random points in x-range of the data and with y value between
# zero and the maximum y-value of the curve
pts = data.frame(x=runif(n,min(data$x),max(data$x)), y=runif(n,0,max(data$y)))
# Using interpolation, keep only those points whose y-value is less than y(x)
pts = pts[pts$y < approx(data$x,data$y,xout=pts$x)$y, ]
# Randomly select only n.target points
pts = pts[sample(1:nrow(pts), n.target), ]
# Plot the points
points(pts, pch=16, col="red", cex=0.7)
}
Let's run the function for the original curve:
c.pts = as.data.frame(curve(dnorm(x), -4, 4))
pts.under.curve(c.pts)
Now let's test it with a different distribution:
# Curve to be filled
c.pts = as.data.frame(curve(df(x, df1=100, df2=20),0,5,n=1001))
pts.under.curve(c.pts, n.target=200)
n_points = 10000 #A large number
#Store curve in a variable and plot
cc = curve(dnorm(x), -4, 4, n = n_points)
#Generate 1000 random points
p = data.frame(x = seq(-4,4,length.out = n_points), y = rnorm(n = n_points))
#OR p = data.frame(x = runif(n_points,-4,4), y = rnorm(n = n_points))
#Find out the index of values in cc$x closest to p$x
p$ind = findInterval(p$x, cc$x)
#Only retain those points within the curve whose p$y are smaller than cc$y
p2 = p[p$y >= 0 & p$y < cc$y[p$ind],] #may need p[p$y < 0.90 * cc$y[p$ind],] or something
#Plot points
points(p2$x, p2$y)

How to generate random shapes given a specified area.(R language).?

My question is this.. I am working on some clustering algorithms.. For this first i am experimenting with 2d shapes..
Given a particular area say 500sq units .. I need to generate random shapes for a particular area
say a Rect, Square, Triangle of 500 sq units.. etc .. Any suggestions on how i should go about this problem.. I am using R language..
It's fairly straightforward to do this for regular polygon.
The area of an n-sided regular polygon, with a circumscribed circle of radius R is
A = 1/2 nR^2 * sin((2pi)/n)
Therefore, knowing n and A you can easily find R
R = sqrt((2*A)/(n*sin((2pi)/n))
So, you can pick the center, go at distance R and generate n points at 2pi/n angle increments.
In R:
regular.poly <- function(nSides, area)
{
# Find the radius of the circumscribed circle
radius <- sqrt((2*area)/(nSides*sin((2*pi)/nSides)))
# I assume the center is at (0;0) and the first point lies at (0; radius)
points <- list(x=NULL, y=NULL)
angles <- (2*pi)/nSides * 1:nSides
points$x <- cos(angles) * radius
points$y <- sin(angles) * radius
return (points);
}
# Some examples
par(mfrow=c(3,3))
for (i in 3:11)
{
p <- regular.poly(i, 100)
plot(0, 0, "n", xlim=c(-10, 10), ylim=c(-10, 10), xlab="", ylab="", main=paste("n=", i))
polygon(p)
}
We can extrapolate to a generic convex polygon.
The area of a convex polygon can be found as:
A = 1/2 * [(x1*y2 + x2*y3 + ... + xn*y1) - (y1*x2 + y2*x3 + ... + yn*x1)]
We generate the polygon as above, but deviate angles and radii from those of the regular polygon.
We then scale the points to get the desired area.
convex.poly <- function(nSides, area)
{
# Find the radius of the circumscribed circle, and the angle of each point if this was a regular polygon
radius <- sqrt((2*area)/(nSides*sin((2*pi)/nSides)))
angle <- (2*pi)/nSides
# Randomize the radii/angles
radii <- rnorm(nSides, radius, radius/10)
angles <- rnorm(nSides, angle, angle/10) * 1:nSides
angles <- sort(angles)
points <- list(x=NULL, y=NULL)
points$x <- cos(angles) * radii
points$y <- sin(angles) * radii
# Find the area of the polygon
m <- matrix(unlist(points), ncol=2)
m <- rbind(m, m[1,])
current.area <- 0.5 * (sum(m[1:nSides,1]*m[2:(nSides+1),2]) - sum(m[1:nSides,2]*m[2:(nSides+1),1]))
points$x <- points$x * sqrt(area/current.area)
points$y <- points$y * sqrt(area/current.area)
return (points)
}
A random square of area 500m^2 is easy - its a square of side sqrt(500)m. Do you care about rotations? Then rotate it by runif(x,0,2*pi). Do you care about its location? Add an (x,y) offset computed from runif or whatever.
Rectangle? Given the length of any one pair of sides you only have the freedom to choose the length of the other two. How do you choose the length of the first pair of sides? Well, you might want to use runif() between some 'sensible' limits for your application. You could use rnorm() but that might give you negative lengths, so maybe rnorm-squared. Then once you've got that side, the other side length is 500/L. Rotate, translate, and add salt and pepper to taste.
For triangles, the area formula is half-base-times-height. So generate a base length - again, runif, rnorm etc etc - then choose another point giving the required height. Rotate, etc.
Summarily, a shape has a number of "degrees of freedom", and constraining the area to be fixed will limit at least one of those freedoms[1], so if you start building a shape with random numbers you'll come to a point where you have to put in a computed value.
[1] exactly one? I'm not sure - these aren't degrees of freedom in the statistical sense...
I would suggest coding a random walk of adjacent tiny squares, so that the aggregation of the tiny squares could be of arbitrary shape with known area.
http://en.wikipedia.org/wiki/File:Random_walk_in2D.png
It would be very tough to make a generic method.
But you could code up example for 3, 4, 5 sided objects.
Here is an example of a random triangle.(in C#)
class Triangle
{
double Angle1;
double Angle2;
//double angle3; 180 - angle1 - angle2;
double Base;
}
Triangle randomTriangle(double area){
//A = (base*hieght)/2.0;
double angle1 = *random number < 180*;
double angle2 = *random number < (180 - angle1)*;
*use trig to get height in terms of angles and base*
double base = (area*2.0)/height;
return new Triangle(){Angle1 = angle1, Angle2 = angle2, Base = base};
}

Interpolating height for a point inside a grid based on a discrete height function

I have been wracking my brain to come up with a solution to this problem.
I have a lookup table that returns height values for various points (x,z) on the grid. For instance I can calculate the height at A, B, C and D in Figure 1. However, I am looking for a way to interpolate the height at P (which has a known (x,z)). The lookup table only has values at the grid intervals, and P lies between these intervals. I am trying to calculate values s and t such that:
A'(s) = A + s(C-A)
B'(t) = B + t(P-B)
I would then use the these two equations to find the intersection point of B'(t) with A'(s) to find a point X on the line A-C. With this I can calculate the height at this point X and with that the height at point P.
My issue lies in calculating the values for s and t.
Any help would be greatly appreciated.
Try also bilinear interpolation or bicubic interpolation.
Depending on if you want to interpolate between ABC or ABCD the algorithm will change.
To interpolate between ABC (which I assume is what you want to do since you draw the diagonal) you will need to find the barycentric coordinates of P relative to ABC x and y positions then apply the barycentric coordinate to the height (z is assumed here) component of those triangles.
What about going this way: find u and v so that
P = B + u(A-B) + v(C-B)
If you write this out, you'll see that this is a 2x2 linear system with unknowns u and v, so I guess you know how to go on from there.
Oh, and once you have u and v you use the same exact formula as above for the height, only this time A,B,C,P will be the heights at these points.
Considering points value are available at four corners of a square of unit length, interpolated value at any point(x,y) inside the square is given by:
f(x,y) = [ (1-y)f(0,0) + yf(0,1) ](1-x) + [ (1-y)f(1,0)+y(f(1,1)) ]x
If square has length other than 1,say L then f(x,y) is given by:
f(x,y) = [ (L-y)f(0,0) + yf(0,L) ](L-x)/L^2 + [ (L-y)f(L,0)+y(f(L,L)) ]x/L^2
image
Here's an explicit example based on shape functions.
Consider the functions:
u1(x,z) = (x-x_b)/(x_c-x_b)
One has u1(x_b,z_b) = u1(x_a,z_a) = 0 (because x_a = x_b) and u1(x_c,z_c) = u1(x_d,z_d) = 1
u2(x,z) = 1 - u1(x,z)
Now we have u2(x_b,z_b) = u2(x_a,z_a) = 1 and u2(x_c,z_c) = u2(x_d,z_d) = 0
v1(x,z) = (z-z_b)/(z_a-z_b)
This function satisfies v1(x_a,z_a) = v1(x_d,z_d) = 1 and v1(x_b,z_b) = v1(x_c,z_c) = 0
v2(x,z) = 1 - v1(x,z)
We have v2(x_a,z_a) = v2(x_d,z_d) = 0 and v2(x_b,z_b) = v2(x_c,z_c) = 1
Now let's build new functions as follows:
S_D(x,z) = u1(x,z) * v1(x,z)
We get S_D(x_d, z_d) = 1 and S_D(x_a,z_a) = S_D(x_b,z_b) = S_D(x_c,z_c) = 0
S_C(x,z) = u1(x,z) * v2(x,z)
We get S_C(x_c, z_c) = 1 and S_C(x_a,z_a) = S_C(x_b,z_b) = S_C(x_d,z_d) = 0
S_A(x,z) = u2(x,z) * v1(x,z)
We get S_A(x_a, z_a) = 1 and S_A(x_b,z_b) = S_A(x_c,z_c) = S_A(x_d,z_d) = 0
S_B(x,z) = u2(x,z) * v2(x,z)
We get S_B(x_b, z_b) = 1 and S_B(x_a,z_a) = S_B(x_c,z_c) = S_B(x_d,z_d) = 0
Now define your interpolating function as
H(x,z) = h_a * S_A(x,z) + h_b * S_B(x,z) + h_c * S_C(x,z) + h_d * S_D(x,z),
where h_a is the heigh at point A, h_b is the height at point B, and so on.
You can easily verify that H is indeed an interpolating function:
H(x_a,z_a) = h_a, H(x_b,z_b) = h_b, H(x_c,z_c) = h_c and H(x_d,z_d) = h_d.
Now, in order to approximate the height at P, all you need to do is evaluate H at this point:
h_p = H(x_p, z_p)
The functions S are normally referred to as "shape functions". There's one such function for each node you want your interpolated value to depend on, and in this case they all satisfy Kronecker's delta property (they take the value one at one node and zero at all other nodes).
There are many ways to build shape functions for a given set of nodes. If I remember correctly, the construction of 2D shape functions by multiplication of 1D shape functions (as we've done in this case) is called "tensor product of functions" (easy in this case because the grid is rectangular). We have ended up with four functions (one per node), all of them linear combinations of {1, x, z, xz}.
If you want to use only three points for your interpolation, then you should be able to easily build three shape functions as linear combinations of {1, x, z} only, but you will loose a 25% of the height information provided by the grid and your interpolant will not be smooth inside the rectangle when h_b != h_d.

Resources