Computing 3-way Venn diagram circle radius & position

Computing 3-way Venn diagram circle radius & position - math

Given the source data for a Venn diagram, e.g. A=10, B=15, C=12, A+B=5, B+C=3, A+C=2, A+B+C=1, I need to draw a Venn diagram with the circle sizes proportional to A,B, and C, and their overlap proportional to the A+B, B+C, and A+C. The graph does not need to be perfectly matching the data, but be as close to it as possible (I prefer simpler computation method). It must correctly represent cases of no overlap and when one set is a proper subset of another set. How would I compute the correct positioning and radius of the circles for a given canvas size (width/height). I was able to find mathematics for two-circle Venn. Has anyone done the 3 circle calculation?
P.S. The example numbers above were random, and might be invalid.

The distance between circles A, B must be the solution to the two-circle Venn problem with intersection area equal to (A+B) + (A+B+C) (or simply (A+B) if by your definition it includes (A+B+C)). Similarly for B, C the intersection is (B+C) + (A+B+C), and likewise for C, A.
Solve these independently with the algorithm you have found, and you get three distances equal to the side lengths of the triangle joining the centers of the three circles. Constructing the triangle and hence drawing the circles is then an easy task with some high-school trigonometry.
The solution is unique, and only valid if the intersection values themselves are valid.

There are 8 regions defined by a 3-circle Venn diagram. If we define set A as including binary numbers 0 though 7 that have the 1-bit set, B as those with the 2-bit set, and C ans those with the 4-bit set, we get
A = {1, 3, 5, 7}; B = {2, 3, 6, 7}; C = {4, 5, 6, 7}
Each one of those numbers defined a region in the diagram, with 0 representing the region outside the circles and inside the universal set: i.e. A' ∩ B' ∩ C'.
You know how to do the 2-circle problem. So solve that for A and B (using the sizes of A, B, and A ∩ B), B and C, and A and C. that gives you distances between the circle centers and the sizes of the circle. Use the three distances to draw a triangle for those circle centers, then draw the circles around those centers. If that makes the exterior region 0have the wrong size, you can shrink or expand the entire 3-circle setup to get that right as well.
That makes all regions correct--except for region 7, the intersection of all three sets. That size will be set from all the others--you have no choice here. Therefore, it will probably not have the size you desire. You will need to experiment to see if the size of that region is usually close enough to what you want. My brief research implies that there is no way to use circles in your diagram and always get the sizes of all eight regions. If you use ellipses or some other more-general shape instead, this should be possible, but you seem to want circles.
Note that if you solve the 2-circle problem correctly, the situation of disjoint circles and of subset circles will automatically be handled. For example, if A and B are disjoint, then regions 3 and 7 are empty, and your solution will make the two circles to not overlap. They will probably touch, if you use the obvious algorithm from your linked site, but without overlap. Similarly, if one set is a subset of another, one circle will be inside the other, though they will probably touch. If you do not want the touching, the algorithm to avoid that should be easy, unless of course you have the situation where two of your three sets are equal.

Related

How to determine orientation of points in 2D spatial data and arrange in clockwise manner [for corner cases]?

Essentially I have spatial data in which I go through each point and figure out what are the surrounding points within a circle of some radius. I then want to arrange those points in a clockwise manner and I managed to have done that for "most" cases. The unique feature about this data is that there are only 6 maximum possible location that can surround any center point with how I defined my circle's radius[top-left, top-right, right, bottom-right, bottom-left, left]
So as a sample data
Center Point: 161.3861 368.8119
col row
1 164.5365 363.4114
2 155.2205 368.7669
3 167.5968 368.8569
4 158.2358 374.1674
5 164.4465 374.2124
6 158.3258 363.3663
The function would then output [4, 5 ,3, 1, 6, 2] which is the clockwise order. This sub-sample [highlighted in red with the center remained as black] of the data looks like this. [To be clear I have this case working]
But you can imagine that it is not exactly straightforward for the various corner cases. For instance the following case it has no point to the right of it so in the final out put array there should be a zero in the "right, top-right, top-left" index of the array I described earlier.
What I am struggling with is a systematic way to go through the corner cases and assign labels to the missing points. I tried using a dot product approach to quantify how closely the points are from each other (using a normal vector of straight up) but this lead to issues with discriminating top-right, right. I imagine that checking if a line goes through the point we get a sense of what axis the point exists on, but I have not managed to make it work. To summarize the two main corner cases are
Edge points
Island points

You could write a function to tell you which direction a point is in, given the point and the center-point:
Pseudo-code:
direction_vector = point - center_point
angle = atan2(direction_vector.y, direction_vector.x)
direction_index = ((angle * 12 / TWO_TIMES_PI) + 12) % 12
This will give you an index from 0 to 11 (imagine hours on bizarro clock face that goes anti-clockwise from 0, with 0 on the right where 3 o'clock is on a normal clock).
Now map this onto your directions, with 1 being top-left, 2 being top-right, 3 being right etc:
direction_index = (((16 - x) // 2) % 6) + 1
Where // is integer division and % is modulo.
Now that you have the directions, iterate from 1 to 6 and output the array index of your point that has the corresponding direction index, or 0 if there isn't one (assuming 1-based array indexing).

Your points are arranged in a hexagonal grid. When you consider the immediate neighbors, you can easily classify them in an absolute way by comparing the coordinates of the centers to three straight lines.
Then sort by index.

What about adding dummy points so that every point has six neighbors ? Then when you enumerate the neighbors in the desired order, you just skip the dummy ones.
Depending how your data structures are organized, you could truly add points to the point set, or add them "virtually" when you process a given point.

Tangent circle(s) for two other circles?

There are two circles: a centered at point A, and circle b (center at B). What is the equation to calculate 2D position of all or none tangent circles possible. Main constraint is, that radius is the same for all the circles. As far as I know, there should be either no solution (figure 2), or 2 solutions (figure 1). How to find out if there are solutions, and also position of centers of those solutions (C and D).
Figure 1: 2 solutions should be possible here
Figure 2: No solutions!
Update (solution):
1) Calculate distance from A to B -> |AB|:
2) Checks whether a solution exist, it exist only if:
3) If it exist, calculate half-point between points A and B:
4) Create normalized perpendicular vector to line segment AB:
5) Calculate distance from this H point to C point -> |HC|:
6) Finally calculate point C along the (HC) starting at X at distance |HC|:

I suppose this question should migrate to a more math related site.
Try to imagine where these two tangent circles go when the circles a and b get further and further apart. They get closer to the line AB. Once the AB segment equals 4r these two tangent circles will overlap. From now on, once circles a and b get further apart, there's no tangent circles whatsoever.
If you want to calculate the position of these circles, just assume that the distance between the centers is always 2r:
You should get two, one or none solutions for xC and yC, which will be the centers of your tangent circles. I hope I haven't messed something up.
Solutions
Provided you do know there are solutions ( just check if d(A,B) <= 4r ), these are the coordinates of your two circles:
http://pastebin.com/LeW7Ws98
A little scary, eh? But it's working. There are the following variables:
x_A, y_A - the coordinates of the circle A,
x_B, y_B - the coordinates of the circle B,
r - the radius.
I've checked the solutions with the values from one of my comments below. I think that you can copy these solutions and inject them into your code straight away (provided there's a sqrt function) and get the results after declaring some variables.
These solutions are loosely derived from the Save's proposition but I couldn't comment below his answer - I've got less than 50 reputation points, duh ... ( thanks SO! You're the man! ). However I'm pretty sure they should be valid for my system anyways. Cheers

A solution exists iff d(A,B) = sqrt(2)*2*r
To find the center of the solution circles, that will let you draw the circonferences, you can intersect the circle with center (x_m,y_m), that is the medium point of the segment AB, of radius sqrt(2)*r, with the line perpendicular to AB and passing from (x_m,y_m)
This should give you all the needed information to check if a solution exixsts, and if it does, to draw it.

Find all points with integer coordinates inside tetrahedron

I am trying to find all the points with integer coordinates that lie inside of a tetrahedron (I want to somehow be able to loop through them). I know the coordinates of the four points (A, B, C, D) that define the tetrahedron.
What I'm currently doing is I find the bounding box of the tetrahedron (minimum and maximum x, y, z coordinates of A, B, C, D) and then do a loop through all of the points inside the bounding box. For every such point, I calculate the barycentric coordinates (using the equations from Wikipedia) and check if the point is inside the tetrahedron (if any of the barycentric coordinates is negative or bigger than 1, the point isn't inside).
Is there a better way to do this? Currently there is around 1/6 chance that the point I am testing (from the bounding box) really lies inside the tetrahedron, so I think I'm doing too many unnecessary computations.
I am working with a list of tetrahedra that I generated by triangulating a bigger volume (I am expanding the volume and want to interpolate the missing values using tetrahedral interpolation). I am not using any external libraries.

Another idea for improving:
check if a "rod" parrallel to z-axis (i.e. x=4, y=6) runs through the tetrahedron. If not, no values with (x=4, y=5, z) can be inside.
Else, find where the rod intersects the edge of the tetrahedron (by finding out where the planes that make up the edge of the tetrahedron intersect it).
Say these planes intersect at z=1.3 and z= 10.04. Then you know all points (4,5, 2) to (4,5,10) are inside.
Repeat for all values of x and y.
This should be faster in practice, because it will save you 1 loop.

Your approach is the correct one. There are some possible optimisations, which might be worth it or not depending on the requirements. For example:
There is an easier way to check if a given point is inside or outside of the tetrahedron.
It amounts to checking the which half-space the point belongs to with respect to each of the 4 sides of the tetrahedron:
Each side is defined by 3 points (say A, B, C). Then a plane normal is a (C-A)x(B-A) (that's cross product of vectors in the plane). If this coordinates are (a,b,c), then the plane equation is F(x,y,z) = ax+by+cz = 0. For a given point (x0, y0, z0) the sign of F(x0,y0,z0) determines which half-plane the points belong to.
The point is that you can precompute plane quations for each side of the tetrahedron as well as the sign which corresponds to 'outside' an then the check for a given point amounts to doing at most 4 evaluations (one for each side), each taking 3 multiplications and 2 additions.

Determine if a set of points lie on a regular grid

Problem: Suppose you have a collection of points in the 2D plane. I want to know if this set of points sits on a regular grid (if they are a subset of a 2D lattice). I would like some ideas on how to do this.
For now, let's say I'm only interested in whether these points form an axis-aligned rectangular grid (that the underlying lattice is rectangular, aligned with the x and y axes), and that it is a complete rectangle (the subset of the lattice has a rectangular boundary with no holes). Any solutions must be quite efficient (better than O(N^2)), since N can be hundreds of thousands or millions.
Context: I wrote a 2D vector field plot generator which works for an arbitrarily sampled vector field. In the case that the sampling is on a regular grid, there are simpler/more efficient interpolation schemes for generating the plot, and I would like to know when I can use this special case. The special case is sufficiently better that it merits doing. The program is written in C.

This might be dumb but if your points were to lie on a regular grid, then wouldn't peaks in the Fourier transform of the coordinates all be exact multiples of the grid resolution? You could do a separate Fourier transform the X and Y coordinates. If theres no holes on grid then the FT would be a delta function I think. FFT is O(nlog(n)).
p.s. I would have left this as a comment but my rep is too low..

Not quite sure if this is what you are after but for a collection of 2d points on a plane you can always fit them on a rectangular grid (down to the precision of your points anyway), the problem may be the grid they fit to may be too sparsly populated by the points to provide any benefit to your algorithm.
to find a rectangular grid that fits a set of points you essentially need to find the GCD of all the x coordinates and the GCD of all the y coordinates with the origin at xmin,ymin this should be O( n (log n)^2) I think.
How you decide if this grid is then too sparse is not clear however

If the points all come only from intersections on the grid then the hough transform of your set of points might help you. If you find that two mutually perpendicular sets of lines occur most often (meaning you find peaks at four values of theta all 90 degrees apart) and you find repeating peaks in gamma space then you have a grid. Otherwise not.

Here's a solution that works in O(ND log N), where N is the number of points and D is the number of dimensions (2 in your case).
Allocate D arrays with space for N numbers: X, Y, Z, etc. (Time: O(ND))
Iterate through your point list and add the x-coordinate to list X, the y-coordinate to list Y, etc. (Time: O(ND))
Sort each of the new lists. (Time: O(ND log N))
Count the number of unique values in each list and make sure the difference between successive unique values is the same across the whole list. (Time: O(ND))
If
the unique values in each dimension are equally spaced, and
if the product of the number of unique values of each coordinate is equal to the number of original points (length(uniq(X))*length(uniq(Y))* ... == N,
then the points are in a regular rectangular grid.

Let's say a grid is defined by an orientation Or (within 0 and 90 deg) and a resolution Res. You could compute a cost function that evaluate if a grid (Or, Res) sticks to your points. For example, you could compute the average distance of each point to its closest point of the grid.
Your problem is then to find the (Or, Res) pair that minimize the cost function. In order to narrow the search space and improve the , some a heuristic to test "good" candidate grids could be used.
This approach is the same as the one used in the Hough transform proposed by jilles. The (Or, Res) space is comparable to the Hough's gamma space.

Computational geometry, tetrahedron signed volume

I'm not sure if this is the right place to ask, but here goes...
Short version: I'm trying to compute the orientation of a triangle on a plane, formed by the intersection of 3 edges, without explicitly computing the intersection points.
Long version: I need to triangulate a PSLG on a triangle in 3D. The vertices of the PSLG are defined by the intersections of line segments with the plane through the triangle, and are guaranteed to lie within the triangle. Assuming I had the intersection points, I could project to 2D and use a point-line-side (or triangle signed area) test to determine the orientation of a triangle between any 3 intersection points.
The problem is I can't explicitly compute the intersection points because of the floating-point error that accumulates when I find the line-plane intersection. To figure out if the line segments strike the triangle in the first place, I'm using some freely available robust geometric predicates, which give the sign of the volume of a tetrahedron, or equivalently which side of a plane a point lies on. I can determine if the line segment endpoints are on opposite sides of the plane through the triangle, then form tetrahedra between the line segment and each edge of the triangle to determine whether the intersection point lies within the triangle.
Since I can't explicitly compute the intersection points, I'm wondering if there is a way to express the same 2D orient calculation in 3D using only the original points. If there are 3 edges striking the triangle that gives me 9 points in total to play with. Assuming what I'm asking is even possible (using only the 3D orient tests), then I'm guessing that I'll need to form some subset of all the possible tetrahedra between those 9 points. I'm having difficultly even visualizing this, let alone distilling it into a formula or code. I can't even google this because I don't know what the industry standard terminology might be for this type of problem.
Any ideas how to proceed with this? Thanks. Perhaps I should ask MathOverflow as well...
EDIT: After reading some of the comments, one thing that occurs to me... Perhaps if I could fit non-overlapping tetrahedra between the 3 line segments, then the orientation of any one of those that crossed the plane would be the answer I'm looking for. Other than when the edges enclose a simple triangular prism, I'm not sure this sub-problem is solvable either.
EDIT: The requested image.

I am answering this on both MO & SO, expanding the comments I made on MO.
My sense is that no computational trick with signed tetrahedra volumes will avoid the precision issues that are your main concern. This is because, if you have tightly twisted segments, the orientation of the triangle depends on the precise positioning of the cutting plane.
[image removed; see below]
In the above example, the upper plane crosses the segments in the order (a,b,c) [ccw from above]: (red,blue,green), while the lower plane crosses in the reverse order (c,b,a): (green,blue,red). The height
of the cutting plane could be determined by your last bit of precision.
Consequently, I think it makes sense to just go ahead and compute the points of intersection in
the cutting plane, using enough precision to make the computation exact. If your segment endpoints coordinates and plane coefficients have L bits of precision, then there is just a small constant-factor increase needed. Although I am not certain of precisely what that factor is, it is small--perhaps 4. You will not need e.g., L2 bits, because the computation is solving linear equations.
So there will not be an explosion in the precision required to compute this exactly.
Good luck!
(I was prevented from posting the clarifying image because I don't have the reputation. See
the MO answer instead.)
Edit: Do see the MO answer, but here's the image:

I would write symbolic vector equations, you know, with dot and cross products, to find the normal of the intersection triangle. Then, the sign of the dot product of this normal with the initial triangle one gives the orientation. So finally you can express this in a form sign(F(p1,...,p9)), where p1 to p9 are your points and F() is an ugly formula including dot and cross products of differences (pi-pj). Don't know if this can be done simpler, but this general approach does the job.

As I understand it, you have three lines intersecting the plane, and you want to calculate the orientation of the triangle formed by the intersection points, without calculating the intersection points themselves?
If so: you have a plane
N·(x - x0) = 0
and six points...
l1a, l1b, l2a, l2b, l3a, l3b
...forming three lines
l1 = l1a + t(l1b - l1a)
l2 = l2a + u(l2b - l2a)
l3 = l3a + v(l3b - l3a)
The intersection points of these lines to the plane occur at specific values of t, u, v, which I'll call ti, ui, vi
N·(l1a + ti(l1b - l1a) - x0) = 0
N·(x0 - l1a)
ti = ----------------
N·(l1b - l1a)
(similarly for ui, vi)
Then the specific points of intersection are
intersect1 = l1a + ti(l1b - l1a)
intersect2 = l2a + ui(l2b - l2a)
intersect3 = l3a + vi(l3b - l3a)
Finally, the orientation of your triangle is
orientation = direction of (intersect2 - intersect1)x(intersect3 - intersect1)
(x is cross-product) Work backwards plugging the values, and you'll have an equation for orientation based only on N, x0, and your six points.

Let's call your triangle vertices T[0], T[1], T[2], and the first line segment's endpoints are L[0] and L[1], the second is L[2] and L[3], and the third is L[4] and L[5]. I imagine you want a function
int Orient(Pt3 T[3], Pt3 L[6]); // index L by L[2*i+j], i=0..2, j=0..1
which returns 1 if the intersections have the same orientation as the triangle, and -1 otherwise.
The result should be symmetric under interchange of j values, antisymmetric under interchange of i values and T indices. As long as you can compute a quantity with these symmetries, that's all you need.
Let's try
Sign(Product( Orient3D(T[i],T[i+1],L[2*i+0],L[2*i+1]) * -Orient3D(T[i],T[i+1],L[2*i+1],L[2*i+0]) ), i=0..2))
where the product should be taken over cyclic permutations of the indices (modulo 3). I believe this has all the symmetry properties required. Orient3D is Shewchuk's 4-point plane orientation test, which I assume you're using.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex