What is a non-simple shapely Polygon? - shapely

from shapely.geometry import Polygon, MultiPolygon, mapping
from shapely.ops import cascaded_union
polygon = Polygon([(0,0), (0, 1), (1, 1), (1, 2)])
gives True. But the description/documentation is:
True if the geometry is simple, meaning that any self-intersections are only at boundary points, else False
I thought this was one of the cases that are not simple. Could you please give me a minimal example of a non-simple polygon?

Your polygon is not valid (the definition of not valid depends on the geometry type), but it is simple. To be honest, I think that once the geometry is not valid, I don't know if it is possible to define whether or not the geometry is simple. Because, if it is not valid, how would you define the boundary and the interior of the geometry?
To give you an example of a not simple geometry, try the same points but with a LineString:
l = LineString([(0, 0), (0, 1), (1, 1), (1, 2), (0, 0)])
In this case, since it is a geometry of one dimension, the boundary is composed of the two ending points. The interior are the lines, and it self-intersects in this. Thus, this geometry is valid, but it is not simple.


Finding all quadrilaterals in a set of intersections

I want to take all the intersections of a set of lines and find all the convex quadrilaterals they create. I am not sure if there is an algorithm that works perfect for this, or if I need to loop through and create my own.
I have an array of lines, and all their intersections.
Lines and intersections:
Example Quadrilaterals 1:
Example Quadrilaterals 2
In this case, I would come out with 8 quadrilaterals.
How can I achieve this? If there isn't an algorithm I can implement for this, how can I check each intersection with other intersections to determine if they make a convex quadrilateral?
There is a simple, non-speedy, brute-force over-all algorithm to find those quadrilaterals. However, first you would need to clarify some definitions, especially that of a "quadrilateral." Do you count it as a quadrilateral if it has zero area, such as when all the vertices are collinear? Do you count it as a quadrilateral if it self-intersects or crosses? Do you count it if it is not convex? Do you count it if two adjacent sides are straight (which includes consecutive vertices identical)? What about if the polygon "doubles back" on itself so the result looks like a triangle with one side extended?
Here is a top-level algorithm: Consider all combinations of the line segments taken four at a time. If there are n line segments then there are n*(n-1)*(n-2)*(n-3)/24 combinations. For each combination, look at the intersections of pairs of these segments: there will be at most 6 intersections. Now see if you can make a quadrilateral from those intersections and segments.
This is brute-force, but at least it is polynomial in execution time, O(n^4). For your example of 8 line segments that means considering 70 combinations of segments--not too bad. This could be sped up somewhat by pre-calculating the intersection points: there are at most n*(n-1)/2 of them, 28 in your example.
Does this overall algorithm meet your needs? Is your last question "how can I check each intersection with other intersections to determine if they make a quadrilateral?" asking how to implement my statement "see if you can make a quadrilateral from those intersections and segments"? Do you still need an answer to that? You would need to clarify your definition of a quadrilateral before I could answer that question.
I'll explain the definitions of "quadrilateral" more. This diagram shows four line segments "in general position," where each segment intersects all the others and no three segments intersect in the same point.
Here are (some of) the "quadrilaterals" arising from those four line segments and six intersection points.
1 simple and convex (ABDE)
1 simple and not convex (ACDF)
1 crossing (BCEF)
4 triangles with an extra vertex on a triangle's side (ABCE, ACDE, ABDF, ABFE). Note that the first two define the same region with different vertices, and the same is true of the last two.
4 "double-backs" which looks like a triangle with one side extended (ACEF, BCDF, BDEF, CDEF)
Depending on how you define "quadrilateral" and "equal" you could get anywhere from 1 to 11 of them in that diagram. Wikipedia's definition would include only the first, second, and fourth in my list--I am not sure how that counts the "duplicates" in my fourth group. And I am not even sure that I found all the possibilities in my diagram, so there could be even more.
I see we are now defining a quadrilateral as outlined by four distinct line segments that are sub-segments of the given line segments that form a polygon that is strictly convex--the vertex angles are all less than a straight angle. This still leaves an ambiguity in a few edge cases--what if two line segments overlap more than at one point--but let's leave that aside other than defining that two such line segments have no intersection point. Then this algorithm, pseudo-code based on Python, should work.
We need a function intersection_point(seg1, seg2) that returns the intersection point of the two given line segments or None if there is none or the segments overlap. We also need a function polygon_is_strictly_convex(tuple of points) that returns True or False depending on if the tuple of points defines a strictly-convex polygon, with the addition that if any of the points is None then False is returned. Both those functions are standard in computational geometry. Note that "combination" in the following means that for each returned combination the items are in sorted order, so of (seg1, seg2) and (seg2, seg1) we will get exactly one of them. Python's itertools.combinations() does this nicely.
intersections = {} # empty dictionary/hash table
for each combination (seg1, seg2) of given line segments:
intersections[(seg1, seg2)] = intersection_point(seg1, seg2)
quadrilaterals = emptyset
for each combination (seg1, seg2, seg3, seg4) of given line segments:
for each tuple (sega, segb, segc, segc) in [
(seg1, seg2, seg3, seg4),
(seg1, seg2, seg4, seg3),
(seg1, seg3, seg2, seg4)]:
a_quadrilateral = (intersections[(sega, segb)],
intersections[(segb, segc)],
intersections[(segc, segd)],
intersections[(segd, sega)])
if polygon_is_strictly_convex(a_quadrilateral):
break # only one possible strictly convex quad per 4 segments
Here is my actual, tested, Python 3.6 code, which for your segments gives your eight polygons. First, here are the utility, geometry routines, collected into module rdgeometry.
def segments_intersection_point(segment1, segment2):
"""Return the intersection of two line segments. If none, return
NOTES: 1. This version returns None if the segments are parallel,
even if they overlap or intersect only at endpoints.
2. This is optimized for assuming most segments intersect.
pt1seg1, pt2seg1 = segment1 # points defining the segment
pt1seg2, pt2seg2 = segment2
seg1_delta_x = pt2seg1[0] - pt1seg1[0]
seg1_delta_y = pt2seg1[1] - pt1seg1[1]
seg2_delta_x = pt2seg2[0] - pt1seg2[0]
seg2_delta_y = pt2seg2[1] - pt1seg2[1]
denom = seg2_delta_x * seg1_delta_y - seg1_delta_x * seg2_delta_y
if denom == 0.0: # lines containing segments are parallel or equal
return None
# solve for scalars t_seg1 and t_seg2 in the vector equation
# pt1seg1 + t_seg1 * (pt2seg1 - pt1seg1)
# = pt1seg2 + t_seg2(pt2seg2 - pt1seg2) and note the segments
# intersect iff 0 <= t_seg1 <= 1, 0 <= t_seg2 <= 1 .
pt1seg1pt1seg2_delta_x = pt1seg2[0] - pt1seg1[0]
pt1seg1pt1seg2_delta_y = pt1seg2[1] - pt1seg1[1]
t_seg1 = (seg2_delta_x * pt1seg1pt1seg2_delta_y
- pt1seg1pt1seg2_delta_x * seg2_delta_y) / denom
t_seg2 = (seg1_delta_x * pt1seg1pt1seg2_delta_y
- pt1seg1pt1seg2_delta_x * seg1_delta_y) / denom
if 0 <= t_seg1 <= 1 and 0 <= t_seg2 <= 1:
return (pt1seg1[0] + t_seg1 * seg1_delta_x,
pt1seg1[1] + t_seg1 * seg1_delta_y)
return None
except ArithmeticError:
return None
def orientation3points(pt1, pt2, pt3):
"""Return the orientation of three 2D points in order.
Moving from Pt1 to Pt2 to Pt3 in cartesian coordinates:
1 means counterclockwise (as in standard trigonometry),
0 means straight, back, or stationary (collinear points),
-1 means counterclockwise,
signed = ((pt2[0] - pt1[0]) * (pt3[1] - pt1[1])
- (pt2[1] - pt1[1]) * (pt3[0] - pt1[0]))
return 1 if signed > 0.0 else (-1 if signed < 0.0 else 0)
def is_convex_quadrilateral(pt1, pt2, pt3, pt4):
"""Return True if the quadrilateral defined by the four 2D points is
'strictly convex', not a triangle nor concave nor self-intersecting.
This version allows a 'point' to be None: if so, False is returned.
NOTES: 1. Algorithm: check that no points are None and that all
angles are clockwise or all counter-clockwise.
2. This does not generalize to polygons with > 4 sides
since it misses star polygons.
if pt1 and pt2 and pt3 and pt4:
orientation = orientation3points(pt4, pt1, pt2)
if (orientation != 0 and orientation
== orientation3points(pt1, pt2, pt3)
== orientation3points(pt2, pt3, pt4)
== orientation3points(pt3, pt4, pt1)):
return True
return False
def polygon_in_canonical_order(point_seq):
"""Return a polygon, reordered so that two different
representations of the same geometric polygon get the same result.
The result is a tuple of the polygon's points. `point_seq` must be
a sequence of 'points' (which can be anything).
NOTES: 1. This is intended for the points to be distinct. If two
points are equal and minimal or adjacent to the minimal
point, which result is returned is undefined.
pts = tuple(point_seq)
length = len(pts)
ndx = min(range(length), key=pts.__getitem__) # index of minimum
if pts[(ndx + 1) % length] < pts[(ndx - 1) % length]:
return (pts[ndx],) + pts[ndx+1:] + pts[:ndx] # forward
return (pts[ndx],) + pts[:ndx][::-1] + pts[ndx+1:][::-1] # back
def sorted_pair(val1, val2):
"""Return a 2-tuple in sorted order from two given values."""
if val1 <= val2:
return (val1, val2)
return (val2, val1)
And here is the code for my algorithm. I added a little complexity to use only a "canonical form" of a pair of line segments and for a polygon, to reduce the memory usage of the intersections and polygons containers.
from itertools import combinations
from rdgeometry import segments_intersection_point, \
is_strictly_convex_quadrilateral, \
polygon_in_canonical_order, \
segments = [(( 2, 16), (22, 10)),
(( 4, 4), (14, 14)),
(( 4, 6), (12.54, 0.44)),
(( 4, 14), (20, 6)),
(( 4, 18), (14, 2)),
(( 8, 2), (22, 16))]
intersections = dict()
for seg1, seg2 in combinations(segments, 2):
intersections[sorted_pair(seg1, seg2)] = (
segments_intersection_point(seg1, seg2))
quadrilaterals = set()
for seg1, seg2, seg3, seg4 in combinations(segments, 4):
for sega, segb, segc, segd in [(seg1, seg2, seg3, seg4),
(seg1, seg2, seg4, seg3),
(seg1, seg3, seg2, seg4)]:
a_quadrilateral = (intersections[sorted_pair(sega, segb)],
intersections[sorted_pair(segb, segc)],
intersections[sorted_pair(segc, segd)],
intersections[sorted_pair(segd, sega)])
if is_strictly_convex_quadrilateral(*a_quadrilateral):
break # only one possible strictly convex quadr per 4 segments
print('\nThere are {} strictly convex quadrilaterals, namely:'
for p in sorted(quadrilaterals):
And the printout from that is:
There are 8 strictly convex quadrilaterals, namely:
((5.211347517730497, 5.211347517730497), (8.845390070921987, 2.8453900709219857), (11.692307692307693, 5.692307692307692), (9.384615384615383, 9.384615384615383))
((5.211347517730497, 5.211347517730497), (8.845390070921987, 2.8453900709219857), (14.666666666666666, 8.666666666666668), (10.666666666666666, 10.666666666666666))
((5.211347517730497, 5.211347517730497), (8.845390070921987, 2.8453900709219857), (17.384615384615387, 11.384615384615383), (12.769230769230768, 12.76923076923077))
((6.0, 14.8), (7.636363636363637, 12.181818181818182), (10.666666666666666, 10.666666666666666), (12.769230769230768, 12.76923076923077))
((6.0, 14.8), (7.636363636363637, 12.181818181818182), (14.666666666666666, 8.666666666666668), (17.384615384615387, 11.384615384615383))
((9.384615384615383, 9.384615384615383), (10.666666666666666, 10.666666666666666), (14.666666666666666, 8.666666666666668), (11.692307692307693, 5.692307692307692))
((9.384615384615383, 9.384615384615383), (11.692307692307693, 5.692307692307692), (17.384615384615387, 11.384615384615383), (12.769230769230768, 12.76923076923077))
((10.666666666666666, 10.666666666666666), (12.769230769230768, 12.76923076923077), (17.384615384615387, 11.384615384615383), (14.666666666666666, 8.666666666666668))
A O(intersection_count2) algorithm is as follows:
For each intersection:
Add the the intersection point to
a hash table with the lines as the key.
Let int be a lookup function that returns
true iff the inputted lines intersect.
RectCount = 0
For each distinct pair of intersections a,b:
Let A be the list of lines that pass
through point a but not through b.
Let B '' '' '' through b but not a.
For each pair of lines c,d in A:
For each pair of lines e,f in B:
If (int(c,e) and int(d,f) and
!int(c,f) and !int(d,e)) or
(int(c,f) and int(d,e) and
!int(c,e) and !int(d,f)):
RectCount += 1

Find scatterplot area where ~50% of points have one of 2 values

I have a data frame that has 3 values for each point in the form: (x, y, boolean). I'd like to find an area bounded by values of (x, y) where roughly half the points in the area are TRUE and half are FALSE.
I can scatterplot then data and color according to the 3rd value of each point and I get a general idea but I was wondering if there would be a better way. I understand that if you take a small enough area where there are only 2 points and one if TRUE and the other is FALSE then you have 50/50 so I was thinking there has to be a better way of deciding what size area to look for.
Visually I see this has drawing a square on the scatter plot and moving it around the x and y axis each time checking the number of TRUE and FALSE points in the area, but is there a way to determine what a good size for the area is based on the values?
EDIT: G5W's answer is a step in the right direction but based on their scatterplot, I'm looking to create a square / rectangle idea in which ~ half the points are green and half are red. I understand that there is potentially an infinite amount of those areas but thinking there might be a good way to determine an optimal size for the area (maybe it should contain at least a certain percentage of the points or something)
Note update below
You do not provide any sample data, so I have created some bogus data like this:
TestData = data.frame(x = c(rnorm(100, -1, 1), rnorm(100, 1,1)),
y = c(rnorm(100, -1, 1), rnorm(100, 1,1)),
z = rep(c(TRUE,FALSE), each=100))
I think that what you want is how much area is taken up by each of the TRUE and FALSE points. A way to interpret that task is to find the convex hull for each group and take its area. That is, find the minimum convex polygon that contains a group. The function chull will compute the convex hull of a set of points.
plot(TestData[,1:2], pch=20, col=as.numeric(TestData$z)+2)
CH1 = chull(TestData[TestData$z,1:2])
CH2 = chull(TestData[!TestData$z,1:2])
polygon(TestData[which(TestData$z)[CH1],1:2], lty=2, col="#00FF0011")
polygon(TestData[which(!TestData$z)[CH2],1:2], lty=2, col="#FF000011")
Once you have the polygons, the polyarea function from the pracma package will compute the area. Note that it computes a "signed" area so you either need to be careful about which direction you traverse the polygon or take the absolute value of the area.
[1] 16.48692
[1] 15.17897
This is a completely different answer based on the updated question. I am leaving the old answer because the question now refers to it.
The question now gives a little more information about the data ("There are about twice as many FALSE than TRUE") so I have made an updated bogus data set to reflect that.
TestData = data.frame(x = c(rnorm(100, -1, 1), rnorm(200, 1, 1)),
y = c(rnorm(100, 1, 1), rnorm(200, -1,1)),
z = rep(c(TRUE,FALSE), c(100,200)))
The problem is now to find regions where the density of TRUE and FALSE are approximately equal. The question asked for a rectangular region, but at least for this data, that will be difficult. We can get a good visualization to see why.
We can use the function kde2d from the MASS package to get the 2-dimensional density of the TRUE points and the FALSE points. If we take the difference of these two densities, we need only find the regions where the difference is near zero. Once we have this difference in density, we can visualize it with a contour plot.
Grid1 = kde2d(TestData$x[TestData$z], TestData$y[TestData$z],
lims = c(c(-3,3), c(-3,3)))
Grid2 = kde2d(TestData$x[!TestData$z], TestData$y[!TestData$z],
lims = c(c(-3,3), c(-3,3)))
GridDiff = Grid1
GridDiff$z = Grid1$z - Grid2$z
filled.contour(GridDiff, color = terrain.colors)
In the plot it is easy to see the place that there are far more TRUE than false near (-1,1) and where there are more FALSE than TRUE near (1,-1). We can also see that the places where the difference in density is near zero lie in a narrow band in the general area of the line y=x. You might be able to get a box where a region with more TRUEs is balanced by a region with more FALSEs, but the regions where the density is the same is small.
Of course, this is for my bogus data set which probably bears little relation to your real data. You could perform the same sort of analysis on your data and maybe you will be luckier with a bigger region of near equal densities.

Interpolation in a distorted box

I want to interpolate in a distorted box. As We have 8 points around a distorted box (p0, p1, p2, ..., p7), if we find the transformation matrix which transform this box to a box with points ((0, 0, 0), (0, 0, 1), (0, 1, 1), (0, 1, 0), (1, 0, 0), (1, 0, 1), (1, 1, 1), (1, 1, 0) ), the interpolation can be done easily. In other words, If we find a transformation from a distorted box to a normal box which length, width and height of the box are equal to 1, the interpolation can be done very simple. Anyone has any idea about interpolating in a distorted box or finding transformation from a distorted box to a normal box?
Not answering the original question since in the comment you said that you simply wanted to interpolate a function inside the cube, using the values at the 8 vertices.
So in order to do that, you can reason as follow:
1) Split the cube in 6 tetrahedra
2) Find the tetrahedron that contains the point you want to interpolate
3) An irregular tetrahedron can be easily mapped to a regular one, that is you can easily obtain the generalized tetrahedral coordinates of a point. Check eq. 9-11 here.
4) Once you have the tetrahedral coordinates of your point, the interpolation is trivial (see previous link).
This is the easiest way I can think of, the big downside is that there are 13 ways to split a cube in tetrahedras, and this choice will produce (slightly) different results, especially if the cube is heavily deformed. You should aim for a delaunay tetrahedralization of the cube to minimize this effect.
Also notice that the interpolated function defined in this way is continuous across the faces of the tetrahedra (but not differentiable).
You can apply the inverse of a scaling matrix to the cube where vx, vy and vz
are the cube's spacial extents.

How to calculate azimut & elevation relative to a camera direction of view in 3D ...?

I'm rusty a bit here.
I have a vector (camDirectionX, camDirectionY, camDirectionZ) that represents my camera direction of view.
I have a (camX, camY, camZ) that is my camera position.
Then, I have an object placed at (objectX, objectY, objectZ)
How can I calculate, from the camera point of view, the azimut & elevation of my object ??
The first thing I would do, to simplify the problem, is transform the coordinate space so the camera is at (0, 0, 0) and pointing straight down one of the axes (so the direction is say (0, 0, 1)). Translating so the camera is at (0, 0, 0) is pretty trivial, so I won't go into that. Rotating so that the camera direction is (0, 0, 1) is a little trickier...
One way of doing it is to construct the full orthonormal basis of the camera, then stick that in a rotation matrix and apply it. The "orthonormal basis" of the camera is a fancy way of saying the three vectors that point forward, up, and right from the camera. They should all be at 90 degrees to each other (which is what the ortho bit means), and they should all be of length 1 (which is what the normal bit means).
You can get these vectors with a bit of cross-product trickery: the cross product of two vectors is perpendicular (at 90 degrees) to both.
To get the right-facing vector, we can just cross-product the camera direction vector with (0, 1, 0) (a vector pointing straight up). You'll need to normalise the vector you get out of the cross-product.
To get the up vector of the camera, we can cross product the camera direction vector with the right-facing vector we just calculated. Assuming both input vectors are normalised, this shouldn't need normalising.
We now have the orthonormal basis of the camera. If we stick these vectors into the rows of a 3x3 matrix, we get a rotation matrix that will transform our coordinate space so the camera is pointing straight down one of the axes (which one depends on the order you stick the vectors in).
It's now fairly easy to calculate the azimuth and elevation of the object.
To get the azimuth, just do an atan2 on the x/z coordinates of the object.
To get the elevation, project the object coordinates onto the x/z plane (just set the y coordinate to 0), then do:
acos(dot(normalise(object coordinates), normalise(projected coordinates)))
This will always give a positive angle -- you probably want to negate it if the object's y coordinate is less than 0.
The code for all of this will look something like:
fwd = vec3(camDirectionX, camDirectionY, camDirectionZ)
cam = vec3(camX, camY, camZ)
obj = vec3(objectX, objectY, objectZ)
# if fwd is already normalised you can skip this
fwd = normalise(fwd)
# translate so the camera is at (0, 0, 0)
obj -= cam
# calculate the orthonormal basis of the camera
right = normalise(cross(fwd, (0, 1, 0)))
up = cross(right, fwd)
# rotate so the camera is pointing straight down the z axis
# (this is essentially a matrix multiplication)
obj = vec3(dot(obj, right), dot(obj, up), dot(obj, fwd))
azimuth = atan2(obj.x, obj.z)
proj = vec3(obj.x, 0, obj.z)
elevation = acos(dot(normalise(obj), normalise(proj)))
if obj.y < 0:
elevation = -elevation
One thing to watch out for is that the cross-product of your original camera vector with (0, 1, 0) will return a zero-length vector when your camera is facing straight up or straight down. To fully define the orientation of the camera, I've assumed that it's always "straight", but that doesn't mean anything when it's facing straight up or down -- you need another rule.

Scipy - data interpolation from one irregular grid to another irregular spaced grid

I am struggling with the interpolation between two grids, and I couldn't find an appropriate solution for my problem.
I have 2 different 2D grids, of which the node points are defined by their X and Y coordinates. The grid itself is not rectangular, but forms more or less a parallelogram (so the X-coordinate for (i,j) is not the same as (i,j+1), and the Y coordinate of (i,j) is different from the Y coordinate of (i+1,j).
Both grids have a 37*5 shape and they overlap almost entirely.
For the first grid I have for each point the X-coordinate, the Y-coordinate and a pressure value. Now I would like to interpolate this pressure distribution of the first grid on the second grid (of which also X and Y are known for each point.
I tried different interpolation methods, but my end result was never correct due to the irregular distribution of my grid points.
Functions as interp2d or griddata require as input a 1D array, but if I do this, the interpolated solution is wrong (even if I interpolate the pressure values from the original grid again on the original grid, the new pressure values are miles away from the original values.
For 1D interpolation on different irregular grids I use:
def interpolate(X, Y, xNew):
if xNew<X[0]:
print 'Interp Warning :', xNew,'is under the interval [',X[0],',',X[-1],']'
yNew = Y[0]
elif xNew>X[-1]:
print 'Interp Warning :', xNew,'is above the interval [',X[0],',',X[-1],']'
yNew = Y[-1]
elif xNew == X[-1] : yNew = Y[-1]
ind = numpy.argmax(numpy.bitwise_and(X[:-1]<=xNew,X[1:]>xNew))
yNew = Y[ind] + ((xNew-X[ind])/(X[ind+1]-X[ind]))*(Y[ind+1]-Y[ind])
return yNew
but for 2D I thought griddata would be easier to use. Does anyone have experience with an interpolation where my input is a 2D array for the mesh and for the data?
Have another look at interp2d. http://docs.scipy.org/scipy/docs/scipy.interpolate.interpolate.interp2d/#scipy-interpolate-interp2d
Note the second example in the 'x,y' section under 'Parameters'. 'x' and 'y' are 1-D in a loose sense but they can be flattened arrays.
Should be something like this:
f = scipy.interpolate.interp2d([0.25, 0.5, 0.27, 0.58], [0.4, 0.8, 0.42,0.83], [3, 4, 5, 6])
znew = f(.25,.4)
print znew
[ 3.]
znew = f(.26,.41) # midway between (0.25,0.4,3) and (0.27,0.42,5)
print znew
[ 4.01945345] # Should be 4 - close enough?
I would have thought you could pass flattened 'xnew' and 'ynew' arrays to 'f()' but I couldn't get that to work. The 'f()' function would accept the row, column syntax though, which isn't useful to you. Because of this limitation with 'f()' you will have to evaluate 'znew' as part of a loop - might should look at nditer for that. Make sure also that it does what you want when '(xnew,ynew)' is outside of the '(x,y)' domain.
