I'm working on some gridded temperature data, which I have categorised into a matrix where each cell can be one of two classes - let's say 0 or 1 for simplicity. For each class I want to calculate patch statistics, taking inspiration from FRAGSTATS, which is used in landscape ecology to characterise the shape and size of habitat patches.
For my purposes, a patch is a cluster of adjacent cells of the same class. Here's an example matrix, mat:
mat <-
matrix(c(0,1,0,
1,1,1,
1,0,1), nrow = 3, ncol = 3,
byrow = TRUE)
0 1 0
1 1 1
1 0 1
All the 1s in mat form a single patch (we'll ignore the 0s), and in order to calculate various different shape metrics I need to be able to calculate the perimeter (i.e. number of outside edges).
EDIT
Sorry I apparently can't post an image because I don't have enough reputation, but you can see in the black lines of G5W's answer below that the outside borders of 1's represent the outside edges I'm referring to.
Manually I can count that the patch of 1s has 14 outside edges and I know the area (i.e. number of cells) is 6. Based on a paper by He et al. and this other question I've figured out how to calculate the number of inside edges (5 in this example), but I'm really struggling to do the same for the outside edges! I think it's something to do with how the patch shape compares to the largest integer square that has a smaller area (in this case, a 2 x 2 square), but so far my research and pondering have been to no avail.
N.B. I am aware of the package SDMTools, which can calculate various FRAGSTATS metrics. Unfortunately the metrics returned are too processed e.g. instead of just Aggregation Index, I need to know the actual numbers used to calculate it (number of observed shared edges / maximum number of shared edges).
This is my first post on here so I hope it's detailed enough! Thanks in advance :)
If you know the area and the number of inside edges, it is simple to calculate the number of outside edges. Every patch has four edges so in some way, the total number of edges is 4 * area. But that is not quite right because every inside edge is shared between two patches. So the right number of total edges is
4*area - inside
The number of outside edges is the total edges minus the inside edges, so
outside = total - inside = (4*area- inside) - inside = 4*area - 2*inside.
You can see that the area is made up of 6 squares each of which has 4 sides. The inside edges (the red ones) are shared by two adjacent squares.
Related
Essentially I have spatial data in which I go through each point and figure out what are the surrounding points within a circle of some radius. I then want to arrange those points in a clockwise manner and I managed to have done that for "most" cases. The unique feature about this data is that there are only 6 maximum possible location that can surround any center point with how I defined my circle's radius[top-left, top-right, right, bottom-right, bottom-left, left]
So as a sample data
Center Point: 161.3861 368.8119
col row
1 164.5365 363.4114
2 155.2205 368.7669
3 167.5968 368.8569
4 158.2358 374.1674
5 164.4465 374.2124
6 158.3258 363.3663
The function would then output [4, 5 ,3, 1, 6, 2] which is the clockwise order. This sub-sample [highlighted in red with the center remained as black] of the data looks like this. [To be clear I have this case working]
But you can imagine that it is not exactly straightforward for the various corner cases. For instance the following case it has no point to the right of it so in the final out put array there should be a zero in the "right, top-right, top-left" index of the array I described earlier.
What I am struggling with is a systematic way to go through the corner cases and assign labels to the missing points. I tried using a dot product approach to quantify how closely the points are from each other (using a normal vector of straight up) but this lead to issues with discriminating top-right, right. I imagine that checking if a line goes through the point we get a sense of what axis the point exists on, but I have not managed to make it work. To summarize the two main corner cases are
Edge points
Island points
You could write a function to tell you which direction a point is in, given the point and the center-point:
Pseudo-code:
direction_vector = point - center_point
angle = atan2(direction_vector.y, direction_vector.x)
direction_index = ((angle * 12 / TWO_TIMES_PI) + 12) % 12
This will give you an index from 0 to 11 (imagine hours on bizarro clock face that goes anti-clockwise from 0, with 0 on the right where 3 o'clock is on a normal clock).
Now map this onto your directions, with 1 being top-left, 2 being top-right, 3 being right etc:
direction_index = (((16 - x) // 2) % 6) + 1
Where // is integer division and % is modulo.
Now that you have the directions, iterate from 1 to 6 and output the array index of your point that has the corresponding direction index, or 0 if there isn't one (assuming 1-based array indexing).
Your points are arranged in a hexagonal grid. When you consider the immediate neighbors, you can easily classify them in an absolute way by comparing the coordinates of the centers to three straight lines.
Then sort by index.
What about adding dummy points so that every point has six neighbors ? Then when you enumerate the neighbors in the desired order, you just skip the dummy ones.
Depending how your data structures are organized, you could truly add points to the point set, or add them "virtually" when you process a given point.
Given set of N points, what is the maximum number of directed graphs can be created ? I'm having trouble with isomorphic problem.
Edit (1): Only directed simple, non-loop vertex graph, doesn't required to be connected
Edit (2): Any point in this set is treated equally to each other, so the main problem here is to calculate and subtract the number of isomorphic graphs created from different sets of edges.
Number of unlabeled directed graphs with n vertices is here (OEIS A000273)
1, 1, 3, 16, 218, 9608, 1540944, 882033440, 1793359192848
There is no closed formula, approximated value is number of labeled graphs divided by number of vertex permutations:
2^(n*(n-1)) / n!
There are n-1 possible edges for each node, so a total of n(n-1) edges.
Each possible graph will either contain a particular edge, or it won't.
So the number of possible graphs is 2^(n(n-1)).
EDIT: This only applies under the assumption there are no loops and each edge is unique.
Looping is basically coming back to the same node again so I'm considering double-headed arrows are not allowed. Now, if there are n nodes available so graphs you make without loops can have n-1 edges. Now, let m be the number of homeomorphic graphs you can make out of n nodes. Let si is number of symmetries present in ith graph of those m homeomorphic graphs. These symmetries I'm talking about are the likes of we study in group theory for geometric figures. Now, we know all edge can have 2 states i.e. left head and right head.
So the total number of distinct directed graphs can be given as:
Note: If these symmetries were not present then it would have been simply m*2(n-1)
(Edit 1) Also, this valid for connected graph with n nodes. If you want to include graphs that don't need to be connected then you'll have to modify a few things in this equation or add few things like the number of smaller partitions of this n noded graph you can form and apply this formula in each of those combinations.
Permutation&Combination, Group Theory, Symmetries, Partitions, Overall it's messy so this was the only simple way I could put it.
I cannot warp my mind arround reading the plots generated by coplot().
For example from the help(coplot)
## Tonga Trench Earthquakes
coplot(lat ~ long | depth, data = quakes)
What do the gray bars above represent? Why are there 2 rows or lat/long boxes?
How do I read this graph?
I can shed some more light on the second chart's interpretation. The gray bars for both mag and depth represent intervals of the their respective variables. Andy gave a nice description of how they are created above.
When you are reading them keep in mind that they are meant to show you the range of the observations for the respective conditioning variable (mag or depth) represented in each column or row. Therefore, in Andy's example the largest mag bar is just showing that the topmost row contains observations for earthquakes of approx. 4.6 to 7. It makes sense that this bar is the largest, since as Andy mentioned, they are created to have roughly similar numbers of observations and stronger earthquakes are not as common as weaker ones. The same logic holds true for depth where a larger range of depths was required to get a roughly proportional number of observations.
Regarding reading the chart, you would read the columns as representing the three depth groups (left to right) and the rows as representing the four mag groups (bottom to top). Thus, as you read up the chart you're progressively slicing the data into groups of observations with increasing magnitudes. So, for example, the bottom row represents earthquakes with magnitudes of 4 to 4.5 with each column representing a different range of depths. Similarly, you read the columns as holding depth constant while allowing you to see various ranges of magnitudes.
Putting it all together, as mentioned by Andy, we can see that as we read up the rows (progressing up in magnitude) the distribution of earthquakes remains relatively unchanged. However, when reading across the columns (progressing up in depth) we see that the distribution does slightly change. Specifically, the grouping of quakes on the right, between longitudes 180 and 185, grows tighter and more clustered towards the top of the cell.
This is a method for visualizing interactions in your dataset. More specifically, it lets you see how some set of variables are conditional on some other set of variables.
In the example given, you're asking to visualize how lat and long vary with depth. Because you didn't specify number, and the formula indicates you're interested in only one conditional variable, the function assumes you want number=6 depth cuts (passed to co.intervals, which tries to make the number of data points approximately equal within each interval) and is simply maximizing the data-to-ink ratio by stacking individual plot frames; the value of depth increases to the right, starting with the lowest row and moving up (hence the top-right frame represents the largest depth interval). You can set rows or columns to change this behavior, e.g.:
coplot(lat ~ long | depth, data = quakes, columns=6)
but I think the power of this tool becomes more apparent when you inspect two or more conditioning variables. For example:
coplot(lat ~ long | depth * mag, data = quakes, number=c(3,4))
gives a rich view of how earthquakes vary in space, and demonstrates that there is some interaction with depth (the pattern changes from left to right), and little-to-no interaction with magnitude (the pattern does not change from top to bottom).
Finally, I would highly recommend reading Cleveland's Visualizing Data -- a classic text.
I have the vertices of a non-self-intersecting polygon in 2-D where the x-coordinate is centred longitude and y-coordinate is centred latitude. I want to find the edges of the polygon.
I can plot the vertices and see which vertices are neighbouring and see the edges. But my question is how can I get these edges.
For example, I am considering the sample data:
> data1
vertices lon lat
5 1.133179 1.027886
4 1.094459 1.013952
2 1.055672 1.000000
1 1.000000 1.028578
3 1.038712 1.042541
6 1.116241 1.070438
Sample Plot of the points is
I want to have an array like this
>edges
ind1 ind2
[1,] 5 6
[2,] 1 3
[3,] 3 6
[4,] 1 2
[5,] 2 4
[6,] 4 5
I am interested about this kind of shape of the polygon (with minimum area)
I got this array by using a function ashape of the R-package alphahull. But in this function Euclidean distance is used to find distance between points, which not applicable in my case (since I am considering data on (lon, lat), we can use distHaversine distance function in the package geosphere). And this function giving unsatisfactory result in case if the polygon has large number vertices and have complex shape. This polygon may or may not be convex.
Now all I want is to build an algorithm to find the edges of the non-intersecting polygon with minimum area.
Any help in this direction will be gratefully appreciated.
Algorithm for finding all possible polygons:
generate the convex hull .
Note that any non intersecting polygon must traverse its convex hull in order.
Start with any point on the convex hull
Generate a list of paths from that point to each interior point, and to the next adjacent point on the convex hull
recursively extend each path to each remaining interior point as well as to that first free point on the convex hull
for each segment added to a path reject the path if it self intersects
I'm not going to post the code, but here are all 67 possible polygons for a random set of 8 points.
As one can imagine the set of results blows up quickly with the number of points (eg. n=12 -> ~10000 polygons.. )
here are the polygons with min and max perimeters.
convert points from lon,lat to Cartesian x,y or x,y,z
use spherical or ellipsoidal surface
if the size is small enough you can project (x,y,z) to local surface plane to avoid 3D computing
you can also use lon,lat as x,y but make sure there is no zero crossing (if is then offset that axis by some value until it isn't)
now there are many possible strategies
you did not provide any rule for the shape
so I assume 'minimal' perimeter/size/area and generic concave polygon
you can not go directly to edge lines before you know where is inside and where is outside
I would do this task like this: find polygon based on find holes in 2D point set
modification 1
as you already have all the edge points (at least that is my impression)
so you can make flag for each point from the above algorithm
that will tell you where is inside or outside of polygon
for example take 8 directions (N,NE,E,...) and encode which way is filled and which empty
then on each edge start in the middle of empty direction
and find 2 closest lines to it (in angular terms) that are not intersecting any previous line
and if more available use the smallest ones
gray means inside polygon
make list of all such possible lines (2 per point)
then search for connected loops
beware this modification is not 100% error prone (I do not think that is for concave polygon even possible)
modification 2
use complete polygon from bullet 2
and try to match its edge points to your input edge points
then use the edge lines as in original polygon but with your new points
if some points are skipped then find closest edge line to it and divide it by this point
this should be more safe and accurate then bullet 3.
simple approach
if the above is too much for you then
create list of all possible lines
sort them by size ascending
remove all 'long' lines that are intersecting any 'short' line
what is short or long depends on you
for example first third of lines can be the short ones and the last third the long ones
or make average size and what is < 0.6*avg_size or > 1.2*avg_size ...
or if you have N points then first 2N lines are short the rest is long (2 lines per point)
test all and select the best option for you ...
try to find joined lines
find only lines that are connected once (no more then 2 lines per point)
remove them from list into the final solution list
after this you will have list of possible lines and list of found lines
remove all lines from possible lines that intersect any line in found lines
this should remove any non relevant lines
try to find connections again
take first possible line if found connection move it to the solution list
and go to bullet 5.
if none found continue with next line ...
stop if none line left or none connection found.
I am new to this forum and not a native english speaker, so please be nice! :)
Here is the challenge I face at the moment:
I want to calculate the (approximate) relative coordinates of yet unknown points in a 3D euclidean space based on a set of given distances between 2 points.
In my first approach I want to ignore possible multiple solutions, just taking the first one by random.
e.g.:
given set of distances: (I think its creating a pyramid with a right-angled triangle as a base)
P1-P2-Distance
1-2-30
2-3-40
1-3-50
1-4-60
2-4-60
3-4-60
Step1:
Now, how do I calculate the relative coordinates for those points?
I figured that the first point goes to 0,0,0 so the second one is 30,0,0.
After that the third points can be calculated by finding the crossing of the 2 circles from points 1 and 2 with their distances to point 3 (50 and 40 respectively). How do I do that mathematically? (though I took these simple numbers for an easy representation of the situation in my mind). Besides I do not know how to get to the answer in a correct mathematical way the third point is at 30,40,0 (or 30,0,40 but i will ignore that).
But getting the fourth point is not as easy as that. I thought I have to use 3 spheres in calculate the crossing to get the point, but how do I do that?
Step2:
After I figured out how to calculate this "simple" example I want to use more unknown points... For each point there is minimum 1 given distance to another point to "link" it to the others. If the coords can not be calculated because of its degrees of freedom I want to ignore all possibilities except one I choose randomly, but with respect to the known distances.
Step3:
Now the final stage should be this: Each measured distance is a bit incorrect due to real life situation. So if there are more then 1 distances for a given pair of points the distances are averaged. But due to the imprecise distances there can be a difficulty when determining the exact (relative) location of a point. So I want to average the different possible locations to the "optimal" one.
Can you help me going through my challenge step by step?
You need to use trigonometry - specifically, the 'cosine rule'. This will give you the angles of the triangle, which lets you solve the 3rd and 4th points.
The rules states that
c^2 = a^2 + b^2 - 2abCosC
where a, b and c are the lengths of the sides, and C is the angle opposite side c.
In your case, we want the angle between 1-2 and 1-3 - the angle between the two lines crossing at (0,0,0). It's going to be 90 degrees because you have the 3-4-5 triangle, but let's prove:
50^2 = 30^2 + 40^2 - 2*30*40*CosC
CosC = 0
C = 90 degrees
This is the angle between the lines (0,0,0)-(30,0,0) and (0,0,0)- point 3; extend along that line the length of side 1-3 (which is 50) and you'll get your second point (0,50,0).
Finding your 4th point is slightly trickier. The most straightforward algorithm that I can think of is to firstly find the (x,y) component of the point, and from there the z component is straightforward using Pythagoras'.
Consider that there is a point on the (x,y,0) plane which sits directly 'below' your point 4 - call this point 5. You can now create 3 right-angled triangles 1-5-4, 2-5-4, and 3-5-4.
You know the lengths of 1-4, 2-4 and 3-4. Because these are right triangles, the ratio 1-4 : 2-4 : 3-4 is equal to 1-5 : 2-5 : 3-5. Find the point 5 using trigonometric methods - the 'sine rule' will give you the angles between 1-2 & 1-4, 2-1 and 2-4 etc.
The 'sine rule' states that (in a right triangle)
a / SinA = b / SinB = c / SinC
So for triangle 1-2-4, although you don't know lengths 1-4 and 2-4, you do know the ratio 1-4 : 2-4. Similarly you know the ratios 2-4 : 3-4 and 1-4 : 3-4 in the other triangles.
I'll leave you to solve point 4. Once you have this point, you can easily solve the z component of 4 using pythagoras' - you'll have the sides 1-4, 1-5 and the length 4-5 will be the z component.
I'll initially assume you know the distances between all pairs of points.
As you say, you can choose one point (A) as the origin, orient a second point (B) along the x-axis, and place a third point (C) along the xy-plane. You can solve for the coordinates of C as follows:
given: distances ab, ac, bc
assume
A = (0,0)
B = (ab,0)
C = (x,y) <- solve for x and y, where:
ac^2 = (A-C)^2 = (0-x)^2 + (0-y)^2 = x^2 + y^2
bc^2 = (B-C)^2 = (ab-x)^2 + (0-y)^2 = ab^2 - 2*ab*x + x^2 + y^2
-> bc^2 - ac^2 = ab^2 - 2*ab*x
-> x = (ab^2 + ac^2 - bc^2)/2*ab
-> y = +/- sqrt(ac^2 - x^2)
For this to work accurately, you will want to avoid cases where the points {A,B,C} are in a straight line, or close to it.
Solving for additional points in 3-space is similar -- you can expand the Pythagorean formula for the distance, cancel the quadratic elements, and solve the resulting linear system. However, this does not directly help you with your steps 2 and 3...
Unfortunately, I don't know a well-behaved exact solution for steps 2 and 3, either. Your overall problem will generally be both over-constrained (due to conflicting noisy distances) and under-constrained (due to missing distances).
You could try an iterative solver: start with a random placement of all your points, compare the current distances with the given ones, and use that to adjust your points in such a way as to improve the match. This is an optimization technique, so I would look up books on numerical optimization.
If you know the distance between the nodes (fixed part of system) and the distance to the tag (mobile) you can use trilateration to find the x,y postion.
I have done this using the Nanotron radio modules which have a ranging capability.