Obtain the differential spectrum of each marine floating target and its background/neighborhood water in Google Earth Engine - google-earth-engine

How to obtain the differential spectrum of each floating target (algal pattern here), that is, the band value of each algal pattern subtract the band value of the adjacent water around it (such as the median water spectrum)
I first extract floating algae from the sea. I can use NDVI, NDWI, etc. to extract the algae and its edges first (See the Fig.1, algae is in viridis palette). My goal is to get the difference between the spectra of the algae and the surrounding water. Therefore, I carried out buffer operation on the edge of algae patterns (See the Fig.2, yellow buffer). The buffer represents the water around algae. My goal was to calculate the difference between algae pattern and the surrounding background water body. I have considered the object-based approach, but this is very memory intensive and has limitations on spot size. Now I want to do it based on pixels and morphology. How to achieve this?
An alternative idea maybe:Fill nodata values (masked algae) using neighborhood water in an image, then using subtraction between the original image and the new one to obtain the difference between the spectra of the algae and the surrounding water.

Related

Point pattern classification with spatstat: what am I doing wrong?

I’am trying to classify bivariate point patterns into groups using spatstat. The patterns are derived from the whole slide images of lymph nodes with cancer. I’ve trained a neural network to recognize cells of three types (cancer “LP”, immune cells “bcell” and all other cells). I do not wish to analyse all other cells but use them to construct a polygonal window in the shape of the lymph node. Thus, the patterns to be analysed are immune cells and cancer cells in polygonal windows. Each pattern can have several 10k cancer cells and up to 2mio immune cells. The patterns are of the type “Small World Model” as there is no possibility of points laying outside the window.
My classification should be based on the position of the cancer cells in relation to the immune cells. E.g. most cancer cells are laying on the “islands” of immune cells but in some cases cancer cells are (seemingly) uniformly dispersed and there are only a few immune cells. In addition, the patterns are not always uniform across the node. As I’m rather new to spatial statistics I developed a simple and crude method to classify the patterns. Here in short:
I calculated a kernel density of the immune cells with sigma=80 because this looked “nice” for me. Den<-density(split(cells)$"bcell",sigma=80,window= cells$window) (Should I have used e.g. sigma=bw.scott instead?)
Then I created a tessellation image by dividing density range in 3 parts (here again, I experimented with the breaks to get some “good looking results”).
rangesDenMax<-2*range(Den)[2]/3
rangesDenMin<-range(Den)[2]/3
map.breaks<-c(-Inf,rangesDenMin,rangesDenMax,Inf)
map.cuts <- cut(Den, breaks = map.breaks, labels = c("Low B-cell density","Medium B-cell density", "High B-cell density"))
map.quartile <- tess(image = map.cuts,window=cells$window)
tessImage<-map.quartile
Here are some examples of the plots of the tessellations with the cancer cell overlay (white dots). The lymph node on the left has a typical uniformly distributed “islands” of immune cells while the node on the right has only a few dense spots of immune cells and cancer cells not restricted to those spots:
heat map: immune cell kernel density, white dots: cancer cells
Then I measured a silly number of variables, which should give me a clue of how the cancer cells are distributed across the tessellation tiles (the calculation code is trivial so I post only the description of my variables):
LPlwB<-c() # proportion of cancer cells in low-b-cell-area
LPmdB<-c() # proportion of cancer cells in medium-b-cell-area
LPhiB<-c() # proportion of cancer cells in high-b-cell-area
AlwB<-c() # proportion of the low-b-cell area
AmdB<-c() # proportion of the medium-b-cell area
AhiB<-c() # proportion of the high-b-cell area
LPm1<-c() # mean distance to the 1st neighbour
LPm2<-c() # mean distance to the 2nd neighbour
LPm3<-c() # mean distance to the 3d neighbour
LPsd1<-c() # standard deviation of the mean distance to the 1st neighbour
LPsd2<-c() # standard deviation of the mean distance to the 2nd neighbour
LPsd3<-c() # standard deviation of the mean distance to the 3d neighbour
meanQ<-c() # mean quadratcount (I visually chose the quadrat size to be not too large and not too small)
sdevQ<-c() # standard deviation of the mean quadratcount
hiSAT<-c() # realised cancer cells saturation in high b-cell-area (number of cells observed divided by a number of cells, which could be fitted into the area considering the observed min distance between the cells)
mdSAT<-c() # realised cancer cells saturation in medium b-cell-area
lwSAT<-c() # realised cancer cells saturation in low b-cell-area
ll<-c() # Proportion LP neighbours of LP (contingency table count divided by total points)
lb<-c() # Proportion b-cell neighbours of LP
bl<-c() # Proportion b-cell neighbours of b-cells
bb<-c() # Proportion LP neighbours of b-cells
I z-scaled the variables, inspected them on a PCA-plot (the vectors pointed in different directions like needles of a sea urchin) and performed a hierarchical cluster analysis. I choose k by calculating fviz_nbclust(scaled_variables, hcut, method = "silhouette"). After dividing the dendrogram into k clusters and checking the cluster stability, I ended up with my groups, which seemed to make sense as cases with “islands” were separated from the "more dispersed" ones.
However, given the possibilities of the spatstat package I strongly feel like hitting nails into the wall with a smartphone.
It seems you are trying to quantify the way in which the cancer cells are positioned relative to the immune cells. You could do this by something like
Cancer <- split(cells)[["LP"]]
Immune <- split(cells)[["bcell"]]
Dimmune <- density(Immune, sigma=80)
f <- rhohat(Cancer, Dimmune)
plot(f)
Then f is a function that indicates the intensity (number per unit area) of cancer cells as a function of the density of immune cells. The plot shows the density of cancer cells on the vertical axis, against the density of immune cells on the horizontal axis.
If the graph of this function is flat, it means that the cancer cells are not paying attention to the density of immune cells. If the graph is steeply declining it means that cancer cells tend to avoid immune cells.
I suggest you first look at the plot of f for some example datasets to decide whether f has any ability to discriminate between spatial arrangements that you think should be classified as different. If so then you can use as.data.frame to extract the values of f and then use classical discriminant analysis (etc) to classify the slide images into groups.
Instead of density(Immune) you could use any other summary of the immune cells.
For example D <- distfun(Immune) would give you the distance to the nearest immune cell, and then f would compute the density of cancer cells as a function of the distance to nearest immune cell. And so on.

Count outside edges of adjacent cells in a matrix in R

I'm working on some gridded temperature data, which I have categorised into a matrix where each cell can be one of two classes - let's say 0 or 1 for simplicity. For each class I want to calculate patch statistics, taking inspiration from FRAGSTATS, which is used in landscape ecology to characterise the shape and size of habitat patches.
For my purposes, a patch is a cluster of adjacent cells of the same class. Here's an example matrix, mat:
mat <-
matrix(c(0,1,0,
1,1,1,
1,0,1), nrow = 3, ncol = 3,
byrow = TRUE)
0 1 0
1 1 1
1 0 1
All the 1s in mat form a single patch (we'll ignore the 0s), and in order to calculate various different shape metrics I need to be able to calculate the perimeter (i.e. number of outside edges).
EDIT
Sorry I apparently can't post an image because I don't have enough reputation, but you can see in the black lines of G5W's answer below that the outside borders of 1's represent the outside edges I'm referring to.
Manually I can count that the patch of 1s has 14 outside edges and I know the area (i.e. number of cells) is 6. Based on a paper by He et al. and this other question I've figured out how to calculate the number of inside edges (5 in this example), but I'm really struggling to do the same for the outside edges! I think it's something to do with how the patch shape compares to the largest integer square that has a smaller area (in this case, a 2 x 2 square), but so far my research and pondering have been to no avail.
N.B. I am aware of the package SDMTools, which can calculate various FRAGSTATS metrics. Unfortunately the metrics returned are too processed e.g. instead of just Aggregation Index, I need to know the actual numbers used to calculate it (number of observed shared edges / maximum number of shared edges).
This is my first post on here so I hope it's detailed enough! Thanks in advance :)
If you know the area and the number of inside edges, it is simple to calculate the number of outside edges. Every patch has four edges so in some way, the total number of edges is 4 * area. But that is not quite right because every inside edge is shared between two patches. So the right number of total edges is
4*area - inside
The number of outside edges is the total edges minus the inside edges, so
outside = total - inside = (4*area- inside) - inside = 4*area - 2*inside.
You can see that the area is made up of 6 squares each of which has 4 sides. The inside edges (the red ones) are shared by two adjacent squares.

How is adaptative.density() (spatstat) managing duplicated points and default f value

I can not find this information in the reference literature [1]
1)how adaptative.density() (package spatstat) manage duplicated spatial points. I have duplicated points exactly in the same position because I am combining measurements from different years, and I am expecting that the density curve is higher in those areas but I am not sure about it.
2) is the default value of f in adaptative.density() f=0 or f=1?
My guess is that it is f=0, so it is doing an adaptive estimate by calculating the intensity estimate at every location equal to the average intensity (number of points divided by window area)
Thank you for your time and input!
The default value of f is 0.1 as you can see from the "Usage" section in the help file.
The function subsamples the point pattern with this selection probability and uses the resulting pattern to generate a Dirichlet tessellation (if there are duplicated points here they are ignored). The other fraction of points (1-f) is used to estimate the intensity by the number of points in each tile of the tessellation divided by the corresponding area (here duplicated points count equally to the total count in the tile).

How to read a coplot() graph

I cannot warp my mind arround reading the plots generated by coplot().
For example from the help(coplot)
## Tonga Trench Earthquakes
coplot(lat ~ long | depth, data = quakes)
What do the gray bars above represent? Why are there 2 rows or lat/long boxes?
How do I read this graph?
I can shed some more light on the second chart's interpretation. The gray bars for both mag and depth represent intervals of the their respective variables. Andy gave a nice description of how they are created above.
When you are reading them keep in mind that they are meant to show you the range of the observations for the respective conditioning variable (mag or depth) represented in each column or row. Therefore, in Andy's example the largest mag bar is just showing that the topmost row contains observations for earthquakes of approx. 4.6 to 7. It makes sense that this bar is the largest, since as Andy mentioned, they are created to have roughly similar numbers of observations and stronger earthquakes are not as common as weaker ones. The same logic holds true for depth where a larger range of depths was required to get a roughly proportional number of observations.
Regarding reading the chart, you would read the columns as representing the three depth groups (left to right) and the rows as representing the four mag groups (bottom to top). Thus, as you read up the chart you're progressively slicing the data into groups of observations with increasing magnitudes. So, for example, the bottom row represents earthquakes with magnitudes of 4 to 4.5 with each column representing a different range of depths. Similarly, you read the columns as holding depth constant while allowing you to see various ranges of magnitudes.
Putting it all together, as mentioned by Andy, we can see that as we read up the rows (progressing up in magnitude) the distribution of earthquakes remains relatively unchanged. However, when reading across the columns (progressing up in depth) we see that the distribution does slightly change. Specifically, the grouping of quakes on the right, between longitudes 180 and 185, grows tighter and more clustered towards the top of the cell.
This is a method for visualizing interactions in your dataset. More specifically, it lets you see how some set of variables are conditional on some other set of variables.
In the example given, you're asking to visualize how lat and long vary with depth. Because you didn't specify number, and the formula indicates you're interested in only one conditional variable, the function assumes you want number=6 depth cuts (passed to co.intervals, which tries to make the number of data points approximately equal within each interval) and is simply maximizing the data-to-ink ratio by stacking individual plot frames; the value of depth increases to the right, starting with the lowest row and moving up (hence the top-right frame represents the largest depth interval). You can set rows or columns to change this behavior, e.g.:
coplot(lat ~ long | depth, data = quakes, columns=6)
but I think the power of this tool becomes more apparent when you inspect two or more conditioning variables. For example:
coplot(lat ~ long | depth * mag, data = quakes, number=c(3,4))
gives a rich view of how earthquakes vary in space, and demonstrates that there is some interaction with depth (the pattern changes from left to right), and little-to-no interaction with magnitude (the pattern does not change from top to bottom).
Finally, I would highly recommend reading Cleveland's Visualizing Data -- a classic text.

How to calculate the average stretch of a Quad?

Not sure if I formulate the correct answer but basically I need to know how to calculate the average size of a stretch quad, or something like taking the 2 diagonals AC and BD how can I calculate the average of them
The blue square show its original size and the pink lines shows when its deform, I need to calculate some sort of average so I can change its color in relation to how is deform if expands change to a lighter color if contracts change to darker color, hope that makes sense
Not quite sure what the question here is.
If you're asking how to find the average diagonal length, just find the length of each diagonal, add them together and divide by two.
If you're asking how to determine the area of the resultant shape, here are several formulas for finding the area of an arbitrary quadrilateral. They are all equivalent to the formula
Area = 1/2 * |AC x BD|
Where x means the cross-product.
A quadrilateral will have a 2nd order symmetric strain tensor:
| Exx Exy |
| Exy Eyy |
Exx is the strain in the x-direction; Eyy is the strain in the y-direction; Exy is the shear strain that occurs when one of the angles that is originally 90 degrees changes its value.
If you have large strains, you'll need a Lagrangian point of a view and a Green-Lagrange large strain measure.
See Malvern's Continuum Mechanics for definitions of each term. I'd give you the formulas, but I don't have LaTeX available here.

Resources