Using R for simple image/pattern-recognition task? - r

I have an image with many dots, and I would like to extract from it what is the x-y location of each dot.
I already know how to do this manually (there is a package for doing it).
However, is there some way of doing it automatically ?
(My next question will be - is there a a way, when having an image of many lines, to detect where the lines intersect/"touch each other")
Due to requests in the comments, here is an example for an image to "solve" (i.e: extract the data point locations for it)
#riddle 1 (find dots):
plot(cars, pch = 19)
#riddle 2 (find empty center circles):
plot(cars, pch = 1)
#riddle 2 (fine intersection points):
plot(cars, pch = 3)
#riddle 3 (find intersections between lines):
plot(cars, pch = 1, col = "white")
lines(stats::lowess(cars))
abline(v = c(5,10,15,20,25))
Thanks, Tal
(p.s: since I am unfamiliar with this field, I am sorry if I am using the wrong terminology or asking something too simple or complex. Is this OMR?)

The Medical Imaging Task View covers general image provessing, this may be a start.

Following up after Dirk, yes check the medical imaging task view. Also look at Rforge,
Romain Francois has an RJImage package and another image processing package was recently registered. What you are looking for are segmentation algorithms. Your dots problem is much easier than the line problem. The first can be done with an RGB or greyscale filter, just doing some sort of radius search. Detecting linear features is harder. Once you hve the features extracted you can use a sweepline algorithm to detect intersections. EBIimage may have an example for detecting cells in the vignette.
Nicholas

I think you could use package raster to extract xy coordinates from an image with specific values. Have a look at the package vignettes.
EDIT
Can you try this and tell me if it's in the ball park of what you're looking for?
I hope the code with comments is quite self-explanatory. Looking forward to your answer!
library(raster)
rst <- raster(nrows = 100, ncols = 100) #create a 100x100 raster
rst[] <- round(runif(ncell(rst))) #populate raster with values, for simplicity we round them to 0 and 1
par(mfrow=c(1,2))
plot(rst) #see what you've got so far
rst.vals <- getValues(rst) #extract values from rst object
rst.cell.vals <- which(rst.vals == 1) #see which cells are 1
coords <- xyFromCell(rst, rst.cell.vals) #get coordinates of ones
rst[rst.cell.vals] <- NA #set those raster cells that are 1 to NA (you can play with rst[!rst.cell.vals] <- NA to exclude all others)
plot(rst) #a diag plot, should have only one color

Related

Rasterize SpatVect (points) with buffer around SpatRaster

I have a SpatVect consisting of points and I want to rasterize them into a SpatRaster with a given resolution. Is there a way of specifying a function taking in the points that are within a buffer of each raster cell?
Many thanks
Joao
-- Update --
Maybe a figure would help understand what I'm after with my question. The red square will have to be run over the center of each pixel to calculate some statistics using the ovelaying points. Apologies for the clumsy question, but I hope the figure is clear enough...
terra version 1.6-28 supports rasterization of points with a rectangular moving window.
Example data
library(terra)
#terra 1.6.33
r <- rast(ncol=100, nrow=100, crs="local", xmin=0, xmax=50, ymin=0, ymax=50)
set.seed(100)
x <- runif(50, 5, 45)
y <- runif(50, 5, 45)
z <- sample(50)
v <- vect(data.frame(x,y,z), geom=c("x", "y"))
Solution
r1 <- rasterizeWin(v, r, field="z", fun="count", pars=10, win="rectangle")
plot(r1)
points(x, y)
You can change fun to another function that works for you, and you can change the size of the moving window with pars.
Instead of a rectangle, you can also use a circle or an ellipse. The border of a circular window is equidistant from the center of the cells. In contrast, the border of rectangles are at a constant distance from the border of the grid cells in most directions (not at the corners). Here is an example.
r2 <- rasterizeWin(v, r, field="z", fun="count", pars=5.25, win="circle")
plot(r2)
You can also use buffers around each cell to get a window that is truly equidistant from each cell border.
r3 <- rasterizeWin(v, r, field="z", fun=length, pars=5, win="buf")
plot(r3)
In this case, because the buffer size is large relative to the cell size, the result is very similar to what you get when using a circular window. Using "circle" should be the fastest, and using "buffer" should be the slowest in most cases. The function should now in all cases be memory-safe, except, perhaps when using very large buffers (more could be done if need be).
Version 1.6-28 is currently the development version. You can install it with
install.packages('terra', repos='https://rspatial.r-universe.dev')
The approach you take seems to depend on what result you're looking for from the above and the relationship they have with each other.
library(terra)
`terra::buffer(` # both SpatVectx/SpatRastery, to distance in 'm'
`terra::buffer(` # that is meaningful
#take Rasty to SpatVecty
`terra::as.polygons(`, #then
`z<-terra::intersection(SpatVectx, SpatVecty)`
then back to SpatRastz? terra::mask or crop, might also be useful, again depending on where things are going next.

Simplifying the data in a raster - R

I have a raster file, which I created from data downloaded from DIVA-GIS: http://www.diva-gis.org/datadown
nz_map<-raster("NZL1_msk_cov.grd")
Using plot() on this object works great, so there are no issues importing it. The raster object contains a lot of data I don't need, data on land cover. I want a more simple raster object with lon & lat coordinates and a value of 1 for land and NA for ocean.
This raster will be used with the dismo function randomPoints() to sample background data for modelling species distribution, so the most important thing is to identify which areas are land(suitable for sampling) and which are ocean(unsuitable).
I can visualise the raster more simply with plot(!is.na(nz_map5)). This works well and services for the randomPoints() function, but I'm not sure how to edit the color of the map. Doing this: plot(!is.na(nz_map5), col="grey") results in a totally grey block, instead of just colouring the appropriate areas grey; this is why I thought I might be better off with a more simple raster object, to do away with the !is.na argument Any ideas?
If anyone knows of a place you could download such files, saving me the hassle- that works, too.
Here are similar data for elevation
library(raster)
a <- getData("alt", country="NZL")
r <- a[[1]]
plot(r)
I think your confusion stems from what happens here
x <- !is.na(r)
That turns the values to TRUE (those that were not NA) or FALSE (those that were NA). So now you have two categories
plot(x, col=c("red", "blue"))
And now it is no longer a good dataset for dismo::randomPoints
If you would rather have NA and 1 other value you can do
y <- r * 0
plot(y, col="blue")
Or
y <- reclassify(y, cbind(-Inf, Inf, 1))
But, as you say yourself, for randomPoints you can just use the original data.

Use Rcartogram on a SpatialPolygonsDataFrame object

I'm trying to do the same thing asked in this question, Cartogram + choropleth map in R, but starting from a SpatialPolygonsDataFrame and hoping to end up with the same type of object.
I could save the object as a shapefile, use scapetoad, reopen it and convert back, but I'd rather have it all within R so that the procedure is fully reproducible, and so that I can code dozens of variations automatically.
I've forked the Rcartogram code on github and added my efforts so far here.
Essentially what this demo does is create a SpatialGrid over the map, look up the population density at each point of the grid and convert this to a density matrix in the format required for cartogram() to work on. So far so good.
But, how to interpolate the original map points based on the output of cartogram()?
There are two problems here. The first is to get the map and grid into the same units to allow interpolation. The second is to access every point of every polygon, interpolate it, and keep them all in right order.
The grid is in grid units and the map is in projected units (in the case of the example longlat). Either the grid must be projected into longlat, or the map into grid units. My thought is to make a fake CRS and use this along with the spTransform() function in package(rgdal), since this handles every point in the object with minimal fuss.
Accessing every point is difficult because they are several layers down into the SpPDF object: object>polygons>Polygons>lines>coords I think. Any ideas how to access these while keeping the structure of the overall map intact?
This problem can be solved with the getcartr package, available on Chris Brunsdon's GitHub, as beautifully explicated in this blog post.
The quick.carto function does exactly what you want -- takes a SpatialPolygonsDataFrame as input and has a SpatialPolygonsDataFrame as output.
Reproducing the essence of the example in the blog post here in case the link goes dead, with my own style mixed in & typos fixed:
(Shapefile; World Bank population data)
library(getcartr)
library(maptools)
library(data.table)
world <- readShapePoly("TM_WORLD_BORDERS-0.3.shp")
#I use data.table, see blog post if you want a base approach;
# data.table wonks may be struck by the following step as seeming odd;
# see here: http://stackoverflow.com/questions/32380338
# and here: https://github.com/Rdatatable/data.table/issues/1310
# for some background on what's going on.
world#data <- setDT(world#data)
world.pop <- fread("sp.pop.totl_Indicator_en_csv_v2.csv",
select = c("Country Code", "2013"),
col.names = c("ISO3", "pop"))
world#data[world.pop, Population := as.numeric(i.pop), on = "ISO3"]
#calling quick.carto has internal calls to the
# necessary functions from Rcartogram
world.carto <- quick.carto(world, world$Population, blur = 0)
#plotting with a color scale
x <- world#data[!is.na(Population), log10(Population)]
ramp <- colorRampPalette(c("navy", "deepskyblue"))(21L)
xseq <- seq(from = min(x), to = max(x), length.out = 21L)
#annoying to deal with NAs...
cols <- ramp[sapply(x, function(y)
if (length(z <- which.min(abs(xseq - y)))) z else NA)]
plot(world.carto, col = cols,
main = paste0("Cartogram of the World's",
" Population by Country (2013)"))

R/ImageJ: Measuring shortest distance between points and curves

I have some experience with R as a statistics platform, but am inexperienced in image based maths. I have a series of photographs (tiff format, px/µm is known) with holes and irregular curves. I'd like to measure the shortest distance between a hole and the closest curve for that particular hole. I'd like to do this for each hole in a photograph. The holes are not regular either, so maybe I'd need to tell the program what are holes and what are curves (ImageJ has a point and segmented line functions).
Any ideas how to do this? Which package should I use in R? Would you recommend another program for this kind of task?
EDIT: Doing this is now possible using sclero package. The package is currently available on GitHub and the procedure is described in detail in the tutorial. Just to illustrate, I use an example from the tutorial:
library(devtools)
install_github("MikkoVihtakari/sclero", dependencies = TRUE)
library(sclero)
path <- file.path(system.file("extdata", package = "sclero"), "shellspots.zip")
dat <- read.ijdata(path, scale = 0.7812, unit = "um")
shell <- convert.ijdata(dat)
aligned <- spot.dist(shell)
plot(aligned)
It is also possible to add sample spot sizes using the functions provided by the sclero package. Please see Section 2.5 in the tutorial.
There's a tool for edge detection written for Image J that might help you first find the holes and the lines, and clarify them. You find it at
http://imagejdocu.tudor.lu/doku.php?id=plugin:filter:edge_detection:start
Playing around with the settings for the tresholding and the hysteresis can help in order to get the lines and holes found. It's difficult to tell whether this has much chance of working without seeing your actual photographs, but a colleague of mine had good results using this tool on FRAP images. I programmed a ImageJ tool that can calculate recoveries in FRAP analysis based on those images. You might get some ideas for yourself when looking at the code (see: http://imagejdocu.tudor.lu/doku.php?id=plugin:analysis:frap_normalization:start )
The only way I know you can work with images, is by using EBImage that's contained in the bioconductor system. The package Rimage is orphaned, so is no longer maintained.
To find the shortest distance: once you have the coordinates of the lines and holes, you can go for the shotgun approach : calculate the distances between all points and the line, and then take the minimum. An illustration about that in R :
x <- -100:100
x2 <- seq(-70,-50,length.out=length(x)/4)
a.line <- list(x = x,
y = 4*x + 5)
a.hole <- list(
x = c(x2,rev(x2)),
y = c(200 + sqrt(100-(x2+60)^2),
rev(200 - sqrt(100-(x2+60)^2)))
)
plot(a.line,type='l')
lines(a.hole,col='red')
calc.distance <- function(line,hole){
mline <- matrix(unlist(line),ncol=2)
mhole <- matrix(unlist(hole),ncol=2)
id1 <- rep(1:nrow(mline),nrow(mhole))
id2 <- rep(1:nrow(mhole), each=nrow(mline))
min(
sqrt(
(mline[id1,1]-mhole[id2,1])^2 +
(mline[id1,2]-mhole[id2,2])^2
)
)
}
Then :
> calc.distance(a.line,a.hole)
[1] 95.51649
Which you can check mathematically by deriving the equations from the circle and the line. This goes fast enough if you don't have millions of points describing thousands of lines and holes.

How can I calculate the area within a contour in R?

I'm wondering if it is possible to caclulate the area within a contour in R.
For example, the area of the contour that results from:
sw<-loess(m~l+d)
mypredict<-predict(sw, fitdata) # Where fitdata is a data.frame of an x and y matrix
contour(x=seq(from=-2, to=2, length=30), y=seq(from=0, to=5, length=30), z=mypredict)
Sorry, I know this code might be convoluted. If it's too tough to read. Any example where you can show me how to calculate the area of a simply generated contour would be helpful.
Thanks for any help.
I'm going to assume you are working with an object returned by contourLines. (An unnamed list with x and y components at each level.) I was expecting to find this in an easy to access location but instead found a pdf file that provided an algorithm which I vaguely remember seeing http://finzi.psych.upenn.edu/R/library/PBSmapping/doc/PBSmapping-UG.pdf (See pdf page 19, labeled "-11-") (Added note: The Wikipedia article on "polygon" cites this discussion of the Surveyors' Formula: http://www.maa.org/pubs/Calc_articles/ma063.pdf , which justifies my use of abs().)
Building an example:
x <- 10*1:nrow(volcano)
y <- 10*1:ncol(volcano)
contour(x, y, volcano);
clines <- contourLines(x, y, volcano)
x <- clines[[9]][["x"]]
y <- clines[[9]][["y"]]
level <- clines[[9]][["level"]]
level
#[1] 130
The area at level == 130 (chosen because there are not two 130 levels and it doesn't meet any of the plot boundaries) is then:
A = 0.5* abs( sum( x[1:(length(x)-1)]*y[2:length(x)] - y[1:(length(x)-1)]*x[2:length(x)] ) )
A
#[1] 233542.1
Thanks to #DWin for reproducible example, and to the authors of sos (my favourite R package!) and splancs ...
library(sos)
findFn("area polygon compute")
library(splancs)
with(clines[[9]],areapl(cbind(x,y)))
Gets the same answer as #DWin, which is comforting. (Presumably it's the same algorithm, but implemented within a Fortran routine in the splancs package ...)

Resources