I drew a dotplot (using dotPlot() from seqinr package) of 2 fasta sequences and I need to extract some values (x,y) from the plot.
The Dotplot() output is an image
A generic dotplot maybe be this one
I need for example the values of start & end of the local alignment which are represented by the purple lines
so here an example
l=30
seq1 <- paste(sample(c("A","G","T","C"), l, repl=TRUE))
seq2 <- paste(sample(c("A","G","T","C"), l, repl=TRUE))
dotPlot(seq1,seq2, wsize = 2, wstep = 1, nmatch = 2, col = c("white", "green"), xlab = deparse(substitute(seq1)), ylab = deparse(substitute(seq2)))
locator(n=2, type="p")
$x
[1] 27.18720 31.23263
$y
[1] 20.45222 24.65726
So I want exactly the position of the 2 circled points,and as you can see the locator() gives decimal value .
I may use ceiling() or round() but i maybe get back an approximation error
I need the integer value of the point I clicked on, basically the nearest point to the place
Would be perfect to use identify(), which works with "normal" plots and gives back a vector with the closest plotted value to your "click", but it doesn't work on the dotPlot() output (the problem seems to be that it doesn't work on image output as locator() )
Any possible solution would be welcome, including using dotter in shell or python. Thanks
As you have mentioned Identify doesn't work since it need a plot not an image. Maybe a solution is to call image after plot(type="n",..) but this need to change the dotPlot function source code. Another elegant solution is to use lattice package and panel.identify the grid equivalent of identify.
Here an example, where I select some points ( 6 -> 15):
library(lattice)
dotplot(y~x,data.frame(x=letters,y=letters))
trellis.focus("panel", 1, 1)
> panel.identify()
[1] 6 7 8 9 10 11 12 13 14 15
Have a look at evolvedmicrobe/dotplot on github
https://github.com/evolvedmicrobe/dotplot/blob/master/R/plotters.R
It provides mkDotPlotDataFrame. With this you can better get coordinates between matches, like with identify.
Related
I'm currently struggling with some image analysis. I have images of zebrafish embryo vasculature, and I want to measure the distance between certain features (the highest point to the lowest etc).
I have processed the images to be more visible (higher contrast) using EBImage
.
I would appreciate any guidance.
Since you are using R and EBImage, I would presume that there is more analysis intended than just extracting measurements from an image. If that is all you intend, other software such as Fiji or the more streamline precursor, ImageJ, may be more user-friendly.
To answer the question, don't use display() for the image as you show here. Rather, use the plot() method that uses the option method = raster as the default. With the image plotted in a graphic window, you can use all the tools of R to interact with the plot. The resolution you have is determined by the size of your image and display. All values are returned in pixels and obviously need to be scaled appropriately.
This example uses locator() in a small helper function to measure diagonal distances between vascular junctions (?) in the image.
This simple helper function marks two points and measures the distance between the points. End the call to locator() with a right-click control-click or the escape key. In RStudio, you may have to explicitly press another button in the window and the points/lines may not be drawn until all calls to locator() are terminated.
p2p <- function(n = 512) # end with ctrl-click or Esc
{
ans <- numeric()
while (n > 0) {
# this call to locator places 2 points as crosses
# and connects them with a line
p <- locator(2, type = "o", pch = 3, col = "magenta")
if (is.null(p)) break
ans <- c(ans, sqrt(sum(sapply(p, diff)^2)))
n <- n - 1
}
return(ans) # return the vector of point-to-point distances
}
Now replot the image in the question (without the elements from the browser display) and then interact with the image.
plot(img) # not 'display(img)'
d <- p2p() # interact with the image, collecting distances
Here's the image after selecting six pairs of points with the distances measured between each pair of points.
round(d, 1)
> [1] 113.4 99.2 109.4 110.8 120.6 122.7
mean(d)
> 112.6736
Have fun!
Not dumb at all. Yes, it is in pixels—EBImage and R gives you fractional pixels.
So I have plotted a curve, and have had a look in both my book and on stack but can not seem to find any code to instruct R to tell me the value of y when along curve at 70 x.
curve(
20*1.05^x,
from=0, to=140,
xlab='Time passed since 1890',
ylab='Population of Salmon',
main='Growth of Salmon since 1890'
)
So in short, I would like to know how to command R to give me the number of salmon at 70 years, and at other times.
Edit:
To clarify, I was curious how to command R to show multiple Y values for X at an increase of 5.
salmon <- data.frame(curve(
20*1.05^x,
from=0, to=140,
xlab='Time passed since 1890',
ylab='Population of Salmon',
main='Growth of Salmon since 1890'
))
salmon$y[salmon$x==70]
1 608.5285
This salmon data.frame gives you all of the data.
head(salmon)
x y
1 0.0 20.00000
2 1.4 21.41386
3 2.8 22.92768
4 4.2 24.54851
5 5.6 26.28392
6 7.0 28.14201
If you can also use inequalities to check the number of salmon in given ranges using the syntax above.
It's also simple to answer the 2nd part of your question using this object:
salmon$z <- salmon$y*5 # I am using * instead of + to make the plot more clear
plot(x=salmon$x,y=salmon$z, xlab='Time passed since 1890', ylab='Population of Salmon',type="l")
lines(salmon$x,salmon$y, col="blue")
curve is plotting the function 20*1.05^x
so just plug any value you want in that function instead of x, e.g.
> 20*1.05^70
[1] 608.5285
>
20*1.05^(seq(from=0, to=70, by=10))
Was all I had to do, I had forgotten until Ed posted his reply that I could type a function directly into R.
I am trying to convey the concentration of lines in 2D space by showing the number of crossings through each pixel in a grid. I am picturing something similar to a density plot, but with more intuitive units. I was drawn to the spatstat package and its line segment class (psp) as it allows you to define line segments by their end points and incorporate the entire line in calculations. However, I'm struggling to find the right combination of functions to tally these counts and would appreciate any suggestions.
As shown in the example below with 50 lines, the density function produces values in (0,140), the pixellate function tallies the total length through each pixel and takes values in (0, 0.04), and as.mask produces a binary indictor of whether a line went through each pixel. I'm hoping to see something where the scale takes integer values, say 0..10.
require(spatstat)
set.seed(1234)
numLines = 50
# define line segments
L = psp(runif(numLines),runif(numLines),runif(numLines),runif(numLines), window=owin())
# image with 2-dimensional kernel density estimate
D = density.psp(L, sigma=0.03)
# image with total length of lines through each pixel
P = pixellate.psp(L)
# binary mask giving whether a line went through a pixel
B = as.mask.psp(L)
par(mfrow=c(2,2), mar=c(2,2,2,2))
plot(L, main="L")
plot(D, main="density.psp(L)")
plot(P, main="pixellate.psp(L)")
plot(B, main="as.mask.psp(L)")
The pixellate.psp function allows you to optionally specify weights to use in the calculation. I considered trying to manipulate this to normalize the pixels to take a count of one for each crossing, but the weight is applied uniquely to each line (and not specific to the line/pixel pair). I also considered calculating a binary mask for each line and adding the results, but it seems like there should be an easier way. I know that you can sample points along a line, and then do a count of the points by pixel. However, I am concerned about getting the sampling right so that there is one and only one point per line crossing of a pixel.
Is there is a straight-forward way to do this in R? Otherwise would this be an appropriate suggestion for a future package enhancement? Is this more easily accomplished in another language such as python or matlab?
The example above and my testing has been with spatstat 1.40-0, R 3.1.2, on x86_64-w64-mingw32.
You are absolutely right that this is something to put in as a future enhancement. It will be done in one of the next versions of spatstat. It will probably be an option in pixellate.psp to count the number of crossing lines rather than measure the total length.
For now you have to do something a bit convoluted as e.g:
require(spatstat)
set.seed(1234)
numLines = 50
# define line segments
L <- psp(runif(numLines),runif(numLines),runif(numLines),runif(numLines), window=owin())
# split into individual lines and use as.mask.psp on each
masklist <- lapply(1:nsegments(L), function(i) as.mask.psp(L[i]))
# convert to 0-1 image for easy addition
imlist <- lapply(masklist, as.im.owin, na.replace = 0)
rslt <- Reduce("+", imlist)
# plot
plot(rslt, main = "")
I'm wondering if it is possible to caclulate the area within a contour in R.
For example, the area of the contour that results from:
sw<-loess(m~l+d)
mypredict<-predict(sw, fitdata) # Where fitdata is a data.frame of an x and y matrix
contour(x=seq(from=-2, to=2, length=30), y=seq(from=0, to=5, length=30), z=mypredict)
Sorry, I know this code might be convoluted. If it's too tough to read. Any example where you can show me how to calculate the area of a simply generated contour would be helpful.
Thanks for any help.
I'm going to assume you are working with an object returned by contourLines. (An unnamed list with x and y components at each level.) I was expecting to find this in an easy to access location but instead found a pdf file that provided an algorithm which I vaguely remember seeing http://finzi.psych.upenn.edu/R/library/PBSmapping/doc/PBSmapping-UG.pdf (See pdf page 19, labeled "-11-") (Added note: The Wikipedia article on "polygon" cites this discussion of the Surveyors' Formula: http://www.maa.org/pubs/Calc_articles/ma063.pdf , which justifies my use of abs().)
Building an example:
x <- 10*1:nrow(volcano)
y <- 10*1:ncol(volcano)
contour(x, y, volcano);
clines <- contourLines(x, y, volcano)
x <- clines[[9]][["x"]]
y <- clines[[9]][["y"]]
level <- clines[[9]][["level"]]
level
#[1] 130
The area at level == 130 (chosen because there are not two 130 levels and it doesn't meet any of the plot boundaries) is then:
A = 0.5* abs( sum( x[1:(length(x)-1)]*y[2:length(x)] - y[1:(length(x)-1)]*x[2:length(x)] ) )
A
#[1] 233542.1
Thanks to #DWin for reproducible example, and to the authors of sos (my favourite R package!) and splancs ...
library(sos)
findFn("area polygon compute")
library(splancs)
with(clines[[9]],areapl(cbind(x,y)))
Gets the same answer as #DWin, which is comforting. (Presumably it's the same algorithm, but implemented within a Fortran routine in the splancs package ...)
I have an image with many dots, and I would like to extract from it what is the x-y location of each dot.
I already know how to do this manually (there is a package for doing it).
However, is there some way of doing it automatically ?
(My next question will be - is there a a way, when having an image of many lines, to detect where the lines intersect/"touch each other")
Due to requests in the comments, here is an example for an image to "solve" (i.e: extract the data point locations for it)
#riddle 1 (find dots):
plot(cars, pch = 19)
#riddle 2 (find empty center circles):
plot(cars, pch = 1)
#riddle 2 (fine intersection points):
plot(cars, pch = 3)
#riddle 3 (find intersections between lines):
plot(cars, pch = 1, col = "white")
lines(stats::lowess(cars))
abline(v = c(5,10,15,20,25))
Thanks, Tal
(p.s: since I am unfamiliar with this field, I am sorry if I am using the wrong terminology or asking something too simple or complex. Is this OMR?)
The Medical Imaging Task View covers general image provessing, this may be a start.
Following up after Dirk, yes check the medical imaging task view. Also look at Rforge,
Romain Francois has an RJImage package and another image processing package was recently registered. What you are looking for are segmentation algorithms. Your dots problem is much easier than the line problem. The first can be done with an RGB or greyscale filter, just doing some sort of radius search. Detecting linear features is harder. Once you hve the features extracted you can use a sweepline algorithm to detect intersections. EBIimage may have an example for detecting cells in the vignette.
Nicholas
I think you could use package raster to extract xy coordinates from an image with specific values. Have a look at the package vignettes.
EDIT
Can you try this and tell me if it's in the ball park of what you're looking for?
I hope the code with comments is quite self-explanatory. Looking forward to your answer!
library(raster)
rst <- raster(nrows = 100, ncols = 100) #create a 100x100 raster
rst[] <- round(runif(ncell(rst))) #populate raster with values, for simplicity we round them to 0 and 1
par(mfrow=c(1,2))
plot(rst) #see what you've got so far
rst.vals <- getValues(rst) #extract values from rst object
rst.cell.vals <- which(rst.vals == 1) #see which cells are 1
coords <- xyFromCell(rst, rst.cell.vals) #get coordinates of ones
rst[rst.cell.vals] <- NA #set those raster cells that are 1 to NA (you can play with rst[!rst.cell.vals] <- NA to exclude all others)
plot(rst) #a diag plot, should have only one color