EBImage feature names - r

Can anyone explain what is being used to compute different features within computeFeatures?
I get the naming convention being applied that is spelled out in ? computeFeatures. I don't understand the .0., .a. and .Ba. labels.
For example:
> library(EBImage)
> y = readImage(system.file("images", "nuclei.tif", package="EBImage"))[,,1]
> x = thresh(y, 10, 10, 0.05)
> x = opening(x, makeBrush(5, shape='disc'))
> x = bwlabel(x)
> ft = computeFeatures(x, y, xname="nucleus")
> colnames(ft)
[1] "nucleus.0.m.cx" "nucleus.0.m.cy"
[3] "nucleus.0.m.majoraxis" "nucleus.0.m.eccentricity"
<snip>
[11] "nucleus.0.s.radius.max" "nucleus.a.b.mean"
[13] "nucleus.a.b.sd" "nucleus.a.b.mad"
<snip>
[51] "nucleus.Ba.b.mean" "nucleus.Ba.b.sd"
[53] "nucleus.Ba.b.mad" "nucleus.Ba.b.q001"
[55] "nucleus.Ba.b.q005" "nucleus.Ba.b.q05"
<snip>
My guess is nucleus.0.* features use only the data from the binary masks contained in x. So nucleus.0.m.cy is the y-axis centroid computed using the binary data. There are also nucleus.a.m.cy and nucleus.Ba.m.cy but it is unclear how these computations are different (they are extremely correlated but not identical).
I also suppose the .a. and .Ba. use the intensity values in y but the details are vague. Features like nucleus.a.b.mean and nucleus.Ba.b.mean are similar (~.80 corr) but not the same. I assume that they estimate the mean y intensity of objects defined by the labels in x but the difference is unclear.
Is there any documentation on this?
Thanks,
Max
> sessionInfo()
R Under development (unstable) (2014-08-23 r66461)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] EBImage_4.7.16
loaded via a namespace (and not attached):
[1] abind_1.4-0 BiocGenerics_0.11.4 grid_3.2.0
[4] jpeg_0.1-8 lattice_0.20-29 locfit_1.5-9.1
[7] parallel_3.2.0 png_0.1-7 tiff_0.1-5
[10] tools_3.2.0

Have you seen the documentation here: AnalysisWithEBImage
This seems to be the most in depth document that discusses the package. Have you tried contacting the author Grégoire Pau directly? Im sure if you google him you can find him.

As a disclaimer, I know nothing of your field, but by looking at the function I can make a pretty good guess at what is going on. I recommend that you use debugonce(computeFeatures) and then run ft = computeFeatures(x, y, xname="nucleus"). You can step through each line of the code (type Q) to exit, and see what is going on.
As you noted, the documentation states:
Features are named x.y.f, where x is the object layer, y the
reference image layer and f the feature name.
In your example computeFeatures has generated values for three reference layers (a, aB, and 0). The documentation mentions that if you don't name your reference layers, they will just be given the letters of the alphabet, so in your case you had one reference layer, and it is called a. I believe the 0 means that it uses no reference layer.
From looking at the source code, it appears that for every layer i, a B_i layer is created. It appears to pass a hardcoded filter over each layer as you can see in this code, found in the expandRef function (the comments are mine):
# Hard code a filter
blob = gblob(x0 = 15, n = 49, alpha = 0.8, beta = 1.2)
# Filter using the fast 2D FFT convolution product.
bref = lapply(ref, function(r) filter2(r, blob)/2)
# Name it "B" and then the layer name
names(bref) = paste("B", names(ref), sep = "")
I don't know exactly what you are trying to do here, but you can see visually what this filter is doing. Here is your x (you can just run display(x) to see it):
Here is your reference (y):
Here is what the hardcoded filter looks like:
And this is what the hardcoded filter does to y:
So, to summarize: everything with 0 is comparing to a no reference, everything with a is comparing directly to y as a reference, and everything with aB is comparing to a filtered version of y.

Related

Compact letter display after Kruskal-Wallis-Test

I'm trying to evaluate some data for my thesis. I can use R to build a boxplot and conduct the statistical test just fine, and I can do the compact letter display manually... but this time I simply have too much data to do it this way. I'm plotting the distance travelled by different species against each other. I did find some manuals online telling me to use the cldList-function, like so:
PT = Data$res
PT
library(rcompanion)
cldList(P.adj ~ Comparison,
Data = PT,
threshold = 0.05)
But it seems the resulting table isn't right:
[1] Group Letter MonoLetter
<0 Zeilen> (oder row.names mit Länge 0)
Obviously, I need the data grouped by species, but I thought I had already clarified this when conducting the Kruskal-Wallis-Test.
I'm fairly inexperienced with R, or programming in general, so I have no idea where the error is here. I'd apprechiate any help.

How to generate a negative exponential distribution in R

I was manually creating a negative exponent distribution today and was trying to figure out a faster/easier solution. First, I just manually crafted a geometric sequence such as this one, multiplying constantly by .60 til I neared zero:
x <- 400
x*.60
Doing this about 20 times, I got this vector of solutions and plotted the distribution, as seen below:
y <- c(400,240,144,86.4, 51.84, 31.104, 18.6624, 11.19744, 6.718464, 4.031078,
2.418647, 1.451188, .8707129, .5224278, .3134567, .188074, .1128444,
.06770664, .04062398, .02437439)
plot(y)
However, I was trying to figure out what must be an easier way of doing this with seq, but I only know how to do this with arithmetic sequences. I tried reproducing what I did below:
plot(seq(from=400,
to=1,
by=-.60))
Which obviously doesn't produce the same effect, causing a very linear decline when plotted:
Is there an easier solution? I have to imagine that this is a rather basic function within R.
You may use dexp.
(x <- dexp(1:20, rate=.5)*1000)
# [1] 303.26532986 183.93972059 111.56508007 67.66764162 41.04249931 24.89353418 15.09869171 9.15781944 5.55449827
# [10] 3.36897350 2.04338572 1.23937609 0.75171960 0.45594098 0.27654219 0.16773131 0.10173418 0.06170490
# [19] 0.03742591 0.02269996
plot(x)
To make it start exactly at 400, we can minimize (400 - dexp(1, rate=.5)*x)^2 using optimize.
f <- function(x, a) (a - dexp(1, rate=.5)*x)^2
xmin <- optimize(f, c(0, 4000), a=400)
(x <- dexp(seq_len(20), rate=.5)*xmin$minimum)
# [1] 400.00000000 242.61226389 147.15177647 89.25206406 54.13411329 32.83399945 19.91482735 12.07895337 7.32625556
# [10] 4.44359862 2.69517880 1.63470858 0.99150087 0.60137568 0.36475279 0.22123375 0.13418505 0.08138735
# [19] 0.04936392 0.02994073
Note that if you want any different rate= you should to use it both in optimize and when creating the values.

How can I increase the map size in a plot?

I want to plot points in an openstreetmap. To determine a suitable range for the map I want to use min() and max() and increase the size by 10%:
library(OpenStreetMap)
coords <- data.frame(cbind(c(-2.121821, -2.118570, -2.124278),
c(51.89437, 51.90330, 51.90469)))
topleft <- c(max(coords[,2]) + 0.1 * max(coords[,2]),
min(coords[,1]) - 0.1 * min(coords[,1]))
bottomright <- c(min(coords[,2]) - 0.1 * min(coords[,2]),
max(coords[,1]) + 0.1 * max(coords[,1]))
map <- openproj(openmap(topleft, bottomright, zoom = "16", type="osm"))
When I now try to create the map R eats up all my resources and I have to kill the process. Is there a better way to achieve this?
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
other attached packages:
[1] ggplot2_0.9.3.1 OpenStreetMap_0.3.1 rgdal_0.8-14 raster_2.2-12 sp_1.0-14
[6] rJava_0.9-6
You're extending the range incorrectly, as you'll see if you have a look at the computed values of topleft and bottomright.
A less error-prone approach might use the function extendrange() (which is used by many R plotting functions to add a little buffer around the most extreme points in the plot).
xx <- extendrange(coords[[1]], f=0.10)
yy <- extendrange(coords[[2]], f=0.10)
tl <- c(max(yy), min(xx))
br <- c(min(yy), max(xx))
map <- openproj(openmap(tl, br, zoom="16", type="osm"))
plot(map)

Solving transcendental equations in R

Is there a function for solving transcendental equations in R?
For example, I want to solve the following equation
x = 1/tan(x)
Any suggestions? I know the solution has multiple roots so I also want to be able to recover all the answers for a given interval
I would plot the function curve and look at it to see what it looks like:
R > y = function(x) { x - 1/tan(x) }
R > curve(y, xlim = c(-10, 10))
R > abline(h = 0, color = 'red')
Then I saw there is a root between 0 and 3, I would use uniroot to get the root I want:
R > uniroot(y, interval = c(0, 3))
$root
[1] 0.8603
$f.root
[1] 6.612e-06
$iter
[1] 7
$estim.prec
[1] 6.104e-05
You can use uniroot to find roots of any 1D equations within a given range. However, getting multiple roots seems like a very hard problem in general (e.g. see the relevant chapter of Numerical Recipes for some background: chapter 9 at http://apps.nrbook.com/c/index.html ). Which root is found when there are multiple roots is hard to predict. If you know enough about the problem to subdivide the space into subregions with zero or one roots, or if you're willing to divide it into lots of regions and hope that you found all the roots, you can do it. Otherwise I look forward to other peoples' solutions ...
In this particular case, as shown by #liuminzhao's solution, there's (at most? exactly?) one solution between n*pi and (n+1)*pi
y = function(x) x-1/tan(x)
curve(y,xlim=c(-10,10),n=501,ylim=c(-5,5))
abline(v=(-3:3)*pi,col="gray")
abline(h=0,col=2)
This is a bit of a hack, but it will find roots of your equation (provided they are not too close to a multiple of pi: you can reduce eps if you like ...). However, if you want to solve a different multi-root transcendental equation you might need another (specialized) strategy ...
f <- function(n,eps=1e-6) uniroot(y,c(n*pi+eps,(n+1)*pi-eps))$root
sapply(0:3,f)
## [1] 0.8603337 3.4256204 6.4372755 9.5293334

Help in using rgl package

I installed rgl package with the option --disable-libpng. I tried generating a 3d scatter plot and it crashes. Please help me in resolving this
This is the code i am running
library(rgl)
open3d()
x <- sort(rnorm(1000))
y <- rnorm(1000)
z <- rnorm(1000) + atan2(x,y)
plot3d(x, y, z, col=rainbow(1000))
It crashes with below messages
*** caught segfault ***
address (nil), cause 'memory not mapped'
Traceback:
1: .External(rgl_par3d, args)
2: par3d(skip)
3: plot3d.default(x, y, z, col = rainbow(1000))
4: plot3d(x, y, z, col = rainbow(1000))
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:
Here is the information from sessionInfo()
> sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rgl_0.92.798
This is from the sysname command
x86-64_linux_2.6.16_ImageSLES10SP3-3
Some more info:
I am able to generate a surface plot from some code in R: Plotting a 3D surface from x, y, z
Here is the code
x <- seq(-10, 10, length.out = 50)
y <- x
rotsinc <- function(x,y) {
sinc <- function(x) {
y <- sin(x)/x;
y[is.na(y)] <- 1;
y
}
10 * sinc( sqrt(x^2+y^2) )
}
z <- outer(x, y, rotsinc)
surface3d(x, y, z)
I tried demo(rgl) and that is also crashing with similar message. I want to generate 3d plots, which other package do you recommend? ggplot?
The rgl package makes use of possible hardware acceleration in your graphics card via its driver.
This is unfortunately entirely dependent on the driver. I have been using rgl for animated visualization for a number of years---see eg this visualization of option analytics surfaces from 2005---which I can assure you crashed for no good reason on some machines and runs on others. You really should try on a different machine with a different driver before making any firm conclusions.
Computers use hardware, and sometimes the hardware bites. I can your code fine on one of my machines. Another is dual-screen and hence without GL extension so it won't. Did I mention hardware bites?
I tested the exact same code on my system, and it worked perfectly.
Whatever the issues were, they have probably been fixed.
Test 1:
library(rgl)
demo(rgl)
Test 2:
library(rgl)
open3d()
x <- sort(rnorm(1000))
y <- rnorm(1000)
z <- rnorm(1000) + atan2(x,y)
plot3d(x, y, z, col=rainbow(1000))
My system is Windows 7 x64 running R v2.14.2. Tested under two IDEs, namely Revolution R and RStudio.

Resources