cut function produces uneven first break - r

I'm exploring the use of the cut function and am trying to cut the following basic vector into 10 breaks. I'm able to do it, but I'm confused as to why my initial break occurs at -0.1 rather than 0:
test_vec <- 0:10
test_vec2 <- cut(test_vec, breaks = 10)
test_vec2
yields:
(-0.01,1] (-0.01,1] (1,2] (2,3] (3,4] (4,5] (5,6] (6,7] (7,8] (8,9] (9,10]
Why does this produce 2 instances of (-0.01,1] (-0.01,1] and the lower number does not start at 0?

tl;dr to get what you might want, you'll probably need to specify breaks explicitly, and include.lowest=TRUE:
cut(x,breaks=0:10,include.lowest=TRUE)
The issue is probably this, from the "Details" of ?cut:
When ‘breaks’ is specified as a single number, the range of the
data is divided into ‘breaks’ pieces of equal length, and then the
outer limits are moved away by 0.1% of the range to ensure that
the extreme values both fall within the break intervals.
Since the range is (0,10), the outer limits are (-0.01, 10.01); as #Onyambu suggests, the results are asymmetric because the value at 0 lies on the left-hand boundary (not included) whereas the value at 10 lies on the right-hand boundary (included).
The (apparent) asymmetry is due to formatting; if you follow the code below (the core of base:::cut.default(), you'll see that the top break is actually at 10.01, but gets formatted as "10" because the default number of digits is 3 ...
x <- 0:10
breaks <- 10
dig <- 3
nb <- as.integer(breaks+1)
dx <- diff(rx <- range(x, na.rm = TRUE))
breaks <- seq.int(rx[1L], rx[2L], length.out = nb)
breaks[c(1L, nb)] <- c(rx[1L] - dx/1000, rx[2L] + dx/1000)
ch.br <- formatC(0 + breaks, digits = dig, width = 1L)

Related

Find local minimum in a vector with r

Taking the ideas from the following links:
the local minimum between the two peaks
How to explain ...
I look for the local minimum or minimums, avoiding the use of functions already created for this purpose [max / min locale or global].
Our progress:
#DATA
simulate <- function(lambda=0.3, mu=c(0, 4), sd=c(1, 1), n.obs=10^5) {
x1 <- rnorm(n.obs, mu[1], sd[1])
x2 <- rnorm(n.obs, mu[2], sd[2])
return(ifelse(runif(n.obs) < lambda, x1, x2))
}
data <- simulate()
hist(data)
d <- density(data)
#
#https://stackoverflow.com/a/25276661/8409550
##Since the x-values are equally spaced, we can estimate dy using diff(d$y)
d$x[which.min(abs(diff(d$y)))]
#With our data we did not obtain the expected value
#
d$x[which(diff(sign(diff(d$y)))>0)+1]#pit
d$x[which(diff(sign(diff(d$y)))<0)+1]#peak
#we check
#1
optimize(approxfun(d$x,d$y),interval=c(0,4))$minimum
optimize(approxfun(d$x,d$y),interval=c(0,4),maximum = TRUE)$maximum
#2
tp <- pastecs::turnpoints(d$y)
summary(tp)
ind <- (1:length(d$y))[extract(tp, no.tp = FALSE, peak = TRUE, pit = TRUE)]
d$x[ind[2]]
d$x[ind[1]]
d$x[ind[3]]
My questions and request for help:
Why did the command lines fail:
d$x[which.min(abs(diff(d$y)))]
It is possible to eliminate the need to add one to the index in the command lines:
d$x[which(diff(sign(diff(d$y)))>0)+1]#pit
d$x[which(diff(sign(diff(d$y)))<0)+1]#peak
How to get the optimize function to return the two expected maximum values?
Question 1
The answer to the first question is straighforward. The line d$x[which.min(abs(diff(d$y)))] asks for the x value at which there was the smallest change in y between two consecutive points. The answer is that this happened at the extreme right of the plot where the density curve is essentially flat:
which.min(abs(diff(d$y)))
#> [1] 511
length(abs(diff(d$y)))
#> [1] 511
This is not only smaller than the difference at your local maxima /minima points; it is orders of magnitude smaller. Let's zoom in to the peak value of d$y, including only the peak and the point on each side:
which.max(d$y)
#> [1] 324
plot(d$x[323:325], d$y[323:325])
We can see that the smallest difference is around 0.00005, or 5^-5, between two consecutive points. Now look at the end of the plot where it is flattest:
plot(d$x[510:512], d$y[510:512])
The difference is about 1^-7, which is why this is the flattest point.
Question 2
The answer to your second question is "no, not really". You are taking a double diff, which is two elements shorter than x, and if x is n elements long, a double diff will correspond to elements 2 to (n - 1) in x. You can remove the +1 from the index, but you will have an off-by-one error if you do that. If you really wanted to, you could concatenate dummy zeros at each stage of the diff, like this:
d$x[which(c(0, diff(sign(diff(c(d$y, 0))))) > 0)]
which gives the same result, but this is longer, harder to read and harder to justify, so why would you?
Question 3
The answer to the third question is that you could use the "pit" as the dividing point between the minimum and maximum value of d$x to find the two "peaks". If you really want a single call to get both at once, you could do it inside an sapply:
pit <- optimize(approxfun(d$x,d$y),interval=c(0,4))$minimum
peaks <- sapply(1:2, function(i) {
optimize(approxfun(d$x, d$y),
interval = c(min(d$x), pit, max(d$x))[i:(i + 1)],
maximum = TRUE)$maximum
})
pit
#> [1] 1.691798
peaks
#> [1] -0.02249845 3.99552521

cutting a variable into pieces in R

I'm trying to cut() my data D into 3 pieces: [0-4], [5-12], [13-40] (see pic below). But I wonder how to exactly define my breaks in cut to achieve that?
Here is my data and R code:
D <- read.csv("https://raw.githubusercontent.com/rnorouzian/m/master/t.csv", h = T)
table(cut(D$time, breaks = c(0, 5, 9, 12))) ## what should breaks be?
# (0,5] (5,9] (9,12] # cuts not how I want the 3 pieces .
# 228 37 10
The notation (a,b] means ">a and <=b".
So, to get your desired result, just define the cuts so you get the grouping that you want, including a lower and upper bound:
table(cut(D$time, breaks=c(-1, 4, 12, 40)))
## (-1,4] (4,12] (12,40]
## 319 47 20
You may also find it helpful to look at the two arguments right=FALSE, which changes the endpoints of the intervals from (a,b] to [a,b), and include.lowest, which includes the lowest breaks value (in the OP's example, this is [0,5] with closed brackets on the lower bound). You can also use infinity. Here's an example with a couple of those options put to use:
table(cut(D$time, breaks = c(-Inf, 4, 12, Inf), include.lowest=TRUE))
## [-Inf,4] (4,12] (12, Inf]
## 319 47 20
This produces the right buckets, but the interval notation would need tweaking. Assuming all times are integers. Might need to tweak the labels manually - each time you have an right-open interval notation, replace the factor label with a closed interval notation. Use your best string 'magic'
Personally, I like to make sure all possibilities are covered. Perhaps future data from this process might exceed 40? I like to put an upper bound of +Inf in all my cuts. This prevents NA from creeping into the data.
What cut needs is a 'whole numbers only` option.
F=cut(D$time,c(0,5,13,40),include.lowest = TRUE,right=FALSE)
# the below levels hard coded but you could write a loop to turn all labels
# of the form [m,n) into [m,n-1]
levels(F)[1:2]=c('[0,4]','[5,12]')
Typically there would be more analysis before final results are obtained, so I wouldn't sweat the labels too much until the work is closer to complete.
Here are my results
> table(F)
F
[0,4] [5,12] [13,40]
319 47 20
R can compare integers to floats, like in
> 6L >= 8.5
[1] FALSE
Thus you can use floats as breaks in cut such as in
table(cut(D$time, breaks = c(-.5, 4.5, 12.5, 40.5)))
For integers this fullfills your bucket definition of [0-4], [5-12], [13-40] without you having to think to much about square brackets against round brackets.
A fancy alternative would be clustering around the mean of you buckets as in
D <- read.csv("https://raw.githubusercontent.com/rnorouzian/m/master/t.csv", h = T)
D$cluster <- kmeans(D$time, center = c(4/2, (5+12)/2, (13+40)/2))$cluster
plot(D$time, rnorm(nrow(D)), col=D$cluster)
You shoud add two aditional arguments right and include.lowest to your code!
table(cut(D$time, breaks = c(0, 5, 13, 40), right=FALSE, include.lowest = TRUE))
In the case of right=FALSE the intervals should be closed on the left and open on the right such that you would have your desired result. include.lowest=TRUE causes that your highest break value (here 40) is included to the last interval.
Result:
[0,5) [5,13) [13,40]
319 47 20
Vice versa you can write:
table(cut(D$time, breaks = c(0, 4, 12, 40), right=TRUE, include.lowest = TRUE))
with the result:
[0,4] (4,12] (12,40]
319 47 20
Both mean exact what you looking for:
[0,4] [5,12] [13,40]
319 47 20

Error in rollapply: subscript out of bounds

I'd first like to describe my problem:
What i want to do is to calculate the number of spikes on prices in a 24 hour window, while I possess half hourly data.
I have seen all Stackoverflow posts like e.g. this one:
Rollapply for time series
(If there are more relevant ones, please let me know ;) )
As I cannot and probably also should not upload my data, here's a minimal example:
I simulate a random variable, convert it to an xts object, and use a user defined function to detect "spikes" (of course pretty ridiculous in this case, but illustrates the error).
library(xts)
##########Simulate y as a random variable
y <- rnorm(n=100)
##########Add a date variable so i can convert it to a xts object later on
yDate <- as.Date(1:100)
##########bind both variables together and convert to a xts object
z <- cbind(yDate,y)
z <- xts(x=z, order.by=yDate)
##########use the rollapply function on the xts object:
x <- rollapply(z, width=10, FUN=mean)
The function works as it is supposed to: it takes the 10 preceding values and calculates the mean.
Then, I defined an own function to find peaks: A peak is a local maximum (higher than m points around it) AND is at least as big as the mean of the timeseries+h.
This leads to:
find_peaks <- function (x, m,h){
shape <- diff(sign(diff(x, na.pad = FALSE)))
pks <- sapply(which(shape < 0), FUN = function(i){
z <- i - m + 1
z <- ifelse(z > 0, z, 1)
w <- i + m + 1
w <- ifelse(w < length(x), w, length(x))
if(all(x[c(z : i, (i + 2) : w)] <= x[i + 1])&x[i+1]>mean(x)+h) return(i + 1) else return(numeric(0))
})
pks <- unlist(pks)
pks
}
And works fine: Back to the example:
plot(yDate,y)
#Is supposed to find the points which are higher than 3 points around them
#and higher than the average:
#Does so, so works.
points(yDate[find_peaks(y,3,0)],y[find_peaks(y,3,0)],col="red")
However, using the rollapply() function leads to:
x <- rollapply(z,width = 10,FUN=function(x) find_peaks(x,3,0))
#Error in `[.xts`(x, c(z:i, (i + 2):w)) : subscript out of bounds
I first thought, well, maybe the error occurs because for it might run int a negative index for the first points, because of the m parameter. Sadly, setting m to zero does not change the error.
I have tried to trace this error too, but do not find the source.
Can anyone help me out here?
Edit: A picture of spikes:Spikes on the australian Electricity Market. find_peaks(20,50) determines the red points to be spikes, find_peaks(0,50) additionally finds the blue ones to be spikes (therefore, the second parameter h is important, because the blue points are clearly not what we want to analyse when we talk about spikes).
I'm still not entirely sure what it is that you are after. On the assumption that given a window of data you want to identify whether its center is greater than the rest of the window at the same time as being greater than the mean of the window + h then you could do the following:
peakfinder = function(x,h = 0){
xdat = as.numeric(x)
meandat = mean(xdat)
center = xdat[ceiling(length(xdat)/2)]
ifelse(all(center >= xdat) & center >= (meandat + h),center,NA)
}
y <- rnorm(n=100)
z = xts(y, order.by = as.Date(1:100))
plot(z)
points(rollapply(z,width = 7, FUN = peakfinder, align = "center"), col = "red", pch = 19)
Although it would appear to me that if the center point is greater than it's neighbours it is necessarily greater than the local mean too so this part of the function would not be necessary if h >= 0. If you want to use the global mean of the time series, just substitute the calculation of meandat with the pre-calculated global mean passed as an argument to peakfinder.

Detecting dips in a 2D plot

I need to automatically detect dips in a 2D plot, like the regions marked with red circles in the figure below. I'm only interested in the "main" dips, meaning the dips have to span a minimum length in the x axis. The number of dips is unknown, i.e., different plots will contain different numbers of dips. Any ideas?
Update:
As requested, here's the sample data, together with an attempt to smooth it using median filtering, as suggested by vines.
Looks like I need now a robust way to approximate the derivative at each point that would ignore the little blips that remain in the data. Is there any standard approach?
y <- c(0.9943,0.9917,0.9879,0.9831,0.9553,0.9316,0.9208,0.9119,0.8857,0.7951,0.7605,0.8074,0.7342,0.6374,0.6035,0.5331,0.4781,0.4825,0.4825,0.4879,0.5374,0.4600,0.3668,0.3456,0.4282,0.3578,0.3630,0.3399,0.3578,0.4116,0.3762,0.3668,0.4420,0.4749,0.4556,0.4458,0.5084,0.5043,0.5043,0.5331,0.4781,0.5623,0.6604,0.5900,0.5084,0.5802,0.5802,0.6174,0.6124,0.6374,0.6827,0.6906,0.7034,0.7418,0.7817,0.8311,0.8001,0.7912,0.7912,0.7540,0.7951,0.7817,0.7644,0.7912,0.8311,0.8311,0.7912,0.7688,0.7418,0.7232,0.7147,0.6906,0.6715,0.6681,0.6374,0.6516,0.6650,0.6604,0.6124,0.6334,0.6374,0.5514,0.5514,0.5412,0.5514,0.5374,0.5473,0.4825,0.5084,0.5126,0.5229,0.5126,0.5043,0.4379,0.4781,0.4600,0.4781,0.3806,0.4078,0.3096,0.3263,0.3399,0.3184,0.2820,0.2167,0.2122,0.2080,0.2558,0.2255,0.1921,0.1766,0.1732,0.1205,0.1732,0.0723,0.0701,0.0405,0.0643,0.0771,0.1018,0.0587,0.0884,0.0884,0.1240,0.1088,0.0554,0.0607,0.0441,0.0387,0.0490,0.0478,0.0231,0.0414,0.0297,0.0701,0.0502,0.0567,0.0405,0.0363,0.0464,0.0701,0.0832,0.0991,0.1322,0.1998,0.3146,0.3146,0.3184,0.3578,0.3311,0.3184,0.4203,0.3578,0.3578,0.3578,0.4282,0.5084,0.5802,0.5667,0.5473,0.5514,0.5331,0.4749,0.4037,0.4116,0.4203,0.3184,0.4037,0.4037,0.4282,0.4513,0.4749,0.4116,0.4825,0.4918,0.4879,0.4918,0.4825,0.4245,0.4333,0.4651,0.4879,0.5412,0.5802,0.5126,0.4458,0.5374,0.4600,0.4600,0.4600,0.4600,0.3992,0.4879,0.4282,0.4333,0.3668,0.3005,0.3096,0.3847,0.3939,0.3630,0.3359,0.2292,0.2292,0.2748,0.3399,0.2963,0.2963,0.2385,0.2531,0.1805,0.2531,0.2786,0.3456,0.3399,0.3491,0.4037,0.3885,0.3806,0.2748,0.2700,0.2657,0.2963,0.2865,0.2167,0.2080,0.1844,0.2041,0.1602,0.1416,0.2041,0.1958,0.1018,0.0744,0.0677,0.0909,0.0789,0.0723,0.0660,0.1322,0.1532,0.1060,0.1018,0.1060,0.1150,0.0789,0.1266,0.0965,0.1732,0.1766,0.1766,0.1805,0.2820,0.3096,0.2602,0.2080,0.2333,0.2385,0.2385,0.2432,0.1602,0.2122,0.2385,0.2333,0.2558,0.2432,0.2292,0.2209,0.2483,0.2531,0.2432,0.2432,0.2432,0.2432,0.3053,0.3630,0.3578,0.3630,0.3668,0.3263,0.3992,0.4037,0.4556,0.4703,0.5173,0.6219,0.6412,0.7275,0.6984,0.6756,0.7079,0.7192,0.7342,0.7458,0.7501,0.7540,0.7605,0.7605,0.7342,0.7912,0.7951,0.8036,0.8074,0.8074,0.8118,0.7951,0.8118,0.8242,0.8488,0.8650,0.8488,0.8311,0.8424,0.7912,0.7951,0.8001,0.8001,0.7458,0.7192,0.6984,0.6412,0.6516,0.5900,0.5802,0.5802,0.5762,0.5623,0.5374,0.4556,0.4556,0.4333,0.3762,0.3456,0.4037,0.3311,0.3263,0.3311,0.3717,0.3762,0.3717,0.3668,0.3491,0.4203,0.4037,0.4149,0.4037,0.3992,0.4078,0.4651,0.4967,0.5229,0.5802,0.5802,0.5846,0.6293,0.6412,0.6374,0.6604,0.7317,0.7034,0.7573,0.7573,0.7573,0.7772,0.7605,0.8036,0.7951,0.7817,0.7869,0.7724,0.7869,0.7869,0.7951,0.7644,0.7912,0.7275,0.7342,0.7275,0.6984,0.7342,0.7605,0.7418,0.7418,0.7275,0.7573,0.7724,0.8118,0.8521,0.8823,0.8984,0.9119,0.9316,0.9512)
yy <- runmed(y, 41)
plot(y, type="l", ylim=c(0,1), ylab="", xlab="", lwd=0.5)
points(yy, col="blue", type="l", lwd=2)
EDITED : function strips the regions to contain nothing but the lowest part, if wanted.
Actually, Using the mean is easier than using the median. This allows you to find regions where the real values are continuously below the mean. The median is not smooth enough for an easy application.
One example function to do this would be :
FindLowRegion <- function(x,n=length(x)/4,tol=length(x)/20,p=0.5){
nx <- length(x)
n <- 2*(n %/% 2) + 1
# smooth out based on means
sx <- rowMeans(embed(c(rep(NA,n/2),x,rep(NA,n/2)),n),na.rm=T)
# find which series are far from the mean
rlesx <- rle((sx-x)>0)
# construct start and end of regions
int <- embed(cumsum(c(1,rlesx$lengths)),2)
# which regions fulfill requirements
id <- rlesx$value & rlesx$length > tol
# Cut regions to be in general smaller than median
regions <-
apply(int[id,],1,function(i){
i <- min(i):max(i)
tmp <- x[i]
id <- which(tmp < quantile(tmp,p))
id <- min(id):max(id)
i[id]
})
# return
unlist(regions)
}
where
n determines how much values are used to calculate the running mean,
tol determines how many consecutive values should be lower than the running mean to talk about a low region, and
p determines the cutoff used (as a quantile) for stripping the regions to their lowest part. When p=1, the complete lower region is shown.
Function is tweaked to work on data as you presented, but the numbers might need to be adjusted a bit to work with other data.
This function returns a set of indices, which allows you to find the low regions. Illustrated with your y vector :
Lows <- FindLowRegion(y)
newx <- seq_along(y)
newy <- ifelse(newx %in% Lows,y,NA)
plot(y, col="blue", type="l", lwd=2)
lines(newx,newy,col="red",lwd="3")
Gives :
You have to smooth the graph in some way. Median filtration is quite useful for that purpose (see http://en.wikipedia.org/wiki/Median_filter). After smoothing, you will simply have to search for the minima, just as usual (i.e. search for the points where the 1st derivative switches from negative to positive).
A simpler answer (which also does not require smoothing) could be provided by adapting the maxdrawdown() function from the tseries. A drawdown is commonly defined as the retreat from the most-recent maximum; here we want the opposite. Such a function could then be used in a sliding window over the data, or over segmented data.
maxdrawdown <- function(x) {
if(NCOL(x) > 1)
stop("x is not a vector or univariate time series")
if(any(is.na(x)))
stop("NAs in x")
cmaxx <- cummax(x)-x
mdd <- max(cmaxx)
to <- which(mdd == cmaxx)
from <- double(NROW(to))
for (i in 1:NROW(to))
from[i] <- max(which(cmaxx[1:to[i]] == 0))
return(list(maxdrawdown = mdd, from = from, to = to))
}
So instead of using cummax(), one would have to switch to cummin() etc.
My first thought was something much cruder than filtering. Why not look for the big drops followed by long enough stable periods?
span.b <- 20
threshold.b <- 0.2
dy.b <- c(rep(NA, span.b), diff(y, lag = span.b))
span.f <- 10
threshold.f <- 0.05
dy.f <- c(diff(y, lag = span.f), rep(NA, span.f))
down <- which(dy.b < -1 * threshold.b & abs(dy.f) < threshold.f)
abline(v = down)
The plot shows that it's not perfect, but it doesn't discard the outliers (I guess it depends on your take on the data).

Graph to compare two matrices in R

I have two matrices (of approximately 300 x 100) and I would like to plot a graph to see the parts of the first one that are higher than those of the second.
I can do, for instance:
# Calculate the matrices and put them into m1 and m2
# Note that the values are between -1 and 1
par(mfrow=c(1,3))
image(m1, zlim=c(-1,1))
image(m2, zlim=c(-1,1))
image(m1-m2, zlim=c(0,1))
This will plot only the desired regions in the 3rd plot but I would like to do something a bit different, like putting a line around those areas over the first plot in order to highlight them directly there.
Any idea how I can do that?
Thank you
nico
How about:
par(mfrow = c(1, 3))
image(m1, zlim = c(-1, 1))
contour(m1 - m2, add = TRUE)
image(m2, zlim = c(-1, 1))
contour(m1 - m2, add = TRUE)
image(m1 - m2, zlim = c(0, 1))
contour(m1 - m2, add = TRUE)
This adds a contour map around the regions. Sort of puts rings around the areas of the 3rd plot (might want to fiddle with the (n)levels of the contours to get fewer 'circles').
Another way of doing your third image might be:
image(m1>m2)
this produces a matrix of TRUE/FALSE values which gets imaged as 0/1, so you have a two-colour image. Still not sure about your 'putting a line around' thing though...
Here's some code I wrote to do something similar. I wanted to highlight contiguous regions above a 0.95 threshold by drawing a box round them, so I got all the grid squares above 0.95 and did a clustering on them. Then do a bit of fiddling with the clustering output to get the rectangle coordinates of the regions:
computeHotspots = function(xyz, thresh, minsize=1, margin=1){
### given a list(x,y,z), return a data frame where each row
### is a (xmin,xmax,ymin,ymax) of bounding box of a contiguous area
### over the given threshhold.
### or approximately. lets use the clustering tools in R...
overs <- which(xyz$z>thresh,arr.ind=T)
if(length(overs)==0){
## found no hotspots
return(NULL)
}
if(length(overs)==2){
## found one hotspot
xRange <- cbind(xyz$x[overs[,1]],xyz$x[overs[,1]])
yRange <- cbind(xyz$y[overs[,2]],xyz$y[overs[,2]])
}else{
oTree <- hclust(dist(overs),method="single")
oCut <- cutree(oTree,h=10)
oXYc <- data.frame(x=xyz$x[overs[,1]],y=xyz$y[overs[,2]],oCut)
xRange <- do.call("rbind",tapply(oXYc[,1],oCut,range))
yRange <- do.call("rbind",tapply(oXYc[,2],oCut,range))
}
### add user-margins
xRange[,1] <- xRange[,1]-margin
xRange[,2] <- xRange[,2]+margin
yRange[,1] <- yRange[,1]-margin
yRange[,2] <- yRange[,2]+margin
## put it all together
xr <- apply(xRange,1,diff)
xm <- apply(xRange,1,mean)
xRange[xr<minsize,1] <- xm[xr<minsize]-(minsize/2)
xRange[xr<minsize,2] <- xm[xr<minsize]+(minsize/2)
yr <- apply(yRange,1,diff)
ym <- apply(yRange,1,mean)
yRange[yr<minsize,1] <- ym[yr<minsize]-(minsize/2)
yRange[yr<minsize,2] <- ym[yr<minsize]+(minsize/2)
cbind(xRange,yRange)
}
Test code:
x=1:23
y=7:34
m1=list(x=x,y=y,z=outer(x,y,function(x,y){sin(x/3)*cos(y/3)}))
image(m1)
hs = computeHotspots(m1,0.95)
That should give you a matrix of rectangle coordinates:
> hs
[,1] [,2] [,3] [,4]
1 13 15 8 11
2 3 6 17 20
3 22 24 18 20
4 13 16 27 30
Now you can draw them over the image with rect:
image(m1)
rect(hs[,1],hs[,3],hs[,2],hs[,4])
and to show they are where they should be:
image(list(x=m1$x,y=m1$y,z=m1$z>0.95))
rect(hs[,1],hs[,3],hs[,2],hs[,4])
You could of course adapt this to draw circles, but more complex shapes would be tricky. It works best when the regions of interest are fairly compact.
Barry

Resources