Related
I'm trying to find sites to collect snails by using a semi-random selection method. I have set a 10km2 grid around the region I want to collect snails from, which is broken into 10,000 10m2 cells. I want to randomly this grid in R to select 200 field sites.
Randomly sampling a matrix in R is easy enough;
dat <- matrix(1:10000, nrow = 100)
sample(dat, size = 200)
However, I want to bias the sampling to pick cells closer to a single position (representing sites closer to the research station). It's easier to explain this with an image;
The yellow cell with a cross represents the position I want to sample around. The grey shading is the probability of picking a cell in the sample function, with darker cells being more likely to be sampled.
I know I can specify sampling probabilities using the prob argument in sample, but I don't know how to create a 2D probability matrix. Any help would be appreciated, I don't want to do this by hand.
I'm going to do this for a 9 x 6 grid (54 cells), just so it's easier to see what's going on, and sample only 5 of these 54 cells. You can modify this to a 100 x 100 grid where you sample 200 from 10,000 cells.
# Number of rows and columns of the grid (modify these as required)
nx <- 9 # rows
ny <- 6 # columns
# Create coordinate matrix
x <- rep(1:nx, each=ny);x
y <- rep(1:ny, nx);y
xy <- cbind(x, y); xy
# Where is the station? (edit: not snails nest)
Station <- rbind(c(x=3, y=2)) # Change as required
# Determine distance from each grid location to the station
library(SpatialTools)
D <- dist2(xy, Station)
From the help page of dist2
dist2 takes the matrices of coordinates coords1 and coords2 and
returns the inter-Euclidean distances between coordinates.
We can visualize this using the image function.
XY <- (matrix(D, nr=nx, byrow=TRUE))
image(XY) # axes are scaled to 0-1
# Create a scaling function - scales x to lie in [0-1)
scale_prop <- function(x, m=0)
(x - min(x)) / (m + max(x) - min(x))
# Add the coordinates to the grid
text(x=scale_prop(xy[,1]), y=scale_prop(xy[,2]), labels=paste(xy[,1],xy[,2],sep=","))
Lighter tones indicate grids closer to the station at (3,2).
# Sampling probabilities will be proportional to the distance from the station, which are scaled to lie between [0 - 1). We don't want a 1 for the maximum distance (m=1).
prob <- 1 - scale_prop(D, m=1); range (prob)
# Sample from the grid using given probabilities
sam <- sample(1:nrow(xy), size = 5, prob=prob) # Change size as required.
xy[sam,] # Thse are your (**MY!**) 5 samples
x y
[1,] 4 4
[2,] 7 1
[3,] 3 2
[4,] 5 1
[5,] 5 3
To confirm the sample probabilities are correct, you can simulate many samples and see which coordinates were sampled the most.
snail.sam <- function(nsamples) {
sam <- sample(1:nrow(xy), size = nsamples, prob=prob)
apply(xy[sam,], 1, function(x) paste(x[1], x[2], sep=","))
}
SAMPLES <- replicate(10000, snail.sam(5))
tab <- table(SAMPLES)
cols <- colorRampPalette(c("lightblue", "darkblue"))(max(tab))
barplot(table(SAMPLES), horiz=TRUE, las=1, cex.names=0.5,
col=cols[tab])
If using a 100 x 100 grid and the station is located at coordinates (60,70), then the image would look like this, with the sampled grids shown as black dots:
There is a tendency for the points to be located close to the station, although the sampling variability may make this difficult to see. If you want to give even more weight to grids near the station, then you can rescale the probabilities, which I think is ok to do, to save costs on travelling, but these weights need to be incorporated into the analysis when estimating the number of snails in the whole region. Here I've cubed the probabilities just so you can see what happens.
sam <- sample(1:nrow(xy), size = 200, prob=prob^3)
The tendency for the points to be located near the station is now more obvious.
There may be a better way than this but a quick way to do it is to randomly sample on both x and y axis using a distribution (I used the normal - bell shaped distribution, but you can really use any). The trick is to make the mean of the distribution the position of the research station. You can change the bias towards the research station by changing the standard deviation of the distribution.
Then use the randomly selected positions as your x and y coordinates to select the positions.
dat <- matrix(1:10000, nrow = 100)
#randomly selected a position for the research station
rs <- c(80,30)
# you can change the sd to change the bias
x <- round(rnorm(400,mean = rs[1], sd = 10))
y <- round(rnorm(400, mean = rs[2], sd = 10))
position <- rep(NA, 200)
j = 1
i = 1
# as some of the numbers sampled can be outside of the area you want I oversampled # and then only selected the first 200 that were in the area of interest.
while (j <= 200) {
if(x[i] > 0 & x[i] < 100 & y[i] > 0 & y [i]< 100){
position[j] <- dat[x[i],y[i]]
j = j +1
}
i = i +1
}
plot the results:
plot(x,y, pch = 19)
points(x =80,y = 30, col = "red", pch = 19) # position of the station
I have data which is movement of an object in 3D space at regular time intervals. Data is as below:
Time X Y Z
1 1 1 1
2 2 1 2
3 2 0 1
4 3 2 1
.....
(x,y,z) is the position of object at time t. I want to plot a 3D graph where it shows the complete movement of object in 3d space, but to have a slider or something of that sort where I can select a time range (say 500 to 750) and see the movement of the object in 3D space. So, here we have 4 dimensions: x,y,z are positions and time as 4th dimension and use a slider to control the plotting of points with in that time. [Example in Mathematica below gives a good idea about this]
To make it more clear. We first draw the complete movement of the object in 3D space from time 1 to N. Then, by controlling the slider, we draw the movement of same object between t1 to t2 time stamps. It is also important to display at what time the slider is at (as I have to make a note of some interested time stamps based on the movement).
I have Googled the same, but no example was close enough to get me what I want. All of those bind the slider to one of the axis variables (say x or y which might be time) but we have to bind it to 4th dimension, time. dygraphs was promising but I had similar issues as discussed above (also, didn't find any 3d support).
This one in Mathematica is interesting. But I don't have license for it. It just moves a point on the 3D path traced. This can solve my problem as well, but I should be able to know the time-stamp values when I pause it.
Solution in R is good for me because it does not have any licensing issues. Or in Matlab if it does not use any advanced visualization toolboxes. Or Python.
Thanks in Advance.
This is a raw example that can be customized as desired. It uses manipulate and plot3D
library(manipulate)
library(plot3D)
min_time <- 1
max_time <- 100
time_interval <- min_time:max_time
# Create data frame
DF <- data.frame(t = time_interval)
# Time parametric functions
X <- function(t) {
return(2 * t)
}
Y <- function(t) {
return(t ** 2)
}
Z <- function(t) {
return(10 * cos(t / 100))
}
# Update data frame
DF$x <- sapply(DF$t, X)
DF$y <- sapply(DF$t, Y)
DF$z <- sapply(DF$t, Z)
# Use manipulate with RStudio
manipulate({
lines3D(x = DF$x, y = DF$y, z = DF$z)
scatter3D(
x = DF$x[t],
y = DF$y[t],
z = DF$z[t],
add = TRUE
)
}, t = slider(min_time, max_time))
I would like to create a 2D levelplot in R where the x and y coordinates are from an irregular grid without using interpolation. The grid is given below:
grid<-cbind(seq(from=0.05,to=0.5,by=0.05),seq(from=0.05,to=0.5,by=0.05))
grid<-rbind(grid,cbind(seq(from=0.0,to=0.95,by=0.05),seq (from=0.05,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.9,by=0.05),seq (from=0.1,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.85,by=0.05),seq(from=0.15,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.75,by=0.05),seq(from=0.25,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.80,by=0.05),seq(from=0.20,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.70,by=0.05),seq(from=0.30,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.65,by=0.05),seq(from=0.35,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.60,by=0.05),seq(from=0.40,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.55,by=0.05),seq(from=0.45,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.50,by=0.05),seq(from=0.50,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.40,by=0.05),seq(from=0.60,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.45,by=0.05),seq(from=0.55,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=.35,by=0.05),seq(from=0.65,to=1,by=0.05)))
grid<-rbind(grid,cbind(seq(from=0,to=0.30,by=0.05),seq(from=0.70,to=1,by=0.05)))
x=grid[,1]
y=grid[,2]
The Z-values are stored in another vector. I have tried to use the image-function, but without any luck. For instance, if I try
image(x,y,height.vals)
where
height.vals=matrix(runif(dim(grid)[1]),nrow=dim(grid)[1],ncol=1)
I get an error message saying that x and y should be increasing.
One could use the akima function interp, but then I get interpolated data.
Looks like you have points on a 20 x 20 grid. So, you can create a 20 x 20 matrix and fill it with the values from height.vals.
With a little bit of tweaking, you can turn the x and y values into indices of the matrix and use those indices to assign height.vals to the appropriate places in the matrix.
# Turn the x and y values into integers.
# R doesn't take 0 as an index, so add 1 to the x values to get rid of the 0s
inds <- cbind(x = as.integer(20*x + 1), y = as.integer(20*y))
# create the 20 x 20 matrix
m <- matrix(nrow = 20, ncol = 20)
# fill the matrix with height.vals based on the indices
m[inds] <- height.vals
Then, you can use m as an input to functions like image, filled.contour, and lattice::levelplot
image(m)
I am trying to calculate the area generated (in orange) by an arbitrary point in the space. here are some example pictures of different possible scenarios:
So basically in all three pictures I want to be able to calculate the orange area that is generated from point by drawing a horizontal and vertical line from the point to the blue area. The idea is simple but actually implementing is very challenging. I am writing this code in R so any help with R code would be great. Also, for the third example, we can just assume that the orange area is bounded at x and y equal to 8. And, we also know the coordinates of the green points. Any suggestion greatly appreciated!
Oh an here is my code for generating the plots below:
x = c(1,3,5)
y = c(5,3,1)
point1 = c(2,4)
point2 = c(2,2)
point3 = c(0,0)
plot(x,y,type="n",xlim=c(0,8),ylim=c(0,8))
rect(point1[1],point1[2],max(x)+10,max(y)+10,col="orange",border=NA)
rect(x,y,max(x)+10,max(y)+10,col="lightblue",border=NA)
points(x,y,pch=21,bg="green")
points(point1[1],point1[2],pch=21,bg="blue")
box()
plot(x,y,type="n",xlim=c(0,8),ylim=c(0,8))
rect(point2[1],point2[2],max(x)+10,max(y)+10,col="orange",border=NA)
rect(x,y,max(x)+10,max(y)+10,col="lightblue",border=NA)
points(x,y,pch=21,bg="green")
points(point2[1],point2[2],pch=21,bg="blue")
box()
plot(x,y,type="n",xlim=c(0,8),ylim=c(0,8))
rect(point3[1],point3[2],max(x)+10,max(y)+10,col="orange",border=NA)
rect(x,y,max(x)+10,max(y)+10,col="lightblue",border=NA)
points(x,y,pch=21,bg="green")
points(point3[1],point3[2],pch=21,bg="blue")
box()
You're working much harder than necessary. pracma::polyarea will calculate the area of any polygon given the coordinates of all vertices.
Think about the entire plotting region as an unequal grid of rectangles, with x- and y-grid points at the x- and y-coordinates of the rectangle vertices you're plotting.
x <- c(1, 3, 5)
y <- c(5, 3, 1)
max.x <- max(x) + 10
max.y <- max(y) + 10
point <- c(0, 0)
x.grid <- sort(unique(c(x, point[1], max.x)))
x.grid
# [1] 0 1 3 5 15
y.grid <- sort(unique(c(y, point[2], max.y)))
y.grid
# [1] 0 1 3 5 15
We'll keep track of the grid rectangles we painted orange with the matrix orange:
orange <- matrix(FALSE, nrow=length(y.grid)-1, ncol=length(x.grid)-1)
We'll make a plotting function that labels cells in orange based on the passed rectangle, with (x1, y1) as lower left and (x2, y2) as upper right:
plot.rect <- function(x1, y1, x2, y2, value) {
x1.idx <- which(x.grid == x1)
y1.idx <- which(y.grid == y1)
x2.idx <- which(x.grid == x2)
y2.idx <- which(y.grid == y2)
orange[y1.idx:(y2.idx-1),x1.idx:(x2.idx-1)] <<- value
}
Then, let's plot our orange rectangle (filling in TRUE) followed by all the blue ones (filling in FALSE):
plot.rect(point[1], point[2], max.x, max.y, TRUE)
for (idx in 1:length(x)) {
plot.rect(x[idx], y[idx], max.x, max.y, FALSE)
}
Finally, let's compute the size of each grid rectangle, enabling the final size computation (the point I selected at the top corresponds to your third plot; since the plot extends up 15 and to the right 15, it appears to be working as intended):
sizes <- t(outer(diff(x.grid), diff(y.grid)))
area <- sum(orange * sizes)
area
# [1] 41
I have two matrices (of approximately 300 x 100) and I would like to plot a graph to see the parts of the first one that are higher than those of the second.
I can do, for instance:
# Calculate the matrices and put them into m1 and m2
# Note that the values are between -1 and 1
par(mfrow=c(1,3))
image(m1, zlim=c(-1,1))
image(m2, zlim=c(-1,1))
image(m1-m2, zlim=c(0,1))
This will plot only the desired regions in the 3rd plot but I would like to do something a bit different, like putting a line around those areas over the first plot in order to highlight them directly there.
Any idea how I can do that?
Thank you
nico
How about:
par(mfrow = c(1, 3))
image(m1, zlim = c(-1, 1))
contour(m1 - m2, add = TRUE)
image(m2, zlim = c(-1, 1))
contour(m1 - m2, add = TRUE)
image(m1 - m2, zlim = c(0, 1))
contour(m1 - m2, add = TRUE)
This adds a contour map around the regions. Sort of puts rings around the areas of the 3rd plot (might want to fiddle with the (n)levels of the contours to get fewer 'circles').
Another way of doing your third image might be:
image(m1>m2)
this produces a matrix of TRUE/FALSE values which gets imaged as 0/1, so you have a two-colour image. Still not sure about your 'putting a line around' thing though...
Here's some code I wrote to do something similar. I wanted to highlight contiguous regions above a 0.95 threshold by drawing a box round them, so I got all the grid squares above 0.95 and did a clustering on them. Then do a bit of fiddling with the clustering output to get the rectangle coordinates of the regions:
computeHotspots = function(xyz, thresh, minsize=1, margin=1){
### given a list(x,y,z), return a data frame where each row
### is a (xmin,xmax,ymin,ymax) of bounding box of a contiguous area
### over the given threshhold.
### or approximately. lets use the clustering tools in R...
overs <- which(xyz$z>thresh,arr.ind=T)
if(length(overs)==0){
## found no hotspots
return(NULL)
}
if(length(overs)==2){
## found one hotspot
xRange <- cbind(xyz$x[overs[,1]],xyz$x[overs[,1]])
yRange <- cbind(xyz$y[overs[,2]],xyz$y[overs[,2]])
}else{
oTree <- hclust(dist(overs),method="single")
oCut <- cutree(oTree,h=10)
oXYc <- data.frame(x=xyz$x[overs[,1]],y=xyz$y[overs[,2]],oCut)
xRange <- do.call("rbind",tapply(oXYc[,1],oCut,range))
yRange <- do.call("rbind",tapply(oXYc[,2],oCut,range))
}
### add user-margins
xRange[,1] <- xRange[,1]-margin
xRange[,2] <- xRange[,2]+margin
yRange[,1] <- yRange[,1]-margin
yRange[,2] <- yRange[,2]+margin
## put it all together
xr <- apply(xRange,1,diff)
xm <- apply(xRange,1,mean)
xRange[xr<minsize,1] <- xm[xr<minsize]-(minsize/2)
xRange[xr<minsize,2] <- xm[xr<minsize]+(minsize/2)
yr <- apply(yRange,1,diff)
ym <- apply(yRange,1,mean)
yRange[yr<minsize,1] <- ym[yr<minsize]-(minsize/2)
yRange[yr<minsize,2] <- ym[yr<minsize]+(minsize/2)
cbind(xRange,yRange)
}
Test code:
x=1:23
y=7:34
m1=list(x=x,y=y,z=outer(x,y,function(x,y){sin(x/3)*cos(y/3)}))
image(m1)
hs = computeHotspots(m1,0.95)
That should give you a matrix of rectangle coordinates:
> hs
[,1] [,2] [,3] [,4]
1 13 15 8 11
2 3 6 17 20
3 22 24 18 20
4 13 16 27 30
Now you can draw them over the image with rect:
image(m1)
rect(hs[,1],hs[,3],hs[,2],hs[,4])
and to show they are where they should be:
image(list(x=m1$x,y=m1$y,z=m1$z>0.95))
rect(hs[,1],hs[,3],hs[,2],hs[,4])
You could of course adapt this to draw circles, but more complex shapes would be tricky. It works best when the regions of interest are fairly compact.
Barry