regarding two raster layers which do not match exactly because of defective data, i would like to know, how to find out about the x/y shift between these two layers to align them properly using raster::shift()
i have already tried to investigate on the x/y-shift using qgis, but i just found the georeferencing tool, providing to relocate raster layers but not something interactive. i am looking for a possibility to move my defective raster on a basemap and getting information about the x/y shift.
i am NOT looking for a solution where i have to set specific georeferencing points to align the two raster layers since i am working on a highly dynamic landscape where it is difficult to find matching points, but where it is possible to align the raster layers by textural information provided by the datasets.
a code example should look like the solution provided by user #dTanMan URL:https://gis.stackexchange.com/users/77712/dtanman in this post URL:https://gis.stackexchange.com/a/201750
raster <- raster()
raster <- shift(raster, x=5, y=-15)
thanks a lot in advance, cheers, ExploreR
Perhaps you can use something like this
Example data
library(raster)
a <- raster(ncol=20, nrow=20, xmn=0,xmx=20,ymn=0,ymx=20)
values(a) <- 1:400
set.seed(3)
b <- a + runif(400)
Function to compare similarity of cell values
rmse <- function(obs, prd) {
sqrt(mean((obs-prd)^2, na.rm=TRUE))
}
Values from reference raster. May need to take a sample if raster is very large
nsamples <- 10000
s <- sampleRegular(a, nsamples, cells=TRUE)
sample_a <- s[,2]
Locations to be compared
xy <- xyFromCell(a, s[,1])
Test range for cell shifts
xrange <- -5:5 * xres(a)
yrange <- -5:5 * yres(a)
Matrix to store the results in
result <- cbind(rep(xrange, each=length(yrange)), rep(yrange, length(xrange)), NA)
colnames(result) <- c("dx", "dy", "rmse")
Loop over cellshift combinations
i <- 1
for (dx in xrange) {
for (dy in yrange) {
x <- shift(b, dx, dy)
sample_b <- extract(x, xy)
result[i,3] <- rmse(sample_a, sample_b)
i <- i + 1
}
}
Results suggest that dx=0 and dy=0 is the best in this case.
r <- result[order(result[,3]), ]
head(r)
# dx dy rmse
#[1,] 0 0 0.5734866
#[2,] 1 0 0.5800670
#[3,] -1 0 1.5252878
#[4,] 2 0 1.5302921
#[5,] -2 0 2.5153573
#[6,] 3 0 2.5157728
Test
bb <- shift(b, dx=r[1,1], dy=r[1,2])
rmse(values(a), values(bb))
#[1] 0.5734866
Related
I have two rasters that are of same size and contains data from the same location, but different types of data (one raster has slope data and the other has aspect data). I want to be able to look at slope data for one aspect at a time, so I was trying to create a setup (maybe an if/else statement?) where I said "if (aspect condition) was satisfied in one raster, the slope data would get pulled from that same pixel in the other raster.
#I have a slope and an aspect raster that i pulled
library(raster)
library(rgdal)
library(sp)
aspect <- raster("geotiff name here")
slope <- raster("geotiff name here")
#Looking at the north aspect (between 0-22.5 degrees or 337.5-360 degrees)
#First I am setting the pixels in the aspect raster that correspond to north
#equal to 1, and the values that don't = 0
aspect[aspect >= 0 & aspect <= 22.5] <- 1
aspect[aspect >= 337.5 & aspect <= 360] <- 1
aspect[aspect > 22.5 & aspect < 337.5] <- 0
#Here i am saving the indices of the raster that face north to a new one
north <- which(aspect == 1, cells = true)
Then I want to only read the data from the pixels of the slope raster that got assigned a TRUE value from the aspect raster, but this is where I've gotten stumped! I've started using R very recently so there is probably an easy way to do this I'm missing, and any help is appreciated. Thank you very much!
Always include example data (see the help files for inspiration, here from ?raster::terrain)
library(raster)
x <- getData('alt', country='CHE')
aspect <- terrain(x, 'aspect', unit='degrees')
slope <- terrain(x, 'slope', unit='degrees')
This is a better way to reclassify:
m <- matrix(c(0,22.5,1,22.5,337.50,0,337.5,360,1), ncol=3, byrow=TRUE)
aspectcls <- reclassify(aspect, m)
Get the slope data where aspectcls != 0
nslope <- mask(slope, aspectcls, maskvalue=0)
Get the values
v <- values(nslope)
boxplot(v)
You could also do
crosstab(aspectcls, slope)
I would not recommend the path you took, but if you took it, you could do
cells <- Which(aspectcls, cells=T)
vv <- slope[cells]
boxplot(vv)
You don't need to convert 1 to TRUE, as R does this automatically. Try this code:
#create a data frame
data <- data.frame(aspect=aspect, slope=slope)
#create a 'north' column and populate with 1
data$north <- 1
#those that don't meet the north criteria are converted to 0
data$north[data$aspect > 22.5 & data$aspect < 337.5] <- 0
#report the 'slope' values where north=1
data$slope[data$north == 1]
I have run Maxent for multiple species under present conditions and also under future climate change scenarios. I was quantifying changes between present and future suitable habitat using the nicheOverlap function and Schoener's D statistic. Quite a few of the organisms in my study are just moving farther up mountains so there is a lot of overlap as the future distribution is inside the present distribution (just occupying less area at higher elevations). By looking at the ascii files in QGIS I can see that there is less suitable habitat in terms of area in the future, so I want to quantify this. I have scoured the internet for a good way to calculate area for rasters and never found anything that perfectly suited my fancy. I therefore wrote up something that is an amalgamation of bits and pieces of various scripts. It is pasted below.
Two questions:
1) do you all agree this is doing what I think it is doing (calculating area in square kilometers)?
2) is there a way to simplify this? Specifically you'll see I go from a raster to a dataframe back to raster? Maybe I could stay in rasters?
Thanks for any input!
Rebecca
####
library(raster)
#load rasters
m <- raster("SpeciesA_avg.asc")
mf <- raster("SpeciesA_future_layers_avg.asc")
#change to dataframe
m.df <- as.data.frame(m, xy=TRUE)
#get rid of NAs
m.df1 <- na.omit(m.df)
#keep only cells that that have a suitability score above 0.5 (scores range from 0 to 1)
m.df2 <- m.df1[m.df1$SpeciesA_avg> 0.5,]
#re-rasterize just the suitable area
m.raster <- rasterFromXYZ(m.df2)
##same as above but for future projection
mf.df <- as.data.frame(mf, xy=TRUE)
mf.df1 <- na.omit(mf.df)
mf.df2 <- mf.df1[mf.df1$SpeciesA_future_layers_avg>0.5,]
mf.raster <-rasterFromXYZ(mf.df2)
#get sizes of all cells in current distribution raster
#note my original layers were 30 seconds or 1 km2.
cell_size<-area(m.raster, na.rm=TRUE, weights=FALSE)
#delete NAs from all raster cells. It looks like these come back when switching from dataframe to raster
cell_size1<-cell_size[!is.na(cell_size)]
#compute area [km2] of all cells in raster
raster_area_present<-length(cell_size1)*median(cell_size1)
raster_area_present
#get sizes of all cells in future raster [km2]
cell_size<-area(mf.raster, na.rm=TRUE, weights=FALSE)
#delete NAs from vector of all raster cells
cell_size1<-cell_size[!is.na(cell_size)]
#compute area [km2] of all cells in geo_raster
raster_area_future<-length(cell_size1)*median(cell_size1)
raster_area_future
##calculate change in area
dif_area <- raster_area_present - raster_area_future
dif_area
When you ask a question, you should provide a simple self-contained example. Not just dump your script that points to files we do not have. Writing a simple example teaches your R, and often helps you solve the problem by yourself. Anyway, I here is some example data and solution to your problem, I think:
library(raster)
#example data
m <- mf <- raster(ncol=10, nrow=10, vals=0)
m[,1] <- NA
m[,3:7] <- 1
mf[,6:9] <- 1
# get rid of NAs (the example has none); should not be needed
m <- reclassify(m, cbind(NA, NA, 0))
mf <- reclassify(mf, cbind(NA, NA, 0))
# keep cells > 0.5 (scores range from 0 to 1)
m <- round(m)
mf <- round(mf)
# now combine the two layers, for example:
x <- m + mf * 10
# area of each cell
a <- area(x)
# sum area by class
z <- zonal(a, x, sum)
# zone value
#[1,] 0 152327547
#[2,] 1 152327547
#[3,] 10 101551698
#[4,] 11 101551698
zone 0 is "not current, nor future", 1 is "current only", 10 is "future only" and 11 is "current and future"
The areas are in m^2.
You may want to check out this tutorial on maxent and other spatial distribution models: http://rspatial.org/sdm/
I want to draw a heatmap.
I have 100k*100k square matrix (50Gb(csv), numbers on right-top side and other filled by 0).
I want to ask "How can I draw a heatmap with R?" with this huge dataset.
I'm trying to this code on large RAM machine.
d = read.table("data.csv", sep=",")
d = as.matrix(d + t(d))
heatmap(d)
I tried some libraries like heatmap.2(in gplots) or something.
But they are take so much time and memories.
What I suggest you is to heavily down-sample your matrix before plotting the heatmap, e.g. doing the mean of each submatrices (as suggested by #IaroslavDomin) :
# example of big mx 10k x 10 k
bigMx <- matrix(rnorm(10000*10000,mean=0,sd=100),10000,10000)
# here we downsample the big matrix 10k x 10k to 100x100
# by averaging each submatrix
downSampledMx <- matrix(NA,100,100)
subMxSide <- nrow(bigMx)/nrow(downSampledMx)
for(i in 1:nrow(downSampledMx)){
rowIdxs <- ((subMxSide*(i-1)):(subMxSide*i-1))+1
for(j in 1:ncol(downSampledMx)){
colIdxs <- ((subMxSide*(j-1)):(subMxSide*j-1))+1
downSampledMx[i,j] <- mean(bigMx[rowIdxs,colIdxs])
}
}
# NA to disable the dendrograms
heatmap(downSampledMx,Rowv=NA,Colv=NA)
For sure with your huge matrix it will take a while to compute the downSampledMx, but it should be feasible.
EDIT :
I think downsampling should preserve recognizable "macro-patterns", e.g. see the following example :
# create a matrix with some recognizable pattern
set.seed(123)
bigMx <- matrix(rnorm(50*50,mean=0,sd=100),50,50)
diag(bigMx) <- max(bigMx) # set maximum value on the diagonal
# set maximum value on a circle centered on the middle
for(i in 1:nrow(bigMx)){
for(j in 1:ncol(bigMx)){
if(abs((i - 25)^2 + (j - 25)^2 - 10^2) <= 16)
bigMx[i,j] <- max(bigMx)
}
}
# plot the original heatmap
heatmap(bigMx,Rowv=NA,Colv=NA, main="original")
# function used to down sample
downSample <- function(m,newSize){
downSampledMx <- matrix(NA,newSize,newSize)
subMxSide <- nrow(m)/nrow(downSampledMx)
for(i in 1:nrow(downSampledMx)){
rowIdxs <- ((subMxSide*(i-1)):(subMxSide*i-1))+1
for(j in 1:ncol(downSampledMx)){
colIdxs <- ((subMxSide*(j-1)):(subMxSide*j-1))+1
downSampledMx[i,j] <- mean(m[rowIdxs,colIdxs])
}
}
return(downSampledMx)
}
# downsample x 2 and plot heatmap
downSampledMx <- downSample(bigMx,25)
heatmap(downSampledMx,Rowv=NA,Colv=NA, main="downsample x 2")
# downsample x 5 and plot heatmap
downSampledMx <- downSample(bigMx,10)
heatmap(downSampledMx,Rowv=NA,Colv=NA, main="downsample x 5")
Here's the 3 heatmaps :
I have created a 3d map using rgl.surface(), mainly following Shane's answer in this post. Using my own data, I get this map
On top of this surface map, I would like to add a map of vegetation density such that I obtain something like this (obtained with the software Surfer):
Is it possible to do this with rgl, or for that matter any other package in r or is the only solution to have two maps like in Shane's answer?
Thank you.
Edit:
Following #gsk3's request, here is the code for this map:
library(rgl)
# Read the z (i.e. elevation) dimension from file
z1 = matrix(scan("myfile.txt"),nrow=256, ncol=256, byrow=TRUE)
#create / open x y (i.e. easting and northing coordinates) dimensions
y=8*(1:ncol(z)) # Each point is 8 m^2
x=8*(1:nrow(z))
# See https://stackoverflow.com/questions/1896419/plotting-a-3d-surface-plot-with-contour-map-overlay-using-r for details of code below
zlim <- range(z)
zlen <- zlim[2] - zlim[1] + 1
colorlut <- terrain.colors(zlen,alpha=0) # height color lookup table
col <- colorlut[ z-zlim[1]+1 ] # assign colors to heights for each point
open3d()
rgl.surface(x,y,z)
I can't post the elevation code because there are 65536 (i.e. x*y=256*256) points but it is a matrix which looks like this
[,1] [,2] [,3] [,4] [,5]
[1,] 1513.708 1513.971 1514.067 1513.971 1513.875
[2,] 1513.622 1513.524 1513.578 1513.577 1513.481
and so on.
Same for the vegetation density map, which is exactly the same format and for which I have a single value for each x*y point. I hope this makes things a bit clearer...?
Edit 2, final version
This is the map I have produced with R. I haven't got the legend on it yet but this is something I'll do at a later stage.
The final code for this is
library(rgl)
z1 = matrix(scan("myfile.txt"),nrow=256, ncol=256, byrow=TRUE)
# Multiply z by 2 to accentuate the relief otherwise it looks a little bit flat.
z= z1*2
#create / open x y dimensions
y=8*(1:ncol(z))
x=8*(1:nrow(z))
trn = matrix(scan("myfile.txt"),nrow=256, ncol=256, byrow=TRUE)
fv = trn*100
trnlim = range(fv)
fv.colors = colorRampPalette(c("white","tan4","darkseagreen1","chartreuse4")) ## define the color ramp
colorlut =fv.colors(100)c(1,seq(35,35,length.out=9),seq(35,75,length.out=30),seq(75,100,length.out=61))]
# Assign colors to fv for each point
col = colorlut[fv-trnlim[1]+1 ]
open3d()
rgl.surface(x,y,z,color=col)
Thank you very much to #gsk3 and #nullglob in this post for their help. Hope this post will help many others!
Modified above code to give an answer. Note that terrain should be a matrix in the same format as the elevation matrix. And I added a ,color argument to your function call so it actually uses the color matrix you created.
library(rgl)
# Read the z (i.e. elevation) dimension from file
z1 = matrix(scan("myfile.txt"),nrow=256, ncol=256, byrow=TRUE)
#create / open x y (i.e. easting and northing coordinates) dimensions
y=8*(1:ncol(z)) # Each point is 8 m^2
x=8*(1:nrow(z))
# Read the terrain types from a file
trn = matrix(scan("terrain.txt"),nrow=256, ncol=256, byrow=TRUE)
# See http://stackoverflow.com/questions/1896419/plotting-a-3d-surface-plot-with-contour-map-overlay-using-r for details of code below
trnlim <- range(trn)
trnlen <- trnlim[2] - trnlim[1] + 1
colorlut <- terrain.colors(trnlen,alpha=0) # height color lookup table
col <- colorlut[ trn-trnlim[1]+1 ] # assign colors to heights for each point
open3d()
rgl.surface(x,y,z,color=col)
I have two matrices (of approximately 300 x 100) and I would like to plot a graph to see the parts of the first one that are higher than those of the second.
I can do, for instance:
# Calculate the matrices and put them into m1 and m2
# Note that the values are between -1 and 1
par(mfrow=c(1,3))
image(m1, zlim=c(-1,1))
image(m2, zlim=c(-1,1))
image(m1-m2, zlim=c(0,1))
This will plot only the desired regions in the 3rd plot but I would like to do something a bit different, like putting a line around those areas over the first plot in order to highlight them directly there.
Any idea how I can do that?
Thank you
nico
How about:
par(mfrow = c(1, 3))
image(m1, zlim = c(-1, 1))
contour(m1 - m2, add = TRUE)
image(m2, zlim = c(-1, 1))
contour(m1 - m2, add = TRUE)
image(m1 - m2, zlim = c(0, 1))
contour(m1 - m2, add = TRUE)
This adds a contour map around the regions. Sort of puts rings around the areas of the 3rd plot (might want to fiddle with the (n)levels of the contours to get fewer 'circles').
Another way of doing your third image might be:
image(m1>m2)
this produces a matrix of TRUE/FALSE values which gets imaged as 0/1, so you have a two-colour image. Still not sure about your 'putting a line around' thing though...
Here's some code I wrote to do something similar. I wanted to highlight contiguous regions above a 0.95 threshold by drawing a box round them, so I got all the grid squares above 0.95 and did a clustering on them. Then do a bit of fiddling with the clustering output to get the rectangle coordinates of the regions:
computeHotspots = function(xyz, thresh, minsize=1, margin=1){
### given a list(x,y,z), return a data frame where each row
### is a (xmin,xmax,ymin,ymax) of bounding box of a contiguous area
### over the given threshhold.
### or approximately. lets use the clustering tools in R...
overs <- which(xyz$z>thresh,arr.ind=T)
if(length(overs)==0){
## found no hotspots
return(NULL)
}
if(length(overs)==2){
## found one hotspot
xRange <- cbind(xyz$x[overs[,1]],xyz$x[overs[,1]])
yRange <- cbind(xyz$y[overs[,2]],xyz$y[overs[,2]])
}else{
oTree <- hclust(dist(overs),method="single")
oCut <- cutree(oTree,h=10)
oXYc <- data.frame(x=xyz$x[overs[,1]],y=xyz$y[overs[,2]],oCut)
xRange <- do.call("rbind",tapply(oXYc[,1],oCut,range))
yRange <- do.call("rbind",tapply(oXYc[,2],oCut,range))
}
### add user-margins
xRange[,1] <- xRange[,1]-margin
xRange[,2] <- xRange[,2]+margin
yRange[,1] <- yRange[,1]-margin
yRange[,2] <- yRange[,2]+margin
## put it all together
xr <- apply(xRange,1,diff)
xm <- apply(xRange,1,mean)
xRange[xr<minsize,1] <- xm[xr<minsize]-(minsize/2)
xRange[xr<minsize,2] <- xm[xr<minsize]+(minsize/2)
yr <- apply(yRange,1,diff)
ym <- apply(yRange,1,mean)
yRange[yr<minsize,1] <- ym[yr<minsize]-(minsize/2)
yRange[yr<minsize,2] <- ym[yr<minsize]+(minsize/2)
cbind(xRange,yRange)
}
Test code:
x=1:23
y=7:34
m1=list(x=x,y=y,z=outer(x,y,function(x,y){sin(x/3)*cos(y/3)}))
image(m1)
hs = computeHotspots(m1,0.95)
That should give you a matrix of rectangle coordinates:
> hs
[,1] [,2] [,3] [,4]
1 13 15 8 11
2 3 6 17 20
3 22 24 18 20
4 13 16 27 30
Now you can draw them over the image with rect:
image(m1)
rect(hs[,1],hs[,3],hs[,2],hs[,4])
and to show they are where they should be:
image(list(x=m1$x,y=m1$y,z=m1$z>0.95))
rect(hs[,1],hs[,3],hs[,2],hs[,4])
You could of course adapt this to draw circles, but more complex shapes would be tricky. It works best when the regions of interest are fairly compact.
Barry