Simplifying the data in a raster - R - r

I have a raster file, which I created from data downloaded from DIVA-GIS: http://www.diva-gis.org/datadown
nz_map<-raster("NZL1_msk_cov.grd")
Using plot() on this object works great, so there are no issues importing it. The raster object contains a lot of data I don't need, data on land cover. I want a more simple raster object with lon & lat coordinates and a value of 1 for land and NA for ocean.
This raster will be used with the dismo function randomPoints() to sample background data for modelling species distribution, so the most important thing is to identify which areas are land(suitable for sampling) and which are ocean(unsuitable).
I can visualise the raster more simply with plot(!is.na(nz_map5)). This works well and services for the randomPoints() function, but I'm not sure how to edit the color of the map. Doing this: plot(!is.na(nz_map5), col="grey") results in a totally grey block, instead of just colouring the appropriate areas grey; this is why I thought I might be better off with a more simple raster object, to do away with the !is.na argument Any ideas?
If anyone knows of a place you could download such files, saving me the hassle- that works, too.

Here are similar data for elevation
library(raster)
a <- getData("alt", country="NZL")
r <- a[[1]]
plot(r)
I think your confusion stems from what happens here
x <- !is.na(r)
That turns the values to TRUE (those that were not NA) or FALSE (those that were NA). So now you have two categories
plot(x, col=c("red", "blue"))
And now it is no longer a good dataset for dismo::randomPoints
If you would rather have NA and 1 other value you can do
y <- r * 0
plot(y, col="blue")
Or
y <- reclassify(y, cbind(-Inf, Inf, 1))
But, as you say yourself, for randomPoints you can just use the original data.

Related

raster::overlay with a function that calls a list of vectors

I wonder if it’s possible to use overlay function of raster package with a function that calls a list of vectors to perform some calculations based on two rasters. So far I just saw examples of functions performing some raster algebra without calling external data.
Hereafter I provide some toy code to illustrate what I’m trying to do, but can also provide some context on my real problem. Specifically, I need to classify each pixel as either zero (absence) or one (presence) of housings. The likelihood of housing presence is related to the percentage of built-up area covering the pixel (raster ‘r1’ below), and the land cover type (raster ‘r2’ below). This likelihood is known based on reference data, which is stored in a list like ‘probs’ below.
library(raster)
# continuous and categorical maps
r1<-r2<-raster()
r1[]<-round(runif(ncell(r1))*100)
r2[]<-1
r2[1:30000]<-2
# probability of housing presence in each stratum
prob1<-1:100/100
prob2<-log(1:100)/max(log(1:100))
# list of probabilities to be used in overlay
probs<-list(prob1,prob2)
# overlay - not working
o<-overlay(r1,r2,fun=function(x,y,...){return(rbinom(n=1, size=1, prob=probs[[y]][x]))})
the error is
cannot use this formula, probably because it is not vectorized
Alternatively to the toy code above, I thought to process each categorical class separately and use function calc rather than function overlay (see below). However, this is extremely slow (if not impossible) for large rasters, so I though overlay would be better.
# alternative: loop across categorical classes (extremely slow for large rasters)
r<-list()
for(i in 1:2){
stratum<-r2
stratum[Which(stratum !=i)]<-NA
r[[i]]<-calc(r1, fun=function(x,...){return(rbinom(n=1, size=1, prob=probs[[i]][x]))})
r[[i]]<-mask(r[[i]],stratum)
}
r<-stack(r)
r<-sum(r,na.rm=T)
par(mfrow=c(1,3))
plot(r1)
plot(r2)
plot(r)
I ran into this same error recently. My solution was to "Vectorize" the function that is passed to overlay. This creates a wrapper of your function with mapply so that overlay can use it. I was able to get your code to run by using Vectorize (see below).
library(raster)
# continuous and categorical maps
r1<-r2<-raster()
r1[]<-round(runif(ncell(r1))*100)
r2[]<-1
r2[1:30000]<-2
# probability of housing presence in each stratum
prob1<-1:100/100
prob2<-log(1:100)/max(log(1:100))
# list of probabilities to be used in overlay
probs<-list(prob1,prob2)
## edits below
# define function
f <- function(x,y,...){return(rbinom(n=1, size=1, prob=probs[[y]][x]))}
# run function using overlay with Vectorize
o <- overlay(r1,r2,fun=Vectorize(f))
This produced a raster layer of probabilities. It also produced the following error:
In rbinom(n = 1, size = 1, prob = probs[[y]][x]) : NAs produced
I am not sure if this error would be problematic with your real data.
You can also refer to the second answer here for another worked example.

How to extract specific values with point coordinates from Kriging interpolations made in R?

By using R version 3.4.2 and the library "geoR", I made kriging interpolations for different variables (bellow I give an example of my process). I also made a matrix with the coordinates for 305 trees with distinct marks (species, DBH, Height) that are within the same space for the interpolations, as seen in the image attached (https://imgur.com/SLQBnZH). I've been looking for ways to extract the nearest value from each variable for each tree and save the corresponding values in a data.frame or matrix, but haven't been successful, and I can't find specific answers to this.
One thing I've been looking at is trying to convert the Kriging result into a Raster (.tif) and proceed from there. But Kriging interpolations are made out of vector data, so is it even posible?
I'd be glad to receive any sort of help, thank you in advance!
P.S. I'm doing this so that I can latter use the data for spatial point patern analysis.
#Kriging####:
PG<-read.csv("PGF.csv", header=T, stringsAsFactors=FALSE)
library("geoR")
x<-(PG$x)
y<-(PG$y)
#Grid
loci<-expand.grid(x=seq(-5, 65, length=100), y=seq(-5, 85, length=100))
names(loci)<-c("x", "y")
mix<-cbind(rep(1,10000), loci$x, loci$y, loci$x*loci$y)
#Model
pH1.mod<-lm(pH1~y*x, data=PG, x=T)
pH1.kg<-cbind(pH1.mod$x[,3], pH1.mod$x[,2], pH1.mod$residuals)
#Transform to geographic data
pH1.geo<-as.geodata(pH1.kg)
#Variogram
pH1.vario<-variog(pH1.geo, max.dist=35)
pH1.vario.mod<-eyefit(pH1.vario)
#Cross validation
pH1.valcruz<-xvalid(pH1.geo, model=pH1.vario.mod)
#Kriging
pH1.krig<-krige.conv(pH1.geo, loc=loci, krige=krige.control(obj.model=pH1.vario.mod[[1]]))
#Predictive model
pH1a.yhat<-mix %*% pH1.mod$coefficients + pH1.krig$predict
#Exchange Kriging prediction values
pH1.krig$predict<-pH1.yhat
#Image
image(pH1.krig2)
contour(pH1.krig2, add=TRUE)
#Tree matrix####:
CoA<-read.csv("CoAr.csv", header=T)
#Data
xa<-(CoA$X)
ya<-(CoA$Y)
points(xa,ya, col=4)
TreeDF<-(cbind.data.frame(xa, ya, CoA$Species, CoA$DBH, CoA$Height, stringsAsFactors = TRUE))
m<-(cbind(xa, ya, 1:305))
as.matrix(m)
I tried to find the value of a point in space (trees [1:305]) through the minimum distance to a predicted value using the following code, (I suggest not running this since it takes too long):
for(i in 1:2){print(c(2:10000)[as.matrix(dist(rbind(m[i,], as.matrix(pH1.krig2$predict))))[i,2:10000]==min(as.matrix(dist(rbind(m[i,],as.matrix(pH1.krig2$predict))))[i,2:10000])])}
In the following link aldo_tapia's answer was the approach needed for this problem. Thank you to everyone! https://gis.stackexchange.com/questions/284698/how-to-extract-specific-values-with-point-coordinates-from-kriging-interpolation
The process is as follows:
Use extract() function from raster package:
library(raster)
r <- SpatialPointsDataFrame(loci, data.frame(predict = pH1.krig$predict))
gridded(r) <- T
r <- as(r,'RasterLayer')
pts <- SpatialPointsDataFrame(CoA[,c('X','Y')],CoA)
pH1.arb <-extract(r, pts)
to this I just added the values through cbind to the tree data frame since they are in order.
COA2<-cbind(CoA, pH1val=pH1.arb)
I will repeat the process for each variable.

Creating Heat Map using Krigging

I'm trying to create a good heat map using Krigging for missing values.
I have the following data, that contains all the values that have been measured for RLevel.
I followed the following link that tells how to use krigging. https://rpubs.com/nabilabd/118172
This is the following code I wrote. Before these steps, I had removed all the values from my DieData that
needed values to be tested. The values that need to be tested are refered as die.data.NAValues in my code.
#**************************************************CODE*****************
#Step3: Convert to SpatialPointsDataFrame Object
coordinates(die.data) = ~X+Y
#Step 4: Get the prediction Grid
coordinates(die.data.NAValues)=~X+Y
#Using autokride method
kr = autoKrige(RLevel, die.data, die.data.NAValues,nmax=20)
predicted_die_values <- kr$krige_output
predicted_die_model <- kr$var_model
#Get Predictions. Plot the predicted on heat map.
g <- gstat(NULL,"RLevel",RLevel~1,die.data, model=predicted_die_model,nmax=1)
predictedSet <- predict(g,newdata=die.data,BLUE=TRUE)
#Plot the krigging graph
predicted_die_values %>% as.data.frame %>% ggplot(aes(x=X,y=Y)) + geom_tile(aes(fill=v1.pred))+coord_equal() +scale_fill_gradient(low="yellow",high="red")+scale_x_continuous()+scale_y_continuous()+theme_bw()
When I plot the graph, I get the following image from the values that have been tested by the KRIGING METHOD.
My question is how can I show a good heat map with predicted points from KRIG and from the points already have. I want my graph to show something like this from the link above I had posted.
Description about my dataset: My original dataset including NA values that have not been tested contains around 55057 points. When I take out NA values and use that are my prediction grid, I get 390 points. Majority of the values for RLevel are within 30's range except around 100-200 points are above 100.
Can anyone help me out or give me guidance of how to produce a good heatmap?

R - original colours of georeferenced raster image using ggplot2- and raster-packages

I would like to use the original colortable of a >>georeferenced raster image<< (tif-file) as coloured scale in a map plotted by ggplot/ggplot2.
Due to not finding an easier solution, I accessed the colortable-slot from the legend-attribute of the loaded raster image (object) raster1 like so:
raster1 <- raster(paste(workingDir, "/HUEK200_Durchlaessigkeit001_proj001.tif", sep="", collapse=""))
raster1.pts <- rasterToPoints(raster1)
raster1.df <- data.frame(raster1.pts)
colTab <- attr(raster1, "legend")#colortable
Ok, so far so good. Now I simply need to apply colortable as a colored scale to my existing plot:
(ggplot(data=raster1.df)
+ geom_tile(aes(x, y, fill=raster1.df[[3]]))
+ scale_fill_gradientn(values=1:length(colTab), colours=colTab, guide=FALSE)
+ coord_fixed(ratio=1)
)
Unfortunately, this does not work as expected. The resulting image does not show any colors beside white and the typical ggplot-grey which often appears when no custom values are defined. At the moment, I am a little clueless what is actually wrong here. I assumed that the underlying band values stored in raster1.df[[3]] are indices for the color table. This might be wrong. If it is wrong, then how are the band values connected with the colortable? And even if my assumption would be right: The parameters which I have given to scale_fill_gradientn() should still result in a more colorful plot, shouldn't they? I checked out what the unique values are:
sort(unique(raster1.df[[3]]))
This outputs:
[1] 0 1 2 3 4 5 6 7 8 9 10 11 12
Apparently, not all of the 256 members of colortable are used which reminds me that the color does not always need to reflect the underlying band-data distribution (especially when including multiple bands).
I hope, my last thoughts didn't confuse you about the fact that the objective is quite straight forward.
Thank you for your help!
Ok, I have found an answer which might not apply to every georeferenced raster image out there, but maybe almost.
First, my assumption that the data values do bot exactly represent the color selection was wrong. There are 15 unique colors in the colortable of the spatial raster object. However, not all of them are used (14 and 15). Ok, now I know, I have to map my values to the corresponding colors ina way that scale_fill_gradientn understands. For this I am using my previous initial code snippet and define a new variable valTab which stores all unique data values of the given band:
raster1 <- raster(paste(workingDir, "/HUEK200_Durchlaessigkeit001_proj001.tif", sep="", collapse=""))
raster1.pts <- rasterToPoints(raster1)
raster1.df <- data.frame(raster1.pts)
raster1.img <- melt(raster1)
colTab <- attr(raster1, "legend")#colortable
names(colTab) <- 0:(length(colTab) - 1)
valTab <- sort(unique(raster1.df[[3]]))
Notice, how index names are defined for colTab - this will be important soon. With this, I am able to automatically relate all active colors with their respective value while plotting:
(ggplot(data=raster1.df)
+ geom_tile(aes(x, y, fill=raster1.df[[3]]))
+ scale_fill_gradientn(colours=colTab[as.character(valTab)])
+ coord_fixed(ratio=1)
)
Using valTab-members as references to the corresponding color-indices helps to always pick only the colors which are needed. I don't know if defining the values-paramter of scale_fill_gradientn() is necessary in some cases.
I am not sure if the raster images read by raster() always define their values starting from 0. If not, names(colTab) <- 0:(length(colTab) - 1) needs to be adjusted.
I hope, this helps somebody in the future. At least, I finally have a solution!

Make a 3D rendered plot of time-series

I have a set of 3D coordinates (below - just for a single point, in 3D space):
x <- c(-521.531433, -521.511658, -521.515259, -521.518127, -521.563416, -521.558044, -521.571228, -521.607178, -521.631165, -521.659973)
y <- c(154.499557, 154.479568, 154.438705, 154.398682, 154.580688, 154.365189, 154.3564, 154.559189, 154.341309, 154.344223)
z <- c(864.379272, 864.354675, 864.365479, 864.363831, 864.495667, 864.35498, 864.358582, 864.50415, 864.35553, 864.359863)
xyz <- data.frame(x,y,z)
I need to make a time-series plot of this point with a 3D rendering (so I can rotate the plot, etc.). The plot will visualize a trajectory of the point above in time (for example in the form of solid line). I used 'rgl' package with plot3d method, but I can't make it to plot time-series (below, just plot a single point from first frame in time-series):
require(rgl)
plot3d(xyz[1,1],xyz[1,2],xyz[1,3],axes=F,xlab="",ylab="",zlab="")
I found this post, but it doesn't really deal with a real-time rendered 3D plots. I would appreciate any suggestions. Thank you.
If you read help(plot3d) you can see how to draw lines:
require(rgl)
plot3d(xyz$x,xyz$y,xyz$z,type="l")
Is that what you want?
How about this? It uses rgl.pop() to remove a point and a line and draw them as a trail - change the sleep argument to control the speed:
ts <- function(xyz,sleep=0.3){
plot3d(xyz,type="n")
n = nrow(xyz)
p = points3d(xyz[1,])
l = lines3d(xyz[1,])
for(i in 2:n){
Sys.sleep(sleep)
rgl.pop("shapes",p)
rgl.pop("shapes",l)
p=points3d(xyz[i,])
l=lines3d(xyz[1:i,])
}
}
The solution was simpler than I thought and the problem was that I didn't use as.matrix on my data. I was getting error (list) object cannot be coerced to type 'double' when I was simply trying to plot my entire dataset using plot3d (found a solution for this here). So, if you need to plot time-series of set of coordinates (in my case motion capture data of two actors) here is my complete solution (only works with the data set below!):
download example data set
read the above data into a table:
data <- read.table("Bob12.txt",sep="\t")
extract XYZ coordinates into a separate matrixes:
x <- as.matrix(subset(data,select=seq(1,88,3)))
y <- as.matrix(subset(data,select=seq(2,89,3)))
z <- as.matrix(subset(data,select=seq(3,90,3)))
plot the coordinates on a nice, 3D rendered plot using 'rgl' package:
require(rgl)
plot3d(x[1:nrow(x),],y[1:nrow(y),],z[1:nrow(z),],axes=F,xlab="",ylab="",zlab="")
You should get something like on the image below (but you can rotate it etc.) - hope you can recognise there are joint centers for people there. I still need to tweak it to make it visually better - to have first frame as a points (to clearly see actor's joints), then a visible break, and then the rest of frames as a lines.

Resources