I use randomForest model to predict class memberships. 'x' consists of 10 classes that I use to train 'training_predictors' values extracted from a large rasterstack/brick. The specific line of codes is:
r_tree<-randomForest(x ~. , data=training_predictors, ...)
Then I run 'predict' using the model 'r_tree' that I apply to the rasterstack 'predictor_data', as follow:
predictions<-predict(predictor_data, r_tree, filename=outraster, fun=predict na.rm=TRUE, format="PCDISK", overwrite=TRUE, progress="text", type="response").
The output is a raster that I use as thematic map.
I would like to use the conditional inference trees mode 'cforest' instead of randomForest to achieve the same goals.
I understand that 'predict' can be used with cforest, yet, I have not been able to generate raster files, such as those with randomForest as illustrated above.
It should run fine, but you may need to add the argument OOB=TRUE, and identify factors if there are any.
Example data
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)
# extract values for points
xy <- rbind(cbind(1, p), cbind(0, a))
v <- data.frame(cbind(xy[,1], extract(logo, xy[,2:3])))
colnames(v)[1] <- 'pa'
Basic model
library(party)
m1 <- cforest(pa~., control=cforest_unbiased(mtry=3), data=v)
pc1 <- predict(logo, m1, OOB=TRUE)
plot(pc1)
Model with factors
v$red <- as.factor(round(v$red/100))
logo$red <- round(logo[[1]]/100)
m2 <- cforest(pa~., control=cforest_unbiased(mtry=3), data=v)
f <- list(levels(v$red))
names(f) <- 'red'
pc2 <- predict(logo, m2, OOB=TRUE, factors=f)
plot(pc2)
By the way, this comes almost straight out of the help file of raster::predict
Related
As far as I can tell, I have specified this simple GLM the same way using a basic glm function and the caret train function. However, the caret version will not converge. Is there something missing in how I am specifying the train model?
library(caret)
logo <- rast(system.file("ex/logo.tif", package="terra"))
names(logo) <- c("red", "green", "blue")
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)
xy <- rbind(cbind(1, p), cbind(0, a))
# extract predictor values for points
e <- terra::extract(logo, xy[,2:3])
# combine with response (excluding the ID column)
v <- data.frame(cbind(pa=xy[,1], e))
v$pa <- as.factor(v$pa)
#GLM model
model <- glm(formula=as.numeric(pa)~ red + blue + green , data=v)
#Train model
model2 <- train(pa ~ red + green + blue,
data=v,
method = "glm")
>Warning messages:
> 1: glm.fit: algorithm did not converge
> 2: glm.fit: fitted probabilities numerically 0 or 1 occurred
Compliments of #Ben Bolker in the comments:
The train model was using family="binomial" as the default because the response variable is 0 and 1. The train model works with family="gaussian".
I am using an xgboost model to predict onto a raster stack. I have successfully used the same approach with CART, xgb and Random Forest models:
library(raster)
# create a RasterStack or RasterBrick with with a set of predictor layers
logo <- brick(system.file("external/rlogo.grd", package="raster"))
names(logo)
# known presence and absence points
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)
# extract values for points
xy <- rbind(cbind(1, p), cbind(0, a))
v <- data.frame(cbind(pa=xy[,1], extract(logo, xy[,2:3])))
xgb <- xgboost(data = data.matrix(subset(v, select = -c(pa))), label = v$pa,
nrounds = 5)
raster::predict(model = xgb, logo)
But with xgboost I get the following error:
Error in xgb.DMatrix(newdata, missing = missing) :
xgb.DMatrix does not support construction from list
The problem is that predict.xgb.Booster does not accept a data.frame for argument newdata (see ?predict.xgb.Booster). That is unexpected (all common predict.* methods take a data.frame), but we can work around it. I show how to do that below, using the "terra" package instead of the obsolete "raster" package (but the solution is exactly the same for either package).
The example data
library(terra)
library(xgboost)
logo <- rast(system.file("ex/logo.tif", package="terra"))
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)
xy <- rbind(cbind(1, p), cbind(0, a))
v <- extract(logo, xy[,2:3])
xgb <- xgboost(data = data.matrix(v), label=xy[,1], nrounds = 5)
The work-around is to write a prediction function that first coerces the data.frame with "new data" to a matrix. We can use that function with predict<SpatRaster>
xgbpred <- function(model, data, ...) {
predict(model, newdata=as.matrix(data), ...)
}
p <- predict(logo, model=xgb, fun=xgbpred)
plot(p)
I'm using the randomForest package to classify a raster stack of different predictors. Classification works fine, but I also want to retrieve the class probabilities. With my code I only get a RasterLayer with the probability of the first class, but I'd like to get a RasterStack with the class probabilities for each class in one layer.
PRED_train$response <- as.factor(PRED_train$response)
rf <- randomForest(response~., data = PRED_train, na.action = na.omit, confusion = T)
pred_RF <- raster::predict(PRED,rf,)
beginCluster()
pred_RF <- clusterR(PRED, predict, args = list(rf,type="prob"))
endCluster()
The first place to look should be ?raster::predict; which has an example that shows how to do that. Here it is:
library(raster)
logo <- brick(system.file("external/rlogo.grd", package="raster"))
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)
xy <- rbind(cbind(1, p), cbind(0, a))
v <- data.frame(cbind(pa=xy[,1], extract(logo, xy[,2:3])))
v$pa <- as.factor(v$pa)
library(randomForest)
rfmod <- randomForest(pa ~., data=v)
rp <- predict(logo, rfmod, type='prob', index=1:2)
spplot(rp)
in R it is possible to create a list
k <- list()
k[[1]] <- airquality
k[[2]] <- rock
k[[3]] <- AirPassengers
k[[4]] <- airmiles
k[[5]] <- trees
k[[6]] <- treering
and selecting it with
k[c(1:3,6)]
How it is possible to do same in S4 class?
for example I create the some data from dismo package:
library(dismo)
example(voronoi)
that performs following:
p <- matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48,
28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2)
v <- voronoi(p)
v
I want to select the coordinates of a polygon, it can be done with.
v#polygons[[1]]#Polygons[[1]]#coords.
My question is How to select for example 1 to 3rd and sixth component?
my idea to use
v#polygons[c(1:3,6)]#Polygons[[1]]#coords
does not work. R says:
Error: trying to get slot "Polygons" from an object of a basic class ("list") with no slots
The problem isn't with v#polygons[c(1:3,6)] but rather in the attempt to apply #Polygons[[1]]#coords directly to the resulting list. instead, you could use lapply() on v#polygons[c(1:3,6)] like this:
result <- lapply(v#polygons[c(1:3,6)], function(x) x#Polygons[[1]]#coords)
which works as expected.
I have constructed models in glmer and would like to predict these on a rasterStack representing the fixed effects in my model. my glmer model is in the form of:
m1<-glmer(Severity ~ x1 + x2 + x3 + (1 | Year) + (1 | Ecoregion), family=binomial( logit ))
As you can see, I have random effects which I don't have as spatial layer - for example 'year'. Therefore the problem is really predicting glmer on rasterStacks when you don't have the random effects data random effects layers. If I use it out of the box without adding my random effects I get an error.
m1.predict=predict(object=all.var, model=m1, type='response', progress="text", format="GTiff")
Error in predict.averaging(model, blockvals, ...) :
Your question is very brief, and does not indicated what, if any, trouble you have encountered. This seems to work 'out of the box', but perhaps not in your case. See ?raster::predict for options.
library(raster)
# example data. See ?raster::predict
logo <- brick(system.file("external/rlogo.grd", package="raster"))
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)
xy <- rbind(cbind(1, p), cbind(0, a))
v <- data.frame(cbind(pa=xy[,1], extract(logo, xy[,2:3])))
v$Year <- sample(2000:2001, nrow(v), replace=TRUE)
library(lme4)
m <- lmer(pa ~ red + blue + (1 | Year), data=v)
# here adding Year as a constant, as it is not a variable (RasterLayer) in the RasterStack object
x <- predict(logo, m, const=(data.frame(Year=2000)))
If you don't have the random effects, just use re.form=~0 in your predict call to predict at the population level:
x <- predict(logo, m, re.form=~0)
works without complaint for me with #RobertH's example (although I don't know if correctly)