How Regression Interpolation possible in R? - raster

I have climatic data of different weather stations. I want to interpolate this data using regression instead of kriging or IDW. Can anyone tell how to do this in R?

Some example data (you should have included that with your question)
library(raster)
r <- raster(system.file("external/test.grd", package="raster"))
ra <- aggregate(r, 10)
d <- na.omit(data.frame(v=values(ra), xyFromCell(ra, 1:ncell(ra))))
Fit a model
m <- glm(v ~ ., data=d)
Predict that to a raster
p <- raster(r)
p <- interpolate(p, m)
Remove unwanted areas
p <- mask(p, r)

Related

Plotting the predictions of a mixed model as a line in R

I'm trying to plot the predictions (predict()) of my mixed model below such that I can obtain my conceptually desired plot as a line below.
I have tried to plot my model's predictions, but I don't achieve my desired plot. Is there a better way to define predict() so I can achieve my desired plot?
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
newdata <- with(dat3, expand.grid(pc1=unique(pc1), pc2=unique(pc2), discon=unique(discon)))
y <- predict(m4, newdata=newdata, re.form=NA)
plot(newdata$pc1+newdata$pc2, y)
More sjPlot. See the parameter grid to wrap several predictors in one plot.
library(lme4)
library(sjPlot)
library(patchwork)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3) # Does not converge
m4 <- lmer(math~pc1+pc2+discon+(1|id), data=dat3) # Converges
# To remove discon
a <- plot_model(m4,type = 'pred')[[1]]
b <- plot_model(m4,type = 'pred',title = '')[[2]]
a + b
Edit 1: I had some trouble removing the dropcon term within the sjPlot framework. I gave up and fell back on patchwork. I'm sure Daniel could knows the correct way.
As Magnus Nordmo suggest, this is very simple with sjPlot which has some predefined functions for these types of plot.
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
plot_model(m4, type = 'pred', terms = c('pc1', 'pc2'),
ci.lvl = 0)
which gives the following result.
This plot is designed to include different quantiles of the second term in terms over the axes of pc1 and pred. You could split up these plots and combine them using patchwork and the interval can be changed by using square brackets after the term in terms (eg pc1 [-10:1] for interval between -10 and 1).

Fit a copula model in R

I want to accomplish the task of creating an optimal portfolio of stocks, the yield between which is modeled using kopulas.
And I have data: return of 4 stocks:
s1 <- read.csv('s1.csv',header=F)$V2
s2 <- read.csv('s2.csv',header=F)$V2
s3 <- read.csv('s3.csv',header=F)$V2
s4 <- read.csv('s4.csv',header=F)$V2
Then I tried to fit t-copula and plot the density
t.cop <- tCopula(dim=4)
set.seed(500)
m <- pobs(as.matrix(cbind(s1,s2,s3,s4)))
fit <- fitCopula(t.cop,m,method='ml')
coef(fit)
rho <- coef(fit)[1]
df <- coef(fit)[2]
persp(tCopula(dim=2,rho,df=df),dCopula)
But I cant understand how to build other types of copulas(vine copulas for example). And how can I find an optimal portfolio?

pixel level regression with large raster dataset

I am trying to fit a glm model where y, x1 + x2....xn are layers in a rasterStack object. I have tried converting the raster stack to a dataframeobject but I get a vector size error as shown below. instead, I'd like to try fitting the regression model with raster layers as the input - without having to convert the layers to a data frame given the file size and memory error. Would that be possible and how would you configure that?
The model that I'd line to fit is of nature: m1<-glm(y1~x1 + x2, family=binomial(), data=layers), but I don't get to this point because I cant convert the data to a dataframe for model fitting.
dat<-as.data.frame(stack(layers[c(y1,x1,x2)]))
Error: cannot allocate vector of size 40GB
Here are some regression examples with Raster* data (from ?calc):
Create example data
r <- raster(nrow=10, ncol=10)
s1 <- lapply(1:12, function(i) setValues(r, rnorm(ncell(r), i, 3)))
s2 <- lapply(1:12, function(i) setValues(r, rnorm(ncell(r), i, 3)))
s1 <- stack(s1)
s2 <- stack(s2)
Regression of values in one brick (or stack) with another
s <- stack(s1, s2)
# s1 and s2 have 12 layers; coefficients[2] is the slope
fun <- function(x) { lm(x[1:12] ~ x[13:24])$coefficients[2] }
x1 <- calc(s, fun)
Regression of values in one brick (or stack) with 'time'
time <- 1:nlayers(s)
fun <- function(x) { lm(x ~ time)$coefficients[2] }
x2 <- calc(s, fun)
Get multiple layers, e.g. the slope and intercept
fun <- function(x) { lm(x ~ time)$coefficients }
x3 <- calc(s, fun)
In some cases, a much (> 100 times) faster approach is to directly use linear algebra and pre-compute some constants
# add 1 for a model with an intercept
X <- cbind(1, time)
# pre-computing constant part of least squares
invXtX <- solve(t(X) %*% X) %*% t(X)
## much reduced regression model; [2] is to get the slope
quickfun <- function(y) (invXtX %*% y)[2]
x4 <- calc(s, quickfun)

Visualizing GAM models in R

Does anyone know how to visualize the smooth component of gam models in R very well? I would really like to visualize something like the output of the function visreg. This code below illustrates my problem
library(gam)
f=function(v){exp(v)}
n=100
x=runif(n)
t=runif(n)
y=x+f(t)+rnorm(n, sd=0.1)
fit=gam(y~x+s(t))
plot(t,y)
lines(t,as.numeric(fit$smooth))
#want something more like
library(visreg)
visreg(fit)
You could use the plotting method for gam objects, but you'd have to use the data parameter of gam:
library(gam)
f <- function(v){exp(v)}
n <- 100
x <- runif(n)
t <- runif(n)
y <- x+f(t)+rnorm(n, sd=0.1)
DF <- data.frame(y, x, t)
fit <- gam(y~x+s(t), data = DF)
layout(t(1:2))
plot(fit, se=TRUE)
See help("plot.gam") for other options.

Drawing a 3D decision boundary of logistic regression

I have fitted a logistic regression model that takes 3 variables into account. I would like to make a 3D plot of the datapoints and draw the decision boundary (which I suppose would be a plane here).
I found an online example that applies to the case (so that you can load the data directly)
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
mylogit <- glm(admit ~ gre + gpa + rank, data = mydata, family = "binomial")
I was thinking of using the 3Dscatterplot package, but I am not sure what equation I should write to draw the boundary. Any ideas?
Many thanks,
The decision boundary will be a 3-d plane, which you could plot with any 3-d plotting package in R. I'll use persp by defining an x-y grid and then calculating the corresponding z value with the outer function:
# Use iris dataset for example logistic regression
data(iris)
iris$long <- as.numeric(iris$Sepal.Length > 6)
mod <- glm(long~Sepal.Width+Petal.Length+Petal.Width, data=iris, family="binomial")
# Plot 50% decision boundary; another cutoff can be achieved by changing the intercept term
x <- seq(2, 5, by=.1)
y <- seq(1, 7, by=.1)
z <- outer(x, y, function(x, y) (-coef(mod)[1] - coef(mod)[2]*x - coef(mod)[3]*y) /
coef(mod)[4])
persp(x, y, z, col="lightblue")

Resources