I have a spatial polygons data frame and I am interested in a matrix of correlation coefficients for my variables.
The command
>cor(df)
returns the following error:
>Error in cor(MergedData) : supply both 'x' and 'y' or a matrix-like 'x'
I can get pairwise coefficients if I run the following command
>cor.test(df$var1, df$var2)
However, since I have 15 variables, I would need to run over 200 commands. Is there a way I could do it faster, i.e. return a matrix of correlation coefficients all in one table?
Thanks in advance!
cor only works on data frames or matrices. You need to pull out the data slot from the SpatialPolygonsDataFrame:
grd <- GridTopology(c(1,1), c(1,1), c(10,10))
polys <- as(grd, "SpatialPolygons")
centroids <- coordinates(polys)
x <- centroids[,1]
y <- centroids[,2]
z <- 1.4 + 0.1*x + 0.2*y + 0.002*x*x
ex_1.7 <- SpatialPolygonsDataFrame(polys,
data=data.frame(x=x, y=y, z=z, row.names=row.names(polys)))
class(slot(ex_1.7, "data"))
cor(slot(ex_1.7, "data"))
Example SpatialPolygonsDataFrame from the docs at:
??sp::`SpatialPolygonsDataFrame-class`
Related
In R, I need to create a raster of probabilities of 4 rasters (distance from road, slope, grass cover and tree cover). For each of these I have created a formula to calculate a weight. Unfortunately I cannot share the data. This function below is what I have tried to do so far but it is not working yet. It gives the error: Non-numeric argument to mathematical function. Any recommendations?
probabilities_raster <- function(tc, gc, road, slp){
# Create structure to hold data
propxy_raster <- raster(ncol=100, nrow=100)
ncell(propxy_raster)
treecover <- (dnorm(tc, mean=0.7, sd=0.1))/(dnorm(0.7, mean=0.7, sd=0.1)) # not working
grasscover <- (dnorm(gc, mean=0.3, sd=0.1))/(dnorm(0.3, mean=0.3, sd=0.1)) # not working
road <- pnorm(-2+4*road) # not working
slope <- exp(-10*slp) # this one works
# Calculate weight
weight <- treecover * grasscover * road * slope
propxy_raster <- weight
return(propxy_raster)
}
raster_1 <- probabilities_raster(tc=raster_treecover, gc=raster_grasscover, road=raster_road, slp=
raster_slope)
Here is a minimal, self-contained reproducible example. Minimal is also important, because your questions really should be:
"How can I use dnorm with a RasterLayer?"
library(raster)
tc <- raster()
values(tc) <- runif(ncell(tc))
x <- dnorm(tc, mean=0.7, sd=0.1)
#Error in dnorm(tc, mean = 0.7, sd = 0.1) :
# Non-numeric argument to mathematical function
I think what you are looking for is
x <- calc(tc, function(i) dnorm(i, 0.7, 0.3))
And with "terra" that would be
library(terra)
tc <- rast()
values(tc) <- runif(ncell(tc))
x <- app(tc, \(i) dnorm(i, 0.7, 0.3))
I have successfull run this code. I have read it from:
Can't Calculate pixel-wise regression in R on raster stack with fun
library(raster)
# Example data
r <- raster(nrow=15, ncol=10)
set.seed(0)
# Now I make 6 raster (1 raster/months), then assign each pixel's value randomly
s <- stack(lapply(1:6, function(i) setValues(r, rnorm(ncell(r), i, 3))))
names(s) <- paste0('Month', c(1,2,3,4,5,6))
# Extract each pixel values
x <- values(s)
# Model with linreg
m <- lm(Month6 ~ ., data=data.frame(x))
# Prediction raster
p <- predict(s, m)
If you run that code, p will be a raster. But, I still confused. How to make raster in the future? For example, I want 'Month8' raster based on 6 previous raster?
What I mean is, each pixels has different linreg equations (where X=Month1, ... , Months6). If I input X=Month8, I will have 150 cells of Y for 8th Month that represent in each pixel of raster.
What I have done
# Lets try make a data frame for clear insight for my data
x <- values(s)
DF <- data.frame(x)
# Make X as month, and y is target.
library(data.table)
DF_T <- transpose(DF)
Month <- seq(1,nrow(DF_T))
DF_T <- cbind(Month, DF_T)
# Make prediction for first pixel
V1_lr <- lm(V1 ~ Month, data=DF_T)
# prediction for 8th Months in a pixel
V1_p <- predict(V1_lr, data.frame(Month=8))
V1_p
This is just one pixel. I want the entire raster for 'Month8'
I have a column that is left-skewed, I need to transform it. So I tried this
library(car)
vect<-c(1516201202, 1526238001, 1512050372, 1362933719, 1516342174, 1526502557 ,1523548827, 1512241202,1526417785, 1517846464)
powerTransform(vect)
The values in the vector are 13 digit numeric unix epoch timestamps like this I have few thousand values, pasting 10 of them here, I do the same operation on the entire column. This gave me an error
Error in qr.resid(xqr, w * fam(Y, lambda, j = TRUE, ...)) : NA/NaN/Inf in foreign function call (arg 5)
I was expecting transformed column back. Any Idea on how to do this in R?
Thanks
Raj
Generally, car::powerTransform returns a powerTransform object (which is a list containing amongst other things the estimated Box-Cox transformation parameter(s)). To get the transformed values, you need bcPower, which takes the car::powerTransform output object to transform the original data.
Unfortunately you don't provide sample data, so here's an example based on the iris dataset.
library(car)
# Box-Cox transformation of `Sepal.Length`
df <- iris
trans <- powerTransform(df$Sepal.Length)
# Or the same using formula syntax:
# trans <- powerTransform(Sepal.Length ~ 1, data = df)
# Add the transformed `Sepal.Length` data to the original `data.frame`
df <- cbind(
df,
Sepal.Length_trans = bcPower(
with(iris, cbind(Sepal.Length)), coef(trans))[, 1])
# Show a histogram of the Box-Cox-transformed data
library(ggplot2)
ggplot(df, aes(Sepal.Length_trans)) +
geom_histogram(aes(Sepal.Length_trans), bins = 30)
I have geographical data at the town level for 35 000 towns.
I want to estimate the impact of my covariates X on a dependent variable Y, taking into account autocorrelation.
I have first computed weight matrix and then I used the command spautolm from the package spam but it returned me an error message because my dataset is too large.
Do you have any ideas of how can I fix it? Is there any other equivalent commands that would work?
library(haven)
library(tibble)
library(sp)
library(data.table)
myvars <- c("longitude","latitude","Y","X")
newdata2 <- na.omit(X2000[myvars]) #drop observations with no values for one observation
df <- data.frame(newdata2)
newdata3<- unique(df) #drop duplicates in terms of longitude and latitude
coordinates(newdata3) <- c("longitude2","latitude2") #set the coordinates
coords<-coordinates(newdata3)
Sy4_nb <- knn2nb(knearneigh(coords, k = 4)) # Display the k closest neighbours
Sy4_lw_idwB <- nb2listw(Sy8_nb, glist = idw, style = "B") #generate a list weighted by the distance
When I try to run such formulas:
spautolm(formula = Y~X, data = newdata3, listw = Sy4_lw_idwB)
It returns me : Error: cannot allocate vector of size 8.3 Gb
I am new to R. And I already have a SVM model in R. Right now, I have two raster image, one is the elevation, another one is the slope. The elevation and slope would be used as the predictors for SVM. And I also want to plot the results as a map.
Right now my code is as follow, but the predict for the two raster image input return all 0. It should be 0 or 1. Anything wrong?
library("e1071")
tornado=read.csv(file="~/Desktop/new.csv",header=TRUE,sep=",")
err<- rep(0,5)
m<-0
for (i in c(1:5)) {
#split the data sets into testing and training
training.indices <- sample(nrow(tornado), 1800)
training <- rep(FALSE, nrow(tornado))
training[training.indices] <- TRUE
tornado.input<- tornado[training,]
tornado.input=data.frame(tornado.input)
tornado=data.frame(tornado)
tornado$Sig <- factor(tornado$Sig)
model <- svm(Sig~slope+elevation, data=tornado.input)
pred<- predict(model, tornado[!training,] )
ConfM1<- table(tornado$Sig[!training], pred=pred)
err[i]<-(sum(ConfM1)-sum(diag(ConfM1)))/sum(ConfM1)
}
library("raster")
library("rgdal")
elevation <- raster("~/Desktop/elevation.tif")
slope<- raster("~/Desktop/slope.tif")
#plot(elevation)
#plot(slope)
logo <- brick(elevation, slope)
r1 <- predict(logo,model)
plot(r1)
Maybe it is a bit late to answer this question but I have had the same issue. The raster::predict function does not seem to provide the same output as stats:predict.
My alternative solution is simply to extract the values from your predictor rasters (slope and elevation), then use ggplot to project the results spatially.
####Convert raster into dataframe
logo_df <- as.data.frame(values(logo))
logo_df[c("x","y")] <- coordinates(logo)
logo_df <- logo_df[complete.cases(logo_df),] # in case you had holes in your raster
#### predict to this new data
pred <- predict(model, logo_df, probability = T)
logo_df$svm.fit <- attr(pred, "probabilities")[,2]
###map the predictions
ggplot(logo_df, aes(x,y,fill=svm.fit)) +
geom_tile() +
scale_fill_gradientn(colours = rev(colorRamps::matlab.like(100))) +
coord_fixed()
I was having this problem and found that when I renamed the layers of my RasterStack to be their variable names, and added the type option, it worked!
e.g.
names(logo)<-c("elevation","slope")
r1<-predict(logo,model,type="response")