point pattern analysis in Spatstat - r

I am having some trouble setting up my data for some point pattern analysis.
What I want to do: conduct a point pattern analysis on NYC arrest data and see if there exists a spatial dependence between arrests and Covid-19 cases.
What I've done so far: downloaded data in the form of shapefiles
https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm (the ZIP code boundaries)
https://www1.nyc.gov/site/nypd/stats/crime-statistics/citywide-crime-stats.page (year to date data for arrests in NYC by zip code)
Code:
library(readxl)
library(rgdal) #Brings Spatial Data in R
library(spatstat) # Spatial Statistics
library(lattice) #Graphing
library(maptools)
library(raster)
library(ggplot2)
library(RColorBrewer)
library(broom)
# Load nyc zip code boundary polygon shapefile
s <- readOGR("/Users/my_name/Documents/fproject/zip","zip")
nyc <- as(s,"owin")
### OGR data source with driver: ESRI Shapefile
Source: "/Users/my_name/Documents/project/zip", layer: "zip"
with 263 features
# Load nyc arrests point feature shapefile
> s <- readOGR("/Users/my_name/Documents/project/nycarrests/","geo1")
### OGR data source with driver: ESRI Shapefile
Source: "/Users/my_name/Documents/project/nycarrests", layer: "geo1"
with 103376 features
It has 19 fields
#Converting the dataset into a point pattern
arrests <- as(s,"ppp”)
### Error in as.ppp.SpatialPointsDataFrame(from) :
Only projected coordinates may be converted to spatstat class objects
This gave me the error above.
I know the error has to do with the coordinates not being in the cartesian coordinates. So my question is:
How can I convert my sp object to have (projected) cartesian coordinates in order to convert it to a point pattern (poisson point process)?

You are looking for spTransform.
Here is some example data
library(raster)
filename <- system.file("external/lux.shp", package="raster")
p <- shapefile(filename)
Solution
utm <- "+proj=utm +zone=32 +datum=WGS84"
x <- spTransform(p, utm)
x
#class : SpatialPolygonsDataFrame
#features : 12
#extent : 266045.9, 322163.8, 5481445, 5563062 (xmin, xmax, ymin, ymax)
#crs : +proj=utm +zone=32 +datum=WGS84 +units=m +no_defs
#variables : 5
#names : ID_1, NAME_1, ID_2, NAME_2, AREA
#min values : 1, Diekirch, 1, Capellen, 76
#max values : 3, Luxembourg, 12, Wiltz, 312

Related

How do I get this raster and this shapefile on the same projection?

I have a shapefile of 10 CA counties projected in NAD83(Hard)/CA Albers. I have a raster (a netCDF file of temperature) for the entire US projected in WGS84/WGS84. I want to use the shapefile to clip the raster. I know that I need to get them on the same datum/projection first. But I've tried re-projecting the raster using raster::projectRaster(). That failed (as in the data disappeared). So then I tried re-projecting the shapefile instead using sp::spTransform(). This also failed (as in the data don't overlap). I've searched through stackoverflow but didn't see anything that seemed to help. I'm not getting an error, but projectRaster is not working and re-projecting the shapefile using spTransform doesn't produce the desired outcome. I feel like there is something specific going on here, like the transformation from WGS84 to NAD83 or loading the raster in using raster() is the problem ... but then again, it could easily be something stupid that I'm missing! =)
my shapefile and raster are here: https://www.dropbox.com/sh/l3b2syzcioeqmyy/AAA5CstBZty4ofOcVFkAumNYa?dl=0
here is my code:
library(raster) #for creating rasters from .bil files
library(rgdal) #for reading .bil files and .gdb files
library(ncdf4) #for working with ncdf files
library(sp) #for working with spatial data files
load(my_counties.RData)
myraster <- raster(myraster.nc)
my.crs <- CRS("+init=EPSG:3311") #NAD83(HARN) / California Albers (HARN is high resolution)
newraster <- projectRaster(myraster, res = 6000, crs = my.crs) #raster resolution is 1/16th of a degree
#There is data in the raster.
plot(myraster)
#but none in newraster
plot(newraster)
#Now try re-projecting the shapefile
my.crs2 <- crs(myraster)
newshapefile <- spTransform(my_counties, my.crs2)
#but the data don't overlap
plot(newshapefile); plot(myraster, add = T)
You can do
library(raster)
library(rgdal)
load("my_counties.RData")
b <- brick("myraster.nc")
Now look at b
b
#class : RasterBrick
#dimensions : 490, 960, 470400, 365 (nrow, ncol, ncell, nlayers)
#resolution : 0.0625, 0.0625 (x, y)
#extent : 234, 294, 23.375, 54 (xmin, xmax, ymin, ymax)
#crs : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
#source : myraster.nc
#names : X2005.01.01, X2005.01.02, X2005.01.03, X2005.01.04, X2005.01.05, X2005.01.06, X2005.01.07, X2005.01.08, X2005.01.09, X2005.01.10, X2005.01.11, X2005.01.12, X2005.01.13, X2005.01.14, X2005.01.15, ...
#Date : 2005-01-01, 2005-12-31 (min, max)
#varname : tasmax
The horizontal extent is between 234 and 294 degrees. That points to a system with longitudes that start in Greenwich at 0 and continues to 360 (again in Greenwich). Climatologist do that. To go to the more conventional -180 to 180 degrees system:
r <- shift(b, -360)
(if your data had a global extent, you would use raster::rotate instead)
Now, transform the counties to lonlat and show that they overlap.
counties <- spTransform(my_counties, crs(r))
plot(r, 1)
lines(counties)
It is generally best to transform vector data, not raster data if you can avoid it.

Join SpatialPointsDataFrame and SpatialLinesDataFrame using over(), R

I am stuck with something that it seems should be quite straightforward. Apologies, I am new to using spatial data in R.
I am trying to map city data, onto a map of the world's coastlines. I have taken the coastlines from the natural earth data set (https://www.naturalearthdata.com/downloads/) 1:110m data and generated the spatial lines dataframe:
coast_rough_sldf
class : SpatialLinesDataFrame
features : 134
extent : -180, 180, -85.60904, 83.64513 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
variables : 3
names : scalerank, featurecla, min_zoom
min values : 0, Coastline, 0.0
max values : 1, Country, 1.5
I further have a dataset of cities, a sample of which looks as follows:
city_coast <- data.frame(Latitude = c(-34.60842, -34.47083, -34.55848, -34.76200, -34.79658, -34.66850),
Longitude = c(-58.37316, -58.52861, -58.73540, -58.21130, -58.27601, -58.72825),
Name1 = c("Buenos Aires", "San Isidro", "San Miguel", "Berazategui", "Florencio Varela", "Merlo"),
distance = c(7970.091, 5313.518, 26156.700, 11670.274, 18409.738, 33880.259))
city_coast
Latitude Longitude Name1 distance
1 -34.60842 -58.37316 Buenos Aires 7970.091
2 -34.47083 -58.52861 San Isidro 5313.518
3 -34.55848 -58.73540 San Miguel 26156.700
4 -34.76200 -58.21130 Berazategui 11670.274
5 -34.79658 -58.27601 Florencio Varela 18409.738
6 -34.66850 -58.72825 Merlo 33880.259
I then successfully create the spatial points dataframe:
city_spdf <- SpatialPointsDataFrame(coords = select(city_coast, c("Longitude", "Latitude")),
proj4string = CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84"),
data = select(city_coast, c("Name1", "distance")))
city_spdf
class : SpatialPointsDataFrame
features : 6
extent : -58.7354, -58.2113, -34.79658, -34.47083 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
variables : 2
names : Name1, distance
min values : Berazategui, 5313.518
max values : San Miguel, 33880.259
Now i want to join the city_spdf with the coast_sldf, so that i can plot them using tmap. Looking at tutorials it seems that i should use over():
city_coast_shp <- over(coast_rough_sldf, city_spdf)
city_coast_shp
Name1 distance
1 <NA> NA
Which is clearly wrong. Switching the order of the objects changes things but still doesn't give me what i need.
Can anyone tell me what i am not getting right with this over function? Every example i have seen simply has people joining the two spatial objects. Apologies if i am missing something extremely simple.
Like #elmuertefurioso point out in the comments, I think a one reason this isn't working how you expect is because of confusion of types of geometries.
Since the coastline data is lines, and not polygons like data(World) from tmap, you are restricted a bit in the calculations and comparisons you can make with cities, which is points.
Reading in the data the sf way:
library(sf)
# downloaded from https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_coastline.zip
coastline <- read_sf("~/Downloads/ne_110m_coastline/ne_110m_coastline.shp")
cities <- data.frame(
Latitude = c(-34.60842, -34.47083, -34.55848, -34.76200, -34.79658, -34.66850),
Longitude = c(-58.37316, -58.52861, -58.73540, -58.21130, -58.27601, -58.72825),
Name1 = c("Buenos Aires", "San Isidro", "San Miguel", "Berazategui", "Florencio Varela", "Merlo"),
distance = c(7970.091, 5313.518, 26156.700, 11670.274, 18409.738, 33880.259)
)
In order to do any comparisons between sf objects they must have the same Coordinate Reference System. So as we read in cities we will set the CRS to be that of coastline.
cities <- st_as_sf(
cities,
coords = c("Longitude", "Latitude"), # must be x, y order
crs = st_crs(coastline) # must be equivilant between objects
)
Now you can make comparisons using the st_{comparison}() family of functions.
The function over() and its sf counterpart st_intersects() would work on a set of points and polygons, but we don't have that here. We can use distance functions like st_nearest_feature() with points and lines, to get the closest geometry from coastline for each city.
st_nearest_feature(cities, coastline)
It returns the row index for the nearest geometry in coastlines which happens to be the same for all the cities here because they are all in Argentina. The order matters in the function because it defines the question being asked If we flipped it to st_nearest_feature(coastline, cities)it would return the closest city for each geometry in coastline, so the return would have 134 elements.
All that to say you don't actually have to do any joining or comparisons to plot your points together on the same tmap.
library(tmap)
tmap_mode("view")
tm_shape(coastline) +
tm_lines() +
tm_shape(cities) +
tm_bubbles("distance")
I'm not a tmap user but I just zoomed in snapped this screen shot to show its working.

Converting shapefile to raster

I'm having an issue rasterizing a shapefile to produce points on a 0.5*0.5 grid. The shapefile represents classifications of risk level (Low-0, Medium-100, High-1000, Very High-1500) of global coral reefs to integrated threats.
I pulled the code from another example that works fine, but when I try it for my data I get nothing from the plot function. See below for the link to the shapefile and my code:
Reefs At Risk: Global Integreated Threats
# Read shapefile into R
library(rgdal)
library(raster)
int.threat.2030 <- readOGR(dsn = "Global_Threats/Integrated_Future",
layer = "rf_int_2030_poly")
## Set up a raster "template" for a 0.5 degree grid
ext <- extent(-110, -50, 0, 35)
gridsize <- 0.5
r <- raster(ext, res=gridsize)
## Rasterize the shapefile
rr <- rasterize(int.threat.2030, r)
## Plot raster
plot(rr)
Any ideas where I might be going wrong? Is it an issue with the shapefile itself?
Please and thanks!
You assumed that the polygons were in lon/lat coordinates, but they are not:
library(raster)
library(rgdal)
p <- shapefile('Global_Threats/Integrated_Future/rf_int_2030_poly.shp')
p
#class : SpatialPolygonsDataFrame
#features : 63628
#extent : -18663508, 14601492, -3365385, 3410115 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=cea +lon_0=-160 +lat_ts=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
#variables : 3
#names : ID, THREAT, THREAT_TXT
#min values : 1, 0, Critical
#max values : 63628, 2000, Very High
You can either change the projection
pgeo <- spTransform(p, CRS('+proj=longlat +datum=WGS84'))
and then do something like:
ext <- floor(extent(pgeo))
rr <- raster(ext, res=0.5)
rr <- rasterize(pgeo, rr, field=1)
Or keep the orginal CRS and do something like:
ext <- extent(p)
r <- raster(ext, res=50000)
r <- rasterize(p, r, field=1)
plot(r)
Note that you are rasterizing very small polygons to large raster cells. A polygon is considered 'inside' if it covers the center of a cell (i.e. assuming a case where polygons cover multiple cells). So for these data you would need to use a much higher resolution (and then perhaps aggregate the results). Alternatively you could rasterize polygon centroids.
But none of the above is relevant really, as you are doing this all backwards. The polygons are clearly derived from a raster (look how blocky they are) and the raster is available in the dataset you point to!
So instead of rasterizing, do:
x <- raster('Global_Threats/Integrated_Future/rf_int_2030')
x
#class : RasterLayer
#dimensions : 25456, 80150, 2040298400 (nrow, ncol, ncell)
#resolution : 500, 500 (x, y)
#extent : -20037508, 20037492, -6363885, 6364115 (xmin, xmax, ymin, ymax)
#coord. ref. : NA
#data source : C:\temp\Global_Threats\Integrated_Future\rf_int_2030
#names : rf_int_2030
#values : 0, 2000 (min, max)
#attributes :
# ID COUNT THREAT_TXT
# 0 80971 Low
# 100 343535 Medium
# 1000 322231 High
# 1500 168518 Very High
# 2000 83598 Critical
Here plotting a part of Palawan:
e <- extent(c(-8990636, -8929268, 1182946, 1256938))
plot(x, ext=e)
plot(p, add=TRUE)
If you need a lower resolution see raster::aggregate. For a different coordinate reference system, see raster::projectRaster.

netCDF to raster and projection

I've recently started using R for spatial data. I'd be really grateful if you could could help me with this. Thanks!
I've extracted data from a multidimensional netCDF file. This file had longitude, latitude and temperature data (for 12 months of a specific year).
From this netCDF I've got a data frame for January with these variables: longitude, latitude, temperature.
With this data frame I've created a raster.
# Packages
library("sp")
library("raster")
library("rgdal")
library("ncdf")
library("maptools")
library("rgeos")
library("sm")
library("chron")
# Dataframe to raster
# Create spatial points data frame
coordinates(tmp.df01) <- ~ lon + lat
# Coerce to SpatialPixelsDataFrame
gridded(tmp.df01) <- T
# Coerce to raster
rasterDF1 <- raster(tmp.df01)
> print(tmp.df01)
class : SpatialPixelsDataFrame
dimensions : 103, 241, 24823, 1 (nrow, ncol, npixels, nlayers)
resolution : 0.02083333, 0.02083333 (x, y)
extent : 5.739583, 10.76042, 45.73958, 47.88542 (xmin, xmax, ymin, ymax)
coord. ref. : NA
names : TabsM_1
min values : -18.1389980316162
max values : 2.26920962333679
There is no value for 'coord. ref.'
The projection of the original netCDF was WGS84. So I gave this projection to the raster.
proj4string(rasterDF1) <- "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
Then, I wanted to reproject my raster to another projection:
# Reprojecting into CH1903_LV03
# First, change the coordinate reference system (crs)
proj4string(rasterDF1) <- "+init=epsg:21781"
# Second, reproject the raster
rasterDF1.CH <- spTransform(rasterDF1, crs("+init=epsg:21781"))
At this point I get the following error:
Error in spTransform(rasterDF1, crs("+init=epsg:21781")) :
load package rgdal for spTransform methods
But the package rgdal is already uploaded! It must be something wrong in the code!
Here the code to solve the problem described.
Solution provided by Frede Aakmann Tøgersen.
tmp.df01 # tmp.df01 is a data.frame
coordinates(tmp.df01) <- ~ lon + lat # tmp.df01 is now a SpatialPointsDataFrame
# Assign orignial data projection
proj4string(tmp.df01) <- CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")
gridded(tmp.df01) <- T # tmp.df01 is now a SpatialPixelFrame
# Coerce to raster
rasterDF1 <- raster(tmp.df01) # rasterDF1 is a RasterLayer
# To reproject the raster layer
rasterDF1.proj <- projectRaster(rasterDF1, crs=CRS("+init=epsg:21781"))
Here is another option GDAL. Using gdal_translate you can convert netcdf file with bands to geotiffs.
gdal_translate -a_srs EPSG:4326 NETCDF:File_Name.nc:Band_Name -of ‘Gtiff’ Output_FileName.geotiff
To explore more options in gdal_translate you can visit this link

R : error in moving a Raster using Shift function: It does NOT move. Why?

I have a nc file (which I can read it as a raster in R and ArcGIS) and a shapefile (polygon) for Canada.
When I open both files in ArcMap, it re-projects them on-fly and the polygon would be located on the right location:
Then, I use R to open and plot them using the code:
## set the working directory:
setwd("C:/Users/Desktop/Data")
## required libraries
library(raster)
library(rgdal)
library(ncdf4)
## reading the polygon:
My_study_area <- readOGR(dsn = ".", layer = "Canada_AB")
## create a connection to the downloaded NC file:
fn <- "tasmin_day_CanESM5_ssp126_r1i1p1f1_gn_21010101-23001231_cliped_AB.nc"
ncin <- nc_open(fn)
ncin
## create a raster from the nc file:
fn_brick <- brick(fn)
fn_brick
extent(fn_brick)
Here is the output:
class : Extent
xmin : 229.2188
xmax : 271.4062
ymin : 39.06842
ymax : 64.18258
## since it has 73,000 layers, let us only work on one layer: fn_brick[[1]]
My Goal:
## I want to shift this raster in the x direction for -177.1876 units:
## why -177.1876 units ?
## because I know if I shift it for -177.1876 units, the raster and the polygon will be in the right place
fn_brick_shifted <- shift(fn_brick[[1]], dx= -177.1876, dy= 0, filename="new_nc_file", format="GTiff", datatype='FLT4S', overwrite=T)
extent(fn_brick_shifted)
Despite shifting the raster, its extent is still remained unchanged:
class : Extent
xmin : 229.2188
xmax : 271.4062
ymin : 39.06842
ymax : 64.18258
However, let us plot them:
plot(fn_brick_shifted)
lines(My_study_area)
The result in the R is like this:
As can be seen, the raster is NOT shifted, therefore the polygon is not plotted on the right place.
However, we can see the coordinate info for the files:
For the raster:
crs(fn_brick_shifted)
CRS arguments:
+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
For the polygon:
crs(My_study_area)
CRS arguments:
+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
Did I make any mistake in using the shift function for the raster?
Do I need to employ another function for obtaining my goal?
Any comment or help would be highly appreciated.
If you think having the raster and polygon files may help, you could download them here in a zip file:
nc and polygon.zip

Resources