Export R plot to shapefile - r

I am fairly new to R, but not to ArcView. I am plotting some two-mode data, and want to convert the plot to a shapefile. Specifically, I would like to convert the vertices and the edges, if possible, so that I can get the same plot to display in ArcView, along with the attributes.
I've installed the package "shapefiles", and I see the convert.to.shapefile command, but the help doesn't talk about how to assign XY coords to the vertices.
Thank you,
Tim

Ok, I'm making a couple of assumptions here, but I read the question as you're looking to assign spatial coordinates to a bipartite graph and export both the vertices and edges as point shapefiles and polylines for use in ArcGIS.
This solution is a little kludgey, but will make shapefiles with coordinate limits xmin, ymin and xmax, ymax of -0.5 and +0.5. It will be up to you to decide on the graph layout algorithm (e.g. Kamada-Kawai), and project the shapefiles in the desired coordinate system once the shapefiles are in ArcGIS as per #gsk3's suggestion. Additional attributes for the vertices and edges can be added where the points.data and edge.data data frames are created.
library(igraph)
library(shapefiles)
# Create dummy incidence matrix
inc <- matrix(sample(0:1, 15, repl=TRUE), 3, 5)
colnames(inc) <- c(1:5) # Person ID
rownames(inc) <- letters[1:3] # Event
# Create bipartite graph
g.bipartite <- graph.incidence(inc, mode="in", add.names=TRUE)
# Plot figure to get xy coordinates for vertices
tk <- tkplot(g.bipartite, canvas.width=500, canvas.height=500)
tkcoords <- tkplot.getcoords(1, norm=TRUE) # Get coordinates of nodes centered on 0 with +/-0.5 for max and min values
# Create point shapefile for nodes
n.points <- nrow(tkcoords)
points.attr <- data.frame(Id=1:n.points, X=tkcoords[,1], Y=tkcoords[,2])
points.data <- data.frame(Id=points.attr$Id, Name=paste("Vertex", 1:n.points, sep=""))
points.shp <- convert.to.shapefile(points.attr, points.data, "Id", 1)
write.shapefile(points.shp, "~/Desktop/points", arcgis=TRUE)
# Create polylines for edges in this example from incidence matrix
n.edges <- sum(inc) # number of edges based on incidence matrix
Id <- rep(1:n.edges,each=2) # Generate Id number for edges.
From.nodes <- g.bipartite[[4]]+1 # Get position of "From" vertices in incidence matrix
To.nodes <- g.bipartite[[3]]-max(From.nodes)+1 # Get position of "To" vertices in incidence matrix
# Generate index where position alternates between "From.node" to "To.node"
node.index <- matrix(t(matrix(c(From.nodes, To.nodes), ncol=2)))
edge.attr <- data.frame(Id, X=tkcoords[node.index, 1], Y=tkcoords[node.index, 2])
edge.data <- data.frame(Id=1:n.edges, Name=paste("Edge", 1:n.edges, sep=""))
edge.shp <- convert.to.shapefile(edge.attr, edge.data, "Id", 3)
write.shapefile(edge.shp, "~/Desktop/edges", arcgis=TRUE)
Hope this helps.

I'm going to take a stab at this based on a wild guess as to what your data looks like.
Basically you'll want to coerce the data into a data.frame with two columns containing the x and y coordinates (or lat/long, or whatever).
library(sp)
data(meuse.grid)
class(meuse.grid)
coordinates(meuse.grid) <- ~x+y
class(meuse.grid)
Once you have it as a SpatialPointsDataFrame, sp provides some decent functionality, including exporting shapefiles:
writePointsShape(meuse.grid,"/home/myfiles/wherever/myshape.shp")
Relevant help files examples are drawn from:
coordinates
SpatialPointsDataFrame
readShapePoints
At least a few years ago when I last used sp, it was great about projection and very bad about writing projection information to the shapefile. So it's best to leave the coordinates untransformed and manually tell Arc what projection it is. Or use writeOGR rather than writePointsShape.

Related

How to convert "im" pixel image to raster?

I am trying to convert an "im" pixel image I've produced into a raster image. The "im" was created with the following code:
library(sf)
library(spatstat)
library(rgeos)
library(raster)
# read ebird data
ebd_species <- ("ebd_hooded.txt") %>%
read_ebd()
# extracting coordinates
latitude_species <- ebd_species$latitude
longitude_species <- ebd_species$longitude
#convert to spatial object
coordinates1 <- data.frame(x = longitude_species, y = latitude_species) %>% st_as_sf(coords = c("x", "y"))
# converting to point pattern data
coordinates <- as.ppp(coordinates1)
# density image
a <- density(coordinates,2)
plot(a)
This is the plot I get:
plot
What I want to do is convert this into a raster. I wanna then use the coordinates of the ebird data to extract the values of density from the raster.
Here is a minimal, self-contained, reproducible example (based on the first example in ?im):
library(spatstat)
mat <- matrix(1:1200, nrow=30, ncol=40, byrow=TRUE)
m <- im(mat)
Solution
library(raster)
r <- raster(m)
Looks like you are using geographic coordinates (longitude, latitude) directly in spatstat. Are you sure this is OK in your context? For regions away from the equator this can be quite misleading. Consider projecting to planar coordinates using sf::st_transform() (see other of my answers on this site for code to do this). Also, in newer versions of sf you can convert directly from sf to spatstat format with e.g. as.ppp().
If you want a kernel density estimate of the intensity at the data points you can use the option at = "points" in density.ppp():
a <- density(coordinates, 2, at = "points")
Then a is simply a vector with length equal to the number of points containing the intensity estimate for each data point. This uses "leave-one-out" estimation by default to minimize bias (see the help file for density.ppp).

Create neighborhood list of large dataset / fasten up

I want to create a weight matrix based on distance. My code for the moment looks as follows and functions for a smaller sample of the data. However, with the large dataset (569424 individuals in 24077 locations) it doesn't go through. The problem arise at the nb2blocknb fuction. So my question would be: How can I optimize my code for large datasets?
# load all survey data
DHS <- read.csv("Daten/final.csv")
attach(DHS)
# define coordinates matrix
coormat <- cbind(DHS$location, DHS$lon_s, DHS$lat_s)
coorm <- cbind(DHS$lon_s, DHS$lat_s)
colnames(coormat) <- c("location", "lon_s", "lat_s")
coo <- cbind(unique(coormat))
c <- as.data.frame(coo)
coor <- cbind(c$lon_s, c$lat_s)
# get a list with beneighbored locations thath are inbetween 50 km distance
neighbor <- dnearneigh(coor, d1 = 0, d2 = 50, row.names=c$location, longlat=TRUE, bound=c("GE", "LE"))
# get neighborhood list on individual level
nb <- nb2blocknb(neighbor, as.character(DHS$location)))
# weight matrix in list format
nbweights.lw <- nb2listw(nb, style="B", zero.policy=TRUE)
Thanks a lot for your help!
you're trying to make 1.3 e10 distance calculations. The results would be in the GB.
I think you'd want to limit either the maximum distance or the number of nearest neighbors you're looking for. Try nn2 from the RANN package:
library('RANN')
nearest_neighbours_w_distance<-nn2(coordinatesA, coordinatesB,10)
note that this operation is not symmetric (Switching coordinatesA and coordinatesB gives different results).
Also you would first have to convert your gps coordinates to a coordinate reference system in which you can calculate euclidean distances, for example UTM (code not tested):
library("sp")
gps2utm<-function(gps_coordinates_matrix,utmzone){
spdf<-SpatialPointsDataFrame(gps_coordinates_matrix[,1],gps_coordinates_matrix[,2])
proj4string(spdf) <- CRS("+proj=longlat +datum=WGS84")
return(spTransform(spdf, CRS(paste0("+proj=utm +zone=",utmzone," ellps=WGS84"))))
}

Dealing with unordered XY points to create a polygon shapefile in R

I've inherited a geodatabase of polygons of lakes for which I am trying to create sampling grids on each lake. My current strategy is to export spatial data to CSV, use R to run a loop to create the grids on each lake, and then write to a new shapefile. However, here is my problem, when exporting to a CSV the WKT strings get messed up and put onto different lines. Okay, no problem, I moved on to exporting just the geometry to CSV so that I get X-Y values. When I simply plot the points they look perfect (using plot(y~x)), but the points are not in order. So, when I transform the data to a SpatialPolygon in the sp package in R using the following sequence:
XY-points -> Polygon -> Polygons -> SpatialPolygon
and then plot the SpatialPolygon I get this:
I know this is an artifact of incorrectly ordered points, because when I order the points by X and then by Y and run the same procedure here is what I get:
This is what the correct plotting is supposed to look like (X-Y data plotted with open circles):
Here is a short reproducible example of what I am trying to deal with:
library(sp)
# correct polygon
data <- data.frame(x=c(1:10, 10:1), y=c(5:1, 1:10, 10:6))
# plot(y~x, data=data)
correct.data.points <- rbind(data, data[1,]) # to close the ring for a polygon
correct.data.coords <- as.matrix(cbind(correct.data.points))
correct.data.poly <- Polygon(correct.data.coords, hole=F)
correct.data.poly <- Polygons(list(correct.data.poly), ID=0)
correct.data.poly.sp <- SpatialPolygons(list(correct.data.poly))
plot(correct.data.poly.sp)
# incorrect polygon
scr.data <- data[c(sample(1:20)),]
# plot(y~x, data=scr.data)
scr.data.points <- rbind(scr.data, scr.data[1,]) # to close the ring for a polygon
scr.data.coords <- as.matrix(cbind(scr.data.points))
scr.data.poly <- Polygon(scr.data.coords, hole=F)
scr.data.poly <- Polygons(list(scr.data.poly), ID=0)
scr.data.poly.sp <- SpatialPolygons(list(scr.data.poly))
plot(scr.data.poly.sp)
Any thoughts? Thanks for any help or insight anyone can provide. Also, for reference I am using QGIS 2.6.0 and the MMQGIS Python plugin to do the geometry exporting.

In R, how to average spatial points data over spatial grid squares

Managed to solve problem now
I have a set of around 50 thousand points that have coordinates and one value associated with them. I would like to be able to place points into a grid averaging the associated value of all points that fall into a grid square. So I want to end up with an object that identifies each grid square and gives the average inside the grid square.
I have the data in a spatial points data frame and a spatial grid object if that helps.
Improving answer: I have definitely done some searching, sorry about the initial state of the question I had only managed to frame the question inside my own head; hadn't had to communicate it to anyone else before...
Here is example data that hopefully illustrates the problem more clearly
##make some data
longi <- runif(100,0,10)
lati <- runif(100,0,10)
value <- runif(500,20,30)
##put in data frame then change to spatial data frame
df <- data.frame("lon"=longi,"lat"=lati,"val"=value)
coordinates(df) <- c("lon","lat")
proj4string(df) <- CRS("+proj=longlat")
##create a grid that bounds the data
grd <- GridTopology(cellcentre.offset=bbox(df)[,1],
cellsize=c(1,1),cells.dim=c(11,11))
sg <- SpatialGrid(grd)
Then I hope to get an object albeit a vector/data frame/list that gives me the average of value in each grid cell/square and some way of identifying which cell it is.
Solution
##convert the grid into a polygon##
polys <- as.SpatialPolygons.GridTopology(grd)
proj4string(polys) <- CRS("+proj=longlat")
##can now use the function over to select the correct points and average them
results <- rep(0, length(polys))
for(i in 1:length(polys)) {
results[i] = mean(df$val[which(!is.na(over(x=df,y=polys[i])))])
}
My question now is if this is the best way to do it or is there a more efficient way?
Your description is vague at best. Please try to ask more specific answers preferably, with code illustrating what you have already tried. Averaging a single value in your point data or a single raster cell makes absolutely no sense.
The best guess at an answer I can provide is to use raster extract() to assign the raster values to a sp point object and then use tapply() to aggregate the values to your grouping values in the points. You can use the coordinates of the points to identify cell location or alternately, the cellnumbers returned from extract (per below example).
require(raster)
require(sp)
# Create example data
r <- raster(ncol=500, nrow=500)
r[] <- runif(ncell(r))
pts <- sampleRandom(r, 100, sp=TRUE)
# Add a grouping value to points
pts#data <- data.frame(ID=rownames(pts#data), group=c( rep(1,25),rep(2,25),
rep(3,25),rep(4,25)) )
# Extract raster values and add to #data slot dataframe. Note, the "cells"
# attribute indicates the cell index in the raster.
pts#data <- data.frame(pts#data, extract(r, pts, cellnumbers=TRUE))
head(pts#data)
# Use tapply to cal group means
tapply(pts#data$layer, pts#data$group, FUN=mean)

How to find point related to set of coordinates?

I have a set of about 5000 geographical (WGS84) coordinates. All of them are inside 40km square.
Is there any algorithm / R function to find point, inside square and not in the given set, farthest from any point from set?
I mean how to find point in the square where the distance to the nearest point from set is longest?
Now I do it by generating grid of coordinates equally spaced and finding distance from each grid point to the nearest set point. Is there any less numerical / not brute force method?
EDIT:
I made mistake in previous version of the question. Maybe this will help:
Set of points are coordinates of the 5000 shops in the city. I want to find place in the city where distance to the nearest shop is the longest.
I think that if the point you seek isn't on the edge of the box then it has to be at a vertex of the voronoi tesselation of the points. If it is on the edge of the box then it has to be on the intersection of the box and an edge of the voronoi tesselation.
So if you compute the voronoi tesselation and then use rgeos to intersect it with the box, that gives you a set of possible points. You can then use the FNN package to compute the neighbour distances from those possible points to the data points, sort, and find the possible point with the biggest nearest neighbour.
That gives you an exact point without any of this gridding business. If it wasn't so close to bedtime I'd sort out some code to do it. You probably want the deldir package or voronoi tesselations. It might even already do the box intersection...
Okay, not quite bedtime. Here's the solution:
findM <- function(pts,xmin,xmax,ymin,ymax){
require(deldir)
require(FNN)
d = deldir(pts[,1],pts[,2],rw=c(xmin,xmax,ymin,ymax))
vpts = rbind(as.matrix(d$dirsgs[,1:2]),as.matrix(d$dirsgs[,3:4]))
vpts = rbind(vpts,cbind(c(xmin,xmax,xmin,xmax),c(ymin,ymin,ymax,ymax)))
vpts = vpts[!duplicated(vpts),]
nn = get.knnx(pts,vpts,k=1)
ptmin = which(nn$nn.dist==max(nn$nn.dist))
list(point = vpts[ptmin,,drop=FALSE], dist = nn$nn.dist[ptmin])
}
Edited version now returns one point and adds the corner points as possibles.
Here's an example that uses several functions (distanceFromPoints(), maxValue(), Which(), and xyFromCell()) from the raster package to perform the key calculations:
# Load required libraries
library(sp)
library(rgdal)
library(raster)
# Create a SpatialPoints object with 10 points randomly sampled from
# the area lying between longitudes 0 and 1 and latitudes 0 and 1
bbox <- matrix(c(0,0,1,1), ncol=2, dimnames = list(NULL, c("min", "max")))
PRJ4 <- CRS("+proj=longlat +datum=WGS84 +ellps=WGS84")
S <- Spatial(bbox = bbox, proj4string = PRJ4)
SP <- spsample(S, 10, type="random")
# Create a raster object covering the same area
R <- raster(extent(bbox), nrow=100, ncol=100, crs=PRJ4)
# Find the coordinates of the cell that is farthest from all of the points
D <- distanceFromPoints(object = R, xy = SP)
IDmaxD <- Which(D == maxValue(D), cells=TRUE)
(XY <- xyFromCell(D, IDmaxD))
# x y
# [1,] 0.005 0.795
# Plot the results
plot(D, main = "Distance map, with most distant cell in red")
points(SP)
points(XY, col="red", pch=16, cex=2)

Resources