Geopandas to_file gives blank prj file - projection

I am trying to use GeoPandas for a (only slightly) more complex project, but at the moment I'm failing to write out a simple shapefile with a single point in it in a projected manner.
The following code results in a shapefile that looks generally good - but the .prj is empty:
import pandas as pd
from geopandas import GeoDataFrame
from shapely.geometry import Point
df=pd.read_csv("richmond.csv")
geometry = [Point(xy) for xy in zip(df.x, df.y)]
crs = {'init': 'epsg:4326'}
geo_df = GeoDataFrame(df, crs=crs, geometry=geometry)
geo_df.to_file("geopan.shp")
The csv is 2 row and 2 columns (header row, then lon and lat in 2nd row):
Am I missing something obvious? I've hunted through stackoverflow, the geopandas docs, etc. All seem to imply to_file() should work just fine.
In the long run, the goal is to create a few functions for my students to use in a lab - one that draws a line along a lat or lon the width / height of the US, another that clips the line to polygons (the states), so that the students can figure out the widest spot in each state as a gentle introduction to working with spatial data. I'm trying to avoid arcpy as it's Python 2, and I thought (and think) I was doing the right thing by teaching them the ways of Python 3. I'd like them to be able to debug their methodologies by being able to open the line in Arc though, hence this test.

So, after playing with this, I've determined that under the current version of Anaconda the problem is with crs = {'init': 'epsg:4326'} on Windows machines. This works fine on Macs, but has not worked on any of my or my students' Windows systems. Changing this line to make use of the proj4 string crs = {'proj': 'latlong', 'ellps': 'WGS84', 'datum': 'WGS84', 'no_defs': True} instead works just fine. More of a workaround than an actual solution, but, it seems to consistently work.

I'm always using from_epsg function from fiona library.
>>> from fiona.crs import from_epsg
>>> from_epsg(4326)
{'init': 'epsg:4326', 'no_defs': True}
I've never had any problems using it. Keep it mind that some local projections are missing, but it shouldn't be a problem in your case.

Another user and I had a similar issue using fiona, and the issue for me was the GDAL_DATA environmental variable not being set correctly. To reiterate my answer there: For reference, I'm using Anaconda, the Spyder IDE, Fiona 1.8.4, and Python 3.6.8, and GDAL 2.3.3.
While Anaconda usually sets the GDAL_DATA variable upon entering the virtual environment, using another IDE like Spyder will not preserve it, and thus causes issues where fiona (and I assume Geopandas) can't export the CRS correctly.
You can test this fix by trying to printing out a EPSG to WKT transformation before & after setting the GDAL_DATA variable explictly.
Without setting GDAL_DATA:
import os
print('GDAL_DATA' in os.environ)
from osgeo import osr
srs = osr.SpatialReference() # Declare a new SpatialReference
srs.ImportFromEPSG(3413) # Import the EPSG code into the new object srs
print(srs.ExportToWkt()) # Print the result before transformation to ESRI WKT (prints nothing)
Results in:
False
With setting GDAL_DATA:
import os
os.environ['GDAL_DATA'] = 'D:\\ProgramData\\Anaconda3\\envs\\cfm\\Library\\share\\gdal'
print('GDAL_DATA' in os.environ)
from osgeo import , osr
srs = osr.SpatialReference() # Declare a new SpatialReference
srs.ImportFromEPSG(3413) # Import the EPSG code into the new object srs
print(srs.ExportToWkt()) # Print the result before transformation to ESRI WKT (prints nothing)
Results in:
True
PROJCS["WGS 84 / NSIDC Sea Ice Polar Stereographic North",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Polar_Stereographic"],PARAMETER["latitude_of_origin",70],PARAMETER["central_meridian",-45],PARAMETER["scale_factor",1],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["X",EAST],AXIS["Y",NORTH],AUTHORITY["EPSG","3413"]]

Related

how to handle every (polygon) item in a shapefile as single geometry?

import of packages:
from rasterio.mask import mask
import geopandas as gpd
opened a shapefile:
gdf = gpd.read_file(shpfilepath+clipshape)
and opened a rasterfile:
img = rasterio.open(f'{rstfilepath}raw_immutable/SuperView/{SV_filename}{ext}')
then perform action:
for poly_gon in gdf.geometry:
out_image, out_transform = mask(img, poly_gon, crop=True)
but this failes:
TypeError: 'Polygon' object is not iterable
I cannot find how to handle every polygon in the shapefile (5 in my case) to be the polygon to clip the raster image.
Update
How about going into nesting your results. First create an empty object like an empty dict then fill it like:
empt_dict1=dict()
for i in range(len(gdf.geometry)):
empt_dict1[i] = dict()
empt_dict1[i][0], empt_dict1[i][1] = mask(img, gdf.geometry[i], crop=True)
Your expected clips are in each sub-object of the empt_dict list.
I don't have a working gdf right now so I'm not sur if you can index it that way or if you should use something like .loc.
Old answer
If I understand correctly you seek to use the whole area of all the polygons at the same time. How about merging them into a single one using a temporary layer, like below. PS: I tried to use your names given that you don't provide any data.
gdf["dummy"]=[0 for i in range(5)]
gdf_tempo = gdf.dissolve(by=dummy)
out_image, out_transform = mask(img, gdf_tempo , crop=True)

Read in coordinates of SVG path / polygon

I'm trying to import the coordinates of a path in an SVG file created with illustrator into R.
I thought I might read the SVG into R with grImport2, which should in theory be importing SVG files, but I think they might only handle SVG files generated by the Cairo device.
Let's say I want to import the following .svg file:
Here is my attempt at loading an SVG file. If I read the content correctly, it should just contain 1 (complicated) path. The warning is the same one I get when loading my SVG file created by Adobe Illustrator. Be warned that the code below will get stuck for some time!
file <- "https://upload.wikimedia.org/wikipedia/commons/d/db/Brain_Drawing.svg"
download.file(file, tmp <- tempfile(fileext = ".svg"))
# Don't run the following line, it will get your R session stuck!
x <- grImport2::readPicture(tmp)
#> Warning message:
#> In checkValidSVG(doc, warn = warn) :
#> This picture was not generated by Cairo graphics; errors may result
unlink(tmp)
My ideal output would be a data.frame with at least x and y coordinates of (anchor)points and perhaps some metadata that can tell different paths apart. I don't need curves and arcs interpolated or anything like that.
Are there any other packages I'm not aware of that might import this? Is there a way to convert the SVG to one that I can read into R?
Nice question Teunbrand - thanks for posting. The issue does seem to be getting grImport2 to read non-Cairo svg. So actually, we just need another translation step: use package rsvg to read in a non-Cairo svg and and write the same thing with Cairo. Oddly enough, it has a function to do this, called rsvg_svg. So we can read the remote file as a Picture object in a single line without even creating a local tmp file:
file <- "https://upload.wikimedia.org/wikipedia/commons/d/db/Brain_Drawing.svg"
svg <- grImport2::readPicture(rawToChar(rsvg::rsvg_svg(file)))
Unfortunately, a Picture is a deeply listed S4 object which is harder to navigate than a grob tree. I'm sure a person of your caliber could coerce it into a data frame, but it's not trivial, and I won't attempt it here. At least the components themselves look easy enough to harvest:
svg#content[[1]]#content[[1]]#d#segments[[3]]
#> An object of class "PathCurveTo"
#> Slot "x":
#> [1] 791.8359 789.5286 787.1382 784.6293 782.0041 779.2962 776.5600 773.8555 771.2318
#> [10] 768.712 766.2875 764.8359
#>
#> Slot "y":
#> [1] 8.191406 8.209760 8.239691 8.282891 8.340521 8.412835 8.499026 8.597442 8.706126
#> [10] 8.82357 8.949580 9.031250
Anyway, there are a few nice utility functions that allow you to do cool stuff like this:
brainGrob <- grImport2::pictureGrob(svg)
ggplot2::ggplot() + ggplot2::geom_point(ggplot2::aes(1, 1))
grid::grid.draw(brainGrob)
I haven't seen an arbitrary SVG drawn in grid before, so I was pleased to find this, prompted by your question. Thanks again.

Can I import PostGIS raster data type into R by using the RPostgreSQL-package?

I have a PostgreSQL / PostGIS table with 30 rows (only 3 are shown) and 3 columns as follows
(raster is a PostGIS data type) - It's the EFSA CAPRI data set btw, if somebody's fimilar with it:
// Can I import the raster data type from PostGIS into R with the help of the RPostgreSQL-package (see the code below) OR do I have to use the rgdal-package inevitably as described by #Jot eN?
require(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname = "")
dbGetquery(con, "SELECT rid, rast, filename FROM schema.capri")
Importing it without transformation and St_AsText(rast) (which works for the geometry data type of PostGIS) don't work.
If this is still relevant, at the University of Florida, David Bucklin and I have released a rpostgis package that provides bi-directional transfer between PostGIS and R for vector and raster data. The package does not rely on GDAL (and rgdal), and should be platform independent.
Assuming that you already have a functional connection con established through RPostgreSQL, you can import PostGIS raster data type into R using the function pgGetRast, for instance:
library(rpostgis)
my_raster <- pgGetRast(con, c("schema", "raster_table"))
The function assumes that the raster tiles are stored in the column "rast" by default (as is the case for you), but you can change that with the argument rast. Now, depending on the size and other considerations, this may be significantly slower (but a lot more flexible) than using rgdal. We are still working on it, but this is the cost of providing a "pure R" solution. You can also use the boundary argument if you are only interested in a subset of the entire raster (which will significantly increase the loading time).
Note also that there is pgGetGeom for points/lines/polygons, instead of using St_AsText.
You have an answer on gis.stackexchange page - https://gis.stackexchange.com/a/118401/20955:
library('raster')
library('rgdal')
dsn="PG:dbname='plots' host=localhost user='test' password='test' port=5432 schema='gisdata' table='map' mode=2"
ras <- readGDAL(dsn) # Get your file as SpatialGridDataFrame
ras2 <- raster(ras,1) # Convert the first Band to Raster
plot(ras2)
Additional info could be found here https://rpubs.com/dgolicher/6373

Creating Shapefiles in R

I'm trying to create a shapefile in R that I will later import to either Fusion Table or some other GIS application.
To start,I imported a blank shapefile containing all the census tracts in Canada. I have attached other data (in tabular format) to the shapefile based on the unique ID of the CTs, and I have mapped my results. At the moment, I only need the ones in Vancouver and I would like to export a shapefile that contains only the Vancouver CTs as well as my newly attached attribute data.
Here is my code (some parts omitted due to privacy reasons):
shape <- readShapePoly('C:/TEST/blank_ct.shp') #Load blank shapefile
shape#data = data.frame(shape#data, data2[match(shape#data$CTUID, data2$CTUID),]) #data2 is my created attributes that I'm attaching to blank file
shape1 <-shape[shape$CMAUID == 933,] #selecting the Vancouver CTs
I've seen other examples using this: writePolyShape to create the shapefile. I tried it, and it worked to an extent. It created the .shp, .dbf, and .shx files. I'm missing the .prj file and I'm not sure how to go about creating it. Are there better methods out there for creating shapefiles?
Any help on this matter would be greatly appreciated.
Use rgdal and writeOGR. rgdal will preserve the projection information
something like
library(rdgal)
shape <- readOGR(dsn = 'C:/TEST', layer = 'blank_ct')
# do your processing
shape#data = data.frame(shape#data, data2[match(shape#data$CTUID, data2$CTUID),]) #data2 is my created attributes that I'm attaching to blank file
shape1 <-shape[shape$CMAUID == 933,]
writeOGR(shape1, dsn = 'C:/TEST', layer ='newstuff', driver = 'ESRI Shapefile')
Note that the dsn is the folder containing the .shp file, and the layer is the name of the shapefile without the .shp extension. It will read (readOGR) and write (writeOGR) all the component files (.dbf, .shp, .prj etc)
Problem solved! Thank you again for those who help!
Here is what I ended up doing:
As Mnel wrote, this line will create the shapefile.
writeOGR(shape1, dsn = 'C:/TEST', layer ='newstuff', driver = 'ESRI Shapefile')
However, when I ran this line, it came back with this error:
Can't convert columns of class: AsIs; column names: ct2,mprop,mlot,mliv
This is because my attribute data was not numeric, but were characters. Luckily, my attribute data is all numbers so I ran transform() to fix this problem.
shape2 <-shape1
shape2#data <- transform(shape1#data, ct2 = as.numeric(ct2),
mprop = as.numeric(mprop),
mlot = as.numeric(mlot),
mliv = as.numeric(mliv))
I tried the writeOGR() command again, but I still didn't get the .prj file that I was looking for. The problem was I didn't specified the coordinate systems for the shapefile when I was importing the file. Since I already know what the coordinate system is, all I had to do was define it when importing.
readShapePoly('C:/TEST/blank_ct.shp',proj4string=CRS("+proj=longlat +datum=WGS84")
After that, I re-ran all the things I wanted to do with the shapefile, and the writeOGR line for exporting. And that's it!

Animated graphs in ipython notebook

Is there a way of creating animated graphs. For example showing the same graph, with different parameters.
For example is SAGE notebook, one can write:
a = animate([circle((i,i), 1-1/(i+1), hue=i/10) for i in srange(0,2,0.2)],
xmin=0,ymin=0,xmax=2,ymax=2,figsize=[2,2])
a.show()
This has horrible flickering, but at least this creates a plot that animates for me. It is based on Aron's, but Aron's does not work as-is.
import time, sys
from IPython.core.display import clear_output
f, ax = plt.subplots()
n = 30
x = array([i/10.0 for i in range(n)])
y = array([sin(i) for i in x])
for i in range(5,n):
ax.plot(x[:i],y[:i])
time.sleep(0.1)
clear_output()
display(f)
ax.cla() # turn this off if you'd like to "build up" plots
plt.close()
Update: January 2014
Jake Vanderplas has created a Javascript-based package for matplotlib animations available here. Using it is as simple as:
# https://github.com/jakevdp/JSAnimation
from JSAnimation import examples
examples.basic_animation()
See his blog post for a more complete description and examples.
Historical answer (see goger for a correction)
Yes, the Javascript update does not correctly hold the image frame yet, so there is flicker, but you can do something quite simple using this technique:
import time, sys
from IPython.display import clear_output
f, ax = plt.subplots()
for i in range(10):
y = i/10*sin(x)
ax.plot(x,y)
time.sleep(0.5)
clear_output()
display(f)
ax.cla() # turn this off if you'd like to "build up" plots
plt.close()
IPython widgets let you manipulate Python objects in the kernel with GUI objects in the Notebook. You might also like Sage hosted IPython Notebooks. One problem you might have with sharing widgets or interactivity in Notebooks is that if someone else doesn't have IPython, they can't run your work. To solve that, you can use Domino to share Notebooks with widgets that others can run.
Below are three examples of widgets you can build in a Notebook using pandas to filter data, fractals, and a slider for a 3D plot. Learn more and see the code and Notebooks here.
If you want to live-stream data or set up a simulation to run as a loop, you can also stream data into plots in a Notebook. Disclaimer: I work for Plotly.
If you use IPython notebook, v2.0 and above support interactive widgets. You can find a good example notebook here (n.b. you need to download and run from your own machine to see the sliders).
It essentially boils down to importing interact, and then passing it a function, along with ranges for the paramters. e.g., from the second link:
In [8]:
def pltsin(f, a):
plot(x,a*sin(2*pi*x*f))
ylim(-10,10)
In [9]:
interact(pltsin, f=(1,10,0.1), a=(1,10,1));
This will produce a plot with two sliders, for f and a.
If you want 3D scatter plot animations, the Ipyvolume Jupyter widget is very impressive.
http://ipyvolume.readthedocs.io/en/latest/animation.html#
bqplot is a really good option to do this now. its built specifically for animation through python in the notebook
https://github.com/bloomberg/bqplot
On #goger's comment of 'horrible flickering', I found that calling clear_output(wait=True) solved my problem. The flag tells clear_output to wait to render till it has something new to render.
matplotlib has an animation module to do just that. However, examples provided on the site will not run as is in a notebook; you need to make a few tweaks to make it work.
Here is the example of the page below modified to work in a notebook (modifications in bold).
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from matplotlib import rc
from IPython.display import HTML
fig, ax = plt.subplots()
xdata, ydata = [], []
ln, = plt.plot([], [], 'ro', animated=True)
def init():
ax.set_xlim(0, 2*np.pi)
ax.set_ylim(-1, 1)
return ln,
def update(frame):
xdata.append(frame)
ydata.append(np.sin(frame))
ln.set_data(xdata, ydata)
return ln,
ani = FuncAnimation(fig, update, frames=np.linspace(0, 2*np.pi, 128),
init_func=init, blit=True)
rc('animation', html='html5')
ani
# plt.show() # not needed anymore
Note that the animation in the notebook is made via a movie and that you need to have ffmpeg installed and matplotlib configured to use it.

Resources