How can I download GADM data in R? - r

library(raster)
france<-getData('GADM', country='FRA', level=1)
However, the command is leading me to this error.
trying URL 'http://biogeo.ucdavis.edu/data/gadm2.8/rds/FRA_adm1.rds'
Error in utils::download.file(url = aurl, destfile = fn, method = "auto", :
cannot open URL 'http://biogeo.ucdavis.edu/data/gadm2.8/rds/FRA_adm1.rds'

First, download the country data you want from the GADM database, and save it to your local directory. Be sure that you have chosen the R (SpatialPolygonsDataFrame) format. There are five levels available for France (from level 0 to level 5). You can choose what you need.
Second, read the .rds file downloaded from GADM with readRDS() function and transform it into a data.frame with ggplot2::fortify().
library(ggplot2)
library(sp)
# assumed that you downloaded into a such path: '~/Downloads/FRA_adm1.rds':
path <- file.path(Sys.getenv("HOME"), "Downloads", "FRA_adm1.rds")
# FR map (Level 1) from GADM version 2.8
frRDS <- readRDS(path)
# Region names 1 in data frame
frRDS_df <- ggplot2::fortify(frRDS, region = "NAME_1")
head(frRDS_df)

I am going to improve upon the previous answer to the OP's question.
To answer the OP's question directly and correctly, there is nothing wrong with the OP's code. The issue was likely a temporary internet connection issue because the OP's code works and retrieves the gadm.org data without issue. Note, the getData() function retrieves the gadm.org website's geodata that is stored and retrieved from the http://biogeo.ucdavis.edu/ website.
The raster package provides the getData() function which is very useful for automatically retrieving the geodata from the internet. This function can also be used to retrieve geodata that is kept locally on a PC.
In years past, the way to use geodata was to first download a file from the gadm.org website, and then to move that file from the download folder and save the file in a folder on the pc. These files then needed to be unpackaged/unzipped before the geodata was available to be used by R.
Using the getData() makes life simpler because this method directly retrieves the desired geodata and then makes the geodata available to use with R.
The gadm.org website clearly states:
"Downloading by country is the recommended approach"
Even though downloading the large world geodata file directly from the website can be done, it is unnecessary and resource intensive. Unless there is some specific reason for doing so, there is absolutely no need to download and keep the large worldwide geodata database on the PC.
And one last thing about the getData() function. This function is currently generating a warning when it is used in R nowadays. The warning reads:
Warning message in getData("GADM", country = "USA", level = 1):
"getData will be removed in a future version of raster.
Please use the geodata package instead"

Related

Is there a way of reading shapefiles directly into R from an online source?

I am trying to find a way of loading shapefiles (.shp) from an online repository/folder/url directly into my global environment in R, for the purpose of making plots in ggplot2 using geom_sf. In the first instance I'm using my Google Drive to store these files but I'd ideally like to find a solution that works with any folder with a valid url and appropriate access rights.
So far I have tried a few options, the first 2 involving zipping the source folder on Google Drive where the shapefiles are stored and then downloading and unzipping in some way. Have included reproducable examples using a small test shapefile:
Using utils::download.file() to retrieve the compressed folder and unzipping using either base::system('unzip..') or zip::unzip() (loosely following this thread: Downloading County Shapefile from ONS):
# Create destination data folder (if there isn't one)
if(!dir.exists('data')) dir.create('data')
# Download the zipped file/folder
download.file("https://drive.google.com/file/d/1BYTCT_VL8EummlAsH1xWCd5rC4bZHDMh/view?usp=sharing", destfile = "data/test_shp.zip")
# Unzip folder using unzip (fails)
unzip(zipfile = "data/test_shp.zip", exdir = "data/test_shp", junkpaths = TRUE)
# Unzip folder using system (also fails)
system("unzip data/test_shp.zip")
If you can't run the above code then FYI the 2 error messages are:
Warning message:
In unzip(zipfile = "data/test_shp.zip", exdir = "data/test_shp", :
error 1 in extracting from zip file
AND
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of data/test_shp.zip or
data/test_shp.zip.zip, and cannot find data/test_shp.zip.ZIP, period.
Worth noting here that I can't even manually unzip this folder outside R so I think there's something going wrong with the download.file() step.
Using the googledrive package:
# Create destination data folder (if there isn't one)
if(!dir.exists('data')) dir.create('data')
# Specify googledrive url:
test_shp = drive_get(as_id("https://drive.google.com/file/d/1BYTCT_VL8EummlAsH1xWCd5rC4bZHDMh/view?usp=sharing"))
# Download zipped folder
drive_download(test_shp, path = "data/test_shp.zip")
# Unzip folder
zip::unzip(zipfile = "data/test_shp.zip", exdir = "data/test_shp", junkpaths = TRUE)
# Load test.shp
test_shp <- read_sf("data/test_shp/test.shp")
And that works!
...Except it's still a hacky workaround, which requires me to zip, download, unzip and then use a separate function (such as sf::read_sf or st_read) to read in the data into my global environment. And, as it's using the googledrive package it's only going to work for files stored in this system (not OneDrive, DropBox and other urls).
I've also tried sf::read_sf, st_read and fastshp::read.shp directly on the folder url but those approaches all fail as one might expect.
So, my question: is there a workflow for reading shapefiles stored online directly into R or should I stop looking? If there is not, but there is a way of expanding my above solution (2) beyond googledrive, I'd appreciate any tips on that too!
Note: I should also add that I have deliberately ignored any option requiring the package rgdal due to its imminient permanent retirement and so am looking for options that are at least somewhat future-proof (I understand all packages drop off the map at some point). Thanks in advance!
I ran into a similar problem recently, having to read in shapefiles directly from Dropbox into R.
As a result, this solution only applies for the case of Dropbox.
The first thing you will need to do is create a refreshable token for Dropbox using rdrop2, given recent changes from Dropbox that limit single token use to 4 hours. You can follow this SO post.
Once you have set up your refreshable token, identify all the files in your spatial data folder on Dropbox using:
shp_files_on_db<- drop_dir("Dropbox path/to your/spatial data/", dtoken = refreshable_token) %>%
filter(str_detect(name, "adm2"))
My 'spatial data' folder contained two sets of shapefiles – adm1 and adm 2. I used the above code to choose only those associated with adm2.
Then create a vector of the names of the shp, csv, shx, dbf, cpg files in the 'spatial data' folder, as follows:
shp_filenames<- shp_files_on_db$name
I choose to read in shapefiles into a temporary directory, avoiding the need to have to store the files on my disk – also useful in a Shiny implementation. I create this temporary directory as follows:
# create a new directory under tempdir
dir.create(dir1 <- file.path(tempdir(), "testdir"))
#If needed later on, you can delete this temporary directory
unlink(dir1, recursive = T)
#And test that it no longer exists
dir.exists(dir1)
Now download the Dropbox files to this temporary directory:
for (i in 1: length(shp_filenames)){
drop_download(paste0("Dropbox path/to your/spatial data/",shp_filenames[i]),
dtoken = refreshable_token,
local_path = dir1)
}
And finally, read in your shapefile as follows:
#path to the shapefile in the temporary directory
path1_shp<- paste0(dir1, "/myfile_adm2.shp")
#reading in the shapefile using the sf package - a recommended replacement for rgdal
shp1a <- st_read(path1_shp)

Cannot access EIA API in R

I'm having trouble accessing the Energy Information Administration's API through R (https://www.eia.gov/opendata/).
On my office computer, if I try the link in a browser it works, and the data shows up (the full url: https://api.eia.gov/series/?series_id=PET.MCREXUS1.M&api_key=e122a1411ca0ac941eb192ede51feebe&out=json).
I am also successfully connected to Bloomberg's API through R, so R is able to access the network.
Since the API is working and not blocked by my company's firewall, and R is in fact able to connect to the Internet, I have no clue what's going wrong.
The script works fine on my home computer, but at my office computer it is unsuccessful. So I gather it is a network issue, but if somebody could point me in any direction as to what the problem might be I would be grateful (my IT department couldn't help).
library(XML)
api.key = "e122a1411ca0ac941eb192ede51feebe"
series.id = "PET.MCREXUS1.M"
my.url = paste("http://api.eia.gov/series?series_id=", series.id,"&api_key=", api.key, "&out=xml", sep="")
doc = xmlParse(file=my.url, isURL=TRUE) # yields error
Error msg:
No such file or directoryfailed to load external entity "http://api.eia.gov/series?series_id=PET.MCREXUS1.M&api_key=e122a1411ca0ac941eb192ede51feebe&out=json"
Error: 1: No such file or directory2: failed to load external entity "http://api.eia.gov/series?series_id=PET.MCREXUS1.M&api_key=e122a1411ca0ac941eb192ede51feebe&out=json"
I tried some other methods like read_xml() from the xml2 package, but this gives a "could not resolve host" error.
To get XML, you need to change your url to XML:
my.url = paste("http://api.eia.gov/series?series_id=", series.id,"&api_key=",
api.key, "&out=xml", sep="")
res <- httr::GET(my.url)
xml2::read_xml(res)
Or :
res <- httr::GET(my.url)
XML::xmlParse(res)
Otherwise with the post as is(ie &out=json):
res <- httr::GET(my.url)
jsonlite::fromJSON(httr::content(res,"text"))
or this:
xml2::read_xml(httr::content(res,"text"))
Please note that this answer simply provides a way to get the data, whether it is in the desired form is opinion based and up to whoever is processing the data.
If it does not have to be XML output, you can also use the new eia package. (Disclaimer: I'm the author.)
Using your example:
remotes::install_github("leonawicz/eia")
library(eia)
x <- eia_series("PET.MCREXUS1.M")
This assumes your key is set globally (e.g., in .Renviron or previously in your R session with eia_set_key). But you can also pass it directly to the function call above by adding key = "yourkeyhere".
The result returned is a tidyverse-style data frame, one row per series ID and including a data list column that contains the data frame for each time series (can be unnested with tidyr::unnest if desired).
Alternatively, if you set the argument tidy = FALSE, it will return the list result of jsonlite::fromJSON without the "tidy" processing.
Finally, if you set tidy = NA, no processing is done at all and you get the original JSON string output for those who intend to pass the raw output to other canned code or software. The package does not provide XML output, however.
There are more comprehensive examples and vignettes at the eia package website I created.

Using shapefile as an input to the user defined function using R

I have a script to create a randomly distributed square polygons in KML format which takes in shapefile with a single polygon as an input which works absolutely well. The problem arise when I tried to create the user defined function out of it. I used readShapePoly() function to read the shapefile and it works well when used out of the function. But when the function is created in which shapefile should be given as an input, it simply wont take. It shows this error message
Error in getinfo.shape(filen) : Error opening SHP file
I avoid writing extensions and I do have all the extension files to create the shapefile.
Part of the script to read the shapefile using it as the input file:
library(maptools)
library(sp)
library(ggplot2)
Polytokml <- function(shapefile_name){
###Input Files
file1 <- "shapefile_name"
Shape_file <- readShapePoly(file1) #requires maptools
return(Shape_file)
}
The function is created but it doesn't work if the function is called.
>Polytokml(HKH.shp)
Error in getinfo.shape(filen) : Error opening SHP file
This works well out of the function.
###Input Files
file1 <- "shapefile.shp"
Shape_file <- readShapePoly(file1) #requires maptools
This is just an example out of the whole script in which different arguments are taken as an input. So just to make things simple I have added script to read the shapefile which has been a problem now. Do let me know if it is not clear.
Thank you so much in advance :)

Creating Shapefiles in R

I'm trying to create a shapefile in R that I will later import to either Fusion Table or some other GIS application.
To start,I imported a blank shapefile containing all the census tracts in Canada. I have attached other data (in tabular format) to the shapefile based on the unique ID of the CTs, and I have mapped my results. At the moment, I only need the ones in Vancouver and I would like to export a shapefile that contains only the Vancouver CTs as well as my newly attached attribute data.
Here is my code (some parts omitted due to privacy reasons):
shape <- readShapePoly('C:/TEST/blank_ct.shp') #Load blank shapefile
shape#data = data.frame(shape#data, data2[match(shape#data$CTUID, data2$CTUID),]) #data2 is my created attributes that I'm attaching to blank file
shape1 <-shape[shape$CMAUID == 933,] #selecting the Vancouver CTs
I've seen other examples using this: writePolyShape to create the shapefile. I tried it, and it worked to an extent. It created the .shp, .dbf, and .shx files. I'm missing the .prj file and I'm not sure how to go about creating it. Are there better methods out there for creating shapefiles?
Any help on this matter would be greatly appreciated.
Use rgdal and writeOGR. rgdal will preserve the projection information
something like
library(rdgal)
shape <- readOGR(dsn = 'C:/TEST', layer = 'blank_ct')
# do your processing
shape#data = data.frame(shape#data, data2[match(shape#data$CTUID, data2$CTUID),]) #data2 is my created attributes that I'm attaching to blank file
shape1 <-shape[shape$CMAUID == 933,]
writeOGR(shape1, dsn = 'C:/TEST', layer ='newstuff', driver = 'ESRI Shapefile')
Note that the dsn is the folder containing the .shp file, and the layer is the name of the shapefile without the .shp extension. It will read (readOGR) and write (writeOGR) all the component files (.dbf, .shp, .prj etc)
Problem solved! Thank you again for those who help!
Here is what I ended up doing:
As Mnel wrote, this line will create the shapefile.
writeOGR(shape1, dsn = 'C:/TEST', layer ='newstuff', driver = 'ESRI Shapefile')
However, when I ran this line, it came back with this error:
Can't convert columns of class: AsIs; column names: ct2,mprop,mlot,mliv
This is because my attribute data was not numeric, but were characters. Luckily, my attribute data is all numbers so I ran transform() to fix this problem.
shape2 <-shape1
shape2#data <- transform(shape1#data, ct2 = as.numeric(ct2),
mprop = as.numeric(mprop),
mlot = as.numeric(mlot),
mliv = as.numeric(mliv))
I tried the writeOGR() command again, but I still didn't get the .prj file that I was looking for. The problem was I didn't specified the coordinate systems for the shapefile when I was importing the file. Since I already know what the coordinate system is, all I had to do was define it when importing.
readShapePoly('C:/TEST/blank_ct.shp',proj4string=CRS("+proj=longlat +datum=WGS84")
After that, I re-ran all the things I wanted to do with the shapefile, and the writeOGR line for exporting. And that's it!

Dealing with large shapefiles on R

Good afternoon:
I have been studying R for a while, however now I'm working with large shapefiles, the size of these files is bigger than 600 Mb. I have a computer with 200GB free and 12 GB in RAM, I want to ask if somebody knows how to deal with these files?. I really appreciate your kind help.
With the latest version of 64-bit R, and the latest version of rgdal just try reading it in:
library(rgdal)
shpdata <- readOGR("/path/to/shpfolder/", "shpfilename")
Where "shpfilename" is the filename without the extension.
If that fails update your question with details of what you did, what you saw, details of the file sizes - each of the "shpfilename.*" files, details of your R version, operating system and rgdal version.
Ok so the question is more about a strategy for dealing with large files, not "how does one read a shapefile in R."
This post shows how one might use the divide-apply-recombine approach as a solution by subsetting shapefiles.
Working from the current answer, assume you have a SpatialPolygonsDataFrame called shpdata. shpdata will have a data attribute (accessed via #data) with some sort of identifier for each polygon (for Tiger shapefiles it's usually something like 'GEOID'). You can then loop over these identifiers in groups and subset/process/export the shpdata for each small batch of polygons. I suggest either writing intermediate files as a .csv or inserting them into a database like sqlite.
Some sample code
library(rgdal)
shpdata <- readOGR("/path/to/shpfolder/", "shpfilename")
# assuming the geo id var is 'geo_id'
lapply(unique(shpdata#data$geo_id), function(id_var){
shp_sub = subset(shpdata, geo_id == id_var)
### do something to the shapefile subset here ###
### output results here ###
### clean up memory !!! ###
rm(shp_sub)
gc()
})

Resources