I downloaded a shapefile of the world map from gadm, however, there are typos that happen with importing the shape file. For example "Aland" shows up as "Ã…land" in the shapefile. There are handful of countries I want to make changes to.
world map shapefile, the one that says "You can also download this version as six separate layers (one for each level of subdivision/aggregation), as a geopackage database or as shapefiles" :https://gadm.org/download_world.html
I imported the shapefile using:
worldmap <- readOGR("file/gadm36_0.shp")
I tried using the following code:
levels(wordlmap$NAME_0)[5] <- "Aland"
However I got this message:
Error in `levels<-`(`*tmp*`, value = c(NA, NA, NA, NA, "Aland")) :
factor level [2] is duplicated
Could you suggest how this code can be made better or an alternative.Thanks in advance
Since you did not provide a shapefile, I just worked with a publicly-available shapefile of Indian states. The long and short of it is to use the sf package. It loads shapefiles as (quasi) dataframes--with the longitudes and latitudes stored in the geometry variable. Then, you should be in familiar territory. Here is some code to change a state name variable:
# clear environment
rm(list=ls(all=TRUE))
# let's take admin 1 (states)
# note: already in WGS84 format
library(sf)
india_shape <- st_read("india_shape/gadm36_IND_1.shp", stringsAsFactors=FALSE)
# Let's pick something to change (state name)
> india_shape$NAME_1[1]
[1] "Andaman and Nicobar"
# Now change it
> india_shape$NAME_1[1] <- "New State Name"
> india_shape$NAME_1[1]
[1] "New State Name"
I can tell you a few things about how I manage those shape files downloaded from, www.gadm.org site.
First, a shape file has several other related files that do not have the .shp extension. These files must remain together in the same folder. All these files are included in the shape zip file from the gadm website.
The rgdal package provides the, readOGR() function. This function is normally in the form: readOGR(dsn = " ", layer = " " )
The, dsn is "data source name". Use quotes.
The, layer is the name of shape file without the .shp extension. Use quotes.
Proper file management is required to make things work and to maintain good file management. I already have a USA folder within my dataset folder.
I just downloaded the gadm USA shape file. So first, I will add a new folder, named USA_map, in the USA folder. And also create a new folder named, data in this new USA_map folder.
C:/python/datasets/usa/usa_map/data # usa_map/data are new
Copy the downloaded gadm36_USA_shp.zip from the "download" folder and paste into the new "USA_map" folder. Then, open the GADM zip folder and extract the entire contents of the zip folder into the new "data" folder. Then the zip file can new be deleted because all the files have been copied into the "data" folder. All's done and ready.
Now use the readOGR() function to read the shape file and assign to new variable, called usmap
usmap <- readOGR(dsn = "c:/python/datasets/USA/USA_map/data", layer = "gadm36_USA_1")
The trick is to follow the correct file management so the readOGR() function works as designed.
Next, you need to learn how to navigate through this type of data.
If there is more than one polygon with the same wrong name, you can make like that :
w <- length(wordlmap)
for (i in 1:w){
if (wordlmap$NAME_0[i] == "Ã…land") {
wordlmap$NAME_0[i] <- "Aland"
}}
Related
I am trying to find a way of loading shapefiles (.shp) from an online repository/folder/url directly into my global environment in R, for the purpose of making plots in ggplot2 using geom_sf. In the first instance I'm using my Google Drive to store these files but I'd ideally like to find a solution that works with any folder with a valid url and appropriate access rights.
So far I have tried a few options, the first 2 involving zipping the source folder on Google Drive where the shapefiles are stored and then downloading and unzipping in some way. Have included reproducable examples using a small test shapefile:
Using utils::download.file() to retrieve the compressed folder and unzipping using either base::system('unzip..') or zip::unzip() (loosely following this thread: Downloading County Shapefile from ONS):
# Create destination data folder (if there isn't one)
if(!dir.exists('data')) dir.create('data')
# Download the zipped file/folder
download.file("https://drive.google.com/file/d/1BYTCT_VL8EummlAsH1xWCd5rC4bZHDMh/view?usp=sharing", destfile = "data/test_shp.zip")
# Unzip folder using unzip (fails)
unzip(zipfile = "data/test_shp.zip", exdir = "data/test_shp", junkpaths = TRUE)
# Unzip folder using system (also fails)
system("unzip data/test_shp.zip")
If you can't run the above code then FYI the 2 error messages are:
Warning message:
In unzip(zipfile = "data/test_shp.zip", exdir = "data/test_shp", :
error 1 in extracting from zip file
AND
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of data/test_shp.zip or
data/test_shp.zip.zip, and cannot find data/test_shp.zip.ZIP, period.
Worth noting here that I can't even manually unzip this folder outside R so I think there's something going wrong with the download.file() step.
Using the googledrive package:
# Create destination data folder (if there isn't one)
if(!dir.exists('data')) dir.create('data')
# Specify googledrive url:
test_shp = drive_get(as_id("https://drive.google.com/file/d/1BYTCT_VL8EummlAsH1xWCd5rC4bZHDMh/view?usp=sharing"))
# Download zipped folder
drive_download(test_shp, path = "data/test_shp.zip")
# Unzip folder
zip::unzip(zipfile = "data/test_shp.zip", exdir = "data/test_shp", junkpaths = TRUE)
# Load test.shp
test_shp <- read_sf("data/test_shp/test.shp")
And that works!
...Except it's still a hacky workaround, which requires me to zip, download, unzip and then use a separate function (such as sf::read_sf or st_read) to read in the data into my global environment. And, as it's using the googledrive package it's only going to work for files stored in this system (not OneDrive, DropBox and other urls).
I've also tried sf::read_sf, st_read and fastshp::read.shp directly on the folder url but those approaches all fail as one might expect.
So, my question: is there a workflow for reading shapefiles stored online directly into R or should I stop looking? If there is not, but there is a way of expanding my above solution (2) beyond googledrive, I'd appreciate any tips on that too!
Note: I should also add that I have deliberately ignored any option requiring the package rgdal due to its imminient permanent retirement and so am looking for options that are at least somewhat future-proof (I understand all packages drop off the map at some point). Thanks in advance!
I ran into a similar problem recently, having to read in shapefiles directly from Dropbox into R.
As a result, this solution only applies for the case of Dropbox.
The first thing you will need to do is create a refreshable token for Dropbox using rdrop2, given recent changes from Dropbox that limit single token use to 4 hours. You can follow this SO post.
Once you have set up your refreshable token, identify all the files in your spatial data folder on Dropbox using:
shp_files_on_db<- drop_dir("Dropbox path/to your/spatial data/", dtoken = refreshable_token) %>%
filter(str_detect(name, "adm2"))
My 'spatial data' folder contained two sets of shapefiles – adm1 and adm 2. I used the above code to choose only those associated with adm2.
Then create a vector of the names of the shp, csv, shx, dbf, cpg files in the 'spatial data' folder, as follows:
shp_filenames<- shp_files_on_db$name
I choose to read in shapefiles into a temporary directory, avoiding the need to have to store the files on my disk – also useful in a Shiny implementation. I create this temporary directory as follows:
# create a new directory under tempdir
dir.create(dir1 <- file.path(tempdir(), "testdir"))
#If needed later on, you can delete this temporary directory
unlink(dir1, recursive = T)
#And test that it no longer exists
dir.exists(dir1)
Now download the Dropbox files to this temporary directory:
for (i in 1: length(shp_filenames)){
drop_download(paste0("Dropbox path/to your/spatial data/",shp_filenames[i]),
dtoken = refreshable_token,
local_path = dir1)
}
And finally, read in your shapefile as follows:
#path to the shapefile in the temporary directory
path1_shp<- paste0(dir1, "/myfile_adm2.shp")
#reading in the shapefile using the sf package - a recommended replacement for rgdal
shp1a <- st_read(path1_shp)
Thanks for reading my post. I am trying to create one map in Plotly using R by layering data from 2 data sources and the function plot_mapbox. The map will show locations of stores in zoned business districts.
test is a geoJSON file of zoning districts
test2 is a csv file of business locations using longitude and latitude coordinates
I've tried layering the data and combining two geoJSON files. The first file is a geoJSON file (business zones)and the second file is a .csv (store locations) with longitude and latitude. I converted the csv file to a geoJSON file and then tried to merge them. I would really need to append them since they don't have a common key.
library(plotly)
library(geojsonR)
library(sf)
test<-st_read("D:/SPB/Zoning_Detailed.geojson", quiet=FALSE, , geometry_column="SHAPE_Area")
test2<-read.csv("D:/SPB/Pet_Bus.csv")
One layering exampe
plot_mapbox(data=test, color=~ZONING) %>%
add_markers(data=test2, x=~Longitude, y=~Latitude)
layout(mapbox=list(style = "streets"))
One merge example (only the first file is added in merge)
files Zoning_Detailed.geojson and Pet_Bus.geojson are in the Merge folder. I
converted Pet_Bus.csv to a geojson file.
This should really be append since test and test2 are independent of each other, but in same city.
merge_files("D:/SPB/Merge/", "D:/SPB/Merge/test7.geojson")
library(raster)
france<-getData('GADM', country='FRA', level=1)
However, the command is leading me to this error.
trying URL 'http://biogeo.ucdavis.edu/data/gadm2.8/rds/FRA_adm1.rds'
Error in utils::download.file(url = aurl, destfile = fn, method = "auto", :
cannot open URL 'http://biogeo.ucdavis.edu/data/gadm2.8/rds/FRA_adm1.rds'
First, download the country data you want from the GADM database, and save it to your local directory. Be sure that you have chosen the R (SpatialPolygonsDataFrame) format. There are five levels available for France (from level 0 to level 5). You can choose what you need.
Second, read the .rds file downloaded from GADM with readRDS() function and transform it into a data.frame with ggplot2::fortify().
library(ggplot2)
library(sp)
# assumed that you downloaded into a such path: '~/Downloads/FRA_adm1.rds':
path <- file.path(Sys.getenv("HOME"), "Downloads", "FRA_adm1.rds")
# FR map (Level 1) from GADM version 2.8
frRDS <- readRDS(path)
# Region names 1 in data frame
frRDS_df <- ggplot2::fortify(frRDS, region = "NAME_1")
head(frRDS_df)
I am going to improve upon the previous answer to the OP's question.
To answer the OP's question directly and correctly, there is nothing wrong with the OP's code. The issue was likely a temporary internet connection issue because the OP's code works and retrieves the gadm.org data without issue. Note, the getData() function retrieves the gadm.org website's geodata that is stored and retrieved from the http://biogeo.ucdavis.edu/ website.
The raster package provides the getData() function which is very useful for automatically retrieving the geodata from the internet. This function can also be used to retrieve geodata that is kept locally on a PC.
In years past, the way to use geodata was to first download a file from the gadm.org website, and then to move that file from the download folder and save the file in a folder on the pc. These files then needed to be unpackaged/unzipped before the geodata was available to be used by R.
Using the getData() makes life simpler because this method directly retrieves the desired geodata and then makes the geodata available to use with R.
The gadm.org website clearly states:
"Downloading by country is the recommended approach"
Even though downloading the large world geodata file directly from the website can be done, it is unnecessary and resource intensive. Unless there is some specific reason for doing so, there is absolutely no need to download and keep the large worldwide geodata database on the PC.
And one last thing about the getData() function. This function is currently generating a warning when it is used in R nowadays. The warning reads:
Warning message in getData("GADM", country = "USA", level = 1):
"getData will be removed in a future version of raster.
Please use the geodata package instead"
wmap <- readOGR(dsn="~/R/funwithR/data/ne_110m_land", layer="ne_110m_land")
This code is not loading the shape file and error is generated as
Error in ogrInfo(dsn = dsn, layer = layer, encoding = encoding, use_iconv = use_iconv, :
Cannot open file
I am sure that the directory is correct one. At the end / is also not there and layer name is also correct.
Inside the ne_110m_land directory files I have are:
ne_110m_land.dbf
ne_110m_land.prj
ne_110m_land.shp
ne_110m_land.shx
ne_110m_land.VERSION.txt
ne_110m_land.README.html
You could have shown that you have the right path with:
list.files('~/R/funwithR/data/ne_110m_land', pattern='\\.shp$')
file.exists('~/R/funwithR/data/ne_110m_land/ne_110m_land.shp')
perhaps try:
readOGR(dsn=path.expand("~/R/funwithR/data/ne_110m_land"), layer="ne_110m_land")
or a simpler alternative that is wrapped around that:
library(raster)
s <- shapefile("~/R/funwithR/data/ne_110m_land/ne_110m_land.shp")
Update:
rgdal has changed a bit and you do not need to separate the path and layer anymore (at least for some formats). So you can do
x <- readOGR("~/R/funwithR/data/ne_110m_land/ne_110m_land.shp")
(perhaps still using path.expand)
Also, if you are still using readOGR you are a bit behind the times. It is better to use terra::vect or sf::st_read.
I had the same error. To read in a shapefile, you need to have three files in your folder: the .shp, .dbf and .shx files.
For me, the command returned the Cannot open layer error when I included the dsn and layer tags.
So when I included it all just as
readOGR('~/R/funwithR/data/ne_110m_land/ne_110m_land.shp')
it worked.
Note that my file was a gjson, so I've only seen this with
readOGR('~/R/funwithR/data/ne_110m_land/ne_110m_land.gjson')
the Mandatory files should be all in the same directory
.shp — shape format
.shx — shape index format;
.dbf — attribute format;
then we can just give the path as a parameter to the function it will work.
global_24h =readOGR( '/Users/m-store/Desktop/R_Programing/global_24h.shp')
Here's what worked for me (with a real example)
require(rgdal)
shape <- readOGR(dsn = "1259030001_ste11aaust_shape/STE11aAust.shp", layer = "STE11aAust")
The exact data is available here (download the .zip file called 'State and Territory ASGC Ed 2011 Digital Boundaries in MapInfo Interchange Format')
the syntax: library(raster)
s <- shapefile("~/R/funwithR/data/ne_110m_land/ne_110m_land.shp") worked perfectly! todah rabah!
As I commented in other post (Error when opening shapefile), using file.choose() and selecting manually will help in the case one file selection is needed. Apparently is related with NaturalEarth shapefiles
it seems to me that this is the solution, at least before uploading it to the cloud
######################################
# Server
######################################
#I tell R where to extract the data from
#Le digo al R donde debe jalar la data
dirmapas <- "E:/Tu-carpeta/Tu-sub-carpeta/ESRI" #Depende donde tengas tú tus
#archivos de cartografía
setwd(dirmapas)
#The raw map
# El mapa de polígonos en blanco y negro
departamentos<-readOGR(dsn="BAS_LIM_DEPARTAMENTO.shp", layer="BAS_LIM_DEPARTAMENTO")
I'm trying to create a shapefile in R that I will later import to either Fusion Table or some other GIS application.
To start,I imported a blank shapefile containing all the census tracts in Canada. I have attached other data (in tabular format) to the shapefile based on the unique ID of the CTs, and I have mapped my results. At the moment, I only need the ones in Vancouver and I would like to export a shapefile that contains only the Vancouver CTs as well as my newly attached attribute data.
Here is my code (some parts omitted due to privacy reasons):
shape <- readShapePoly('C:/TEST/blank_ct.shp') #Load blank shapefile
shape#data = data.frame(shape#data, data2[match(shape#data$CTUID, data2$CTUID),]) #data2 is my created attributes that I'm attaching to blank file
shape1 <-shape[shape$CMAUID == 933,] #selecting the Vancouver CTs
I've seen other examples using this: writePolyShape to create the shapefile. I tried it, and it worked to an extent. It created the .shp, .dbf, and .shx files. I'm missing the .prj file and I'm not sure how to go about creating it. Are there better methods out there for creating shapefiles?
Any help on this matter would be greatly appreciated.
Use rgdal and writeOGR. rgdal will preserve the projection information
something like
library(rdgal)
shape <- readOGR(dsn = 'C:/TEST', layer = 'blank_ct')
# do your processing
shape#data = data.frame(shape#data, data2[match(shape#data$CTUID, data2$CTUID),]) #data2 is my created attributes that I'm attaching to blank file
shape1 <-shape[shape$CMAUID == 933,]
writeOGR(shape1, dsn = 'C:/TEST', layer ='newstuff', driver = 'ESRI Shapefile')
Note that the dsn is the folder containing the .shp file, and the layer is the name of the shapefile without the .shp extension. It will read (readOGR) and write (writeOGR) all the component files (.dbf, .shp, .prj etc)
Problem solved! Thank you again for those who help!
Here is what I ended up doing:
As Mnel wrote, this line will create the shapefile.
writeOGR(shape1, dsn = 'C:/TEST', layer ='newstuff', driver = 'ESRI Shapefile')
However, when I ran this line, it came back with this error:
Can't convert columns of class: AsIs; column names: ct2,mprop,mlot,mliv
This is because my attribute data was not numeric, but were characters. Luckily, my attribute data is all numbers so I ran transform() to fix this problem.
shape2 <-shape1
shape2#data <- transform(shape1#data, ct2 = as.numeric(ct2),
mprop = as.numeric(mprop),
mlot = as.numeric(mlot),
mliv = as.numeric(mliv))
I tried the writeOGR() command again, but I still didn't get the .prj file that I was looking for. The problem was I didn't specified the coordinate systems for the shapefile when I was importing the file. Since I already know what the coordinate system is, all I had to do was define it when importing.
readShapePoly('C:/TEST/blank_ct.shp',proj4string=CRS("+proj=longlat +datum=WGS84")
After that, I re-ran all the things I wanted to do with the shapefile, and the writeOGR line for exporting. And that's it!