R - Fully remove cropped shapefile data - r

Is there a way to remove unused levels in a SpatialPolygonsDataFrame object in R?
I have large shapefile of geological data Geology that I am clipping with the raster::crop tool. This seems to work fine.
But when I try to work using my new, cropped shapefile Geo, polygon types present in Geology but absent in area covered by Geo still appear as levels in Geo. This interferes with my later analysis.
I have tried to remove these "ghost" levels/attributes using droplevels, but this function is not valid for SpatialPolygons or SpatialPolygonsDataFrame objects.
For reference, I am using the wygeol_dd_polygon.shp shapefile (downloadable here - 41.4 MB) as a starting point. The salient parts of my code are below:
library(maptools)
Geology <- readShapePoly("~/wygeol_dd_polygon.shp")
library(raster)
Geo <- crop(Geology, extent(-111.05, -110.25, 44.2667, 44.7667))
After cropping, I have ten unique rock types, but still 46 levels:
unique(Geo$ROCKTYPE1)
[1] alluvium rhyolite mixed clastic/volcanic intermediate volcanic rock
[5] basalt water trachyandesite sandstone
[9] conglomerate shale
46 Levels: alkalic intrusive rock alkalic volcanic rock alluvium andesite anorthosite basalt carbonate clastic ... water
How do I get rid of these?

Try this:
Geo#data<-droplevels(Geo#data)
The above will handle all the factor columns in one call.

The column you are having issues with as a factor variable. When you crop factors in R, it often retains the cropped factors even though you no longer have any remaining in your dataset. Gladly it's an easy fix as follows:
Geo$ROCKTYPE1 <- factor(Geo$ROCKTYPE1)
This redefines the factor so now you should only have 10 levels, as you want.

Related

R: Changing values from raster at certain coordinates

I run species distribution models in R and want to create variable rasters for the mainland of Africa, without the islands. I can only find shapefiles of Africa with its islands, not from the mainland only.
1) Where can I possibly download a shapefile of the mainland only?
2) If there is no shapefile, I would like to manually delete the islands from my raster. Is there a way to do this, f.e. setting parts of the rasters between certain coordinates to NA?
Yes. Here is a minimal, self-contained, reproducible example. The easiest approach might be to use Africa polygons africa and do
library(raster)
afr <- aggregate(africa)
v <- disaggregate(afr)
a <- area(v)
afnois <- v[which.max(a), ]
And then used that in mask to remove the islands from the rasters
You can also create polygons with raster::drawPoly and use these for masking.

How do I remove a subset of polygons from a Large SpatialPolygonsDataFrame using a string search, in R?

I have a spatial file in R, that contains all the area units for New Zealand. I have downloaded it in NZGD2000 format. In this file I have irrelevant geographic details, such as the Oceanic regions. I have managed to remove those from my data by simply removing those polygons with higher than a certain value.
library("dplyr")
library("rgdal")
library("rgeos")
NZAreas <- readOGR("[FILEPATH]/area-unit-2013.shp")
#remove the areas that are offshore
NZAreas#data$AU2013_V1_ <- as.numeric(as.character(NZAreas#data$AU2013_V1_))
NZAreas <- NZAreas[NZAreas#data$AU2013_V1_ < 614000,]
I have the problem that the area units include inlets and inland water. I can't remove those in the same way as I removed the coastal units, as the area unit values are not continguous. The #data$$AU2013_V_1 contains the labels for the area units. All the area units I wish to remove have the label starting with "Inlet" or "Inland Water".
I can't work out how to remove these polygons from the data.
First I tried without the dataframe name in front of the #data:
NZAreas <- NZAreas[!grepl("Inlet", #data$AU2013_V_1),]
Error: unexpected '#' in "NZAreas <- NZAreas[!grepl("Inlet", #"
and then I tried:
NZAreas <- NZAreas[!grepl("Inlet", NZAreas#data$AU2013_V_1),]
That second code runs but does not remove the polygons; it does not seem to do anything to the Large SpatialPolygonDataFrame. I checked the dataframe I constructed off NZAreas and there are Inlet and Inland Water rows. How do I remove these polygons?
This should work. It removed 49 areas containing "Inlet" in label and 15 areas having "Inland Water" in label.
> dim(NZAreas)
[1] 2004 5
> NZAreas=NZAreas[!grepl("Inlet", NZAreas$AU2013_V_1),]
> dim(NZAreas)
[1] 1955 5
> NZAreas=NZAreas[!grepl("Inland Water", NZAreas$AU2013_V_1),]
> dim(NZAreas)
[1] 1940 5
>

Merging (two and a half) countries from maps-package to one map object in R

I am looking for a map that combines Germany, Austria and parts of Switzerland together to one spatial object. This area should represent the German speaking areas in those three countries. I have some parts in place, but can not find a way to combine them. If there is a completely different solution to solve this problem, I am still interested.
I get the German and the Austrian map by:
require(maps)
germany <- map("world",regions="Germany",fill=TRUE,col="white") #get the map
austria <- map("world",regions="Austria",fill=TRUE,col="white") #get the map
Switzerland is more complicated, as I only need the 60-70% percent which mainly speak German. The cantones that do so (taken from the census report) are
cantonesGerman = c("Uri", "Appenzell Innerrhoden", "Nidwalden", "Obwalden", "Appenzell Ausserrhoden", "Schwyz", "Lucerne", "Thurgau", "Solothurn", "Sankt Gallen", "Schaffhausen", "Basel-Landschaft", "Aargau", "Glarus", "Zug", "Zürich", "Basel-Stadt")
The cantone names can used together with data from gadm.org/country (selecting Switzerland & SpatialPolygonsDataFrame -> Level 1 or via the direct link) to get the German-speaking areas from the gadm-object:
gadmCH = readRDS("~/tmp/CHE_adm1.rds")
dataGermanSwiss <- gadmCH[gadmCH$NAME_1 %in% cantonesGerman,]
I am now missing the merging step to get this information together. The result should look like this:
It represents a combined map consisting of the contours of the merged area (Germany + Austria + ~70% of Switzerland), without borders between the countries. If adding and leaving out the inter-country borders would be parametrizable, that would be great but not a must have.
You can that like this:
Get the polygons you need
library(raster)
deu <- getData('GADM', country='DEU', level=0)
aut <- getData('GADM', country='AUT', level=0)
swi <- getData('GADM', country='CHE', level=1)
Subset the Swiss cantons (here an example list, not the correct one); there is no need for a loop for such things in R.
cantone <- c('Aargau', 'Appenzell Ausserrhoden', 'Appenzell Innerrhoden', 'Basel-Landschaft', 'Basel-Stadt', 'Sankt Gallen', 'Schaffhausen', 'Solothurn', 'Thurgau', 'Zürich')
GermanSwiss <- swi[swi$NAME_1 %in% cantone,]
Aggregate (dissolve) Swiss internal boundaries
GermanSwiss <- aggregate(GermanSwiss)
Combine the three countries and aggregate
german <- bind(deu, aut, GermanSwiss)
german <- aggregate(german)

barplot: selecting data in R

I have a problem for building barplot.
I am working on air traffic in different countries. I would like to get barplots for each countries with the different airport names in the X axis. The Y axis will show the quantity of airlines using the airport.
My plan is to make the script for 1 country and to replicate it manually for the others.
in my data, I have in the different columns:
Country / aiport / destination.
So each rows is actually one airline that is using the airport.
Do you have an idea about how to do this?
For now I have this idea:
UK<-traffic[traffic$Country=="UK",]
UK$airport <- as.factor(UK$airport)
countUK<-table(UK$airport)
barplot(countUK)
This is not working, I have a bunch of airports that are not in UK in the X axis...
Thanks for your help
Answer found:
You could try to drop unused factor levels, i.e.
UK <- droplevels(UK) after the line UK$airport <- as.factor(UK$airport).

R "maps" package and choropleths

I would like to make a choropleth with the maps package in R. I have data which I have constructed to create bins and associate color names with those bins. Now, I need to use the col= argument to point the colors to the counties, in this example. How do I construct that argument? I would have thought that constructing a data frame would associate the county and color on the same line? Is that not true? So far I have the following
Example Data:
County | Value | Bin | Color
alamance | 100 | 1 | white
brunswick | 1000 | 2 | red
... through 100 counties
R code (which does not work):
library("maps")
DATA <- read.csv("~/Example_Data.csv")
DATA$County <- as.character(DATA$County)
DATA$Color <- as.character(DATA$Color)
NC <- map('county', 'north carolina', col= DATA$Color, Fill=TRUE)
So, after many iterations here is the essence of the solution. Instead of giving the R code which made it work (pretty bland), here are the rules that helped solve the problem.
The county.fips data included in the package has a column with all states and county names. This revealed the formatting of county name matches which are all lowercase, "state,county" with no spaces.
For the NC subset there are 102 entries, not 100, because Currituck County is subdivided into three entities. This was the source of most/all of the issues and was difficult to diagnose but easy to solve.
Solution 1 - Match a vector of colors to the vector of counties. 102 color entries IN THE PROPER ALPHA ORDER will produce a correctly resulting choropleth. Fastest, but also the least convenient if you were trying to do this for, say, all counties in the U.S.
Solution 2 - Add fips codes to original data and then match on fips. Since the county.fips file has Currituck entities listed as "north carolina,currituck:main", etc., this is still going to take some manipulation or finding an external fips reference. This is the method used in the maps() documentation, but which would have taken too long so I preferred the former. However, taking the time would allow you to approach a national dataset, for instance.

Resources