How to create State & district level map in using GADM and ggplot? - r

I am using Covid data & looking to plot State & district level Indian data on map.
I have State, District Name of India along with Cases but do not have needed lat, long for them.
I came across this so post How to map an Indian state with districts in r?
and tried raster::getData("GADM", country = "India", level = 2) %>% as_tibble() but this doesn't work as it doesnt have lat,lon, shapefile etc.
library(raster)
library(rgdal)
library(rgeos)
state_level_map <- raster::getData("GADM", country = "India", level = 1) %>%
as_tibble() %>%
filter(NAME_1 == "Rajasthan") %>%
fortify()
ggplot() +
geom_map(data= state_level_map, map = state_level_map,
aes(x = long, y = lat, map_id = id, group = group))
I am new to spatial data / maps and not sure how exactly I can proceed in this situation. Is it possible to get lat, lon, shapefile etc. for State/districts name's info from any r packages or the only way is to manually google them for lat,lon ?
Appreciate any help.

You were almost there. Use sf for that.
library(raster)
library(sf)
library(rgeos)
library(dplyr)
state_level_map <- raster::getData("GADM", country = "India", level = 1) %>%
st_as_sf() %>%
filter(NAME_1 == "Rajasthan")
ggplot() +
geom_sf(data = state_level_map)
you can then easily use aes() to change your aesthetics of the ggplot as you normally would using variables.
sf uses a dataframe-like notation that incorporates both attribute data as well as geometries into a single and easy to use dataframe. just have a look at print(state_level_map). That is, you could join data using district names to augment you attributes and visualize them through aes(color = yourjoinedvar).

Related

How to intersect maps using Tigris (and keeping all maps boundaries)?

Sorry for this very basic question but I'm new using Tigris. I would like create a shapefile (and then plot it) of county boundaries + places boundaries for the state of Minnesota.
Here is my code to get the counties:
mn_counties = tigris::counties(cb = T) %>%
filter(STUSPS == 'MN')
And here is my code to get the intersection between places and counties:
mn_places = tigris::places(cb = T) %>%
filter(STUSPS == 'MN') %>%
sf::st_intersection(mn_counties)
However, when I plot the intersection of these maps (counties and places), I just can see the polygons for the places map, but not for the counties.
tm_shape(mn_places) + tm_polygons()
Can anyone please tell me how to get an intersection of counties and places: 1. using tigris and, 2. that I'm able to see both places and county boundaries?
Many thanks in advance!!!
If I am understanding you correctly, you want places and counties in the same dataset. This is accomplished with dplyr::bind_rows():
library(tigris)
library(dplyr)
library(tmap)
mn_counties_and_places <- counties(state = "MN", cb = TRUE) %>%
bind_rows(
places(state = "MN", cb = TRUE)
)
tm_shape(mn_counties_and_places) +
tm_polygons()

Split a map into two separate maps by latitude or longitude

Is there a way to slice a ggplot2 map into two separate maps? I have one large map with id labels that are illegible. I want to split the map vertically into two distinct maps, preferably with an overlapping area so that each polygon would show up whole in at least one map.
Here's a reproducible example. I would want to split the map into a northern one at 35 degrees north and then into a southern one at 35.5 degrees north (giving an overlap between 35 and 35.5 in both). (While I realize it might make more sense with this example to split the other way, my actual map is long vertically.)
library(sf)
library(ggplot2)
sf_nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
plot <- ggplot2::ggplot(sf_nc) +
geom_sf(aes(color = NAME)) +
geom_sf_text(aes(label = NAME))
Maybe this is what you are looking for.
Following this post I first make use of st_crop to split the sf df by latitude and extract the FIPS codes for south and north regions.
The FIPS codes are then used to split the sf dataframe into two which ensures that regions on the dividing line are shown in total in both maps.
Finally, I add an ID and bind both dfs back together for easy plotting with facet_wrap
library(sf)
library(ggplot2)
library(dplyr)
sf_nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
# Get FIPS/regiona codes for north regions
south <- st_crop(sf_nc, xmin=-180, xmax=180, ymin=-90, ymax=35.5) %>%
pull(FIPS)
north <- st_crop(sf_nc, xmin=-180, xmax=180, ymin=35.5, ymax=90) %>%
pull(FIPS)
# Make sf df for north and south
sf_nc_1 <- filter(sf_nc, FIPS %in% south) %>%
mutate(id = "South")
sf_nc_2 <- filter(sf_nc, FIPS %in% north) %>%
mutate(id = "North")
# Bind together for using facet_wrap
sf_nc_split <- rbind(sf_nc_1, sf_nc_2)
ggplot2::ggplot(sf_nc_split) +
geom_sf(aes(color = NAME)) +
geom_sf_text(aes(label = NAME), size = 2) +
guides(color = FALSE) +
facet_wrap(~id, ncol = 1) +
theme_void()

Spatial Visualization using country names - R

I have a dataframe with several columns in which i'd like to visualize some information.
I want to display the information using a world map. Like this:
In my dataframe ,i have a column with the countries names, but i don't have the latitude/longitude informations.
How can i make this plot using only the country names?
Many options. One is the rworldmap package.
library(rworldmap)
We need some data to map.
COVID <- read.csv("https://opendata.ecdc.europa.eu/covid19/casedistribution/csv", na.strings = "", fileEncoding = "UTF-8-BOM")
Aggregate to get the total cases.
library(dplyr)
CASES <- COVID %>% group_by(countriesAndTerritories) %>%
summarise(`Total cases` = sum(cases)) %>%
mutate(countriesAndTerritories=gsub("_", " ", countriesAndTerritories))
If you've already got your data, then you can start from here. Just two steps.
Step 1.
Join the map with your own data using "NAME" for joinCode and the name of the variable in your data that represents the country name for nameJoinColumn.
COVID.map <- joinCountryData2Map(CASES, joinCode = "NAME", nameJoinColumn = "countriesAndTerritories")
Step 2. Plot this object.
par(mar=c(0,0,1,0))
mapCountryData(COVID.map, nameColumnToPlot="Total cases")
It's not a particularly useful map because the data are highly skewed. But you can see how easy it is. The most difficult part is to ensure your country names match those from the package. You can see these from:
countryRegions$ADMIN
[1] "Afghanistan" "Akrotiri Sovereign Base Area" "Aland"
[4] "Albania" "Algeria" "American Samoa"
There's also a country synonym data base:
countrySynonyms
A ggplot version:
library(ggplot2)
library(scales)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)
world <- ne_countries(scale = "medium", returnclass = "sf")
COVID.world <- merge(world, CASES, by.x="admin", by.y="countriesAndTerritories")
ggplot(data = COVID.world) +
geom_sf(aes(fill=Total)) +
scale_fill_gradient(label=comma) +
theme_void()

Find centre of polygons using dplyr

I'm making a map with arc lines connecting between counties for the US state of Missouri. I've calculated the 'good enough' centres of each county by taking the mean of each polygon's long/lat. This works good for the more or less square-shaped counties, but less so for the more intricately shaped counties. I think that this must be a common occurrence, but I can't find the answer online or with any function I've created. I'd like to use a tidyverse work flow (i.e. not transform to spatial objects if I can help it). Are there any tidyverse solutions to the problem at hand.
You can see the problem in the examples below.
library(tidyverse)
# import all state/county fortified
all_states <- as_tibble(map_data('county'))
# filter for missouri
mo_fortify <- all_states %>%
filter(region == 'missouri')
## Pull Iron county, which is relatively oddly shaped
mo_iron_fortify <- mo_fortify %>%
group_by(subregion) %>%
mutate(long_c = mean(long),
lat_c = mean(lat),
iron = ifelse(subregion == 'iron','Iron','Others')) %>%
ungroup()
# map a ggplot2 map
mo_iron_fortify %>%
ggplot(aes(long, lat, group = group))+
geom_polygon(aes(fill = iron),
color = 'white')+
geom_point(aes(long_c, lat_c))+
scale_fill_grey('Iron county is\na good example')+
coord_map()+
theme_bw()

Create choropleth map from coordinate points

I have a data frame consisting of multiple data points with specific geocoordinates (latitude and longitude). I'm looking to create a choropleth-style world map where geographical regions are shaded according to how many data points fall within the boundaries of the region.
Is there a simple way to accomplish what I'm trying to do in R, preferably using the "maps" package's world map and the "ggplot2" map-plotting functions?
Here is a minimally reproducible result of what I have:
library(ggplot2)
library(maps)
data <- data.frame(lat = 40.730610, lon = -73.935242)
ggplot() +
geom_polygon(data = map_data("world"), aes(x = long, y = lat, group = group, fill = group)) +
coord_fixed(1.3)
I've noticed that the fill parameter on plot item functions can be used to create a choropleth effect. Here, the fill parameter on the aes() function of the geom_polygon() function is used to create a choropleth where each group is color coded differently.
There are many ways to achieve this task. The general idea is to convert both the point data and polygon data to spatial objects. After that, count how many points fall within that polygon. I know we can do this using the sp package, which is widespread and well-known in the R community, but I decided to use the sf package because sf would be the next generation standard of spatial objects in R (https://cran.r-project.org/web/packages/sf/index.html). Knowing the usage and functionality of sf will probably be beneficial.
First, the OP provided an example point, but I decided to add more points so that we can see how to count the points and aggregate the data. To do so, I used the ggmap pakcage to geocode some cities that I selected as an example.
# Load package
library(tidyverse)
library(ggmap)
library(maps)
library(maptools)
library(sf)
# Create point data as a data frame
point_data <- data.frame(lat = 40.730610, lon = -73.935242)
# Geocode a series of cities
city <- c("Detroit", "Seattle", "Toranto", "Denver", "Mexico City", "Paris", "New Orleans",
"Tokyo", "Osaka", "Beijing", "Canberra", "New York", "Istanbul", "New Delhi",
"London", "Taipei", "Seoul", "Manila", "Bangkok", "Lagos", "Chicago", "Shanghai")
point_data2 <- geocode(city)
# Combine OP's example and the geocoding result
point_data3 <- bind_rows(point_data, point_data2)
Next, I converted the point_data3 data frame to the sf object. I will also get the polygon data of the world using the maps package and convert it to an sf object.
# Convert to simple feature object
point_sf <- st_as_sf(point_data3, coords = c("lon", "lat"), crs = 4326)
# Get world map data
worldmap <- maps::map("world", fill = TRUE, plot = FALSE)
# Convert world to sp class
IDs <- sapply(strsplit(worldmap$names, ":"), "[", 1L)
world_sp <- map2SpatialPolygons(worldmap, IDs = IDs,
proj4string = CRS("+proj=longlat +datum=WGS84"))
# Convert world_sp to simple feature object
world_sf <- st_as_sf(world_sp)
# Add country ID
world_sf <- world_sf %>%
mutate(region = map_chr(1:length(world_sp#polygons), function(i){
world_sp#polygons[[i]]#ID
}))
Now both point_sf and world_sf are sf objects. We can use the st_within function to examine which points are within which polygons.
# Use st_within
result <- st_within(point_sf, world_sf, sparse = FALSE)
# Calculate the total count of each polygon
# Store the result as a new column "Count" in world_sf
world_sf <- world_sf %>%
mutate(Count = apply(result, 2, sum))
The total count information is in the Count column of world_sf. We can get the world data frame as the OP did using the map_data function. We can then merge world_data and world_df.
# Convert world_sf to a data frame world_df
world_df <- world_sf
st_geometry(world_df) <- NULL
# Get world data frame
world_data <- map_data("world")
# Merge world_data and world_df
world_data2 <- world_data %>%
left_join(world_df, by = c("region"))
Now we are ready to plot the data. The following code is the same as the OP's ggplot code except that the input data is now world_data2 and fill = Count.
ggplot() +
geom_polygon(data = world_data2, aes(x = long, y = lat, group = group, fill = Count)) +
coord_fixed(1.3)

Resources