How to merge a shapefile with a .csv and create a graph - r

I am trying to merge a shapefile and .csv file to make a map of election results. I can make a graph when I load the shapefile, but as soon as I merge it with the .csv, it says "Error in plot.window(...) : need finite 'xlim' values".
I have been reading online, and I think maybe I have to merge the csv file to the shapefile (I have been merging the shapefile to the csv file). However, the csv file (which contains the election results) has more values than the shapefile (which gives the coordinates for the districts). How can I create more districts to match the election results? And will this even solve my problem, or is there something else that I am missing? Also, the data is in spanish, but the relevant values Distritos=districts, partidos=political party, cargo=type of election, votos=votes.
library(maptools)
library(rgdal)
library(rgeos)
library(sf)
municipios <-readOGR("/Users/Desktop/Limite_partidos/Shapefile/Partidos.shp")
elec <- read.csv("/Users/Desktop/elections excel.csv", header = TRUE, stringsAsFactors = FALSE)
library(dplyr)
#merge datasets together
elec <-merge(elec, municipios, by=c("nam"), na.rm=TRUE)
plot(elec)
Link to election results (elec)
Link for shapefile (municipios)

First some sample data. This should be similar to what you get when
you read in your shapefile. The R spatial object is called a
SpatialPolygonsDataFrame. It carries a data.frame with the covariate
information about your polygons.
library(sp)
Sr1 <- Polygon(cbind(c(2,4,4,1,2),c(2,3,5,4,2)))
Sr2 <- Polygon(cbind(c(5,4,2,5),c(2,3,2,2)))
Sr3 <- Polygon(cbind(c(4,4,5,10,4),c(5,3,2,5,5)))
Sr4 <- Polygon(cbind(c(5,6,6,5,5),c(4,4,3,3,4)), hole = TRUE)
Srs1 <- Polygons(list(Sr1), "s1")
Srs2 <- Polygons(list(Sr2), "s2")
Srs3 <- Polygons(list(Sr3, Sr4), "s3/4")
SpP <- SpatialPolygons(list(Srs1,Srs2,Srs3), 1:3)
Spdf <- SpatialPolygonsDataFrame(SpP, data.frame(name = c("a", "b", "c"), row.names = c("s1", "s2", "s3/4")))
Now you have a spatial object that you can plot:
plot(Spdf)
and have a look at the attached data.frame of your spatial object. Here you need to have some identifier that you will match to your election results:
Spdf#data
You also have another dataframe with your 'election results' (Also with this identifier)
election <- data.frame(name = c("a", "c", "b"), voted = c(0.1, 0.2, 0.3))
Now match in the election results to you spatial object so you can plot it:
Spdf#data$voted <- election$voted[match(Spdf$name, election$name)]
To plot the polygons with the voted result as the colour of the polygon you need a pallette:
Spdf#data$colour <- heat.colors(3)[as.numeric(cut(Spdf#data$voted, 3))]
Then just plot:
plot(Spdf, col = Spdf#data$colour)
You can image that you will want to have more than 3 breaks in you
scale and you'll have more polygons but this is just an example. Good luck!

You can either select a single party (and then have a single value per administrative region) or facet by the party (and have as many small multiples as there are parties).
This really depends on what is your aim with the visualization. Given that there seems to be 48 results in your file the "small" multiples would be rather large, and filtering makes more sense.
For joining the shapefile and data frame with election results I suggest using one of the *_join functions from tidyverse package.
Consider this approach, built on assumption of filtering:
library(tidyverse)
library(sf)
tf_elec <- tempfile(fileext = ".csv") # create a temporary csv file
download.file("https://catalogo.datos.gba.gob.ar/dataset/1ae289f8-532c-4f69-a3c8-0268fe0ee390/resource/f8168491-4c38-4b03-82f1-b05fe43f8349/download/generales-2017.csv", tf_elec, quiet = T)
elec <- read_csv2(tf_elec) # read the data
tf_zip <- tempfile(fileext = ".zip") # a temoprary zip file
download.file("https://catalogo.datos.gba.gob.ar/dataset/627f65de-2510-4bf4-976b-16035828b5ae/resource/de607a34-b782-420f-93ed-35073a016e01/download/limite_partidos.zip", tf_zip, quiet = T)
unzip(tf_zip, files = 'Limite_partidos/GeoJSON/Partidos.geojson', exdir = tempdir(), junkpaths = T)
municipios <- st_read(paste0(tempdir(), '/Partidos.geojson'), quiet = T) # read the metro stations
src <- municipios %>%
left_join(elec, by = c('nam' = 'distrito')) %>%
filter(partido == 'VOTOS NULOS') #or what not... :)
ggplot() +
geom_sf(data = src, aes(fill = votos))
The most difficult part is downloading the data. The left_join() is in the second last chunk, and the very last is visualization - for the sake of simplicty I took the easy ggplot2 route, but also consider the excellent tmap package if you want your maps to really shine.

Related

Extract CMIP6 information with R

I'm doing a research about climate change and I want to extract the CMIP6 information using R but I don't get the results I want. This is the code.
library(raster)
library(ncdf4)
long_lat <- read.csv("C:/Users/USUARIO/Desktop/Tesis/Coordenadas - copia.csv", header = T)
raster_pp <- raster::brick("pr_day_ACCESS-CM2_historical_r1i1p1f1_gn_2000.nc")
sp::coordinates(long_lat) <- ~XX+YY
raster::projection(long_lat) <- raster::projection(raster_pp)
points_long_lat <- raster::extract(raster_pp[[1]], long_lat, cellnumbers = T)[,1]
data_long_lat <- t(raster_pp[points_long_lat])
The cordinates I've used are the following: -77.7754,-9.5352 (latitude, longitude), but I've tried these as well and still doesn't work 102.2246,-9.5352. Thanks for your response.

ggplot2 map is blank

I am using similar to code to other scripts with different shape files from Statistics Canada. However, I can't get a simple script to work with a provincial map. I think the problem is simple but I can't see it.
setwd("D:\\OneDrive\\lfs_stuff")
project_folder<-getwd()
data_folder<-project_folder
library(tidyverse)
#now start the map
library(rgeos)
library(rgdal)
library(maptools)
library(sp)
library(mapproj)
library(ggplot2)
#get test data
mydata<-read_csv("map_data.csv",col_types=list(col_character(),col_double()))
print(mydata)
# shape file came from this link for a digital shape file
# http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/files-fichiers/2016/lpr_000a16a_e.zip
target_url<-"http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/files-fichiers/2016/lpr_000a16a_e.zip"
url_file<-"lpr_000a16a_e.zip"
download_target<-paste0(project_folder,"/",url_file)
download.file(target_url,download_target,mode="wb",quiet=FALSE)
unzip(download_target,overwrite=TRUE,exdir=data_folder)
provincial_shape_file<-gsub(".zip",".shp",download_target)
provincial_shp<-readOGR(dsn=provincial_shape_file,layer="lpr_000a16a_e")
#convert it to the reqired data structure. the id vbl will contain the provincial codes
prov_base_map<-fortify(provincial_shp,region="PRUID")
map_data_1<-merge(prov_base_map,as_data_frame(mydata),by="id")
map1<-ggplot()+
geom_map(data=map_data_1,map=map_data_1,stat="identity",
aes(map_id=id,x=long,y=lat,fill=(pch),group=group),
colour="black",size=0.3)+
coord_map()
print(map1)
The download for the shape file is in the script. The mydata file is shown below
"id","pch"
"10",0.667259786476859
"11",5.63186813186813
"12",2.12053571428572
"13",-0.563697857948142
"24",0.150669774230772
"35",1.15309092428315
"46",0.479282622139765
"47",1.70242950877815
"48",1.84482533036765
"59",1.96197656978394
Here's one way with sf (though I think the ultimate issue is not having the id being identified correctly):
library(sf)
library(httr)
library(tidyverse)
read.csv(text='"id","pch"
"10",0.667259786476859
"11",5.63186813186813
"12",2.12053571428572
"13",-0.563697857948142
"24",0.150669774230772
"35",1.15309092428315
"46",0.479282622139765
"47",1.70242950877815
"48",1.84482533036765
"59",1.96197656978394',
stringsAsFactors=FALSE,
colClasses = c("character", "double")) -> xdf
# cross-platform-friendly d/l with caching built-in
try(httr::GET(
url = "http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/files-fichiers/2016/lpr_000a16a_e.zip",
httr::write_disk("~/Data/lpr_00a16a_e.zip"),
httr::progress()
)) -> res
fils <- unzip("~/Data/lpr_00a16a_e.zip", exdir = "~/Data/lpr")
ca_map <- st_read(grep("shp$", fils, value=TRUE), stringsAsFactors = FALSE)
ca_map <- st_simplify(ca_map, TRUE, 10) # you don't need the coastlines to be that detailed
ca_map <- left_join(ca_map, xdf, by=c("PRUID"="id"))
ggplot(ca_map) +
geom_sf(aes(fill = pch)) +
viridis::scale_fill_viridis(direction=-1, option="magma") +
coord_sf()
As an aside, even though I simplified the shapefile (for faster plotting) I'd hunt around for a light[er]-weight GeoJSON version of the provinces since the one you grabbed has super fine-grained coastlines and you absolutely don't need that for a choropleth.

Spatial data overlay selection in R

I'm trying to overlay some spatial data from a bigger SpatialPolygonsDataFrame (world size) to a smaller (country size), by doing these:
x <- c("rgdal", "dplyr",'ggplot2')
apply(x, library, character.only = TRUE)
est<-readOGR(dsn='/estados_2010',layer='estados_2010')
est_f<-fortify(est)
est$id<-row.names(est)
est_f<-left_join(est_f,est#data)
zon<-readOGR(dsn='/Zonas Homogeneas/gyga_ed_poly.shp',layer='gyga_ed_poly')
zon_f<-fortify(zon)
zon$id<-row.names(zon)
zon_f<-left_join(zon_f,zon#data)
t<-ggplot()+geom_polygon(data=zon_f,aes(x=long,y=lat,group=group,fill=GRID_CODE))
t+geom_polygon(data=est_f,aes(x=long,y=lat,group=group),fill=NA,color='red')+coord_fixed(xlim=est_f$long,ylim=est_f$lat,1)
Which is resulting in this:
I'm want to select only what is being plotted inside the polygon with the red lines.
If someone could help me with this issue, I'll appreciate
PS.: For those who want to reproduce the example completely by yourselves, the files are available in the links above to my google drive:
https://drive.google.com/open?id=0B6XKeXRlyyTDakx2cmJORlZqNUE
Thanks in advance.
Since you are using polygons to display the raster values, you can use a spatial selection via [ like in this reproducible example:
library(raster)
library(rgdal)
bra <- getData("GADM", country = "BRA", level = 1)
r <- getData("worldclim", res = 10, var = "bio")
r <- r[[1]]
r <- crop(r, bra)
r <- rasterToPolygons(r)
# bra and raster (now as polygons) have to have the same projection, thusly reproject!
bra <- spTransform(bra, CRSobj = proj4string(r))
here comes the magic!!
r <- r[bra, ]
let's look at the results:
library(ggplot2)
t <- ggplot()+
geom_polygon(data=r,aes(x=long,y=lat,group=group, fill = rep(r$bio1, each = 5)))
t +
geom_polygon(data=bra,aes(x=long,y=lat,group=group),fill=NA,color='red') + coord_map()

Drawing line on ggmap plot between two countries using long/lat

I am a total newbie to R and I would like to draw a line (possibly weighted, e.g., by the number of trips made) between two countries. Currently, I use longitude and latitude for each capital to draw a line, but I would like to do it using the package ggmap. I was looking around, but did not find any solution so far. I would appreciate a quick help.
require(ggmap)
require (rworldmap)
all_content = readLines("ext_lt_intratrd_1_Data.csv")
skip_second = all_content[-2]
dat = read.csv(textConnection(skip_second), header = TRUE, stringsAsFactors =F)
dat[5,2]<- c("Germany") # using a data where the first line is
header, but second line must be skipped as it is EU 27
and not a single country
europe <- read.csv("eulonglat.csv", header = TRUE) # using world capitals to
generate points
myfulldata <- merge(dat, europe)
map <- get_map(location = 'Europe', zoom = 4)
mapPoints <- ggmap(map) + geom_point(aes(x = UNc_longitude, y = UNc_latitude, size
= log(myfulldata$Value)), data = myfulldata, col = "red", alpha= 0.5) # this can
be plotted
# i would continue with drawing line and i searched for references
# i found arrows(42.66,23.34,50.82,4.47) - which did not work
# i tried to look for a reference work more, but could not find
# instead i found it using with the package rworldmap the following
lines(c(4.47, 23.32), c(50.82, 42.66))
# this does not work on ggmap

How can I plot shapefile loaded through fastshp in ggplot2?

I stumbled upon fastshp library and according to description (and my quick cursory tests) it really does offer improvements in time of reading large shapefiles compared to three other methods.
I'm using read.shp function to load exemplary dataset from maptools package:
library("maptools")
setwd(system.file("shapes", package="maptools"))
shp <- read.shp("columbus.shp", format="polygon")
I chose 'polygon' format since accordng to docs:
This is typically the preferred format for plotting.
My question is how can I plot these polygons using ggplot2 package?
Since read.shp in the fastshp package returns the polygon data in the form of a list of lists, it is then a matter of reducing it to a single dataframe required for plotting in ggplot2.
library(fastshp)
library(ggplot2)
setwd(system.file("shapes", package="maptools"))
shp <- read.shp("columbus.shp", format="polygon")
shp.list <- sapply(shp, FUN = function(x) do.call(cbind, x[c("id","x","y")]))
shp.df <- as.data.frame(do.call(rbind, shp.list))
shp.gg <- ggplot(shp.df, aes(x = x, y=y, group = id))+geom_polygon()
EDIT: Based on #otsaw's comment regarding polygon holes, the following solution requires a couple of more steps but ensures that the holes are plotted last. It takes advantage that shp.df$hole is logical and polygons with hole==TRUE will be plotted last.
shp.list <- sapply(shp, FUN = function(x) Polygon(cbind(lon = x$x, lat = x$y)))
shp.poly <- Polygons(shp.list, "area")
shp.df <- fortify(shp.poly, region = "area")
shp.gg <- ggplot(shp.df, aes(x = long, y=lat, group = piece, order = hole))+geom_polygon()

Resources