I am attempting to make a map with three layers using ggmap. The layers are as follows:
A map of the US (toner-lite)
a set of geometries that color the states on some value (simulated data below)
labels for the state names, as annotations in the center of each state.
To do this I have created a map of US states with states colored by a randomized value (rnorm) and this part is successful. From here I am attempting to print the abbreviations of each state at the longitude and latitude coordinates of each state's center, using geom_text. The part that fails is the 'geom_text' overlay, with the following error:
Error: 'x' and 'units' must have length > 0 In addition: Warning
messages: 1: In gpclibPermit() : support for gpclib will be
withdrawn from maptools at the next major release 2: Removed 855070
rows containing missing values (geom_text).
Here is the script, which I have worked hard to run as on its own. It will download the shapefile and center of state data, as well as to simulate data to fill the states. I've tested it and it works up to what I have commented out (geom_text layer).
I have searched for answers to this already, so please let me know if you have any advice on how to do what I am attempting. If there is a better strategy for placing labels on top of the polygon fills, I am all ears (or eyes in this case).
###Combining Census data with a tract poly shapefile
library(maptools)
library(ggplot2)
library(gpclib)
library(ggmap)
library(rgdal)
library(dplyr)
#Set working directory to where you want your files to exist (or where they already exist)
setwd('~/Documents/GIS/USCensus/')
#Read and translate coord data for shape file of US States
if(!file.exists('tl_2014_us_state.shp')){
download.file('ftp://ftp2.census.gov/geo/tiger/TIGER2014/STATE/tl_2014_us_state.zip',
'tl_2014_us_state.zip')
files <- unzip('tl_2014_us_state.zip')
tract <- readOGR(".","tl_2014_us_state") %>% spTransform(CRS("+proj=longlat +datum=WGS84"))
} else {
tract <- readOGR(".","tl_2014_us_state") %>% spTransform(CRS("+proj=longlat +datum=WGS84"))
}
#two column dataset of state abbreviations and center of state
#Downloadable from: https://dev.maxmind.com/static/csv/codes/state_latlon.csv
if(!file.exists('state_latlon.csv')){
download.file('http://dev.maxmind.com/static/csv/codes/state_latlon.csv','state_latlon.csv')
}
centers <- read.csv('state_latlon.csv')
#Change values of longitude and latitude from state center data so as not to interfere with shapefile at merge
names(centers)[2:3] <- c('long_c','lat_c')
#simulated data for plotting values
mydata<- data.frame(rnorm(55, 0, 1)) #55 "states" in the coord dataset for state centers
names(mydata)[1] <- 'value'
#hold names in tract dataset and for simulated data
ntract<-names(tract)
ndata<-names(mydata)
#Turn geo data into R dataframe
gpclibPermit()
tract_geom<-fortify(tract,region="STUSPS")
#Merge state geo data with simulated data
state_data <- cbind(centers,mydata)
#merge state center and value data with shapefile data
tract_poly <- merge(state_data,tract_geom,by.x="state",by.y="id", all = F)
tract_poly<-tract_poly[order(tract_poly$order),]
#Create map of US
mymap <- get_stamenmap(bbox = c(left = -124.848974,
bottom = 24.396308,
right = -66.885444,
top = 49.384358),zoom=5,
maptype="toner-lite")
#This plots a map of the US with just the state names as labels (and a few other landmarks). Used for reference
USMap <- ggmap(mymap,extent='device') +
geom_polygon(aes(x = long, y = lat, group = group, fill = value),
data = tract_poly,
alpha = 1,
color = "black",
size = 0.2) #+
# geom_text(aes(x = long_c, y = lat_c, group = group, label = state),
# data= tract_poly,
# alpha = 1,
# color = "black")
USMap
That's a strange error message for what ended up being the problem. Somewhere along the way you have flipped the latitude and longitude for centers. (I also took into account elpi's advice above and didn't plot the Initials repeatedly by using your centers dataset directly). The code below works, but I'd recommend fixing your centers dataset.
centers$new_long <- centers$lat_c
centers$new_lat <- centers$long_c
USMap <- ggmap(mymap,extent='device') +
geom_polygon(aes(x = long, y = lat, group = group, fill = value),
data = tract_poly,
alpha = 1,
color = "black",
size = 0.2) +
geom_text(aes(x = new_long, y = new_lat, label = state),
data= centers,
alpha = 1,
color = "black")
Try this
centroids <- setNames(do.call("rbind.data.frame", by(tract_poly, tract_poly$group, function(x) {Polygon(x[c('long', 'lat')])#labpt})), c('long', 'lat'))
centroids$label <- tract_poly$state[match(rownames(centroids), tract_poly$group)]
USMap + with(centroids, annotate(geom="text", x = long, y=lat, label = label, size = 2.5))
(via)
Related
I'm trying to create a map of Europe with grid cells coloured based on the number of records within a cell. Here I attach an image as illustrative of the desired output (see Fig 1 of https://doi.org/10.3897/phytokeys.74.9723).
In order to produce this image I have developed a minimal reproducible example with random points distributed across Europe. I have been able to produce a similar figure with levelplot but I'm particulary interested in doing this with ggplot as it will allow further customising. Is it possible to do produce a similar figure with ggplot? And if so, any advice of what path should I follow?
Note: The size of the grids/cells is irrelevant at the moment but I'll adjust it depending on point density. All of them have to be the same size as in the first example and they only will differ on the pattern of colour.
#Load libraries
library(rgdal) #v1.5-28
library(rgeos) #v.0.5-9
library(ggplot2) # 3.3.5
library(rworldmap) #plot worldmap v.1.3-6
library(dplyr) #v.1.0.7
#Create dataframe of coordinates that fall in Europe
coord <- data.frame(cbind(runif(1000,-15,45),runif(1000,30,75)))
colnames(coord) <- c("long","lat")
#Exlude ocean points following this post
URL <- "http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/physical/ne_110m_ocean.zip"
fil <- basename(URL)
if (!file.exists(fil)) download.file(URL, fil)
fils <- unzip(fil)
oceans <- readOGR(grep("shp$", fils, value=TRUE), "ne_110m_ocean",
stringsAsFactors=FALSE, verbose=FALSE)
europe_coord <- data.frame(long = coord$long,
lat = coord$lat)
coordinates(europe_coord) <- ~long+lat
proj4string(europe_coord) <- CRS(proj4string(oceans))
ocean_points <- over(europe_coord, oceans)
#Add ocean points to dataset
coord$ocean <- ocean_points$featurecla
#Exlude ocean points
europe_land <- coord %>% filter(is.na(ocean))
#Load worldmap
world <- map_data("world")
#Plot europe spatial data
ggplot() + geom_map(data = world, map = world,
aes(long, lat, map_id = region), color = "white",
fill = "lightgray", size = 0.1) +
geom_point(data = europe_land,aes(long, lat),
alpha = 0.7, size = 0.05) + ylim(0,70) +
coord_sf(xlim = c(-15, 45), ylim = c(30, 75), expand = FALSE)
I need to create heat map with 3 digit zip boundary.
I have 3 digit zip and count data like this
zip <- c(790, 791, 792, 793)
count <- c(0, 100, 20, 30)
TX <- data.frame(zip, count)
Also, I draw TX map.
library(ggplot2)
library(ggmap)
library(maps)
library(mapdata)
states <- map_data("state")
texas<- subset(states, region =="texas")
ggplot(data = texas) +
geom_polygon(aes(x = long, y = lat), fill = "gray", color = "black")
What I want to achieve is to (1) draw boundary with 3 digit zip code and (2) create the heat map using count column. The outcome will looks like this with heat map coloring.
This question does not contain reproducible sample data. Hence, I needed some good amount of time to deliver the following. Please provide minimum reproducible data and codes you tried from next time. (I doubt if you really invested time to seriously write your codes.)
Anyway, I think getting a good polygon data for US zip codes is difficult without paying some money. This question provides good information. I obtained data from this link since the data was accessible. You gotta find whatever suitable polygon data for yourself.
I also obtained data for the zip codes in Texas from here and saved it as "zip_code_database.csv."
I added explanation for each code below. So I do not write a thourough explanation here. Basically, you need to merge polygon data by subtracting the first three numbers in the zip codes. You also need to create an aggregated data for whatever the value you have in your data using the 3-digit zip code. The other thing is to find center points of the polygons to add the zip codes as labels.
library(tidyverse)
library(rgdal)
library(rgeos)
library(maptools)
library(ggalt)
library(ggthemes)
library(ggrepel)
library(RColorBrewer)
# Prepare the zip poly data for US
mydata <- readOGR(dsn = ".", layer = "cb_2016_us_zcta510_500k")
# Texas zip code data
zip <- read_csv("zip_code_database.csv")
tx <- filter(zip, state == "TX")
# Get polygon data for TX only
mypoly <- subset(mydata, ZCTA5CE10 %in% tx$zip)
# Create a new group with the first three digit.
# Drop unnecessary factor levels.
# Add a fake numeric variable, which is used for coloring polygons later.
mypoly$group <- substr(mypoly$ZCTA5CE10, 1,3)
mypoly$ZCTA5CE10 <- droplevels(mypoly$ZCTA5CE10)
set.seed(111)
mypoly$value <- sample.int(n = 10000, size = nrow(mypoly), replace = TRUE)
# Merge polygons using the group variable
# Create a data frame for ggplot.
mypoly.union <- unionSpatialPolygons(mypoly, mypoly$group)
mymap <- fortify(mypoly.union)
# Check how polygons are like
plot(mypoly)
plot(mypoly.union, add = T, border = "red", lwd = 1)
# Convert SpatialPolygons to data frame and aggregate the fake values
mypoly.df <- as(mypoly, "data.frame") %>%
group_by(group) %>%
summarise(value = sum(value))
# Find a center point for each zip code area
centers <- data.frame(gCentroid(spgeom = mypoly.union, byid = TRUE))
centers$zip <- rownames(centers)
# Finally, drawing a graphic
ggplot() +
geom_cartogram(data = mymap, aes(x = long, y = lat, map_id = id), map = mymap) +
geom_cartogram(data = mypoly.df, aes(fill = value, map_id = group), map = mymap) +
geom_text_repel(data = centers, aes(label = zip, x = x, y = y), size = 3) +
scale_fill_gradientn(colours = rev(brewer.pal(10, "Spectral"))) +
coord_map() +
theme_map()
I am trying to overlay a raster layer onto a map in ggplot. The raster layer contains likelihood surfaces for each time point from a satellite tag. I also want to set cumulative probabilities(95%, 75%, 50%) on the raster layer.
I have figured out how to show the raster layer on the ggplot map, but the coordinates are not aligned with one another. I tried making each have the same projection but it does not seem to be working... I want them both to fit the boundaries of my model (xmin = 149, xmax = 154, ymin = -14, ymax = -8.75
Attached is my r code and the figure result:
#load data
ncname <- "152724-13-GPE3"
ncfname <- paste(ncname, ".nc", sep = "")
ncin <- nc_open(ncfname)
StackedObject<-stack("152724-13-GPE3.nc", varname = "monthly_residency_distributions")
MergedObject<-overlay(StackedObject,fun=mean )
MergedObject[is.na(MergedObject)]<-0
Boundaries<-extent(c(149, 154, -14, -8.75))
ExtendedObject<-extend(MergedObject, Boundaries)
Raster.big<-raster(ncol=1200,nrow=900,ext=Boundaries)
Raster.HR<-resample(x=ExtendedObject, y=Raster.big, method="bilinear")
Raster.HR#data#values<- Raster.HR#data#values/sum(Raster.HR#data#values)
RasterVals<-sort(Raster.HR#data#values)
Raster.breaks <- c(RasterVals[max(which(cumsum(RasterVals)<= 0.05 ))], RasterVals[max(which(cumsum(RasterVals)<= 0.25 ))], RasterVals[max(which(cumsum(RasterVals)<= 0.50 ))], 1)
Raster.cols<-colorRampPalette(c("yellow","orange","red"))
RasterCols<- c(Raster.cols(3))
#Create Map
shape2 <- readOGR(dsn = "/Users/shannonmurphy/Desktop/PNG_adm/PNG_adm1.shp", layer = "PNG_adm1")
map<- crop(shape2, extent(149, 154, -14, -8.75))
projection(map)<- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
p <- ggplot() + geom_polygon(data = map, aes(x = long, y = lat, group = group), color = "black", size = 0.25) + coord_map()
projection(Raster.HR)<- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
#plot raster and ggplot
par(mfrow=c(1,1))
plot(p)
par(mfrow=c(1,1), new = TRUE)
plot(Raster.HR, col=RasterCols, breaks=Raster.breaks, legend = NULL, bbox(map))
Please let me know if there is another package/line of code I should be using to do this! Appreciate any help
Ok I understand. You want to plot multiple raster layers on the ggplot or you want that the raster object is over your background polygon object. The problem with rasterVis::gplot is that it directly plot the raster and does not allow to add another one under or over. You remind me that I already had this need and modified function gplot to retrieve the data as a tibble so that you can then play with it as much as you want with dplyr and then ggplot2. Thanks for the reminder, I added it in my current github library for later use!
Let's use a reproducible example to show this function:
Create datasets
Create a map of the world as a Raster to be use as background Raster map
Create a raster of data, here a distance from a point (limited to a maximum distance)
The code:
library(raster)
# Get world map
library(maptools)
data(wrld_simpl)
# Transform World as raster
r <- raster(wrld_simpl, res = 1)
wrld_r <- rasterize(wrld_simpl, r)
# Lets create a raster of data
pt1 <- matrix(c(100,0), ncol = 2)
dist1 <- distanceFromPoints(r, pt1)
values(dist1)[values(dist1) > 5e6] <- NA
plot(dist1)
# Plot both
plot(wrld_r, col = "grey")
plot(dist1, add = TRUE)
Function to extract (part of) raster values and transform as a tibble
#' Transform raster as data.frame to be later used with ggplot
#' Modified from rasterVis::gplot
#'
#' #param x A Raster* object
#' #param maxpixels Maximum number of pixels to use
#'
#' #details rasterVis::gplot is nice to plot a raster in a ggplot but
#' if you want to plot different rasters on the same plot, you are stuck.
#' If you want to add other information or transform your raster as a
#' category raster, you can not do it. With `SDMSelect::gplot_data`, you retrieve your
#' raster as a tibble that can be modified as wanted using `dplyr` and
#' then plot in `ggplot` using `geom_tile`.
#' If Raster has levels, they will be joined to the final tibble.
#'
#' #export
gplot_data <- function(x, maxpixels = 50000) {
x <- raster::sampleRegular(x, maxpixels, asRaster = TRUE)
coords <- raster::xyFromCell(x, seq_len(raster::ncell(x)))
## Extract values
dat <- utils::stack(as.data.frame(raster::getValues(x)))
names(dat) <- c('value', 'variable')
dat <- dplyr::as.tbl(data.frame(coords, dat))
if (!is.null(levels(x))) {
dat <- dplyr::left_join(dat, levels(x)[[1]],
by = c("value" = "ID"))
}
dat
}
Plot multiple rasters in ggplot
You can use gplot_data to transform any raster as a tibble. You are then able to add any modification using dplyr and plot on ggplot with geom_tile. The interesting point is that you can use geom_tile as many time as you want with different raster data, provided that fill option is comparable. Otherwise, you can use the trick below to remove NA values in the background raster map and use a unique fill colour.
# With gplot_data
library(ggplot2)
# Transform rasters as data frame
gplot_wrld_r <- gplot_data(wrld_r)
gplot_dist1 <- gplot_data(dist1)
# To define a unique fill colour for the world map,
# you need to remove NA values in gplot_wrld_r which
# can be done with dplyr::filter
ggplot() +
geom_tile(data = dplyr::filter(gplot_wrld_r, !is.na(value)),
aes(x = x, y = y), fill = "grey20") +
geom_tile(data = gplot_dist1,
aes(x = x, y = y, fill = value)) +
scale_fill_gradient("Distance",
low = 'yellow', high = 'blue',
na.value = NA) +
coord_quickmap()
Plot raster over polygons
Of course, with a background map as a polygon object, this trick also let you add your raster over it:
wrld_simpl_sf <- sf::st_as_sf(wrld_simpl)
ggplot() +
geom_sf(data = wrld_simpl_sf, fill = "grey20",
colour = "white", size = 0.2) +
geom_tile(data = gplot_dist1,
aes(x = x, y = y, fill = value)) +
scale_fill_gradient("Distance",
low = 'yellow', high = 'blue',
na.value = NA)
EDIT: gplot_data is now in this simple R package: https://github.com/statnmap/cartomisc
I am trying to change the background colors in US map for displaying presidential results for different states. I read so many posts regarding this color change but I was not able to change any of those colors. Below is my code, link for dataset and snapshot which I am getting:
#install.packages("ggplot2")
#install.packages("ggmap")
#install.packages("plyr")
#install.packages("raster")
#install.packages("stringr")
library(ggplot2) # for plotting and miscellaneuous things
library(ggmap) # for plotting
library(plyr) # for merging datasets
library(raster) # to get map shape filegeom_polygon
library(stringr) # for string operation
# Get geographic data for USA
usa.shape<-getData("GADM", country = "usa", level = 1)
# Creating a data frame of map data
usa.df <- map_data("state")
#rename 'region' as 'state' and make it a factor variable
colnames(usa.df) [5] <- "State"
usa.df$State <- as.factor(usa.df$State)
#set working directory
setwd("C:/Users/Ashish/Documents/Stats projects/2/")
#input data from file separated by commas
usa.dat <- read.csv("data1.csv", header = T)
# printing data structure
str(usa.df)
# removing % sign from the data, and converting percentage win to numeric
usa.dat$Clinton <- as.numeric(sub("%","",usa.dat$Clinton))/1
usa.dat$Trump <- as.numeric(sub("%","",usa.dat$Trump))/1
usa.dat$Others <- as.numeric(sub("%","",usa.dat$Others))/1
# Creating a winner column based on the percentage
usa.dat$Winner = "Trump"
usa.dat[usa.dat$Clinton > usa.dat$Trump,]$Winner = "Clinton"
usa.dat$State <- tolower(usa.dat$State)
# Creating a chance column which corresponds to winning percentage of the candidate
usa.dat$chance <- usa.dat$Trump
a <- usa.dat[usa.dat$Clinton > usa.dat$Trump,]
usa.dat[usa.dat$Clinton > usa.dat$Trump,]$chance <- a$Clinton
# display the internal structure of the object
usa.dat
#join the usa.df and usa.dat objects on state variable
usa.df <- join(usa.df, usa.dat, by = "State", type = "inner")
str(usa.df)
states <- data.frame(state.center, state.abb) # centers of states and abbreviations
#function for plotting different regions of USA map based on the input data showing different coloring scheme
#for each state.
p <- function(data, title) {
ggp <- ggplot() +
# Draw borders of states
geom_polygon(data = data, aes(x = long, y = lat, group = group,
fill = Winner, alpha=chance), color = "black", size = 0.15) +
#scale_alpha_continuous(range=c(0,1))+
scale_color_gradientn(colours = c("#F08080","white","#5DADE2"),breaks = c(0,50,100),
labels=c("Clinton","Equal","Trump"),
limits=c(0,100),name="Election Forecast") +
# Add state abbreviations
geom_text(data = states, aes(x = x, y = y, label = state.abb), size = 2)+
guides(fill = guide_legend(direction='vertical', title='Candidate', label=TRUE, colours=c("red", "blue")))
return(ggp)
}
figure.title <- "2016 presidential election result"
# Save the map to a file to viewing (you can plot on the screen also, but it takes
# much longer that way. The ratio of US height to width is 1:9.)
#print(p(usa.df, brks.to.use, figure.title))
ggsave(p(usa.df, figure.title), height = 4, width = 4*1.9,
file = "election_result.jpg")
Image link:
Dataset: Dataset link
I would like to get same coloring scheme as displayed in Election forecast gradient.
Thanks to Alistaire for providing his valuable feedbacks and solution for the above problem. Using scale_fill_brewer(type = 'qual', palette = 6) along with ggplot() resolves the above issue in R.
I have a problem with my heatmap, which displays the density LEVEL, but doesn't say anything about the density count. (how many points are in the same area for example).
My data is divided in more columns, but the most important ones are: lat,lon.
I would like to have something like this, but with "count" : https://stackoverflow.com/a/24615674/5316566,
however when I try to apply the code he uses in that answer, my maximum-"level" density doesn't reflect my density count.( Intead of 7500 I receive for example 6, even if I have thousands and thousands of data concentrated).
That's my code:
us_map_g_str <- get_map(location = c(-90.0,41.5,-81.0,42.7), zoom = 7)
ggmap(us_map_g_str, extent = "device") +
geom_tile(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat)), size = 0.3) +
stat_density2d(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat), fill = ..level.., alpha = ..level..), size = 0.3, bins = 10, geom = "polygon") +
scale_fill_gradient(name= "Ios",low = "green", high = "red", trans= "exp") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
This is what I get:
This is part of the data:
lat lon tag device
1 43.33622 -83.67445 0 iPhone5
2 43.33582 -83.69964 0 iPhone5
3 43.33623 -83.68744 0 iPhone5
4 43.33584 -83.72186 0 iPhone5
5 43.33616 -83.67526 0 iPhone5
6 43.25040 -83.78234 0 iPhone5
(The "tag" column is not important)
REVISED
I realised that my previous answer needs to be revised. So, here it is. If you want to find out how many data points exist in each level of a contour, you actually have a lot of things to do. If you are happy to use the leaflet option below, your life would be much easier.
First, let's get a map of Detroit, and create a sample data frame.
library(dplyr)
library(ggplot2)
library(ggmap)
mymap <- get_map(location = "Detroit", zoom = 8)
### Create a sample data
set.seed(123)
mydata <- data.frame(long = runif(min = -84, max = -82.5, n = 100),
lat = runif(min = 42, max = 42.7, n = 100))
Now, we draw a map and save it as g.
g <- ggmap(mymap) +
stat_density2d(data = mydata,
aes(x = long, y = lat, fill = ..level..),
size = 0.5, bins = 10, geom = "polygon")
The real job begins here. In order to find out the numbers of data points in all levels, you want to employ the data frame, which ggplot generates. In this data frame you have data for polygons. These polygons are used to draw level lines. You can see that in the following image, which I draw three levels on a map.
### Create a data frame so that we can find how many data points exist
### in each level.
mydf <- ggplot_build(g)$data[[4]]
### Check where the polygon lines are. This is just for a check.
check <- ggmap(mymap) +
geom_point(data = mydata, aes(x = long, y = lat)) +
geom_path(data = subset(mydf, group == "1-008"), aes(x = x, y = y)) +
geom_path(data = subset(mydf, group == "1-009"), aes(x = x, y = y)) +
geom_path(data = subset(mydf, group == "1-010"), aes(x = x, y = y))
The next step is to reate a level vector for a legend. We group the data by group (e.g., 1-010) and take the first row for each group using slice(). Then, ungroup the data and choose the 2nd column. Finally, create a vector
with unlist(). We come back to lev in the end.
mydf %>%
group_by(group) %>%
slice(1) %>%
ungroup %>%
select(2) %>%
unlist -> lev
Now we split the polygon data (i.e., mydf) by group and create a polygon for each level. Since we have 11 levels (11 polygons), we use lapply(). In the lapply loop, we need to do; 1) extract column for longitude anf latitude, 2) create polygon, 3) convert polygons to spatial polygons, 4) assign
CRS, 5) create a dummy data frame, and 6) create SpatialPolygonsDataFrames.
mylist <- split(mydf, f = mydf$group)
test <- lapply(mylist, function(x){
xy <- x[, c(3,4)]
circle <- Polygon(xy, hole = as.logical(NA))
SP <- SpatialPolygons(list(Polygons(list(circle), ID = "1")))
proj4string(SP) <- CRS("+proj=longlat +ellps=WGS84")
df <- data.frame(value = 1, row.names = "1")
circleDF <- SpatialPolygonsDataFrame(SP, data = df)
})
Now we go back to the original data. What we need to to is to convert the data frame to SpatialPointsDataFrame. This is because we need to subset data and find how many data points exist in each polygon (in each level). First, get long and lat from your data.frame. Make sure that the order is in lon/lat.
xy <- mydata[,c(1,2)]
Then, we create SPDF (SpatialPolygonsDataFrame). You want to have an identical proj4string between spatial polygons and spatial points data.
spdf <- SpatialPointsDataFrame(coords = xy, data = mydata,
proj4string = CRS("+proj=longlat +ellps=WGS84"))
Then, we subset data (mydata) using each polygon.
ana <- lapply(test, function(y){
mydf <- as.data.frame(spdf[y, ])
})
Data points are overlapping across levels; we have duplication. First we try to find out unique data points for each level. We bind data frames in ana and create a data frame, which is foo1. We also create a data frame, which we want to find unique number of data points. We make sure that columns names are all identical between foo1 and foo2. Using setdiff() and nrow(), we can find the unique number of data points in each level.
total <- lapply(11:2, function(x){
foo1 <- bind_rows(ana[c(11:x)])
foo2 <- as.data.frame(ana[x-1])
names(foo2) <- names(foo1)
nrow(setdiff(foo2, foo1))
})
Finally, we need to find the number of data points for the most inner level, which is level 11. We choose a data frame for level 11 in ana and create a data frame and count the number of row.
bob <- nrow(as.data.frame(ana[11]))
out <- c(bob,unlist(total))
### check if total is 100
### sum(out)
### [1] 100
We assign reversed out as names for lev. This is because we want to show how many data points exist for each level in a legend.
names(lev) <- rev(out)
Now we are ready to add a legend.
final <- g +
scale_fill_continuous(name = "Total",
guide = guide_legend(),
breaks = lev)
final
LEAFLET OPTION
If you use leaflet package, you can group data points with different zooms. Leaflet counts data points in certain areas and indicate numbers in circles like the following figure. The more you zoom in, the more leaflet breaks up data points into small groups. In terms of workload, this is much lighter. In addition, your map is interactive. This may be a better option.
library(leaflet)
leaflet(mydf) %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions())