I'm trying to display labels on GIS polygon features in R using the st_centroid function in the sf library. Unfortunately, while the head() function seems to show that each polygon has different x and y coordinates associated with it, all labels get rendered overlapping at a single point on the map (which is apparently the centroid of one particular polygon). What am I doing wrong here?
Current code setup:
library("ggplot2")
library("sf")
sf::sf_use_s2(FALSE) #makes centroids not break
world <- st_read("C:/prgrm/gis/source/10m_land_and_islands.shp")
prov <- st_read("C:/prgrm/gis/edited ncm/ncm_provinces.shp")
prov <- cbind(prov, st_coordinates(st_centroid(prov))) #attaches centroids to 'prov' dataset
head(prov)
ggplot(data = world) +
geom_sf() +
geom_sf(data=prov, aes(fill="blue")) +
geom_text(data=prov, aes(X,Y, label=provname_r), size=5) +
coord_sf(xlim=c(-2000000,1000000),ylim=c(-1500000, 3000000), crs=st_crs(3310))
You may be better off with specifying the centroid placement via fun.geometry argument of the geom_sf_text() call / by the way the default is sf::st_point_on_surface() - which is a good default as it makes sure that the label is not placed inside a hole, should the polygon have one.
Consider this example, using the well known & much loved nc.shp shapefile that ships with {sf}.
library(sf)
library(ggplot2)
# in place of your world dataset
shape <- st_read(system.file("shape/nc.shp", package="sf")) # included with sf package
# in place of your prov dataset
ashe <- shape[1, ]
ggplot(data = shape) +
geom_sf() +
geom_sf(data = ashe, fill = "blue") +
geom_sf_text(data = ashe,
aes(label = NAME),
color = "red",
fun.geometry = st_centroid)
I am using sf and ggplot2 to read shapefiles as simple features and plot various maps. I have been working through the maps chapter in the ggplot2 book but could not really find an answer to the following issue:
Plotting a map using geom_sf and labelling its features with geom_sf_text is a pretty straightforward task.
library(ggplot2)
library(sf)
library(ozmaps)
oz_states <- ozmaps::ozmap_states
ggplot() +
geom_sf(data = oz_states) +
geom_sf_text(data = oz_states, aes(label = NAME))
Once we zoom in on a section of the previous map, not all labels of the features present in the plot are visible.
xlim <- c(120.0, 140.0)
ylim <- c(-40, -24)
ggplot() +
geom_sf(data = oz_states) +
geom_sf_text(data = oz_states, aes(label = NAME)) +
coord_sf(xlim = xlim, ylim = ylim)
I have found a workaround to zoom in on sections of the map and still be able to label the features present in the plot by calculating the centroids of the features, extracting the coordinates as separate columns, selecting the elements I would like to be displayed in the final map, and using ggrepel to label them.
library(dplyr)
library(ggrepel)
oz_states_labels <- oz_states %>% st_centroid()
oz_states_labels <- do.call(rbind, st_geometry(oz_states_labels)) %>%
as_tibble() %>%
rename(x = V1) %>%
rename(y = V2) %>%
cbind(oz_states_labels) %>%
slice(4,5,7,3)
ggplot() +
geom_sf(data = oz_states) +
geom_text_repel(data = oz_states_labels, aes(label = NAME, x = x, y = y)) +
coord_sf(xlim = xlim, ylim = ylim)
Naturally, if possible, I would like to avoid the workaround of first having to calculate the centroids, extract the coordinates from the resulting sf and select the labels to be shown in the final map.
Hence my question: Is there a faster way of labelling all elements visible in the plot for example by specifying this either in geom_sf_text or coord_sf?
Thanks in advance for your tips and answers!
I believe the issue you are facing is caused by your applying the crop at presentation level / the actual data underlying your ggplot object is not cropped.
I suggest applying the crop at data level, for example via sf::st_crop(). In this example I am using the values of your xlim and ylim objects to create a bounding box (called crop_factor for no good reason) to limit the extent of the oz_states at the data level, by creating a new object called oz_cropped & continuing in your original workflow.
All the centroids and labels and what not will be much better behaved now.
library(ggplot2)
library(sf)
library(ozmaps)
oz_states <- ozmaps::ozmap_states
crop_factor <- st_bbox(c(xmin = 120,
xmax = 140,
ymax = -24,
ymin = -40),
crs = st_crs(oz_states))
oz_cropped <- st_crop(oz_states, crop_factor)
ggplot() +
geom_sf(data =oz_cropped) +
geom_sf_text(data = oz_cropped, aes(label = NAME))
I've imported a shapefile of the world's oceans from Natural Earth to R via readOGR. When I try to render it in ggplot, it fills in land over N. & S. America. The behaviour is inconsistent with QGIS & ArcMap, both of which render & fill the shapefile just fine. Any ideas?
download.file("https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/physical/ne_50m_ocean.zip" , destfile="./ne_50m_ocean.zip")
system("unzip ./ne_50m_ocean.zip")
wrld <- readOGR(dsn=getwd(),layer="ne_50m_ocean")
wrld <- tidy(wrld)
ggplot() + geom_polygon(data = wrld, aes(x = long, y = lat, group = group), colour = "black", fill = "blue")
screenshot of RStudio render
screenshot of QGIS render
I was able to resolve this using read_sf() instead of readOGR(), then tweaking the ggplot code to accommodate. Also worked for the next bit of my workflow which involved rasterizing the sf object using fasterize() for a mask ocean layer (simplified demo code included in case useful to others):
#get data
download.file("https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/physical/ne_50m_ocean.zip" , destfile="./ne_50m_ocean.zip")
system("unzip ./ne_50m_ocean.zip")
wrld <- read_sf(dsn=getwd(),layer="ne_50m_ocean")
#plot sf object
ggplot() + geom_sf(data=wrld, colour = "black", fill = "blue")
#rasterize sf object
r <- raster(ncol=720, nrow=360)
extent(r) <- extent(wrld)
rp <- fasterize(wrld, r)
ocean <- as.data.frame(rasterToPoints(rp))
#plot raster object
ggplot() + geom_tile(data=ocean,aes(x=x,y=y),fill="white")
Has anyone been able to create maps of a selection of USDA hardiness zones in R, maybe with ggplot2 and sf packages? I'd specifically like to create a map with only zones 9b and higher in color .
I think some of the data to create the map is found here Prism Climate Group, but I am inexperienced and at a loss to know what to do with GIS data (file extensions SGML,XML,DBF, PRJ, SHP,SHX).
To elaborate a little bit on the answer by #niloc:
The USA looks more natural when shown in the Albers conical projection (Canada border slightly curved - like in the original image).
This can be achieved by using coord_sf(crs = 5070) in your {ggplot2} call.
The gist of the answer (downloading, unzipping & plotting via ggplot2::geom_sf()) remains unchanged).
library(sf)
library(tidyverse)
library(USAboundaries)
# Download and unzip file
temp_shapefile <- tempfile()
download.file('http://prism.oregonstate.edu/projects/public/phm/phm_us_shp.zip', temp_shapefile)
unzip(temp_shapefile)
# Read full shapefile
shp_hardness <- read_sf('phm_us_shp.shp')
# Subset to zones 9b and higher
shp_hardness_subset <- shp_hardness %>%
filter(str_detect(ZONE, '9b|10a|10b|11a|11b'))
# state boundaries for context
usa <- us_boundaries(type="state", resolution = "low") %>%
filter(!state_abbr %in% c("PR", "AK", "HI")) # lower 48 only
# Plot it
ggplot() +
geom_sf(data = shp_hardness_subset, aes(fill = ZONE)) +
geom_sf(data = usa, color = 'black', fill = NA) +
coord_sf(crs = 5070) +
theme_void() # remove lat/long grid lines
There is a lot going on in that map with all of the insets, the legend with F and C, states displayed over the CONUS. Would be better to narrow down your question.
But here is a start. The shapefile is composed of many files (XML, DBF, etc) but you only need to point read_sf() at the .shp file. Subsetting with an sf object can be done just like with a data.frame.
library(sf)
library(tidyverse)
# Download and unzip file
temp_shapefile <- tempfile()
download.file('http://prism.oregonstate.edu/projects/public/phm/phm_us_shp.zip', temp_shapefile)
unzip(temp_shapefile)
# Read full shapefile
shp_hardness <- read_sf('phm_us_shp.shp')
# Subset to zones 9b and higher
shp_hardness_subset <- shp_hardness %>%
filter(str_detect(ZONE, '9b|10a|10b|11a|11b'))
# Plot it
ggplot() +
geom_sf(data = shp_hardness_subset, aes(fill = ZONE)) +
geom_polygon(data = map_data("state"), # add states for context
aes(x=long, y=lat,group=group),
color = 'black',
fill = NA) +
theme_void() # remove lat/long grid lines
This may be a wish list thing, not sure (i.e. maybe there would need to be the creation of geom_pie for this to occur). I saw a map today (LINK) with pie graphs on it as seen here.
I don't want to debate the merits of a pie graph, this was more of an exercise of can I do this in ggplot?
I have provided a data set below (loaded from my drop box) that has the mapping data to make a New York State map and some purely fabricated data on racial percentages by county. I have given this racial make up as a merge with the main data set and as a separate data set called key. I also think Bryan Goodrich's response to me in another post (HERE) on centering county names will be helpful to this concept.
How can we make the map above with ggplot2?
A data set and the map without the pie graphs:
load(url("http://dl.dropbox.com/u/61803503/nycounty.RData"))
head(ny); head(key) #view the data set from my drop box
library(ggplot2)
ggplot(ny, aes(long, lat, group=group)) + geom_polygon(colour='black', fill=NA)
# Now how can we plot a pie chart of race on each county
# (sizing of the pie would also be controllable via a size
# parameter like other `geom_` functions).
Thanks in advance for your ideas.
EDIT: I just saw another case at junkcharts that screams for this type of capability:
Three years later this is solved. I've put together a number of processes together and thanks to #Guangchuang Yu's excellent ggtree package this can be done fairly easily. Note that as of (9/3/2015) you need to have version 1.0.18 of ggtree installed but these will eventually trickle down to their respective repositories.
I've used the following resources to make this (the links will give greater detail):
ggtree blog
move ggplot legend
correct ggtree version
centering things in polygons
Here's the code:
load(url("http://dl.dropbox.com/u/61803503/nycounty.RData"))
head(ny); head(key) #view the data set from my drop box
if (!require("pacman")) install.packages("pacman")
p_load(ggplot2, ggtree, dplyr, tidyr, sp, maps, pipeR, grid, XML, gtable)
getLabelPoint <- function(county) {Polygon(county[c('long', 'lat')])#labpt}
df <- map_data('county', 'new york') # NY region county data
centroids <- by(df, df$subregion, getLabelPoint) # Returns list
centroids <- do.call("rbind.data.frame", centroids) # Convert to Data Frame
names(centroids) <- c('long', 'lat') # Appropriate Header
pops <- "http://data.newsday.com/long-island/data/census/county-population-estimates-2012/" %>%
readHTMLTable(which=1) %>%
tbl_df() %>%
select(1:2) %>%
setNames(c("region", "population")) %>%
mutate(
population = {as.numeric(gsub("\\D", "", population))},
region = tolower(gsub("\\s+[Cc]ounty|\\.", "", region)),
#weight = ((1 - (1/(1 + exp(population/sum(population)))))/11)
weight = exp(population/sum(population)),
weight = sqrt(weight/sum(weight))/3
)
race_data_long <- add_rownames(centroids, "region") %>>%
left_join({distinct(select(ny, region:other))}) %>>%
left_join(pops) %>>%
(~ race_data) %>>%
gather(race, prop, white:other) %>%
split(., .$region)
pies <- setNames(lapply(1:length(race_data_long), function(i){
ggplot(race_data_long[[i]], aes(x=1, prop, fill=race)) +
geom_bar(stat="identity", width=1) +
coord_polar(theta="y") +
theme_tree() +
xlab(NULL) +
ylab(NULL) +
theme_transparent() +
theme(plot.margin=unit(c(0,0,0,0),"mm"))
}), names(race_data_long))
e1 <- ggplot(race_data_long[[1]], aes(x=1, prop, fill=race)) +
geom_bar(stat="identity", width=1) +
coord_polar(theta="y")
leg1 <- gtable_filter(ggplot_gtable(ggplot_build(e1)), "guide-box")
p <- ggplot(ny, aes(long, lat, group=group)) +
geom_polygon(colour='black', fill=NA) +
theme_bw() +
annotation_custom(grob = leg1, xmin = -77.5, xmax = -78.5, ymin = 44, ymax = 45)
n <- length(pies)
for (i in 1:n) {
nms <- names(pies)[i]
dat <- race_data[which(race_data$region == nms)[1], ]
p <- subview(p, pies[[i]], x=unlist(dat[["long"]])[1], y=unlist(dat[["lat"]])[1], dat[["weight"]], dat[["weight"]])
}
print(p)
This functionality should be in ggplot, I think it is coming to ggplot soonish, but it is currently available in base plots. I thought I would post this just for comparison's sake.
load(url("http://dl.dropbox.com/u/61803503/nycounty.RData"))
library(plotrix)
e=10^-5
myglyff=function(gi) {
floating.pie(mean(gi$long),
mean(gi$lat),
x=c(gi[1,"white"]+e,
gi[1,"black"]+e,
gi[1,"hispanic"]+e,
gi[1,"asian"]+e,
gi[1,"other"]+e),
radius=.1) #insert size variable here
}
g1=ny[which(ny$group==1),]
plot(g1$long,
g1$lat,
type='l',
xlim=c(-80,-71.5),
ylim=c(40.5,45.1))
myglyff(g1)
for(i in 2:62)
{gi=ny[which(ny$group==i),]
lines(gi$long,gi$lat)
myglyff(gi)
}
Also, there may be (probably are) more elegant ways of doing this in the base graphics.
As, you can see, there are quite a few problems with this that need to be solved. A fill color for the counties. The pie charts tend to be too small or overlap. The lat and long do not take a projection so sizes of counties are distorted.
In any event, I am interested in what others can come up with.
I've written some code to do this using grid graphics. There is an example here: https://qdrsite.wordpress.com/2016/06/26/pies-on-a-map/
The goal here was to associate the pie charts with specific points on the map, and not necessarily regions. For this particular solution, it is necessary to convert the map coordinates (latitude and longitude) to a (0,1) scale so they can be plotted in the proper locations on the map. The grid package is used to print to the viewport that contains the plot panel.
Code:
# Pies On A Map
# Demonstration script
# By QDR
# Uses NLCD land cover data for different sites in the National Ecological Observatory Network.
# Each site consists of a number of different plots, and each plot has its own land cover classification.
# On a US map, plot a pie chart at the location of each site with the proportion of plots at that site within each land cover class.
# For this demo script, I've hard coded in the color scale, and included the data as a CSV linked from dropbox.
# Custom color scale (taken from the official NLCD legend)
nlcdcolors <- structure(c("#7F7F7F", "#FFB3CC", "#00B200", "#00FFFF", "#006600", "#E5CC99", "#00B2B2", "#FFFF00", "#B2B200", "#80FFCC"), .Names = c("unknown", "cultivatedCrops", "deciduousForest", "emergentHerbaceousWetlands", "evergreenForest", "grasslandHerbaceous", "mixedForest", "pastureHay", "shrubScrub", "woodyWetlands"))
# NLCD data for the NEON plots
nlcdtable_long <- read.csv(file='https://www.dropbox.com/s/x95p4dvoegfspax/demo_nlcdneon.csv?raw=1', row.names=NULL, stringsAsFactors=FALSE)
library(ggplot2)
library(plyr)
library(grid)
# Create a blank state map. The geom_tile() is included because it allows a legend for all the pie charts to be printed, although it does not
statemap <- ggplot(nlcdtable_long, aes(decimalLongitude,decimalLatitude,fill=nlcdClass)) +
geom_tile() +
borders('state', fill='beige') + coord_map() +
scale_x_continuous(limits=c(-125,-65), expand=c(0,0), name = 'Longitude') +
scale_y_continuous(limits=c(25, 50), expand=c(0,0), name = 'Latitude') +
scale_fill_manual(values = nlcdcolors, name = 'NLCD Classification')
# Create a list of ggplot objects. Each one is the pie chart for each site with all labels removed.
pies <- dlply(nlcdtable_long, .(siteID), function(z)
ggplot(z, aes(x=factor(1), y=prop_plots, fill=nlcdClass)) +
geom_bar(stat='identity', width=1) +
coord_polar(theta='y') +
scale_fill_manual(values = nlcdcolors) +
theme(axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
legend.position="none",
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank()))
# Use the latitude and longitude maxima and minima from the map to calculate the coordinates of each site location on a scale of 0 to 1, within the map panel.
piecoords <- ddply(nlcdtable_long, .(siteID), function(x) with(x, data.frame(
siteID = siteID[1],
x = (decimalLongitude[1]+125)/60,
y = (decimalLatitude[1]-25)/25
)))
# Print the state map.
statemap
# Use a function from the grid package to move into the viewport that contains the plot panel, so that we can plot the individual pies in their correct locations on the map.
downViewport('panel.3-4-3-4')
# Here is the fun part: loop through the pies list. At each iteration, print the ggplot object at the correct location on the viewport. The y coordinate is shifted by half the height of the pie (set at 10% of the height of the map) so that the pie will be centered at the correct coordinate.
for (i in 1:length(pies))
print(pies[[i]], vp=dataViewport(xData=c(-125,-65), yData=c(25,50), clip='off',xscale = c(-125,-65), yscale=c(25,50), x=piecoords$x[i], y=piecoords$y[i]-.06, height=.12, width=.12))
The result looks like this:
I stumbled upon what looks like a function to do this: "add.pie" in the "mapplots" package.
The example from the package is below.
plot(NA,NA, xlim=c(-1,1), ylim=c(-1,1) )
add.pie(z=rpois(6,10), x=-0.5, y=0.5, radius=0.5)
add.pie(z=rpois(4,10), x=0.5, y=-0.5, radius=0.3)
A slight variation on the OP's original requirements, but this seems like an appropriate answer/update.
If you want an interactive Google Map, as of googleway v2.6.0 you can add charts inside info_windows of map layers.
see ?googleway::google_charts for documentation and examples
library(googleway)
set_key("GOOGLE_MAP_KEY")
## create some dummy chart data
markerCharts <- data.frame(stop_id = rep(tram_stops$stop_id, each = 3))
markerCharts$variable <- c("yes", "no", "maybe")
markerCharts$value <- sample(1:10, size = nrow(markerCharts), replace = T)
chartList <- list(
data = markerCharts
, type = 'pie'
, options = list(
title = "my pie"
, is3D = TRUE
, height = 240
, width = 240
, colors = c('#440154', '#21908C', '#FDE725')
)
)
google_map() %>%
add_markers(
data = tram_stops
, id = "stop_id"
, info_window = chartList
)