I have a problem with my heatmap, which displays the density LEVEL, but doesn't say anything about the density count. (how many points are in the same area for example).
My data is divided in more columns, but the most important ones are: lat,lon.
I would like to have something like this, but with "count" : https://stackoverflow.com/a/24615674/5316566,
however when I try to apply the code he uses in that answer, my maximum-"level" density doesn't reflect my density count.( Intead of 7500 I receive for example 6, even if I have thousands and thousands of data concentrated).
That's my code:
us_map_g_str <- get_map(location = c(-90.0,41.5,-81.0,42.7), zoom = 7)
ggmap(us_map_g_str, extent = "device") +
geom_tile(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat)), size = 0.3) +
stat_density2d(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat), fill = ..level.., alpha = ..level..), size = 0.3, bins = 10, geom = "polygon") +
scale_fill_gradient(name= "Ios",low = "green", high = "red", trans= "exp") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
This is what I get:
This is part of the data:
lat lon tag device
1 43.33622 -83.67445 0 iPhone5
2 43.33582 -83.69964 0 iPhone5
3 43.33623 -83.68744 0 iPhone5
4 43.33584 -83.72186 0 iPhone5
5 43.33616 -83.67526 0 iPhone5
6 43.25040 -83.78234 0 iPhone5
(The "tag" column is not important)
REVISED
I realised that my previous answer needs to be revised. So, here it is. If you want to find out how many data points exist in each level of a contour, you actually have a lot of things to do. If you are happy to use the leaflet option below, your life would be much easier.
First, let's get a map of Detroit, and create a sample data frame.
library(dplyr)
library(ggplot2)
library(ggmap)
mymap <- get_map(location = "Detroit", zoom = 8)
### Create a sample data
set.seed(123)
mydata <- data.frame(long = runif(min = -84, max = -82.5, n = 100),
lat = runif(min = 42, max = 42.7, n = 100))
Now, we draw a map and save it as g.
g <- ggmap(mymap) +
stat_density2d(data = mydata,
aes(x = long, y = lat, fill = ..level..),
size = 0.5, bins = 10, geom = "polygon")
The real job begins here. In order to find out the numbers of data points in all levels, you want to employ the data frame, which ggplot generates. In this data frame you have data for polygons. These polygons are used to draw level lines. You can see that in the following image, which I draw three levels on a map.
### Create a data frame so that we can find how many data points exist
### in each level.
mydf <- ggplot_build(g)$data[[4]]
### Check where the polygon lines are. This is just for a check.
check <- ggmap(mymap) +
geom_point(data = mydata, aes(x = long, y = lat)) +
geom_path(data = subset(mydf, group == "1-008"), aes(x = x, y = y)) +
geom_path(data = subset(mydf, group == "1-009"), aes(x = x, y = y)) +
geom_path(data = subset(mydf, group == "1-010"), aes(x = x, y = y))
The next step is to reate a level vector for a legend. We group the data by group (e.g., 1-010) and take the first row for each group using slice(). Then, ungroup the data and choose the 2nd column. Finally, create a vector
with unlist(). We come back to lev in the end.
mydf %>%
group_by(group) %>%
slice(1) %>%
ungroup %>%
select(2) %>%
unlist -> lev
Now we split the polygon data (i.e., mydf) by group and create a polygon for each level. Since we have 11 levels (11 polygons), we use lapply(). In the lapply loop, we need to do; 1) extract column for longitude anf latitude, 2) create polygon, 3) convert polygons to spatial polygons, 4) assign
CRS, 5) create a dummy data frame, and 6) create SpatialPolygonsDataFrames.
mylist <- split(mydf, f = mydf$group)
test <- lapply(mylist, function(x){
xy <- x[, c(3,4)]
circle <- Polygon(xy, hole = as.logical(NA))
SP <- SpatialPolygons(list(Polygons(list(circle), ID = "1")))
proj4string(SP) <- CRS("+proj=longlat +ellps=WGS84")
df <- data.frame(value = 1, row.names = "1")
circleDF <- SpatialPolygonsDataFrame(SP, data = df)
})
Now we go back to the original data. What we need to to is to convert the data frame to SpatialPointsDataFrame. This is because we need to subset data and find how many data points exist in each polygon (in each level). First, get long and lat from your data.frame. Make sure that the order is in lon/lat.
xy <- mydata[,c(1,2)]
Then, we create SPDF (SpatialPolygonsDataFrame). You want to have an identical proj4string between spatial polygons and spatial points data.
spdf <- SpatialPointsDataFrame(coords = xy, data = mydata,
proj4string = CRS("+proj=longlat +ellps=WGS84"))
Then, we subset data (mydata) using each polygon.
ana <- lapply(test, function(y){
mydf <- as.data.frame(spdf[y, ])
})
Data points are overlapping across levels; we have duplication. First we try to find out unique data points for each level. We bind data frames in ana and create a data frame, which is foo1. We also create a data frame, which we want to find unique number of data points. We make sure that columns names are all identical between foo1 and foo2. Using setdiff() and nrow(), we can find the unique number of data points in each level.
total <- lapply(11:2, function(x){
foo1 <- bind_rows(ana[c(11:x)])
foo2 <- as.data.frame(ana[x-1])
names(foo2) <- names(foo1)
nrow(setdiff(foo2, foo1))
})
Finally, we need to find the number of data points for the most inner level, which is level 11. We choose a data frame for level 11 in ana and create a data frame and count the number of row.
bob <- nrow(as.data.frame(ana[11]))
out <- c(bob,unlist(total))
### check if total is 100
### sum(out)
### [1] 100
We assign reversed out as names for lev. This is because we want to show how many data points exist for each level in a legend.
names(lev) <- rev(out)
Now we are ready to add a legend.
final <- g +
scale_fill_continuous(name = "Total",
guide = guide_legend(),
breaks = lev)
final
LEAFLET OPTION
If you use leaflet package, you can group data points with different zooms. Leaflet counts data points in certain areas and indicate numbers in circles like the following figure. The more you zoom in, the more leaflet breaks up data points into small groups. In terms of workload, this is much lighter. In addition, your map is interactive. This may be a better option.
library(leaflet)
leaflet(mydf) %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions())
Related
I'd like to create a graph like the one below. It's kind of a combination of using geom_area and geom_point.
Let's say my data looks like this:
library(gcookbook, janitor)
ggplot(uspopage, aes(x = Year, y = Thousands, fill = AgeGroup)) +
geom_area()
I obtain the following graph
Then, I'd like to add the exact number of points as the total for each category, which would be:
library(dplyr)
uspopage |>
group_by(AgeGroup) |>
summarize(total = sum(Thousands))
# A tibble: 8 × 2
AgeGroup total
<fct> <int>
1 <5 1534529
2 5-14 2993842
3 15-24 2836739
4 25-34 2635986
5 35-44 2331680
6 45-54 1883088
7 55-64 1417496
8 >64 1588163
Following some twitter comments my workaround is as follows:
1 - create the original plot with ggplot2
2 - grab the areas of the plot as a data.frame (ggplot_build)
3 - create polygons of the points given in 2, and make it a sensible sf object (downscale to a flatter earth)
4 - generate N random points inside each polygon (st_sample)
5 - grab these points and upscale back to the original scale
6 - ggplot2 once again, now with geom_point
7 - enjoy the wonders of ggplot2
library(gcookbook)
library(tidyverse)
library(sf)
set.seed(42)
# original data
d <- uspopage
# number of points for each group (I divide it by 1000)
d1 <- d |>
group_by(AgeGroup) |>
summarize(n_points = round(sum(Thousands) / 1e3)) |>
mutate(group = 1:n())
# original plot
g <- ggplot(data = d,
aes(x = Year,
y = Thousands,
fill = AgeGroup)) +
geom_area()
# get the geom data from ggplot
f <- ggplot_build(g)$data[[1]]
# polygons are created point by point in order. So let´s, by group, add the data.frame back to itself first part is the ymin line the secound the inverse of ymax line (to make a continous line from encompassing each area).
# list of groups
l_groups <- unique(f$group)
# function to invert and add back the data.frame
f_invert <- function(groups) {
k <- f[f$group == groups,]
k$y <- k$ymin
k1 <- k[nrow(k):1,]
k1$y <- k1$ymax
k2 <- rbind(k, k1)
return(k2)
}
# create a new data frame of the points in order
f1 <- do.call("rbind", lapply(l_groups, f_invert))
# for further use at the end of the script (to upscale back to the original ranges)
max_x <- max(f1$x)
max_y <- max(f1$y)
min_x <- min(f1$x)
min_y <- min(f1$y)
# normalizing: limiting sizes to a fairy small area on the globe (flat earth wannabe / 1 X 1 degrees)
f1$x <- scales::rescale(f1$x)
f1$y <- scales::rescale(f1$y)
# create polygons
polygons <- f1 |>
group_by(group) |>
sf::st_as_sf(coords = c("x", "y"), crs = 4326) |>
summarise(geometry = sf::st_combine(geometry)) |>
sf::st_cast("POLYGON")
# cast N number of points randomly inside each geometry (N is calculated beforehand in d1)
points <- polygons %>%
st_sample(size = d1$n_points,
type = 'random',
exact = TRUE) %>%
# Give the points an ID
sf::st_sf('ID' = seq(length(.)), 'geometry' = .) %>%
# Get underlying polygon attributes (group is the relevant attribute that we want to keep)
sf::st_intersection(., polygons)
# rescale back to the original ranges
points <- points |>
mutate(x = unlist(map(geometry,1)),
y = unlist(map(geometry,2))) |>
mutate(x = (x * (max_x - min_x) + min_x),
y = (y * (max_y - min_y) + min_y))
# bring back the legends
points <- left_join(points, d1, by = c("group"))
# final plot
g1 <- ggplot() +
geom_point(data = points,
aes(x = x,
y = y,
color = AgeGroup),
size = 0.5) +
labs(x = element_blank(),
y = element_blank()) +
theme_bw()
g1
Here's a version without any smoothing, just adding noise to where the dots would go naturally. One nice thing here is we can specify how many people are represented per dot.
dots_per_thou <- 1
uspopage %>%
uncount(round(dots_per_thou * Thousands / 1000)) %>%
group_by(Year) %>%
mutate(x_noise = runif(n(), 0, 1) - 0.5,
x_pos = Year + x_noise,
y_noise = runif(n(), 0, 1000*dots_per_thou),
y_pos = cumsum(row_number() + y_noise)) %>%
ungroup() %>%
ggplot(aes(x_pos, y_pos, color = AgeGroup)) +
geom_point(size = 0.1) +
ggthemes::scale_color_tableau()
You could come close-ish to that look with the ggbeeswarm package. It includes a few positions which "offset points within a category based on their density using quasirandom noise" (this is the description in the vipor package which underlies those positions).
The approach is just a hack and certainly not exactly satisfying. The number of dots might not be accurate and are more like "guessed", and they are too regular with position_beeswarm - I couldn't yet get it to run with the probably more appropriate position_quasirandom.
Also, it is computationally very intense and it made my reprex crash, thus simply copied from my script.
library(gcookbook)
library(ggplot2)
library(dplyr)
## ggbeeswarm needs to be in the development version
# devtools::install_github("eclarke/ggbeeswarm")
library(ggbeeswarm)
uncount_df <- uspopage %>%
group_by(Year) %>%
## inflate every group artificllay to add up to the previous group
## and make numbers much much smaller so to make computations not cray
mutate(cumul_sum = as.integer(cumsum(Thousands)/ 10^3)) %>%
## uncount
tidyr::uncount(cumul_sum)
## I am creating a list of layers
ls_layers <- lapply(split(uncount_df, uncount_df$AgeGroup), function(dat){
## I switched x and y aesthetic so to avoid coord_flip
## side is an argument in the dev version
## the size is a bit of a trial and error
geom_beeswarm(data = dat, aes( x = Year, y = "x", color = AgeGroup),
side = 1L,
size = .4)
})
## reversing the order, a trick to plot from small to large numbers
ls_layers <- ls_layers[length(ls_layers):1]
ggplot() +
## you can now simply add the list of layers to your ggplot object
ls_layers
I have a dataframe object, created by reading in a shape file with sf::read_sf and merged with some pre-existing data with a common geography column:
boundaries <- sf::read_sf('./shapefile')
map <- merge(boundaries, data, by.x = "InterZone",
by.y = "IntermediateZone2011Code", all.x = FALSE, duplicateGeoms = TRUE)
This is then overlaid using ggmap on top of a provider tile obtained with the sf get_map function:
myMap <- get_map(location = c(lon = -2.27, lat = 57.1), zoom = 6,
maptype="toner", crop=FALSE, source = 'stamen')
ggmap(myMap) +
geom_sf(data = map, aes(fill=as.factor(column1)), inherit.aes = FALSE) +
scale_fill_brewer(palette = "OrRd") +
coord_sf(crs = st_crs(4326)) +
labs(x = 'Longitude', y = 'Latitude', fill = 'column1') +
ggtitle('column1')
The issue is that this auto creates hundreds of bins.
I have been looking through the documentation but cannot find an additional argument to specify the number of bins. How can I make it clear to breakdown the column by a fixed number of bins and then map this?
Without a reproducible example it is hard to say exactly what is going on, but it looks like you might be converting a continuous variable into a factor with fill=as.factor(column1).
One option is you remove as.factor and use scale_fill_continuous or some other continuous color scale of your choice.
Another option is to look into cut, where you bin continuous data by specifying the number of bins, or the specific start and end points of your bins.
# Make n bins
map$data_bin <- cut(map$column, breaks = n )
# Or make specific start and end points for bins
map$data_bin <- cut(map$column, breaks = c(-Inf,50,100,Inf) )
I need to create heat map with 3 digit zip boundary.
I have 3 digit zip and count data like this
zip <- c(790, 791, 792, 793)
count <- c(0, 100, 20, 30)
TX <- data.frame(zip, count)
Also, I draw TX map.
library(ggplot2)
library(ggmap)
library(maps)
library(mapdata)
states <- map_data("state")
texas<- subset(states, region =="texas")
ggplot(data = texas) +
geom_polygon(aes(x = long, y = lat), fill = "gray", color = "black")
What I want to achieve is to (1) draw boundary with 3 digit zip code and (2) create the heat map using count column. The outcome will looks like this with heat map coloring.
This question does not contain reproducible sample data. Hence, I needed some good amount of time to deliver the following. Please provide minimum reproducible data and codes you tried from next time. (I doubt if you really invested time to seriously write your codes.)
Anyway, I think getting a good polygon data for US zip codes is difficult without paying some money. This question provides good information. I obtained data from this link since the data was accessible. You gotta find whatever suitable polygon data for yourself.
I also obtained data for the zip codes in Texas from here and saved it as "zip_code_database.csv."
I added explanation for each code below. So I do not write a thourough explanation here. Basically, you need to merge polygon data by subtracting the first three numbers in the zip codes. You also need to create an aggregated data for whatever the value you have in your data using the 3-digit zip code. The other thing is to find center points of the polygons to add the zip codes as labels.
library(tidyverse)
library(rgdal)
library(rgeos)
library(maptools)
library(ggalt)
library(ggthemes)
library(ggrepel)
library(RColorBrewer)
# Prepare the zip poly data for US
mydata <- readOGR(dsn = ".", layer = "cb_2016_us_zcta510_500k")
# Texas zip code data
zip <- read_csv("zip_code_database.csv")
tx <- filter(zip, state == "TX")
# Get polygon data for TX only
mypoly <- subset(mydata, ZCTA5CE10 %in% tx$zip)
# Create a new group with the first three digit.
# Drop unnecessary factor levels.
# Add a fake numeric variable, which is used for coloring polygons later.
mypoly$group <- substr(mypoly$ZCTA5CE10, 1,3)
mypoly$ZCTA5CE10 <- droplevels(mypoly$ZCTA5CE10)
set.seed(111)
mypoly$value <- sample.int(n = 10000, size = nrow(mypoly), replace = TRUE)
# Merge polygons using the group variable
# Create a data frame for ggplot.
mypoly.union <- unionSpatialPolygons(mypoly, mypoly$group)
mymap <- fortify(mypoly.union)
# Check how polygons are like
plot(mypoly)
plot(mypoly.union, add = T, border = "red", lwd = 1)
# Convert SpatialPolygons to data frame and aggregate the fake values
mypoly.df <- as(mypoly, "data.frame") %>%
group_by(group) %>%
summarise(value = sum(value))
# Find a center point for each zip code area
centers <- data.frame(gCentroid(spgeom = mypoly.union, byid = TRUE))
centers$zip <- rownames(centers)
# Finally, drawing a graphic
ggplot() +
geom_cartogram(data = mymap, aes(x = long, y = lat, map_id = id), map = mymap) +
geom_cartogram(data = mypoly.df, aes(fill = value, map_id = group), map = mymap) +
geom_text_repel(data = centers, aes(label = zip, x = x, y = y), size = 3) +
scale_fill_gradientn(colours = rev(brewer.pal(10, "Spectral"))) +
coord_map() +
theme_map()
I am trying to change the background colors in US map for displaying presidential results for different states. I read so many posts regarding this color change but I was not able to change any of those colors. Below is my code, link for dataset and snapshot which I am getting:
#install.packages("ggplot2")
#install.packages("ggmap")
#install.packages("plyr")
#install.packages("raster")
#install.packages("stringr")
library(ggplot2) # for plotting and miscellaneuous things
library(ggmap) # for plotting
library(plyr) # for merging datasets
library(raster) # to get map shape filegeom_polygon
library(stringr) # for string operation
# Get geographic data for USA
usa.shape<-getData("GADM", country = "usa", level = 1)
# Creating a data frame of map data
usa.df <- map_data("state")
#rename 'region' as 'state' and make it a factor variable
colnames(usa.df) [5] <- "State"
usa.df$State <- as.factor(usa.df$State)
#set working directory
setwd("C:/Users/Ashish/Documents/Stats projects/2/")
#input data from file separated by commas
usa.dat <- read.csv("data1.csv", header = T)
# printing data structure
str(usa.df)
# removing % sign from the data, and converting percentage win to numeric
usa.dat$Clinton <- as.numeric(sub("%","",usa.dat$Clinton))/1
usa.dat$Trump <- as.numeric(sub("%","",usa.dat$Trump))/1
usa.dat$Others <- as.numeric(sub("%","",usa.dat$Others))/1
# Creating a winner column based on the percentage
usa.dat$Winner = "Trump"
usa.dat[usa.dat$Clinton > usa.dat$Trump,]$Winner = "Clinton"
usa.dat$State <- tolower(usa.dat$State)
# Creating a chance column which corresponds to winning percentage of the candidate
usa.dat$chance <- usa.dat$Trump
a <- usa.dat[usa.dat$Clinton > usa.dat$Trump,]
usa.dat[usa.dat$Clinton > usa.dat$Trump,]$chance <- a$Clinton
# display the internal structure of the object
usa.dat
#join the usa.df and usa.dat objects on state variable
usa.df <- join(usa.df, usa.dat, by = "State", type = "inner")
str(usa.df)
states <- data.frame(state.center, state.abb) # centers of states and abbreviations
#function for plotting different regions of USA map based on the input data showing different coloring scheme
#for each state.
p <- function(data, title) {
ggp <- ggplot() +
# Draw borders of states
geom_polygon(data = data, aes(x = long, y = lat, group = group,
fill = Winner, alpha=chance), color = "black", size = 0.15) +
#scale_alpha_continuous(range=c(0,1))+
scale_color_gradientn(colours = c("#F08080","white","#5DADE2"),breaks = c(0,50,100),
labels=c("Clinton","Equal","Trump"),
limits=c(0,100),name="Election Forecast") +
# Add state abbreviations
geom_text(data = states, aes(x = x, y = y, label = state.abb), size = 2)+
guides(fill = guide_legend(direction='vertical', title='Candidate', label=TRUE, colours=c("red", "blue")))
return(ggp)
}
figure.title <- "2016 presidential election result"
# Save the map to a file to viewing (you can plot on the screen also, but it takes
# much longer that way. The ratio of US height to width is 1:9.)
#print(p(usa.df, brks.to.use, figure.title))
ggsave(p(usa.df, figure.title), height = 4, width = 4*1.9,
file = "election_result.jpg")
Image link:
Dataset: Dataset link
I would like to get same coloring scheme as displayed in Election forecast gradient.
Thanks to Alistaire for providing his valuable feedbacks and solution for the above problem. Using scale_fill_brewer(type = 'qual', palette = 6) along with ggplot() resolves the above issue in R.
I am attempting to make a map with three layers using ggmap. The layers are as follows:
A map of the US (toner-lite)
a set of geometries that color the states on some value (simulated data below)
labels for the state names, as annotations in the center of each state.
To do this I have created a map of US states with states colored by a randomized value (rnorm) and this part is successful. From here I am attempting to print the abbreviations of each state at the longitude and latitude coordinates of each state's center, using geom_text. The part that fails is the 'geom_text' overlay, with the following error:
Error: 'x' and 'units' must have length > 0 In addition: Warning
messages: 1: In gpclibPermit() : support for gpclib will be
withdrawn from maptools at the next major release 2: Removed 855070
rows containing missing values (geom_text).
Here is the script, which I have worked hard to run as on its own. It will download the shapefile and center of state data, as well as to simulate data to fill the states. I've tested it and it works up to what I have commented out (geom_text layer).
I have searched for answers to this already, so please let me know if you have any advice on how to do what I am attempting. If there is a better strategy for placing labels on top of the polygon fills, I am all ears (or eyes in this case).
###Combining Census data with a tract poly shapefile
library(maptools)
library(ggplot2)
library(gpclib)
library(ggmap)
library(rgdal)
library(dplyr)
#Set working directory to where you want your files to exist (or where they already exist)
setwd('~/Documents/GIS/USCensus/')
#Read and translate coord data for shape file of US States
if(!file.exists('tl_2014_us_state.shp')){
download.file('ftp://ftp2.census.gov/geo/tiger/TIGER2014/STATE/tl_2014_us_state.zip',
'tl_2014_us_state.zip')
files <- unzip('tl_2014_us_state.zip')
tract <- readOGR(".","tl_2014_us_state") %>% spTransform(CRS("+proj=longlat +datum=WGS84"))
} else {
tract <- readOGR(".","tl_2014_us_state") %>% spTransform(CRS("+proj=longlat +datum=WGS84"))
}
#two column dataset of state abbreviations and center of state
#Downloadable from: https://dev.maxmind.com/static/csv/codes/state_latlon.csv
if(!file.exists('state_latlon.csv')){
download.file('http://dev.maxmind.com/static/csv/codes/state_latlon.csv','state_latlon.csv')
}
centers <- read.csv('state_latlon.csv')
#Change values of longitude and latitude from state center data so as not to interfere with shapefile at merge
names(centers)[2:3] <- c('long_c','lat_c')
#simulated data for plotting values
mydata<- data.frame(rnorm(55, 0, 1)) #55 "states" in the coord dataset for state centers
names(mydata)[1] <- 'value'
#hold names in tract dataset and for simulated data
ntract<-names(tract)
ndata<-names(mydata)
#Turn geo data into R dataframe
gpclibPermit()
tract_geom<-fortify(tract,region="STUSPS")
#Merge state geo data with simulated data
state_data <- cbind(centers,mydata)
#merge state center and value data with shapefile data
tract_poly <- merge(state_data,tract_geom,by.x="state",by.y="id", all = F)
tract_poly<-tract_poly[order(tract_poly$order),]
#Create map of US
mymap <- get_stamenmap(bbox = c(left = -124.848974,
bottom = 24.396308,
right = -66.885444,
top = 49.384358),zoom=5,
maptype="toner-lite")
#This plots a map of the US with just the state names as labels (and a few other landmarks). Used for reference
USMap <- ggmap(mymap,extent='device') +
geom_polygon(aes(x = long, y = lat, group = group, fill = value),
data = tract_poly,
alpha = 1,
color = "black",
size = 0.2) #+
# geom_text(aes(x = long_c, y = lat_c, group = group, label = state),
# data= tract_poly,
# alpha = 1,
# color = "black")
USMap
That's a strange error message for what ended up being the problem. Somewhere along the way you have flipped the latitude and longitude for centers. (I also took into account elpi's advice above and didn't plot the Initials repeatedly by using your centers dataset directly). The code below works, but I'd recommend fixing your centers dataset.
centers$new_long <- centers$lat_c
centers$new_lat <- centers$long_c
USMap <- ggmap(mymap,extent='device') +
geom_polygon(aes(x = long, y = lat, group = group, fill = value),
data = tract_poly,
alpha = 1,
color = "black",
size = 0.2) +
geom_text(aes(x = new_long, y = new_lat, label = state),
data= centers,
alpha = 1,
color = "black")
Try this
centroids <- setNames(do.call("rbind.data.frame", by(tract_poly, tract_poly$group, function(x) {Polygon(x[c('long', 'lat')])#labpt})), c('long', 'lat'))
centroids$label <- tract_poly$state[match(rownames(centroids), tract_poly$group)]
USMap + with(centroids, annotate(geom="text", x = long, y=lat, label = label, size = 2.5))
(via)