I'm trying to plot a matrix (mostly random numbers with a few NAs) with longitude/latitude coordinates on a ggmap plot.
This is my code:
define the longitude and latitude coordinates
lon = seq(x1,x2,by=0.1)
lat = seq(y1,y2,by=-0.1)
define a matrix with random numbers with the longitude/latitude dimensions
numbers = rnorm(length(lon)*length(lat))
var = matrix(numbers,length(lon),length(lat))
add some NAs to the matrix
var[1:5,1:5] = NA
lat_min <- min(lat)-0.3
lon_min <- min(lon)-0.3
lat_max <- max(lat)+0.3
lon_max <- max(lon)+0.3
construct the ggmap
map_box <- c(left=lon_min,bottom=lat_min,
total_stmap <- get_stamenmap(bbox=map_box,zoom=5,maptype="toner")
total_ggmap <- ggmap(total_stmap,extent="panel")
make a data.frame to attribute each matrix index to the geographical coordinate
lat_df <- c()
lon_df <- c()
var_df <- c()
for (i in 1:length(lon)) {
for (j in 1:length(lat)) {
lon_df <- c(lon_df,lon[i])
lat_df <- c(lat_df,lat[j])
var_df = var[i,j]
make the plot using ggmap and geom_tile
plot = total_ggmap +
geom_tile(data=df,aes(x=Longitude,y=Latitude,fill=Variable),alpha=1/2,color="black",size=0) +
geom_sf(data = df, inherit.aes = FALSE, fill = NA)
With this code I get the message:
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
...and a blank plot.
There were two problems here. The first is that all of your values in the Variable column are the same, because you are just overwriting var_df with every iteration of your loop. The line
var_df = var[i,j]
Should be
var_df = c(var_df, var[i,j])
Secondly, you should not be using geom_sf if you have a data frame of longitude, latitude and value. geom_sf is used to plot sf objects, which is not what you have.
Instead, you need only do:
plot <- total_ggmap +
geom_tile(data = df, aes(Longitude, Latitude, fill = Variable), alpha = 1/2, size = 0)
and you get:
Using the code below I can plot the following:
This code is adapted from here
As you can see there are few issues with the plot. I am struggling to
Remove weird lines in plot
Only plot cells (grids) where there are data
Plot ID (see gridSpatialPolygons$values) on top of the grid cell
I realise there are a few points to this question but I hope one solution solves all.
# Load libraries
# Projection
wgs.84 <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
# Load data
x <- c(76.82973, 76.82972, 76.82969, 76.83076, 76.83075, 76.83071, 76.83129, 76.83126, 76.83125)
y <- c(28.26734, 28.26644, 28.26508, 28.26778, 28.26733, 28.26507, 28.26912, 28.26732, 28.26687)
z <- c(-56.7879, -58.22462, -58.4211, -55.75333, -58.55153, -56.38619, -56.11011, -58.17415, -59.77212)
# Create data frame
dataset <- data.frame("LONGITUDE" = x, "LATITUDE" = y, "VALUES" = z)
# Create SpatialPointsDataFrame object
datasetSP <- SpatialPointsDataFrame(coords = dataset[,c(1,2)], data = data.frame("id" = 1:nrow(dataset), "values" = dataset$VALUES), proj4string = wgs.84)
# Extent
extentDatasetSP <-extent(datasetSP)
# Make grid options
# Cell size (map units)
xCellSizeGrid <- 0.001
yCellSizeGrid <- 0.001
# Grid
grid <- GridTopology(cellcentre.offset = c(extentDatasetSP#xmin, extentDatasetSP#ymin),
cellsize = c(xCellSizeGrid, yCellSizeGrid),
cells.dim = c(3, 7))
# Create SpatialGrid object
gridSpatial <- SpatialGrid(grid = grid, proj4string = wgs.84)
# Convert to SpatialPixels object
gridSpatialPixels <- as(gridSpatial, "SpatialPixels")
# Convert to SpatialPolygons object
gridSpatialPolygons <- as(gridSpatialPixels, "SpatialPolygons")
# Add 'id' and 'values' to every polygon
gridSpatialPolygons$id <- 1:nrow(coordinates(gridSpatialPolygons))
gridSpatialPolygons$values <- paste("Gridvalue", 1:nrow(coordinates(gridSpatialPolygons)), sep = ":")
# Get attributes from polygons
samplePointsInPolygons2 <- datasetSP %over% gridSpatialPolygons
ggplot(gridSpatialPolygons, aes(x = long, y = lat)) +
geom_polygon(color = "red") +
geom_point(data = dataset,
aes(x = LONGITUDE,
When it comes to spatial objects, ggplot2 (and tidyverse in general) seems to play nicer with sf than sp. The advice below is taken from one of the help files in the associated broom package:
Note that the sf package now defines tidy spatial objects and is the
recommended approach to spatial data. sp tidiers are likely to be
deprecated in the near future in favor of sf::st_as_sf(). Development
of sp tidiers has halted in broom.
Things should be fairly straightforward after conversion to sf.
sf::st_as_sf(gridSpatialPolygons) %>%
filter(id %in% samplePointsInPolygons2$id) %>% # keep only grid cells with data
ggplot() +
geom_sf(colour = "red") +
geom_sf_text(aes(label = values), # label cells
nudge_y = 0.0003, colour = "grey40") +
geom_point(data = dataset,
aes(x = LONGITUDE,
I need to create heat map with 3 digit zip boundary.
I have 3 digit zip and count data like this
zip <- c(790, 791, 792, 793)
count <- c(0, 100, 20, 30)
TX <- data.frame(zip, count)
Also, I draw TX map.
states <- map_data("state")
texas<- subset(states, region =="texas")
ggplot(data = texas) +
geom_polygon(aes(x = long, y = lat), fill = "gray", color = "black")
What I want to achieve is to (1) draw boundary with 3 digit zip code and (2) create the heat map using count column. The outcome will looks like this with heat map coloring.
This question does not contain reproducible sample data. Hence, I needed some good amount of time to deliver the following. Please provide minimum reproducible data and codes you tried from next time. (I doubt if you really invested time to seriously write your codes.)
Anyway, I think getting a good polygon data for US zip codes is difficult without paying some money. This question provides good information. I obtained data from this link since the data was accessible. You gotta find whatever suitable polygon data for yourself.
I also obtained data for the zip codes in Texas from here and saved it as "zip_code_database.csv."
I added explanation for each code below. So I do not write a thourough explanation here. Basically, you need to merge polygon data by subtracting the first three numbers in the zip codes. You also need to create an aggregated data for whatever the value you have in your data using the 3-digit zip code. The other thing is to find center points of the polygons to add the zip codes as labels.
# Prepare the zip poly data for US
mydata <- readOGR(dsn = ".", layer = "cb_2016_us_zcta510_500k")
# Texas zip code data
zip <- read_csv("zip_code_database.csv")
tx <- filter(zip, state == "TX")
# Get polygon data for TX only
mypoly <- subset(mydata, ZCTA5CE10 %in% tx$zip)
# Create a new group with the first three digit.
# Drop unnecessary factor levels.
# Add a fake numeric variable, which is used for coloring polygons later.
mypoly$group <- substr(mypoly$ZCTA5CE10, 1,3)
mypoly$ZCTA5CE10 <- droplevels(mypoly$ZCTA5CE10)
mypoly$value <- sample.int(n = 10000, size = nrow(mypoly), replace = TRUE)
# Merge polygons using the group variable
# Create a data frame for ggplot.
mypoly.union <- unionSpatialPolygons(mypoly, mypoly$group)
mymap <- fortify(mypoly.union)
# Check how polygons are like
plot(mypoly.union, add = T, border = "red", lwd = 1)
# Convert SpatialPolygons to data frame and aggregate the fake values
mypoly.df <- as(mypoly, "data.frame") %>%
group_by(group) %>%
summarise(value = sum(value))
# Find a center point for each zip code area
centers <- data.frame(gCentroid(spgeom = mypoly.union, byid = TRUE))
centers$zip <- rownames(centers)
# Finally, drawing a graphic
ggplot() +
geom_cartogram(data = mymap, aes(x = long, y = lat, map_id = id), map = mymap) +
geom_cartogram(data = mypoly.df, aes(fill = value, map_id = group), map = mymap) +
geom_text_repel(data = centers, aes(label = zip, x = x, y = y), size = 3) +
scale_fill_gradientn(colours = rev(brewer.pal(10, "Spectral"))) +
coord_map() +
I am trying to overlay a raster layer onto a map in ggplot. The raster layer contains likelihood surfaces for each time point from a satellite tag. I also want to set cumulative probabilities(95%, 75%, 50%) on the raster layer.
I have figured out how to show the raster layer on the ggplot map, but the coordinates are not aligned with one another. I tried making each have the same projection but it does not seem to be working... I want them both to fit the boundaries of my model (xmin = 149, xmax = 154, ymin = -14, ymax = -8.75
Attached is my r code and the figure result:
#load data
ncname <- "152724-13-GPE3"
ncfname <- paste(ncname, ".nc", sep = "")
ncin <- nc_open(ncfname)
StackedObject<-stack("152724-13-GPE3.nc", varname = "monthly_residency_distributions")
MergedObject<-overlay(StackedObject,fun=mean )
Boundaries<-extent(c(149, 154, -14, -8.75))
ExtendedObject<-extend(MergedObject, Boundaries)
Raster.HR<-resample(x=ExtendedObject, y=Raster.big, method="bilinear")
Raster.HR#data#values<- Raster.HR#data#values/sum(Raster.HR#data#values)
Raster.breaks <- c(RasterVals[max(which(cumsum(RasterVals)<= 0.05 ))], RasterVals[max(which(cumsum(RasterVals)<= 0.25 ))], RasterVals[max(which(cumsum(RasterVals)<= 0.50 ))], 1)
RasterCols<- c(Raster.cols(3))
#Create Map
shape2 <- readOGR(dsn = "/Users/shannonmurphy/Desktop/PNG_adm/PNG_adm1.shp", layer = "PNG_adm1")
map<- crop(shape2, extent(149, 154, -14, -8.75))
projection(map)<- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
p <- ggplot() + geom_polygon(data = map, aes(x = long, y = lat, group = group), color = "black", size = 0.25) + coord_map()
projection(Raster.HR)<- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
#plot raster and ggplot
par(mfrow=c(1,1), new = TRUE)
plot(Raster.HR, col=RasterCols, breaks=Raster.breaks, legend = NULL, bbox(map))
Please let me know if there is another package/line of code I should be using to do this! Appreciate any help
Ok I understand. You want to plot multiple raster layers on the ggplot or you want that the raster object is over your background polygon object. The problem with rasterVis::gplot is that it directly plot the raster and does not allow to add another one under or over. You remind me that I already had this need and modified function gplot to retrieve the data as a tibble so that you can then play with it as much as you want with dplyr and then ggplot2. Thanks for the reminder, I added it in my current github library for later use!
Let's use a reproducible example to show this function:
Create datasets
Create a map of the world as a Raster to be use as background Raster map
Create a raster of data, here a distance from a point (limited to a maximum distance)
The code:
# Get world map
# Transform World as raster
r <- raster(wrld_simpl, res = 1)
wrld_r <- rasterize(wrld_simpl, r)
# Lets create a raster of data
pt1 <- matrix(c(100,0), ncol = 2)
dist1 <- distanceFromPoints(r, pt1)
values(dist1)[values(dist1) > 5e6] <- NA
# Plot both
plot(wrld_r, col = "grey")
plot(dist1, add = TRUE)
Function to extract (part of) raster values and transform as a tibble
#' Transform raster as data.frame to be later used with ggplot
#' Modified from rasterVis::gplot
#' #param x A Raster* object
#' #param maxpixels Maximum number of pixels to use
#' #details rasterVis::gplot is nice to plot a raster in a ggplot but
#' if you want to plot different rasters on the same plot, you are stuck.
#' If you want to add other information or transform your raster as a
#' category raster, you can not do it. With `SDMSelect::gplot_data`, you retrieve your
#' raster as a tibble that can be modified as wanted using `dplyr` and
#' then plot in `ggplot` using `geom_tile`.
#' If Raster has levels, they will be joined to the final tibble.
#' #export
gplot_data <- function(x, maxpixels = 50000) {
x <- raster::sampleRegular(x, maxpixels, asRaster = TRUE)
coords <- raster::xyFromCell(x, seq_len(raster::ncell(x)))
## Extract values
dat <- utils::stack(as.data.frame(raster::getValues(x)))
names(dat) <- c('value', 'variable')
dat <- dplyr::as.tbl(data.frame(coords, dat))
if (!is.null(levels(x))) {
dat <- dplyr::left_join(dat, levels(x)[[1]],
by = c("value" = "ID"))
Plot multiple rasters in ggplot
You can use gplot_data to transform any raster as a tibble. You are then able to add any modification using dplyr and plot on ggplot with geom_tile. The interesting point is that you can use geom_tile as many time as you want with different raster data, provided that fill option is comparable. Otherwise, you can use the trick below to remove NA values in the background raster map and use a unique fill colour.
# With gplot_data
# Transform rasters as data frame
gplot_wrld_r <- gplot_data(wrld_r)
gplot_dist1 <- gplot_data(dist1)
# To define a unique fill colour for the world map,
# you need to remove NA values in gplot_wrld_r which
# can be done with dplyr::filter
ggplot() +
geom_tile(data = dplyr::filter(gplot_wrld_r, !is.na(value)),
aes(x = x, y = y), fill = "grey20") +
geom_tile(data = gplot_dist1,
aes(x = x, y = y, fill = value)) +
low = 'yellow', high = 'blue',
na.value = NA) +
Plot raster over polygons
Of course, with a background map as a polygon object, this trick also let you add your raster over it:
wrld_simpl_sf <- sf::st_as_sf(wrld_simpl)
ggplot() +
geom_sf(data = wrld_simpl_sf, fill = "grey20",
colour = "white", size = 0.2) +
geom_tile(data = gplot_dist1,
aes(x = x, y = y, fill = value)) +
low = 'yellow', high = 'blue',
na.value = NA)
EDIT: gplot_data is now in this simple R package: https://github.com/statnmap/cartomisc
I have a problem with my heatmap, which displays the density LEVEL, but doesn't say anything about the density count. (how many points are in the same area for example).
My data is divided in more columns, but the most important ones are: lat,lon.
I would like to have something like this, but with "count" : https://stackoverflow.com/a/24615674/5316566,
however when I try to apply the code he uses in that answer, my maximum-"level" density doesn't reflect my density count.( Intead of 7500 I receive for example 6, even if I have thousands and thousands of data concentrated).
That's my code:
us_map_g_str <- get_map(location = c(-90.0,41.5,-81.0,42.7), zoom = 7)
ggmap(us_map_g_str, extent = "device") +
geom_tile(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat)), size = 0.3) +
stat_density2d(data = data1, aes(x = as.numeric(lon), y = as.numeric(lat), fill = ..level.., alpha = ..level..), size = 0.3, bins = 10, geom = "polygon") +
scale_fill_gradient(name= "Ios",low = "green", high = "red", trans= "exp") +
scale_alpha(range = c(0, 0.3), guide = FALSE)
This is what I get:
This is part of the data:
lat lon tag device
1 43.33622 -83.67445 0 iPhone5
2 43.33582 -83.69964 0 iPhone5
3 43.33623 -83.68744 0 iPhone5
4 43.33584 -83.72186 0 iPhone5
5 43.33616 -83.67526 0 iPhone5
6 43.25040 -83.78234 0 iPhone5
(The "tag" column is not important)
I realised that my previous answer needs to be revised. So, here it is. If you want to find out how many data points exist in each level of a contour, you actually have a lot of things to do. If you are happy to use the leaflet option below, your life would be much easier.
First, let's get a map of Detroit, and create a sample data frame.
mymap <- get_map(location = "Detroit", zoom = 8)
### Create a sample data
mydata <- data.frame(long = runif(min = -84, max = -82.5, n = 100),
lat = runif(min = 42, max = 42.7, n = 100))
Now, we draw a map and save it as g.
g <- ggmap(mymap) +
stat_density2d(data = mydata,
aes(x = long, y = lat, fill = ..level..),
size = 0.5, bins = 10, geom = "polygon")
The real job begins here. In order to find out the numbers of data points in all levels, you want to employ the data frame, which ggplot generates. In this data frame you have data for polygons. These polygons are used to draw level lines. You can see that in the following image, which I draw three levels on a map.
### Create a data frame so that we can find how many data points exist
### in each level.
mydf <- ggplot_build(g)$data[[4]]
### Check where the polygon lines are. This is just for a check.
check <- ggmap(mymap) +
geom_point(data = mydata, aes(x = long, y = lat)) +
geom_path(data = subset(mydf, group == "1-008"), aes(x = x, y = y)) +
geom_path(data = subset(mydf, group == "1-009"), aes(x = x, y = y)) +
geom_path(data = subset(mydf, group == "1-010"), aes(x = x, y = y))
The next step is to reate a level vector for a legend. We group the data by group (e.g., 1-010) and take the first row for each group using slice(). Then, ungroup the data and choose the 2nd column. Finally, create a vector
with unlist(). We come back to lev in the end.
mydf %>%
group_by(group) %>%
slice(1) %>%
ungroup %>%
select(2) %>%
unlist -> lev
Now we split the polygon data (i.e., mydf) by group and create a polygon for each level. Since we have 11 levels (11 polygons), we use lapply(). In the lapply loop, we need to do; 1) extract column for longitude anf latitude, 2) create polygon, 3) convert polygons to spatial polygons, 4) assign
CRS, 5) create a dummy data frame, and 6) create SpatialPolygonsDataFrames.
mylist <- split(mydf, f = mydf$group)
test <- lapply(mylist, function(x){
xy <- x[, c(3,4)]
circle <- Polygon(xy, hole = as.logical(NA))
SP <- SpatialPolygons(list(Polygons(list(circle), ID = "1")))
proj4string(SP) <- CRS("+proj=longlat +ellps=WGS84")
df <- data.frame(value = 1, row.names = "1")
circleDF <- SpatialPolygonsDataFrame(SP, data = df)
Now we go back to the original data. What we need to to is to convert the data frame to SpatialPointsDataFrame. This is because we need to subset data and find how many data points exist in each polygon (in each level). First, get long and lat from your data.frame. Make sure that the order is in lon/lat.
xy <- mydata[,c(1,2)]
Then, we create SPDF (SpatialPolygonsDataFrame). You want to have an identical proj4string between spatial polygons and spatial points data.
spdf <- SpatialPointsDataFrame(coords = xy, data = mydata,
proj4string = CRS("+proj=longlat +ellps=WGS84"))
Then, we subset data (mydata) using each polygon.
ana <- lapply(test, function(y){
mydf <- as.data.frame(spdf[y, ])
Data points are overlapping across levels; we have duplication. First we try to find out unique data points for each level. We bind data frames in ana and create a data frame, which is foo1. We also create a data frame, which we want to find unique number of data points. We make sure that columns names are all identical between foo1 and foo2. Using setdiff() and nrow(), we can find the unique number of data points in each level.
total <- lapply(11:2, function(x){
foo1 <- bind_rows(ana[c(11:x)])
foo2 <- as.data.frame(ana[x-1])
names(foo2) <- names(foo1)
nrow(setdiff(foo2, foo1))
Finally, we need to find the number of data points for the most inner level, which is level 11. We choose a data frame for level 11 in ana and create a data frame and count the number of row.
bob <- nrow(as.data.frame(ana[11]))
out <- c(bob,unlist(total))
### check if total is 100
### sum(out)
### [1] 100
We assign reversed out as names for lev. This is because we want to show how many data points exist for each level in a legend.
names(lev) <- rev(out)
Now we are ready to add a legend.
final <- g +
scale_fill_continuous(name = "Total",
guide = guide_legend(),
breaks = lev)
If you use leaflet package, you can group data points with different zooms. Leaflet counts data points in certain areas and indicate numbers in circles like the following figure. The more you zoom in, the more leaflet breaks up data points into small groups. In terms of workload, this is much lighter. In addition, your map is interactive. This may be a better option.
leaflet(mydf) %>%
addTiles() %>%
addMarkers(clusterOptions = markerClusterOptions())
I am attempting to make a map with three layers using ggmap. The layers are as follows:
A map of the US (toner-lite)
a set of geometries that color the states on some value (simulated data below)
labels for the state names, as annotations in the center of each state.
To do this I have created a map of US states with states colored by a randomized value (rnorm) and this part is successful. From here I am attempting to print the abbreviations of each state at the longitude and latitude coordinates of each state's center, using geom_text. The part that fails is the 'geom_text' overlay, with the following error:
Error: 'x' and 'units' must have length > 0 In addition: Warning
messages: 1: In gpclibPermit() : support for gpclib will be
withdrawn from maptools at the next major release 2: Removed 855070
rows containing missing values (geom_text).
Here is the script, which I have worked hard to run as on its own. It will download the shapefile and center of state data, as well as to simulate data to fill the states. I've tested it and it works up to what I have commented out (geom_text layer).
I have searched for answers to this already, so please let me know if you have any advice on how to do what I am attempting. If there is a better strategy for placing labels on top of the polygon fills, I am all ears (or eyes in this case).
###Combining Census data with a tract poly shapefile
#Set working directory to where you want your files to exist (or where they already exist)
#Read and translate coord data for shape file of US States
files <- unzip('tl_2014_us_state.zip')
tract <- readOGR(".","tl_2014_us_state") %>% spTransform(CRS("+proj=longlat +datum=WGS84"))
} else {
tract <- readOGR(".","tl_2014_us_state") %>% spTransform(CRS("+proj=longlat +datum=WGS84"))
#two column dataset of state abbreviations and center of state
#Downloadable from: https://dev.maxmind.com/static/csv/codes/state_latlon.csv
centers <- read.csv('state_latlon.csv')
#Change values of longitude and latitude from state center data so as not to interfere with shapefile at merge
names(centers)[2:3] <- c('long_c','lat_c')
#simulated data for plotting values
mydata<- data.frame(rnorm(55, 0, 1)) #55 "states" in the coord dataset for state centers
names(mydata)[1] <- 'value'
#hold names in tract dataset and for simulated data
#Turn geo data into R dataframe
#Merge state geo data with simulated data
state_data <- cbind(centers,mydata)
#merge state center and value data with shapefile data
tract_poly <- merge(state_data,tract_geom,by.x="state",by.y="id", all = F)
#Create map of US
mymap <- get_stamenmap(bbox = c(left = -124.848974,
bottom = 24.396308,
right = -66.885444,
top = 49.384358),zoom=5,
#This plots a map of the US with just the state names as labels (and a few other landmarks). Used for reference
USMap <- ggmap(mymap,extent='device') +
geom_polygon(aes(x = long, y = lat, group = group, fill = value),
data = tract_poly,
alpha = 1,
color = "black",
size = 0.2) #+
# geom_text(aes(x = long_c, y = lat_c, group = group, label = state),
# data= tract_poly,
# alpha = 1,
# color = "black")
That's a strange error message for what ended up being the problem. Somewhere along the way you have flipped the latitude and longitude for centers. (I also took into account elpi's advice above and didn't plot the Initials repeatedly by using your centers dataset directly). The code below works, but I'd recommend fixing your centers dataset.
centers$new_long <- centers$lat_c
centers$new_lat <- centers$long_c
USMap <- ggmap(mymap,extent='device') +
geom_polygon(aes(x = long, y = lat, group = group, fill = value),
data = tract_poly,
alpha = 1,
color = "black",
size = 0.2) +
geom_text(aes(x = new_long, y = new_lat, label = state),
data= centers,
alpha = 1,
color = "black")
Try this
centroids <- setNames(do.call("rbind.data.frame", by(tract_poly, tract_poly$group, function(x) {Polygon(x[c('long', 'lat')])#labpt})), c('long', 'lat'))
centroids$label <- tract_poly$state[match(rownames(centroids), tract_poly$group)]
USMap + with(centroids, annotate(geom="text", x = long, y=lat, label = label, size = 2.5))