Thanks to help from some users on this site, I was able to get a nice map plot for some data using geom_point. (Get boundaries to come through on states) However, now I'm trying to clean it up as I have more years to plot and want to make sure the plot is working and providing good information. After some further research, it seems the geom_tile would actually be better for this as it would shy away from points and use a gradient.
The problem I'm running into is getting the code to work with geom_tile. It isn't plot anything and I'm not sure why.
Here's the dataset :
https://www.dropbox.com/s/0evuvrlm49ab9up/PRISM_1895_db.csv?dl=0
Here's the original code with geom_points :
PRISM_1895_db <- read.csv("/.../PRISM_1895_db.csv")
regions<- c("north dakota","south dakota","nebraska","kansas","oklahoma","texas","minnesota","iowa","missouri","arkansas", "illinois", "indiana", "wisconsin")
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group)) +
geom_point(data = PRISM_1895_db, aes(x = longitude, y = latitude, color = APPT), alpha = .5, size = 3.5) +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), color="white", fill=NA)
And here is the code I've been trying, but none of the data is showing up.
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group)) +
geom_tile(data = PRISM_1895_db, aes(x = longitude, y = latitude, fill = APPT), alpha = 0.5, color = NA)
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), color="white", fill=NA)
geom_tile needs your x and y values to be sampled on an regular grid. It needs to be able to tile the surface in rectangles. So your data is irregularly sampled, it's not possible to divide up the raw data into a bunch of nice tiles.
One option is to use the stat_summary2d layer to divide your data into boxes and calculate the average APPT for all points in that box. This will allow you to create regular tiles. For example
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group)) +
stat_summary2d(data=PRISM_1895_db, aes(x = longitude, y = latitude, z = APPT)) +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), color="white", fill=NA)
which produces
you can look at other options to control this bin sizes if you like. But as you can see it's "smoothing" out the data by taking averages inside bins.
Related
I am practicing the ggplot2 grammar of graphics on World maps using the base R dataset and the ggplot2 and mapproj packages.
When building a map that colours countries by a random variable (called "CountryColour" in the following example):
world_map <- map_data("world")
country_colours <- data.frame(region = c(names(table(world_map$region))),
colour= sample(c(1:20), length(names(table(world_map$region))), replace = TRUE))
world_map <- merge(world_map, country_colours)
world_map <- world_map[order(world_map$region, world_map$order),]
I happened to include the aes(fill) argument in the ggplot component:
ggplot(world_map[abs(world_map$long) < 180,], aes(x=long, y=lat, group=group, fill=colour)) +
geom_polygon(color="black") +
coord_map(projection = "mercator")
Now, if I include it in the ggplot() component itself I get EXACTLY the same output:
ggplot(world_map[abs(world_map$long) < 180,], aes(x=long, y=lat, group=group)) +
geom_polygon(aes(fill=colour), color="black") +
coord_map(projection = "mercator")
I would like to understand the conceptual difference between placing the fill function in one place or the other.
for my PhD project, I'd like to show my sampling sites (coordinates) on a map showing them first on a map of NZ and then building a zoom in of the region (coordinates that I pick myself) to show the sampling sites in that specific region. I am very new to R and I am finding a bit frustrating.
I managed to build a map of NZ (code follows) but how can I add the data points on it and how can I create a zoom in of a certain region and adding data points on it as well??
NZ <- map_data("nz",xlim = c(166, 179), ylim = c(-48, -34))
ggplot() +
geom_path(aes(long, lat, group=group), data=NZ, color="black") +
coord_equal() +
scalebar(NZ, dist = 100, dist_unit = "km", st.size=3, height=0.01, model = 'WGS84', transform = TRUE)
Thanks to whoever will help me!!
For example:
library(tidyverse)
dunedin <- tibble(X=170.5, Y=-45 - 52/60, Text="Dunedin")
NZ <- map_data("nz",xlim = c(166, 179), ylim = c(-48, -34))
ggplot() +
geom_path(aes(long, lat, group=group), data=NZ, color="black") +
geom_point(data=dunedin, aes(x=X, y=Y), colour="blue") +
geom_label(data=dunedin, aes(x=X, y=Y, label=Text), colour="blue", nudge_x=1) +
coord_equal()
Incidentally, scalebar isn't part of ggplot2, so your example isn't self-contained. That's not a major issue here, but could be in another situation.
Alright, so I'm struggling a bit in creating this map. The following code gives me this map, which is the map that I really want to use.
map(database= "world", ylim=c(15,90), xlim=c(-180,-24), fill = TRUE, projection = 'gilbert')
This is the code I used to save the map information.
map.dat <- map_data(map(database= "world", ylim=c(15,90), xlim=c(-180,-24), fill = TRUE, projection = 'gilbert'))
Now, when I run the following code, it gives me the error 'Error in eval(expr, envir, enclos) : object 'group' not found'. I'm not sure what that means.
ggplot(map.dat, aes(x=long, y=lat, group=group, fill=region)) +
geom_polygon() +
geom_point(data = basindf, aes(x = basindf$latitude, y = basindf$longitude)) +
theme(legend.position = "none")
I had set 'group = NULL' and 'fill = NULL' and that seems to allow me to plot, but it only displays this, which is not what I want. The map is gone!
What can I do to fix this? Also, I want to move away from the points and create lines. How would I be able to make lines based on a certain id?
EDIT: Seems that some of you needed basindf to troubleshoot. I've added the first 20 lines below.
"","id","year","month","date","basin","latitude","longitude","wind speed"
"1","1902276N14266",1902,"October",1902-10-03,"EP",-93.8,14,30
"2","1902276N14266",1902,"October",1902-10-03,"EP",-94,14.5,30
"3","1902276N14266",1902,"October",1902-10-03,"EP",-94.2,15,30
"4","1902276N14266",1902,"October",1902-10-03,"EP",-94.3,15.5,30
"5","1902276N14266",1902,"October",1902-10-04,"EP",-94.4,16,30
"6","1902276N14266",1902,"October",1902-10-04,"EP",-94.5,16.5,30
"7","1902276N14266",1902,"October",1902-10-04,"EP",-94.6,17,30
"8","1902276N14266",1902,"October",1902-10-04,"EP",-94.7,17.5,30
"9","1902276N14266",1902,"October",1902-10-05,"EP",-94.8,18,30
"10","1902276N14266",1902,"October",1902-10-05,"EP",-94.9,18.5,30
"11","1902276N14266",1902,"October",1902-10-05,"NA",-94.9,18.7,35
"12","1902276N14266",1902,"October",1902-10-05,"NA",-94.7,18.8,45
"13","1902276N14266",1902,"October",1902-10-06,"NA",-94.4,18.9,55
"14","1902276N14266",1902,"October",1902-10-06,"NA",-94,19.1,60
"15","1902276N14266",1902,"October",1902-10-06,"NA",-93.7,19.3,65
"16","1902276N14266",1902,"October",1902-10-06,"NA",-93.3,19.5,75
"17","1902276N14266",1902,"October",1902-10-07,"NA",-92.9,19.7,85
"18","1902276N14266",1902,"October",1902-10-07,"NA",-92.5,20,90
"19","1902276N14266",1902,"October",1902-10-07,"NA",-92,20.3,90
"20","1902276N14266",1902,"October",1902-10-07,"NA",-91.5,20.7,90
You have two main problems.
First, the error you are getting is because you are sepecufying aes() in the ggplot() call which means that those values inherit to all layers. That means it's trying to set a group= in the geom_point layer as well but you do not have groups for that layer. You can disable the inherited aesthetics with
ggplot(map.dat, aes(x=long, y=lat, group=group, fill=region)) +
geom_polygon() +
geom_point(data = basindf, aes(x = basindf$latitude, y = basindf$longitude), inherit.aes=FALSE) +
theme(legend.position = "none")
or you can sepecy the aes per layer
ggplot(map.dat) +
geom_polygon(aes(x=long, y=lat, group=group, fill=region)) +
geom_point(data = basindf, aes(x = basindf$latitude, y = basindf$longitude)) +
theme(legend.position = "none")
Your other problem is that you transformed your map data with a projection but not your point data.
You can transform your data with mapproj so they are both on the same scale
ggplot(map.dat) +
geom_polygon(aes(x=long, y=lat, group=group, fill=region)) +
geom_point(data = data.frame(mapproject(basindf$latitude, basindf$longitude, "gilbert")), aes(x = x, y = y)) +
theme(legend.position = "none")
This gives
The reason it was not working was because you set global aes parameters in the first call to aes, and ggplot2 was looking for group and region in the geom_points call to group and fill the points.
This technically works:
library(maps)
library(ggplot2)
ggplot() +
geom_polygon(data = map.dat, aes(x =long, y = lat, group = group, fill = region)) +
geom_point(data = basindf, aes(x = latitude, y = longitude)) +
theme(legend.position = "none")
You can see your map in the bottom right, very tiny. You want to rescale your map to lat/long, or your data to whatever you have in your map.
EDIT see the answer from #MrFlick for plot rescaling.
Edit 7 :
After quite a bit of help, I've been able to get a map that is getting close to the results I need. But I still need to have the state boundaries come through on the map, but I can't figure it out. In order to make a reproducible example that would be appropriate I need to link to the data set since the dput is so large.
To make things easy, I subset only three states, but where the boundary lines do not show up. I would like to be able to have the boundary lines come through the plot as white lines, like they are on the rest of the map. Thanks for your help.
Dataset :
https://www.dropbox.com/s/0evuvrlm49ab9up/PRISM_1895_db.csv?dl=0
Rep Code :
PRISM_1895_db <- read.csv("PRISM_1895_db.csv")
regions<- c("north dakota","south dakota","nebraska","kansas","oklahoma","texas","minnesota","iowa","missouri","arkansas", "illinois", "indiana", "wisconsin")
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), col="white") +
geom_point(data = PRISM_1895_db2, aes(x = longitude, y = latitude, color = APPT), alpha = .5, size = 3.5)
Graph :
The order in which you draw the layers matters. If you want the while lines on top, you'll need to add them last. And if you want the black shapes in the background, you need them first. So basically you need to split up the states into two draws: the background and the outline.
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group)) +
geom_point(data = PRISM_1895_db, aes(x = longitude, y = latitude, color = APPT), alpha = .5, size = 3.5) +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), color="white", fill=NA)
which produces
I'm trying to draw a choropleth map of Germany showing poverty rate by state (inspired by this question).
The problem is that some of the states (Berlin, for example) are completely surrounded by other states (Brandenburg), and I'm having trouble getting ggplot to recognize the "hole" in Brandenburg.
The data for this example is here.
library(rgdal)
library(ggplot2)
library(RColorBrewer)
map <- readOGR(dsn=".", layer="germany3")
pov <- read.csv("gerpoverty.csv")
mrg.df <- data.frame(id=rownames(map#data),ID_1=map#data$ID_1)
mrg.df <- merge(mrg.df,pov, by="ID_1")
map.df <- fortify(map)
map.df <- merge(map.df,mrg.df[,c("id","poverty")], by="id")
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(aes(fill=poverty))+
geom_path(colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
Notice how the colors for Berlin and Brandenburg (in the northeast) are identical. They shouldn't be - Berlin's poverty rate is much lower than Brandenburg. It appears that ggplot is rendering the Berlin polygon and then rendering the Brandenburg polygon over it, without the hole.
If I change the call to geom_polygon(...) as suggested here, I can fix the Berlin/Brandenburg problem, but now the three northernmost states are rendered incorrectly.
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(aes(group=poverty, fill=poverty))+
geom_path(colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
What am I doing wrong??
This is just an expansion on #Ista's answer, which does not require that one knows which states (Berlin, Bremen) need to be rendered last.
This approach takes advantage of the fact that fortify(...) generates a column, hole which identifies whether a group of coordinates are a hole. So this renders all regions (id's) with any holes before (e.g. underneath) the regions without holes.
Many thanks to #Ista, without whose answer I could not have come up with this (believe me, I spent many hours trying...)
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(data=map.df[map.df$id %in% map.df[map.df$hole,]$id,],aes(fill=poverty))+
geom_polygon(data=map.df[!map.df$id %in% map.df[map.df$hole,]$id,],aes(fill=poverty))+
geom_path(colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
You can plot the island polygons in a separate layer, following the example on the ggplot2 wiki. I've modified your merging steps to make this easier:
mrg.df <- data.frame(id=rownames(map#data),ID_1=map#data$ID_1)
mrg.df <- merge(mrg.df,pov, by="ID_1")
map.df <- fortify(map)
map.df <- merge(map.df,mrg.df, by="id")
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(aes(fill=poverty), color = "grey50", data =subset(map.df, !Id1 %in% c("Berlin", "Bremen")))+
geom_polygon(aes(fill=poverty), color = "grey50", data =subset(map.df, Id1 %in% c("Berlin", "Bremen")))+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
As an unsolicited act of evangelism, I encourage you to consider something like
library(ggmap)
qmap("germany", zoom = 6) +
geom_polygon(aes(x=long, y=lat, group=group, fill=poverty),
color = "grey50", alpha = .7,
data =subset(map.df, !Id1 %in% c("Berlin", "Bremen")))+
geom_polygon(aes(x=long, y=lat, group=group, fill=poverty),
color = "grey50", alpha= .7,
data =subset(map.df, Id1 %in% c("Berlin", "Bremen")))+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))
to provide context and familiar reference points.
Just to add another small improvement to #Ista's and #jhoward's answers (thanks a lot for your help!).
The modification of #jhoward could be easily wrapped in a small function like this
gghole <- function(fort){
poly <- fort[fort$id %in% fort[fort$hole,]$id,]
hole <- fort[!fort$id %in% fort[fort$hole,]$id,]
out <- list(poly,hole)
names(out) <- c('poly','hole')
return(out)
}
# input has to be a fortified data.frame
Then, one doesn't need to recall every time how to extract holes info. The code would look like
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(data=gghole(map.df)[[1]],aes(fill=poverty),colour="grey50")+
geom_polygon(data=gghole(map.df)[[2]],aes(fill=poverty),colour="grey50")+
# (optionally). Call by name
# geom_polygon(data=gghole(map.df)$poly,aes(fill=poverty),colour="grey50")+
# geom_polygon(data=gghole(map.df)$hole,aes(fill=poverty),colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
Alternatively you could create that map using rworldmap.
library(rworldmap)
library(RColorBrewer)
library(rgdal)
map <- readOGR(dsn=".", layer="germany3")
pov <- read.csv("gerpoverty.csv")
#join data to the map
sPDF <- joinData2Map(pov,nameMap='map',nameJoinIDMap='VARNAME_1',nameJoinColumnData='Id1')
#default map
#mapPolys(sPDF,nameColumnToPlot='poverty')
colours=brewer.pal(5,"OrRd")
mapParams <- mapPolys( sPDF
,nameColumnToPlot='poverty'
,catMethod="pretty"
,numCats=5
,colourPalette=colours
,addLegend=FALSE )
do.call( addMapLegend, c( mapParams
, legendLabels="all"
, legendWidth=0.5
))
#to test state names
#text(pov$x,pov$y,labels=pov$Id1)