So I'm plotting a shape file (from the ONS) of Great Britain split into 11 regions with the hope of creating a choropleth map based on COVID-19 cases.
I join the covid data with the shape file so that I can work within 1 data frame, joining on the region name.
I've used the longitude and latitude fields of the shape file for the x and y values within the aesthetics.
covid <- data.frame(Name = c("Scotland","Eastern","West Midlands","Yorkshire and the Humber","East Midlands","London","South West","South East","North West","North East","Wales"),
Cases = c(20,50,45,30,25,75,100,5,60,35,80))
#'greatb' is the name of the shape file
join <- merge(greatb,covid,by=c("NAME","Name"),by.x=c("NAME"),by.y=c("Name"), all=TRUE)
ggplot()+
geom_polygon(data=join, aes(x=long, y=lat, group=group, fill=Cases))
However, it seems that once I do this I can't use a variable name to fill the regions of the map. I get confronted with the error message: object 'Cases' not found
I'm unsure why I get this is message though as 'covid$data' is clearly an object and therefore so is 'join$data'. Can anyone help me with this?
Related
I have a dataset containing different states, zip codes, and claim counts each in separate columns. I am trying to create a plot to show the total claim count according to zip codes for the state of MA.
Dataset:
I used this to filter by MA:
MA_medicare <- medicare %>%
filter(medicare$NPPES.Provider.State == "MA")
I then used this to set the fips code for plot_usmap:
MA_medicare$NPPES.Provider.State <- fips(MA_medicare$NPPES.Provider.State)
setnames(MA_medicare, old=c("NPPES.Provider.State"), new=c("fips"))
And last tried to graph (not sure why this doesn't work):
plot_usmap(data = MA_medicare, values= c("Total.Claim.Count", "NPPES.Provider.Zip.Code"), include = c("MA")) + scale_fill_continuous(low= "white", high= "red") + theme(legend.position = "right")
Error: Aesthetics must be either length 1 or the same as the data (4350838): fill
I'm the developer of usmap. plot_usmap only accepts one column of values for plotting so you're probably looking for the following:
plot_usmap(data = MA_medicare, values = "Total.Claim.Count", include = c("MA"))
However, your data is by zip code, and currently usmap doesn't support zip code maps (only state and county level maps). It uses the FIPS column to assign colors to states/counties on the map. Since you defined the FIPS codes by state, you'll just get the entire state of Massachusetts filled in with one solid color.
I am trying to make a grid containing maps of megaregions in the us. I create a SpatialPolygonDataframe from a shape file. then convert it into a data.frame to use ggplot2. as soon as I add the data into the frame, the polygon plots.
the file containing SpatialPolygon and the data frame are here:
https://drive.google.com/open?id=1kGPZ3CENJbHva0s558vWU24-erbqWUGo
the code is as follow:
load("./data.rda")
prop.test <- proptest.result[which(proptest.result$variable=="Upward N"),]
#transforming the data
# add to data a new column termed "id" composed of the rownames of data
shape#data$id <- rownames(shape#data)
#add data to our
shape#data <- data.frame(merge(x = shape#data, y = prop.test, by.x='Name', by.y="megaregion"))
# create a data.frame from our spatial object
mega.prop <- fortify(shape)
#merge the "fortified" data with the data from our spatial object
mega.prop.test <- merge(mega.prop, shape#data, by="id")
Plotting the first one (mega.prop) works fine:
ggplot(data = mega.prop, aes(x=long, y=lat, group=group), fill="blue")+
geom_polygon()
but plotting after adding the analytics data:
ggplot(data = mega.prop.test, aes(x=long, y=lat, group=group), fill="blue")+
geom_polygon()
In the new plot:
The filling of polygons is messed up. (Is it about the order of points?how?)
two of the polygons are totally missed.
What is the problem?
Thank you very much for your help.
Use geom_map() (which requires a slight tweak of your shapefile for some reason) so you don't have to do the merge/left join.
Also, you merged a great deal of different factors, not sure which ones you want to plot.
Finally, it's unlikely the coastal areas need that fine level of detail. rgeos::gSimplify() will definitely speed things up and you're already distorting areas, so a smaller bit of additional distortion won't impact the results.
library(ggplot2)
library(tidyverse)
shape_map <- tbl_df(fortify(shape, region="Name"))
colnames(shape_map) <- c("long", "lat", "order", "hole", "piece", "region", "group")
prop.test <- proptest.result[which(proptest.result$variable=="Upward N"),]
ggplot() +
geom_map(data=shape_map, map=shape_map, aes(long, lat, map_id=region)) +
geom_map(
data=filter(prop.test, season=="DJF"),
map=shape_map, aes(fill=prop.mega, map_id=megaregion)
)
I am trying to mark airports in India on India map. My code is as follows:
library(ggmap)
library(ggplot2)
airports <- read.csv("C://Users//MEJA03514//Downloads//in-airports.csv", header=T)
map <- get_googlemap("India", zoom = 4)
points <- ggmap(map) + geom_point(aes(x = longitude_deg, y = latitude_deg), data = airports, alpha = 0.5)
points
I downloaded the airports data file from: https://data.humdata.org/dataset/ourairports-ind
I am getting an error:
Error: Discrete value supplied to continuous scale
after I am combining ggmap() with geom_point() function. Can you please help me figure out what is the mistake in this code.
Thanks in advance!
It works perfectly for me with your code. But I was also able to reproduce your error: have a look at airport using View(airport) or head(airport) and you will notice that your data has in first row values that explain the variables (HXL tags). Make sure you download the data without these tags or remove them manually (but then you have to get data-types correct). If you check str(airport), longitude and latitude should be numeric, not character.
I am trying to plot data on New York state map. I am using map_data code. But If you look at polygon, It shows extra piece which is actually not part of New York state? Any ideas how can I apply filter on map data to remove that?
ny <- map_data("state", region="new york")
s1 <- ggplot() + geom_polygon(data=ny, aes(x=long, y=lat))
s2 <- ggplot() + geom_point(data=ny, aes(x=long, y=lat))
grid.arrange(s1, s2, ncol=2)
Output:
geom_point shows correct boundary, but not polygon
The state is actually composed of multiple polygons which are not connected. You just need to tell ggplot which points go with which groups. This is done by mapping your data to the group argument of aes(). See the documentation here, although it would be nicer if they had a map example.
So how do you know which points go with which groups? The data frame returned by map_data() contains a group column. See:
head(ny)
ny$group
To plot the map correctly, use:
ggplot() + geom_polygon(data = ny, aes(x = long, y = lat, group = group))
I'm trying to get districts of Warsaw and draw them on google map. Using this code, where 2536107 is relation code for OpenStreetMap single Warsaw district, gives me almost what I want but with a few bugs. There is general outline but also lines between points which shouldn't be connected. What am I doing wrong?
map <- get_googlemap('warsaw', zoom =10)
warszawa <- get_osm(relation(2536107), full = T)
warszawa.sp <- as_sp(warszawa, what='lines')
warsawfort <- fortify(warszawa.sp)
mapa_polski <- ggmap(map, extent='device', legend="bottomleft")
warsawfort2 <- geom_polygon(aes(x = long, y = lat),
data = warsawfort, fill="blue", colour="black",
alpha=0.0, size = 0.3)
base <- mapa_polski + warsawfort2
base
Edit: I figured it must be somehow connected with order of plotting every point/line but I have no idea how to fix this.
There is a way to generate your map without using external packages: don't use osmar...
This link, to the excellent Mapzen website, provides a set of shapefiles of administrative areas in Poland. If you download and unzip it, you will see a shapfile set called warsaw.osm-admin.*. This is a polygon shapefile of all the districts in Poland, conveniantly indexed by osm_id(!!). The code below assumes you have downloaded the file and unzipped it into the "directory with your shapefiles".
library(ggmap)
library(ggplot2)
library(rgdal)
setwd(" <directory with your shapefiles> ")
pol <- readOGR(dsn=".",layer="warsaw.osm-admin")
spp <- pol[pol$osm_id==-2536107,]
wgs.84 <- "+proj=longlat +datum=WGS84"
spp <- spTransform(spp,CRS(wgs.84))
map <- get_googlemap('warsaw', zoom =10)
spp.df <- fortify(spp)
ggmap(map, extent='device', legend="bottomleft") +
geom_polygon(data = spp.df, aes(x = long, y=lat, group=group),
fill="blue", alpha=0.2) +
geom_path(data=spp.df, aes(x=long, y=lat, group=group),
color="gray50", size=0.3)
Two nuances: (1) The osm IDs are stored as negative numbers, so you have to use, e.g.,
spp <- pol[pol$osm_id==-2536107,]
to extract the relevant district, and (2) the shapefile is not projected in WGS84 (long/lat). So we have to reproject it using:
spp <- spTransform(spp,CRS(wgs.84))
The reason osmar doesn't work is that the paths are in the wrong order. Your warszawa.sp is a SpatialLinesDataframe, made up of a set of paths (12 in your case), each of which is made up of a set of line segments. When you use fortify(...) on this, ggplot tries to combine them into a single sequence of points. But since the paths are not in convex order, ggplot tries, for example, to connect a path that ends in the northeast, to a path the begins in the southwest. This is why you're getting all the extra lines. You can see this by coloring the segments:
xx=coordinates(warszawa.sp)
colors=rainbow(11)
plot(t(bbox(warszawa.sp)))
lapply(1:11,function(i)lines(xx[[i]][[1]],col=colors[i],lwd=2))
The colors are in "rainbow" order (red, orange, yellow, green, etc.). Clearly, the lines are not in that order.
EDIT Response to #ako's comment.
There is a way to "fix" the SpatialLines object, but it's not trivial. The function gPolygonize(...) in the rgeos package will take a list of SpatialLines and convert to a SpatialPolygons object, which can be used in ggplot with fortify(...). One huge problem (which I don't understand, frankly), is that OP's warszaw.sp object has 12 lines, two of which seem to be duplicates - this causes gPolygonize(...) to fail. So if you create a SpatialLines list with just the first 11 paths, you can convert warszawa.sp to a polygon. This is not general however, as I can't predict how or if it would work with other SpatialLines objects converted from osm. Here's the code, which leads to the same map as above.
library(rgeos)
coords <- coordinates(warszawa.sp)
sll <- lapply(coords[1:11],function(x) SpatialLines(list(Lines(list(Line(x[[1]])),ID=1))))
spp <- gPolygonize(sll)
spp.df <- fortify(spp)
ggmap(map, extent='device', legend="bottomleft") +
geom_polygon(data = spp.df, aes(x = long, y=lat, group=group),
fill="blue", alpha=0.2) +
geom_path(data=spp.df, aes(x=long, y=lat, group=group),
color="gray50", size=0.3)
I am not sure this is a general hangup--I can reproduce your example and see the issue. My first thought was that you didn't supply group=id which are typically used for polygons with many lines, but you have lines, so that should not be needed.
The only way I could get it to display properly was by changing your lines into a polygon off script. Qgis' line to polygon didn't get this "right", getting a large donut hole, so I used ArcMap, which produced a full polygon. If this is a one off that may work for your workflow. Odds are it is not. In that case, perhaps RGDAL can transform lines to polygons, assuming that is indeed a general problem.
Upon reading the polygon shapefile and fortifying that, your code ran without problems.