How to plot data on map without coordinates? - r

I am trying to plot dataframe like:
code name description estimate
0 Australia Vegetables 854658
0 Australia Fruit 667541
1 New South Wales Vegetables 45751
1 New South Wakes Fruit 77852
2 Victoria Vegetables 66211
2 Victoria Fruit 66211
.
.
.
For each region in Australia there are multiple rows with different description. What packages may I use to plot a map with estimate without coordinates?
I try ggplot and ozmaps with sf which mentioned in ggplot2 tutorial, and I filter the dataframe for only fruit, but there is error message :
stat_sf requires the following missing aesthetics: geometry
the code I tried :
ggplot() +
geom_sf(oz_states,mapping=aes())+
geom_sf(df,mapping=aes()) +
coord_sf()
The methods I found are all required langitude and latitude to plot the data map, I tried ggmaps or geom_ploygon but didn't figure out the correct way to do so. Is there a possible way to plot map with only region labels?
this is what I plot by tableau, and this is expected plot by using r as well:

So essentially, your first problem is that you're calling the wrong object within the ozmaps package. it's ozmap_states, meanwhile you called yours oz_states
I came up with this solution that I think takes what you want and elevates it.
df <- data.frame(code = rep(c(0,1,2), 2), name = rep(c("Australia", "New South Wales", "Victoria"), 2), description = rep(c("Vegetables", "Fruit"), 3), count =
c(854658, 45751, 66211, 667541, 77852, 66211))
library(tidyverse)
library(sf)
library(ozmaps)
library(leaflet)
library(tmap)
states_full <- right_join(df, ozmap_states, by = c("name" = "NAME"))
data <- states_full %>%
filter(description == "Fruit") %>%
select(name, geometry, count)
ozmap1 = tm_shape(ozmap_states) +tm_polygons()
tmap_mode("view")
ozmap1 + tm_shape(st_as_sf(data)) + tm_fill(col = "count")
Basically, instead of using the sample dataframe that I created from your data, you would just use your data in the right join. You can also choose whether you want fruits or vegetables in your filter function.
The tmap package is a mapping package that can make interactive leaflet like maps.
You can look at some tutorials here: https://geocompr.robinlovelace.net/adv-map.html
End solution looks something like this.
Note: This solution uses lng/lat, but it pulls it directly from the shape file for oz state maps in the ozmaps package, therefore fulfilling the need of the question.
When you add in more data, more of Australia will be colored in depending on their count.

Related

How to plot geom_line features in ggplot2 map?

I want to plot rivers (lines) in a map containing polygons (counties, etc) from South Dakota. The river data is here, https://www.weather.gov/gis/Rivers. Use the subset of rivers data set. The county download can be obtained from here, https://www2.census.gov/geo/tiger/TIGER2020/COUNTY/.
I only want the rivers that lie within the county boundaries of South Dakota, so I am using rgeos::intersection to perform that, which produces a Large SpatialLines object, which ggplot2 doesn't like when I try to plot it with geom_line (I get an error that says "Error: data must be a data frame, or other object coercible by fortify(), not an S4 object with class SpatialLines.")
Here is my code:
library(rgdal)
library(raster)
counties <- readOGR('D:\\Shapefiles\\Counties\\tl_2020_us_county.shp')
counties <- counties[which(counties$STATEFP == '46'),]
counties <- spTransform(counties, CRS("+init=epsg:3395"))
rivers <- readOGR('D:\\Shapefiles\\Main_Rivers\\rs16my07.shp')
proj4string(rivers) <- CRS("+proj=longlat")
rivers <- spTransform(rivers, CRS("+init=epsg:3395"))
rivers <- as.SpatialLines.SLDF(rgeos::gIntersection(counties, rivers))
The raster packages "intersect" function does not work for doing the intersection. I think I need to change the SpatialLines object to a spatialLinesDataFrame object to get ggplot2 to plot the rivers. How do I do that? The as.SpatialLines.SLDF function is not doing it. Is there another way to get this to plot? My plotting code is here:
ggplot() +
geom_path(counties, mapping = aes(x = long, y = lat, group = group, col = 'darkgreen')) +
geom_path(rivers, mapping = aes(x = long, y = lat, color = 'blue'))
I would recommend handling your spatial data with the sf library. Firstly, it plays well with ggplot. Also, according to my very much infant understanding of GIS and spatial data in R, I believe that the idea is the sf will eventually take over from sp and the Spatial* data formats. sf is I think a standard format across multiple platforms. See this link for more details on sf.
Onto your question - this is quite simple using sf. To find the rivers inside a specific county, we use st_intersection() (the sf version of gIntersection).
library(sf)
# read in the rivers data
st_read(dsn = 'so_data/rs16my07', layer = 'rs16my07') %>%
{. ->> my_rivers}
# set the CRS for the rivers data
st_crs(my_rivers) <- crs('+proj=longlat')
# transform crs
my_rivers %>%
st_transform('+init=epsg:3395') %>%
{. ->> my_rivers_trans}
# read in counties data
st_read(dsn = 'so_data/tl_2020_us_county') %>%
{. ->> my_counties}
# keep state 46
my_counties %>%
filter(
STATEFP == 46
) %>%
{. ->> state_46}
# transform crs
state_46 %>%
st_transform('+init=epsg:3395') %>%
{. ->> state_46_trans}
# keep only rivers inside state 46
my_rivers_trans %>%
st_intersection(state_46_trans) %>%
{. ->> my_rivers_46}
Then we can plot the sf objects using ggplot and geom_sf(), just like you would plot lines using geom_line() etc. geom_sf() seems to know if you are plotting point data, line data or polygon data, and plots accordingly. It is quite easy to use.
# plot it
state_46_trans %>%
ggplot()+
geom_sf()+
geom_sf(data = my_rivers_46, colour = 'red')
Hopefully this looks right - I don't know my US states so have no idea if this is South Dakota or not.

Pie Chart using variables with character names

I'm trying to create some pie charts showing the distribution of companies amongst regions and countries.
I'm getting an error saying 'x' values must be positive, which I think is because I'm trying to plot country names and it needs to be a number?
Any guidance on this would be really helpful
Summary: trying to make a pie chart of investor countries/regions to show their distribution (i.e. how many are in the UK, France, Germany etc)
Data: data
Main variables: investor, country/region
Any help with this code would be great!
Rory
try something on these lines
#demo data
investors <- paste0('investor', 1:100)
countries <- paste0('country', 1:5)
set.seed(1)
df <- data.frame(investors, countries = sample(countries, 100, T))
# pie chart code
library(tidyverse)
df %>% ggplot(aes(x = '', y = ..count.. , fill = countries)) +
geom_bar() +
coord_polar('y', start = 0)
Created on 2021-07-31 by the reprex package (v2.0.0)

filled map + location map in R

Rephrasing the question...I am preparing report and one part of it is spatial viz.
I have 2 datasets. First(Scores) is countries with their scores. Second one (Locations) is exact longitude and latitude that refers to an exact location inside those countries. Let that be examples:
Scores = data.frame( Country = c("Lebanon","UK","Chille"), Score =c(1,3.5,5))
Locations = data.frame(Location_Name = c("London Bridge", "US Embassy in Lebanon" , "Embassy of Peru in Santiago"),
LONG = c(-0.087749, 35.596614, -70.618236),
LAT = c(51.507911, 33.933586, -33.423285))
What i want to achieve is get filled map of the world (in my dataset i have every country) and color inside of its boundouries with the Score (Scores$Score) on continous scale.
On top of that I would like to add pins, bubbles or whatever marker of Locations from Locations dataframe.
So my desired outcome would be combination of this view:
and this view:
Ideally i would like also to be able to draw 2km radius around the Locations from Locations data.frame also.
I know to do them separately but cant seem to achieve it on one nice clean map.
I really appreciate any help or tips on this, got stuck for whole day on that one
As suggested by #agila you can use the tmap package.
First merge your Scores data with World so you can fill countries based on Scores data. Note that your Country column should match the name in World exactly when merging.
You will need to use st_as_sf from sf package to make your Locations an sf object to add to map.
tm_dots can show points. An alternative for bubbles is tm_bubbles.
library(tmap)
library(sf)
data(World)
Scores = data.frame(Country = factor(c("Mexico","Brazil","Chile"), levels = levels(World$name)),
Score =c(1,3.5,5))
Locations = data.frame(Location_Name = c("Rio de Janeiro", "US Embassy in Lebanon" , "Embassy of Peru in Santiago"),
LONG = c(-43.196388, 35.596614, -70.618236),
LAT = c(-22.908333, 33.933586, -33.423285))
map_data <- merge(World, Scores, by.x = "name", by.y = "Country", all = TRUE)
locations_sf <- st_as_sf(Locations, coords = c('LONG', 'LAT'))
tm_shape(map_data) +
tm_polygons("Score", palette = "-Blues") +
tm_shape(locations_sf) +
tm_dots(size = .1)
Map

Interactive choropleth map with leaflet inaccurately mapping data

First post, I'll try to do my best.
I'm trying to make an interactive choropleth map with leaflet in R, based off this
tutorial. It all goes great until I compare the data represented on the map with the data in my data frame.
I'm using the same data as the person who wrote the tutorial. It is a Large Spatial Polygons Dataframe called world_spdf with 246 rows of world_spdf#data and world_spdf#polygons as well as three other objects I have no idea about. The data is essentially countries with long and lat. I don't know anything about spatial data, but I'm assuming it's responsible for the shapes of the countries on a rendered map.
I also have a data frame called results that initially has 234 rows, also about countries and some additional data for each country. This is how head(results) looks like (it goes on like this, and there are no NAs):
ISO2 v1 v2 v3
<chr> <dbl> <dbl> <dbl>
1 AD 0.118 0.880 0.001
2 AE 0.226 0.772 0.0016
3 AF 0.197 0.803 0.0001
4 AG 0.0884 0.911 0.0009
5 AI 0.172 0.827 0.00120
6 AL 0.107 0.891 0.0022
I am merging the two dataframes by ISO2 column which contains country codes. Everything is fine so far, the data in the merged dataframe is correctly assigned to the country.
world_spdf#data = merge(world_spdf#data, results, by = "ISO2")
However, when I try to plot the data, the resulting interactive map presents the data in a "wrong order", for example, data for Poland for Nigeria etc.
What I tried was to find the missing countries in the smaller dataframe like this:
differences = c(setdiff(world_spdf#data$ISO2, results$ISO2)
And then add rows with NAs to the dataframe so that all the countries in the spatial dataframe are represented with NAs at least. But this didn't help.
I am clueless as to why this occurs. Please help!
I cannot see your side. But it seems that your join process did not go well. I have seen people like that here before. In that tutorial, the author is using sp approach. But now we can use sp approach. The sp package is better, in that you can handle data manipulation much easier. For example, you can use the tidyverse package. If you use the sp package, you cannot use filter(), mutate(), left_join() and so on. In the end, it is your choice. But I recommend the sf package.
Here, I used the data you provided. So you do not see colors much in the following map. But I want to show you that country names are matching with correct locations. On my side, I see Poland has 38.2 for POP2005, which is the right value in mysf.
library(dplyr)
library(sf)
library(leaflet)
library(viridis)
# Read the shapefile as an sf object rather than as sp object.
mysf <- st_read(dsn = ".", layer = "TM_WORLD_BORDERS_SIMPL-0.3")
# Clean up the data as the tutorial shows.
mutate(mysf,
POP2005 = if_else(POP2005 == 0, NA_real_, POP2005),
POP2005 = round(POP2005 / 1000000, 2)) -> mysf
# Now join the results to mysf
left_join(mysf, results, by = "ISO2") -> mysf
# Create a color palette. I am not sure which variable you use.
# But I chose POP2005.
mypal <- colorNumeric(palette = "viridis",
domain = mysf$POP2005, na.color = "transparent")
# Draw a leaflet map
leaflet() %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addPolygons(data = mysf, group = "continuous",
stroke = FALSE, smoothFactor = 0.2, fillOpacity = 0.5,
fillColor = ~mypal(mysf$POP2005),
popup = paste("Country: ", mysf$NAME, "<br>",
"ISO2: ", mysf$ISO2, "<br>",
"Population:", mysf$POP2005, "<br>")) %>%
addLayersControl(overlayGroups = "continuous") %>%
addLegend(position = "bottomright", pal = mypal, values = mysf$POP2005,
title = "Population 2005",
opacity = 0.5)

Find centre of polygons using dplyr

I'm making a map with arc lines connecting between counties for the US state of Missouri. I've calculated the 'good enough' centres of each county by taking the mean of each polygon's long/lat. This works good for the more or less square-shaped counties, but less so for the more intricately shaped counties. I think that this must be a common occurrence, but I can't find the answer online or with any function I've created. I'd like to use a tidyverse work flow (i.e. not transform to spatial objects if I can help it). Are there any tidyverse solutions to the problem at hand.
You can see the problem in the examples below.
library(tidyverse)
# import all state/county fortified
all_states <- as_tibble(map_data('county'))
# filter for missouri
mo_fortify <- all_states %>%
filter(region == 'missouri')
## Pull Iron county, which is relatively oddly shaped
mo_iron_fortify <- mo_fortify %>%
group_by(subregion) %>%
mutate(long_c = mean(long),
lat_c = mean(lat),
iron = ifelse(subregion == 'iron','Iron','Others')) %>%
ungroup()
# map a ggplot2 map
mo_iron_fortify %>%
ggplot(aes(long, lat, group = group))+
geom_polygon(aes(fill = iron),
color = 'white')+
geom_point(aes(long_c, lat_c))+
scale_fill_grey('Iron county is\na good example')+
coord_map()+
theme_bw()

Resources