How to remove unwanted lines/connections in plotly::ggplotly() animation? - r

I've created an animation using the plotly library which calls a ggplot with geom_sf() layers. When I run the animation I get weird, random lines or connections between unconnected points in the map layer that I don't want.
I saw a post suggesting the order of the points matters (i.e. should be ordered by x ascending) but I tried ordering the points and still experience the issue.
I've created a MRE; when you load the 'scotland.html' file in your browser it should display similar behaviour:
library("dplyr")
library("tidyr")
library("sf")
library("ggplot2")
library("plotly")
library("htmlwidgets")
tmp <- tempdir()
tmpfile <- paste0(tmp, "/scotland.zip")
url <- "https://borders.ukdataservice.ac.uk/ukborders/easy_download/prebuilt/shape/Scotland_ol_1991.zip"
download.file(url, tmpfile)
unzip(tmpfile, exdir = tmp)
scotland <- read_sf(tmp, "scotland_ol_1991")
points <- data.frame(
id = 20:1,
year = 20:1,
east = sample(7459:469817, size = 20),
north = sample(530297:1219574, size = 20)
)
points <-
points %>%
uncount(year) %>%
group_by(id) %>%
mutate(year = row_number()) %>%
ungroup()
# Try rearranging by x coordinate as suggested by this post:
# https://community.plot.ly/t/scatterplot-lines-unwanted-connecting/8729/3
scotland_order =
scotland %>%
group_by(name) %>%
st_coordinates() %>%
as_tibble() %>%
group_by(L2) %>%
summarise(min_x = min(X))
scotland =
scotland %>%
mutate(order = scotland_order$min_x) %>%
arrange(order) %>%
select(-order)
points <- st_as_sf(points, coords = c("east", "north"))
points <- st_set_crs(points, 27700)
p <- ggplotly(
ggplot() +
geom_sf(data = scotland) +
geom_sf(data = points, aes(frame = year, ids = id))
)
saveWidget(p, "scotland.html")
I don't think it's a problem with the data (i.e. the scotland layer) because:
Not all animation frames are affected
If you run the animation, go back, and play it again it doesn't show the weird behaviour
Any suggestions about what's going on, how to solve it, or any workarounds greatly received!

Related

Overlapping shp maps in sf

I have two shapefiles that I read into R with sf.
The first shp file covers regions.
The second shp file covers administrative districts.
The electoral districts are nested into regions.
I would like to overlay the two maps, then coloring each electoral district in a shade of the same color, having one color for each region.
I can plot the two and play around with colors but cannot overlay and coloring.
Files can be accessed here from the Italian National Institute of Statistics :
Reg1991_WGS84.shp:
http://www.istat.it/storage/cartografia/confini_amministrativi/non_generalizzati/Limiti1991.zip
CAMERA_PLURI_2017.shp: https://www.istat.it/storage/COLLEGI_ELETTORALI_2017.zip
library(sf)
italia_regions_1991<- read_sf("Limiti1991/Reg1991/Reg1991_WGS84.shp") %>% select(geometry)
italia_camera_pluri <- read_sf("COLLEGI_ELETTORALI_2017/CAMERA_PLURI_2017.shp") %>% select(geometry)
This will get you started....
I used the leafgl library, since you are plotting alot of polylines/plygons... This performs (pretty)fast...
library(sf)
library(devtools)
library(leaflet)
#install leaflet with gl-suport
devtools::install_github("r-spatial/leafgl")
library(leafgl)
library(colourvalues)
#read shapefile regions and cast to polygons
sf1 <- st_read( "e:/two_shapes/Limiti1991/Reg1991/Reg1991_WGS84.shp" ) %>% st_cast( "POLYGON", warn = FALSE )
#read shapefile and cast to POLYGON and then to LINESTRING
sf2 <- st_read( "e:/two_shapes/COLLEGI_ELETTORALI_2017/COLLEGI_ELETTORALI_2017.shp") %>%
st_cast( "POLYGON", warn = FALSE ) %>%
st_cast( "LINESTRING", warn = FALSE )
#creaae color matrix for the regions( depending om DEN_REG), and for the polylines (=black)
col_region <- colour_values_rgb(sf1$DEN_REG, include_alpha = FALSE) / 255
col_lines <- matrix(data = c(0,0,0), nrow = 1 )
#plot leaflet (takes some time)
leaflet() %>% addTiles() %>%
addGlPolygons(data = sf1, color = col_region) %>%
addGlPolylines( data = sf2, color = col_lines)
result
Consider intersecting the regions & districts via sf::st_intersection - note however that there seems to be some overlap, as the regions and districts do not align perfectly (they mostly do, but not quite...)
I have also transformed the CRS to WGS84; perhaps not necessary, but works better with leaflet and the like...
library(sf)
library(dplyr)
library(ggplot2)
italia_regions_1991<- read_sf("Reg1991_WGS84.shp") %>%
select(region = DEN_REG) %>% # this, and geometry by default
st_transform(4326)
italia_camera_pluri <- read_sf("CAMERA_PLURI_2017.shp") %>%
select(geometry) %>% # only geometry...
st_transform(4326)
result <- italia_camera_pluri %>%
st_intersection(italia_regions_1991)
ggplot(data = result, aes(fill = region)) +
geom_sf()

Difficulty in customising cartogram output in R Studio

I am able to produce a cartogram using cartogram::cartogram_cont() But then have difficulty in customising the styling.
I have used broom::tidy() and dplyr::left_join() to fortify the cartogram, but I think perhaps the tidy stage has interfered with the plotOrder. If possible, I will include the output cartograms.
I'm attempting to replicate this type of output, but within my locality. Plesae note that the dataset used for the weighting in cartogram_cont() is not particularly significant, just a proof of concept at this stage:
[R Graph Gallery][1]
[1]: https://www.r-graph-gallery.com/331-basic-cartogram/
Shapefile from: [Lle Shapefile Location][2]
[2]: http://lle.gov.wales/catalogue/item/LocalAuthorities/?lang=en
library(dplyr)
library(leaflet)
library(maptools)
library(cartogram)
library(devtools)
install_github("HanOostdijk/odataR" , build_vignettes = T)
library(odataR)
library(tidyr)
library(rgdal)
library(htmltools)
#Read in shapefile and transform shape
#dsn = folder name, layer = filename but drop the .shp
shapefile <- readOGR(dsn = "Wales Shapefile",
layer = "localauthoritiesPolygon") %>%
#Transform coordinate referencing system
spTransform(CRS("+init=epsg:4326"))
#Next step is to join an interesting dataset to the shapefile using dplyr, then pass this to the cartoram package to render.
#Gone for the teacher sickness dataset from Stats Wales. Noticed it's only up to 2017, wonder if they've stopped collecting.
teacher_sickness_data <- odataR_query('http://open.statswales.gov.wales/dataset/schw0001')
#Check values for join.
categories <- unique(teacher_sickness_data$Area_ItemName_ENG)
categories_shp <- shapefile#data$name_en
categories
categories_shp
#Teacher data has "All Welsh local authorities". Not contained in shapefile so remove.
UA_sickness_data <- teacher_sickness_data[-c(2, 4:6, 8, 9, 11:13, 15:17)] %>%
filter(Area_ItemName_ENG != "All Welsh local authorities")
#Perform join to shapefile
shapefile_1 <- shapefile %>%
merge(UA_sickness_data, by.x = "name_en", by.y = "Area_ItemName_ENG",
duplicateGeoms = TRUE)
#Shiny App will allow choice of inputs to achieve one row per polygon. However, for testing
#functionality with cartograph functions, perform test filtering.
data_filtered <- UA_sickness_data %>%
filter(Year_ItemName_ENG == 2017) %>%
filter(Type_ItemName_ENG == "Full-time") %>%
filter(Variable_ItemName_ENG == "Total days of sick leave")
test_merge <- shapefile %>%
merge(data_filtered, by.x = "name_en", by.y = "Area_ItemName_ENG")
nc_pal <- colorNumeric(palette = "Reds",
domain = log(test_merge#data$Data))
m <-test_merge %>%
leaflet() %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(weight = 1,
color = ~nc_pal(log(Data)),
label = ~name_en,
highlight = highlightOptions(weight = 3,
color = "crimson",
bringToFront = TRUE),
popup = ~ paste0(Variable_ItemName_ENG, "<br/>",
"<b/>",
Data))
m
wales_cart <- cartogram_cont(test_merge, "Data", itermax=5)
plot(wales_cart)
[![Wales_Cartogram][3]][3]
[3]: https://i.stack.imgur.com/2tsMC.png
library(tidyverse)
library(ggmap)
library(broom)
library(rgeos) #used for gBuffer
#Buffer allows to tidy cartogram based on factor of choice.
wales_cart_buffered <- gBuffer(wales_cart, byid=TRUE, width=0)
#tidy cartogram in order to pass to ggplot
spdf_fortified_wales <- tidy(wales_cart_buffered, region = "name_en")
#Now perform a join based on english UA names
spdf_fortified_wales_joined <- spdf_fortified_wales %>%
left_join(. , wales_cart#data, by=c("id"="name_en"))
ggplot() +
geom_polygon(data = spdf_fortified_wales_joined, aes(fill = Data, x = long, y = lat, group = "name_en") , size=0, alpha=0.9) +
coord_map() +
theme_void()
[![incorrect_ggplot][4]][4]
[4]: https://i.stack.imgur.com/as0Z4.png
ggplot() +
geom_polygon(data = spdf_fortified_wales_joined, aes(fill = Data, x = long, y = lat, group = "name_en") , size=0, alpha=0.9) +
coord_map() +
theme_void()
Success Criteria: Polygons are rendered correctly distorted and colour scale reflects weighting variable.

Create a map with colored polygons and coordinate points by using a .shp file in combination with another dataframe with coordinates

I have the following map boundaries in this .gdb folder
and here I have a csv which contains the variables that I want to plot and the coordinates of the points that need to be displayed on the map. My final goal is to create a map with polygons and inside every polygon there should be points according to the coordinates. Every polygon should be colored according to the count of studentid (students) for the year 2019. Any alternative is accepted
I believe that the 1st code chunk below is correct:
library(sf)
library(tidyverse)
library(data.table)
library(tigris)
library(leaflet)
library(mapview)
options(tigris_use_cache = TRUE)
# To keep enough digits on coords
options(digits = 11)
#coordinate reference system (long-lat system)
cr_sys = 4326
# Shp file for hs boundaries (constitutes overall district bounds)
hs_bounds <- st_read("C:/Users/makis/Documents/school/TPS_schools.shp")
# Read the feature class
#fc <- readOGR(dsn=fgdb )
#fc <- spTransform(fc, CRS("+proj=longlat +datum=WGS84 +no_defs"))
# Convert hs_bounds into longlat coord system
hs_bounds <- hs_bounds %>%
st_transform(4326)
tmp <- list.files(pattern = "school_report_data_fake.csv")
raw_master <- lapply(tmp,
function(x) read_csv(x,guess_max = 5000)) %>%
rbindlist(., fill = TRUE)
# r blocks in tps
tps_blocks <- blocks(state = "OK") %>%
st_as_sf() %>%
st_transform(crs = 4326) %>%
st_intersection(hs_bounds)
tps_bgs <- block_groups(state = "OK") %>%
st_as_sf() %>%
st_transform(crs = 4326) %>%
st_intersection(hs_bounds)
mapview(hs_bounds)
# Display all tps block groups on interactive map
tps_blocks_map <- mapview(tps_bgs) %>%
addFeatures(., hs_bounds)
# convert to df and remove geometry bc its a list col
tps_blocks_df <- tps_blocks %>%
as.data.frame() %>%
select(-geometry)
# Export blocks in tps. GEOID10 is the unique identifier for the block
write_csv(tps_blocks_df, path = "C:/Users/makis/Documents/school/tps_blocks.csv")
Here Im trying to include the student data as well but Im concluding in adataframe with zero data
#r students by geography
student_geos <- raw_master %>%
#filter for students active in a given year
filter(year == 2019) %>%
# filter(row_number() %in% sample(length(year), 20000)) %>%
# Parse lat/long. I believe that I should do something here with the lat and long
#and some variable of the csv like the geocode variable that is used here
#a similar should be present in my csv file as well
#mutate(lat = as.numeric(str_extract(geocode, "[0-9]+.[0-9]+"))) %>%
#mutate(lon = as.numeric(str_extract(geocode, "-[0-9]+.[0-9]+"))) %>%
# Please don't ask me why this rowwise is necessary
rowwise() %>%
# Create sf point for each set of coords
mutate(pt = st_sfc(st_point(x = c(lon, lat)), crs = 4326)) %>%
# Turn df into sfc then take intersection of pts and blocks
st_as_sf() %>%
st_intersection(tps_blocks)
# convert to df and remove geometry bc its a list col
student_geos_df <- student_geos %>%
as.data.frame() %>%
select(-pt)
If everything above is correct i should do something like:
# enrollment by tract
tract_enrol <- student_geos %>%
as.data.frame() %>%
group_by(year, TRACTCE10) %>%
summarize(enrollment = n())
# convert list of tracts into sfc
tracts <- tracts(state = "OK",
county = c("Tulsa", "Osage", "Wagoner", "Creek"),
year = 2010) %>%
st_as_sf() %>%
as.data.frame() %>%
#I guess student id instead of TRACTE10 here
inner_join(tract_enrol, by = "TRACTCE10") %>%
st_as_sf()
mapview(tracts, zcol = "enrollment", legend = TRUE)
Your file still doesn't download.
I can give you a generic guide to use ggplot2 to make a map. This will draw the polygons and the points.
You need to modify the Spatial_DataFrames with fortify() to get them into a format ggplot2 can use.
library(ggplot2)
hs_b2 <- fortify(hs_bounds) #or instead of "hs_bounds", "tracts" or whatever your polygon
#is called. If that doesn't work you need "<-as.data.frame()".
#Make sure the output has a separate column for x and y.
#repeat for the points (student) object.
student_2 <- fortify(<studentpointsobject>)
ggplot(data=<student_2>, aes(x=x,y=y)) +
geom_polygon(data=hs_b2, aes(x=long, y=lat, group=group) , #this will create the polygon
colour="grey90", alpha=1, #one of your color options for polygons
fill="grey40") + #one of your color options for polygons
theme(axis.ticks.y = element_blank(),axis.text.y = element_blank(), # get rid of x ticks/text
axis.ticks.x = element_blank(),axis.text.x = element_blank()) + # get rid of y ticks/text
geom_point(aes(color="grey20")) #to get the points drawn. mess with "fill" and "color"
You can customize the plot with 'color' or 'fill' in the aes().

Ploting a Buffer Around a Point on a Map - R SF

I've been trying to plot a buffer around a point on a map but when I do the buffer doesn't appear in the right place like this.
Faulty R Map
The correct location is in California.
Here's my code:
library(tigris)
library(sf)
library(tidyverse)
projection <- 102003
options(tigris_use_cache = TRUE)
county_polys <- counties(class = 'sf') %>%
filter(STATEFP %in% c('06','41','53','04','16','32','49')) %>%
st_transform(projection)
centroids <- county_polys %>%
as_tibble %>% select(INTPTLON,INTPTLAT) %>%
mutate(
INTPTLON = as.double(INTPTLON),
INTPTLAT = as.double(INTPTLAT)) %>%
st_as_sf(coords = c('INTPTLON','INTPTLAT'), crs = projection)
pt <- centroids[2,]
pt_buffer <- st_buffer(pt,150000)
ggplot() + geom_sf(data = county_polys) + geom_sf(data = pt_buffer,color = 'red')
We can use the st_centroid function to get the centroid to avoid errors. There is no need to convert the sf object to other classes.
# This is the only thing I changed from your original code
# Get the centroid by st_centroid
centroids <- county_polys %>% st_centroid()
pt <- centroids[2,]
pt_buffer <- st_buffer(pt,150000)
ggplot() + geom_sf(data = county_polys) + geom_sf(data = pt_buffer,color = 'red')

How to label an individual state on the map while the others at sub-divisional level

I've managed to produce a map, however I need to add one label for a state (level 2) that includes subdivisons (level 3), instead of labeling each subdivision (for only this state). In data "newpak" rows 641-664 correspond to this state, is there any way to place only one name above this state.
library(dplyr)
library(raster)
library(sf)
library(tidyverse)
library(ggrepel)
devtools::install_github("tidyverse/ggplot2", force = TRUE)
library(ggplot2)
pak <- getData("GADM",country="PAK",level=3)
pak <- st_as_sf(pak) %>%
mutate(
lon = map_dbl(geometry, ~st_centroid(.x)[[1]]),
lat = map_dbl(geometry, ~st_centroid(.x)[[2]]))
ggplot(pak) + geom_sf() + geom_text(aes(label = NAME_3, x = lon, y = lat), size = 2)
ind <- getData("GADM",country="IND",level=3)
ind <- st_as_sf(ind) %>%
mutate(
lon = map_dbl(geometry, ~st_centroid(.x)[[1]]),
lat = map_dbl(geometry, ~st_centroid(.x)[[2]]))
jnk <- subset(ind, OBJECTID >= 641 & OBJECTID <= 664 )
newpak <- rbind(pak, jnk)
regionalValues <- runif(165) # Simulate a value for each region between 0 and 1
ggplot(newpak) + geom_sf(aes(fill = regionalValues)) + geom_text(aes(label = NAME_3, x = lon, y = lat), size = 2)
Here's a complete solution using the sf package.
library(raster)
library(sf)
library(tidyverse)
# downlaod PAK data and convert to sf
pak <- getData("GADM",country="PAK",level=3) %>%
st_as_sf()
# download IND data, convert to sf, filter out
# desired area, and add NAME_3 label
jnk <- getData("GADM",country="IND",level=3) %>%
st_as_sf() %>%
filter(OBJECTID %>% between(641, 664)) %>%
group_by(NAME_0) %>%
summarize() %>%
mutate(NAME_3 = "Put desired region name here")
regionalValues <- runif(142) # Simulate a value for each region between 0 and 1
# combine the two dataframes, find the center for each
# region, and the plot with ggplot
pak %>%
select(NAME_0, NAME_3, geometry) %>%
rbind(jnk) %>%
mutate(
lon = map_dbl(geometry, ~st_centroid(.x)[[1]]),
lat = map_dbl(geometry, ~st_centroid(.x)[[2]])
) %>%
ggplot() +
geom_sf(aes(fill = regionalValues)) +
geom_text(aes(label = NAME_3, x = lon, y = lat), size = 2) +
scale_fill_distiller(palette = "Spectral")
Some notes:
I used sf::filter instead of raster::subset to get the desired subset of the IND data, because I feel it's more idiomatic tidyverse code.
To combine areas with sf you can group the different regions by a common group with group_by and then simply call summarize. This is the method I used in my solution above. There are other functions in the sf package that accomplish similar results worth looking at. They are st_combine and st_union.
Using st_centroid for the purpose of plotting the region labels is not necessarily the best method for finding a good location for region labels. I used it because it's the most convenient. You might try other methods, including manual placement of labels.
I changed the fill palette to a diverging color palette because I think it more clearly shows the difference between one region and the next. You can see some of the color palettes available with RColorBrewer::display.brewer.all()

Resources