I have a dataset of 4 rows and a few columns which are country, location, hits, lat, lon.
Reproducible is:
structure(list(country = c("France", "France", "France", "France")
, location("Ile-de-France, Paris", "Ile-de-France, Villebon-sur-yvette", "Nord-Pas-de-Calais, Hérin", "Nord-Pas-de-Calais, Lille")
, Hits(1, 1, 3, 5)
, lat = c(46.227638, 46.227638, 46.227638, 46.227638)
, Ion = c(-2.213749, 2.213749, 2.213749, 2.213749)
)
, .Names = c("country", "location", "Hits", "lat", "Ion")
, class = "data.frame")
I want to use this in popup and show all the location and hits as 4 seperate lines.
current code i am using is:
m <- leaflet() %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addCircles(lng=area$longitude, lat=area$latitude, popup=paste("Country:", area$Country, "<br>"
, "Location:", area$Location, "-", area$Hits, "<br>"))
If u have questions feel free to ask.
There were some mistakes in your example.
Try this
library(leaflet)
area <- data.frame(country = c("France", "France", "France", "France")
, location= c("Ile-de-France, Paris", "Ile-de-France, Villebon-sur-yvette", "Nord-Pas-de-Calais, Hérin", "Nord-Pas-de-Calais, Lille")
, Hits= c(1, 1, 3, 5)
, lat = c(46.234638, 46.456638, 46.288638, 46.900638)
, lon = c(2.313749, 2.413749, 2.513749, 2.613749)
)
m <- leaflet() %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addCircles(lng=area$lon, lat=area$lat,
popup=paste("Country:", area$country, "<br>", "Location:", area$location, "-", area$Hits, "<br>"))
The main issue with your example is in the provided coordinates. You assigned the same coordinates to the four points. This results in the display of the last point only.
If you want to plot multiple versions of the same point, you'll have to adjust the coordinates so that they are not all the same, or reshape your data so that each additional information is a column in your data set. I call this 'jittering' and have a function to do it below. It provides a slight offset without being too far off.
I use the following code to add white noise to the latitude and longitude so that they are close, but do not overlap.
This would solve your issue of repeated values for longitude and latitude. Another use case is that if your data is at zip code resolution, and you have the coordinates for the zip code centroid, then all locations within the zip would be plotted on top of each other and you would only see the last one.
# Add Jittering to the Zips so that they don't stack
l <- nrow(last)
jitterFactor <- function(l){
jF <- runif(l, min = -1, max = 1)
jF <- jF/100
return(jF)
}
# use randomness so that 0's don't overlap
last$JitterLat <- jitterFactor(l)
last$JitterLon <- jitterFactor(l)
# apply the jitter
last$Lat <- last$latitude + last$JitterLat
last$Lon <- last$longitude + last$JitterLon
In the map below, the three green circles are all the same location, but represent different month's data. The jitter function spread them out a little.
Related
fdb is this file and I need to join it with number of HHs by block group but it's not working as Geeks for Geeks says it will because the number of census blocks (12 first in BlockCode) at the end isn't as many after merging as it says it is at the begginig, I was expecting that this join would give me 226773. This isn't as if it was merging more but it's giving me less and I can't find the right lead. I really don't understand the pieces where it gives me more than 239780 Census blocks they are according to tidyCensus. Could someone please?
library(tidycensus)
fdb <- read.csv("fbd_us_with_satellite_dec2020_v1.csv")
abbr <- c("AL","AK","AZ","AR","CA","CO","CT","DE","DC","FL","GA","HI","ID","IL","IN","IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV",
"NH","NJ","NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA","WA","WV","WI","WY") # 51 states
HH_units <- get_acs(geography = "block group", variables = c(households = "B25001_001"), state = abbr) # same as B01003_001 for income level # HHs
HH_units$households <- HH_units$estimate
library(dplyr)
HH_units <- HH_units %>%
select(
GEOID,
NAME,
households
)
unique(fdb$BlockCode) # 11164855 rows
fdb$GEOID <- substring(fdb$BlockCode, 1, 12)
unique(fdb$GEOID) # it's got 226773 block lines
# Apparently you got to increase R's memory limit to join huge data sets
memory.limit()
memory.limit(400000)
all <- merge(x = fdb,
y = HH_units,
by = "GEOID")
unique(all$GEOID) # 138625 rows and not 226773
all2 <- full_join(x = fdb,
y = HH_units,
by = "GEOID")
unique(all2$GEOID) # 326928
all3 <- right_join(x = fdb,
y = HH_units,
by = "GEOID")
unique(all3$GEOID) # 239780
I am looking to ultimately render a map in Leaflet that shows the roll call voting results for a specific vote from the Senate. This obviously involves coloring a state polygon based on the unique combination of the Senator's party affiliation and how they voted (2 senators per state). The problem I have is developing a workflow to color code a state (here I am using a simple sf dataframe of the US states) in this manner. The idea would be to "stripe" the state in two different colors based on each of the Senator's party affiliation and vote type.
Below is a workflow that has already been created for viewing roll call voting results by congressional districts (not what I want, I want to do this for Senate voting), but I figured this would be a starting point or baseline for hoping to create a similar map for a roll call vote from the Senate. This code can be found at https://www.r-bloggers.com/2020/09/mapping-congressional-roll-calls/. The only thing different is that I provided a function that I found on another website to directly read a congressional district shapefile from the website where they are housed courtesy of the UCLA Political Science Department:
# Workflow for mapping congressional district roll call voting results
library(Rvoteview)
library(tidyverse)
devtools::install_github("jaytimm/wnomadds")
library(wnomadds)
library(sf)
library(tigris)
# Function to download a shapefile for any congressional district of your choice.
get_congress_map <- function(cong=113) {
tmp_file <- tempfile()
tmp_dir <- tempdir()
zp <- sprintf("http://cdmaps.polisci.ucla.edu/shp/districts%03i.zip",cong)
download.file(zp, tmp_file)
unzip(zipfile = tmp_file, exdir = tmp_dir)
fpath <- paste(tmp_dir, sprintf("districtShapes/districts%03i.shp",cong), sep = "/")
st_read(fpath)
}
# Get the shapefile for the 89th congress
cd89 <- get_congress_map(cong = 89)
options(tigris_use_cache = TRUE, tigris_class = "sf")
# List the FIPS for US territories (and Alaska and Hawaii) that we won't include in maps.
nonx <- c('78', '69', '66', '72', '60', '15', '02')
# Create a simple states dataframe
states <- tigris::states(cb = TRUE) %>%
data.frame() %>%
select(STATEFP, STUSPS) %>%
rename(state_abbrev = STUSPS)
# Join the congressional districts shapefile with the simple states dataframe we
# created above.
cd_sf <- cd89 %>%
mutate(STATEFP = substr(ID, 2, 3),
district_code = as.numeric(substr(ID, 11, 12))) %>%
left_join(states, by = "STATEFP") %>%
filter(!STATEFP %in% nonx) %>%
select(STATEFP, state_abbrev, district_code)
# Download rollcall data from the Voteview database. Here for the Voting
# Rights Act of 1965
vra <- Rvoteview::voteview_search('("VOTING RIGHTS ACT OF 1965") AND (congress:89)
AND (chamber:house)') %>%
filter( date == '1965-07-09') %>%
janitor::clean_names()
votes <- Rvoteview::voteview_download(vra$id)
names(votes) <- gsub('\\.', '_', names(votes))
# Restructure the roll call voting data stored in votes
big_votes <- votes$legis_long_dynamic %>%
left_join(votes$votes_long, by = c("id", "icpsr")) %>%
filter(!grepl('POTUS', cqlabel)) %>%
group_by(state_abbrev) %>%
mutate(n = length(district_code)) %>%
ungroup() %>%
mutate(avote = case_when(vote %in% c(1:3) ~ 'Yea',
vote %in% c(4:6) ~ 'Nay',
vote %in% c(7:9) ~ 'Not Voting'),
party_code = case_when(party_code == 100 ~ 'Dem',
party_code == 200 ~ 'Rep' ),
Party_Member_Vote = paste0(party_code, ': ', avote),
## fix at-large --
district_code = ifelse(district_code %in% c(98, 99), 0, district_code),
district_code = ifelse(n == 1 & district_code == 1, 0, district_code),
district_code = as.integer(district_code)) %>%
select(-n)
#Members who represent historical “at-large” districts are
##assigned 99, 98, or 1 in various circumstances. Per VoteView.
# Make the Party_Member_Vote variable a factor and change the order of its levels.
big_votes$Party_Member_Vote <- factor(big_votes$Party_Member_Vote)
big_votes$Party_Member_Vote <-
factor(big_votes$Party_Member_Vote,
levels(big_votes$Party_Member_Vote)[c(3,6,1,4,2,5)])
# Join the roll call voting data with the shapefile and plot.
cd_sf_w_rolls <- cd_sf %>%
left_join(big_votes, by = c("state_abbrev", "district_code"))
main1 <- cd_sf_w_rolls %>%
ggplot() +
geom_sf(aes(fill = Party_Member_Vote),
color = 'white',
size = .25) +
wnomadds::scale_fill_rollcall() +
theme_minimal() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
legend.position = 'none') # +
main1 + ggtitle(vra$short_description)
This is fine for mapping congressional districts on roll call votes by the house. I am trying to figure out a way to reproduce a similar map for senate roll call votes. So, I started with the same workflow and am not sure how to proceed further or if it is even possible:
# Now I want to make a similar map for the senators of each state, not
# the representatives.
# I want to include Hawaii and Alaska in my Senate maps, so remove those FIPS
# from the vector.
non_states <- c('78', '69', '66', '72', '60', '11')
# No congressional district shapefile is therefore needed. So, here I just make
# a simple sf dataframe for the US States. Set the coordinate reference system to
# 4326 (World Geodetic System 1984) because I want to render the map in Leaflet
# and that's the reference system Leaflet uses.
states_Senate <- tigris::states(cb = TRUE) %>%
st_as_sf(crs = 4326) %>%
select(STATEFP, STUSPS, geometry) %>%
filter(!STATEFP %in% non_states) %>%
rename(state_abbrev = STUSPS)
# Query a roll call vote in the Voteview database. Any vote will work, here
# a vote related to marketing of non-prescription drugs in the 116th congress in the
# Senate now, not the House.
vra2 <- Rvoteview::voteview_search('("A bill to amend the Federal Food, Drug, and Cosmetic Act")
AND (congress:116) AND (chamber:senate)') %>%
janitor::clean_names()
votes2 <- Rvoteview::voteview_download(vra2$id)
names(votes2) <- gsub('\\.', '_', names(votes2))
# Restructure the roll call voting data stored in votes2
big_votes2 <- votes2$legis_long_dynamic %>%
left_join(votes2$votes_long, by = c("id", "icpsr")) %>%
filter(!grepl('POTUS', cqlabel)) %>%
mutate(avote = case_when(vote %in% c(1:3) ~ 'Yea',
vote %in% c(4:6) ~ 'Nay',
vote %in% c(7:9) ~ 'Not Voting'),
party_code = case_when(party_code == 100 ~ 'Dem',
party_code == 200 ~ 'Rep' ),
Party_Member_Vote = paste0(party_code, ': ', avote))
# Now I have a dataframe, big_votes2 that has 2 rows for each state. I need to figure
# out how to color the polygons for each state based on the unique combination of
# party affiliation and vote cast.
# Make Party_Member_Vote a factor like the congressional district workflow above,
# join big_votes2 with states_Senate sf dataframe, and plot........finishing this
# workflow and making a Senate map is essentially my question.
My hope is make a final map that looks similiar to the following (found at https://voteview.com/rollcall/RS1160389), which is the resulting roll call vote for the example query
I provide in the script of my workflow for creating a senate map directly above (the roll call vote about non-prescription drugs). This is probably done in Javascript, maybe D3, but I am working on an R Shiny app looking at roll call voting, so I am strictly looking to do this in R.
Here the state polygons are "striped" by the senator's party and how they voted. If the senators in a state are both one party and vote in unison, the state is a solid color reflecting this. The color palette is based off the voteview_pal provided in the wnomadds package. The colors in this palette don't include senators that consider themselves independent, but I can update the palette if there is a solution to creating the striping pattern within the state polygons. In my use of R I can't think of a way to accomplish this, since color fills are based on unique levels of a factor variable and here we have to rows per state, as the dataframe is being created in this workflow. Additionally, I've never seen a pattern or stripe fill in ggplot that could accomplish this even if the dataframe was arranged in a way that there were only 1 row/observation per state. If this is even possible, I would want to render this in Leaflet, but if the basic concepts can be accomplished by plotting the sf object in ggplot I would gladly start there. Any help would appreciated.
Is there any way that can be used to parse a shapefile of a country and download MODIS product data within that country using R?
I tried different approaches using the MODIStsp package (https://docs.ropensci.org/MODIStsp/) as well as the MODISTools package (https://docs.ropensci.org/MODISTools/articles/modistools-vignette.html) and they both only allow me to download MODIS product data for a defined site, but not a country.
Here's an example of how you might achieve this.
Firstly, download the MODIS data that you require, in this example I'm using MCD12Q1.006
begin_year and end_year are in the format: Year.Month.Days.
shape_file is the shapefile you're using, presumably the extent of the shapefile is the country you're after. Though, I'm only going off by the minimal information you have provided.
library(MODIS)
tifs <- runGdal(product = "MCD12Q1", collection = "006", SDSstring = "01",
extent = shape_file %>% st_buffer(dist = 10000),
begin = begin_year, end = end_year,
outDirPath = "data", job = "modis",
MODISserverOrder = "LPDAAC") %>%
pluck("MCD12Q1.006") %>%
unlist()
# rename tifs to have more descriptive names
new_names <- format(as.Date(names(tifs)), "%Y") %>%
sprintf("modis_mcd12q1_umd_%s.tif", .) %>%
file.path(dirname(tifs), .)
file.rename(tifs, new_names)
landcover <- list.files("data/modis", "^modis_mcd12q1_umd",
full.names = TRUE) %>%
stack()
# label layers with year
landcover <- names(landcover) %>%
str_extract("(?<=modis_mcd12q1_umd_)[0-9]{4}") %>%
paste0("y", .) %>%
setNames(landcover, .)
Also, if you require a particular cell size, then you could follow this procedure to get a 5x5 modis cell size.
neighborhood_radius <- 5 * ceiling(max(res(landcover))) / 2
agg_factor <- round(2 * neighborhood_radius / res(landcover))
r <- raster(landcover) %>%
aggregate(agg_factor)
r <- shape_file %>%
st_transform(crs = projection(r)) %>%
rasterize(r, field = 1) %>%
# remove any empty cells at edges
trim()
Here's an example using MODISTools to automate downloading the correct tiles for the country.
First let's generate a polygon of a country to demonstrate (using Luxembourg as an example):
library(maptools)
library(sf)
data(wrld_simpl)
world = st_as_sf(wrld_simpl)
lux = world[world$NAME=='Luxembourg',]
Now we find the location (centroid) and size of the country:
#find centroid of polygon in long-lat decimal degrees
lux.cent = st_centroid(lux)
#find width and height of country in km
lux.proj = st_transform(lux,
"+proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +units=km +no_defs")
lux.km_lr = diff(st_bbox(lux.proj)[c(1,3)])
lux.km_ab = diff(st_bbox(lux.proj)[c(2,4)])
Using this info, we can download the correct Modis data (using leaf-area index, lai, as an example):
#download the MODIS tiles for the area we defined
library(MODISTools)
lux_lai <- mt_subset(product = "MOD15A2H",
lat = lux.cent$LAT, lon = lux.cent$LON,
band = "Lai_500m",
start = "2004-01-01", end = "2004-01-01",
km_lr = lux.km_lr, km_ab = lux.km_ab,
site_name = "Luxembourg",
internal = TRUE, progress = TRUE)
# convert to a spatial raster
lux.rast = mt_to_raster(df = lux_lai, reproject = TRUE)
lux.rast = raster::mask(lux.rast, lux)
plot(lux.rast)
plot(st_geometry(lux),add=T)
I am trying to do the following:
I have a two datasets about my company. The first one has, say, the top 20 growing sellers. The second one has the bottom 20 losing sellers. So, it's something like this:
growing_seller <- c("a","b","c","d","e","f","g","h","i","h")
sales_yoy_growing <- c(100000,90000,75000,50000,37500,21000,15000,12000,10000,8000)
top_growing <- data.frame(growing_seller,sales_yoy_growing)
losing_seller <- c("i","j","k","l","m","n","o","p","q","r")
sales_yoy_losing <- c(-90000,-75000,-50000,-37500,-21000,-15000,-12000,-10000,-8000,-5000)
bottom_losing <- data.frame(losing_seller,sales_yoy_losing)
I am trying to plot both charts in the same plot using DIFFERENT categories, corresponding to the sellers' name. So what I have so far is this:
library(highcharter)
growing_seller <- c("a","b","c","d","e","f","g","h","i","h")
sales_yoy_growing <- c(100000,90000,75000,50000,37500,21000,15000,12000,10000,8000)
top_growing <- data.frame(growing_seller,sales_yoy_growing)
losing_seller <- c("i","j","k","l","m","n","o","p","q","r")
sales_yoy_losing <- c(-90000,-75000,-50000,-37500,-21000,-15000,-12000,-10000,-8000,-5000)
bottom_losing <- data.frame(losing_seller,sales_yoy_losing)
highchart() %>%
hc_add_series(
data = top_growing$sales_yoy_growing,
type = "column",
grouping = FALSE
) %>%
hc_add_series(
data = bottom_losing$sales_yoy_losing,
type = "column"
)
This is what I want to achieve graphically: Chart example
Now,I would like to have a different category array per each independent x-axis: something like the possibility to have "two hc_xAxis" controls, where I could specify per each plotted series its own categories.
My final aim is to, then, have the seller's name as I parse over each of the different columns.
Hope I was clear enough :)
Thanks
Highcharts displays the point's name in the tooltip by default. You just need to point the name value in your data.
You can do it this way:
top_growing <- data.frame(name = growing_seller, y = sales_yoy_growing)
This is the whole code:
library(highcharter)
growing_seller <- c("a","b","c","d","e","f","g","h","i","h")
sales_yoy_growing <- c(100000,90000,75000,50000,37500,21000,15000,12000,10000,8000)
top_growing <- data.frame(name = growing_seller, y = sales_yoy_growing)
losing_seller <- c("i","j","k","l","m","n","o","p","q","r")
sales_yoy_losing <- c(-90000,-75000,-50000,-37500,-21000,-15000,-12000,-10000,-8000,-5000)
bottom_losing <- data.frame(name = losing_seller, y = sales_yoy_losing)
highchart() %>%
hc_add_series(
data = top_growing,
type = "column",
grouping = FALSE
) %>%
hc_add_series(
data = bottom_losing,
type = "column"
)
Can you help figure out the best way to resolve the length mismatch error thrown by dotsInPolys? I think it is because there are NA's or NULLs or some funk in the polygon data that makes it too long. Here's code that reproduces the error. Ultimately, I want to plot multiple races using Leaflet, but I can't produce the lat/lon needed for the random dots at this point.
require(maptools)
require(tidycensus)
person.number.divider <- 1000
census_api_key("ENTER KEY HERE", install = TRUE)
racevars <- c(White = "B02001_002", #"P005003"
Black = "B02001_003", #Black or African American alone
Latinx = "B03001_003"
)
nj.county <- get_acs(geography = "county", #tract
year = 2015,
variables = racevars,
state = "NJ", #county = "Harris County",
geometry = TRUE,
summary_var = "B02001_001")
library(sf)
st_write(nj.county, "nj.county.shp", delete_layer = TRUE)
nj <- rgdal::readOGR(dsn = "nj.county.shp") %>%
spTransform(CRS("+proj=longlat +datum=WGS84"))
nj#data <- nj#data %>%
tidyr::separate(NAME,
sep =",",
into = c("county", "state")) %>%
dplyr::select(estimat,variabl, GEOID, county) %>%
spread(key = variabl, value = estimat) %>%
mutate(county = trimws(county))
black.dots <- dplyr::select(nj#data, Black) / person.number.divider #%>%
black.dots <- dotsInPolys(nj, as.integer(black.dots$Black), f="random")
# Error in dotsInPolys(nj, as.integer(black.dots$Black), f = "random") :
# different lengths
length(nj) # 63 This seems too many, because I believe NJ has 21 counties.
length(black.dots$Black) # 21
This post (Advice on troubleshooting dotsInPolys error (maptools)) came close to helping me, but I couldn't see how to apply it to my case.
I can change the length of the nj spatialpolygonsdataframe by removing NA's and counties with a black pop greater than 0, but then the map doesn't plot multiple counties (maybe there is something wrong with the census download?).
It looks like you might have gotten this figured out, but I wanted to share another approach that uses sf::st_sample() instead of maptools::dotsInPolys(). One advantage of this is that you don't need to convert the sf object you get from tidycensus to a sp object.
In the following example I split the census data by race into a list three sf objects then perform st_sample() on each element of the list (each race). Next, I recombine the sampled points into one sf object with a new race variable for each point. Finally, I use tmap to make a map, though you could use ggplot2 or leaflet to map as well.
library(tidyverse)
library(tidycensus)
library(sf)
library(tmap)
person.number.divider <- 1000
racevars <- c(White = "B02001_002", #"P005003"
Black = "B02001_003", #Black or African American alone
Latinx = "B03001_003"
)
# get acs data with geography in "tidy" form
nj.county <- get_acs(geography = "county", #tract
year = 2015,
variables = racevars,
state = "NJ", #county = "Harris County",
geometry = TRUE,
summary_var = "B02001_001"
)
# split by race
county.split <- nj.county %>%
split(.$variable)
# randomly sample points in polygons based on population
points.list <- map(county.split, ~ st_sample(., .$estimate / person.number.divider))
# combine points into sf collections and add race variable
points <- imap(points.list, ~ st_sf(tibble(race = rep(.y, length(.x))), geometry = .x)) %>%
reduce(rbind)
# map!
tm_shape(nj.county) +
tm_borders(col = "darkgray", lwd = 0.5) +
tm_shape(points) +
tm_dots(col = "race", size = 0.01, pal = "Set2")
I don't have enough rep to post the map image directly, but here it is.