I am having troubles trying to write R code for a choroplet using the highcharter package. I am trying to replicate the code in the following link on lines 84-112: https://www.kaggle.com/gloriousc/global-terrorism-in-1970-2016/code.
I have been encountering 2 errors:
When running line 95, error says that there is no object called "countrycode_data". I looked on the internet in order to find out what countrycode_data is and I discovered that it is a dataset of the containing country code to associate to country names in datasets. Countrycode_data, from what I understood, it should have been contained in the "countrycode" package that I had installed but I didn't manage to find out how to access this dataset. In order to overcome this problem i downloaded this dataset from the internet and managed to go on with the code.
When running the choroplet code starting on line 103, I encountered the following error: "Error: %in%(x = tail(joinBy, 1), table = names(df)) is not TRUE". I actually have no idea about what this error could mean, so I'm here asking for help.
I managed to overcome the 1st error problem even though I am not sure that it is the correct way.
I am going to leave the entire code right here:
knitr::opts_chunk$set(echo=TRUE, error=FALSE)
library(dplyr) #manipulate table
library(ggplot2) #visualization
library(highcharter) #making map
library("viridisLite") #Default Color Maps
library(countrycode) #list of country code
library(treemap) #make a treemap chart
library(reshape2) #melt function
library(plotly) #pie chart
library(tm) #text mining
library(SnowballC) #stemming text
library(wordcloud) #make a text chart
library(RColorBrewer) #make a color pallette
library(DT) #make datatable
#input the data
terror <- read.csv("../input/globalterrorismdb_0617dist.csv")
Terrorist Incidents Map
#count terrorism incidents per country as a dataframe
countries <- terror %>%
group_by(country_txt) %>%
summarise(Total = round(n()))
#Making a terrorism map
#Credit to umeshnarayanappa
names(countries) <- c("country.name", "total") #change the column name
countries$iso3 <- countrycode_data[match(countries$country.name, countrycode_data$country.name.en), "iso3c"] #add iso3 column from country_code
data(worldgeojson, package = "highcharter")
dshmstops <- data.frame(q = c(0, exp(1:5)/exp(5)),
c = substring(viridis(5 + 1, option = "D"), 0, 7)) %>% #from viridisLite, make a color
list_parse2() #from highchart package, parse df to list
highchart() %>% #from highchart package
hc_add_series_map(worldgeojson, countries, value = "total", joinBy = "iso3") %>%
hc_colorAxis(stops = dshmstops) %>%
hc_legend(enabled = TRUE) %>%
hc_add_theme(hc_theme_db()) %>%
hc_mapNavigation(enabled = TRUE) %>%
hc_title(text = "Global Terrorism in 1970-2016", style = list(fontSize = "25px")) %>%
hc_add_theme(hc_theme_google()) %>%
hc_credits(enabled = TRUE, text = "Sources: National Consortium for the Study of Terrorism and Responses to Terrorism (START)", style = list(fontSize = "10px"))
I want to specify that, even though I ctrl+c ctrl+v the lines, they are not working for me.
Thank you for reading everything and also, I hope, for your help.
I tried to replicate the example. I hope the following is enough for you to work by yourself and replicate the example. It seems that countrycode_data is on the psData package. This package requires the rJava package, which is not on my machine now. As you were looking for a workaround, I found my own way; I scrape country data including iso3. (You can probably use the ISOcodes package too.) You need to check if country names in the two datasets are identical or not, which is a common challenge. You usually see some mismatches. I do not have time to correct all, but I showed you how to revise some country names in recode(). The bottom line is that you want to add iso3 to countries. So you need to make sure that you have identical country names as much as possible. (Obviously, some countries do not exist any more. You cannot really do anything about them.) The author used match() in his code, but I rather used left_join() to do the same. After this, I think you are ready to follow the rest of the code. Note that hc_add_series_map() is also doing a join process. worldgeojson has a property called iso3. countries must have a column called iso3. Otherwise, you will get the same error message again.
library(tidyverse)
library(data.table)
library(rvest)
library(highcharter)
library(viridisLite)
# I used fread(). This is much faster.
terror <- fread("globalterrorismdb_0919dist.csv")
# I wrote my own code which does the same job.
count(terror, country_txt) %>%
setNames(nm = c("country.name", "total")) -> countries
# Get iso3 data
map_dfc(.x = c("official", "shortname", "iso3"),
.f = function(x) {read_html("http://www.fao.org/countryprofiles/iso3list/en/") %>%
html_nodes(paste("td.", x, sep = "")) %>%
html_text() %>%
gsub(pattern = "\\n(\\s+)?", replacement = "")}) %>%
setNames(nm = c("official", "shortname", "iso3")) -> iso3
# Revise some country names.
mutate(iso3, shortname = trimws(sub(x = shortname, pattern = "\\(.*\\)",
replacement = "")),
shortname = recode(.x = shortname,
`Bosnia and Herzegovina` = "Bosnia-Herzegovina",
`Brunei Darussalam` = "Brunei",
Czechia = "Czech Republic",
Congo = "Republic of the Congo",
`Côte d'Ivoire` = "Ivory Coast",
`Russian Federation` = "Russia",
`United Kingdom of Great Britain and Northern Ireland` = "United Kingdom",
`United States of America`= "United States"
)) -> iso3
# Join the two data sets
left_join(countries, iso3, by = c("country.name" = "shortname")) -> countries
data(worldgeojson, package = "highcharter")
dshmstops <- data.frame(q = c(0, exp(1:5)/exp(5)),
c = substring(viridis(5 + 1, option = "D"), 0, 7)) %>% #from viridisLite, make a color
list_parse2()
highchart() %>% #from highchart package
hc_add_series_map(worldgeojson, df = countries,
value = "total", joinBy = "iso3") %>%
hc_colorAxis(stops = dshmstops) %>%
hc_legend(enabled = TRUE) %>%
hc_add_theme(hc_theme_db()) %>%
hc_mapNavigation(enabled = TRUE) %>%
hc_title(text = "Global Terrorism in 1970-2016", style = list(fontSize = "25px")) %>%
hc_add_theme(hc_theme_google()) %>%
hc_credits(enabled = TRUE,
text = "Sources: National Consortium for the Study of Terrorism and Responses to Terrorism (START)",
style = list(fontSize = "10px"))
Related
library(tidyverse)
library(tidycensus)
library(sf)
library(sp)
#install.packages('geosphere')
library('geosphere')
library(rgeos)
library(sfheaders)
#install.packages('reshape')
library('reshape')
#> Linking to GEOS 3.6.1, GDAL 2.1.3, PROJ 4.9.3
census_tract <- get_acs(geography = "tract",
variables = "B19013_001",
state = "CA",
county = c("San Joaquin","Merced","stanislaus"),
geometry = TRUE,
year = 2020)
plot(st_geometry(census_tract), axes = T)
plot(st_centroid(st_geometry(census_tract)), pch = "+", col = "red", add = T)
library(ggplot2)
ggplot(census_tract) + geom_sf() +
geom_sf(aes(geometry = st_centroid(st_geometry(census_tract))), colour = "red")
census_tract$centroid <- st_centroid(st_geometry(census_tract))
schoolloc <- read.csv("C:/Users/rlnu/Desktop/EXAMPLE/pubschls.csv")
schoolloc <- schoolloc%>% filter(County == c("San Joaquin","Merced","Stanislaus"))
census_tract <- census_tract %>%
mutate(long = unlist(map(census_tract$centroid,1)),
lat = unlist(map(census_tract$centroid,2)))
shortest_distance$min_distance <- expand.grid.df(census_tract,schoolloc) %>%
mutate(distance = distHaversine(p1 = cbind(long,lat),
p2 = cbind(Longitude,Latitude))
`
I am trying to find distance between the each census tract's centroid to three nearest schools. please help me out with it. I have written some code . The logic is wrong and the code is not working
Can achieve this using the sf package.
I could not access you schools data so made a dummy set of 4 schools.
library(sf)
schools <- data.frame(School_Name=c("School_1", "School_2", "School_3", "School_4"), Lat=c(37.83405, 38.10867, 37.97743, 37.51615), Long=c(-121.2810, -121.2312, -121.2575, -120.8772)) %>% st_as_sf(coords=c("Long", "Lat"), crs=4326)
Convert tracts to centroids and make the crs the same as the school set then calculate the distance matrix
census_centroid <- st_centroid(census_tract) %>% st_transform(4326)
DISTS<- st_distance(census_centroid, schools)
Rename the columns to be the school IDs
colnames(DISTS) <- schools$School_Name
link it back to centoids
cent_dists <- cbind(census_centroid, DISTS) %>% #bind ditances to centroids
pivot_longer(cols = -names(census_centroid), names_to = "School Name", values_to = "Distance") %>% #make long for ordering
group_by(NAME) %>% #group by centroid
slice_min(Distance,n= 3) %>% # take three closest
mutate(Near_No=paste0("Near_School_",rep(1:3))) #School distance ranking
Make wide if one row per census centroid desired, might want to play with column order though
cent_dists_wide <- cent_dists %>%
pivot_wider(names_from = c("Near_No"), values_from = c("Distance", "School Name"), names_sort = FALSE) #make wid if wyou want one row per centoid
This is an odd sort of question, but I'm using the exact data and code from this github here: https://github.com/edunford/tidysynth
However, when I get to the code plotting trends, it turns out I'm plotting the trends for a synthetic vs. real Alabama, rather than the synthetic vs. real California, which is what it supposed to be happening -- and what the hithub results show!!
I am running this code in Rstudio on a laptop. How can the results be different? Why would it be showing a synthetic AL rather than synthetic CA?
All my code is below. Can somebody sanity-check me here -- do you get a trend for CA or for AL? I feel like I'm going crazy.
#install.packages("devtools")
#devtools::install_github("edunford/tidysynth")
require(tidysynth)
library(dplyr)
data("smoking")
smoking %>% dplyr::glimpse()
unique(smoking$state)
smoking_out <-
smoking %>%
synthetic_control(outcome = cigsale,
unit = state,
time = year,
i_unit = "California",
i_time = 1988,
generate_placebos=TRUE
) %>%
generate_predictor(time_window = 1980:1988,
ln_income = mean(lnincome, na.rm = T),
ret_price = mean(retprice, na.rm = T),
youth = mean(age15to24, na.rm = T)) %>%
generate_predictor(time_window = 1984:1988,
beer_sales = mean(beer, na.rm = T)) %>%
generate_predictor(time_window = 1975,
cigsale_1975 = cigsale) %>%
generate_predictor(time_window = 1980,
cigsale_1980 = cigsale) %>%
generate_predictor(time_window = 1988,
cigsale_1988 = cigsale) %>%
generate_weights(optimization_window = 1970:1988,
margin_ipop = .02,sigf_ipop = 7,bound_ipop = 6
) %>%
generate_control()
smoking_out %>% plot_trends()
## This is the plot that is CLEARLY labeled as "Difference in synthetic control and observed Alabama"
smoking_out %>% plot_differences()
Oh -- And also, if I change generate_placebos=TRUE to FALSE in the synthetic_control() specifications, it doesn't run. (I was checking to see if it was stalling on another state via the placebo runs.)
Some days ago I found table1 library to get nice tables.
The only one problem (for me), its that output is a HTML table. I am using rtf library to export R table to word, but I dont know how export this output table (HTML) to word .
I wonder if exist some posibilty of get a different output. Or a different way to convert to R table. I am no using R-studio.
Thanks in advance.
library(table1)
table1(~mpg| carb*am,data = mtcars)
Thanks to #r2evans for the information, I could get a R table, maybe I lost a little bit the format but is ok when I export to word with rtf library:
library(rvest)
library(table1)
tbl_1=table1(~mpg| carb*am,data = mtcars)
as.data.frame(read_html(tbl_1) %>% html_table(fill=TRUE))
Note that you can get a lot more control over the output with some other packages. In the example below I'm using Tplyr and reporter. Tplyr generates the statistics and reporter will create the RTF. It takes a lot more work than table1. But you gain a lot more types of statistics and reports. You could basically produce any safety report.
library(Tplyr)
library(reporter)
dt <- tplyr_table(mtcars, am) %>%
add_layer(group_count(cyl)) %>%
add_layer(group_desc(mpg)) %>%
build()
tbl <- create_table(dt, show_cols = c("ord_layer_index", "row_label1",
"var1_0", "var1_1")) %>%
stub(c("ord_layer_index", "row_label1"), label = "Variables") %>%
define(ord_layer_index, label = "Variable", label_row = TRUE,
format = c("1" = "Cylinders",
"2" = "Miles Per Gallon"),
dedupe = TRUE, blank_after = TRUE) %>%
define(row_label1, label = "", indent = .25) %>%
define(var1_0, label = "Automatic", align = "center", n = 19) %>%
define(var1_1, label = "Manual", align = "center", n = 13)
pth <- file.path(tempdir(), "test1.rtf")
rpt <- create_report(pth,
output_type = "RTF",
orientation = "portrait") %>%
titles("Table 1.0",
"Characteristics of MTCars by Transmission Type",
"Population: All Cars") %>%
set_margins(top = 1, bottom = 1) %>%
add_content(tbl)
write_report(rpt)
file.show(pth)
Here is the RTF output:
My Set-up: I am currently trying to use an interactive map to show the different number of cases of SARS by country through a world map. Rather than trying on Shiny, I attempted to do it on a flexdashboard on R Markdown. Currently, I have a data set with two columns: "Country" and "total". Country shows which country it is while "total" shows the number of cases of SARS. "Country" is a factor object while "total" is numeric.
Now my question is: I have set-up and used the sample geojson world map, however none of the values of my data set is being inputted into the world map I used. How should I go about displaying my values on the maps? Is this because the "worldgeojson" map cannot correctly read the countries in my data set?
My code currently is:
highchart() %>%
hc_title(text = "Number of Cases of SARS in the World") %>%
hc_subtitle(text = "Source: SARS.csv") %>%
hc_add_series_map(worldgeojson, countries,
name = "Country",
value = "total",
joinBy = c("woename", "Country")) %>%
hc_mapNavigation(enabled = T)
Without your data, it is unclear what the problem might be.
However, here is a working example you could use.
library(highcharter)
countries <- data.frame(
Country = c("Canada", "China", "France"),
Total = c(251, 5327, 7)
)
highchart() %>%
hc_title(text = "Number of Cases of SARS in the World") %>%
hc_subtitle(text = "Source: SARS.csv") %>%
hc_add_series_map(worldgeojson, countries,
name = "SARS Cases",
value = "Total",
joinBy = c("name", "Country")) %>%
hc_mapNavigation(enabled = T)
Map
Is there a way to implement a time slider for Leaflet or any other interactive map library in R? I have data arranged in a time series, and would like to integrate that into a "motion" map where the plot points change dynamically over time.
I was thinking of breaking my data into pieces, using subset to capture the corresponding data table for each month. But how would I move between the different data sets corresponding to different months?
As it stands now, I took the average and plotted those points, but I'd rather produce a map that integrates the time series.
Here is my code so far:
data<-read.csv("Stericycle Waste Data.csv")
library(reshape2)
library(ggplot2)
library(plyr)
library(ggmap)
names(data)<-c("ID1","ID2", "Site.Address", "Type", "City", "Province", "Category", "Density", "Nov-14", "Dec-14", "Jan-15", "Feb-15", "Mar-15", "Apr-15", "May-15", "Jun-15", "Jul-15", "Aug-15", "Sep-15", "Oct-15", "Nov-15", "Dec-15", "Jan-16")
data<-melt(data, c("ID1","ID2", "Site.Address","Type", "City", "Province", "Category", "Density"))
data<-na.omit(data)
data_grouped<-ddply(data, c("Site.Address", "Type","City", "Province", "Category", "Density", "variable"), summarise, value=sum(value))
names(data_grouped)<-c("Site.Address", "Type", "City", "Province", "Category", "Density", "Month", 'Waste.Mass')
dummy<-read.csv('locations-coordinates.csv')
geodata<-merge(data_grouped, dummy, by.x="Site.Address", by.y="Site.Address", all.y=TRUE)
library(leaflet)
d = geodata_avg$density_factor
d = factor(d)
cols <- rainbow(length(levels(d)), alpha=NULL)
geodata_avg$colors <- cols[unclass(d)]
newmap <- leaflet(data=geodata_avg) %>% addTiles() %>%
addCircleMarkers(lng = ~lon, lat = ~lat, weight = 1, radius = ~rank*1.1, color = ~colors, popup = paste("Site Address: ", geodata_avg$Site.Address, "<br>", "Category: ", geodata_avg$Category, "<br>", "Average Waste: ", geodata_avg$value))
newmap
Thanks in advance! Any guidance/insight would be greatly appreciated.
Recognizing this is a very old question, in case anyone's still wondering...
The package leaflet.extras2 has some functions that might help. Here's an example that uses some tidyverse functions, sf, and leaflet.extras2::addPlayback() to generate and animate some interesting GPS tracks near Ottawa.
library(magrittr)
library(tibble)
library(leaflet)
library(leaflet.extras2)
library(sf)
library(lubridate)
# how many test data points to create
num_points <- 100
# set up an sf object with a datetime column matching each point to a date/time
# make the GPS tracks interesting
df <- tibble::tibble(temp = (1:num_points),
lat = seq(from = 45, to = 46, length.out = num_points) + .1*sin(temp),
lon = seq(from = -75, to = -75.5, length.out = num_points) + .1*cos(temp),
datetime = seq(from = lubridate::ymd_hms("2021-09-01 8:00:00"),
to = lubridate::ymd_hms("2021-09-01 9:00:00"),
length.out = num_points)) %>%
sf::st_as_sf(coords = c("lon", "lat"), crs = "WGS84", remove = FALSE)
# create a leaflet map and add an animated marker
leaflet() %>%
addTiles() %>%
leaflet.extras2::addPlayback(data = df,
time = "datetime",
options = leaflet.extras2::playbackOptions(speed = 100))
Here is an answer that may be of help.
Alternatively, you could provide the time series of a point as a popup graph using mapview::popupGraph. It is also possible to provide interactive, htmlwidget based graphs to popupGraph