I am working on a research assignment on COVID and using the datalake API to fetch different kind of datasets available to us.
I am wondering if it's possible to fetch all outbreak countries.
ids = list("Australia"), this works with individual country, it doesnt seem to accept wildcard or all.
Can anyone give me any insights on this please.
# Total number of confirmed cases in Australia and proportion of getting infected.
today <- Sys.Date()
casecounts <- evalmetrics(
"outbreaklocation",
list(
spec = list(
**ids = list("Australia"),**
expressions = list("JHU_ConfirmedCasesInterpolated","JHU_ConfirmedDeathsInterpolated"),
start = "2019-12-20",
end = today-1,
interval = "DAY"
)
)
)
casecounts
The easiest way to access a list of countries is in the Excel file linked at https://c3.ai/covid-19-api-documentation/#tag/OutbreakLocation. It has a list of countries in the first sheet, and shows which of those have data from JHU.
You could also fetch an approximate list of country-level locations with:
locations <- fetch(
"outbreaklocation",
list(
spec = list(
filter = "not(contains(id, '_'))"
)
)
)
That should contain all of the countries, but could have some non-countries like World Bank regions.
Then, you'd use this code to get the time series data for all of those locations:
location_ids <-
locations %>%
dplyr::select(-location) %>%
unnest_wider(fips, names_sep = ".") %>%
sample_n(15) %>% # include this to test on a smaller set
pull(id)
today <- Sys.Date()
casecounts <- evalmetrics(
"outbreaklocation",
list(
spec = list(
ids = location_ids,
expressions = list("JHU_ConfirmedCasesInterpolated","JHU_ConfirmedDeathsInterpolated"),
start = "2019-12-20",
end = today-1,
interval = "DAY"
)
),
get_all = TRUE
)
casecounts
I want to create a time series in a netcdf file with 3 dimensions(lon, lat, time[unlimited]). The timeseries should be created from other netcdf-files. Each of them have only one timepoint [For Example 17856].
I know how to create the new netcdf-file, how to extract the data from the netcdf-file as a 2D array and the time for the data.
My problem is:
How do I put the 2D array in the netcdf-file with its correct time? How does the start and count argument in the "ncvar_put" fucntion does work?
I use the ncdf4 package and read the Tutorial on:
http://geog.uoregon.edu/bartlein/courses/geog490/week04-netCDF.html#create-and-write-a-netcdf-file and searched for an answer but I still don`t understand it. I´m still unexperienced with netcdf files.
Example
e of my problem:
# data from other netcdf file
values = array(data = c(1:9)/10, dim = c(3,3))
values_2 = array(data = c(9:25)/10, dim = c(3,3))
time = 25
time_2 = 23
# set parameters
lon = 1:3
lat = 1:3
# define dimensions
# Longitude
londim = ncdim_def(name = "longitude", units = "degrees", vals = as.double(lon),
longname = "longitude")
# Latitude
latdim = ncdim_def(name = "latitude", units = "degrees", vals = as.double(lat),
longname = "latitude")
# Time
timedim = ncdim_def(name = "time", units ="days since 1582-10-15 00:00", vals = as.double(1),
unlim = TRUE, calendar = "gregorian")
# define variables
B01 = ncvar_def(name = "B01",
units ="percent",
list(londim,latdim,timedim),
missval = NA,
prec="double")
# create netcdf
nc_test = nc_create("test.nc", list(B01), force_v4 = TRUE)
# Add values
### Here is somethin missing --> How do I add the timestamp?
ncvar_put(nc_test, "B01", values, start=c(1,1,1), count=c(-1,-1,1))
ncvar_put(nc_test, "B01", values2, start=c(1,1,2), count=c(-1,-1,1))
When I want to extract the data I get the 3-3-2 array, but the timesteps are not correct, because I didnt add them. How do I do this?
I would like to have the 3-3-2 array and when I take the time and I want the right times in the correct order.
I add time to the netCDF file by using another method. This is the sample code for your reference.
from datetime import datetime
from datetime import timedelta
from netCDF4 import date2num
from netCDF4 import Dataset
import os
# generate time for netCDF with 1 hour interval
utc_now = datetime.utcnow()
time_list = [utc_now + timedelta(hours=1*step) for step in range(6)]
trans_time = date2num(time_list, units="hours since 0001-01-01 00:00:00.0", calendar="gregorian")
with Dataset(os.getcwd() + "/sample.nc", "w") as sample:
# Create dimension for sample.nc
time = sample.createDimension("time", None)
lat = sample.createDimension("lat", 3) # 3 is the latitude size in this sample
lon = sample.createDimension("lon", 3)
# Create variable for sample.nc
time = sample.createVariable("time","f8",("time",))
lat = sample.createVariable("lat","f4",("lat",))
lon = sample.createVariable("lon","f4",("lon",))
time[:] = trans_time
variable_with_time = sample.createVariable("variable_with_time", "f4", ("time", "lat", "lon"))
for key, value in sample.variables.items():
print(key)
print(value)
print("*"*70)
Output:
time
<class 'netCDF4._netCDF4.Variable'>
float64 time(time)
unlimited dimensions: time
current shape = (6,)
filling on, default _FillValue of 9.969209968386869e+36 used
**********************************************************************
lat
<class 'netCDF4._netCDF4.Variable'>
float32 lat(lat)
unlimited dimensions:
current shape = (3,)
filling on, default _FillValue of 9.969209968386869e+36 used
**********************************************************************
lon
<class 'netCDF4._netCDF4.Variable'>
float32 lon(lon)
unlimited dimensions:
current shape = (3,)
filling on, default _FillValue of 9.969209968386869e+36 used
**********************************************************************
variable_with_time
<class 'netCDF4._netCDF4.Variable'>
float32 variable_with_time(time, lat, lon)
unlimited dimensions: time
current shape = (6, 3, 3)
filling on, default _FillValue of 9.969209968386869e+36 used
**********************************************************************
You may notice that, time is placed as the first dimension. For detailed information, this is the link to the document that I referenced.
I am having trouble importing and joining a geojson map to some data using the highcharter library. I am trying to use a slim downed version of a sf dataset that I got using the tidycensus package which I then uploaded to https://mapshaper.org/ to reduce the size of the file by thinning out the polygons. After thinning I exported as geojson and import into R.
Here is an example. First I download the data using tidycensus, create two data sets one for geometry and one for the attribute of interest, here its median family income. Then I export the geometry data to so that I can feed into mapshapper for reduction.
#start with an example for one state
##pull geometry data for one state
md_data <- get_acs(geography = "tract",
state = "MD",
variables = "B19113_001",
geometry = T,
key = Sys.getenv("CENSUS_API_KEY"))
#data set of just GEOID and median family income for use in mapping
md_mfi <- as.data.frame(md_data) %>%
mutate(median_family_income = case_when(is.na(estimate) ~ 0,
TRUE ~ estimate)) %>%
select(GEOID,median_family_income)
#slim down to just the geoid and the geometry data
md_tracts <- md_data %>%
select(GEOID,geometry)
st_write(md_tracts, "U:/M1JPW00/GeoSpatial/census_tracts/acs_carto_2016/md_carto_tracts.shp")
After reformatting in mapshaper I import back into R
md_map_json <- jsonlite::fromJSON(txt = "FILEPATH/md_carto_tracts.json",simplifyVector = FALSE)
md_map_json <- geojsonio::as.json(md_map_json)
And then try and build a map based on an example from the highcharter docs here
> class(md_map_json)
[1] "json" "geo_json"
> head(md_mfi)
GEOID median_family_income
1 24001000100 54375
2 24001000200 57174
3 24001000300 48362
4 24001000400 52038
5 24001000500 46174
6 24001000600 49784
highchart(type = "map") %>%
hc_add_series(mapData = md_map_json,
data = list_parse(md_mfi),
joinBy = "GEOID",
value = "median_family_income",
name = "Median Family Income")
The map actually renders and the census tracts are colored solid blue but the series data doesn't seem to successfully join even with or without using list_parse.
I had the same problem, asked here:
Make a choropleth from a non-highmap-collection map. Nobody responded (I know!), so I finally got to a solution that I think should work for you too:
#Work with the map you get until this step:
md_map_json <- jsonlite::fromJSON(txt = "FILEPATH/md_carto_tracts.json",simplifyVector = FALSE)
#This part is unnecessary:
#md_map_json <- geojsonio::as.json(md_map_json)
#Then, write your map like this:
highchart() %>%
hc_add_series_map(md_map_json, md_mfi, value = "median_family_income", joinBy = "GEOID")
I am currently trying to configure the rnoaa library to connect city, state data with a weather station, and therefore output ANNUAL weather data, namely temperature. I have included a hardcoded input for reference, but I intend on feeding in hundreds of geocoded cities eventually. This isn't the issue so much as it is retrieving data.
require(rnoaa)
require(ggmap)
city<-geocode("birmingham, alabama", output = "all")
bounds<-city$results[[1]]$geometry$bounds
se<-bounds$southwest$lat
sw<-bounds$southwest$lng
ne<-bounds$northeast$lat
nw<-bounds$northeast$lng
stations<-ncdc_stations(extent = c(se, sw, ne, nw),token = noaakey)
I am calculating a MBR (rectangle) around the geographic area, in this case Birmingham, and then getting a list of stations. I'm then pulling out the station_id and then attempting to retrieve results with any type of parameters with no success. I'm looking to associate annual temperatures with each city.
test <- ncdc(datasetid = "ANNUAL", locationid = topStation[1],
datatypeid = "DSNW",startdate = "2000-01-01", enddate = "2010-01-01",
limit = 1000, token = noaakey)
Warning message:
Sorry, no data found
Looks like location ID is creating issue. Try without it ( as it is optional field )
ncdc_locs(datasetid = "ANNUAL",datatypeid = "DSNW",startdate = "2000-01-01", enddate = "2010-01-01", limit = 1000,token = <your token key>)
and then with valid location ID
ncdc_locs(datasetid = "ANNUAL",datatypeid = "DSNW",startdate = "2000-01-01", enddate = "2010-01-01", limit = 1000,locationid='CITY:US000001',token = <your token>)
returns
$meta
NULL
$data
mindate maxdate name datacoverage id
1 1872-01-01 2016-04-16 Washington D.C., US 1 CITY:US000001
attr(,"class")
[1] "ncdc_locs"
I'm trying to add polylines from one specific location to many others in shiny R using addPolylines from leaflet. But instead of linking from one location to the others, I am only able to link them all together in a sequence. The best example of what I'm trying to achieve is seen here in the cricket wagon wheel diagram:
.
observe({
long.path <- c(-73.993438700, (locations$Long[1:9]))
lat.path <- c(40.750545000, (locations$Lat[1:9]))
proxy <- leafletProxy("map", data = locations)
if (input$paths) {
proxy %>% addPolylines(lng = long.path, lat = lat.path, weight = 3, fillOpacity = 0.5,
layerId = ~locations, color = "red")
}
})
It is in a reactive expression as I want them to be activated by a checkbox.
I'd really appreciate any help with this!
Note
I'm aware the OP asked for a leaflet answer. But this question piqued my interest to seek an alternative solution, so here are two
Example - mapdeck
Mapdeck (my package) uses Deck.gl on a Mapbox map, so you need a Mapbox API key to use it. But it does let you plot 2.5d arcs
It works on data.frames and data.tables (as well as sp and sf) objects.
center <- c(144.983546, -37.820077)
df_hits$center_lon <- center[1]
df_hits$center_lat <- center[2]
df_hits$score <- sample(c(1:4,6), size = nrow(df_hits), replace = T)
library(mapdeck)
set_token("MAPBOX")
mapdeck(
style = mapdeck_style("satellite")
) %>%
add_arc(
data = df_hits
, origin = c("center_lon", "center_lat")
, destination = c("lon", "lat")
, stroke_from = "score"
, stroke_to = "score"
, stroke_width = "score"
, palette = "magma"
)
Example - googleway
This example uses googleway (also my package, which interfaces Google Maps API), and also works on data.frames and data.tables (as well as sp and sf)
The trick is in the encodeCoordinates function, which encodes coordinates (lines) into a Google Polyline
library(data.table)
library(googleway)
library(googlePolylines) ## gets installed when you install googleway
center <- c(144.983546, -37.820077)
setDT(df_hits) ## data given at the end of the post
## generate a 'hit' id
df_hits[, hit := .I]
## generate a random score for each hit
df_hits[, score := sample(c(1:4,6), size = .N, replace = T)]
df_hits[
, polyline := encodeCoordinates(c(lon, center[1]), c(lat, center[2]))
, by = hit
]
set_key("GOOGLE_MAP_KEY") ## you need an API key to load the map
google_map() %>%
add_polylines(
data = df_hits
, polyline = "polyline"
, stroke_colour = "score"
, stroke_weight = "score"
, palette = viridisLite::plasma
)
The dplyr equivalent would be
df_hits %>%
mutate(hit = row_number(), score = sample(c(1:4,6), size = n(), replace = T)) %>%
group_by(hit, score) %>%
mutate(
polyline = encodeCoordinates(c(lon, center[1]), c(lat, center[2]))
)
Data
df_hits <- structure(list(lon = c(144.982933659011, 144.983487725258,
144.982804912978, 144.982869285995, 144.982686895782, 144.983239430839,
144.983293075019, 144.983529109412, 144.98375441497, 144.984103102141,
144.984376687461, 144.984183568412, 144.984344500953, 144.984097737723,
144.984065551215, 144.984339136535, 144.984001178199, 144.984124559814,
144.984280127936, 144.983990449363, 144.984253305846, 144.983030218536,
144.982896108085, 144.984022635871, 144.983786601478, 144.983668584281,
144.983673948699, 144.983577389175, 144.983416456634, 144.983577389175,
144.983282346183, 144.983244795257, 144.98315360015, 144.982896108085,
144.982686895782, 144.982617158347, 144.982761997634, 144.982740539962,
144.982837099486, 144.984033364707, 144.984494704658, 144.984146017486,
144.984205026084), lat = c(-37.8202049841516, -37.8201201023877,
-37.8199253045246, -37.8197812267274, -37.8197727515541, -37.8195269711051,
-37.8197600387923, -37.8193828925304, -37.8196964749506, -37.8196583366193,
-37.8195820598976, -37.8198956414717, -37.8200651444706, -37.8203575362288,
-37.820196509027, -37.8201032825917, -37.8200948074554, -37.8199253045246,
-37.8197897018997, -37.8196668118057, -37.8200566693299, -37.8203829615443,
-37.8204295746001, -37.8205355132537, -37.8194761198756, -37.8194040805737,
-37.819569347103, -37.8197007125418, -37.8196752869912, -37.8195015454947,
-37.8194930702893, -37.8196286734591, -37.8197558012046, -37.8198066522414,
-37.8198151274109, -37.8199549675656, -37.8199253045246, -37.8196964749506,
-37.8195862974953, -37.8205143255351, -37.8200270063298, -37.8197430884399,
-37.8195354463066)), row.names = c(NA, -43L), class = "data.frame")
I know this was asked a year ago but I had the same question and figured out how to do it in leaflet.
You are first going to have to adjust your dataframe because addPolyline just connects all the coordinates in a sequence. It seems that you know your starting location and want it to branch out to 9 separate locations. I am going to start with your ending locations. Since you have not provided it, I will make a dataframe with 4 separate ending locations for the purpose of this demonstration.
dest_df <- data.frame (lat = c(41.82, 46.88, 41.48, 39.14),
lon = c(-88.32, -124.10, -88.33, -114.90)
)
Next, I am going to create a data frame with the central location of the same size (4 in this example) of the destination locations. I will use your original coordinates. I will explain why I'm doing this soon
orig_df <- data.frame (lat = c(rep.int(40.75, nrow(dest_df))),
long = c(rep.int(-73.99,nrow(dest_df)))
)
The reason why I am doing this is because the addPolylines feature will connect all the coordinates in a sequence. The way to get around this in order to create the image you described is by starting at the starting point, then going to destination point, and then back to the starting point, and then to the next destination point. In order to create the dataframe to do this, we will have to interlace the two dataframes by placing in rows as such:
starting point
- destination point 1
- starting point
- destination point 2
- and so forth...
The way I will do is create a key for both data frames. For the origin dataframe, I will start at 1, and increment by 2 (e.g., 1 3 5 7). For the destination dataframe, I will start at 2 and increment by 2 (e.g., 2, 4, 6, 8). I will then combine the 2 dataframes using a UNION all. I will then sort by my sequence to make every other row the starting point. I am going to use sqldf for this because that is what I'm comfortable with. There may be a more efficient way.
orig_df$sequence <- c(sequence = seq(1, length.out = nrow(orig_df), by=2))
dest_df$sequence <- c(sequence = seq(2, length.out = nrow(orig_df), by=2))
library("sqldf")
q <- "
SELECT * FROM orig_df
UNION ALL
SELECT * FROM dest_df
ORDER BY sequence
"
poly_df <- sqldf(q)
The new dataframe looks like this (notice how the origin locations are interwoven between the destination):
And finally, you can make your map:
library("leaflet")
leaflet() %>%
addTiles() %>%
addPolylines(
data = poly_df,
lng = ~lon,
lat = ~lat,
weight = 3,
opacity = 3
)
And finally it should look like this:
I hope this helps anyone who is looking to do something like this in the future
Here is a possible approach based on the mapview package. Simply create SpatialLines connecting your start point with each of the end points (stored in locations), bind them together and display the data using mapview.
library(mapview)
library(raster)
## start point
root <- matrix(c(-73.993438700, 40.750545000), ncol = 2)
colnames(root) <- c("Long", "Lat")
## end points
locations <- data.frame(Long = (-78):(-70), Lat = c(40:44, 43:40))
## create and append spatial lines
lst <- lapply(1:nrow(locations), function(i) {
SpatialLines(list(Lines(list(Line(rbind(root, locations[i, ]))), ID = i)),
proj4string = CRS("+init=epsg:4326"))
})
sln <- do.call("bind", lst)
## display data
mapview(sln)
Just don't get confused by the Line-to-SpatialLines procedure (see ?Line, ?SpatialLines).