I'm working through eBird code from this webpage:
https://github.com/CornellLabofOrnithology/ebird-best-practices/blob/master/03_covariates.Rmd
with the exception of using my own data. I have a .gpkg from gadm.org of Australia, and my own ebird data selected for Australia. I have followed out the code exactly with the exception of not using "bcr" as my dataset has no bcr codes, along with removing st_buffer(dist = 10000) from the rgdal code because this prevented me from actually downloading the MODIS data for some reason.
EDIT:I have also used the provided data from the site and still received the same error
I got stuck at this code:
lc_extract <- ebird_buff %>%
mutate(pland = map2(year_lc, data, calculate_pland, lc = landcover)) %>%
select(pland) %>%
unnest(cols = pland)
It returns this error:
Error: Problem with `mutate()` input `pland`.
x error in evaluating the argument 'x' in selecting a method for function 'exact_extract': invalid layer names
i Input `pland` is `map2(year_lc, data, calculate_pland, lc = landcover)`.)`
I can not seem to figure out how to correct it, I'm rather new to dense geo-spatial code like this.
There is a free dataset in the link, but I haven't yet tried it out, so it may be that my data is incompatible with the code? however, I have had a look at the Gis-data.gpkg provided, and my data from gadm seems fine.
The previous two codes to the one above were:
neighborhood_radius <- 5 * ceiling(max(res(landcover))) / 2
ebird_buff <- red_knot %>%
distinct(year = format(observation_date, "%Y"),
locality_id, latitude, longitude) %>%
# for 2019 use 2018 landcover data
mutate(year_lc = if_else(as.integer(year) > max_lc_year,
as.character(max_lc_year), year),
year_lc = paste0("y", year_lc)) %>%
# convert to spatial features
st_as_sf(coords = c("longitude", "latitude"), crs = 4326) %>%
# transform to modis projection
st_transform(crs = projection(landcover)) %>%
# buffer to create neighborhood around each point
st_buffer(dist = neighborhood_radius) %>%
# nest by year
nest(data = c(year, locality_id, geometry))
calculate_pland <- function(yr, regions, lc) {
locs <- st_set_geometry(regions, NULL)
exact_extract(lc[[yr]], regions, progress = FALSE) %>%
map(~ count(., landcover = value)) %>%
tibble(locs, data = .) %>%
unnest(data)
}
This has been answered by the author of the webpage.
The solution was this code:
lc_extract <- NULL
for (yr in names(landcover)) {
# get the buffered checklists for a given year
regions <- ebird_buff$data[[which(yr == ebird_buff$year_lc)]]
# get landcover values within each buffered checklist area
ee <- exact_extract(landcover[[yr]], regions, progress = FALSE)
# count the number of each landcover class for each checklist buffer
ee_count <- map(ee, ~ count(., landcover = value))
# attach the year and locality id back to the checklists
ee_summ <- tibble(st_drop_geometry(regions), data = ee_count) %>%
unnest(data)
# bind to results
lc_extract <- bind_rows(lc_extract, ee_summ)
}
credits go to:
Matt Strimas-Mackey
Related
I am at the final stages of a project where i have been comparing the appraisal price vs the sold price of different properties. The complete code for data collection and tidying is below.
At this stage i am looking at different ways to visualize my data. However, I am quite new to it so my question is whether anyone has any "new" or special ways they visualizing data that they find usefull og intuitive. I have given a couple of examples of what i am able to visualize now using ggplot.
Additionally: Now my visualizations plots all 1275 observations every time. I would however also like to visualize the data both with mean and median for the Percentage, Sold and Tax variables which i am most interested in. For example to visualize the mean value of the Percentage column based on different years.
Appreciate any help!
Complete code:
#Step 1: Load needed library
library(tidyverse)
library(rvest)
library(jsonlite)
library(stringi)
library(dplyr)
library(data.table)
library(ggplot2)
#Step 2: Access the URL of where the data is located
url <- "https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/10/"
#Step 3: Direct JSON as format of data in URL
data <- jsonlite::fromJSON(url, flatten = TRUE)
#Step 4: Access all items in API
totalItems <- data$TotalNumberOfItems
#Step 5: Summarize all data from API
allData <- paste0('https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/', totalItems,'/') %>%
jsonlite::fromJSON(., flatten = TRUE) %>%
.[1] %>%
as.data.frame() %>%
rename_with(~str_replace(., "ListItems.", ""), everything())
#Step 6: removing colunms not needed
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
#Step 7: remove whitespace and change to numeric in columns SoldAmount and Tax
#https://stackoverflow.com/questions/71440696/r-warning-argument-is-not-an-atomic-vector-when-attempting-to-remove-whites/71440806#71440806
allData[c("Tax", "SoldAmount")] <- lapply(allData[c("Tax", "SoldAmount")], function(z) as.numeric(gsub(" ", "", z)))
#Step 8: Remove rows where value is NA
#https://stackoverflow.com/questions/4862178/remove-rows-with-all-or-some-nas-missing-values-in-data-frame
alldata <- allData %>%
filter(across(where(is.numeric),
~ !is.na(.)))
#Step 9: Remove values below 10000 NOK on SoldAmount og Tax.
alldata <- alldata %>%
filter_all(any_vars(is.numeric(.) & . > 10000))
#Step 10: Calculate percentage change between tax and sold amount and create new column with percent change
#df %>% mutate(Percentage = number/sum(number))
alldata_Percent <- alldata %>% mutate(Percentage = (SoldAmount-Tax)/Tax)
Visualization
# Plot Percentage difference based on County
ggplot(data=alldata_Percent,mapping = aes(x = Percentage, y = County)) +
geom_point(size = 1.5)
#Plot County with both Date and Percentage difference The The
theme_set(new = ggthemes::theme_economist())
p <- ggplot(data = alldata_Percent,
mapping = aes(x = Date, y = Percentage, colour = County)) +
geom_line(na.rm = TRUE) +
geom_point(na.rm = TRUE)
p
I have a shape file of towns in the north of Spain that I have to join into groups (municipalities or comarcas in Spanish). I've used st_union from the sf package to join them successfully (and each one is their own SpatialPolygonsDataFrame object with a single polygon). I plot each of the municipalities individually and they look fine.
However, once I want to combine the municipalities into a single SpatialPolygonsDataFrame object with multiple polygons, I can't for the life of me manage to do it. I've tried three approaches mostly based on this answer: https://gis.stackexchange.com/questions/155328/merging-multiple-spatialpolygondataframes-into-1-spdf-in-r and this one https://gis.stackexchange.com/questions/141469/how-to-convert-a-spatialpolygon-to-a-spatialpolygonsdataframe-and-add-a-column-t
– If I use raster::union it throws out the error
Error in .rowNamesDF<-(x, value = value) : invalid 'row.names' length
– If I use a simple rbind it throws out the error
Error in SpatialPolygonsDataFrame(pl, df, match.ID = FALSE) :
Object length mismatch:
pl has 7 Polygons objects, but df has 4 rows
Or something similar for 6/11 of the municipalities.
– If I try a lapply approach (more convoluted) it seems to work but one I plot it using leaflet the municipalities that gave the error when trying to raster::union or rbind don't look as they should/don't look as they do when I plot them individually.
** Municipalities 1 and 2 work fine. 3 and 4 for example do not. **
Here's a link to the two files needed to reproduce my code below:
– Link to shape files: https://www.dropbox.com/sh/z9632hworbbchn5/AAAiyq3f_52azB4oFeU46D5Qa?dl=0
– Link to xls file that contains the mapping from towns to municipalities: https://www.dropbox.com/s/4w3fx6neo4t1l3d/listado-comarcas-gipuzkoa.xls?dl=0
And my code:
library(tidyverse)
library(magrittr)
library(sf)
library(ggplot2)
library(lwgeom)
library(readxl)
library(raster)
#Read shapefile
mapa_municip <- readOGR(dsn = "UDALERRIAK_MUNICIPIOS/UDALERRIAK_MUNICIPIOS.shp")
mapa_municip <- spTransform(mapa_municip, CRS('+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0'))
mapa_municip <- st_as_sf(mapa_municip)
#Read excel that contains mapping from town to municioalities
muni2com <- read_excel("listado-comarcas-gipuzkoa.xls",
sheet=1,
range="A1:C91",
col_names = T)
comarcas <- list()
count <- 0
for (i in unique(muni2com$Comarca)[1:4]){
count <- count + 1
for (k in unique(muni2com$Municipios[muni2com$Comarca==i])){
if (k == unique(muni2com$Municipios[muni2com$Comarca==i])[1]){ # if 1st case, keep this town
temp <- mapa_municip[mapa_municip$MUNICIPIO==k,]
}
if (k != unique(muni2com$Municipios[muni2com$Comarca==i])[1]){ # otherwise, join w previous ones
temp <- sf::st_union(temp, mapa_municip[mapa_municip$MUNICIPIO==k,])
}
}
comarcas[[count]] <- spTransform(as(temp, "Spatial"), CRS('+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0'))
comarcas[[count]]#data <- data.frame(comarca = i)
}
IDs <- sapply(comarcas, function(x)
slot(slot(x, "polygons")[[1]], "ID"))
#Checking
length(unique(IDs)) == length(comarcas)
dfIDs <- data.frame(comarca = IDs)
#Making SpatialPolygons from list of polygons
comarcas2 <- SpatialPolygons(lapply(comarcas,
function(x) slot(x, "polygons")[[1]]))
# Try to coerce to SpatialPolygonsDataFrame (will throw error)
p.df <- data.frame( comarca = unique(muni2com$Comarca)[1:4])
p <- SpatialPolygonsDataFrame(comarcas2, p.df)
# Extract polygon ID's
( pid <- sapply(slot(comarcas2, "polygons"), function(x) slot(x, "ID")) )
# Create dataframe with correct rownames
( p.df <- data.frame( comarca = unique(muni2com$Comarca)[1:4], row.names = pid) )
# Try coertion again and check class
comarcas3 <- SpatialPolygonsDataFrame(comarcas2, p.df)
class(comarcas3)
#Leaflet map
leaflet( options = leafletOptions(zoomControl = F,
zoomSnap = 0.1 ,
zoomDelta = 1
),
data = comarcas3,
) %>%
addProviderTiles(provider="CartoDB.Positron") %>%
htmlwidgets::onRender("function(el, x) {
L.control.zoom({ position: 'topright' }).addTo(this)
}") %>%
clearShapes() %>%
addPolygons(fillColor = "gray",
opacity = 0.8,
weight = 0.3,
color = "white",
fillOpacity = 0.95,
smoothFactor = 0.5,
label = ~comarca,
highlight = highlightOptions(
weight = 1.5,
color = "#333333",
bringToFront = T),
layerId = ~comarca
)
** Note how if you plot comarcas[[3]] or comarcas[[4]] above instead of comarcas3 the shape of those municipalities is completely different.**
I'd really appreciate any tips you can give me, I've been at it for days and I can't solve it. I assume the problem is due to the error given by the rbind, which seems to be the most informative one, but I don't know what it means. Thank you very much in advance.
Are you absolutely positively required to use the older {sp} package workflow?
If not it may be easier to dissolve the municipalities into comarcas using a pure {sf} based workflow - grouping by a comarca column, and then summarising will do the trick.
Consider this code:
library(tidyverse)
library(sf)
library(readxl)
library(leaflet)
#Read shapefile
mapa_municip <- st_read("UDALERRIAK_MUNICIPIOS.shp") %>%
st_transform(4326)
#Read excel that contains mapping from town to municioalities
muni2com <- read_excel("listado-comarcas-gipuzkoa.xls",
sheet=1,
range="A1:C91",
col_names = T)
# dissolving comarcas using sf / dplyr based workflow
comarcas <- mapa_municip %>%
inner_join(muni2com, by = c("MUNICIPIO" = "Municipios")) %>%
group_by(Comarca) %>%
summarise() %>% # magic! :)))
ungroup()
leaflet(comarcas) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(color = "red",
label = ~ Comarca)
Is there any way that can be used to parse a shapefile of a country and download MODIS product data within that country using R?
I tried different approaches using the MODIStsp package (https://docs.ropensci.org/MODIStsp/) as well as the MODISTools package (https://docs.ropensci.org/MODISTools/articles/modistools-vignette.html) and they both only allow me to download MODIS product data for a defined site, but not a country.
Here's an example of how you might achieve this.
Firstly, download the MODIS data that you require, in this example I'm using MCD12Q1.006
begin_year and end_year are in the format: Year.Month.Days.
shape_file is the shapefile you're using, presumably the extent of the shapefile is the country you're after. Though, I'm only going off by the minimal information you have provided.
library(MODIS)
tifs <- runGdal(product = "MCD12Q1", collection = "006", SDSstring = "01",
extent = shape_file %>% st_buffer(dist = 10000),
begin = begin_year, end = end_year,
outDirPath = "data", job = "modis",
MODISserverOrder = "LPDAAC") %>%
pluck("MCD12Q1.006") %>%
unlist()
# rename tifs to have more descriptive names
new_names <- format(as.Date(names(tifs)), "%Y") %>%
sprintf("modis_mcd12q1_umd_%s.tif", .) %>%
file.path(dirname(tifs), .)
file.rename(tifs, new_names)
landcover <- list.files("data/modis", "^modis_mcd12q1_umd",
full.names = TRUE) %>%
stack()
# label layers with year
landcover <- names(landcover) %>%
str_extract("(?<=modis_mcd12q1_umd_)[0-9]{4}") %>%
paste0("y", .) %>%
setNames(landcover, .)
Also, if you require a particular cell size, then you could follow this procedure to get a 5x5 modis cell size.
neighborhood_radius <- 5 * ceiling(max(res(landcover))) / 2
agg_factor <- round(2 * neighborhood_radius / res(landcover))
r <- raster(landcover) %>%
aggregate(agg_factor)
r <- shape_file %>%
st_transform(crs = projection(r)) %>%
rasterize(r, field = 1) %>%
# remove any empty cells at edges
trim()
Here's an example using MODISTools to automate downloading the correct tiles for the country.
First let's generate a polygon of a country to demonstrate (using Luxembourg as an example):
library(maptools)
library(sf)
data(wrld_simpl)
world = st_as_sf(wrld_simpl)
lux = world[world$NAME=='Luxembourg',]
Now we find the location (centroid) and size of the country:
#find centroid of polygon in long-lat decimal degrees
lux.cent = st_centroid(lux)
#find width and height of country in km
lux.proj = st_transform(lux,
"+proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +units=km +no_defs")
lux.km_lr = diff(st_bbox(lux.proj)[c(1,3)])
lux.km_ab = diff(st_bbox(lux.proj)[c(2,4)])
Using this info, we can download the correct Modis data (using leaf-area index, lai, as an example):
#download the MODIS tiles for the area we defined
library(MODISTools)
lux_lai <- mt_subset(product = "MOD15A2H",
lat = lux.cent$LAT, lon = lux.cent$LON,
band = "Lai_500m",
start = "2004-01-01", end = "2004-01-01",
km_lr = lux.km_lr, km_ab = lux.km_ab,
site_name = "Luxembourg",
internal = TRUE, progress = TRUE)
# convert to a spatial raster
lux.rast = mt_to_raster(df = lux_lai, reproject = TRUE)
lux.rast = raster::mask(lux.rast, lux)
plot(lux.rast)
plot(st_geometry(lux),add=T)
I have tried several sources but no luck. Please see my codes below and I state the problem at the end of codes. I have created random hexagonal grids over large areas and wanted to summarize how many of them fall under features of 2nd spatial polygon data frame.
library (sf)
library(dplyr)
library(raster)
# load 2nd spdf
Read ibra polygons as sf object. To download paste 'Interim Biogeographic Regionalisation for Australia (IBRA)' in the search item then click on 'Interim Biogeographic Regionalisation for Australia (IBRA), Version 7 (Regions)'
ibra <- st_read("ibra7_subregions.shp")
ibra <- st_transform(ibra, crs = 4326)
# ibra has >2000 features (i.e., rows) for 89 regions of same name, group them together
ibraGrid <- ibra %>%
group_by(REG_NAME_7) %>%
st_sf() %>%
mutate(cellid = row_number()) %>%
summarise()
colnames(ibraGrid)[1] <- "id"
# crop ibra to specific boundary
box <- extent(112,155,-45,-10)
ibraGrid <- st_crop(ibraGrid, box)
# make dataframe of spatial grid (1st spdf)
ran.p <- st_sample(au, size = 1040)
Load shp of au from here then click on "nsaasr9nnd_02211a04es_geo___.zip".
au <- st_read("aust_cd66states.shp")
au <- st_transform(au, crs = 4326)
# create grid around multipoints
rand_sampl_Grid <- ran.p %>%
st_make_grid(cellsize = 0.1, square = F) %>%
st_intersection(au) %>%
st_cast("MULTIPOLYGON") %>%
st_sf() %>%
mutate(cellid = row_number())
# sampled grid per ibra region
density_per_ib_grid <- ibraGrid %>%
st_join(rand_sampl_Grid) %>%
mutate(overlap = ifelse(!is.na(id), 1, 0)) %>%
group_by(cellid) %>%
summarize(num_sGrid = sum(overlap))
Everything worked well. But, I expected that the length of View(density_per_ib_grid$num_sGrid) would be equal to the number of features in ibraGrid (i.e., 89). Currently, View(density_per_ib_grid$num_sGrid) has length of features equal to rand_sample_Grid (i.e., ~1040). In addition, I want to repeat the process for 100 times so that num_sGrid would be the mean of 100 iterations.
The above codes worked desirably using larger spdf (which is ibraGrid in this casae) created from coordinates. Any suggestions/feedback will be highly appreciated.
I have figured out the solution. The last codes section in the above question should be as:
richness_per_ib_grid <- st_intersection(ibraGrid, rand_sampl_Grid) %>%
group_by(id) %>%
count()
out <- as.data.frame(int.result)[,-3] # print output as data frame.
Therefore the complete answer for the question above should be:
library (sf)
library(dplyr)
library(raster)
# load 2nd spdf
Read ibra polygons as sf object. To download paste 'Interim Biogeographic Regionalisation for Australia (IBRA)' in the search item then click on 'Interim Biogeographic Regionalisation for Australia (IBRA), Version 7 (Regions)'
ibra <- st_read("ibra7_subregions.shp")
ibra <- st_transform(ibra, crs = 4326)
# ibra has >2000 features (i.e., rows) for 89 regions of same name, group them together
ibraGrid <- ibra %>%
group_by(REG_NAME_7) %>%
st_sf() %>%
summarise()
colnames(ibraGrid)[1] <- "id"
# crop ibra to specific boundary
box <- extent(112,155,-45,-10)
ibraGrid <- st_crop(ibraGrid, box)
# make dataframe of spatial grid (1st spdf)
ran.p <- st_sample(au, size = 1040)
Load shp of au from here then click on "nsaasr9nnd_02211a04es_geo___.zip".
au <- st_read("aust_cd66states.shp")
au <- st_transform(au, crs = 4326)
# create grid around multipoints
rand_sampl_Grid <- ran.p %>%
st_make_grid(cellsize = 0.1, square = F) %>%
st_intersection(au) %>%
st_cast("MULTIPOLYGON") %>%
st_sf()
# sampled grid per ibra region
density_per_ib_grid <- <- st_intersection(ibraGrid, rand_sampl_Grid) %>%
group_by(id) %>%
count()
out <- as.data.frame(int.result)[,-3] # print output as data frame.
I want to create a function that takes a simple feature layer and a variable name and creates random points based on the variable values. I can do this without a problem sequentially using pipes(%), but I'm getting stuck on setting up a function using pipes to do the same.
library(tidyverse)
library(sf)
library(tmap)
data("World") # load World sf dataset from tmap
# this works to create a point layer of population by country
World_pts <- World %>%
select(pop_est) %>%
filter(pop_est >= (10^6)) %>%
st_sample(., size = round(.$pop_est/(10^6))) %>% # create 1 random point for every 1 million people
st_sf(.)
# here's what it looks like
tm_shape(World) + tm_borders() + tm_shape(World_pts) + tm_dots()
# this function to do the same does not work
pop2points <- function(sf, x){
x <- enquo(x)
sf %>%
select(!!x) %>%
filter(!!x >= (10^6)) %>% # works up to here
st_sample(., size = round(!!.$x/(10^6))) %>% # this is where it breaks
st_sf(.)
}
World_pts <- pop2points(World,pop_est)
I suspect that I'm getting confused about how to handle non-standard evaluation in a function argument.
One option would be converting your x to label and using the .[[ approach for referring to column names:
pop2points <- function(sf, x){
x <- enquo(x)
sf %>%
select(!!x) %>%
filter(!!x >= (10^6)) %>%
st_sample(., size = round(.[[as_label(x)]] /(10^6))) %>%
st_sf(.)
}