I want to plot rivers (lines) in a map containing polygons (counties, etc) from South Dakota. The river data is here, https://www.weather.gov/gis/Rivers. Use the subset of rivers data set. The county download can be obtained from here, https://www2.census.gov/geo/tiger/TIGER2020/COUNTY/.
I only want the rivers that lie within the county boundaries of South Dakota, so I am using rgeos::intersection to perform that, which produces a Large SpatialLines object, which ggplot2 doesn't like when I try to plot it with geom_line (I get an error that says "Error: data must be a data frame, or other object coercible by fortify(), not an S4 object with class SpatialLines.")
Here is my code:
library(rgdal)
library(raster)
counties <- readOGR('D:\\Shapefiles\\Counties\\tl_2020_us_county.shp')
counties <- counties[which(counties$STATEFP == '46'),]
counties <- spTransform(counties, CRS("+init=epsg:3395"))
rivers <- readOGR('D:\\Shapefiles\\Main_Rivers\\rs16my07.shp')
proj4string(rivers) <- CRS("+proj=longlat")
rivers <- spTransform(rivers, CRS("+init=epsg:3395"))
rivers <- as.SpatialLines.SLDF(rgeos::gIntersection(counties, rivers))
The raster packages "intersect" function does not work for doing the intersection. I think I need to change the SpatialLines object to a spatialLinesDataFrame object to get ggplot2 to plot the rivers. How do I do that? The as.SpatialLines.SLDF function is not doing it. Is there another way to get this to plot? My plotting code is here:
ggplot() +
geom_path(counties, mapping = aes(x = long, y = lat, group = group, col = 'darkgreen')) +
geom_path(rivers, mapping = aes(x = long, y = lat, color = 'blue'))
I would recommend handling your spatial data with the sf library. Firstly, it plays well with ggplot. Also, according to my very much infant understanding of GIS and spatial data in R, I believe that the idea is the sf will eventually take over from sp and the Spatial* data formats. sf is I think a standard format across multiple platforms. See this link for more details on sf.
Onto your question - this is quite simple using sf. To find the rivers inside a specific county, we use st_intersection() (the sf version of gIntersection).
library(sf)
# read in the rivers data
st_read(dsn = 'so_data/rs16my07', layer = 'rs16my07') %>%
{. ->> my_rivers}
# set the CRS for the rivers data
st_crs(my_rivers) <- crs('+proj=longlat')
# transform crs
my_rivers %>%
st_transform('+init=epsg:3395') %>%
{. ->> my_rivers_trans}
# read in counties data
st_read(dsn = 'so_data/tl_2020_us_county') %>%
{. ->> my_counties}
# keep state 46
my_counties %>%
filter(
STATEFP == 46
) %>%
{. ->> state_46}
# transform crs
state_46 %>%
st_transform('+init=epsg:3395') %>%
{. ->> state_46_trans}
# keep only rivers inside state 46
my_rivers_trans %>%
st_intersection(state_46_trans) %>%
{. ->> my_rivers_46}
Then we can plot the sf objects using ggplot and geom_sf(), just like you would plot lines using geom_line() etc. geom_sf() seems to know if you are plotting point data, line data or polygon data, and plots accordingly. It is quite easy to use.
# plot it
state_46_trans %>%
ggplot()+
geom_sf()+
geom_sf(data = my_rivers_46, colour = 'red')
Hopefully this looks right - I don't know my US states so have no idea if this is South Dakota or not.
Related
I'm trying to crop a netcdf file with a polygon with stars package from daily netcdf data. I think I have managed to do it and could get this plot
with this script
library(tidyverse)
library(sf)
library(stars)
# Input nc file
nc.file <- "20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc"
# read nc data
nc.data <- read_ncdf(nc.file, var="analysed_sst")
# Read mask coordinates
coordenades.poligon <- read_csv("coordenades_poligon.csv")
colnames(coordenades.poligon) <- c("lon","lat")
# Build sf polygon to crop data
polygon <- coordenades.poligon %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
summarise(geometry = st_combine(geometry)) %>%
st_cast("POLYGON")
# Crop data
nc.stars.crop <- st_crop(nc.data,polygon)
# plot
ggplot() + geom_stars(data=nc.stars.crop) +
coord_equal() + theme_void() +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))
Now I would like to combine lon, lat and analysed_sst in a data frame. I managed to extract coordinates with
nc.stars.coords <- as.data.frame(st_coordinates(nc.stars.crop))
But can't find how to get the corresponding sst values to cbind with longitude and latitude. Maybe there are other solutions with ncdf4 package.
Thank you very much for your help
EDIT 1
Link to SST original data (nc file): SST data
EDIT 2
Added head of coordenades_poligons.csv. First columns are longitude and latitude points, third column is the area ID and fourth one denotes the season. These are just the coordinates of a single area filtered by ID and season.
12.5,44.5,Z1,S
2,44.5,Z1,S
0,41.5,Z1,S
4,40,Z1,S
9,40,Z1,S
9,42,Z1,S
0,41.5,Z2,S
I am making assumptions here, because this is not my area of expertise, but you are able to simply transform this into a dataset using the raster-package. This seems to be the way to go, also according to this author.
raster::as.data.frame(nc.stars.crop, xy = TRUE)
At least for me this worked. And then you could transform it back into a simple features object, if you are so inclined with
raster::as.data.frame(nc.stars.crop, xy = TRUE) %>%
sf::st_as_sf(coords = c('lon','lat'))
However, the transformation to lon/lat is not really exact, because it produces point data, whereas the original information is raster data. So there is clearly information that gets lost.
sf::st_as_sf() seems to work out of the box for this, but I am not sure, because I have no way to validate the transformation of the original data. For me the following worked:
read_ncdf('20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc', var="analysed_sst") %>%
sf::st_as_sf()
This creates polygons, the size of your initial raster tiles and seems to conserve all necessary information.
Finally, here is a work-around to extracting exactly the data you were plotting. You can access the data that ggplot used, by assigning the ggplot to a variable and then accessing the data layer.
p <- ggplot() + geom_stars(data=nc.stars.crop) +
coord_equal() + theme_void() +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))
p$layers[[1]]$data
Hi I've been working a lot but have not gotten really any clear answers. Basically I have a dataframe with sites and chemical analyses from those sites, whose coordinates I was able to convert into a geometry using st_as_sf. I am also using a separate shapefile named Cave_initial. Now what I want to do is plot the points from the dataframe on top of the shapefile as a single map.
I have tried using geom_sf() but at the very best it plots the points on one graph and then the shapefile as a separate graph. But I need them together.
Master_cave_data <- read_xlsx("./JW_cave_master_version.xlsx", range = "C2:AK85") #dataset containing chemical data and lat/long as numerics
cave_system <- st_read("./IllinoisCaverns/Cave_System.shp") #shapefile created by colleague
Master_cave_data <- Master_cave_data %>%
st_as_sf(coords = c('Long_DD', 'Lat_DD'), crs = 4326, sf_column_name = NULL)
#transforming my coordinate data into lat/long
#due to size of datashet dput will not be advisable to display. New column is created using Long/Lat_DD as geometry.
Jul_Data <- filter(Master_cave_data, Month == "Jul") # filtering data for one month
Jul_Coliform_Data_map <- Jul_Data %>%
ggplot() +
geom_sf(data = Jul_Data$Geometry) +
Cave_initial
Jul_Coliform_Data_map
I am not sure why i am keep getting NA whenever I run the Over function with Latitude and Longitude point on the polygon from shapefile. Please note that this is first time for me doing the spatial analysis, but I have done my research and replicated things, but didn't succeed. I need some points which are outside of the polygon to be NA, so I can focus on the real data.
I read these sources since these pertain to my cause but I can't work my problem out:
sp::over() for point in polygon analysis
https://gis.stackexchange.com/questions/133625/checking-if-points-fall-within-polygon-shapefile
https://gis.stackexchange.com/questions/278723/r-error-in-checking-point-inside-polygon
Here is my code chunk
library(sp)
library(rgdal)
library(readr)
gainsville_df <- read_csv("311_Service_Requests__myGNV_.csv")
gnv <- readOGR("~\\Downloads\\GIS_cgbound", layer = "cgbound")
gnv_latlon <- spTransform(gnv, CRS("+proj=longlat +ellps=WGS84 +datum=WGS84"))
gnv_raw <- data.frame(Longitude= gainsville_df$Longitude, Latitude= gainsville_df$Latitude)
coordinates(gnv_raw) <- ~Longitude + Latitude
proj4string(gnv_raw) <- proj4string(gnv)
over(gnv_raw, as(gnv,"SpatialLinesDataFrame"))
#Yeilds:
# FID_cgboun Id Perimeter Area Acres Hectares Shape_Leng
#1 NA NA NA NA NA NA NA
# Desired Output:
# Whereas I should have seen which gainesville Latitudes and Longitude are within the shpaefile
# polygon so I can drop the outliers, that have the NA. According to this, none of my LatLon points
# are inside the polygon.
The datafiles are here:
Shapefile: https://github.com/THsTestingGround/SO_readOGR_quest/tree/master/GIS_cgbound
reading csv file: https://github.com/THsTestingGround/SO_readOGR_quest/blob/master/311_Service_Requests__myGNV_.csv
I would appreciate if someone can help me out.
I realized that your point data is an sf object since you have POINT (-82.34323174 29.67058748) as character. Hence, I reconstructed your data first. I assigned a projection here as well.
library(tidyverse)
library(sf)
library(RCurl)
url <- getURL("https://raw.githubusercontent.com/THsTestingGround/SO_readOGR_quest/master/311_Service_Requests__myGNV_.csv")
mydf <- read_csv(url) %>%
mutate(Location = gsub(x = Location, pattern = "POINT \\(|\\)", replacement = "")) %>%
separate(col = "Location", into = c("lon", "lat"), sep = " ") %>%
st_as_sf(coords = c(3,4)) %>%
st_set_crs(4326)
I imported your shapefile using sf package since your data (mydf in this demonstration) is an sf object. When I imported the data, I realized that I had LINESTRING, not polygons. I believe this is the reason why over() did not work. Here I created polygons. Specifically, I joined all seven polygons all together.
mypoly <- st_read("cgbound.shp") %>%
st_transform(crs = 4326) %>%
st_polygonize() %>%
st_union()
Let's check how your data points and polygon are like. You surely have data points staying outside of the polygon.
ggplot() +
geom_sf(data = mypoly) +
geom_point(data = mydf, aes(x = Longitude, y = Latitude))
You said, "I need some points which are outside of the polygon to be NA." So I decided to create a new column in mydf using st_intersects(). If a data point stays in the polygon, you see TRUE in the new column, check. Otherwise, you see FALSE.
mutate(mydf,
check = as.vector(st_intersects(x = mydf, y = mypoly, sparse = FALSE))) -> result
Finally, check how data points are checked.
ggplot() +
geom_sf(data = mypoly) +
geom_point(data = result, aes(x = Longitude, y = Latitude, color = check))
If you wanna use over() mixing with this sf way, you can do the following.
mutate(mydf,
check = over(as(mydf, "Spatial"), as(mypoly, "Spatial")))
The last thing you wanna do is to subset the data
filter(result, check == TRUE)
THE SIMPLEST WAY
I demonstrated you how things are working with this sf approach. But the following is actually all you need. st_filter() extracts data points staying in mypoly. In this case, data points staying outside are removed. If you do not have to create NAs for these points, this is much easier.
st_filter(x = mydf, y = mypoly, predicate = st_intersects) -> result2
ggplot() +
geom_sf(data = mypoly) +
geom_sf(data = result2)
I have a dataframe with three columns: city_name, longitude, latitude. Using ggplot I am attempting to visualize the data using longitude and latitude as coordinates, which represent the given city. I also want to label each point with the city name. Unfortunately the scale isn't quite right, so the points are mapped in the right location.
Example data for dataframe:
city_name <- c("Berlin", "Brussels", "Paris")
longitude <- c("13.405", "4.3517", "2.3522")
latitude <- c("52.52", "50.8503", "48.8566")
df <- data.frame(city_name, longitude, latitude)
I am using ggplot2.
mapWorld <- borders("world", colour="gray50", fill="gray50") # create a layer of borders
ggplot(df, aes(x= longitude, y= latitude, label=Name))+
geom_point() +geom_text(aes(label=city_name),hjust=0, vjust=0) + mapWorld
Current result:
https://imgur.com/K3RvqTm
Expected result would be mapping the coordinates to their correct location.
Thank you all in advance!
The issue seems to stem from the format of your latitude and longitude data. Instead of quoting each coordinate, just refer to them without quotes.
I also recommend leaflet for a wider array of mapping functionality. The code below worked for me:
longitude <- c(13.405, 4.3517, 2.3522)
latitude <- c(52.52, 50.8503, 48.8566)
df <- data.frame(city_name, longitude, latitude)
library(leaflet)
df$longitude<-as.numeric(df$longitude)
df$latitude<-as.numeric(df$latitude)
leaflet() %>%
addTiles()%>%
addMarkers(data=df,lng=~longitude,lat=~latitude) %>%
setView(10,50,zoom=4)
On top of the solution already provided, you might find it helpful to look into the sf package which, in my opinion, makes spatial data much more pleasant to work with. For example you can do:
library(ggrepel)
library(sf)
library(ggplot2)
mapWorld <- borders("world", colour="gray50", fill="gray50") # create a layer of borders
# define data frame ensuring lat and lon are numeric vectors
df <- data.frame(city_name = c("Berlin", "Brussels", "Paris"),
longitude = c(13.405, 4.3517, 2.3522),
latitude = c(52.52, 50.8503, 48.8566))
# convert into an sf object, letting it know the columns we want to use for X and Y
# setting crs = 4326 for lon/lat data and remove = F to stop those columns from being dropped
df_sf <- st_as_sf(df, coords=c('longitude', 'latitude'), crs = 4326, remove = F)
# it plays nicely with ggplot via the 'geom_sf' geom
ggplot(df_sf)+
mapWorld +
geom_sf() +
geom_text_repel(aes(x=longitude, y=latitude,label=city_name))
You'll notice sf objects come with their own 'geometry' column which is recognised and plays nicely with ggplot. One thing to note is be careful with your layer ordering - by adding mapWorld to your ggplot as the last layer, it will appear at the very top of the plot and may cover your points!
I'm making a map with arc lines connecting between counties for the US state of Missouri. I've calculated the 'good enough' centres of each county by taking the mean of each polygon's long/lat. This works good for the more or less square-shaped counties, but less so for the more intricately shaped counties. I think that this must be a common occurrence, but I can't find the answer online or with any function I've created. I'd like to use a tidyverse work flow (i.e. not transform to spatial objects if I can help it). Are there any tidyverse solutions to the problem at hand.
You can see the problem in the examples below.
library(tidyverse)
# import all state/county fortified
all_states <- as_tibble(map_data('county'))
# filter for missouri
mo_fortify <- all_states %>%
filter(region == 'missouri')
## Pull Iron county, which is relatively oddly shaped
mo_iron_fortify <- mo_fortify %>%
group_by(subregion) %>%
mutate(long_c = mean(long),
lat_c = mean(lat),
iron = ifelse(subregion == 'iron','Iron','Others')) %>%
ungroup()
# map a ggplot2 map
mo_iron_fortify %>%
ggplot(aes(long, lat, group = group))+
geom_polygon(aes(fill = iron),
color = 'white')+
geom_point(aes(long_c, lat_c))+
scale_fill_grey('Iron county is\na good example')+
coord_map()+
theme_bw()