I have a dataframe with three columns: city_name, longitude, latitude. Using ggplot I am attempting to visualize the data using longitude and latitude as coordinates, which represent the given city. I also want to label each point with the city name. Unfortunately the scale isn't quite right, so the points are mapped in the right location.
Example data for dataframe:
city_name <- c("Berlin", "Brussels", "Paris")
longitude <- c("13.405", "4.3517", "2.3522")
latitude <- c("52.52", "50.8503", "48.8566")
df <- data.frame(city_name, longitude, latitude)
I am using ggplot2.
mapWorld <- borders("world", colour="gray50", fill="gray50") # create a layer of borders
ggplot(df, aes(x= longitude, y= latitude, label=Name))+
geom_point() +geom_text(aes(label=city_name),hjust=0, vjust=0) + mapWorld
Current result:
https://imgur.com/K3RvqTm
Expected result would be mapping the coordinates to their correct location.
Thank you all in advance!
The issue seems to stem from the format of your latitude and longitude data. Instead of quoting each coordinate, just refer to them without quotes.
I also recommend leaflet for a wider array of mapping functionality. The code below worked for me:
longitude <- c(13.405, 4.3517, 2.3522)
latitude <- c(52.52, 50.8503, 48.8566)
df <- data.frame(city_name, longitude, latitude)
library(leaflet)
df$longitude<-as.numeric(df$longitude)
df$latitude<-as.numeric(df$latitude)
leaflet() %>%
addTiles()%>%
addMarkers(data=df,lng=~longitude,lat=~latitude) %>%
setView(10,50,zoom=4)
On top of the solution already provided, you might find it helpful to look into the sf package which, in my opinion, makes spatial data much more pleasant to work with. For example you can do:
library(ggrepel)
library(sf)
library(ggplot2)
mapWorld <- borders("world", colour="gray50", fill="gray50") # create a layer of borders
# define data frame ensuring lat and lon are numeric vectors
df <- data.frame(city_name = c("Berlin", "Brussels", "Paris"),
longitude = c(13.405, 4.3517, 2.3522),
latitude = c(52.52, 50.8503, 48.8566))
# convert into an sf object, letting it know the columns we want to use for X and Y
# setting crs = 4326 for lon/lat data and remove = F to stop those columns from being dropped
df_sf <- st_as_sf(df, coords=c('longitude', 'latitude'), crs = 4326, remove = F)
# it plays nicely with ggplot via the 'geom_sf' geom
ggplot(df_sf)+
mapWorld +
geom_sf() +
geom_text_repel(aes(x=longitude, y=latitude,label=city_name))
You'll notice sf objects come with their own 'geometry' column which is recognised and plays nicely with ggplot. One thing to note is be careful with your layer ordering - by adding mapWorld to your ggplot as the last layer, it will appear at the very top of the plot and may cover your points!
Related
I'm trying to crop a netcdf file with a polygon with stars package from daily netcdf data. I think I have managed to do it and could get this plot
with this script
library(tidyverse)
library(sf)
library(stars)
# Input nc file
nc.file <- "20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc"
# read nc data
nc.data <- read_ncdf(nc.file, var="analysed_sst")
# Read mask coordinates
coordenades.poligon <- read_csv("coordenades_poligon.csv")
colnames(coordenades.poligon) <- c("lon","lat")
# Build sf polygon to crop data
polygon <- coordenades.poligon %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
summarise(geometry = st_combine(geometry)) %>%
st_cast("POLYGON")
# Crop data
nc.stars.crop <- st_crop(nc.data,polygon)
# plot
ggplot() + geom_stars(data=nc.stars.crop) +
coord_equal() + theme_void() +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))
Now I would like to combine lon, lat and analysed_sst in a data frame. I managed to extract coordinates with
nc.stars.coords <- as.data.frame(st_coordinates(nc.stars.crop))
But can't find how to get the corresponding sst values to cbind with longitude and latitude. Maybe there are other solutions with ncdf4 package.
Thank you very much for your help
EDIT 1
Link to SST original data (nc file): SST data
EDIT 2
Added head of coordenades_poligons.csv. First columns are longitude and latitude points, third column is the area ID and fourth one denotes the season. These are just the coordinates of a single area filtered by ID and season.
12.5,44.5,Z1,S
2,44.5,Z1,S
0,41.5,Z1,S
4,40,Z1,S
9,40,Z1,S
9,42,Z1,S
0,41.5,Z2,S
I am making assumptions here, because this is not my area of expertise, but you are able to simply transform this into a dataset using the raster-package. This seems to be the way to go, also according to this author.
raster::as.data.frame(nc.stars.crop, xy = TRUE)
At least for me this worked. And then you could transform it back into a simple features object, if you are so inclined with
raster::as.data.frame(nc.stars.crop, xy = TRUE) %>%
sf::st_as_sf(coords = c('lon','lat'))
However, the transformation to lon/lat is not really exact, because it produces point data, whereas the original information is raster data. So there is clearly information that gets lost.
sf::st_as_sf() seems to work out of the box for this, but I am not sure, because I have no way to validate the transformation of the original data. For me the following worked:
read_ncdf('20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc', var="analysed_sst") %>%
sf::st_as_sf()
This creates polygons, the size of your initial raster tiles and seems to conserve all necessary information.
Finally, here is a work-around to extracting exactly the data you were plotting. You can access the data that ggplot used, by assigning the ggplot to a variable and then accessing the data layer.
p <- ggplot() + geom_stars(data=nc.stars.crop) +
coord_equal() + theme_void() +
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))
p$layers[[1]]$data
Hi I've been working a lot but have not gotten really any clear answers. Basically I have a dataframe with sites and chemical analyses from those sites, whose coordinates I was able to convert into a geometry using st_as_sf. I am also using a separate shapefile named Cave_initial. Now what I want to do is plot the points from the dataframe on top of the shapefile as a single map.
I have tried using geom_sf() but at the very best it plots the points on one graph and then the shapefile as a separate graph. But I need them together.
Master_cave_data <- read_xlsx("./JW_cave_master_version.xlsx", range = "C2:AK85") #dataset containing chemical data and lat/long as numerics
cave_system <- st_read("./IllinoisCaverns/Cave_System.shp") #shapefile created by colleague
Master_cave_data <- Master_cave_data %>%
st_as_sf(coords = c('Long_DD', 'Lat_DD'), crs = 4326, sf_column_name = NULL)
#transforming my coordinate data into lat/long
#due to size of datashet dput will not be advisable to display. New column is created using Long/Lat_DD as geometry.
Jul_Data <- filter(Master_cave_data, Month == "Jul") # filtering data for one month
Jul_Coliform_Data_map <- Jul_Data %>%
ggplot() +
geom_sf(data = Jul_Data$Geometry) +
Cave_initial
Jul_Coliform_Data_map
I am not sure why i am keep getting NA whenever I run the Over function with Latitude and Longitude point on the polygon from shapefile. Please note that this is first time for me doing the spatial analysis, but I have done my research and replicated things, but didn't succeed. I need some points which are outside of the polygon to be NA, so I can focus on the real data.
I read these sources since these pertain to my cause but I can't work my problem out:
sp::over() for point in polygon analysis
https://gis.stackexchange.com/questions/133625/checking-if-points-fall-within-polygon-shapefile
https://gis.stackexchange.com/questions/278723/r-error-in-checking-point-inside-polygon
Here is my code chunk
library(sp)
library(rgdal)
library(readr)
gainsville_df <- read_csv("311_Service_Requests__myGNV_.csv")
gnv <- readOGR("~\\Downloads\\GIS_cgbound", layer = "cgbound")
gnv_latlon <- spTransform(gnv, CRS("+proj=longlat +ellps=WGS84 +datum=WGS84"))
gnv_raw <- data.frame(Longitude= gainsville_df$Longitude, Latitude= gainsville_df$Latitude)
coordinates(gnv_raw) <- ~Longitude + Latitude
proj4string(gnv_raw) <- proj4string(gnv)
over(gnv_raw, as(gnv,"SpatialLinesDataFrame"))
#Yeilds:
# FID_cgboun Id Perimeter Area Acres Hectares Shape_Leng
#1 NA NA NA NA NA NA NA
# Desired Output:
# Whereas I should have seen which gainesville Latitudes and Longitude are within the shpaefile
# polygon so I can drop the outliers, that have the NA. According to this, none of my LatLon points
# are inside the polygon.
The datafiles are here:
Shapefile: https://github.com/THsTestingGround/SO_readOGR_quest/tree/master/GIS_cgbound
reading csv file: https://github.com/THsTestingGround/SO_readOGR_quest/blob/master/311_Service_Requests__myGNV_.csv
I would appreciate if someone can help me out.
I realized that your point data is an sf object since you have POINT (-82.34323174 29.67058748) as character. Hence, I reconstructed your data first. I assigned a projection here as well.
library(tidyverse)
library(sf)
library(RCurl)
url <- getURL("https://raw.githubusercontent.com/THsTestingGround/SO_readOGR_quest/master/311_Service_Requests__myGNV_.csv")
mydf <- read_csv(url) %>%
mutate(Location = gsub(x = Location, pattern = "POINT \\(|\\)", replacement = "")) %>%
separate(col = "Location", into = c("lon", "lat"), sep = " ") %>%
st_as_sf(coords = c(3,4)) %>%
st_set_crs(4326)
I imported your shapefile using sf package since your data (mydf in this demonstration) is an sf object. When I imported the data, I realized that I had LINESTRING, not polygons. I believe this is the reason why over() did not work. Here I created polygons. Specifically, I joined all seven polygons all together.
mypoly <- st_read("cgbound.shp") %>%
st_transform(crs = 4326) %>%
st_polygonize() %>%
st_union()
Let's check how your data points and polygon are like. You surely have data points staying outside of the polygon.
ggplot() +
geom_sf(data = mypoly) +
geom_point(data = mydf, aes(x = Longitude, y = Latitude))
You said, "I need some points which are outside of the polygon to be NA." So I decided to create a new column in mydf using st_intersects(). If a data point stays in the polygon, you see TRUE in the new column, check. Otherwise, you see FALSE.
mutate(mydf,
check = as.vector(st_intersects(x = mydf, y = mypoly, sparse = FALSE))) -> result
Finally, check how data points are checked.
ggplot() +
geom_sf(data = mypoly) +
geom_point(data = result, aes(x = Longitude, y = Latitude, color = check))
If you wanna use over() mixing with this sf way, you can do the following.
mutate(mydf,
check = over(as(mydf, "Spatial"), as(mypoly, "Spatial")))
The last thing you wanna do is to subset the data
filter(result, check == TRUE)
THE SIMPLEST WAY
I demonstrated you how things are working with this sf approach. But the following is actually all you need. st_filter() extracts data points staying in mypoly. In this case, data points staying outside are removed. If you do not have to create NAs for these points, this is much easier.
st_filter(x = mydf, y = mypoly, predicate = st_intersects) -> result2
ggplot() +
geom_sf(data = mypoly) +
geom_sf(data = result2)
I have a data frame consisting of multiple data points with specific geocoordinates (latitude and longitude). I'm looking to create a choropleth-style world map where geographical regions are shaded according to how many data points fall within the boundaries of the region.
Is there a simple way to accomplish what I'm trying to do in R, preferably using the "maps" package's world map and the "ggplot2" map-plotting functions?
Here is a minimally reproducible result of what I have:
library(ggplot2)
library(maps)
data <- data.frame(lat = 40.730610, lon = -73.935242)
ggplot() +
geom_polygon(data = map_data("world"), aes(x = long, y = lat, group = group, fill = group)) +
coord_fixed(1.3)
I've noticed that the fill parameter on plot item functions can be used to create a choropleth effect. Here, the fill parameter on the aes() function of the geom_polygon() function is used to create a choropleth where each group is color coded differently.
There are many ways to achieve this task. The general idea is to convert both the point data and polygon data to spatial objects. After that, count how many points fall within that polygon. I know we can do this using the sp package, which is widespread and well-known in the R community, but I decided to use the sf package because sf would be the next generation standard of spatial objects in R (https://cran.r-project.org/web/packages/sf/index.html). Knowing the usage and functionality of sf will probably be beneficial.
First, the OP provided an example point, but I decided to add more points so that we can see how to count the points and aggregate the data. To do so, I used the ggmap pakcage to geocode some cities that I selected as an example.
# Load package
library(tidyverse)
library(ggmap)
library(maps)
library(maptools)
library(sf)
# Create point data as a data frame
point_data <- data.frame(lat = 40.730610, lon = -73.935242)
# Geocode a series of cities
city <- c("Detroit", "Seattle", "Toranto", "Denver", "Mexico City", "Paris", "New Orleans",
"Tokyo", "Osaka", "Beijing", "Canberra", "New York", "Istanbul", "New Delhi",
"London", "Taipei", "Seoul", "Manila", "Bangkok", "Lagos", "Chicago", "Shanghai")
point_data2 <- geocode(city)
# Combine OP's example and the geocoding result
point_data3 <- bind_rows(point_data, point_data2)
Next, I converted the point_data3 data frame to the sf object. I will also get the polygon data of the world using the maps package and convert it to an sf object.
# Convert to simple feature object
point_sf <- st_as_sf(point_data3, coords = c("lon", "lat"), crs = 4326)
# Get world map data
worldmap <- maps::map("world", fill = TRUE, plot = FALSE)
# Convert world to sp class
IDs <- sapply(strsplit(worldmap$names, ":"), "[", 1L)
world_sp <- map2SpatialPolygons(worldmap, IDs = IDs,
proj4string = CRS("+proj=longlat +datum=WGS84"))
# Convert world_sp to simple feature object
world_sf <- st_as_sf(world_sp)
# Add country ID
world_sf <- world_sf %>%
mutate(region = map_chr(1:length(world_sp#polygons), function(i){
world_sp#polygons[[i]]#ID
}))
Now both point_sf and world_sf are sf objects. We can use the st_within function to examine which points are within which polygons.
# Use st_within
result <- st_within(point_sf, world_sf, sparse = FALSE)
# Calculate the total count of each polygon
# Store the result as a new column "Count" in world_sf
world_sf <- world_sf %>%
mutate(Count = apply(result, 2, sum))
The total count information is in the Count column of world_sf. We can get the world data frame as the OP did using the map_data function. We can then merge world_data and world_df.
# Convert world_sf to a data frame world_df
world_df <- world_sf
st_geometry(world_df) <- NULL
# Get world data frame
world_data <- map_data("world")
# Merge world_data and world_df
world_data2 <- world_data %>%
left_join(world_df, by = c("region"))
Now we are ready to plot the data. The following code is the same as the OP's ggplot code except that the input data is now world_data2 and fill = Count.
ggplot() +
geom_polygon(data = world_data2, aes(x = long, y = lat, group = group, fill = Count)) +
coord_fixed(1.3)
I've got a shapefile (SpatialLinesDataFrame) containing all streets in cologne, which can be downloaded from here. I merged this #data with data from an external source. How can i plot these streets (if possible on an google map using ggmaps), so that every street has a different colour (or thickness), depending on its individual value?
So far, i have done this:
shapefile <- readOGR(shapfile, "Strasse", stringsAsFactors=FALSE,
encoding="latin-9")
shp <- spTransform(shapefile, CRS("+proj=longlat +datum=WGS84"))
at this point i add another column to the shp#data data frame, which contains a certain value for each street. Then I fortifiy the the shapefile so it can be plotted using ggplot:
shp$id <- rownames(shp#data)
shp.df <- as.data.frame(shp)
data_fort <- fortify(shp, region = "id")
data_merged <- join(data_fort, shp.df, by="id")
When i use geom_lines, the lines do not look good and are not easy to identify:
ggplot(data_merged, aes(x=long, y=lat,
group=group,
colour=values)) +
geom_line()
Here i saw that one could transform the shapefile so that geom_segement (or in this case a modified function "geom_segment2") can be used, but then would loose my the street specific values.
So this code grabs the 100 longest roads from your shapefile, randomly assigns "values" on (1,10), and plots that with color based on value, on top of a google raster image of Cologne.
library(ggplot2)
library(ggmap) # for ggmap(...) and get_map(...)
library(rgdal) # for readOGR(...)
library(plyr) # for join(...)
set.seed(1) # for reproducible example
setwd(" <directory with your shapefiles> ")
spl <- readOGR(dsn=".", "Strasse", encoding="latin-9")
spl <- spl[spl$SHAPE_LEN %in% tail(sort(spl$SHAPE_LEN),100),]
shp <- spTransform(spl, CRS("+proj=longlat +datum=WGS84"))
shp.df <- data.frame(id=rownames(shp#data),
values=sample(1:10,length(shp),replace=T),
shp#data, stringsAsFactors=F)
data_fort <- fortify(shp)
data_merged <- join(data_fort, shp.df, by="id")
ggmap(get_map(unlist(geocode("Cologne")),zoom=11))+
geom_path(data=data_merged,size=1,
aes(x=long,y=lat,group=group,color=factor(values)))+
labs(x="",y="")+
theme(axis.text=element_blank(),axis.ticks=element_blank())
It is possible to make the ggmap(...) call simpler using, e.g.,
ggmap(get_map("Cologne"))
but there's a problem: the zoom=... argument is interpreted differently and I wasn't able to zoom the map sufficiently.