I have a shape file of the road network of the UK. I also have the lat-longs of a GPS tracking device connected to a vehicle. I can convert the GPS lat-longs to a SpatialLine for each trip. There are however some erroneous data points in the GPS tracking device. Each consecutive lat-long pair is about 2 minutes apart from each other. When plotting the lat-longs on a map, it is quite easy to notice the "erroneous" points visually. Is there a way to do this programmatically by assessing the intersection of created spatial line vs the road network shape-file? I.e. is there a function within any of the R packages that will be able to assess this given that we know that the points should roughly follow the shape of the road?
I know of gInterserct, but I am not just looking for a true or false... result...i am trying to work out which points from a series of points are "erroneous"...
Related
Hopefully the following makes sense and apologies if not!
I have a dataset of GPS locations (of various species footprints), and am measuring the distance from where each point was found to the boundary of a national park in R. Im doing this with a series of environmental factors, (roads/villages/lakes etc), and for all other enviro variables (and kml files) Ive had no problems, however when I run my park boundary data, (using a kml file read in as a spatial polygon of the national park), I keep getting negative values for all the GPS points that occur within the park boundary? (So anything within the polygon essentially). The results output I am getting shows the correct measurements from GPS points to the boundary that occur on the OUTSIDE of the boundary, (or polygon), but anything inside the park/polygon outputs as a zero value. Ive tried to reproject the polygon as just an outline and tried removing the 'fill' etc, along with a few other tricks Ive found, but no luck so far?
Am I correct in assuming the data is just not there and that I need to recreate the boarder of the park some other way? Or is it more an issue with how Im asking r to calculate the distance measurements?
Below is an example of the code I am using
KSNP_Poly <- readOGR("KSNPboundaryexport.kml")
Points <- read.csv("AllPoints.csv")
sptsPoints = SpatialPoints(Points)
plot(KSNP_Poly)
plot(sptsPoints)
apply(gDistance(sptsPoints,KSNP_Poly,byid=TRUE),2,min)
KSNPResults <- apply(gDistance(sptsPoints,KSNP_Poly,byid=TRUE),2,min)
Hope this made sense and any advice greatly appreciated!
Thanks!
Kass
Visual example of the data
I used a drone to create a DOF of a small area. During the flight, it takes a photo every 20sh seconds (40sh meters of a flight). I have created a CSV file, which I transferred to a point shapefile. In total, I made with drone 10 so-called "missions", each with 100-200 points which are "shaped" as squares on the map. What I want now is to create a polygon shapefile from the point shapefile.
Because those points sometimes overlap, I cannot use the "Aggregate Points" task, as it's only distance-based. I want to make polygons automatically, using some kind of script. What could help is the fact that a maximum time between two points (AKA photos taken) is 10-20 seconds, so if the time distance is over 3 minutes, it's another "mission". Can you help with such a script, that would quickly and automatically create as many polygons as there are missions?
Okay, I think I understand what you are trying to accomplish. Since no one replied I am going to give it a quick shot, so you have something to try.
I think the best strategy would be to:
Clustering algorithm: Try running a Clustering algorithm such as DBSCAN around the timestamp dimension to classify them based on time groups, instead of the distance (since, as you said, distance based separation is not enough to properly identify and separate the points). After which, you should have all the points classified between different groups with a column group id. Maximum distance parameter in the algorithm should be around 20 seconds steps, or even a minute (since you said each mission was separated at least about 3 minutes apart).
Feature based Polygon to point: At that point, then you run your generic Polygon_from_points(...) function that transforms these clustered points to polygons shapes based on a specific discriminant feature (which in your case is going to be each group id).
How does this work?: This would properly separate the groups first (time-based) and then you should be able to find a generic point to polygon based on a feature (Arcgis should have some).
I dont have an example dataset, nor any code written, but based on what you described I think it would work, hope it helps.
In R, I am trying to create a choropleth map. I have built a database of businesses, some are part of chains (e.g. McDonalds) and others are independent. I want to calculate how many businesses are within 30km of each point on the map, but treat the different locations of chains as a single business.
For example, if you a point is:
5km from a McDonalds,
10km from Taco Bell
15km from Chick-Fil-A
20km from KFC
25km from McDonalds
35km from Five Guys
The colour will show that there are 4 fast food outlets within 30km.
I am happy to use any R package but I am mostly familiar with tmaps and ggplot2 maps.
At this stage the best approach I can think of is to create polygons for each chain and stack them as transparent layers of the same colour. I don't think this would be very efficient and wouldn't create a very nice looking choropleth.
The other answers I could find were either counting points (e.g https://gis.stackexchange.com/questions/229066/counting-how-many-times-a-point-is-inside-a-set-of-intersecting-polygons-in-r) or for GIS software.
EDIT:
I have managed to create a 30km radius from every location of every chain (using rgeos gIntersection). I now have a series of polygons.
To solve my question the additional thing I need to do is create polygons for where:
Only one polygon covers the area,
Two polygons covers the area,
etc.
To try to visual is this I used the answer from https://gis.stackexchange.com/questions/229066/counting-how-many-times-a-point-is-inside-a-set-of-intersecting-polygons-in-r
In the linked question they are trying to count how many polygons cover the numbered points (the image on the right). What I am trying to do is to create the image on the left, where there are polygons of no overlap (1), two overlapping polygons (2) and so on.
I think what you are trying to accomplish would be best approached using a raster approach rather than a chloropleth. To make a chorlopleth, you define a set of (generally irregular) polygons, summarize something within each polygon, then color the polygons based on the attributes. This would be a good approach if you wanted to say how many fast food resteraunts are within each state or county, or how many fast food joints per capita by state.
From your description, however, you are looking for how many fast food joints within a set radius for all points. This is more of a raster question, since you can represent your data on a regular grid.
The raster package is a good start for working with raster data and works well with the sf package.
You need to determine what density you need to accomplish your goal, then use this to determine the resolution of your raster. Once you've got that you can use raster::rasterize() to summarize your (I'm assuming) point data.
I'm assuming you have an object that has the locations of each restaurant, I'll call this object "points".
library(raster)
library(sf)
# create raster template with 30km resolution (I'm assuming your projection is in meters)
raster_template = raster((extent(points),
resolution = 30000,
crs = st_crs(points)
)
# rasterize your point data
r = rasterize(points, raster_template, fun = "count")
This should create a grid where each cell has the number of points within each 30km cell. You should then be able to plot the raster, but may want to either clip or mask it to just show parts that are within New Zealand
I have a list of GPS coordinates and I want to see if they fall in a certain range of GPS coordinates. My list is about 6000 points, but I have 44000 ranges. These GPS coordinates are based on street addresses so I was thinking to narrow my ranges by street name first and then seeing if the coordinates fall within the subset of ranges, which are street blocks. Otherwise I would have 6000x44000 searches and that would take forever. Anyone have any idea on what would be the most efficient way to do this? I'm completely new to R and coding in general so I'm have no idea where to start.
I have a point shapefile of Station IDs and stageheights. I would like to create a raster where each cell has the stage height value (in meters) of the closest in situ station to that cell.
I want this raster to match up with another raster. So I would like it if I could input both a raster I have created (dataset 3 described below) and my point shapefile (1).
Datasets:
1) Point Shapefile with stage heights of a river delta
2) Shapefile of the river delta extent
3) Raster of the delta where NA's represent land (could also have them be zero's if need be) and 1's are water. Two datasets 10 meter resolution and 30 meter resolution.
One conceptual issue I am having is with the amount of small streams I have.
For example (pictured in image below), station 1 (circled in blue) is technically closer to the black x region than station 2 (circled in red), but the stage height value in red is more representative of point x. There are NA's in between the two streams, does that mean that the value will not jump across streams?
How can I reassign the values in my Raster (all the 1's) to the stage height of the nearest station and make sure that these values are not jumping from stream to stream? Do I need to use least cost path? What is the best way to do this?
I would like to use R, but can use ArcMap if I must.
So I'm not sure what tools you have available to you but I think this answer may be useful:
Calculating attribute for network distance between multiple points in ArcGIS Desktop?
Here the questioner was looking to calculate distances on roads to some points, but your problem seems similar. I think the main point I would make here is that you should do your network distance classification prior to worrying about the raster layer. You may have to convert from polygon to lines or some workaround to get your data into a format that works, but this is the kind of job the tool is designed to do.
After you have reclassified your river shapefile based on their network distance to a given point, then convert the polygons to raster and use this to classify your original raster. You could do this in R or Arcmap. Arcmap will probably be faster.