Related
Problem statement:
With ggquiver::geom_quiver() we can plot vector fields, provided we know x, y, xend, and yend.
How can I calculate these parameters for an arbitrary RasterLayer of elevations?
How can I ensure that the size of these arrows indicates the slope for that particular vector such that the arrows appear different lengths proportional to the gradient at that location (e.g., the first plot below)?
Background:
# ggquiver example
library(tidyverse)
library(ggquiver)
expand.grid(x=seq(0,pi,pi/12), y=seq(0,pi,pi/12)) %>%
ggplot(aes(x=x,y=y,u=cos(x),v=sin(y))) +
geom_quiver()
A related appraoch uses rasterVis::vectorplot, which relies on raster::terrain (provided the field units == CRS units) to calculate and plot a vector field. Source code is here.
library(raster)
library(rasterVis)
r <- getData('alt', country='FRA', mask=TRUE)
r <- aggregate(r, 20)
vectorplot(r, par.settings=RdBuTheme())
Conclusion:
To review, I'd like to take an arbitrary rasterLayer of elevation, convert it to a data.frame, calculate the x, y, xmax, and ymax components of an elevation vector field that size the arrows such that they show the relative slope at the point (as in plots 1 and 2 above), and plot with ggquiver. Something like:
names(r) <- "z"
rd <- as.data.frame(r, xy=TRUE)
# calculate x, y, xend, yend for gradient vectors, add to rd, then plot
ggplot(rd) +
geom_raster(aes(x, y, fill = z)) +
geom_quiver(aes(x, y, xend, yend))
Effectively what you're asking is to convert a 2D scalar field into a vector field. There are a few different ways to do this.
The raster package contains the function terrain, which creates new raster layers that will give you both the angle of your desired vector at each point (i.e. the aspect), and its magnitude (the slope). We can use a little trigonometry to convert these into the North-South and East-West basis vectors used by ggquiver and add them to our original raster before turning the whole thing into a data frame.*
terrain_raster <- terrain(r, opt = c('slope', 'aspect'))
r$u <- terrain_raster$slope[] * sin(terrain_raster$aspect[])
r$v <- terr$slope[] * cos(terr$aspect[])
rd <- as.data.frame(r, xy = TRUE)
However, in most cases this will not make for a good plot. If you don't first aggregate the raster, you will have one gradient for each pixel on the image, which won't plot well. On the other hand, if you do aggregate, you will have a nice vector field, but your raster will look "blocky". Therefore, having a single data frame for your plot is probably not the best way to go.
The following function will take a raster and plot it with an overlaid vector field. You can adjust how much the vector field is aggregated without affecting the raster, and you can specify an arbitrary vector of colours for your raster.
raster2quiver <- function(rast, aggregate = 50, colours = terrain.colors(6))
{
names(rast) <- "z"
quiv <- aggregate(rast, aggregate)
terr <- terrain(quiv, opt = c('slope', 'aspect'))
quiv$u <- terr$slope[] * sin(terr$aspect[])
quiv$v <- terr$slope[] * cos(terr$aspect[])
quiv_df <- as.data.frame(quiv, xy = TRUE)
rast_df <- as.data.frame(rast, xy = TRUE)
print(ggplot(mapping = aes(x = x, y = y, fill = z)) +
geom_raster(data = rast_df, na.rm = TRUE) +
geom_quiver(data = quiv_df, aes(u = u, v = v), vecsize = 1.5) +
scale_fill_gradientn(colours = colours, na.value = "transparent") +
theme_bw())
return(quiv_df)
}
So, trying it out on your France example, after first defining a similar colour palette, we get
pal <- c("#B2182B", "#E68469", "#D9E9F1", "#ACD2E5", "#539DC8", "#3C8ABE", "#2E78B5")
raster2quiver(getData('alt', country = 'FRA', mask = TRUE), colours = pal)
Now to show that it works on an arbitrary raster (provided it has a projection assigned) let's test it out on this image, as converted to a raster. This time, we have a lower resolution so we choose a smaller aggregate value. We'll also choose a transparent colour for the lowest values to give a nicer plot:
rast <- raster::raster("https://i.stack.imgur.com/tXUXO.png")
# Add a fake arbitrary projection otherwise "terrain()" doesn't work:
projection(rast) <- "+proj=lcc +lat_1=48 +lat_2=33 +lon_0=-100 +ellps=WGS84"
raster2quiver(rast, aggregate = 20, colours = c("#FFFFFF00", "red"))
* I should point out that geom_quiver's mapping aesthetic takes arguments called u and v, which represent the basis vectors pointing North and East. The ggquiver package converts them to xend and yend values using stat_quiver. If you prefer to use xend and yend values you could just use geom_segment to plot your vector field, but this makes it more complex to control the appearance of the arrows. Hence, this solution will find the magnitude of the u and v values instead.
After searching around a lot, asking, and doing some code, I kinda got the bare minimum for doing kriging in R's gstat.
Using 4 points (I know, totally bad), I kriged the unsampled points located between them. But in actuality, I don't need all of those points. Inside that area, there is a smaller subarea... this area is the one I actually need.
Long story short.. I have measurements taken from 4 weather stations that report rainfall data. The lat and long coordinates for these points are:
lat long
7.16 124.21
8.6 123.35
8.43 124.28
8.15 125.08
My road to kriging can be seen through my previous questions on StackOverflow.
This: Create variogram in R's gstat package
And this: Create Grid in R for kriging in gstat
I know that the image in has the coordinates (at least according to my estimates):
Leftmost: 124 13ish 0 E(DMS)
Rightmost : 124 20ish 0 E
Topmost corrdinates: 124 17ish 0 E
Bottommost coordinates: 124 16ish 0 E
Conversion will take place for that but that doesn't matter I think, or easier to deal with later.
The image is also irregular (but aren't they all though).
Think of it like a doughnut, you krige the the whole circular shape of the doughnut but you only need the area covered by the hole so you remove or at least disregard the values you got from the doughnut itself.
I have an image (.jpg) of the area in question, I will have to convert the image into a shapefile or some other vector format using QGIS or similar software. After that, I will have to insert that vector image inside the 4 point kriged area, so I know which coordinates to actually consider and which ones to remove.
Finally, I take the values of the area covered by the image and store them into a csv or database.
Anybody know how I can start with this? Total noob at R and statistics. Thanks to anyone who responds.
I just want to know if its possible and if it is provide some tips. Thanks again.
Might as well also post my script:
suppressPackageStartupMessages({
library(sp)
library(gstat)
library(RPostgreSQL)
library(dplyr) # for "glimpse"
library(ggplot2)
library(scales) # for "comma"
library(magrittr)
library(gridExtra)
library(rgdal)
library(raster)
library(leaflet)
library(mapview)
})
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="Rainfall Data", host="localhost", port=5432,
user="postgres", password="postgres")
day_1 <- dbGetQuery(con, "SELECT lat, long, rainfall FROM cotabato.sample")
coordinates(day_1) <- ~ lat + long
plot(day_1)
x.range <- as.integer(c(7.0,9.0))
y.range <- as.integer(c(123.0,126.0))
grid <- expand.grid(x=seq(from=x.range[1], to=x.range[2], by=0.05),
y=seq(from=y.range[1], to=y.range[2], by=0.05))
coordinates(grid) <- ~x+y
plot(grid, cex=1.5)
points(day_1, col='red')
title("Interpolation Grid and Sample Points")
day_1.vgm <- variogram(rainfall~1, day_1, width = 0.02, cutoff = 1.8)
day_1.fit <- fit.variogram(day_1.vgm, model=vgm("Sph", psill = 8000, range = 1))
plot(day_1.vgm, day_1.fit)
plot1 <- day_1 %>% as.data.frame %>%
ggplot(aes(lat, long)) + geom_point(size=1) + coord_equal() +
ggtitle("Points with measurements")
plot(plot1)
############################
plot2 <- grid %>% as.data.frame %>%
ggplot(aes(x, y)) + geom_point(size=1) + coord_equal() +
ggtitle("Points at which to estimate")
plot(plot2)
grid.arrange(plot1, plot2, ncol = 2)
coordinates(grid) <- ~ x + y
############################
day_1.kriged <- krige(rainfall~1, day_1, grid, model=day_1.fit)
day_1.kriged %>% as.data.frame %>%
ggplot(aes(x=x, y=y)) + geom_tile(aes(fill=var1.pred)) + coord_equal() +
scale_fill_gradient(low = "yellow", high="red") +
scale_x_continuous(labels=comma) + scale_y_continuous(labels=comma) +
theme_bw()
write.csv(day_1.kriged, file = "Day_1.csv")
EDIT: The code has changed since the last time. But that doesn't matter I guess, I just want to know if its possible and can anybody provide the simplest example of it being possible. I can derive the solution to the example to my own problem from there.
Let me know if you find this useful:
"Think of it like a doughnut, you krige the the whole circular shape of the doughnut but you only need the area covered by the hole so you remove or at least disregard the values you got from the doughnut itself."
For this you load your vectorial data:
donut <- rgdal::readOGR('/variogram', 'donut')
day_1#proj4string#projargs <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0" # Becouse donut shape have this CRS
plot(donut, axes = TRUE, col = 3)
plot(day_1, col = 2, pch = 20, add = TRUE)
Then you delete the 'external ring' and keep the insider. Also indicates that the second isn't a hole anymore:
hole <- donut # for keep original shape
hole#polygons[1][[1]]#Polygons[1] <- NULL
hole#polygons[1][[1]]#Polygons[1][[1]]#hole <- FALSE
plot(hole, axes = TRUE, col = 4, add = TRUE)
After that you chek whicch points are inside 'hole' new blue vector layer:
over.pts <- over(day_1, hole)
day_1_subset <- day_1[!is.na(over.pts$Id), ]
plot(donut, axes = TRUE, col = 3)
plot(hole, col = 4, add = TRUE)
plot(day_1, col = 2, pch = 20, add = TRUE)
plot(day_1_subset, col = 'white', pch = 1, cex = 2, add = TRUE)
write.csv(day_1_subset#data, 'myfile.csv') # write intersected points table
write.csv(as.data.frame(coordinates(day_1_subset)), 'myfile.csv') # write intersected points coords
writeOGR(day_1_subset, 'path', 'mysubsetlayer', driver = 'ESRI Shapefile') # write intersected points shape
With this code you can solve the 'ring' or doughnut 'hole' if you already have the shapefile.
If you have an image and want to clip it try the follow:
In the case you load a raster (get basemap image from web):
coordDf <- as.data.frame(coordinates(day_1)) # get basemap from points
# coordDf <- data.frame(hole#polygons[1][[1]]#Polygons[1][[1]]#coords) # get basemap from hole
colnames(coordDf) <- c('x', 'y')
imag <- dismo::gmap(coordDf, lonlat = TRUE)
myimag <- raster::crop(day_1.kriged, hole)
plot(myimag)
plot(day_1, add = TRUE, col = 2)
In case you use day_1.kriged:
myCropKrig<- raster::crop(day_1.kriged, hole)
myCropKrig %>% as.data.frame %>%
ggplot(aes(x=x, y=y)) + geom_tile(aes(fill=var1.pred)) + coord_equal() +
scale_fill_gradient(low = "yellow", high="red") +
scale_x_continuous(labels=comma) + scale_y_continuous(labels=comma) +
geom_point(data=coordDf[!is.na(over.pts$Id), ], aes(x=x, y=y), color="blue", size=3, shape=20) +
theme_bw()
And "Finally, I take the values of the area covered by the image and store them into a csv or database."
write.csv(as.data.frame(myCropKrig), 'myCropKrig.csv')
Hope you find this useful and I respond your meaning
To simplify your question:
You want to delineate an area based on an image that is not georeferenced.
You want to extract results of a interpolation only for this area
Few steps are required
You need to use QGIS to georeference your image (Raster > Georeferencer). You need to have a georeferenced map in background to help. This creates a raster object with spatial information.
Two possibilities.
2.a. The central part of your image has a color than can be directly used as a mask in R (Ex. All green pixels in middle of red pixels).
2.b. If not, you need to use QGIS to delineate manually a Polygon of the area (Layer > Create Layer > New Shapefile > Polygon)
Import your raster or polygon shapefile in R
Use function raster::mask to extract values of your interpolation using the raster image or the SpatialPolygon.
I have data in R with overlapping points.
x = c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8)
y = c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2)
plot(x,y)
How can I plot these points so that the points that are overlapped are proportionally larger than the points that are not. For example, if 3 points lie at (4,5), then the dot at position (4,5) should be three times as large as a dot with only one point.
Here's one way using ggplot2:
x = c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8)
y = c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2)
df <- data.frame(x = x,y = y)
ggplot(data = df,aes(x = x,y = y)) + stat_sum()
By default, stat_sum uses the proportion of instances. You can use raw counts instead by doing something like:
ggplot(data = df,aes(x = x,y = y)) + stat_sum(aes(size = ..n..))
Here's a simpler (I think) solution:
x <- c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8)
y <- c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2)
size <- sapply(1:length(x), function(i) { sum(x==x[i] & y==y[i]) })
plot(x,y, cex=size)
## Tabulate the number of occurrences of each cooordinate
df <- data.frame(x, y)
df2 <- cbind(unique(df), value = with(df, tapply(x, paste(x,y), length)))
## Use cex to set point size to some function of coordinate count
## (By using sqrt(value), the _area_ of each point will be proportional
## to the number of observations it represents)
plot(y ~ x, cex = sqrt(value), data = df2, pch = 16)
You didn't really ask for this approach but alpha may be another way to address this:
library(ggplot2)
ggplot(data.frame(x=x, y=y), aes(x, y)) + geom_point(alpha=.3, size = 3)
You need to add the parameter cex to your plot function. First what I would do is use the function as.data.frame and table to reduce your data to unique (x,y) pairs and their frequencies:
new.data = as.data.frame(table(x,y))
new.data = new.data[new.data$Freq != 0,] # Remove points with zero frequency
The only downside to this is that it converts numeric data to factors. So convert back to numeric, and plot!
plot(as.numeric(new.data$x), as.numeric(new.data$y), cex = as.numeric(new.data$Freq))
You may also want to try sunflowerplot.
sunflowerplot(x,y)
Let me propose alternatives to adjusting the size of the points. One of the drawbacks of using size (radius? area?) is that the reader's evaluation of spot size vs. the underlying numeric value is subjective.
So, option 1: plot each point with transparency --- ninja'd by Tyler!
option 2: use jitter to push your data around slightly so the plotted points don't overlap.
A solution using lattice and table ( similar to #R_User but no need to remove 0 since lattice do the job)
dt <- as.data.frame(table(x,y))
xyplot(dt$y~dt$x, cex = dt$Freq^2, col =dt$Freq)
I am relatively new to ggplot, so please forgive me if some of my problems are really simple or not solvable at all.
What I am trying to do is generate a "Heat Map" of a country where the filling of the shape is continous. Furthermore I have the shape of the country as .RData. I used hadley wickham's script to transform my SpatialPolygon data into a data frame. The long and lat data of my data frame now looks like this
head(my_df)
long lat group
6.527187 51.87055 0.1
6.531768 51.87206 0.1
6.541202 51.87656 0.1
6.553331 51.88271 0.1
This long/lat data draws the outline of Germany. The rest of the data frame is omitted here since I think it is not needed. I also have a second data frame of values for certain long/lat points. This looks like this
my_fixed_points
long lat value
12.817 48.917 0.04
8.533 52.017 0.034
8.683 50.117 0.02
7.217 49.483 0.0542
What I would like to do now, is colour each point of the map according to an average value over all the fixed points that lie within a certain distance of that point. That way I would get a (almost)continous colouring of the whole map of the country.
What I have so far is the map of the country plotted with ggplot2
ggplot(my_df,aes(long,lat)) + geom_polygon(aes(group=group), fill="white") +
geom_path(color="white",aes(group=group)) + coord_equal()
My first Idea was to generate points that lie within the map that has been drawn and then calculate the value for every generated point my_generated_point like so
value_vector <- subset(my_fixed_points,
spDistsN1(cbind(my_fixed_points$long, my_fixed_points$lat),
c(my_generated_point$long, my_generated_point$lat), longlat=TRUE) < 50,
select = value)
point_value <- mean(value_vector)
I havent found a way to generate these points though. And as with the whole problem, I dont even know if it is possible to solve this way. My question now is if there exists a way to generate these points and/or if there is another way to come to a solution.
Solution
Thanks to Paul I almost got what I wanted. Here is an example with sample data for the Netherlands.
library(ggplot2)
library(sp)
library(automap)
library(rgdal)
library(scales)
#get the spatial data for the Netherlands
con <- url("http://gadm.org/data/rda/NLD_adm0.RData")
print(load(con))
close(con)
#transform them into the right format for autoKrige
gadm_t <- spTransform(gadm, CRS=CRS("+proj=merc +ellps=WGS84"))
#generate some random values that serve as fixed points
value_points <- spsample(gadm_t, type="stratified", n = 200)
values <- data.frame(value = rnorm(dim(coordinates(value_points))[1], 0 ,1))
value_df <- SpatialPointsDataFrame(value_points, values)
#generate a grid that can be estimated from the fixed points
grd = spsample(gadm_t, type = "regular", n = 4000)
kr <- autoKrige(value~1, value_df, grd)
dat = as.data.frame(kr$krige_output)
#draw the generated grid with the underlying map
ggplot(gadm_t,aes(long,lat)) + geom_polygon(aes(group=group), fill="white") + geom_path(color="white",aes(group=group)) + coord_equal() +
geom_tile(aes(x = x1, y = x2, fill = var1.pred), data = dat) + scale_fill_continuous(low = "white", high = muted("orange"), name = "value")
I think what you want is something along these lines. I predict that this homebrew is going to be terribly inefficient for large datasets, but it works on a small example dataset. I would look into kernel densities and maybe the raster package. But maybe this suits you well...
The following snippet of code calculates the mean value of cadmium concentration of a grid of points overlaying the original point dataset. Only points closer than 1000 m are considered.
library(sp)
library(ggplot2)
loadMeuse()
# Generate a grid to sample on
bb = bbox(meuse)
grd = spsample(meuse, type = "regular", n = 4000)
# Come up with mean cadmium value
# of all points < 1000m.
mn_value = sapply(1:length(grd), function(pt) {
d = spDistsN1(meuse, grd[pt,])
return(mean(meuse[d < 1000,]$cadmium))
})
# Make a new object
dat = data.frame(coordinates(grd), mn_value)
ggplot(aes(x = x1, y = x2, fill = mn_value), data = dat) +
geom_tile() +
scale_fill_continuous(low = "white", high = muted("blue")) +
coord_equal()
which leads to the following image:
An alternative approach is to use an interpolation algorithm. One example is kriging. This is quite easy using the automap package (spot the self promotion :), I wrote the package):
library(automap)
kr = autoKrige(cadmium~1, meuse, meuse.grid)
dat = as.data.frame(kr$krige_output)
ggplot(aes(x = x, y = y, fill = var1.pred), data = dat) +
geom_tile() +
scale_fill_continuous(low = "white", high = muted("blue")) +
coord_equal()
which leads to the following image:
However, without knowledge as to what your goal is with this map, it is hard for me to see what you want exactly.
This slideshow offers another approach--see page 18 for a description of the approach and page 21 for a view of what the results looked like for the slide-maker.
Note however that the slide-maker used the sp package and the spplot function rather than ggplot2 and its plotting functions.
I'm plotting some points on a map of the world using the R maps package, something like:
The command to draw the base map is:
map("world", fill=TRUE, col="white", bg="gray", ylim=c(-60, 90), mar=c(0,0,0,0))
But I need to display Pacific centred map. I use map("world2", etc to use the Pacific centred basemap from the maps package, and convert the coordinates of the data points in my dataframe (df) with:
df$longitude[df$longitude < 0] = df$longitude[df$longitude < 0] + 360
This works if I don't use the fill option, but with fill the polygons which cross 0° cause problems.
I guess I need to transform the polygon data from the maps library somehow to sort this out, but I have no idea how to get at this.
My ideal solution would be to draw a maps with a left boundary at -20° and a right boundary at -30° (i.e. 330°). The following gets the correct points and coastlines onto the map, but the crossing-zero problem is the same
df$longitude[df$longitude < -20] = df$longitude[d$longitude < -20] + 360
map("world", fill=TRUE, col="white", bg="gray", mar=c(0,0,0,0),
ylim=c(-60, 90), xlim=c(-20, 330))
map("world2", add=TRUE, col="white", bg="gray", fill=TRUE, xlim=c(180, 330))
Any help would be greatly appreciated.
You could use the fact that internally, a map object returned by the map() function can be recalculated and used again in the map() function. I'd create a list with individual polygons, check which ones have very different longitude values, and rearrange those ones. I gave an example of this approach in the function below*, which allows something like :
plot.map("world", center=180, col="white",bg="gray",
fill=TRUE,ylim=c(-60,90),mar=c(0,0,0,0))
to get
If I were you, I'd shift everything a bit more, like in :
plot.map("world", center=200, col="white",bg="gray",
fill=TRUE,ylim=c(-60,90),mar=c(0,0,0,0))
The function :
plot.map<- function(database,center,...){
Obj <- map(database,...,plot=F)
coord <- cbind(Obj[[1]],Obj[[2]])
# split up the coordinates
id <- rle(!is.na(coord[,1]))
id <- matrix(c(1,cumsum(id$lengths)),ncol=2,byrow=T)
polygons <- apply(id,1,function(i){coord[i[1]:i[2],]})
# split up polygons that differ too much
polygons <- lapply(polygons,function(x){
x[,1] <- x[,1] + center
x[,1] <- ifelse(x[,1]>180,x[,1]-360,x[,1])
if(sum(diff(x[,1])>300,na.rm=T) >0){
id <- x[,1] < 0
x <- rbind(x[id,],c(NA,NA),x[!id,])
}
x
})
# reconstruct the object
polygons <- do.call(rbind,polygons)
Obj[[1]] <- polygons[,1]
Obj[[2]] <- polygons[,2]
map(Obj,...)
}
*Note that this function only takes positive center values. It's easily adapted to allow for center values in both directions, but I didn't bother anymore as that's trivial.
install the latest version of maps (3.2.0).
do this:
d$lon2 <- ifelse(d$lon < -25, d$lon + 360, d$lon) # where d is your df
mapWorld <- map_data('world', wrap=c(-25,335), ylim=c(-55,75))
ggplot() +
geom_polygon(data = mapWorld, aes(x=long, y = lat, group = group)) +
geom_point(data = d, aes(x = lon2, y = lat))
A bit late, but you can also create a shifted map by using a projection (requires the mapproj package):
map("world", projection="rectangular", parameter=0,
orientation=c(90,0,180), wrap=TRUE, fill=T, resolution=0,col=0)
This will shift by 180 degrees. But the difference with 'world2' is that the longitude co-ordinate will be different ([-pi,pi]). All projections of this package put 0 at the centre. And in that case, the 'wrap' option detects the jump correctly.
'resolution=0' helps to get cleaner borders.
You can easily change the centre longitude by changing the '180' value in the projection description.
What about this solution?
xlims = c(0, 359) # these are the limits you want to change
ylims = c(-55,75)
mapWorld <- map_data('world', wrap=xlims, ylim=ylims)
head(mapWorld)
g1 <- ggplot() +
geom_polygon(data = mapWorld, aes(x=long, y = lat, group = group)) +
coord_map("rectangular", lat0=0, xlim=xlims, ylim=ylims)
g1