This might sound like a strange process, but its the best I can think of to control rasterised colour gradients with respect to discrete objects (points, lines, polygons). I'm 95% there but can't quite plot correctly.
This should illustrate proof of concept:
r = matrix(56:255, ncol=20) # reds
b = t(matrix(56:255, ncol=10)) # blues
col = matrix(rgb(r, 0, b, max=255), ncol=20) # matrix of colour strings
ras = raster(r) # data raster object
extent(ras) = extent(1,200,1,100) # set extent for aspect
plot(ras, col = col, axes=F, asp=T) # overwrite data with custom colours
Here I want to clip a raster to a triangle and create colour gradient of pixels inside based on their distances to one of the sides. Sorry for length but its the most minimal example I can design.
require(raster); require(reshape2); require(rgeos)
# equilateral triangle
t_s = 100 # half side
t_h = floor(tan(pi*60/180) * t_s) # height
corners = cbind(c(0, -t_s, t_s, 0), c(t_h, 0, 0, t_h))
trig = SpatialPolygons(list(Polygons(list(Polygon(corners)),"triangle")))
# line to measure pixel distances to
redline = SpatialLines(list(Lines(Line(corners[1:2,]), ID='redline')))
plot(trig); plot(redline, add=T, col='red', lwd=3)
# create a blank raster and clip to triangle
r = raster(mat.or.vec(nc = t_s*2 + 1, nr = t_h))
extent(r) = extent(-t_s, t_s, 0, t_h)
r = mask(r, trig)
image(r, asp=T)
# extract cell coordinates into d.f.
cells =, spatial=T)))
# calculate distance of each pixel to redline with apply
dist_to_line = function(xy, line){
point = readWKT(paste('POINT(', xy[1], xy[2], ')'))
gDistance(point, line) / t_h
cells$dists = apply(cells, 1, dist_to_line, line=redline)
cells$cols = rgb(1 - cells$dists, 0, 0)
length(unique(cells$cols)) # count unique colours
# use custom colours to colour triangle pixels
image(r, col = cells$cols, asp=T)
plot(r, col = cells$cols, asp=T)
As you can see the plotting fails to overwrite as in the first example, but the data seems fine. Trying to convert to matrix also fails:
# try convertying colours to matrix
col_ras = acast(cells, y~x, value.var='cols')
col_ras = apply(col_ras, 1, rev) # rotate acw to match r
plot(r, col = col_ras, asp=T)
Very grateful for any assistance on what's going wrong.
To show Spacedman's plotRGB method:
b = brick(draster, 1-draster, 1-draster)
plotRGB(b, scale=1)
plot(trig, col=NA, border='white', lwd=5, add=T)
Easy way is to go from your points to a spatial pixels data frame to a raster, then do the colour mapping...
Start with:
> head(cells)
x y dists
1 0.0000000 172.5 0.0014463709
2 0.0000000 171.5 0.0043391128
3 -0.9950249 170.5 0.0022523089
4 0.0000000 170.5 0.0072318546
5 0.9950249 170.5 0.0122114004
> coordinates(cells)=~x+y
> draster = raster(as(cells,"SpatialPixelsDataFrame"))
> cols=draster
> cols[!]= rgb(1-draster[!],0,0)
> plot(cols, col=cols)
I'm not sure this is the right way to do things though, you might be better off creating an RGB raster stack and using plotRGB if you want fine colour control.
Imagine a regular 0.5° grid across the Earth's surface. A 3x3 subset of this grid is shown below. As a stylized example of what I'm working with, let's say I have three polygons—yellow, orange, and blue—that for the sake of simplicity all are 1 unit in area. These polygons have attributes Population and Value, which you can see in the legend:
I want to turn these polygons into a 0.5° raster (with global extent) whose values are based on the weighted-mean Value of the polygons. The tricky part is that I want to weight the polygons' values based on not their Population, but rather on their included population.
I know—theoretically—what I want to do, and below have done it for the center gridcell.
Multiply Population by Included (the area of the polygon that is included in the gridcell) to get Pop. included. (Assumes population is distributed evenly throughout polygon, which is acceptable.)
Divide each polygon's Included_pop by the sum of all polygons' Included_pop (32) to get Weight.
Multiply each polygon's Value by Weight to get Result.
Sum all polygons' Result to get the value for the center gridcell (0.31).
Frac. included
Pop. included
I have an idea of how to accomplish this in R, as described below. Where possible, I've filled in code that I think will do what I want. My questions: How do I do steps 2 and 3? Or is there a simpler way to do this? If you want to play around with this, I have uploaded old_polygons as a .rds file here.
Calculate the area of each polygon: old_polygons$area <- as.numeric(st_area(old_polygons))
Generate the global 0.5° grid as some kind of Spatial object.
Split the polygons by the grid, generating new_polygons.
Calculate area of the new polygons: new_polygons$new_area <- as.numeric(st_area(new_polygons))
Calculate fraction included for each new polygon: new_polygons$frac_included <- new_polygons$new_area / new_polygons$old_area
Calculate "included population" in the new polygons: new_polygons$pop_included <- new_polygons$pop * new_polygons$frac_included
Calculate a new attribute for each polygon that is just their Value times their included population. new_polygons$tmp <- new_polygons$Value * new_polygons$frac_included
Set up an empty raster for the next steps: empty_raster <- raster(nrows=360, ncols=720, xmn=-180, xmx=180, ymn=-90, ymx=90)
Rasterize the polygons by summing this new attribute together within each gridcell. tmp_raster <- rasterize(new_polygons, empty_raster, "tmp", fun = "sum")
Create another raster that is just the total population in each gridcell: pop_raster <- rasterize(new_polygons, empty_raster, "pop_included", fun = "sum")
Divide the first raster by the second to get what I want:
output_raster <- empty_raster
values(output_raster) <- getValues(tmp_raster) / getValues(pop_raster)
Any help would be much appreciated!
Example data:
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
values(v) <- data.frame(population=1:12, value=round(c(2:13)/14, 2))
r <- rast(ext(v)+.05, ncols=4, nrows=6, names="cell")
Illustrate the data
p <- as.polygons(r)
plot(p, lwd=2, col="gray", border="light gray")
lines(v, col=rainbow(12), lwd=2)
txt <- paste0(v$value, " (", v$population, ")")
text(v, txt, cex=.8, halo=TRUE)
# area of the polygons
v$area1 <- expanse(v)
# intersect with raster cell boundaries
values(r) <- 1:ncell(r)
p <- as.polygons(r)
pv <- intersect(p, v)
# area of the polygon parts
pv$area2 <- expanse(pv)
pv$frac <- pv$area2 / pv$area1
Now we just use the data.frame with the attributes of the polygons to compute the polygon-cover-weighted-population-weighted values.
z <- values(pv)
a <- aggregate(z[, "frac", drop=FALSE], z[,"cell",drop=FALSE], sum)
names(a)[2] <- 'fsum'
z <- merge(z, a)
z$weight <- z$population * z$frac / z$fsum
z$wvalue <- z$value * z$weight
b <- aggregate(z[, c("wvalue", "weight")], z[, "cell", drop=FALSE], sum)
b$bingo <- b$wvalue / b$weight
Assign values back to raster cells
x <- rast(r)
x[b$cell] <- b$bingo
Inspect results
text(x, digits=2, halo=TRUE, cex=.9)
text(v, "value", cex=.8, col="red", halo=TRUE)
This may not scale very well to large data sets, but you could perhaps do it in chunks.
This is fast and scalable:
# make the 3 polygons with radius = 5km
center_points <- data.frame(lon = c(0.5, 0.65, 1),
lat = c(0.75, 0.65, 1),
Population = c(16, 18, 24),
Value = c(0.4, 0.1, 0.8))
polygon <- vect(center_points, crs = "EPSG:4326")
polygon <- buffer(polygon, 5000)
# make the raster
my_raster <- rast(nrow = 3, ncol = 3, xmin = 0, xmax = 1.5, ymin = 0, ymax = 1.5, crs = "EPSG:4326")
my_raster[] <- 0 # set the value to 0 for now
# find the fractions of cells in each polygon
# "cells" gives you the cell ID and "weights" (or "exact") gives you the cell fraction in the polygon
# using "exact" instead of "weights" is more accurate
my_Table <- extract(my_raster, polygon, cells = TRUE, weights = TRUE)
setDT(my_Table) # convert to datatable
# merge the polygon attributes to "my_Table"
poly_Table <- setDT(
poly_Table[, ID := 1:nrow(poly_Table)] # add the IDs which are the row numbers
merged_Table <- merge(my_Table, poly_Table, by = "ID")
# find Frac_included
merged_Table[, Frac_included := weight / sum(weight), by = ID]
# find Pop_included
merged_Table[, Pop_included := Frac_included * Population]
# find Weight, to avoid confusion with "weight" produced above, I call this "my_Weight"
merged_Table[, my_Weight := Pop_included / sum(Pop_included), by = cell]
# final results
Result <- merged_Table[, .(Result = sum(Value * my_Weight)), by = cell]
# add the values to the raster
my_raster[Result$cell] <- Result$Result
I'm looking for an R implementation of the ESRI 'Slice' tool, specifically I want to use the 'EQUAL_AREA' option.
I want to use an input raster, and reclassify raster values into 9 'bins', based on (approximately) the number of cells within each bin.
My raster has values between 0 and 50,000 that covers a very large geographical area. So for example values between 0 and 5000 might become '1', values between 5000 and 6000 might become '2' and so on. Depending on how many values/cells there are in each category.
There is no such a package as far as I know, but you can use classInt and raster package to do what you are looking for! Although you need to come up with a reproducible example to get the best result, I think below script does the job:
# sample data
volcanoR <- raster(volcano)
# required libraries
n = 9 # this is number of classes
zClass <- classIntervals(values(volcanoR), n=n, style="jenks")
# chosen style: one of "fixed", "sd", "equal", "pretty", "quantile", "kmeans", "hclust",
# "bclust", "fisher", "jenks" or "dpih"
# classes for reclassification based on NBJ
df.rcl <- data.frame(zClass$brks[1:(length(zClass$brks)-1)],
rec.ras <- reclassify(volcanoR, df.rcl, include.lowest=TRUE)
plot(rec.ras, col=terrain.colors(n, alpha=1, rev=T), legend=F, main="NBJ")
legend("topleft", legend = c(seq(1,length(zClass$brks)-1,1)),
fill = terrain.colors(n, alpha = 1, rev = T), cex=0.85, bty = "n")
Same approach for equal interval classes:
zClass <- classIntervals(values(volcanoR), n=n, style="equal")
# chosen style: one of "fixed", "sd", "equal", "pretty", "quantile", "kmeans", "hclust",
# "bclust", "fisher", "jenks" or "dpih"
# classes for reclassification based on EQUAL INTERVAL
df.rcl <- data.frame(zClass$brks[1:(length(zClass$brks)-1)],
rec.ras <- reclassify(volcanoR, df.rcl, include.lowest=TRUE)
plot(rec.ras, col=terrain.colors(n, alpha=1, rev=T), legend=F, main="Equal Interval")
legend("topleft", legend = c(seq(1,length(zClass$brks)-1,1)),
fill = terrain.colors(n, alpha = 1, rev = T), cex=0.85, bty = "n")
v <- raster(volcano)
Use quantile with sampleRegular (for very large rasters)
s <- seq(0, 1, 1/9)
q <- quantile(sampleRegular(v,500000), s)
x <- cut(v, q) # like reclassify
Check the results
# 1 2 3 4 5 6 7 8 9
#597 582 547 562 613 575 592 556 632
I am trying to figure our the proportion of an area that has a slope of 0, +/- 5 degrees. Another way of saying it is anything above 5 degrees and below 5 degrees are bad. I am trying to find the actual number, and a graphic.
To achieve this I turned to R and using the Raster package.
Let's use a generic country, in this case, the Philippines
{list.of.packages <- c("sp","raster","rasterVis","maptools","rgeos")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)}
library(sp) # classes for spatial data
library(raster) # grids, rasters
library(rasterVis) # raster visualisation
Now let's get the altitude information and plot the slopes.
elevation <- getData("alt", country = "PHL")
x <- terrain(elevation, opt = c("slope", "aspect"), unit = "degrees")
Not very helpful due to the scale, so let's simply look at the Island of Palawan
e <- drawExtent(show=TRUE) #to crop out Palawan (it's the long skinny island that is roughly midway on the left and is oriented between 2 and 8 O'clock)
gewataSub <- crop(x,e)
plot(gewataSub, 1)## Now visualize the new cropped object
A little bit better to visualize. I get a sense of the magnitude of the slopes and that with a 5 degree restriction, I am mostly confined to the coast. But I need a little bit more for analysis.
I would like Results to be something to be in two parts:
1. " 35 % (made up) of the selected area has a slope exceeding +/- 5 degrees" or " 65 % of the selected area is within +/- 5 degrees". (with the code to get it)
2. A picture where everything within +/- 5 degrees is one color, call it good or green, and everything else is in another color, call it bad or red.
There are no negative slopes, so I assume you want those that are less than 5 degrees
elevation <- getData('alt', country='CHE')
x <- terrain(elevation, opt='slope', unit='degrees')
z <- x <= 5
Now you can count cells with freq
f <- freq(z)
If you have a planar coordinate reference system (that is, with units in meters or similar) you can do
f <- cbind(f, area=f[,2] * prod(res(z)))
to get areas. But for lon/lat data, you would need to correct for different sized cells and do
a <- area(z)
zonal(a, z, fun=sum)
And there are different ways to plot, but the most basic one
You can use reclassify from the raster package to achieve that. The function assigns each cell value that lies within a defined interval a certain value. For example, you can assign cell values within interval (0,5] to value 0 and cell values within the interval (5, maxSlope] to value 1.
elevation <- getData("alt", country = "PHL")
x <- terrain(elevation, opt = c("slope", "aspect"), unit = "degrees")
e <- drawExtent(show = TRUE)
gewataSub <- crop(x, e)
plot(gewataSub$slope, 1)
m <- c(0, 5, 0, 5, maxValue(gewataSub$slope), 1)
rclmat <- matrix(m, ncol = 3, byrow = TRUE)
rc <- reclassify(gewataSub$slope, rclmat)
margin = F,
col.regions = c("wheat", "gray"),
colorkey = list(at = c(0, 1, 2), labels = list(at = c(0.5, 1.5), labels = c("<= 5", "> 5")))
After the reclassification you can calculate the percentages:
length(rc[rc == 0]) / (length(rc[rc == 0]) + length(rc[rc == 1])) # <= 5 degrees
[1] 0.6628788
length(rc[rc == 1]) / (length(rc[rc == 0]) + length(rc[rc == 1])) # > 5 degrees
[1] 0.3371212
Can I specify endpoints to colorRamp so that a value maps consistently to a single color, regardless of the range of other data?
I'm trying to create an interactive correlation plot in plotly. Here's some sample data.
m <- 4
cm <- matrix(runif(m**2,-1,1),
nrow=m, ncol=m,
diag(cm) <- 1
# a b c d
# a 1.0000000 -0.5966361 0.2582281 0.3740457
# b -0.2557522 1.0000000 -0.8764275 -0.2317926
# c 0.1457067 0.8893505 1.0000000 0.5396828
# d 0.8164156 0.3215956 -0.6468865 1.0000000
I'm basically trying to create an interactive version of this:
Here's the (kind of hacky) interactive correlation plot I created.
div_colors <- c('dark red','white','navy blue')
grid_labels <- matrix(paste0('Cor(',,c(expand.grid(rownames(cm),colnames(cm)), sep=', ') ),
'): ',
plot_ly(x = colnames(cm),
y = rownames(cm),
z = cm,
colors = colorRamp(div_colors),
text = grid_labels
) %>% layout(yaxis=list(autorange='reversed'))
My problem is that without forcing the colorRamp endpoints to c(-1,1), the white color doesn't match correlation of 0, and the dark red maps to the minimum observed, rather than -1.
As #rawr mentioned in a comment, the solution is to set zmin and zmax, as in:
plot_ly(x = colnames(cm),
y = rownames(cm),
z = cm,
zmin=-1, # <============
zmax=1, # <============
colors = colorRamp(div_colors),
text = grid_labels
) %>% layout(yaxis=list(autorange='reversed'))
Which produces the desired result. (The legend bar is shorter, presumably due to a change in default sizes in a newer version of plotly.)
I have a data that looks like this:
> print(dat)
cutoff tp fp
1 0.6 414 45701
2 0.7 172 16820
3 0.8 51 4326
4 0.9 49 3727
5 1.0 0 0
I want to plot them in reverse-order from smallest dat$tp to largest.
However this code plot them in order like above (i.e. largest to smallest) instead.
> fp_max <- max(dat$fp);
> tp_max <- max(dat$tp);
> op <- par(xaxs = "i", yaxs = "i")
> plot(tp ~ fp, data = dat, xlim = c(0,fp_max),ylim = c(0,tp_max), type = "n")
> with(dat, lines(c(0, fp, fp_max), c(0, tp, tp_max), lty=1, type = "l", col = "black"))
> lines( par()$usr[1:2], par()$usr[3:4], col="red" )
How can I modify the code above to address the problem?
Of course, the x-axis & y-axis coordinates should be from smallest to largest value
The following shows the result of my current code.
Notice that the line started at 0,0 and it 'goes back' to 0 again.
we want to avoid it going back to 0.
Ahh, I understand.
It's because lines draws lines between the points in the order they are given.
There are a few ways you could get around this:
do type='l' in your plot command and then with(dat,lines(...)) is not necessary:
# can also do the col='black',lty=1 in here.
plot(tp ~ fp, data = dat, xlim = c(0,fp_max),ylim = c(0,tp_max), type = "l")
Note that by definition of your fp_max and tp_max, you will include the point (fp_max,tp_max) already. And as long as you have a row with (0,0) for tp and fp in dat, you'll also get the (0,0) point.
Sort dat$tp and use that to sort dat$fp too:
plot(tp ~ fp, ..., type='n')
# sort dat$tp
obj <- sort(dat$fp,index.return=T)
# use obj$x as tp and obj$ix to sort dat$fp prior to plotting
lines(c(0, obj$x, fp_max), c(0, tp[obj$ix], tp_max),
lty=1, type = "l", col = "black"))
#Get order of rows
idx <- order(dat$tp)
#Select data in sorted order
sorted <- dat[idx,]