I am looking for a general solution to create bivariate choropleth maps in R using raster files.
I have found the following code here which nearly does what I need but it is limited: it can only handle data which are between 0 and 1 on both axes. In my specific use-case one axis spans 0-1 while another spans between -1 and 1. Regardless as to my specific use-case, I think a more general function which can handle different data ranges would be useful to many people.
I have already tried updating the code within the function colmat to handle negative data but for the life of me cannot get it to work. In the interests of clarity I have avoided posting all of my failed attempts and have insread copied below the code I found at the link above in the hope that someone may be able to offer a solution.
The current code first creates a colour matrix using colmat. The colour matrix generated is then used in bivariate.map along with your two raster files containing the data. I think the ideal solution would be to create the colour matrix based on the two rasters first (so that it can correctly bin the data based on your actual data, not the current solution which is between 0 and 1).
````
library(classInt)
library(raster)
library(rgdal)
library(dismo)
library(XML)
library(maps)
library(sp)
# Creates dummy rasters
rasterx<- raster(matrix(rnorm(400),5,5))
rasterx[rasterx <=0]<-1
rastery<- raster(matrix(rnorm(400),5,5))
# This function creates a colour matrix
# At present it cannot handle negative values i.e. the matrix spans from 0 to 1 along both axes
colmat<-function(nquantiles=10, upperleft=rgb(0,150,235, maxColorValue=255), upperright=rgb(130,0,80, maxColorValue=255), bottomleft="grey", bottomright=rgb(255,230,15, maxColorValue=255), xlab="x label", ylab="y label"){
my.data<-seq(0,1,.01)
my.class<-classIntervals(my.data,n=nquantiles,style="quantile")
my.pal.1<-findColours(my.class,c(upperleft,bottomleft))
my.pal.2<-findColours(my.class,c(upperright, bottomright))
col.matrix<-matrix(nrow = 101, ncol = 101, NA)
for(i in 1:101){
my.col<-c(paste(my.pal.1[i]),paste(my.pal.2[i]))
col.matrix[102-i,]<-findColours(my.class,my.col)
}
plot(c(1,1),pch=19,col=my.pal.1, cex=0.5,xlim=c(0,1),ylim=c(0,1),frame.plot=F, xlab=xlab, ylab=ylab,cex.lab=1.3)
for(i in 1:101){
col.temp<-col.matrix[i-1,]
points(my.data,rep((i-1)/100,101),pch=15,col=col.temp, cex=1)
}
seqs<-seq(0,100,(100/nquantiles))
seqs[1]<-1
col.matrix<-col.matrix[c(seqs), c(seqs)]
}
# Creates colour matrix
col.matrix<-colmat(nquantiles=2, upperleft="blue", upperright="yellow", bottomleft="green", bottomright="red", xlab="Species Richness", ylab="Change in activity hours")
# Function to create bivariate map, given the colour ramp created previously
bivariate.map<-function(rasterx, rastery, colormatrix=col.matrix, nquantiles=10){
quanmean<-getValues(rasterx)
temp<-data.frame(quanmean, quantile=rep(NA, length(quanmean)))
brks<-with(temp, quantile(temp,na.rm=TRUE, probs = c(seq(0,1,1/nquantiles))))
r1<-within(temp, quantile <- cut(quanmean, breaks = brks, labels = 2:length(brks),include.lowest = TRUE))
quantr<-data.frame(r1[,2])
quanvar<-getValues(rastery)
temp<-data.frame(quanvar, quantile=rep(NA, length(quanvar)))
brks<-with(temp, quantile(temp,na.rm=TRUE, probs = c(seq(0,1,1/nquantiles))))
r2<-within(temp, quantile <- cut(quanvar, breaks = brks, labels = 2:length(brks),include.lowest = TRUE))
quantr2<-data.frame(r2[,2])
as.numeric.factor<-function(x) {as.numeric(levels(x))[x]}
col.matrix2<-colormatrix
cn<-unique(colormatrix)
for(i in 1:length(col.matrix2)){
ifelse(is.na(col.matrix2[i]),col.matrix2[i]<-1,col.matrix2[i]<-which(col.matrix2[i]==cn)[1])
}
cols<-numeric(length(quantr[,1]))
for(i in 1:length(quantr[,1])){
a<-as.numeric.factor(quantr[i,1])
b<-as.numeric.factor(quantr2[i,1])
cols[i]<-as.numeric(col.matrix2[b,a])}
r<-rasterx
r[1:length(r)]<-cols
return(r)
}
# Creates map
bivmap<-bivariate.map(rasterx,rastery, colormatrix=col.matrix, nquantiles=2)
# Plots a map
plot(bivmap,frame.plot=F,axes=F,box=F,add=F,legend=F,col=as.vector(col.matrix)) ````
Ideally,a more general function would take two raster files, determine the data ranges of both and then create a bivariate chorpleth map based on the number of bins/quantiles specified by the user.
Here are some ideas based on your code
Three functions
makeCM <- function(breaks=10, upperleft, upperright, lowerleft, lowerright) {
m <- matrix(ncol=breaks, nrow=breaks)
b <- breaks-1
b <- (0:b)/b
col1 <- rgb(colorRamp(c(upperleft, lowerleft))(b), max=255)
col2 <- rgb(colorRamp(c(upperright, lowerright))(b), max=255)
cm <- apply(cbind(col1, col2), 1, function(i) rgb(colorRamp(i)(b), max=255))
cm[, ncol(cm):1 ]
}
plotCM <- function(cm, xlab="", ylab="", main="") {
n <- cm
n <- matrix(1:length(cm), nrow=nrow(cm), byrow=TRUE)
r <- raster(n)
cm <- cm[, ncol(cm):1 ]
image(r, col=cm, axes=FALSE, xlab=xlab, ylab=ylab, main=main)
}
rasterCM <- function(x, y, n) {
q1 <- quantile(x, seq(0,1,1/(n)))
q2 <- quantile(y, seq(0,1,1/(n)))
r1 <- cut(x, q1, include.lowest=TRUE)
r2 <- cut(y, q2, include.lowest=TRUE)
overlay(r1, r2, fun=function(i, j) {
(j-1) * n + i
})
}
Example data
library(raster)
set.seed(42)
r <- raster(ncol=50, nrow=50, xmn=0, xmx=10, ymn=0,ymx=10, crs="+proj=utm +zone=1")
x <- init(r, "x") * runif(ncell(r), .5, 1)
y <- init(r, "y") * runif(ncell(r), .5, 1)
And now used the functions
breaks <- 5
cmat <- makeCM(breaks, "blue", "yellow", "green", "red")
xy <- rasterCM(x, y, breaks)
par(mfrow=c(2,2), mai=c(.5,.5,.5,.5), las=1)
plot(x)
plot(y)
par(mai=c(1,1,1,1))
plotCM(cmat, "var1", "var2", "legend")
par(mai=c(.5,.5,.5,.5))
image(xy, col=cmat, las=1)
Related
I want to select raster cells that are within a certain distance (for e.g. 1 km or 5 km) from the boundary of a polygon. I ultimately want to take an average of only those raster cells that are within the specified distance from the boundary of shapefile inwards.
The way I thought I would approach is to create a negative buffer inwards, and subtract the original polygon and the buffer. Then mask and crop the raster using the new polygon and take the average.
Here's sample data demonstrating what I want to do.
library(raster)
# raster
r <- raster(xmn=1035792, xmx= 1116792, ymn=825303.6, ymx=937803.6, resolution = 12.5,crs = "+init=epsg:3174")
r <- setValues(r, 0)
# polygon
x <- c(1199999, 1080000, 1093067, 1090190, 1087977, 1070419, 1180419)
y <- c(957803.6,937803.6, 894366.9, 872153.9, 853703.0, 825353.6, 805353.6)
poly.lake <- SpatialPolygons(list(Polygons(list(Polygon(data.frame(x,y))), ID = 1)))
r <- mask(r, poly.lake)
r <- crop(r, poly.lake)
plot(poly.lake)
plot(r, add = T)
Instead of taking average of the resulting raster r, I only want to average raster cells which are within a certain specified distance from the boundary.
The example data but using "terra"
library(terra)
r <- rast(xmin=1035792, xmax= 1116792, ymin=825303.6, ymax=937803.6, resolution = 125, crs = "epsg:3174")
values(r) <- 1:ncell(r)
# polygon
x <- c(1199999, 1080000, 1093067, 1090190, 1087977, 1070419, 1180419)
y <- c(957803.6,937803.6, 894366.9, 872153.9, 853703.0, 825353.6, 805353.6)
p <- vect(cbind(x, y), "polygons", crs = "epsg:3174")
r <- mask(r, p)
r <- crop(r, p)
You can now take the internal buffer of p
b <- buffer(p, -10000)
x <- mask(r, b, inverse=TRUE)
global(x, mean,na.rm=T)
# mean
#lyr.1 296549.9
Or you can take both sides like this
bb <- buffer(as.lines(p), 10000)
y <- mask(r, bb)
global(y, mean,na.rm=T)
# mean
#lyr.1 296751.3
So there is a slight difference between these two approaches; I think because the first uses inverse=TRUE; I would go with the second approach.
Your drawing (and Chris' answer) suggests that you only want the distance to the western border. In that case, you can first find the start and end nodes you need (from 2 to 6)
plot(p)
points(p)
text(as.points(p), pos=2)
Select the segments in between these nodes and create a line type SpatVector.
g <- geom(p)
k <- vect(g[2:6,], "lines", crs=crs(p))
lines(k, col="red", lwd=2)
And now do as above.
bk <- buffer(k, 10000)
z <- mask(r, bk)
global(z, mean,na.rm=T)
# mean
#lyr.1 297747
If you wanted to get the part of buffer bk that is inside the original polygon p you can do
bki <- intersect(bk, p)
To complete the plot
polys(bk, lty=3, border=NA, col=adjustcolor("light blue", alpha.f = 0.4))
lines(bki, lty=3)
Finding which segments of a polygon to buffer was what puzzled me, and this seems a decent approach cast_poly_to_subsegments. Taking your poly.lake as poly_sf:
geom <- lapply(
1:(length(st_coordinates(poly_sf)[, 1]) - 1),
function(i) {
rbind(
as.numeric(st_coordinates(poly_sf)[i, 1:2]),
as.numeric(st_coordinates(poly_sf)[i + 1, 1:2])
)
}
+ ) |>
st_multilinestring() |>
st_sfc(crs=st_crs(rt)) |>
st_cast('LINESTRING')
gives us
which is a little surprising, the 'green and red', that I assumed would be 'green'. It is wound clockwise so the desired segments to buffer are 4 & 5.
lns_buf4 <- st_buffer(st_geometry(geom)[4], 1000, singleSide = TRUE)
lns_buf5 <- st_buffer(st_geometry(geom)[5], 1000, singleSide= TRUE)
lns_buf5_neg <- st_buffer(st_geometry(geom)[5], -1000, singleSide= TRUE)
plot(st_geometry(geom), col = c('red', 'yellow', 'blue', 'green'))
plot(lns_buf4, col = 'black', add = TRUE)
plot(lns_buf5, col = 'green', add = TRUE)
plot(lns_buf5_neg, col = 'blue', add = TRUE)
Whether +/-1000 is sufficient is a further intersection test between the buffer poly(s) and the other boundary. If the desired sampling area is not rectangular, steps can be taken to construct a sampling polygon from the buffer and intersection.
#library(lwgeom)
# on poly_sf
new_line <- draw(x = 'line', col ='blue', lwd = 2, n = 10)
lns_buf5_10k_neg <- st_buffer(st_geometry(geom)[5], -10000, singleSide= TRUE)
new_line_sf <- st_as_sf(new_line, crs = st_crs(lns_buf5_10k_neg))
buf5_nline_split <- lwgeom::st_split(lns_buf5_10k_neg, new_line_sf$geometry)
irreg_smp_area <- st_collection_extract(buf5_nline_split)[1]
Though I'm happy to see it all done in terra.
I'm having multiple data frames where the first column (in the end filled with NA's) is the wavenumber and the other columns are my variables of the specific wavenumber for multiple observations.
Is there a possibility to plot the columns in a way that my first column holds the variables for the x-axis and the other are plotted into one big plot with their respective y-values?
I already tried "matplot" (resulting in "numbers" instead of points),
matplot(df[,1],df[,3:5],xlab = "Wavelength [nm]", ylab = "Absorbance")
different sets of "xyplot" (no possibility to give more than one y-value), but none seem to work (on my level of knowledge on R).
The final result should look like this:
Thanks for any help!
You could always make your own function to do this ;I make such functions on a regular basis when nothing really fits my needs.
I put this together rather quickly but you can adapt it to your needs.
# generate data
set.seed(6)
n <- 50
dat <- data.frame(x1=seq(1,100, length.out = n),
x2=seq(1,20, length.out = n)+rnorm(n),
x3=seq(1,20, length.out = n)+rnorm(n, mean = 3),
x4=seq(1,20, length.out = n)+rnorm(n, mean = 5))
# make some NAs at the end
dat[45:n,2] <- NA
dat[30:n,3] <- NA
plot_multi <- function(df, x=1, y=2, cols=y,
xlim=range(df[,x], na.rm = T),
ylim=range(df[,y], na.rm = T),
main="", xlab="", ylab="", ...){
# setup plot frame
plot(NULL,
xlim=xlim,
ylim=ylim,
main=main, xlab=xlab, ylab=ylab)
# plot all your y's against your x
pb <- sapply(seq_along(y), function(i){
points(df[,c(x, y[i])], col=cols[i], ...)
})
}
plot_multi(dat, y=2:4, type='l', lwd=3, main = ":)",
xlab = "Wavelength", ylab = "Absorbance")
Results in :
EDIT
I actually found your dataset online by chance, so I'll include how to plot it as well using my code above.
file <- 'http://openmv.net/file/tablet-spectra.csv'
spectra <- read.csv(file, header = FALSE)
# remove box label
spectra <- spectra[,-1]
# add the 'wavelength' and rotate the df
# (i didn't find the actual wavelength values, but hey).
spectra <- cbind(1:ncol(spectra), t(spectra))
plot_multi(spectra, y=2:ncol(spectra), cols = rainbow(ncol(spectra)),
type='l', main=":))", ylab="Absorbance", xlab = "'Wavelength'")
You could use the pavo R package, which is made to deal with spectral data (full disclosure, I'm one of the maintainers):
library(pavo)
df <- t(read.csv("http://openmv.net/file/tablet-spectra.csv", header = FALSE))
df <- df[-1, ]
df <- apply(df, 2, as.numeric)
df <- cbind(wl = seq_len(nrow(df)),
df)
df <- as.rspec(df)
#> wavelengths found in column 1
plot(df, ylab = "Absorbance", col = rainbow(3))
Created on 2019-07-26 by the reprex package (v0.3.0)
I am trying to group data points of an NMDS in ggplot by adding ellipses using the ordiellipse function following the helpful advice from this post. However, although I don't get any error messages or warnings, calculating the ellipse data produces an empty data frame.
The dataset is available here, and my code is as following:
library(vegan)
library(ggplot2)
setwd("C:")
veg_matrix <- read.csv("161019_vegetation_matrix.csv",header=T); veg_matrix[2:123] <- lapply(veg_matrix[2:123],as.character); veg_matrix[2:123] <- lapply(veg_matrix[2:123],as.numeric)
rownames(veg_matrix) <- veg_matrix[,1]; veg_matrix <- veg_matrix[,-1] # remove non-numeric rownames (=plot codes) from dataset
speciescomp_NMDS = metaMDS(veg_matrix, k=2, trymax=100, distance="raup", na.rm=T)
plot(speciescomp_NMDS,display="sites",type="n")
# add grouping data
plot_scores <- as.data.frame(scores(speciescomp_NMDS))
plot_scores$plot <- rownames(plot_scores);plot_scores <- cbind(plot_scores,data.frame(matrix(unlist(strsplit(plot_scores$plot,"_")),nrow=24,byrow=T)))[,-3]; colnames(plot_scores)[c(3,4)] <- c("summit","aspect")
plot_scores$group <- ""
plot_scores$group[plot_scores$summit=="BUF"|plot_scores$summit=="SES"] <- "low"
plot_scores$group[plot_scores$summit=="CHA"|plot_scores$summit=="MIN"] <- "intermediate"
plot_scores$group[plot_scores$summit=="CUO"|plot_scores$summit=="GAJ"] <- "high"
plot_scores$group <- transform(plot_scores, as.factor(plot_scores$group))
# compute ellipse data
ord <- ordiellipse(speciescomp_NMDS,plot_scores$group,display = "sites", kind = "sd", conf = .95, label=T)
f_ellipse <- function (cov, center = c(0, 0), scale = 1, npoints = 100)
{
theta <- (0:npoints) * 2 * pi/npoints
Circle <- cbind(cos(theta), sin(theta))
t(center + scale * t(Circle %*% chol(cov)))
}
df_ell_level <- data.frame()
for(g in levels(plot_scores$group)){
if(is.null(ord[[g]])) next
df_ell_level <- rbind(df_ell_level,
cbind(as.data.frame(with(plot_scores[plot_scores$group==g,],
f_ellipse(ord[[g]]$cov,ord[[g]]$center,ord[[g]]$scale)))
,level=group))
}
Any ideas on how to handle this one will be warmly appreciated!
First convert group to a factor
plot_scores <- transform(plot_scores, group = as.factor(group))
If we now fix your loop to be:
df_ell_level <- data.frame()
for(g in levels(plot_scores$group)){
if(is.null(ord[[g]])) next
df_ell_level <-
rbind(df_ell_level,
cbind(as.data.frame(f_ellipse(ord[[g]]$cov, ord[[g]]$center, ord[[g]]$scale)),
level=g)
)
}
Then I get a data frame with 301 rows. But I've cut out a lot of your unnecessary code from the call, and you really don't want to be rbind()ing data frames together in a loop like this.
Actually I try to plot a figure but it puts and shows all the columns(lines) on each other so it is not representative. I try to make a simulated data and show you how I plot it, and also show you what I want
I don't know how to make a data like example i show below but here what I do
set.seed(1)
M <- matrix(rnorm(20),20,5)
x <- as.matrix(sort(runif(20, 5.0, 7.5)))
df <- as.data.frame(cbind(x,M))
After making the data frame, I will plot all columns versus the first one by melting it and using ggplot
require(ggplot2)
require(reshape)
dff <- melt(df , id.vars = 'V1')
b <- ggplot(dff, aes(V1,value)) + geom_line(aes(colour = variable))
I want to have specific distance between each line (in this case we have 6) something like below. in one dimension it is V1, in another dimension it is the number of column. I don't care about the function , I just want the photo
This solution uses rgl and produces this plot:
It uses this function that accepts 3 arguments:
df : a data.frame just like your 'M' above
x : a numeric vector (or a 1-coldata.frame`) for the x-axis
cols : (optionnal) a vector of colours to repeat. If missing, black line are drawn
Here is the function:
nik_plot <- function(df, x, cols){
require(rgl)
# if a data.frame is
if (is.data.frame(x) && ncol(x)==1)
x <- as.numeric(x[, 1])
# prepare a vector of colors
if (missing(cols))
cols <- rep_len("#000000", nrow(df))
else
cols <- rep_len(cols, nrow(df))
# initialize an empty 3D plot
plot3d(NA, xlim=range(x), ylim=c(1, ncol(df)-1), zlim=range(df), xlab="Mass/Charge (M/Z)", ylab="Time", zlab="Ion Spectra", box=FALSE)
# draw lines, silently
silence_please <- sapply(1:ncol(df), function(i) lines3d(x=x, y=i, z=df[, i], col=cols[i]))
}
Note that you can remove require(rgl) from the function and library(rgl) somewhere in your script, eg at the beginning.
If you don't have rgl installed, then install.packages("rgl").
Black lines, the default, may produce some moiré effect, but a repeating color palette is worse. This may be brain-dependant. A single colour would also avoid introducing an artificial dimension (and a strong one).
An example below:
# black lines
nik_plot(M, x)
# as in the image above
nik_plot(M, x, "grey40")
# an unreadable rainbow
nik_plot(M, x, rainbow(12))
The 3D window can be navigated with the mouse.
Do you need something else?
EDIT
You can build your second plot with the function below. The range of your data is so large, and I think the whole idea behind shifting upwards every line, prevent having an y-axis with a reliable scale. Here I have normalized all signals (0 <= signal <= 1). Also the parameter gap can be use to play with this. We could disconnect the two behaviors but I think it's nice. Try different values of gap and see examples below.
df : a data.frame just like your 'M' above
x : a numeric vector (or a 1-coldata.frame`) for the x-axis
cols : (optionnal) a vector of colours to repeat. If missing, black line are drawn
gap : gap factor between individual lines
more_gap_each: every n lines, a bigger gap is produced...
more_gap_relative: ... and will be gap x more_gap_relative wide
Here is the function:
nik_plot2D <- function(df, x, cols, gap=10, more_gap_each=1, more_gap_relative=0){
if (is.data.frame(x) && ncol(x)==1)
x <- as.numeric(x[, 1])
# we normalize ( 0 <= signal <= 1)
df <- df-min(df)
df <- (df/max(df))
# we prepare a vector of colors
if (missing(cols))
cols <- rep_len("#00000055", nrow(df))
else
cols <- rep_len(cols, nrow(df))
# we prepare gap handling. there is probably more elegant
gaps <- 1
for (i in 2:ncol(df))
gaps[i] <- gaps[i - 1] + 1/gap + ifelse((i %% more_gap_each) == 0, (1/gap)*more_gap_relative, 0)
# we initialize the plot
plot(NA, xlim=range(x), ylim=c(min(df), 1+max(gaps)), xlab="Time", ylab="", axes=FALSE, mar=rep(0, 4))
axis(1)
# finally, the lines
silent <- lapply(1:ncol(df), function(i) lines(x, df[, i] + gaps[i], col=cols[i]))
}
We can use it with (default):
nik_plot2D(M, x) # gap=10
And you obtain this plot:
or:
nik_plot2D(M, x, 50)
or, with colors:
nik_plot2D(M, x, gap=20, cols=1:3)
nik_plot2D(M, x, gap=20, cols=rep(1:3, each=5))
or, still with colours and but with larger gaps:
nik_plot2D(M, x, gap=20, cols=terrain.colors(10), more_gap_each = 1, more_gap_relative = 0) # no gap by default
nik_plot2D(M, x, gap=20, cols=terrain.colors(10), more_gap_each = 10, more_gap_relative = 4) # large gaps every 10 lines
nik_plot2D(M, x, gap=20, cols=terrain.colors(10), more_gap_each = 5, more_gap_relative = 2) # small gaps every 5 lines
As other have pointed out, your data have very large peaks and it's not clear whether you want to allow some curves to overlap,
m <- read.table("~/Downloads/M.txt", head=T)
fudge <- 0.05
shifty <- function(m, fudge=1){
shifts <- fudge * max(abs(apply(m, 2, diff))) * seq(0, ncol(m)-1)
m + matrix(shifts, nrow=nrow(m), ncol=ncol(m), byrow=TRUE)
}
par(mfrow=c(1,2), mar=c(0,0,1,0))
cols <- colorRampPalette(blues9[4:9])(ncol(m))
matplot(shifty(m), t="l", lty=1, bty="n", yaxt="n", xaxt="n", ylab="", col=cols)
title("no overlap")
matplot(shifty(m, 0.05), t="l", lty=1, bty="n", yaxt="n", xaxt="n", ylab="", col=cols)
title("some overlap")
Alternatively, some outlier/peak detection scheme could be used to filter them out before calculating the shift between curves,
library(outliers)
shifty2 <- function(m, outliers = 10){
tmp <- m
for(ii in seq_len(outliers)) tmp <- rm.outlier(tmp, median = TRUE)
shifts <- max(abs(apply(tmp, 2, diff))) * seq(0, ncol(m)-1)
m + matrix(shifts, nrow=nrow(m), ncol=ncol(m), byrow=TRUE)
}
matplot(shifty2(m), t="l", lty=1, bty="n", yaxt="n", xaxt="n", ylab="", col=cols)
(there are probably good algorithms to decide which points to remove, but I don't know them)
For 3D plotting I prefer the rgl package. This should be close to your desired solution.
The color of each scan changes on every third one.
library(rgl)
M<-read.table("M.txt", sep="\t", header = TRUE, colClasses = "numeric")
x<-read.table("x.txt", sep="\t", header = TRUE)
n<-ncol(M)
M[M<1]<-1
plot3d(x='', xlim=range(x$Time), ylim=c(1, n), zlim=(range(M)), box=FALSE)
sapply(seq(1,n), function(t){lines3d(x$Time, y=t*10, z=(M[,t])/10000, col=t/3+1)})
title3d(xlab="scan", ylab="time", zlab="intensity")
title3d(main ="Extracted Spectra Subset")
axes3d()
#axis3d(edge="x")
#axis3d(edge="y")
#axis3d(edge="z")
Do the huge differences in magnitude of the data points, I needed to scale some factors to make a readable graph. The intensity of the goes from 0 to nearly 1,000,000, thus distorting the graph. Attempted to normalize by taking the ln, but plot became unreadable.
I have got wind data for some stations. The data includes station latitude, longitude, wind speed and wind direction for each station in a csv file. This data is not regularly spaced data. I have a requirement to draw streamlines for this data in R language.
I tried couple of packages rasterVis for STREAMPLOT(), TeachingDemos for My.Symbols by searching through internet, however I was not successful.
Here is an example plot I was talking about.
http://wx.gmu.edu/dev/clim301/850stream.png
Also here is some sample data from csv file that I got for which I was trying to draw streamlines.
longitude,latitude,windspeed,winddirection
84.01,20,1.843478261,126.6521739
77.13,28.48,3.752380952,138.952381
77.2,28.68,2.413333333,140.2666667
78.16,31.32,1.994444444,185.0555556
77.112,31.531,2.492,149.96
77,28.11,7.6,103
77.09,31.5,1.752631579,214.8947368
76.57,31.43,1.28,193.6
77.02,32.34,3.881818182,264.4545455
77.15,28.7,2.444,146.12
77.35,30.55,3.663157895,131.3684211
75.5,29.52,4.175,169.75
72.43,24.17,2.095,279.3
76.19,25.1,1.816666667,170
76.517,30.975,1.284210526,125.6315789
76.13,28.8,4.995,126.7
75.04,29.54,4.09,151.85
72.3,24.32,0,359
72.13,23.86,1.961111111,284.7777778
74.95,30.19,3.032,137.32
73.16,22.36,1.37,251.8
75.84,30.78,3.604347826,125.8695652
73.52,21.86,1.816666667,228.9166667
70.44,21.5,2.076,274.08
69.75,21.36,3.81875,230
78.05,30.32,0.85625,138.5625
Can someone please help me out in drawing streamlines for the irregular wind data?
Like you, I wanted to visualize the same kind of data as streamlnes and I failed to find a function that would do the trick...so I worked up my own crude function:
streamlines <- function(x, y, u, v, step.dist=NULL,
max.dist=NULL, col.ramp=c("white","black"),
fade.col=NULL, length=0.05, ...) {
## Function for adding smoothed vector lines to a plot.
## Interpolation powered by akima package
## step.distance - distance between interpolated locations (user coords)
## max.dist - maximum length of interpolated line (user coords)
## col.ramp - colours to be passed to colorRampPalette
## fade.col - NULL or colour to add fade effect to interpolated line
## ... - further arguments to pass to arrows
## build smoothed lines using interp function
maxiter <- max.dist/step.dist
l <- replicate(5, matrix(NA, length(x), maxiter), simplify=FALSE)
names(l) <- c("x","y","u","v","col")
l$x[,1] <- x
l$y[,1] <- y
l$u[,1] <- u
l$v[,1] <- v
for(i in seq(maxiter)[-1]) {
l$x[,i] <- l$x[,i-1]+(l$u[,i-1]*step.dist)
l$y[,i] <- l$y[,i-1]+(l$v[,i-1]*step.dist)
r <- which(l$x[,i]==l$x[,i-1] & l$y[,i]==l$y[,i-1])
l$x[r,i] <- NA
l$y[r,i] <- NA
for(j in seq(length(x))) {
if(!is.na(l$x[j,i])) {
l$u[j,i] <- c(interp(x, y, u, xo=l$x[j,i], yo=l$y[j,i])$z)
l$v[j,i] <- c(interp(x, y, v, xo=l$x[j,i], yo=l$y[j,i])$z)
}
}
}
## make colour a function of speed and fade line
spd <- sqrt(l$u^2 + l$v^2) # speed
spd <- apply(spd, 1, mean, na.rm=TRUE) # mean speed for each line
spd.int <- seq(min(spd, na.rm=TRUE), max(spd, na.rm=TRUE), length.out=maxiter)
cr <- colorRampPalette(col.ramp)
cols <- as.numeric(cut(spd, spd.int))
ncols <- max(cols, na.rm=TRUE)
cols <- cr(ncols)[cols]
if(is.null(fade.col)) {
l$col <- replicate(maxiter, cols)
} else {
nfade <- apply(!is.na(l$x), 1, sum)
for(j in seq(length(x))) {
l$col[j,seq(nfade[j])] <- colorRampPalette(c(fade.col, cols[j]))(nfade[j])
}
}
## draw arrows
for(j in seq(length(x))) {
arrows(l$x[j,], l$y[j,], c(l$x[j,-1], NA), c(l$y[j,-1], NA),
col=l$col[j,], length=0, ...)
i <- which.max(which(!is.na(l$x[j,]))) # draw arrow at end of line
if(i>1) {
arrows(l$x[j,i-1], l$y[j,i-1], l$x[j,i], l$y[j,i],
col=l$col[j,i-1], length=length, ...)
}
}
}
The function is powered by the interp function in the akima package and, with some fiddling, it can produce some half decent visuals:
dat <- "longitude,latitude,windspeed,winddirection
84.01,20,1.843478261,126.6521739
77.13,28.48,3.752380952,138.952381
77.2,28.68,2.413333333,140.2666667
78.16,31.32,1.994444444,185.0555556
77.112,31.531,2.492,149.96
77,28.11,7.6,103
77.09,31.5,1.752631579,214.8947368
76.57,31.43,1.28,193.6
77.02,32.34,3.881818182,264.4545455
77.15,28.7,2.444,146.12
77.35,30.55,3.663157895,131.3684211
75.5,29.52,4.175,169.75
72.43,24.17,2.095,279.3
76.19,25.1,1.816666667,170
76.517,30.975,1.284210526,125.6315789
76.13,28.8,4.995,126.7
75.04,29.54,4.09,151.85
72.3,24.32,0,359
72.13,23.86,1.961111111,284.7777778
74.95,30.19,3.032,137.32
73.16,22.36,1.37,251.8
75.84,30.78,3.604347826,125.8695652
73.52,21.86,1.816666667,228.9166667
70.44,21.5,2.076,274.08
69.75,21.36,3.81875,230
78.05,30.32,0.85625,138.5625"
tf <- tempfile()
writeLines(dat, tf)
dat <- read.csv(tf)
library(rgdal) # for projecting locations to utm coords
library(akima) # for interpolation
## add utm coords
xy <- as.data.frame(project(cbind(dat$longitude, dat$latitude), "+proj=utm +zone=43 +datum=NAD83"))
names(xy) <- c("easting","northing")
dat <- cbind(dat, xy)
## add u and v coords
dat$u <- -dat$windspeed*sin(dat$winddirection*pi/180)
dat$v <- -dat$windspeed*cos(dat$winddirection*pi/180)
#par(bg="black", fg="white", col.lab="white", col.axis="white")
plot(northing~easting, data=dat, type="n", xlab="Easting (m)", ylab="Northing (m)")
streamlines(dat$easting, dat$northing, dat$u, dat$v,
step.dist=1000, max.dist=50000, col.ramp=c("blue","green","yellow","red"),
fade.col="white", length=0, lwd=5)
I do not think this would be enough data to do what you request:
require(plotrix)
require(maps)
map("world",xlim=c(69,85),ylim= c(20,35))
with(dat,
vectorField(windspeed, winddirection, longitude, latitude , vecspec="deg") )
After staring at the output a bit, I think there may be problems with how I am using that function or with the function itself. The orientations of the arrows seems wrong. Likewise I think the TeachingDemos vector field is not well done, but here is what I get:
require(TeachingDemos)
map("world",xlim=c(69,85),ylim= c(20,35))
with(dat, my.symbols(x=longitude, y=latitude,
symb= ms.arrows, length=windspeed/10, angle=2*pi*winddirection/360))
This plot seems to have sufficient variation in direction but the arrow heads seem to vary erratically in size. In any event neither of these plots suggests that this data can be used to construct streamlines. The data is both too sparse and internally contradictory as far as wid direction at adjacent locations.