manually create a dendrogram r - r

I am trying to create a dendrogram from similarity scores I have acquired not through hclust or any other means. I have two branches and just want to draw them out according to how similar they are and then have them branch off.
A and B are 0.5 similar
A is 0.2 unique
B is 0.3 unique
So the total height of A is 0.7 and the total height of B is 0.8, where 0.5 of their branches are shared.
The following just makes two branches without a long branch connecting the two leaves. There is this similar question, but it doesn't quite help!
x <- list(1, 2)
## attach "leaf" and "label" attributes to leaf nodes
attr(x[[1]], "leaf") <- TRUE
attr(x[[2]], "leaf") <- TRUE
attr(x[[1]], "label") <- "A"
attr(x[[2]], "label") <- "B"
## set "height" attributes for all nodes
attr(x, "height") <- 1
attr(x[[1]], "height") <- (1-0.7)
attr(x[[2]], "height") <- (1-0.8)
## set "midpoints" attributes for all nodes
attr(x, "midpoint") <- 1
attr(x[[1]], "midpoint") <- 0.5
attr(x[[2]], "midpoint") <- 0.5
## set "members" attributes for all nodes
attr(x, "members") <- 2
attr(x[[1]], "members") <- 1
attr(x[[2]], "members") <- 1
## set class as "dendrogram"
class(x) <- "dendrogram"
x
plot(x)

You can make a function to build the leaves. Add the height of the attributes and the total height. n and n1 are the leaves for your A and B and n2 are your leaves combined and are converted to a dendrogram by changing the class.
Attr = function(o, plus_) {
if (!missing(plus_)) for (n in names(plus_)) { attr(o, n) = plus_[[n]]; }
o
}
n = Attr("A", list(label = "A", members = 1, height = 0.2, leaf = T));
n1 = Attr("B", list(label = "B", members = 1, height = 0.3, leaf = T));
n2 = Attr(list(n, n1), list(members = 2, height = 1, midpoint = 0.5));
class(n2) = 'dendrogram';
plot(n2)

Related

How to rotate nodes of a time-calibrated phylogenetic tree to match a particular order in R?

I have a time-calibrated phylogenetic tree from BEAST and I would like to make a figure in which its nodes are rotated to match an arbitrary ordering. The following code works perfectly to plot the tree with the nodes in the order they are in the input file.
library("phytools")
library("phyloch")
library("strap")
library("coda")
t <- read.beast("mcctree.tre") # I couldn't upload the file here
t$root.time <- t$height[1]
num_taxa <- length(t$tip.label)
display_all_node_bars <- TRUE
names_list <-vector()
for (name in t$tip){
v <- strsplit(name, "_")[[1]]
if(display_all_node_bars){
names_list = c(names_list, name)
}
else if(v[length(v)]=="0"){
names_list = c(names_list, name)
}
}
nids <- vector()
pos <- 1
len_nl <- length(names_list)
for(n in names_list){
for(nn in names_list[pos:len_nl]){
if(n != nn){
m <- getMRCA(t,c(n, nn))
if(m %in% nids == FALSE){
nids <- c(nids, m)
}
}
}
pos <- pos+1
}
pdf("tree.pdf", width = 20, height = 20)
geoscalePhylo(tree = t,
x.lim = c(-2,21),
units = c("Epoch"),
tick.scale = "myr",
boxes = FALSE,
width = 1,
cex.tip = 2,
cex.age = 3,
cex.ts = 2,
erotate = 0,
label.offset = 0.1)
lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv)
for(nv in nids){
bar_xx_a <- c(lastPP$xx[nv]+t$height[nv-num_taxa]-t$"height_95%_HPD_MIN"[nv-num_taxa],
lastPP$xx[nv]-(t$"height_95%_HPD_MAX"[nv-num_taxa]-t$height[nv-num_taxa]))
lines(bar_xx_a, c(lastPP$yy[nv], lastPP$yy[nv]), col = rgb(0, 0, 1, alpha = 0.3), lwd = 12)
}
t$node.label <- t$posterior
p <- character(length(t$node.label))
p[t$node.label >= 0.95] <- "black"
p[t$node.label < 0.95 & t$node.label >= 0.75] <- "gray"
p[t$node.label < 0.75] <- "white"
nodelabels(pch = 21, cex = 1.5, bg = p)
dev.off()
The following code is my attempt to rotate the nodes in the way I want (following this tutorial: http://blog.phytools.org/2015/04/finding-closest-set-of-node-rotations.html). And it works for rotating the nodes. However, the blue bars indicating the confidence intervals of the divergence time estimates get out of their correct place - this is what I would like help to correct. This will be used in much larger files with hundreds of branches - the example here is simplified.
new.order <- c("Sp8","Sp9","Sp10","Sp7","Sp6","Sp5","Sp4","Sp2","Sp3","Ou1","Ou2","Sp1")
t2 <- setNames(1:Ntip(t), new.order)
new.order.tree <- minRotate(t, t2)
new.order.tree$root.time <- t$root.time
new.order.tree$height <- t$height
new.order.tree$"height_95%_HPD_MIN" <- t$"height_95%_HPD_MIN"
new.order.tree$"height_95%_HPD_MAX" <- t$"height_95%_HPD_MAX"
pdf("reordered_tree.pdf", width = 20, height = 20)
geoscalePhylo(tree = new.order.tree,
x.lim = c(-2,21),
units = c("Epoch"),
tick.scale = "myr",
boxes = FALSE,
width = 1,
cex.tip = 2,
cex.age = 3,
cex.ts = 2,
erotate = 0,
label.offset = 0.1)
lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv)
for(nv in nids){
bar_xx_a <- c(lastPP$xx[nv]+new.order.tree$height[nv-num_taxa]-new.order.tree$"height_95%_HPD_MIN"[nv-num_taxa],
lastPP$xx[nv]-(new.order.tree$"height_95%_HPD_MAX"[nv-num_taxa]-new.order.tree$height[nv-num_taxa]))
lines(bar_xx_a, c(lastPP$yy[nv], lastPP$yy[nv]), col = rgb(0, 0, 1, alpha = 0.3), lwd = 12)
}
new.order.tree$node.label <- t$posterior
p <- character(length(new.order.tree$node.label))
p[new.order.tree$node.label >= 0.95] <- "black"
p[new.order.tree$node.label < 0.95 & new.order.tree$node.label >= 0.75] <- "gray"
p[new.order.tree$node.label < 0.75] <- "white"
nodelabels(pch = 21, cex = 1.5, bg = p)
dev.off()
I've found several similar questions here and in other forums, but none dealing specifically with time-calibrated trees - which is the core of the problem described above.
The short answer is that phyTools::minRotate() doesn't recognize the confidence intervals as associated with nodes. If you contact the phyTools maintainers, they may well be able to add this functionality quite easily.
Meanwhile, you can correct this yourself.
I don't know how read.beast() saves confidence intervals – let's say they're saved in t$conf.int. (Type unclass(t) at the R command line to see the full structure; you should be able to identify the appropriate property.)
If the tree's node labels are unique, then you can infer the new sequence of nodes using match():
library("phytools")
new.order <- c("Sp8","Sp9","Sp10","Sp7","Sp6","Sp5","Sp4","Sp2","Sp3","Ou1","Ou2","Sp1")
# Set up a fake initial tree -- you would load the tree from a file
tree <- rtree(length(new.order))
tree$tip.label <- sort(new.order)
tree$node.label <- seq_len(tree$Nnode)
tree$conf.int <- seq_len(tree$Nnode) * 10
# Plot tree
par(mfrow = c(1, 2), mar = rep(0, 4), cex = 0.9) # Create space
plot(tree, show.node.label = TRUE)
nodelabels(tree$conf.int, adj = 1) # Annotate "correct" intervals
# Re-order nodes with minRotate
noTree <- minRotate(tree, setNames(seq_along(new.order), new.order))
plot(noTree, show.node.label = TRUE)
# Move confidence intervals to correct node
tree$conf.int <- tree$conf.int[match(noTree$node.label, tree$node.label)]
nodelabels(tree$conf.int, adj = 1)
If you can't guarantee that the node labels are unique, you can always overwrite them in a temporary object:
# Find node order
treeCopy <- tree
treeCopy$node.label <- seq_len(tree$Nnode)
nodeOrder <- match(minRotate(treeCopy)$node.label, treeCopy$node.label)
# Apply node order
tree$conf.int <- tree$conf.int[nodeOrder]

How to compute nearest distance between points?

This is a tmp set of points with (x, y) coordinates and 0 or 1 categories.
tmp <- structure(list(cx = c(146.60916, 140.31737, 145.92917, 167.57799,
166.77618, 137.64381, 172.12157, 175.32881, 175.06154, 135.50566,
177.46696, 148.06731), cy = c(186.29814, 180.55231, 210.6084,
210.34111, 185.48505, 218.89375, 219.69554, 180.67421, 188.15775,
209.27205, 209.27203, 178.00151), category = c(1, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0)), class = "data.frame", row.names = c(NA,
-12L))
I need to find the minimum spanning tree for category = 1 points, then to join (add edge) each point with category = 0 to its nearest category = 1 point.
The minimum spanning tree is built on points with the category = 1.
ones <- tmp[tmp$category == 1,]
n <- dim(ones)[1]
d <- matrix(0, n, n)
d <- as.matrix(dist(cbind(ones$cx, ones$cy)))
g1 <- graph.adjacency(d, weighted=TRUE, mode="undirected")
V(g1)$name <- tmp[tmp$category == 1,]$Name
mylayout = as.matrix(cbind(ones$cx, -ones$cy))
mst <- minimum.spanning.tree(g1) # Find a minimum spanning tree
plot(mst, layout=mylayout,
vertex.size = 10,
vertex.label = V(g1)$name,
vertex.label.cex =.75,
edge.label.cex = .7,
)
Expected result is in center of figure.
My current attempt is:
n <- dim(tmp)[1]
d <- matrix(0, n, n)
d <- as.matrix(dist(cbind(tmp$cx, tmp$cy)))
d[tmp$category %*% t(tmp$category) == 1] = Inf
d[!sweep(d, 2, apply(d, 2, min), `==`)] <- 0
g2 <- graph.adjacency(d, weighted=TRUE, mode="undirected")
mylayout = as.matrix(cbind(tmp$cx, -tmp$cy))
V(g2)$name <- tmp$Name
plot(g2, layout=mylayout,
vertex.size = 10,
vertex.label = V(g2)$name,
vertex.label.cex =.75,
edge.label = round(E(g2)$weight, 3),
edge.label.cex = .7,
)
One can see that I have found the minimum dist and add one edge only.
Question. How to define condition for all possible points?
You can try the code below
# two categories of point data frames
pts1 <- subset(tmp, category == 1)
pts0 <- subset(tmp, category == 0)
# generate minimum spanning tree `gmst`
gmst <- mst(graph_from_adjacency_matrix(as.matrix(dist(pts1[1:2])), mode = "undirected", weighted = TRUE))
# distance matrix between `pts0` and `pts1`
pts0_pts1 <- as.matrix(dist(tmp[1:2]))[row.names(pts0), row.names(pts1)]
# minimum distances of `pts0` to `pts1`
idx <- max.col(-pts0_pts1)
df0 <- data.frame(
from = row.names(pts0),
to = row.names(pts1)[idx],
weight = pts0_pts1[cbind(1:nrow(pts0), idx)]
)
# aggregate edges lists and produce final result
g <- graph_from_data_frame(rbind(get.data.frame(gmst), df0), directed = FALSE) %>%
set_vertex_attr(name = "color", value = names(V(.)) %in% names(V(gmst)))
mylayout <- as.matrix(tmp[names(V(g)), 1:2]) %*% diag(c(1, -1))
plot(g, edge.label = round(E(g)$weight, 1), layout = mylayout)
and you will get

How to calculate a maximum-bottleneck path with igraph?

Given a capacity network with a single source and a single sink, how can I calculate the maximum-bottleneck path (also known as the widest path or maximum capacity path problem) using igraph?
I've read (e.g. here or even with pseudocode there) that it is possible with some modifications to Dijkstra’s algorithm, but I do not want to dive into algortihm development but use igraph instead.
Example
library(igraph)
set.seed(21)
nodes = cbind(
'id' = c('Fermenters', 'Methanogens', 'carbs', 'CO2', 'H2', 'other', 'CH4', 'H2O')
)
from <- c('carbs', rep('Fermenters', 3), rep('Methanogens', 2), 'CO2', 'H2')
to <- c('Fermenters', 'other', 'CO2', 'H2', 'CH4', 'H2O', rep('Methanogens', 2))
weight <- sample(1 : 20, 8)
links <- data.frame(from, to, weight, stringsAsFactors = FALSE)
net = graph_from_data_frame(links, vertices = nodes, directed = T)
## Calculate max-bottleneck here !
# # disabled because just vis
# plot(net, edge.width = E(net)$weight)
# require(networkD3)
# require(tidyverse)
#
# d3net <- igraph_to_networkD3(net, group = rep(1, 8))
# forceNetwork(
# Links = mutate(d3net$links, weight = E(net)$weight), Nodes = d3net$nodes,
# Source = 'source', Target = 'target',
# NodeID = 'name', Group = "group", Value = "weight",
# arrows = TRUE, opacity = 1, opacityNoHover = 1
# )
So with respect to the example, how would I calculate the maximum capacity path from carbs to H2O?
I don't know how efficient this would be, but you could use igraph to find all "simple" paths, then calculate the minimum edge weight of each, then choose the max...
require(tibble)
require(igraph)
nodes = data_frame('id' = c('A', "B", "C", "D"))
links = tribble(
~from, ~to, ~weight,
"A" , "B", 10,
"B", "C", 10,
"C", "D", 6,
"A", "D", 4,
)
net = graph_from_data_frame(links, vertices = nodes, directed = T)
simple_paths <- all_simple_paths(net, "A", "D")
simple_paths[which.max(
sapply(simple_paths, function(path) {
min(E(net, path = path)$weight)
})
)]
# [[1]]
# + 4/4 vertices, named, from 061ab8d:
# [1] A B C D
You could try the same idea as in IGRAPH IN R: Find the path between vertices that maximizes the product of edge attributes. Invert the weights, divide by the total to keep the weights < 1 (to keep the log-weights positive), and take the min:
x<-shortest_paths(net,3,8, weights=-log(E(net)$weight/sum(E(net)$weight)), output="epath")[[2]]
E(net)[x[[1]]]
min(E(net)$weight[x[[1]]])
which gives
+ 4/8 edges from 57589bc (vertex names):
[1] carbs ->Fermenters Fermenters ->H2 H2 ->Methanogens Methanogens->H2O
[1] 10

Determine transects perpendicular to a (coast)line in R

I'd like to automatically derive transects, perpendicular to the coastline. I need to be able to control their length and spacing and their oriëntation needs to be on the "correct" side of the line. I came up with a way to do that, but especially selecting the "correct" (it needs to point to the ocean) can be done better. General approach:
For each line segment in a SpatialLineDataFrame define transect
locations
define transect: in both directions perpendicular to coastline: create points that determine the transect
Create a polygon based on the coastline, add extra points to grow the polygon in a direction that is known and use that to clip the points that are inside (considered as land, and therefore not of interest)
Create transect based on remaining point
Especially part 3 is of interest. I'd like a more robust method to determine the correct orientation of the transect. This is what i'm using now:
library(rgdal)
library(raster)
library(sf)
library(ggplot2)
library(rgeos) # create lines and spatial objects
# create testing lines
l1 <- cbind(c(1, 2, 3), c(3, 2, 2))
l2 <- cbind(c(1, 2, 3), c(1, 1.5, 1))
Sl1 <- Line(l1)
Sl2 <- Line(l2)
S1 <- Lines(list(Sl1), ID = "a")
S2 <- Lines(list(Sl2), ID = "b")
line <- SpatialLines(list(S1, S2))
plot(line)
# for testing:
sep <- 0.1
start <- 0
AllTransects <- vector('list', 100000) # DB that should contain all transects
for (i in 1: length(line)){
# i <- 2
###### Define transect locations
# Define geometry subset
subset_geometry <- data.frame(geom(line[i,]))[, c('x', 'y')]
# plot(SpatialPoints(data.frame(x = subset_geometry[,'x'], y = subset_geometry[,'y'])), axes = T, add = T)
dx <- c(0, diff(subset_geometry[,'x'])) # Calculate difference at each cell comapred to next cell
dy <- c(0, diff(subset_geometry[,'y']))
dseg <- sqrt(dx^2+dy^2) # get rid of negatives and transfer to uniform distance per segment (pythagoras)
dtotal <- cumsum(dseg) # cumulative sum total distance of segments
linelength = sum(dseg) # total linelength
pos = seq(start,linelength, by=sep) # Array with postions numbers in meters
whichseg = unlist(lapply(pos, function(x){sum(dtotal<=x)})) # Segments corresponding to distance
pos=data.frame(pos=pos, # keep only
whichseg=whichseg, # Position in meters on line
x0=subset_geometry[whichseg,1], # x-coordinate on line
y0=subset_geometry[whichseg,2], # y-coordinate on line
dseg = dseg[whichseg+1], # segment length selected (sum of all dseg in that segment)
dtotal = dtotal[whichseg], # Accumulated length
x1=subset_geometry[whichseg+1,1], # Get X coordinate on line for next point
y1=subset_geometry[whichseg+1,2] # Get Y coordinate on line for next point
)
pos$further = pos$pos - pos$dtotal # which is the next position (in meters)
pos$f = pos$further/pos$dseg # fraction next segment of its distance
pos$x = pos$x0 + pos$f * (pos$x1-pos$x0) # X Position of point on line which is x meters away from x0
pos$y = pos$y0 + pos$f * (pos$y1-pos$y0) # Y Position of point on line which is x meters away from y0
pos$theta = atan2(pos$y0-pos$y1,pos$x0-pos$x1) # Angle between points on the line in radians
pos$object = i
###### Define transects
tlen <- 0.5
pos$thetaT = pos$theta+pi/2 # Get the angle
dx_poi <- tlen*cos(pos$thetaT) # coordinates of point of interest as defined by position length (sep)
dy_poi <- tlen*sin(pos$thetaT)
# tabel met alleen de POI informatie
# transect is defined by x0,y0 and x1,y1 with x,y the coordinate on the line
output <- data.frame(pos = pos$pos,
x0 = pos$x + dx_poi, # X coordinate away from line
y0 = pos$y + dy_poi, # Y coordinate away from line
x1 = pos$x - dx_poi, # X coordinate away from line
y1 = pos$y - dy_poi, # X coordinate away from line
theta = pos$thetaT, # angle
x = pos$x, # Line coordinate X
y = pos$y, # Line coordinate Y
object = pos$object,
nextx = pos$x1,
nexty = pos$y1)
# create polygon from object to select correct segment of the transect (coastal side only)
points_for_polygon <- rbind(output[,c('x', 'y','nextx', 'nexty')])# select points
pol_for_intersect <- SpatialPolygons( list( Polygons(list(Polygon(points_for_polygon[,1:2])),1)))
# plot(pol_for_intersect, axes = T, add = T)
# Find a way to increase the polygon - should depend on the shape&direction of the polygon
# for the purpose of cropping the transects
firstForPlot <- data.frame(x = points_for_polygon$x[1], y = points_for_polygon$y[1])
lastForPlot <- data.frame(x = points_for_polygon$x[length(points_for_polygon$x)],
y = points_for_polygon$y[length(points_for_polygon$y)])
plot_first <- SpatialPoints(firstForPlot)
plot_last <- SpatialPoints(lastForPlot)
# plot(plot_first, add = T, col = 'red')
# plot(plot_last, add = T, col = 'blue')
## Corners of shape dependent bounding box
## absolute values should be depended on the shape beginning and end point relative to each other??
LX <- min(subset_geometry$x)
UX <- max(subset_geometry$x)
LY <- min(subset_geometry$y)
UY <- max(subset_geometry$y)
# polygon(x = c(LX, UX, UX, LX), y = c(LY, LY, UY, UY), lty = 2)
# polygon(x = c(LX, UX, LX), y = c(LY, LY, UY), lty = 2)
# if corners are changed to much the plot$near becomes a problem: the new points are to far away
# Different points are selected
LL_corner <- data.frame(x = LX-0.5, y = LY - 1)
LR_corner <- data.frame(x = UX + 0.5 , y = LY - 1)
UR_corner <- data.frame(x = LX, y = UY)
corners <- rbind(LL_corner, LR_corner)
bbox_add <- SpatialPoints(rbind(LL_corner, LR_corner))
# plot(bbox_add ,col = 'green', axes = T, add = T)
# Select nearest point for drawing order to avoid weird shapes
firstForPlot$near <-apply(gDistance(bbox_add,plot_last, byid = T), 1, which.min)
lastForPlot$near <- apply(gDistance(bbox_add,plot_first, byid = T), 1, which.min)
# increase polygon with corresponding points
points_for_polygon_incr <- rbind(points_for_polygon[1:2], corners[firstForPlot$near,], corners[lastForPlot$near,])
pol_for_intersect_incr <- SpatialPolygons( list( Polygons(list(Polygon(points_for_polygon_incr)),1)))
plot(pol_for_intersect_incr, col = 'blue', axes = T)
# Coordinates of points first side
coordsx1y1 <- data.frame(x = output$x1, y = output$y1)
plotx1y1 <- SpatialPoints(coordsx1y1)
plot(plotx1y1, add = T)
coordsx0y0 <- data.frame(x = output$x0, y = output$y0)
plotx0y0 <- SpatialPoints(coordsx0y0)
plot(plotx0y0, add = T, col = 'red')
# Intersect
output[, "x1y1"] <- over(plotx1y1, pol_for_intersect_incr)
output[, "x0y0"] <- over(plotx0y0, pol_for_intersect_incr)
x1y1NA <- sum(is.na(output$x1y1)) # Count Na
x0y0NA <- sum(is.na(output$x1y1)) # Count NA
# inefficient way of selecting the correct end point
# e.g. either left or right, depending on intersect
indexx0y0 <- with(output, !is.na(output$x0y0))
output[indexx0y0, 'endx'] <- output[indexx0y0, 'x1']
output[indexx0y0, 'endy'] <- output[indexx0y0, 'y1']
index <- with(output, is.na(output$x0y0))
output[index, 'endx'] <- output[index, 'x0']
output[index, 'endy'] <- output[index, 'y0']
AllTransects = rbind(AllTransects, output)
}
# Create the transects
lines <- vector('list', nrow(AllTransects))
for(n in 1: nrow(AllTransects)){
# n = 30
begin_coords <- data.frame(lon = AllTransects$x, lat = AllTransects$y) # Coordinates on the original line
end_coords <- data.frame(lon = AllTransects$endx, lat = AllTransects$endy) # coordinates as determined by the over: remove implement in row below by selecting correct column from output
col_names <- list('lon', 'lat')
row_names <- list('begin', 'end')
# dimnames < list(row_names, col_names)
x <- as.matrix(rbind(begin_coords[n,], end_coords[n,]))
dimnames(x) <- list(row_names, col_names)
lines[[n]] <- Lines(list(Line(x)), ID = as.character(n))
}
lines_sf <- SpatialLines(lines)
# plot(lines_sf)
df <- SpatialLinesDataFrame(lines_sf, data.frame(AllTransects))
plot(df, axes = T)
As long as i'm able to correctly define the bounding box and grow the polygon correctly this works. But I'd like to try this on multiple coastlines and parts of coastlines, each with its own orientation. In the example below the growing of the polygon is made for the bottom coastline segment, as a result the top one has transects in the wrong direction.
Anybody has an idea in what directio to look? I was considering to perhaps use external data but when possible i'd like to avoid that.
I used your code for my question (measure line inside a polygon) but maybe this works for you:
Took a spatial polygon or line
Extract the coordinates of the element
Make a combination of coordinates to create straight lines, from with you can derivate perpendicular lines (e.g. ((x1,x3)(y1, y3)) or ((x2,x4)(y2, y4)) )
Iterate along with all the pairs of coordinates
Apply the code you did, especially the result of the 'output' table.
I did this for a polygon, so I could generate perpendicular lines based on the straight line I create taking an arbitrary (1, 3) set of coordinates.
#Define a polygon
pol <- rip[1, 1] # I took the first polygon from my Shapefile
polcoords <- pol#polygons[[1]]#Polygons[[1]]#coords
# define how to create your coords pairing. My case: 1st with 3rd, 2nd with 4th, ...
pairs <- data.frame(a = 1:( nrow(polcoords) - 1),
b = c(2:(nrow(polcoords)-1)+1, 1) )
# Empty list to store the lines
lnDfls <- list()
for (j in 1:nrow(pairs)){ # j = 1
# Select the pairs
pp <- polcoords[c(pairs$a[j], pairs$b[j]), ]
#Extract mean coord, from where the perp. line will start
midpt <- apply(pp, 2, mean)
# points(pp, col = 3, pch = 20 )
# points(midpt[1], midpt[2], col = 4, pch = 20)
x <- midpt[1]
y <- midpt[2]
theta = atan2(y = pp[2, 2] - pp[1, 2], pp[2, 1] - pp[1, 1]) # Angle between points on the line in radians
# pos$theta = atan2(y = pos$y0-pos$y1 , pos$x0-pos$x1) # Angle between points on the line in radians
###### Define transects
tlen <- 1000 # distance in m
thetaT = theta+pi/2 # Get the angle
dx_poi <- tlen*cos(thetaT) # coordinates of point of interest as defined by position length (sep)
dy_poi <- tlen*sin(thetaT)
# tabel met alleen de POI informatie
# transect is defined by x0,y0 and x1,y1 with x,y the coordinate on the line
output2 <- data.frame(#pos = pos,
x0 = x + dx_poi, # X coordinate away from line
y0 = y + dy_poi, # Y coordinate away from line
x1 = x - dx_poi, # X coordinate away from line
y1 = y - dy_poi # X coordinate away from line
#theta = thetaT, # angle
#x = x, # Line coordinate X
#y = y # Line coordinate Y
)
# points(output2$x1, output2$y1, col = 2)
#segments(x, y, output2$x1[1], output2$y1[1], col = 2)
mat <- as.matrix(cbind( c( x, output2$x1[1] ) , c( y, output2$y1[1] ) ))
LL <- Lines(list(Line( mat )), ID = as.character(j))
# plot(SpatialLinesDataFrame(LL, data.frame (a = 1)), add = TRUE, col = 2)
# plot(SpatialLines(list(LL)), add = TRUE, col = 2)
#lnList[[j]] <- LL
lnDfls[[j]] <- SpatialLinesDataFrame( SpatialLines(LinesList = list(LL)) ,
match.ID = FALSE,
data.frame(id = as.character(j ) ) )
# line = st_sfc(st_linestring(mat))
# st_length(line)
# ln <- (SpatialLines(LinesList = list(LL)))
# lndf <- SpatialLinesDataFrame( lndf , data.frame(id = j ))
# sf::st_length(ln)
# # plot(lines_sf)
}
compDf <- do.call(what = sp::rbind.SpatialLines, args = lnDfls)
plot(pol)
plot(compDf, add = TRUE, col = 2)
plot(inDfLn, add = TRUE, col = 3)

How to colourise some cell borders in R corrplot?

I would like to keep some cells in attention by making their borders clearly distinct from anything else.
The parameter rect.col is used to colorise all borders but I want to colorise only borders of the cells (3,3) and (7,7), for instance, by any halo color etc heat.colors(100) or rainbow(12).
Code:
library("corrplot")
library("psych")
ids <- seq(1,11)
M.cor <- cor(mtcars)
colnames(M.cor) <- ids
rownames(M.cor) <- ids
p.mat <- psych::corr.test(M.cor, adjust = "none", ci = F)
p.mat <- p.mat[["r"]]
corrplot(M.cor,
method = "color",
type = "upper",
tl.col = 'black',
diag = TRUE,
p.mat = p.mat,
sig.level = 0.0000005
)
Fig. 1 Output of the top code without cell bordering,
Fig. 2 Output after manually converting all coordinates to upper triangle but artifact at (10,1),
Fig. 3 Output with window size fix
Input: locations by ids (3,3) and (7,7)
Expected output: two cells where borders marked on upper triangle
Pseudocode
# ids must be id.pairs
# or just a list of two lists
createBorders <- function(id.pairs) {
labbly(id.pairs,function(z){
x <- z$V1
y <- z$V2
rect(x+0.5, y+0.5, x+1.5, y+1.5) # user20650
})
}
corrplot(...)
# TODO Which datastructure to use there in the function as the paired list of ids?
createBorders(ids.pairs)
Testing user20650's proposal
rect(2+0.5, 9+0.5, 3+0.5, 10+0.5, border="white", lwd=2)
Output in Fig. 2.
It would be great to have a function for this.
Assume you have a list of IDs.
I think there is something wrong with the placement because (2,3),(9,10) leads to the point in (2,3),(2,3).
Iterating user20650's Proposal in Chat
library("corrplot")
library("psych")
ids <- seq(1,11)
M.cor <- cor(mtcars)
colnames(M.cor) <- ids
rownames(M.cor) <- ids
p.mat <- psych::corr.test(M.cor, adjust = "none", ci = F)
p.mat <- p.mat[["r"]]
# Chat of http://stackoverflow.com/q/40538304/54964 user20650
cb <- function(corrPlot, ..., rectArgs = list() ){
lst <- list(...)
n <- ncol(corrPlot)
nms <- colnames(corrPlot)
colnames(corrPlot) <- if(is.null(nms)) 1:ncol(corrPlot) else nms
xleft <- match(lst$x, colnames(corrPlot)) - 0.5
ybottom <- n - match(lst$y, colnames(corrPlot)) + 0.5
lst <- list(xleft=xleft, ybottom=ybottom, xright=xleft+1, ytop=ybottom+1)
do.call(rect, c(lst, rectArgs))
}
plt <- corrplot(M.cor,
method = "color",
type = "upper",
tl.col = 'black',
diag = TRUE,
p.mat = p.mat,
sig.level = 0.0000005
)
cb(plt, x=c(1, 3, 5), y=c(10, 7, 4), rectArgs=list(border="white", lwd=3))
Output where only one cell border marked in Fig. 3.
Expected output: three cell borders marked
Restriction in Fig. 2 approach
You have to work all coordinates first to upper triangle.
So you can now call only the following where output has an artifact at (10,1) in Fig. 2
cb(plt, x=c(10, 7, 5), y=c(1, 3, 4), rectArgs=list(border="white", lwd=3))
Expected output: no artifact at (10,1)
The cause of the artifact can be white background, but it occurs also if the border color is red so most probably it is not the cause.
Solution - fix the window size and its output in Fig. 3
pdf("Rplots.pdf", height=10, width=10)
plt <- corrplot(M.cor,
method = "color",
type = "upper",
tl.col = 'black',
diag = TRUE,
p.mat = p.mat,
sig.level = 0.0000005
)
cb(plt, x=c(10, 7, 5), y=c(1, 3, 4), rectArgs=list(border="red", lwd=3))
dev.off()
R: 3.3.1
OS: Debian 8.5
Docs corrplot: here
My proposal where still pseudocode mark.ids. I found best to have plt and mark.ids as the options of corrplotCellBorders which creates corrplot with bordered wanted cells
mark.ids <- {x <- c(1), y <- c(2)} # TODO pseudocode
corrplotCellBorders(plt, mark.ids)
cb(plt, x, y, rectArgs=list(border="red", lwd=3))
# Chat of https://stackoverflow.com/q/40538304/54964 user20650
# createBorders.r, test.createBorders.
cb <- function(corrPlot, ..., rectArgs = list() ){
# ... pass named vector of x and y names
# for upper x > y, lower x < y
lst <- list(...)
n <- ncol(corrPlot)
nms <- colnames(corrPlot)
colnames(corrPlot) <- if(is.null(nms)) 1:ncol(corrPlot) else nms
xleft <- match(lst$x, colnames(corrPlot)) - 0.5
ybottom <- n - match(lst$y, colnames(corrPlot)) + 0.5
lst <- list(xleft=xleft, ybottom=ybottom, xright=xleft+1, ytop=ybottom+1)
do.call(rect, c(lst, rectArgs))
}
corrplotCellBorders <- function(plt, mark.ids) {
x <- mark.ids$x
y <- mark.ids$y
cb(plt, x, y, rectArgs=list(border="red", lwd=3))
}
Open
How to create mark.ids such that you can call its items by mark.ids$x and mark.ids$y?
Integrate point order neutrality for the upper triangle here

Resources