How to circle variable to observed (not latent) variables in dagitty plot - r

How would I put a circle around certaiin variables in the following plot?
library(dagitty)
g = dagitty('dag{
A [pos="-1,0.5"]
W [pos="0.893,-0.422"]
X [adjusted,pos="0,-0.5"]
Y [pos="1,0.5"]
A -> Y
X -> A
X -> W
X -> Y
}')
png("mp.png", width = 500, height = 500,res=300)
plot(g)
dev.off()
In the web based tool you can indicate eg latent or adjusted and it changes the color of the circle, but this is not quite what I am looking for, although if it were possible to get these in the plot from R that would be sufficient, although I don't really like the way the variable is next to the circle in the web based version. I really wanted to circle observed variables and not circle unobserved ones.

I wrote a function which takes the points you want to circle as input, extracts the position of said points and circles them.
library(dagitty)
g = dagitty('dag{
A [pos="-1,0.5"]
W [pos="0.893,-0.422"]
X [adjusted,pos="0,-0.5"]
Y [pos="1,0.5"]
A -> Y
X -> A
X -> W
X -> Y
}')
circle_points <- function(points_to_circle, g) {
#few regexs to extract the points and the positions from "g"
#can surely be optimized, made nicer and more robust but it works for now
fsplit <- strsplit(g[1], "\\]")[[1]]
fsplit <- fsplit[-length(fsplit)]
fsplit <- substr(fsplit, 1, nchar(fsplit)-1)
fsplit[1] <- substr(fsplit[1], 6, nchar(fsplit))
vars <- sapply(regmatches(fsplit,
regexec("\\\n(.*?)\\s*\\[", fsplit)), "[", 2)
pos <- sub(".*pos=\\\"", "", fsplit)
#build dataframe with extracted information
res_df <- data.frame(vars = vars,
posx = sapply(strsplit(pos, ","), "[",1),
posy = sapply(strsplit(pos, ","), "[",2))
df_to_circle <- res_df[res_df$vars %in% points_to_circle,]
#y-position seems to be inverted and has to be multiplied by -1
points(c(as.numeric(df_to_circle$posx)),
c(as.numeric(df_to_circle$posy) * -1),
cex = 4)
}
plot(g)
circle_points(c("A", "Y"), g)
This results in:
You can of course work with the cex parameter, adding colors etc. It seems that the positioning of the circles is a bit off-centered so maybe manipulate the x and y positions in circle_points by a slim margin.

I did not find any information in dagitty, but bnlearn package can add circle/or other shape easily. But I just noticed you only want to add circle to observed traits rather than latent variables (better mentioned this in your title). Then my code might not be what you are looking for. I still attached the code here for your reference. Alternatively, you can distinguish observed/latent traits in different color. This can be easily done using bnlearn (https://www.bnlearn.com/examples/graphviz-plot/)
library(bnlearn)
tree = model2network("[X][W|X][A|X][Y|A:X]")
graphviz.plot(tree, main = "DAG structure", shape = "circle",
layout = "circo")

Related

How to create 3D mesh using extracted LiDAR points in as.mesh3d function from rgl package in R

I am trying to create a 3D mesh of a specific building from points that I extracted from a lidar point cloud. I then created a matrix from the x, y and z values to feed into the as.mesh3d function from the rlg package and since its from a lidar survey, I have 27,000+ points for this one building. I run into an error when I try to create the mesh. I've copied in a sample of 20 points from the point cloud:
X <- c(1566328,1566328,1566328,1566328,1566328,1566327,1566327,1566327,
1566327,1566327,1566327,1566327,1566327,1566327,1566327,1566327,
1566326,1566326,1566326,1566326)
Y <- c(5180937,5180937,5180936,5180935,5180936,5180937,5180937,5180936,
5180936,5180935,5180935,5180935,5180936,5180936,5180937,5180938,
5180938,5180937,5180936,5180936)
Z <- c(19.92300028,19.98300046,19.93700046,19.88099962,19.93500046,19.99500046,
20.00400046,20.00600046,19.97199962,19.92499962,19.95400046,
19.99099991,20.01199991,19.97600020,19.95800008,19.93200008,
19.95300008,19.94800008,19.94300020,19.98399991)
#created a matrix
xyz <- matrix(c(X, Y, Z), byrow = TRUE, ncol = 3)
The problem arises when I try to create the mesh using as.mesh3d():
mesh <- as.mesh3d(xyz, y = NULL, Z = NULL, type = "triangle", col = "red")
This is what I get: Error in as.mesh3d.default(xyz, y = NULL, Z = NULL, type = "triangle", : Wrong number of vertices
The same error happens for the original dataset of 27000+ points despite all being of the same length.
I'm really not advanced in R and was hoping I could get some advice or solutions on how to get past this.
Thankyou
The as.mesh3d function assumes the points are already organized as triangles. Since you're giving it 20 points, that's not possible: it needs a multiple of 3 points.
There's a problem with your calculation of xyz: you say byrow = TRUE, but you're specifying values by column. Using
xyz <- cbind(X, Y, Z)
would work.
If I plot all of your points using text3d(xyz, text=1:20), it looks as though there are a lot of repeats.
There are several ways to triangulate those points, but they depend on assumptions about the surface. For example, if you know there is only one Z value for each (X, Y) pair, you could use as.mesh3d.deldir (see the help page) to triangulate. Here's the code and output for your sample:
dxyz <- deldir::deldir(X - mean(X), Y - mean(Y), z = Z)
# Warning message:
# In deldir::deldir(X - mean(X), Y - mean(Y), z = Z) :
# There were different z "weights" corresponding to
# duplicated points.
persp3d(dxyz, col = "red")
I had to subtract the means from X and Y because rounding errors caused it to look very bad without that: rgl does a lot of things in single precision, which only gives 7 or 8 decimal place accuracy.

Mapping distances to colors

Assuming a matrix of distances between a number of samples, I would like to somehow reasonably map these distances to a color space. So for example if you have three apparent clusters, they should have different colors, and within a cluster you would have a number of shades of a color. However, I would like to avoid explicit clustering, if possible.
Clearly, the mapping cannot be perfect and universal: rather, it is a heuristic.
Is there a known algorithm for that? Or, perhaps, a ready solution for R?
Here is one possibility. No matter how many dimensions your original data was, you can use multi-dimensional scaling with the distance matrix to project the data to three dimensions, in a way that coarsely preserves distances. If you treat the three dimensions as R, G and B this will give a color scheme in which points that are close should have "close" colors.
Here is a simple example. I generate some 5-dimensional data with 4 clusters (although no cluster analysis is performed). From that, we get the distance matrix. Then, as above we use multi-dimensional scaling to turn this into a color map. The points are plotted to show the result.
## Generate some sample data
set.seed(1234)
v = c(rnorm(80,0,1), rnorm(80,0,1), rnorm(80,4,1), rnorm(80,4,1))
w = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,0,1), rnorm(80,4,1))
x = c(rnorm(80,0,1), rnorm(80,0,1), rnorm(80,4,1), rnorm(80,4,1))
y = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,0,1), rnorm(80,4,1))
z = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,-4,1), rnorm(80,8,1))
df = data.frame(v,w,x,y,z)
## Distance matrix
D = dist(df)
## Project to 3-dimensions
PROJ3 = cmdscale(D, 3)
## Scale the three dimensions to [0,1] interval
ScaledP3 = apply(PROJ3, 2, function(x) { (x - min(x))/(max(x)-min(x)) })
colnames(ScaledP3) = c("red", "green", "blue")
X = as.data.frame(ScaledP3)
## Convert to color map
ColorMap = do.call(rgb, X)
plot(x,y, pch=20, col=ColorMap)

Drawing a smooth implicit surface with misc3d

The misc3d package provides a great implementation of the marching cubes algorithm, allowing to plot implicit surfaces.
For example, let's plot a Dupin cyclide:
a = 0.94; mu = 0.56; c = 0.34 # cyclide parameters
f <- function(x, y, z, a, c, mu){ # implicit equation f(x,y,z)=0
b <- sqrt(a^2-c^2)
(x^2+y^2+z^2-mu^2+b^2)^2 - 4*(a*x-c*mu)^2 - 4*b^2*y^2
}
# define the "voxel"
nx <- 50; ny <- 50; nz <- 25
x <- seq(-c-mu-a, abs(mu-c)+a, length=nx)
y <- seq(-mu-a, mu+a, length=ny)
z <- seq(-mu-c, mu+c, length=nz)
g <- expand.grid(x=x, y=y, z=z)
voxel <- array(with(g, f(x,y,z,a,c,mu)), c(nx,ny,nz))
# plot the surface
library(misc3d)
surf <- computeContour3d(voxel, level=0, x=x, y=y, z=z)
drawScene.rgl(makeTriangles(surf))
Nice, except that the surface is not smooth.
The documentation of drawScene.rgl says: "Object-specific rendering features such as smoothing and material are controlled by setting in the objects." I don't know what does that mean. How to get a smooth surface?
I have a solution but not a straightforward one: this solution consists in building a mesh3d object from the output of computeContour3d, and to include the surface normals in this mesh3d.
The surface normals of an implicit surface defined by f(x,y,z)=0 are simply given by the gradient of f. It is not hard to derive the gradient for this example.
gradient <- function(xyz,a,c,mu){
x <- xyz[1]; y <- xyz[2]; z <- xyz[3]
b <- sqrt(a^2-c^2)
c(
2*(2*x)*(x^2+y^2+z^2-mu^2+b^2) - 8*a*(a*x-c*mu),
2*(2*y)*(x^2+y^2+z^2-mu^2+b^2) - 8*b^2*y,
2*(2*z)*(x^2+y^2+z^2-mu^2+b^2)
)
}
Then the normals are computed as follows:
normals <- apply(surf, 1, function(xyz){
gradient(xyz,a,c,mu)
})
Now we are ready to make the mesh3d object:
mesh <- list(vb = rbind(t(surf),1),
it = matrix(1:nrow(surf), nrow=3),
primitivetype = "triangle",
normals = rbind(-normals,1))
class(mesh) <- c("mesh3d", "shape3d")
And finally to plot it with rgl:
library(rgl)
shade3d(mesh, color="red")
Nice, the surface is smooth now.
But is there a more straightforward way to get a smooth surface, without building a mesh3d object? What do they mean in the documentation: "Object-specific rendering features such as smoothing and material are controlled by setting in the objects."?
I don't know what the documentation is suggesting. However, you can do it via a mesh object slightly more easily than you did (though the results aren't quite as nice), using the addNormals() function to calculate the normals automatically rather than by formula.
Here are the steps:
Compute the surface as you did.
Create the mesh without normals. This is basically what you did, but using tmesh3d():
mesh <- tmesh3d(t(surf), matrix(1:nrow(surf), nrow=3), homogeneous = FALSE)
Calculate which vertices are duplicates of which others:
verts <- apply(mesh$vb, 2, function(column) paste(column, collapse = " "))
firstcopy <- match(verts, verts)
Rewrite the indices to use the first copy. This is necessary, since the misc3d functions give a collection of disconnected triangles; we need to work out which are connected.
it <- as.numeric(mesh$it)
it <- firstcopy[it]
dim(it) <- dim(mesh$it)
mesh$it <- it
At this point, there are a lot of unused vertices in the mesh; if memory was a problem you might want to add a step to remove them. I'm going to skip that.
Add the normals
mesh <- addNormals(mesh)
Here are the before and after shots. Left is without normals, right is with them.
It's not quite as smooth as your solution using computed normals, but it's not always easy to find those.
There's an option smooth in the makeTriangles function:
drawScene.rgl(makeTriangles(surf, smooth=TRUE))
I think the result is equivalent to #user2554330's solution, but this is more straightforward.
EDIT
The result is highly better with the rmarchingcubes package:
library(rmarchingcubes)
contour_shape <- contour3d(
griddata = voxel, level = 0,
x = x, y = y, z = z
)
library(rgl)
tmesh <- tmesh3d(
vertices = t(contour_shape[["vertices"]]),
indices = t(contour_shape[["triangles"]]),
normals = contour_shape[["normals"]],
homogeneous = FALSE
)
open3d(windowRect = c(50, 50, 562, 562))
view3d(zoom=0.8)
shade3d(tmesh, color = "darkred")

Maximum at any point of two lines in R

Suppose you have two lines, L1 and L2, which for each x value (x1 and x2 for example) they have known points at L1={(x1,L1_y1), (x2,L1_y2)}, and L2={(x1,L2_y1), (x2,L2_y2)}. By joining these points they may or may not have an intersection at some x3 where x1
Now suppose you want to know the maximum at any x value (not restricted to just x1, x2 etc, but anywhere along the axis) of both of these lines. Obviously it is often trivial to calculate for just a few lines, and a few different x value, but in my case I have several tens of thousand x values and a few lines to check it against, so it can't be done manually.
In R, is there some code which will calculate the maximum at any given point x3?
An example of this can be seen here with L1={(1,1), (2,4)}, and L2={(1,4),(2,1)}, illustrated by:
Here the intersection of these lines is at (1.5, 2.5). L2 is the maximum before this, and L1 after. This maximum line is shown in red below.
As you can see, it isn't enough just to take the max at every point and join these up, and so it will need to consider the lines as some form of function, and then take the maximum of this.
Also, as mention before as there are several thousand x values it will need to generalise to larger data.
To test the code further if you wish you can randomly generate y values for some x values, and it will be clear to see from a plot if it works correctly or not.
Thanks in advance!
Defining points constituting your lines from the example
L1 <- list(x = c(1, 2), y = c(1, 4))
L2 <- list(x = c(1, 2), y = c(4, 1))
defining a function taking a pointwise maximum of two functions corresponding to the lines
myMax <- function(x)
pmax(approxfun(L1$x, L1$y)(x), approxfun(L2$x, L2$y)(x))
This gives
plot(L1$x, L1$y, type = 'l')
lines(L2$x, L2$y, col = 'red')
curve(myMax(x), from = 1, to = 2, col = 'blue', add = TRUE)
Clearly this extends to more complex L1 and L2 as approxfun is just a piecewise-linear approximation. Also, you may add L3, L4, and so on.

Using scatterplot3d to plot a sphere

I have a matrix of the x,y,z coordinates of all amino acids. I plot the protein in 3D space using the following function:
make.Plot <- function(position.matrix, center, radius){
scatterplot3d(x = position.matrix[,4], y = position.matrix[,5], z = position.matrix[,6], type = 'o', color = 'blue')
}
Each row in the position.matrix is for a different amino acid. What I would like to do is modify the function so if I pass it a "center" which would correspond to a number in column 2 of position matrix (which lists the amino acid numberings), as well as a radius, I want a sphere with center at that amino acid.
For instance, if I pass it (position.matrix, 9, 3), I want it to plot a sphere of radius 3 around amino acid 9. I have uploaded a copy of the position data here:
http://temp-share.com/show/YgFHv2J7y
Notice that the row count is not always the canonical count as some residues are skipped. I will always pass it the "canonical" count...
Thanks for your help!
Here is a tested modification of your code. It adds a length-2 size vector for cex.symbols which is chosen by adding 1 to a logical vector:
make.Plot <- function(position.matrix, center, radius){
scatterplot3d(x = position.matrix[,4], y = position.matrix[,5],
z = position.matrix[,6], type = 'o',
cex.symbols=c(1,radius)[1+(position.matrix[,2]==center)], color = 'blue')
}
I wonder if what you really want is the rgl package. It has shapes and an interactive plotting environment. With scatterplot3d you could make the chose point red with this code:
myplot <- make.Plot(position.matrix, 3, 9)
myplot$points3d(position.matrix[3 , 4:6], col="red", cex=10)
I also located some code to draw a "parametric sphere" which can be adapted to creating a highlighting indicator:
myplot <- make.Plot(position.matrix, 3, 9)
a=seq(-pi,pi, length=10);
myplot$points3d(x=2*c(rep(1, 10) %*% t(cos(a)))+position.matrix[3 , 4] ,
y=2*c(cos(a) %*% t(sin(a)))+position.matrix[3 , 5],
z=2*c(sin(a) %*% t(sin(a)))+position.matrix[3 , 6],
col="red", cex=.2)

Resources