Using scatterplot3d to plot a sphere - r

I have a matrix of the x,y,z coordinates of all amino acids. I plot the protein in 3D space using the following function:
make.Plot <- function(position.matrix, center, radius){
scatterplot3d(x = position.matrix[,4], y = position.matrix[,5], z = position.matrix[,6], type = 'o', color = 'blue')
}
Each row in the position.matrix is for a different amino acid. What I would like to do is modify the function so if I pass it a "center" which would correspond to a number in column 2 of position matrix (which lists the amino acid numberings), as well as a radius, I want a sphere with center at that amino acid.
For instance, if I pass it (position.matrix, 9, 3), I want it to plot a sphere of radius 3 around amino acid 9. I have uploaded a copy of the position data here:
http://temp-share.com/show/YgFHv2J7y
Notice that the row count is not always the canonical count as some residues are skipped. I will always pass it the "canonical" count...
Thanks for your help!

Here is a tested modification of your code. It adds a length-2 size vector for cex.symbols which is chosen by adding 1 to a logical vector:
make.Plot <- function(position.matrix, center, radius){
scatterplot3d(x = position.matrix[,4], y = position.matrix[,5],
z = position.matrix[,6], type = 'o',
cex.symbols=c(1,radius)[1+(position.matrix[,2]==center)], color = 'blue')
}
I wonder if what you really want is the rgl package. It has shapes and an interactive plotting environment. With scatterplot3d you could make the chose point red with this code:
myplot <- make.Plot(position.matrix, 3, 9)
myplot$points3d(position.matrix[3 , 4:6], col="red", cex=10)
I also located some code to draw a "parametric sphere" which can be adapted to creating a highlighting indicator:
myplot <- make.Plot(position.matrix, 3, 9)
a=seq(-pi,pi, length=10);
myplot$points3d(x=2*c(rep(1, 10) %*% t(cos(a)))+position.matrix[3 , 4] ,
y=2*c(cos(a) %*% t(sin(a)))+position.matrix[3 , 5],
z=2*c(sin(a) %*% t(sin(a)))+position.matrix[3 , 6],
col="red", cex=.2)

Related

How to draw a boundary line on a scatter plot for classifier in Julia?

If I want to draw a boundary line to separate two classes which is the result of my classifier. How to draw it?
The picture is the sample, the black line is the boundary I want to draw.
the green points is the boundary points. I want to draw a curve perfectly fit those points. But when I plot those curve, the result is the purple line which is not a curve.
Here is a reproducible example how to do it:
using Plots
x = rand(1000)
y = rand(1000)
color = [3 * (b-0.5)^2 < a - 0.1 ? "red" : "blue" for (a, b) in zip(x, y)]
y_bound = 0:0.01:1
x_bound = #. 3 * (y_bound - 0.5)^2 + 0.1
scatter(x, y, color=color, legend=false)
plot!(x_bound, y_bound, color="green")
and you should get a plot like:
The crucial thing here is to make your boundary points ordered (i.e. they must be ordered in the vectors properly so that when you plot a line you connect proper points). In my example I achieved it by varying the y-dimension and calculating the x dimension.
In more complex cases it will be better to use contour plot, e.g.:
x = 1:0.1:8
y = 1:0.1:7
f(x, y) = begin
(3x + y ^ 2) * abs(sin(x) + cos(y)) - 40
end
X = repeat(reshape(x, 1, :), length(y), 1)
Y = repeat(y, 1, length(x))
Z = map(f, X, Y)
contour(x, y, Z, levels=[0], color="green", width=3)
x_s = 7 .* rand(1000) .+ 1
y_s = 6 .* rand(1000) .+ 1
color = [f(a, b) > 0 ? "red" : "blue" for (a, b) in zip(x_s, y_s)]
scatter!(x_s, y_s, color=color, legend=false)
and you should get something like:
However, as you can see this time for the best results it is best to pass scores to contour and specify the classification threshold as level.
I guess your TA asked you to conduct a grid search for this question.
The meaning of grid search is not searching over the data point you have, but searching over whole coordinate. (I.e. From (0,0), (0,1), (0,2) to (0,100), then to (1,0), (1,1) and so on.) You may change the distance between each point when you conduct a grid search.
In your case, you need to solve the equation d_1(X) = d_2(X). So what you need to do is to simulate some points (like the above example), then put those points into |d_1(X) - d_2(X)|, and pick the points that bring you to a value that is smaller than epsilon (a self-given small number like 0.05 or 0.1). Then use Plot() to connect them.
This is not the most efficient way to create the boundary but this is what you learnt in your tutorial. You may also try contour().

How to create 3D mesh using extracted LiDAR points in as.mesh3d function from rgl package in R

I am trying to create a 3D mesh of a specific building from points that I extracted from a lidar point cloud. I then created a matrix from the x, y and z values to feed into the as.mesh3d function from the rlg package and since its from a lidar survey, I have 27,000+ points for this one building. I run into an error when I try to create the mesh. I've copied in a sample of 20 points from the point cloud:
X <- c(1566328,1566328,1566328,1566328,1566328,1566327,1566327,1566327,
1566327,1566327,1566327,1566327,1566327,1566327,1566327,1566327,
1566326,1566326,1566326,1566326)
Y <- c(5180937,5180937,5180936,5180935,5180936,5180937,5180937,5180936,
5180936,5180935,5180935,5180935,5180936,5180936,5180937,5180938,
5180938,5180937,5180936,5180936)
Z <- c(19.92300028,19.98300046,19.93700046,19.88099962,19.93500046,19.99500046,
20.00400046,20.00600046,19.97199962,19.92499962,19.95400046,
19.99099991,20.01199991,19.97600020,19.95800008,19.93200008,
19.95300008,19.94800008,19.94300020,19.98399991)
#created a matrix
xyz <- matrix(c(X, Y, Z), byrow = TRUE, ncol = 3)
The problem arises when I try to create the mesh using as.mesh3d():
mesh <- as.mesh3d(xyz, y = NULL, Z = NULL, type = "triangle", col = "red")
This is what I get: Error in as.mesh3d.default(xyz, y = NULL, Z = NULL, type = "triangle", : Wrong number of vertices
The same error happens for the original dataset of 27000+ points despite all being of the same length.
I'm really not advanced in R and was hoping I could get some advice or solutions on how to get past this.
Thankyou
The as.mesh3d function assumes the points are already organized as triangles. Since you're giving it 20 points, that's not possible: it needs a multiple of 3 points.
There's a problem with your calculation of xyz: you say byrow = TRUE, but you're specifying values by column. Using
xyz <- cbind(X, Y, Z)
would work.
If I plot all of your points using text3d(xyz, text=1:20), it looks as though there are a lot of repeats.
There are several ways to triangulate those points, but they depend on assumptions about the surface. For example, if you know there is only one Z value for each (X, Y) pair, you could use as.mesh3d.deldir (see the help page) to triangulate. Here's the code and output for your sample:
dxyz <- deldir::deldir(X - mean(X), Y - mean(Y), z = Z)
# Warning message:
# In deldir::deldir(X - mean(X), Y - mean(Y), z = Z) :
# There were different z "weights" corresponding to
# duplicated points.
persp3d(dxyz, col = "red")
I had to subtract the means from X and Y because rounding errors caused it to look very bad without that: rgl does a lot of things in single precision, which only gives 7 or 8 decimal place accuracy.

How to circle variable to observed (not latent) variables in dagitty plot

How would I put a circle around certaiin variables in the following plot?
library(dagitty)
g = dagitty('dag{
A [pos="-1,0.5"]
W [pos="0.893,-0.422"]
X [adjusted,pos="0,-0.5"]
Y [pos="1,0.5"]
A -> Y
X -> A
X -> W
X -> Y
}')
png("mp.png", width = 500, height = 500,res=300)
plot(g)
dev.off()
In the web based tool you can indicate eg latent or adjusted and it changes the color of the circle, but this is not quite what I am looking for, although if it were possible to get these in the plot from R that would be sufficient, although I don't really like the way the variable is next to the circle in the web based version. I really wanted to circle observed variables and not circle unobserved ones.
I wrote a function which takes the points you want to circle as input, extracts the position of said points and circles them.
library(dagitty)
g = dagitty('dag{
A [pos="-1,0.5"]
W [pos="0.893,-0.422"]
X [adjusted,pos="0,-0.5"]
Y [pos="1,0.5"]
A -> Y
X -> A
X -> W
X -> Y
}')
circle_points <- function(points_to_circle, g) {
#few regexs to extract the points and the positions from "g"
#can surely be optimized, made nicer and more robust but it works for now
fsplit <- strsplit(g[1], "\\]")[[1]]
fsplit <- fsplit[-length(fsplit)]
fsplit <- substr(fsplit, 1, nchar(fsplit)-1)
fsplit[1] <- substr(fsplit[1], 6, nchar(fsplit))
vars <- sapply(regmatches(fsplit,
regexec("\\\n(.*?)\\s*\\[", fsplit)), "[", 2)
pos <- sub(".*pos=\\\"", "", fsplit)
#build dataframe with extracted information
res_df <- data.frame(vars = vars,
posx = sapply(strsplit(pos, ","), "[",1),
posy = sapply(strsplit(pos, ","), "[",2))
df_to_circle <- res_df[res_df$vars %in% points_to_circle,]
#y-position seems to be inverted and has to be multiplied by -1
points(c(as.numeric(df_to_circle$posx)),
c(as.numeric(df_to_circle$posy) * -1),
cex = 4)
}
plot(g)
circle_points(c("A", "Y"), g)
This results in:
You can of course work with the cex parameter, adding colors etc. It seems that the positioning of the circles is a bit off-centered so maybe manipulate the x and y positions in circle_points by a slim margin.
I did not find any information in dagitty, but bnlearn package can add circle/or other shape easily. But I just noticed you only want to add circle to observed traits rather than latent variables (better mentioned this in your title). Then my code might not be what you are looking for. I still attached the code here for your reference. Alternatively, you can distinguish observed/latent traits in different color. This can be easily done using bnlearn (https://www.bnlearn.com/examples/graphviz-plot/)
library(bnlearn)
tree = model2network("[X][W|X][A|X][Y|A:X]")
graphviz.plot(tree, main = "DAG structure", shape = "circle",
layout = "circo")

Trouble with rotating data in accordance with a "spline" polynomial?

Here is the code that the issue is in:
angles = -atan(deriv)
angles = angles*(180/pi)
#shift coordinates onto their polynomials
d[1:mtp,3] = d[1:mtp,3] + poly[,2]
#rotated storage matrix
rrr = as.data.frame(matrix(data = NA, ncol = 2, nrow = 9000))
#for each moment, take in old coordinates and export newly rotated
for(i in 1:mtp){
rotm = matrix(data = c(c(cos(angles[i]),sin(angles[i])),
c(-sin(angles[i]),cos(angles[i]))), ncol=2, nrow = 2)
rotate.1 = d[i,2:3] - poly[i,]
rotate.2 = rotm %*% t(rotate.1)
rotate.3 = rotate.2 + poly[i,]
rrr[i,] = rotate.3
}
#overwrite coordinates with rotations
d[1:mtp,2:3] = rrr
"deriv" is a numeric vector containing the derivative at each point along the polynomial spline "poly" with columns 1:2 the x and y. "angles" therefore contains the calculated angle to rotate by at each point. "d" is the initial data matrix, with columns 2:3 the x and y.
Data to be translated and rotated
"Spline" polynomial along which data will be translated and rotated
Angles along polynomial used during rotation (derived from derivative)
Data + Spline (translated)
Translated and (incorrectly) rotated data
Solved: R trig functions use radians, so the script was converting to degrees needlessly and causing over-correction at a scale of 180/pi.

Mapping distances to colors

Assuming a matrix of distances between a number of samples, I would like to somehow reasonably map these distances to a color space. So for example if you have three apparent clusters, they should have different colors, and within a cluster you would have a number of shades of a color. However, I would like to avoid explicit clustering, if possible.
Clearly, the mapping cannot be perfect and universal: rather, it is a heuristic.
Is there a known algorithm for that? Or, perhaps, a ready solution for R?
Here is one possibility. No matter how many dimensions your original data was, you can use multi-dimensional scaling with the distance matrix to project the data to three dimensions, in a way that coarsely preserves distances. If you treat the three dimensions as R, G and B this will give a color scheme in which points that are close should have "close" colors.
Here is a simple example. I generate some 5-dimensional data with 4 clusters (although no cluster analysis is performed). From that, we get the distance matrix. Then, as above we use multi-dimensional scaling to turn this into a color map. The points are plotted to show the result.
## Generate some sample data
set.seed(1234)
v = c(rnorm(80,0,1), rnorm(80,0,1), rnorm(80,4,1), rnorm(80,4,1))
w = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,0,1), rnorm(80,4,1))
x = c(rnorm(80,0,1), rnorm(80,0,1), rnorm(80,4,1), rnorm(80,4,1))
y = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,0,1), rnorm(80,4,1))
z = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,-4,1), rnorm(80,8,1))
df = data.frame(v,w,x,y,z)
## Distance matrix
D = dist(df)
## Project to 3-dimensions
PROJ3 = cmdscale(D, 3)
## Scale the three dimensions to [0,1] interval
ScaledP3 = apply(PROJ3, 2, function(x) { (x - min(x))/(max(x)-min(x)) })
colnames(ScaledP3) = c("red", "green", "blue")
X = as.data.frame(ScaledP3)
## Convert to color map
ColorMap = do.call(rgb, X)
plot(x,y, pch=20, col=ColorMap)

Resources