Use conditional coloring on a plotly surface - r

I am using plotly via R for the first time and trying to create a surface from a grid and color it based on a calculation.
For example, I would like to use the surface from data(volcano), as in
library(plotly)
plot_ly(z = ~volcano) %>% add_surface()
But instead of color based on the z-value (altitude), let's just say I wanted to color based on distance from my house on the little mesa at (20,60) .
house_loc <- c(20,60,150) # (x,y,z) of my house
dist_to_house <- Vectorize(function(x,y,z){sqrt(sum( (c(x,y,z)-house_loc)^2 ))})
So far I have tried:
color_me <-function(x){
colorRampPalette(c('tan','blue')
)(24L)[findInterval(x,seq(0,1,length.out=25),
all.inside=TRUE)]
}
library(dplyr)
library(reshape2)
volcano %>%
melt( varnames=c('y','x'),value.name='z' ) %>%
mutate( d = dist_to_house(x, y, z) ,
d_rel = d/max(d),
d_color = color_me(d_rel)
) -> df
plot_ly(df,
type='scatter3d',
mode='none', # no markers, just surface
x=~x,
y=~y,
z=~z,
surfaceaxis=2,
surfacecolor=~d_color) # last argument seems not to work
Which just returns:
The desired result would color the landscape tan in the region of the house and gradually fade to blue in the regions far from the house.
Somewhat related question uses mesh3d code found elsewhere and doesn't explain how to calculate (i, j, k)

Your code virtually has everything you need, just use a surface plot and use your distance array as the color.
library(plotly)
library(dplyr)
library(reshape2)
house_loc <- c(20,60,150)
dist_to_house <- Vectorize(function(x,y,z){sqrt(sum( (c(x,y,z)-house_loc)^2 ))})
volcano %>%
melt( varnames=c('y','x'),value.name='z' ) %>%
mutate( d = dist_to_house(x, y, z) ,
d_rel = d/max(d)
) -> df
color <- df$d_rel
dim(color) <- dim(volcano)
plot_ly(df,
type='surface',
z=volcano,
surfacecolor=color,
colors=c('tan','blue'))

In addition to the surface plot (see accepted answer) we can also do a mesh3d plot and avoid the reshaping (back to grid) step that plot requires.
However, the scale bar still isn't right (showing range of z, not d_rel)
plot_ly(df,
type='mesh3d',
x = ~x,
y = ~y,
z = ~z,
intensity=~d_rel,
colors = colorRamp(c("tan", "blue"))
)
Counter-intuitively, it is intensity= and not color= which seems to control the conditional coloring.
I originally avoided mesh3d because I thought I had to create a triangular mesh (Delaunay something or another) and had no idea how to do that, but it seems to be handled automatically in this case.

Related

Change position of legend in plot of pec object

I am trying to plot the prediction error curve from pec package but I can't change the legend position and size. There's an example from pec package:
library(rms)
library(pec)
data(pbc)
pbc <- pbc[sample(1:NROW(pbc),size=100),]
f1 <- psm(Surv(time,status!=0)~edema+log(bili)+age+sex+albumin,data=pbc)
f2 <- coxph(Surv(time,status!=0)~edema+log(bili)+age+sex+albumin,data=pbc,x=TRUE,y=TRUE)
f3 <- cph(Surv(time,status!=0)~edema+log(bili)+age+sex+albumin,data=pbc,surv=TRUE)
brier <- pec(list("Weibull"=f1,"CoxPH"=f2,"CPH"=f3),data=pbc,formula=Surv(time,status!=0)~1)
print(brier)
plot(brier)
But shows a big the legend in the middle of plot.
I also tried:
plot(brier, legend = "topright")
class(brier)
But don't show legend.
How can I change the position of legend? And also ¿is it posible to plot this graph using ggplot?
I think I got what you want using ggplot2. The idea is to pick elements from your brier object that contains data for the plot, make a dataframe with it and plot it.
library(ggplot2)
# packages for the pipe and pivot_wider, you can do it with base functions, I just prefer these
library(tidyr)
library(dplyr)
df <- do.call(cbind, brier[["AppErr"]]) # contains y values for each model
df <- cbind(brier[["time"]], df) # values of the x axis
colnames(df)[1] <- "time"
df <- as.data.frame(df) %>% pivot_longer(cols = 2:last_col(), names_to = "models", values_to = "values") # pivot table to long format makes it easier to use ggplot
ggplot(data = df, aes(x = time, y = values, color = models)) +
geom_line() # I suppose you know how to custom axis names etc.
Output:

R: Plot Axis Display Values Larger than the Original Data

I am using the R programming language. I am following a tutorial on data visualization over here: https://plotly.com/r/3d-surface-plots/
I created my own data and made a 3D plot:
library(plotly)
set.seed(123)
#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)
#3d plot
fig <- plot_ly(z = ~as.matrix(d))
fig <- fig %>% add_surface()
#view plot
fig
As seen here, there is a point on this 3D plot where "y = 97". I am not sure how this is possible, seeing how none of the values within the original data frame "d" are anywhere close to 97. I made sure of this by looking at the individual distributions of each variable in the original data frame "d":
#plot individual densities
plot(density(d$a), main = "density plots", col = "red")
lines(density(d$b), col = "blue")
lines(density(d$c), col = "green")
legend( "topleft", c("a", "b", "c"),
text.col=c("red", "blue", "green") )
As seen here, none of the variables (a,b,c) from the original data frame "d" have any values that are close to 97.
Thus, my question: can someone please explain how is it possible that the point (x = 0 , y = 97, z =25.326) appears on this 3D plot?
Thanks
I am not sure if this will resolve the problem - but using the same logic from this previous stackoverflow post: 3D Surface with Plot_ly in r, with x,y,z coordinates
library(plotly)
set.seed(123)
#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)
data = d
plot_ly() %>%
add_trace(data = data, x=data$a, y=data$b, z=data$c, type="mesh3d" )
Now, it appears that all values seen in this visual plot are contained in the original data frame.
However, I am still not sure what is the fundamental (and mathematical) difference between both of these plots:
I am curious to see what others have to say.
Thanks
The problem is how you have your matrix built. Basically, the z-values (in your case the c variable) should be given in a matrix in which the rows and columns are like coordinates for a surface, similar to a grid or raster dataset. The values you see now along the x and y-axis are not the values from your a and b variables but the row and column numbers from your matrix (similar to coordinates). You can open the volcano dataset in R and have a look at how these data are organized, which will surely give you a better understanding of what I am trying to explain.
As Robbie mentioned, its to do with how your data is organised. To change XYZ data to the same format as the volcano dataset, you can use the following from the raster package:
raster <- rasterFromXYZ(d)
# plot raster
plot_ly(z = as.matrix(raster), type = "surface")

3D plot from model in R plotly?

Is it possible to generate a 3D plot from models using plotly? I tried to search over the internet, but many examples are based on the infamous volcano dataset that generates a plot from a matrix of points.
My two models are:
y = 0.49867x - 4.78577
y = 76.13084x + 4.81945
If not possible, how can i transform my data into the matrix format such as that in the volcano dataset? For more details, I have hosted the data file here. I have never used plotly before and i'm unfamiliar with the grammar, but i think i can manage if i can at least format my data into the likes of the volcano dataset.
Thank you.
To plot a surface with plotly, you need to construct a numeric matrix.
Taking Himmelblau's function as a test:
f <- function(x, y) { (x^2+y-11)^2 + (x+y^2-7)^2 }
Create x and y values:
x <- seq(-6, 6, length = 100)
y <- x
Then, create z with outer function. It will return a matrix.
z <- outer(x, y, f)
We can now create a surface plot:
library(plotly)
plot_ly(x = x, y = y, z = ~z) %>% add_surface()

Adding text annotation to a clustering scatter plot (tSNE)

I have XY data (a 2D tSNE embedding of high dimensional data) which I'd like to scatter plot. The data are assigned to several clusters, so I'd like to color code the points by cluster and then add a single label for each cluster, that has the same color coding as the clusters, and is located outside (as much as possible) from the cluster's points.
Any idea how to do this using R in either ggplot2 and ggrepel or plotly?
Here's the example data (the XY coordinates and cluster assignments are in df and the labels in label.df) and the ggplot2 part of it:
library(dplyr)
library(ggplot2)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)
label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none")
The geom_label_repel() function in the ggrepel package allows you to easily add labels to plots while trying to "repel" the labels from not overlapping with other elements. A slight addition to your existing code where we summarize the data / get coordinates of where to put the labels (here I chose the upper left'ish region of each cluster - which is the min of x and the max of y) and merge it with your existing data containing the cluster labels. Specify this data frame in the call to geom_label_repel() and specify the variable that contains the label aesthetic in aes().
library(dplyr)
library(ggplot2)
library(ggrepel)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)
label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
label.df_2 <- df %>%
group_by(cluster) %>%
summarize(x = min(x), y = max(y)) %>%
left_join(label.df)
ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none") +
ggrepel::geom_label_repel(data = label.df_2, aes(label = label))

Plotting z as a color with R on a rGoogleMap

I have a function and I want to plot only x and y. z should be represented as a color. Is there a package that does the work for me ?
f = function(a,b){
dnorm(a^2+b^2)
}
x = seq(-2, 2, 0.1)
y = seq(-2, 2, 0.1)
z = outer(x, y, f)
persp(x, y, z)
I want to plot this function on a map generated with rGoogleMaps. Maybe there is a more specific package for this use?
Something like this?
library(ggmap) # loads ggplot2 as well
library(RgoogleMaps) # for getGeoCode
london.center <- getGeoCode("London")
london <- get_map("London", zoom=12)
x <- seq(-2,2,0.1)
df <- expand.grid(x=x,y=x)
df$z <- with(df,f(x,y))
df$x <- london.center[2]+df$x/20
df$y <- london.center[1]+df$y/20
ggp <- ggmap(london)+
geom_tile(data=df,aes(x=x,y=y,fill=z), alpha=0.2)+
scale_fill_gradientn(guide="none",colours=rev(heat.colors(10)))+
stat_contour(data=df, aes(x=x, y=y, z=z, color=..level..), geom="path", size=1)+
scale_color_gradientn(colours=rev(heat.colors(10)))
plot(ggp)
This solution uses ggplot. Perhaps someone else will show you how to do this using RgoogleMaps.
Basically, we load the map, using get_map(...) (which is just a wrapper for GetMap(...) in the RgoogleMaps package).
Then we create the sample data frame df, which contains three columns, x, y, and z, and one row for every combination of x and y (this is the format required by ggplot).
Then we create the map layers. First the map itself, using ggmap(...); then a layer of tiles "filled" based on the value of z, using geom_tile(...); then a set of contour lines colored using the value of z, using stat_contour(geom="path",...). The rest of the code sets the fill and line colors and renders the map.
Purists will tell you that you can render the filled contours directly using stat_contour(geom="polygon",...), instead of using tiles, but this has the unfortunate effect of clipping any contours not completely enclosed in the plot area.

Resources