Related
I probably have a simple question but I can't find a way to achieve what I need. I have a simple boxplot as the following:
end_dt <- as.Date("2021-02-12")
start_dt <- end_dt - (nrow(iris) - 1)
dim(iris)
dates <- seq.Date(start_dt, end_dt, by="1 day")
df <- iris
df$LAST_VAL <- "N"
df[3, 'LAST_VAL'] <- "Y"
df1 <- df[,c("Sepal.Length","LAST_VAL")]
df1$DES <- 'Sepal.Length'
colnames(df1) <- c("VALUES","LAST_VAL","DES")
df2 <- df[,c("Sepal.Width","LAST_VAL")]
df2$DES <- 'Sepal.Width'
colnames(df2) <- c("VALUES","LAST_VAL","DES")
df <- rbind(df1, df2)
fig <- plot_ly(df, y = ~VALUES, color = ~DES, type = "box") %>% layout(showlegend = FALSE)
What I would like to do now is a add a red marker to each box plot just for the value corresponding to LAST_VAL = "Y". This would allow me to see given the distribution of each plot, to see where the most recent value is located.
I tried to use the info on https://plotly.com/r/box-plots/ but I can't figure out how to do this.
Thanks
The following solution ended up to be a bit too long codewise. However, it should give you what you asked for. I think the boxplots should be added afterwards, like:
fig <- plot_ly(df[df$LAST_VAL=="Y",],
x=~DES, y = ~VALUES, color = ~DES, type = "scatter", colors='red') %>%
layout(showlegend = FALSE) %>%
add_boxplot(data = df[df$DES=="Sepal.Length",], x = ~DES, y = ~VALUES,
showlegend = F, color = ~DES,
boxpoints = F, fillcolor = 'white', line = list(color = c('blue'))) %>%
add_boxplot(data = df[df$DES=="Sepal.Width",], x = ~DES, y = ~VALUES,
showlegend = F, color = ~DES,
boxpoints = F, fillcolor = 'white', line = list(color = c('green')))
How do I add multiple regression lines to the same plot in plotly?
I want to graph the scatter plot, as well as a regression line for each CATEGORY
The scatter plot plots fine, however the graph lines are not graphed correctly (as compared to excel outputs, see below)
df <- as.data.frame(1:19)
df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)
df[,1] <- NULL
fv <- df %>%
filter(!is.na(x)) %>%
lm(x ~ y + y*CATEGORY,.) %>%
fitted.values()
p <- plot_ly(data = df,
x = ~x,
y = ~y,
color = ~CATEGORY,
type = "scatter",
mode = "markers"
) %>%
add_trace(x = ~y, y = ~fv, mode = "lines")
p
Apologies for not adding in all the information beforehand, and thanks for adding the suggestion of "y*CATEGORY" to fix the parallel line issue.
Excel Output
https://i.imgur.com/2QMacSC.png
R Output
https://i.imgur.com/LNypvDn.png
Try this:
library(plotly)
df <- as.data.frame(1:19)
df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)
df[,1] <- NULL
df$fv <- df %>%
filter(!is.na(x)) %>%
lm(y ~ x*CATEGORY,.) %>%
fitted.values()
p <- plot_ly(data = df,
x = ~x,
y = ~y,
color = ~CATEGORY,
type = "scatter",
mode = "markers"
) %>%
add_trace(x = ~x, y = ~fv, mode = "lines")
p
I am plotting the grouped boxplot with jittering with the following function:
plot_boxplot <- function(dat) {
# taking one of each joine_group to be able to plot it
allx <- dat %>%
mutate(y = median(y, na.rm = TRUE)) %>%
group_by(joined_group) %>%
sample_n(1) %>%
ungroup()
p <- dat %>%
plotly::plot_ly() %>%
# plotting all the groups 1:20
plotly::add_trace(data = allx,
x = ~as.numeric(joined_group),
y = ~y,
type = "box",
hoverinfo = "none",
boxpoints = FALSE,
color = NULL,
opacity = 0,
showlegend = FALSE) %>%
# plotting the boxes
plotly::add_trace(data = dat,
x = ~as.numeric(joined_group),
y = ~y,
color = ~group1,
type = "box",
hoverinfo = "none",
boxpoints = FALSE,
showlegend = FALSE) %>%
# adding ticktext
layout(xaxis = list(tickvals = 1:20,
ticktext = rep(levels(dat$group1), each = 4)))
p <- p %>%
# adding jittering
add_markers(data = dat,
x = ~jitter(as.numeric(joined_group), amount = 0.2),
y = ~y,
color = ~group1,
showlegend = FALSE)
p
}
The problem is that when some of the levels have NA as y variable the width of the jittered boxes changes. Here is an example:
library(plotly)
library(dplyr)
set.seed(123)
dat <- data.frame(group1 = factor(sample(letters[1:5], 100, replace = TRUE)),
group2 = factor(sample(LETTERS[21:24], 100, replace = TRUE)),
y = runif(100)) %>%
dplyr::mutate(joined_group = factor(
paste0(group1, "-", group2)
))
# do the plot with all the levels
p1 <- plot_boxplot(dat)
# now the group1 e is having NAs as y values
dat$y[dat$group1 == "e"] <- NA
# create the plot with missing data
p2 <- plot_boxplot(dat)
# creating the subplot to see that the width has changed:
subplot(p1, p2, nrows = 2)
The problem is that the width of boxes in both plots is different:
I've realised that the boxes have the same size without jittering so I know that the jittering is "messing" with the width but I don't know how to fix that.
Does anyone know how to make the width in both jittered plots exactly the same?
I see two separate plot shifts:
due to jittering
due to NAs
First can be solved by declaring new jitter function with fixed seed
fixed_jitter <- function (x, factor = 1, amount = NULL) {
set.seed(42)
jitter(x, factor, amount)
}
and using it instead of jitter in add_markers call.
Second problem can be solved by assigning -1 instead of NA and setting
yaxis = list(range = c(0, ~max(1.1 * y)))
as a second parameter to layout.
Similar to the question here but this didn't give me excatly what I needed and I couldn't figure it out: Plot ellipse3d in R plotly?. I want to recreate rgl's ellipse3d and surface ellipsoid in plotly. I know there there was an anwer which allowed plotting of an ellipse but as individual opaque markers, I need to get it as a surface ellipsoid that's slightly opaque so I can still see the data points in the ellipsoid.
I tried to figure out how dww's comment for "add_surface" instead works but couldn't figure it out.... Can anyone help please?
if (!require("rgl")) install.packages("rgl")
dt <- cbind(x = rnorm(100), y = rnorm(100), z = rnorm(100))
ellipse <- ellipse3d(cov(dt))
plot3d(dt)
plot3d(ellipse, add = T, color = "red", alpha = 0.5)
dww's answer was:
if (!require("plotly")) install.packages("plotly")
if (!require("rgl")) install.packages("rgl")
dt <- cbind(x = rnorm(100), y = rnorm(100), z = rnorm(100))
ellipse <- ellipse3d(cov(dt))
p <- plot_ly(mode = 'markers') %>%
add_trace(type = 'scatter3d', size = 1,
x = ellipse$vb[1,], y = ellipse$vb[2,], z = ellipse$vb[3,],
opacity=0.01) %>%
add_trace(type = 'scatter3d', x = dt[,1], y = dt[,2], z = dt[,3])
p
# shows more obviously what dww's code does to create the visual ellipsoid
w <- plot_ly(mode = 'markers') %>%
add_trace(type = 'scatter3d',
x = ellipse$vb[1,], y = ellipse$vb[2,], z = ellipse$vb[3,],
opacity=0.5) %>%
add_trace(type = 'scatter3d', x = dt[,1], y = dt[,2], z = dt[,3])
w
Their comment on how to use add_surface was
Note that for simplicity, I plotted the ellipse as a cloud using markers. If you want to use add_surface instead, you will have to first convert the ellipse into a different format, with a vector of x locations, a vector of y locations, z as a matrix (dimensions equal to x by y). You'll also need to split the z values into two separate surface layers one for the top half of the ellipsoid and one for the bottom. I don't have time right now to do all this, but if you get stuck I can work this out later
This is my solution if anyone is interested in it. This allows using of the buttons in plotly to toggle the ellipsoid on and off so that you can still hover over and select data points inside the ellipsoid when desired:
if (!require("rgl")) install.packages("rgl", dependencies=TRUE, repos="http://cran.rstudio.com/")
if (!require("plotly")) install.packages("plotly", dependencies=TRUE, repos="http://cran.rstudio.com/")
dt <- cbind(x = rnorm(100), y = rnorm(100), z = rnorm(100))
ellipse <- ellipse3d(cov(dt))
updatemenus <- list(
list(
active = 0,
type= 'buttons',
buttons = list(
list(
label = "Ellipsoid",
method = "update",
args = list(list(visible = c(TRUE, TRUE)))),
list(
label = "No Ellipsoid",
method = "update",
args = list(list(visible = c(TRUE, FALSE)))))
)
)
plot<- plot_ly()%>%
# Plot raw scatter data points
add_trace(data = dt, x = dt[,1], y = dt[,2], z = dt[,3],
type = "scatter3d", mode = 'markers', marker = list(size = 3)) %>%
# Plot ellipsoid
add_trace(x=ellipse$vb [1,], y=ellipse$vb [2,], z=ellipse$vb [3,],
type='mesh3d', alphahull = 0, opacity = 0.4)%>%
# Axes Titles
layout(updatemenus = updatemenus)
plot
Here is a possibility, using the mesh3d type, and with the help of the misc3d package.
pts <- cbind(x = rnorm(10), y = rnorm(10), z = rnorm(10))
C <- chol(cov(pts))
SVD <- svd(t(C))
A <- solve(t(SVD$u)) %*% diag(SVD$d)
cr <- colMeans(pts)
r <- sqrt(qchisq(0.95,3))
fx <- function(u,v){
cr[1] + r*(A[1,1]*cos(u)*cos(v) + A[1,2]*cos(u)*sin(v) + A[1,3]*sin(u))
}
fy <- function(u,v){
cr[2] + r*(A[2,1]*cos(u)*cos(v) + A[2,2]*cos(u)*sin(v) + A[2,3]*sin(u))
}
fz <- function(u,v){
cr[3] + r*(A[3,1]*cos(u)*cos(v) + A[3,2]*cos(u)*sin(v) + A[3,3]*sin(u))
}
library(misc3d)
tris <- parametric3d(fx, fy, fz,
umin=-pi/2, umax=pi/2, vmin=0, vmax=2*pi,
n=100, engine="none")
n <- nrow(tris$v1)
cont <- matrix(NA_real_, ncol=3, nrow=3*n)
cont[3*(1:n)-2,] <- tris$v1
cont[3*(1:n)-1,] <- tris$v2
cont[3*(1:n),] <- tris$v3
idx <- matrix(0:(3*n-1), ncol=3, byrow=TRUE)
library(plotly)
p <- plot_ly() %>%
add_trace(type = "mesh3d",
x = cont[,1], y = cont[,2], z = cont[,3],
i = idx[,1], j = idx[,2], k = idx[,3],
opacity = 0.3) %>%
add_trace(type = "scatter3d", mode = "markers",
data = as.data.frame(pts),
x = ~x, y = ~y, z = ~z,
marker = list(size = 5)) %>%
layout(scene = list(aspectmode = "data"))
To add some colors:
midpoints <- (tris$v1 + tris$v2 + tris$v3)/3
distances <- apply(midpoints, 1, function(x) crossprod(x-cr))
intervals <- cut(distances, 256)
colorsPalette <- viridisLite::viridis(256)
colors <- colorsPalette[as.integer(intervals)]
p <- plot_ly() %>%
add_trace(type = "mesh3d",
x = cont[,1], y = cont[,2], z = cont[,3],
i = idx[,1], j = idx[,2], k = idx[,3],
facecolor = colors,
opacity = 0.3) %>%
add_trace(type = "scatter3d", mode = "markers",
data = as.data.frame(pts),
x = ~x, y = ~y, z = ~z,
marker = list(size = 5)) %>%
layout(scene = list(aspectmode = "data"))
Another solution with the Rvcg package. We use the vcgSphere function which generates a triangulated sphere.
sphr <- Rvcg::vcgSphere() # triangualted sphere
library(rgl) # to use scale3d and transform3d
ell <- scale3d(transform3d(sphr, A), r, r, r)
vs <- ell$vb[1:3,] + cr
idx <- ell$it - 1
p <- plot_ly() %>%
add_trace(type="mesh3d",
x = vs[1,], y = vs[2,], z = vs[3,],
i = idx[1,], j = idx[2,], k = idx[3,],
opacity = 0.3) %>%
add_trace(type = "scatter3d", mode = "markers",
data = as.data.frame(pts),
x = ~x, y = ~y, z = ~z,
marker = list(size = 5)) %>%
layout(scene = list(aspectmode = "data"))
I'm trying to add legends with arbitrary text in a ggvis plot using data from different dataframes. I have tried using add_legend() but I have not idea about what parameters to use. Using plot() is very simple using the legend() function but it has been very hard to find a way to do it using ggvis()
Here is a simple example of what I have using plot():
df1 = data.frame(x = sample(1:10), y = sample(1:10))
df2 = data.frame(x = 1:10, y = 1:10)
df3 = data.frame(x = 1:10, y = sqrt(1:10))
plot(df1)
lines(df2$x, df2$y, col = "red")
lines(df3$x, df3$y, col = "green")
legend("topleft", c("Data 2","Data 3"), lty = 1, col = c("red","green"))
Now, using ggvis() I can plot the points and the lines from different datasets but I can not find a way to put the legends using add_legend(), Here is the code using ggvis():
df1 %>% ggvis(x=~x,y=~y) %>% layer_points() %>%
layer_paths(x=~x,y=~y,data = df2, stroke := "red") %>%
layer_paths(x=~x,y=~y,data = df3, stroke := "green")
I will really appreciate any help.
Thank you.
Edited:
This a sample code using only one data frame and plot()
df = data.frame(x = sample(1:10), y = sample(1:10), x2 = 1:10, y2 = 1:10, y3 = sqrt(1:10) )
plot(df[,c("x","y")])
lines(df$x2, df$y2, col = "red")
lines(df$x2, df$y3, col = "green")
legend("topleft", c("Data 2","Data 3"), lty = 1, col = c("red","green"))
So, what I came up with, is the following, which works:
#add an id column for df2 and df3 and then rbind
df2$id <- 1
df3$id <- 2
df4 <- rbind(df2,df3)
#turn id into a factor
df4$id <- factor(df4$id)
#then plot df4 using the stroke=~id argument
#then plot the legend
#and finally add df1 with a separate data
df4 %>% ggvis(x=~x,y=~y,stroke=~id) %>% layer_lines() %>%
add_legend('stroke', orient="left") %>%
layer_points(x=~x,y=~y,data = df1,stroke:='black')
And it works:
If you would like to move the legend to a position inside the plot then you need to try this:
df4 %>% ggvis(x=~x,y=~y,stroke=~id) %>% layer_lines() %>%
#make sure you use add relative scales
add_relative_scales() %>%
#values for x and y need to be between 0 and 1
#e.g for the x-axis 0 is the at far-most left point and 1 at the far-right
add_legend("stroke", title = "Cylinders",
properties = legend_props(
legend = list(
x = scaled_value("x_rel", 0.1),
y = scaled_value("y_rel", 1)
))) %>%
layer_points(x=~x,y=~y,data = df1,stroke:='black')
And the output: