using add_tooltip in ggvis to print name when mouse hovers - r

My data looks something like this:
df = data.frame(name=c("A1", "A2"),
x = c(2,4),
y = c(2,5),
sector = c("blue", "red"))
I am trying to use ggvis to create a graph but I am not able to make the tooltip work.
library(ggvis)
df %>%
ggvis(~x, ~y, size := 100, opacity := 0.4) %>%
layer_points(fill = ~sector) %>%
add_tooltip(function(df) df$name)
When I hover the mouse df$name does not appear. What am I doing wrong?
Thanks!

The helpfile for add_tooltip has a clue:
The data sent from client to the server contains only the data columns
that are used in the plot. If you want to get other columns of data,
you should to use a key to line up the item from the plot with a row
in the data.
My fix below adapts the example from that helpfile.
library(ggvis)
df = data.frame(name=c("A1", "A2"),
x = c(2,4),
y = c(2,5),
sector = c("blue", "red"))
# Add a unique id column
df$id <- 1:nrow(df)
# Define a tooltip function, which grabs the data from the original df, not the plot
tt <- function(x) {
if(is.null(x)) return(NULL)
# match the id from the plot to that in the original df
row <- df[df$id == x$id, ]
return(row$name)
}
# in the definition of the plot we include a key, mapped to our id variable
df %>%
ggvis(~x, ~y, key := ~id, size := 100, opacity := 0.4) %>%
layer_points(fill = ~sector) %>%
add_tooltip(tt, "hover")

Related

R: How to customize Sankey plot in ggplotly?

I have data of sales by year and model, which is visualized via Sankey chart. Now I am struggling to handle 2 issue:
Firstly I need to set model B always on the bottom of chaty regardless its value over the years.
When I re-visualize ggplot via ggplotly hover does not show sales or years
code:
df <- data.frame (model = c("A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J"),
Year = c(2015,2015,2015,2015,2015,2015,2015,2015,2015,2015,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2018,2018,2018,2018,2018,2018,2018,2018,2018,2018,2019,2019,2019,2019,2019,2019,2019,2019,2019,2019,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020),
sales = c(450,678,456,344,984,456,234,244,655,789,234,567,234,567,232,900,1005,1900,450,345,567,235,456,345,144,333,555,777,111,444,222,223,445,776,331,788,980,1003,456,434,345,2222,3456,456,678,8911,4560,4567,4566,5555,6666,7777,8888,1233,1255,5677,3411,2344,6122,4533))
install.packages("remotes")
#remotes::install_github("davidsjoberg/ggsankey")
library(ggsankey)
library(tidyverse)
plot <- ggplot(df, aes(x = Year,
node = model,
fill = model,
value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16)
ggplotly(plot)
I'm absolutely certain that there is a better way, but it took me a while to get it working. I think this is what you were looking for.
I started with the ggplot and ggplotly objects that you have here. The primary purpose of this initial plot is to capture the colors. (I could have captured them a few different ways, but this was already done for me in your plot.)
Update ** I've modified the two elements you requested
library(ggsankey)
library(tidyverse)
library(plotly)
# df from the question is unchanged
# visualize the original
(plot <- ggplot(df,
aes(Year, node = model, fill = model, value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial",
color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16))
ggplotly(plot) -> plp
plp
#-------- colors --------
# collect the 10 colors
cols <- map_dfr(1:10, function(k){
nm <- plp$x$data[[k]]$name
filler <- plp$x$data[[k]]$fillcolor
c(nm = nm, filler = filler)
})
Then I divided the contents of the B model into 10 groups to ensure it was always the smallest bump. This allowed me to collect the stacked values for all of the other models, which is needed to push B to the bottom.
#-------------- splitting B -------------
df1 <- df %>% filter(model != "B") %>%
arrange(Year, sales)
df2 <- df %>% filter(model == "B") %>% # this gets used further down
arrange(Year)
# split B into 10 groups - keep on the bottom, then join the groups
# make the groups
ng <- vector(length = 10)
invisible(
map(1:10,
function(i) {
ng[i] <<- rep("B", i) %>% paste0(collapse = "")
})
)
# add values for these groups by year
df4 <- data.frame(Year = rep(unique(df$Year), each = 10),
model = rep(ng, length(unique(df$Year))),
sales = rep(df2$sales/10, each = 10))
df5 <- rbind(df1, df4)
Recreate the Sankey bump with 10 subsections of model B. Everything that follows works with this plot.
#-------------- plotly after dividing B -------------
(nplt <- ggplot(df5, aes(x = Year, node = model, fill = model, value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial",
color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16))
ggplotly(nplt) -> plt
plt
Create a Sankey bump with JUST B, to capture data that represents model B at the bottom. Use this data to substitute all of the traces that represent B in the object plt. The colors get fixed here, as well. (The original 10 colors from the first plot.) Lastly, the hoverinfo gets removed. That will get fixed next.
#-------------- get values for B at the bottom -------------
df %>% filter(model == "B") %>%
ggplot(aes(x = Year,
node = model,
fill = model,
value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16) -> bplt
ggplotly(bplt) -> bplotly
bplotly
#------- take divided B and remove all but one trace for B --------
# xx <- plt$x$data
# plt$x$data <- xx[c(1:2, 12:length(xx))] # keep only one B trace
#---------------- adjustments to plt's build --------------------
# change out data for the B trace, add the right colors
wh <- vector(length = 0)
invisible(
map(1:length(plt$x$data),
function(j) {
nm <- plt$x$data[[j]]$name
plt$x$data[[j]]$hoverinfo <<- "none"
plt$x$data[[j]]$fillcolor <<- unlist(cols[cols$nm == nm, "filler"],
use.names = F)
if(str_detect(nm, "^B$")){
plt$x$data[[j]]$x <<- bplotly$x$data[[1]]$x
plt$x$data[[j]]$y <<- bplotly$x$data[[1]]$y
}
if(str_detect(nm, "BB")) {
wh[length(wh) + 1] <<- j # list of unnecessary traces (extra B groups)
}
})
)
#----- take divided B and remove all but one trace for B ------
plt$x$data <- plt$x$data[-c(wh)] # <------ forget this line when updated last time
# visualize Sankey bump with B at the bottom
plt
The Plotly object is basically 10 globs of color, there is no separation between years in the background. So if you add a tooltip to this as it is, there can be only one...
To get the tooltips you're looking for, I created another trace (well, 10, actually—1 for each model). In order to get the right values (because the sales data isn't in the 50K range), I used the data in plt to create a new data frame.
#--------------- collect values for hovertext positions ----------
x <- plt$x$data[[1]]$x
inds <- which(x %in% 2015:2020, arr.ind = T)
yrs <- x[inds]
tellMe <- invisible(
map(1:length(plt$x$data),
function(m) {
y <- plt$x$data[[m]]$y
y[inds]
}) %>% setNames(sort(unique(df$model))) %>% # changed from LETTERS[1:10]
as.data.frame() %>%
mutate(yr = yrs %>% as.integer()) %>%
pivot_longer(names_to = "model", values_to = "sales",
cols = sort(unique(df$model))) %>%
distinct() %>%
group_by(yr, model) %>%
summarise(val = mean(sales)) %>%
left_join(df, by = c("yr" = "Year", "model" = "model")) %>%
as.data.frame() # drop groups
)
#-------------- create data trace for hovertext --------------
plot_ly(tellMe, x = ~yr, y = ~val, split = ~model,
customdata = ~sales, text = ~model,
line = list(width = .01, shape = "spline", smoothing = 1.3),
hovertemplate = "Year: %{x}<br>Model: %{text}<br>Sales: %{customdata}<extra></extra>",
type = "scatter", mode = "lines", showlegend = F) -> pp2
pp2
If you look at the plot here, it looks blank. That's because of how small the lines are. This is intentional. You don't want lines on your graph.
Fix the colors, so that the hoverlabel background colors match the legend colors.
# change colors to match sankey
pp2 <- plotly_build(pp2)
invisible(
map(1:10,
function(z) {
nm <- pp2$x$data[[z]]$name
# collect and assign the color
cr <- unlist(cols[cols$nm == nm, "filler"], use.names = F)
pp2$x$data[[z]]$line$color <<- cr
})
)
Using subplot here didn't work. Plotly gave me an error when I tried adding a trace, whether all at once or even one for each model. So I forced the traces together.
#-------------- consolidate the traces (subplot won't work) -----------
# collect data one more time!
dx <- plt$x$data
yx <- pp2$x$data
yx <- append(yx, dx) # put plt on top
# replace data
plt$x$data <- yx
# lines are small, increase the distance searched for matches
plt %>% layout(hoverdistance = 40)
The final product:

Adding drop down menu to plotly line graph with R

I am looking to add a dropdown feature to a plotly graph which will highlight a chosen id variable and will make the selected id's line change to a different color. An example of a code to produce the type of chart I'm considering is
library(tidyverse)
library(plotly)
set.seed(2001)
## make 10 random data series
data <- data.frame(
t = rep(1:50, 10),
id = sort(rep(1:10, 50)),
y = rnorm(500)
)
## cumulate the random variates by group
data <- data %>%
group_by(id) %>%
mutate(y2 = cumsum(y))
## plot
plot <- data %>%
plot_ly(x = ~ t, y = ~y2, text = ~id) %>%
add_lines(split = ~id, color = I('gray'), alpha = 0.5) %>%
hide_legend()
plot
What I'm looking to do is create a drop down selection choice so that I can select say id = 3, and change that line corresponding to id = 3 to blue, keeping all others as gray. I found how to change overall styles with dropdowns, but couldn't find this specific question. Thanks in advance.
Andrew

Stacked bar graphs in plotly: how to control the order of bars in each stack

I'm trying to order a stacked bar chart in plotly, but it is not respecting the order I pass it in the data frame.
It is best shown using some mock data:
library(dplyr)
library(plotly)
cars <- sapply(strsplit(rownames(mtcars), split = " "), "[", i = 1)
dat <- mtcars
dat <- cbind(dat, cars, stringsAsFactors = FALSE)
dat <- dat %>%
mutate(carb = factor(carb)) %>%
distinct(cars, carb) %>%
select(cars, carb, mpg) %>%
arrange(carb, desc(mpg))
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = cars) %>%
layout(barmode = "stack")
The resulting plot doesn't respect the ordering, I want the cars with the largest mpg stacked at the bottom of each cylinder group. Any ideas?
As already pointed out here, the issue is caused by having duplicate values in the column used for color grouping (in this example, cars). As indicated already, the ordering of the bars can be remedied by grouping your colors by a column of unique names. However, doing so will have a couple of undesired side-effects:
different model cars from the same manufacturer would be shown with different colors (not what you are after - you want to color by manufacturer)
the legend will have more entries in it than you want i.e. one per model of car rather than one per manufacturer.
We can hack our way around this by a) creating the legend from a dummy trace that never gets displayed (add_trace(type = "bar", x = 0, y = 0... in the code below), and b) setting the colors for each category manually using the colors= argument. I use a rainbow pallette below to show the principle. You may like to select sme more attractive colours yourself.
dat$unique.car <- make.unique(as.character(dat$cars))
dat2 <- data.frame(cars=levels(as.factor(dat$cars)),color=rainbow(nlevels(as.factor(dat$cars))))
dat2[] <- lapply(dat2, as.character)
dat$color <- dat2$color[match(dat$cars,dat2$cars)]
plot_ly() %>%
add_trace(data=dat2, type = "bar", x = 0, y = 0, color = cars, colors=color, showlegend=T) %>%
add_trace(data=dat, type = "bar", x = carb, y = mpg, color = unique.car, colors=color, showlegend=F, marker=list(line=list(color="black", width=1))) %>%
layout(barmode = "stack", xaxis = list(range=c(0.4,8.5)))
One way to address this is to give unique names to all models of car and use that in plotly, but it's going to make the legend messier and impact the color mapping. Here are a few options:
dat$carsID <- make.unique(as.character(dat$cars))
# dat$carsID <- apply(dat, 1, paste0, collapse = " ") # alternative
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID) %>%
layout(barmode = "stack")
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID,
colors = rainbow(length(unique(carsID)))) %>%
layout(barmode = "stack")
I'll look more tomorrow to see if I can improve the legend and color mapping.

Make dual X-axs based on different variables using ggvis

df <- data.frame(X1 = rep(1:5,1), X2 = rep(4:8,1), var1 = sample(1:10,5), row.names = c(1:5))
library("ggvis")
graph <- df %>%
ggvis(~X1) %>%
layer_lines(y = ~ var1) %>%
add_axis("y", orient = "left", title = "var1") %>%
add_axis("x", orient = "bottom", title = "X1") %>%
add_axis("x", orient = "top", title = "X2" )
graph
Obviously, the top x-axis (X2) is not correct here since it refers to the same variable as X1. I know how to create a scaled dual-y axis in ggvis. But how can I create a similar dual axis on different X? Those two X-axis should refer to different variables (X1 and X2 in this example).
I know this could be a really BAD idea to make dual X-axis. But one of my working dataset may need me to do so. Any comments and suggestions are appreciated!
The second axis needs to have a 'name' in order for the axis to know which variable to reflect. See below:
df <- data.frame(X1 = rep(1:5,1),
X2 = rep(4:8,1),
var1 = sample(1:10,5),
row.names = c(1:5))
library("ggvis")
df %>%
ggvis(~X1) %>%
#this is the line plotted
layer_lines(y = ~ var1) %>%
#and this is the bottom axis as plotted normally
add_axis("x", orient = "bottom", title = "X1") %>%
#now we add a second axis and we name it 'x2'. The name is given
#at the scale argument
add_axis("x", scale = 'x2', orient = "top", title = "X2" ) %>%
#and now we plot the second x-axis using the name created above
#i.e. scale='x2'
layer_lines(prop('x' , ~X2, scale='x2'))
And as you can see the top x-axis reflects your X2 variable and ranges between 4 and 8.
Also, as a side note: You don't need rep(4:8,1) to create a vector from 4 to 8. Just use 4:8 which returns the same vector.

ggvis - add_legend with multiple data and position legend inside graph

I'm trying to add legends with arbitrary text in a ggvis plot using data from different dataframes. I have tried using add_legend() but I have not idea about what parameters to use. Using plot() is very simple using the legend() function but it has been very hard to find a way to do it using ggvis()
Here is a simple example of what I have using plot():
df1 = data.frame(x = sample(1:10), y = sample(1:10))
df2 = data.frame(x = 1:10, y = 1:10)
df3 = data.frame(x = 1:10, y = sqrt(1:10))
plot(df1)
lines(df2$x, df2$y, col = "red")
lines(df3$x, df3$y, col = "green")
legend("topleft", c("Data 2","Data 3"), lty = 1, col = c("red","green"))
Now, using ggvis() I can plot the points and the lines from different datasets but I can not find a way to put the legends using add_legend(), Here is the code using ggvis():
df1 %>% ggvis(x=~x,y=~y) %>% layer_points() %>%
layer_paths(x=~x,y=~y,data = df2, stroke := "red") %>%
layer_paths(x=~x,y=~y,data = df3, stroke := "green")
I will really appreciate any help.
Thank you.
Edited:
This a sample code using only one data frame and plot()
df = data.frame(x = sample(1:10), y = sample(1:10), x2 = 1:10, y2 = 1:10, y3 = sqrt(1:10) )
plot(df[,c("x","y")])
lines(df$x2, df$y2, col = "red")
lines(df$x2, df$y3, col = "green")
legend("topleft", c("Data 2","Data 3"), lty = 1, col = c("red","green"))
So, what I came up with, is the following, which works:
#add an id column for df2 and df3 and then rbind
df2$id <- 1
df3$id <- 2
df4 <- rbind(df2,df3)
#turn id into a factor
df4$id <- factor(df4$id)
#then plot df4 using the stroke=~id argument
#then plot the legend
#and finally add df1 with a separate data
df4 %>% ggvis(x=~x,y=~y,stroke=~id) %>% layer_lines() %>%
add_legend('stroke', orient="left") %>%
layer_points(x=~x,y=~y,data = df1,stroke:='black')
And it works:
If you would like to move the legend to a position inside the plot then you need to try this:
df4 %>% ggvis(x=~x,y=~y,stroke=~id) %>% layer_lines() %>%
#make sure you use add relative scales
add_relative_scales() %>%
#values for x and y need to be between 0 and 1
#e.g for the x-axis 0 is the at far-most left point and 1 at the far-right
add_legend("stroke", title = "Cylinders",
properties = legend_props(
legend = list(
x = scaled_value("x_rel", 0.1),
y = scaled_value("y_rel", 1)
))) %>%
layer_points(x=~x,y=~y,data = df1,stroke:='black')
And the output:

Resources