Related
I have data of sales by year and model, which is visualized via Sankey chart. Now I am struggling to handle 2 issue:
Firstly I need to set model B always on the bottom of chaty regardless its value over the years.
When I re-visualize ggplot via ggplotly hover does not show sales or years
code:
df <- data.frame (model = c("A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J"),
Year = c(2015,2015,2015,2015,2015,2015,2015,2015,2015,2015,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2018,2018,2018,2018,2018,2018,2018,2018,2018,2018,2019,2019,2019,2019,2019,2019,2019,2019,2019,2019,2020,2020,2020,2020,2020,2020,2020,2020,2020,2020),
sales = c(450,678,456,344,984,456,234,244,655,789,234,567,234,567,232,900,1005,1900,450,345,567,235,456,345,144,333,555,777,111,444,222,223,445,776,331,788,980,1003,456,434,345,2222,3456,456,678,8911,4560,4567,4566,5555,6666,7777,8888,1233,1255,5677,3411,2344,6122,4533))
install.packages("remotes")
#remotes::install_github("davidsjoberg/ggsankey")
library(ggsankey)
library(tidyverse)
plot <- ggplot(df, aes(x = Year,
node = model,
fill = model,
value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16)
ggplotly(plot)
I'm absolutely certain that there is a better way, but it took me a while to get it working. I think this is what you were looking for.
I started with the ggplot and ggplotly objects that you have here. The primary purpose of this initial plot is to capture the colors. (I could have captured them a few different ways, but this was already done for me in your plot.)
Update ** I've modified the two elements you requested
library(ggsankey)
library(tidyverse)
library(plotly)
# df from the question is unchanged
# visualize the original
(plot <- ggplot(df,
aes(Year, node = model, fill = model, value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial",
color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16))
ggplotly(plot) -> plp
plp
#-------- colors --------
# collect the 10 colors
cols <- map_dfr(1:10, function(k){
nm <- plp$x$data[[k]]$name
filler <- plp$x$data[[k]]$fillcolor
c(nm = nm, filler = filler)
})
Then I divided the contents of the B model into 10 groups to ensure it was always the smallest bump. This allowed me to collect the stacked values for all of the other models, which is needed to push B to the bottom.
#-------------- splitting B -------------
df1 <- df %>% filter(model != "B") %>%
arrange(Year, sales)
df2 <- df %>% filter(model == "B") %>% # this gets used further down
arrange(Year)
# split B into 10 groups - keep on the bottom, then join the groups
# make the groups
ng <- vector(length = 10)
invisible(
map(1:10,
function(i) {
ng[i] <<- rep("B", i) %>% paste0(collapse = "")
})
)
# add values for these groups by year
df4 <- data.frame(Year = rep(unique(df$Year), each = 10),
model = rep(ng, length(unique(df$Year))),
sales = rep(df2$sales/10, each = 10))
df5 <- rbind(df1, df4)
Recreate the Sankey bump with 10 subsections of model B. Everything that follows works with this plot.
#-------------- plotly after dividing B -------------
(nplt <- ggplot(df5, aes(x = Year, node = model, fill = model, value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial",
color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16))
ggplotly(nplt) -> plt
plt
Create a Sankey bump with JUST B, to capture data that represents model B at the bottom. Use this data to substitute all of the traces that represent B in the object plt. The colors get fixed here, as well. (The original 10 colors from the first plot.) Lastly, the hoverinfo gets removed. That will get fixed next.
#-------------- get values for B at the bottom -------------
df %>% filter(model == "B") %>%
ggplot(aes(x = Year,
node = model,
fill = model,
value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16) -> bplt
ggplotly(bplt) -> bplotly
bplotly
#------- take divided B and remove all but one trace for B --------
# xx <- plt$x$data
# plt$x$data <- xx[c(1:2, 12:length(xx))] # keep only one B trace
#---------------- adjustments to plt's build --------------------
# change out data for the B trace, add the right colors
wh <- vector(length = 0)
invisible(
map(1:length(plt$x$data),
function(j) {
nm <- plt$x$data[[j]]$name
plt$x$data[[j]]$hoverinfo <<- "none"
plt$x$data[[j]]$fillcolor <<- unlist(cols[cols$nm == nm, "filler"],
use.names = F)
if(str_detect(nm, "^B$")){
plt$x$data[[j]]$x <<- bplotly$x$data[[1]]$x
plt$x$data[[j]]$y <<- bplotly$x$data[[1]]$y
}
if(str_detect(nm, "BB")) {
wh[length(wh) + 1] <<- j # list of unnecessary traces (extra B groups)
}
})
)
#----- take divided B and remove all but one trace for B ------
plt$x$data <- plt$x$data[-c(wh)] # <------ forget this line when updated last time
# visualize Sankey bump with B at the bottom
plt
The Plotly object is basically 10 globs of color, there is no separation between years in the background. So if you add a tooltip to this as it is, there can be only one...
To get the tooltips you're looking for, I created another trace (well, 10, actually—1 for each model). In order to get the right values (because the sales data isn't in the 50K range), I used the data in plt to create a new data frame.
#--------------- collect values for hovertext positions ----------
x <- plt$x$data[[1]]$x
inds <- which(x %in% 2015:2020, arr.ind = T)
yrs <- x[inds]
tellMe <- invisible(
map(1:length(plt$x$data),
function(m) {
y <- plt$x$data[[m]]$y
y[inds]
}) %>% setNames(sort(unique(df$model))) %>% # changed from LETTERS[1:10]
as.data.frame() %>%
mutate(yr = yrs %>% as.integer()) %>%
pivot_longer(names_to = "model", values_to = "sales",
cols = sort(unique(df$model))) %>%
distinct() %>%
group_by(yr, model) %>%
summarise(val = mean(sales)) %>%
left_join(df, by = c("yr" = "Year", "model" = "model")) %>%
as.data.frame() # drop groups
)
#-------------- create data trace for hovertext --------------
plot_ly(tellMe, x = ~yr, y = ~val, split = ~model,
customdata = ~sales, text = ~model,
line = list(width = .01, shape = "spline", smoothing = 1.3),
hovertemplate = "Year: %{x}<br>Model: %{text}<br>Sales: %{customdata}<extra></extra>",
type = "scatter", mode = "lines", showlegend = F) -> pp2
pp2
If you look at the plot here, it looks blank. That's because of how small the lines are. This is intentional. You don't want lines on your graph.
Fix the colors, so that the hoverlabel background colors match the legend colors.
# change colors to match sankey
pp2 <- plotly_build(pp2)
invisible(
map(1:10,
function(z) {
nm <- pp2$x$data[[z]]$name
# collect and assign the color
cr <- unlist(cols[cols$nm == nm, "filler"], use.names = F)
pp2$x$data[[z]]$line$color <<- cr
})
)
Using subplot here didn't work. Plotly gave me an error when I tried adding a trace, whether all at once or even one for each model. So I forced the traces together.
#-------------- consolidate the traces (subplot won't work) -----------
# collect data one more time!
dx <- plt$x$data
yx <- pp2$x$data
yx <- append(yx, dx) # put plt on top
# replace data
plt$x$data <- yx
# lines are small, increase the distance searched for matches
plt %>% layout(hoverdistance = 40)
The final product:
I'm making some visualization using R Studio. I have a list of dataframes:
tickers_df <- read.csv('tickers.csv')
v_df <- split(tickers_df, tickers_df$pair_code)
Now I want to make a plot for each dataframe within v_df in its own Viewer window. I'm doing:
for (pair_df in v_df) {
col_name <- names(pair_df)[4:9]
colors <- c('green', 'darkgreen', 'red', 'darkred', 'blue', 'darkblue')
df_layout <- data.frame(col_name, colors)
p <- plot_ly()
for (i in 1:nrow(df_layout)) {
col_name <- as.character(df_layout[i, 1])
p <-
add_trace(
p,
x = pair_df$step,
y = pair_df[, col_name],
name = col_name,
type = 'scatter',
mode = "lines",
line = list(color = df_layout[i, 2], width = 1)
)
p <- layout(p, title = pair_df$pair_code[[1]])
}
p
}
But this code doesn't work as expected - it shows no charts at all.
How can I draw many plotly charts within a loop?
And btw what is the meaning of last line with only p variable? Like in this example:
p <- plot_ly( x = df$time.1, y = df$total_profit, line = list(color = 'darkred'))
p #what is this standing for?
I have data I'd like to plot the distribution density of. The data are from three groups, where for each there are three states, each with a probability, and these probabilities sum to 1.
I'm trying to use R's plotly to plot, for each group, the density of the probabilities, color coded by state, and add some text annotation to each such group plot. Finally I'm trying to combine all of these group plots using plotly::subplot.
Here's the code to generate the data and a list of group plots:
library(dplyr)
library(reshape2)
library(plotly)
set.seed(1)
plot.list <- lapply(1:3,function(g){
if(g == 1){
show.legend <- T
} else{
show.legend <- F
}
df <- data.frame(id=LETTERS,t(sapply(1:length(LETTERS),function(x){
probs <- runif(3,0,1)
return(probs/sum(probs))
}))) %>% dplyr::rename(S1=X1,S2=X2,S3=X3) %>%
reshape2::melt() %>% dplyr::rename(state=variable,probability=value)
df$state <- factor(df$state,levels=c("S1","S2","S3"))
density.df <- do.call(rbind,lapply(levels(df$state),function(s){
dens <- density(dplyr::filter(df,state == s)$probability)
return(data.frame(x=dens$x,y=dens$y,state=s,stringsAsFactors=F))
}))
density.df$state <- factor(density.df$state,levels=levels(df$state))
dens.plot <- plot_ly(x=~density.df$x,y=~density.df$y,type='scatter',mode='lines',color=~density.df$state,showlegend=show.legend) %>%
layout(xaxis=list(title="Probability",zeroline=F),yaxis=list(title="Count",zeroline=F)) %>%
add_annotations(x=0.75,y="top",text=paste0("text: ",g))
if(show.legend) dens.plot <- dens.plot %>% add_annotations(text="State",xref="paper",yref="paper",x=1.02,xanchor="left",y=1.02,yanchor="top",legendtitle=T,showarrow=F)
return(dens.plot)
})
Note that I'm only adding the legend to the first group so it appears only once in the final grouped plot (there's probably a more elegant way of achieving that).
And here's the plotly::subplot command I'm using:
subplot(plot.list,nrows=3,shareX=T,shareY=T,titleX=T,titleY=T)
Which gives:
As you can see the text annotation is stuck at "top" of the first plot rather than at the top of each individual plot.
Any idea how do I get each annotation to be located at the top of its corresponding sub-plot?
Preamble. For reasons that are not entirely obvious to me (but relating to how values for annotations are scaled when running subplot), annotations seem to go awry with vertically stacked subplots. To see this, run the MWE at https://plot.ly/r/text-and-annotations/#subplot-annotations, but change
subplot(p1, p2, titleX = TRUE, titleY = TRUE)
to
subplot(p1, p2, titleX = TRUE, titleY = TRUE, nrows = 2)
In the vertically stacked version, the annotations are not where we would expect them to be. To achieve your desired outcome would require some post-processing of the subplot output. Now, on to your main question.
First, in add_annotations, add xref and yref arguments that correspond to each subplot. In each element of plot.list, I also add an additional element y_anno to keep track of where we would like the annotation to go (at the maximum value of the densities in each subplot).
plot.list <- lapply(1:3,function(g){
if(g == 1){
show.legend <- T
} else{
show.legend <- F
}
df <- data.frame(id=LETTERS,t(sapply(1:length(LETTERS),function(x){
probs <- runif(3,0,1)
return(probs/sum(probs))
}))) %>% dplyr::rename(S1=X1,S2=X2,S3=X3) %>%
reshape2::melt() %>% dplyr::rename(state=variable,probability=value)
df$state <- factor(df$state,levels=c("S1","S2","S3"))
density.df <- do.call(rbind,lapply(levels(df$state),function(s){
dens <- density(dplyr::filter(df,state == s)$probability)
return(data.frame(x=dens$x,y=dens$y,state=s,stringsAsFactors=F))
}))
density.df$state <- factor(density.df$state,levels=levels(df$state))
dens.plot <- plot_ly(x=~density.df$x,
y=~density.df$y,
type='scatter',
mode='lines',
color=~density.df$state,
showlegend=show.legend) %>%
layout(xaxis=list(title="Probability",zeroline=F),yaxis=list(title="Count",zeroline=F)) %>%
add_annotations(x = 0.75,
y = max(density.df$y),
text = paste0("text: ", g),
xref = paste0("x", g), # add this
yref = paste0("y", g), # add this
ax = 0,
ay = 0)
if(show.legend) dens.plot <- dens.plot %>% add_annotations(text="State",xref="paper",yref="paper",x=1.02,xanchor="left",y=1.02,yanchor="top",legendtitle=T,showarrow=F)
dens.plot$y_anno <- max(density.df$y) # add this
return(dens.plot)
})
Now if we run subplot(plot.list,nrows=3,shareX=T,shareY=T,titleX=T,titleY=T), the text will be in each subplot, but not at the top (due to the phenomenon I described in the preamble). To fix this, we can post-process the subplot output:
p <- subplot(plot.list, nrows = 3,shareX = T,shareY = T,titleX = T,titleY = T)
for (i in seq_along(plot.list)) {
for (j in seq_along(p$x$layout$annotations)) {
if (p$x$layout$annotations[[j]]$yref == paste0("y", i))
p$x$layout$annotations[[j]]$y <- plot.list[[i]]$y_anno
}
}
Now p gives us
which is close to what we want.
Consider the following R snippet to render a heatmap with categorical axes:
library(plotly)
x <- c("Blue", "Red")
y <- c("One", "Two")
plot_ly(x = x, y = y, z = c(1:4)) %>%
add_heatmap(
opacity = 0.9
) %>%
add_annotations(
text = c(1:4),
showarrow = FALSE
)
This renders the following heatmap:
The annotations appear to be distributed diagonally and unevenly, starting from the bottom left cell. 1 and 3 are in the bottom left cell, and 2 and 4 in the upper right. Why is this? How should my annotation text be structured for it to be ordered more intuitively (horizontally or vertically)?
I can only speculate about the problem but in the provided image you can see that Plotly only used two values out of the 4 z-values. Your colorscale on the right goes from 1 to 2, not 1 to 4. This happens mMost likely because you provided only two x and y values.
Use a data frame
df <- expand.grid(x, y)
df <- transform(df, text = paste(Var1, Var2, sep='_'))
print(df)
Var1 Var2 text
1 Blue One Blue_One
2 Red One Red_One
3 Blue Two Blue_Two
4 Red Two Red_Two
You can now easily use add_annotations
add_annotations(x = df$Var1,
y = df$Var2,
text = df$text)
To get the following plot
Complete code
library(plotly)
x <- c("Blue", "Red")
y <- c("One", "Two")
df <- expand.grid(x, y)
df <- transform(df, text = paste(Var1, Var2, sep='_'))
p <- plot_ly(x = df$Var1,
y = df$Var2,
z = c(1:4)
) %>%
add_heatmap(opacity = 0.9
) %>%
add_annotations(x = df$Var1,
y = df$Var2,
text = df$text)
p
Alternatively you could loop over your values and add an annotation for each one.
library(plotly)
x <- c("Blue", "Red")
y <- c("One", "Two")
p <- plot_ly(x = x,
y = y,
z = c(1:4)
) %>%
add_heatmap(opacity = 0.9)
for (val_x in x)
{
for (val_y in y)
{
p <- add_annotations(p,
x = val_x,
y = val_y,
text = paste(val_x, val_y, sep = '_'))
}
}
p
I am trying to plot a number of dimensions in r using plotly - is it possible to use both color and group parameters on factor variables to have a line that changes color?
Example:
grp <- c(letters[c(1,1,1,1,2,2,2,2)])
a <- c(1,2,3,4,2,3,4,5)
b <- c(1,3,5,6,1,2,4,4)
lvl <- c(1,1,2,2,1,1,2,2)
df <- data.frame(grp, a, b, lvl)
When plotting this using ggplot() I am able to create the desired effect as below, with grp as to define each line and lvl to define the color of sections of the line:
ggplot(data = df, aes(x = a, y = b, group = grp, color = lvl)) + geom_line() + geom_point()
However, when I then call ggplotly() the line gets grouped and colored by lvl.
I'm searching for the same function. It seems that group and color is plotly kryptonite.
So far my only solution is to make a column of color codes and use that to define the colors of the markers:
library(scales)
library(plotly)
grp <- c(letters[c(1,1,1,1,2,2,2,2)])
a <- c(1,2,3,4,2,3,4,5)
b <- c(1,3,5,6,1,2,4,4)
lvl <- c(1,1,2,2,1,1,2,2)
df <- data.frame(grp, a, b, lvl)
Palette <- data.frame(lvl = unique(df$lvl), color = brewer_pal("seq",palette = "Reds",direction = -1)(length(unique(df$lvl))), stringsAsFactors = FALSE)
df <- merge(x = df, y = Palette, by = "lvl")
p <- plot_ly(df, x = a, y = b, group = grp, mode = "markers+lines", marker = list(color = color, size = 8), line = list(color = "black", width = 2))
p
however this trick is very cumbersome and does not work with "line" that only takes a single color input and looks like this. HOWEVER if you do not give an input to the "line" it displays two different colors that you have no control over. like this
I was trying to do the same thing and there is now an official way : you need to add a group_by statement before plot_ly (see https://github.com/ropensci/plotly/issues/418)
grp <- c(letters[c(1,1,1,1,2,2,2,2)])
a <- c(1,2,3,4,2,3,4,5)
b <- c(1,3,5,6,1,2,4,4)
lvl <- c(1,1,2,2,1,1,2,2)
df <- data.frame(grp, a, b, lvl)
df %>% group_by(grp) %>% plot_ly(x = a, y = b, mode = "markers+lines", color = lvl)