The Background
I am using the plotly API in R to create two linked plots. The first is a scatter plot and the second is a bar chart that should show the percentage of data belonging to each category, in the current selection. I can't make the percentages behave as expected.
The problem
The plots render correctly and the interactive selection works fine. When I select a set of data points in the top scatter plot, I would like to see the percentage of that selection that belongs to each category. Instead what I see is the percentage of points in that selection in that category that belong to that category, in other words always 100%. I guess this is because I set color = ~c which applies a grouping to the category.
The Example
Here is a reproducible example to follow. First create some dummy data.
library(plotly)
n = 1000
make_axis = function(n) c(rnorm(n, -1, 1), rnorm(n, 2, 0.25))
data = data.frame(
x = make_axis(n),
y = make_axis(n),
c = rep(c("A", "B"), each = n)
)
Create a sharedData object and supply it to plot_ly() for the base plot.
shared_data = data %>%
highlight_key()
baseplot = plot_ly(shared_data)
Make the individual panels.
points = baseplot %>%
add_markers(x = ~x, y = ~y, color = ~c)
bars = baseplot %>%
add_histogram(x = ~c, color = ~c, histnorm = "percent", showlegend = FALSE) %>%
layout(barmode = "group")
And put them together in a linked subplot with selection and highlighting.
subplot(points, bars) %>%
layout(dragmode = "select") %>%
highlight("plotly_selected")
Here is a screenshot of this to illustrate the problem.
An Aside
Incidentally when I set histnorm = "" in add_histogram() then I get closer to the expected behaviour but I do want percentages and not counts. When I remove color = ~c then I get closer to the expected behaviour but I do want the consistent colour scheme.
What have I tried
I have tried manually supplying the colours but then some of the linked selection breaks. I have tried creating a separate summarised data set from the sharedData object first and then plotting that but again this breaks the linkage between the plots.
If anyone has any clues as to how to solve this I would be very grateful.
To me it seems the behaviour you are looking for isn't implemented in plotly.
Please see schema(): object ► traces ► histogram ► attributes ► histnorm ► description
However, here is the closest I was able to achive via add_bars and perprocessing the data (Sorry for adding data.table, you will be able to do the same in base R, just personal preference):
library(plotly)
library(data.table)
n = 1000
make_axis = function(n) c(rnorm(n, -1, 1), rnorm(n, 2, 0.25))
DT = data.table(
x = make_axis(n),
y = make_axis(n),
c = rep(c("A", "B"), each = n)
)
DT[, grp_percent := rep(100/.N, .N), by = "c"]
shared_data = DT %>%
highlight_key()
baseplot = plot_ly(shared_data)
# Make the individual panels.
points = baseplot %>%
add_markers(x = ~x, y = ~y, color = ~c)
bars = baseplot %>%
add_bars(x = ~c, y = ~grp_percent, color = ~c, showlegend = FALSE) %>%
layout(barmode = "group")
subplot(points, bars) %>%
layout(dragmode = "select") %>%
highlight("plotly_selected")
Unfortunately, the resulting hoverinfo isn't really desirable.
I'm looking for certain fix with range selector in plotly using R.
I have two plots visualized via a single subplot using Plotly in R. Now, I need to add a Range Slider/Selector to the complete plot, so that changing it modifies both my plots.
Is it possible via Plotly? (using R only)
This functionality is similar to Dygraphs synchronize feature(https://rstudio.github.io/dygraphs/gallery-synchronization.html).
I'd recommend using subplots option shareX = TRUE:
Please check the following example:
library(plotly)
DF1 <- data.frame(x=1:100, y=runif(100)+ seq(0, 1, length.out = 100))
DF2 <- data.frame(x=1:100, y=runif(100)+ seq(0, 2, length.out = 100))
p1 <- plot_ly(DF1, x = ~x, y = ~y, type = "scatter", mode = "lines+markers")
p2 <- plot_ly(DF2, x = ~x, y = ~y, type = "scatter", mode = "lines+markers")
p <- subplot(p1, p2, nrows = 2, shareX = TRUE)
p
I have the following simple data.frame:
x <- data.frame(x = c(1, 3, 5, 2, 4, runif(10)),
y = c(1, 2, 3, 4, 5, runif(10)))
I want to make a plot showing both the scatter plot and connecting some of the points with a line, so I use:
plot_ly(data = x) %>%
add_markers(
x = ~x,
y = ~y
) %>%
add_lines(
x = ~x[1:5],
y = ~y[1:5]
)
However, the resulting line graph is sorted along the x-axis, while I want the line to follow the order found in the data.frame (shown in red below).
Is there any way of doing this? I've found similar questions on SO, but they all deal with categorical values.
I could obviously use paths, but to my understanding those only exist as shapes within layout(). I'm hoping for something which behaves like a trace: responds to hover actions, appears (and can be hidden) in the legend, etc.
I have just found a solution by using add_paths instead of add_lines.
plot_ly(data = x) %>%
add_markers(
x = ~x,
y = ~y
) %>%
add_paths(
x = ~x[1:5],
y = ~y[1:5]
)
Hope it solves your challenge.
I am having an issue with a plotly bar plot when I define the date range for the x-axis.
When there is one or more data points with the same x-value, the bars do not show in the plot. If there is at least two different x-values or if I do not use a x-axis range, then the bars show as they should.
Below follows an example (I am currently using lubridate to deal with dates).
library(lubridate)
library(plotly)
# Same x-value: bar does not show
plot_ly(x = c(ymd("2019-08-25"), ymd("2019-08-25")), y = c(1, 2), type = "bar") %>%
layout(xaxis = list(range = ymd(c("2019-08-20", "2019-08-30"))))
# Different x-values: bars are shown
plot_ly(x = c(ymd("2019-08-25"), ymd("2019-08-26")), y = c(1, 2), type = "bar") %>%
layout(xaxis = list(range = ymd(c("2019-08-20", "2019-08-30"))))
# No x-axis range defined, same x-values: the bar is shown
plot_ly(x = c(ymd("2019-08-25"), ymd("2019-08-25")), y = c(1, 2), type = "bar")
Any solution?
Edit: For comparison, ggplot2 does not have the same issue:
# ggplot works like expected
library(lubridate)
library(ggplot2)
ggplot(NULL, aes(x = ymd(c("2019-08-25", "2019-08-25")), y = c(1, 2))) +
geom_col() +
xlim(ymd(c("2019-08-20", "2019-08-30")))
Your code is actually being understood in your first version, but you need to set the width of the bars so they show up in the end.
I'm not sure what the units are (maybe miliseconds???) so you may need to play around with it or do research to get a good width for your actual scenario.
plot_ly() %>%
add_bars(x = c(ymd("2019-08-25"), ymd("2019-08-25")), y = c(1, 2), type = "bar",width=100000000)%>%
layout(xaxis = list(range = ymd(c("2019-08-20", "2019-08-30"))))
Probably an easy one.
I have an xy dataset I'd like to plot using R's plotly. Here are the data:
set.seed(1)
df <- data.frame(x=1:10,y=runif(10,1,10),group=c(rep("A",9),"B"),group.size=as.integer(runif(10,1,10)))
I'd like to color the data by df$group and have the size of the points follow df$group.size (i.e., a bubble plot). In addition, I'd like to have both legends added.
This is my naive attempt:
require(plotly)
require(dplyr)
main.plot <-
plot_ly(type='scatter',mode="markers",color=~df$group,x=~df$x,y=~df$y,size=~df$group.size,marker=list(sizeref=0.1,sizemode="area",opacity=0.5),data=df,showlegend=T) %>%
layout(title="Title",xaxis=list(title="X",zeroline=F),yaxis=list(title="Y",zeroline=F))
which comes out as:
and unfortunately messes up the legend, at least how I want it to be: a point for each group having the same size but different colors.
Then to add a legend for the group.size I followed this, also helped by aocall's answer:
legend.plot <- plot_ly() %>% add_markers(x = 1, y = unique(df$group.size),
size = unique(df$group.size),
showlegend = T,
marker = list(sizeref=0.1,sizemode="area")) %>%
layout(title="TITLE",xaxis = list(zeroline=F,showline=F,showticklabels=F,showgrid=F),
yaxis=list(showgrid=F))
which comes out as:
Here my problem is that the legend is including values that do not exist in my data.
then I combine them using subplot:
subplot(legend.plot, main.plot, widths = c(0.1, 0.9))
I get this:
where the legend title is eliminated
So I'd be helpful for some help.
Based on the updated request:
Note the changes in legend.plot (mapping values to a sequence of integers, then manually changing the axis tick text), and the use of annotations to get a legend title. As explained in this answer, only one title may be used, regardless of how many subplots are used.
The circle on the plot legend seems to correspond to the minimum point size of each trace. Thus, I've added a point at (12, 12), and restricted the range of the axes to ensure it isn't shown.
titleX and titleY control the display of axis labels, as explained here.
set.seed(1)
df <- data.frame(x=1:10,y=runif(10,1,10),group=c(rep("A",9),"B"),group.size=as.integer(runif(10,1,10)))
require(plotly)
require(dplyr)
## Take unique values before adding dummy value
unique_vals <- unique(df$group.size)
df <- rbind(c(12, 12, "B", 1), df)
df[c(1, 2, 4)] <- lapply(df[c(1, 2, 4)], as.numeric)
main.plot <-
plot_ly(type='scatter',
mode="markers",
color=~df$group,
x=~df$x,
y=~df$y,
size=~df$group.size,
marker=list(
sizeref=0.1,
sizemode="area",
opacity=0.5),
data=df,
showlegend=T) %>%
layout(title="Title",
xaxis=list(title="X",zeroline=F, range=c(0, 11)),
yaxis=list(title="Y",zeroline=F, range=c(0, 11)))
legend.plot <- plot_ly() %>%
add_markers(x = 1,
y = seq_len(length(unique_vals)),
size = sort(unique_vals),
showlegend = F,
marker = list(sizeref=0.1,sizemode="area")) %>%
layout(
annotations = list(
list(x = 0.2,
y = 1,
text = "LEGEND TITLE",
showarrow = F,
xref='paper',
yref='paper')),
xaxis = list(
zeroline=F,
showline=F,
showticklabels=F,
showgrid=F),
yaxis=list(
showgrid=F,
tickmode = "array",
tickvals = seq_len(length(unique_vals)),
ticktext = sort(unique_vals)))
subplot(legend.plot, main.plot, widths = c(0.1, 0.9),
titleX=TRUE, titleY=TRUE)
Firstly, you are only passing in the unique values to the legend. If you pass in all possible values (ie, seq(min(x), max(x), by=1), or in this case seq_len(max(x))) the legend will show the full range.
Secondly, sizeref and sizemode in the marker argument alter the way that point size is calculated. The following example should produce a more consistent plot:
set.seed(1)
df <- data.frame(x=1:10,y=runif(10,1,10),group=c(rep("A",9),"B"),group.size=as.integer(runif(10,1,10)))
require(plotly)
require(dplyr)
a <- plot_ly(type='scatter',mode="markers",
color=~df$group,
x=~df$x,
y=~df$y,
size=df$group.size,
marker = list(sizeref=0.1, sizemode="area"),
data=df,
showlegend=F) %>%
layout(title="Title",
xaxis=list(title="X",zeroline=F),
yaxis=list(title="Y",zeroline=F))
b <- plot_ly() %>% add_markers(x = 1, y = seq_len(max(df$group.size)),
size = seq_len(max(df$group.size)),
showlegend = F,
marker = list(sizeref=0.1, sizemode="area")) %>%
layout(
xaxis = list(zeroline=F,showline=F,showticklabels=F,showgrid=F),
yaxis=list(showgrid=F))
subplot(b, a, widths = c(0.1, 0.9))