R plotly simple boxplot highlighting the most recent value - r

I probably have a simple question but I can't find a way to achieve what I need. I have a simple boxplot as the following:
end_dt <- as.Date("2021-02-12")
start_dt <- end_dt - (nrow(iris) - 1)
dim(iris)
dates <- seq.Date(start_dt, end_dt, by="1 day")
df <- iris
df$LAST_VAL <- "N"
df[3, 'LAST_VAL'] <- "Y"
df1 <- df[,c("Sepal.Length","LAST_VAL")]
df1$DES <- 'Sepal.Length'
colnames(df1) <- c("VALUES","LAST_VAL","DES")
df2 <- df[,c("Sepal.Width","LAST_VAL")]
df2$DES <- 'Sepal.Width'
colnames(df2) <- c("VALUES","LAST_VAL","DES")
df <- rbind(df1, df2)
fig <- plot_ly(df, y = ~VALUES, color = ~DES, type = "box") %>% layout(showlegend = FALSE)
What I would like to do now is a add a red marker to each box plot just for the value corresponding to LAST_VAL = "Y". This would allow me to see given the distribution of each plot, to see where the most recent value is located.
I tried to use the info on https://plotly.com/r/box-plots/ but I can't figure out how to do this.
Thanks

The following solution ended up to be a bit too long codewise. However, it should give you what you asked for. I think the boxplots should be added afterwards, like:
fig <- plot_ly(df[df$LAST_VAL=="Y",],
x=~DES, y = ~VALUES, color = ~DES, type = "scatter", colors='red') %>%
layout(showlegend = FALSE) %>%
add_boxplot(data = df[df$DES=="Sepal.Length",], x = ~DES, y = ~VALUES,
showlegend = F, color = ~DES,
boxpoints = F, fillcolor = 'white', line = list(color = c('blue'))) %>%
add_boxplot(data = df[df$DES=="Sepal.Width",], x = ~DES, y = ~VALUES,
showlegend = F, color = ~DES,
boxpoints = F, fillcolor = 'white', line = list(color = c('green')))

Related

R Plotly Stacked Bar breakout date not in correct order

I have an R plotly stacked bar chart where it's broken out by date. I am trying to change the order from the oldest date 10/1/2020 to the newest date 01/01/2021 on top. I noticed in the current state that it's not even in the correct order of dates. The data frame shows in the correct order.
The current code I have.
ramp2 <- colorRamp(c("deepskyblue4", "white"))
ramp.list2 <- rgb( ramp2(seq(0, 1, length = 15)), max = 255)
plot_ly(pcd_2,
x = ~reorder(u_reason_code,-total_qty, sum), y = ~total_qty, type = 'bar', color = ~month_breakout ) %>%
layout(list(title = 'Cost'), barmode = 'stack') %>%
layout(colorway = ramp.list2) %>%
config(displayModeBar = FALSE)
Try formating your date as factor with next code (not tested as no data was shared):
#Process data
pcd_2$Date <- as.Date(pcd_2$month_breakout,'%m/%d/%Y')
pcd_2 <- pcd_2[order(pcd_2$Date),]
pcd_2$month_breakout <- factor(pcd_2$month_breakout,
levels = unique(pcd_2$month_breakout),
ordered = T)
#Plot
ramp2 <- colorRamp(c("deepskyblue4", "white"))
ramp.list2 <- rgb( ramp2(seq(0, 1, length = 15)), max = 255)
plot_ly(pcd_2,
x = ~reorder(u_reason_code,-total_qty, sum), y = ~total_qty, type = 'bar', color = ~month_breakout ) %>%
layout(list(title = 'Cost'), barmode = 'stack') %>%
layout(colorway = ramp.list2) %>%
config(displayModeBar = FALSE)

How to apply subplot to a list of plots with secondary y axis

I want to prepare a subplot where each facet is a separate dual y-axis plot of one variable against the others. So I make a base plot p and add secondary y-axis variable in a loop:
library(rlang)
library(plotly)
library(tibble)
dual_axis_lines <- function(data, x, y_left, ..., facets = FALSE, axes = NULL){
x <- rlang::enquo(x)
y_left <- rlang::enquo(y_left)
y_right <- rlang::enquos(...)
y_left_axparms <- list(
title = FALSE,
tickfont = list(color = "#1f77b4"),
side = "left")
y_right_axparms <- list(
title = FALSE,
overlaying = "y",
side = "right",
zeroline = FALSE)
p <- plotly::plot_ly(data , x = x) %>%
plotly::add_trace(y = y_left, name = quo_name(y_left),
yaxis = "y1", type = 'scatter', mode = 'lines',
line = list(color = "#1f77b4"))
p_facets <- list()
for(v in y_right){
p_facets[[quo_name(v)]] <- p %>%
plotly::add_trace(y = v, name = quo_name(v),
yaxis = "y2", type = 'scatter', mode = 'lines') %>%
plotly::layout(yaxis = y_left_axparms,
yaxis2 = y_right_axparms)
}
p <- subplot(p_facets, nrows = length(y_right), shareX = TRUE)
return(p)
}
mtcars %>%
rowid_to_column() %>%
dual_axis_lines(rowid, mpg, cyl, disp, hp, facets = TRUE)
However, the resulting plots have all the secondary y-axis variables cluttered in the first facet.
The issue seems to be absent when I return p_facets lists that goes into subplot as each plot looks like below:
How can I fix this issue?
Okay, I followed the ideas given in this github issue about your bug.
library(rlang)
library(plotly)
library(tibble)
dual_axis_lines <- function(data, x, y_left, ..., facets = FALSE, axes = NULL){
x <- rlang::enquo(x)
y_left <- rlang::enquo(y_left)
y_right <- rlang::enquos(...)
## I removed some things here for simplicity, and because we want overlaying to vary between subplots.
y_left_axparms <- list(
tickfont = list(color = "#1f77b4"),
side = "left")
y_right_axparms <- list(
side = "right")
p <- plotly::plot_ly(data , x = x) %>%
plotly::add_trace(y = y_left, name = quo_name(y_left),
yaxis = "y", type = 'scatter', mode = 'lines',
line = list(color = "#1f77b4"))
p_facets <- list()
## I needed to change the for loop so that i can have which plot index we are working with
for(v in 1:length(y_right)){
p_facets[[quo_name(y_right[[v]])]] <- p %>%
plotly::add_trace(y = y_right[[v]], x = x, name = quo_name(y_right[[v]]),
yaxis = "y2", type = 'scatter', mode = 'lines') %>%
plotly::layout(yaxis = y_left_axparms,
## here is where you can assign each extra line to a particular subplot.
## you want overlaying to be: "y", "y3", "y5"... for each subplot
yaxis2 = append(y_right_axparms, c(overlaying = paste0(
"y", c("", as.character(seq(3,100,by = 2)))[v]))))
}
p <- subplot(p_facets, nrows = length(y_right), shareX = TRUE)
return(p)
}
mtcars %>%
rowid_to_column() %>%
dual_axis_lines(rowid, mpg, cyl, disp, hp, facets = TRUE)
Axis text the same color as the lines.
For this you would need two things. You would need to give a palette to your function outside of your for-loop:
color_palette <- colorRampPalette(RColorBrewer::brewer.pal(10,"Spectral"))(length(y_right))
If you don't like the color palette, you'd change it!
I've cleaned up the for-loop so it's easier to look at. This is what it would now look like now so that lines and axis text share the same color:
for(v in 1:length(y_right)){
## here is where you can assign each extra line to a particular subplot.
## you want overlaying to be: "y", "y3", "y5"... for each subplot
overlaying_location = paste0("y", c("", as.character(seq(3,100,by = 2)))[v])
trace_name = quo_name(y_right[[v]])
trace_value = y_right[[v]]
trace_color = color_palette[v]
p_facets[[trace_name]] <- p %>%
plotly::add_trace(y = trace_value,
x = x,
name = trace_name,
yaxis = "y2",
type = 'scatter',
mode = 'lines',
line = list(color = trace_color)) %>%
plotly::layout(yaxis = y_left_axparms,
## We can build the yaxis2 right here.
yaxis2 = eval(
parse(
text = "list(side = 'right',
overlaying = overlaying_location,
tickfont = list(color = trace_color))")
)
)
}

How do I split grouped bar chart in R by variable

I am trying to split the attached grouped bar chart by the variable spec. Two thoughts on best way to do this are by adding facet_grid() or if a filter can be applied to the static output? Can either be done? Any advice appreciated.
a sample is below:
period <- c('201901', '201901', '201904', '201905')
spec <- c('alpha', 'bravo','bravo', 'charlie')
c <- c(5,6,3,8)
e <- c(1,2,4,5)
df <- data.frame(period, spec, c,e)
library(tidyverse)
library(plotly)
plot_ly(df, x =~period, y = ~c, type = 'bar', name = "C 1", marker = list(color = 'lightsteelblue3'))
%>%
add_trace(y = ~e, name = "E 1", marker = list(color = 'Gray')) %>%
layout(xaxis = list(title="", tickangle = -45),
yaxis = list(title=""),
margin= list(b=100),
barmode = 'group'
)
I am not sure if you are plotting what you actually want to achieve? My suggestion is to create your plot using standard ggplot and then use ggplotly.
For this, you also need to reshape your data and make it a bit longer.
library(tidyverse)
library(plotly)
period <- c('201901', '201901', '201904', '201905')
spec <- c('alpha', 'bravo','bravo', 'charlie')
c <- c(5,6,3,8)
e <- c(1,2,4,5)
df <- data.frame(period, spec, c,e) %>%
pivot_longer(cols = c(c,e), names_to = 'var', values_to = 'val')
p <- ggplot(df, aes(period, val, fill = var)) +
geom_col(position = position_dodge()) +
facet_grid(~spec)
ggplotly(p)
It's probably easier to use facets here, but a more "interactive" option would be to use a filter transforms which gives you a drop-down menu in the top left corner of your plot.
spec.val <- unique(df$spec)
plot_ly(
df %>% pivot_longer(-c(period, spec)),
x = ~period, y = ~value, color = ~name,
type = "bar",
transforms = list(
list(
type = "filter",
target = ~spec,
operation = "=",
value = spec.val[1]))) %>%
layout(
updatemenus = list(
list(
type = "drowdown",
active = 0,
buttons = map(spec.val, ~list(
method = "restyle",
args = list("transforms[0].value", .x),
label = .x)))))

R Plotly jittered boxplot with NAs width

I am plotting the grouped boxplot with jittering with the following function:
plot_boxplot <- function(dat) {
# taking one of each joine_group to be able to plot it
allx <- dat %>%
mutate(y = median(y, na.rm = TRUE)) %>%
group_by(joined_group) %>%
sample_n(1) %>%
ungroup()
p <- dat %>%
plotly::plot_ly() %>%
# plotting all the groups 1:20
plotly::add_trace(data = allx,
x = ~as.numeric(joined_group),
y = ~y,
type = "box",
hoverinfo = "none",
boxpoints = FALSE,
color = NULL,
opacity = 0,
showlegend = FALSE) %>%
# plotting the boxes
plotly::add_trace(data = dat,
x = ~as.numeric(joined_group),
y = ~y,
color = ~group1,
type = "box",
hoverinfo = "none",
boxpoints = FALSE,
showlegend = FALSE) %>%
# adding ticktext
layout(xaxis = list(tickvals = 1:20,
ticktext = rep(levels(dat$group1), each = 4)))
p <- p %>%
# adding jittering
add_markers(data = dat,
x = ~jitter(as.numeric(joined_group), amount = 0.2),
y = ~y,
color = ~group1,
showlegend = FALSE)
p
}
The problem is that when some of the levels have NA as y variable the width of the jittered boxes changes. Here is an example:
library(plotly)
library(dplyr)
set.seed(123)
dat <- data.frame(group1 = factor(sample(letters[1:5], 100, replace = TRUE)),
group2 = factor(sample(LETTERS[21:24], 100, replace = TRUE)),
y = runif(100)) %>%
dplyr::mutate(joined_group = factor(
paste0(group1, "-", group2)
))
# do the plot with all the levels
p1 <- plot_boxplot(dat)
# now the group1 e is having NAs as y values
dat$y[dat$group1 == "e"] <- NA
# create the plot with missing data
p2 <- plot_boxplot(dat)
# creating the subplot to see that the width has changed:
subplot(p1, p2, nrows = 2)
The problem is that the width of boxes in both plots is different:
I've realised that the boxes have the same size without jittering so I know that the jittering is "messing" with the width but I don't know how to fix that.
Does anyone know how to make the width in both jittered plots exactly the same?
I see two separate plot shifts:
due to jittering
due to NAs
First can be solved by declaring new jitter function with fixed seed
fixed_jitter <- function (x, factor = 1, amount = NULL) {
set.seed(42)
jitter(x, factor, amount)
}
and using it instead of jitter in add_markers call.
Second problem can be solved by assigning -1 instead of NA and setting
yaxis = list(range = c(0, ~max(1.1 * y)))
as a second parameter to layout.

R plotly show only labels where percentage value is value is above 10

I am making a pie-chart in plotly in R.
I want my labels to be on the chart, so I use textposition = "inside", and for the very small slices those values are not visible.
I am trying to find a way to exclude those labels.
Ideally, I would like to like to not print any lables on my plot that are below 10%.
Setting textposition = "auto" doesn't work well, since there are a lot of small slices, and it makes the graph look very messy.
Is there a way to do it?
For example these piecharts from plotly website (https://plot.ly/r/pie-charts/)
library(plotly)
library(dplyr)
cut <- diamonds %>%
group_by(cut) %>%
summarize(count = n())
color <- diamonds %>%
group_by(color) %>%
summarize(count = n())
clarity <- diamonds %>%
group_by(clarity) %>%
summarize(count = n())
plot_ly(cut, labels = cut, values = count, type = "pie", domain = list(x = c(0, 0.4), y = c(0.4, 1)),
name = "Cut", showlegend = F) %>%
add_trace(data = color, labels = color, values = count, type = "pie", domain = list(x = c(0.6, 1), y = c(0.4, 1)),
name = "Color", showlegend = F) %>%
add_trace(data = clarity, labels = clarity, values = count, type = "pie", domain = list(x = c(0.25, 0.75), y = c(0, 0.6)),
name = "Clarity", showlegend = F) %>%
layout(title = "Pie Charts with Subplots")
In the plot for Clarity 1.37% are outside of the plot, while I would like them not to show at all.
You'll have to specify sector labels manually like so:
# Sample data
df <- data.frame(category = LETTERS[1:10],
value = sample(1:50, size = 10))
# Create sector labels
pct <- round(df$value/sum(df$value),2)
pct[pct<0.1] <- 0 # Anything less than 10% should be blank
pct <- paste0(pct*100, "%")
pct[grep("0%", pct)] <- ""
# Install devtools
install.packages("devtools")
# Install latest version of plotly from github
devtools::install_github("ropensci/plotly")
# Plot
library(plotly)
plot_ly(df,
labels = ~category, # Note formula since plotly 4.0
values = ~value, # Note formula since plotly 4.0
type = "pie",
text = pct, # Manually specify sector labels
textposition = "inside",
textinfo = "text" # Ensure plotly only shows our labels and nothing else
)
Check out https://plot.ly/r/reference/#pie for more information...

Resources