Plotly bars with x-axis offset for timestamps - r

I want to plot some timestamps with plotly bars, with 1 bar indicating a whole hour.
My problem is that the ticks are centered in the middle and I would like to shift them to the left end of the bars.
When the plot isn't zoomed in, it's not such a problem, but when zooming in, more tick-labels will appear and they would be wrong.
EDIT: I need the option barmode = 'overlay' as I also have other traces to plot, which are not included in this example.
The picture below illustrates my current and exptected layout and here's some data to make that plot. (Some option I tried without success are also included in the xaxis configuration but uncommented).
library(plotly)
library(lubridate)
df <- data.frame(
ts = seq(as.POSIXct("2019-03-20 00:00:00"), by = "hour", length.out = 24),
val = sample(1:100, 24)
)
plot_ly() %>%
add_bars(data = df, x = ~ts, y = ~val) %>%
layout(dragmode = "select", autosize = TRUE, selectdirection = "h",
barmode = 'overlay',
bargap = 0.05,
xaxis = list(ticks = "outside",
type = "date",
# tickson="boundaries",
# offset=1800,
tickmode = "auto",
title = ""
)) %>%
config(scrollZoom = TRUE)

Would the following meet your needs?
One of the things that I take advantage of sometimes with plotly is that you can show different values in text that are independent of your the x and y values used to plot the data.
In this case, we can create a column with an offset time value, ts_x and plot the x values a half hour past the time for each row -- If you have a column for every hour, this effectively left-aligns the bars.
library(plotly)
df <- data.frame(
ts = seq(as.POSIXct("2019-03-20 00:00:00"), by = "hour", length.out = 24),
val = sample(1:100, 24)
)
## Create a dummy column with x offset values
df$ts_x <- df$ts + 1800
plot_ly() %>%
## Plot based on the dummy column
add_bars(data = df, x = ~ts_x, y = ~val,
## Cover up our tracks by not showing true x value on hoverinfo
hoverinfo = "text",
## Give text that includes the un-altered time values
text = ~paste0("Time: ",format(ts, format = "%B %d, %Y %H:%M"),
"<br>Value: ",val)) %>%
layout(dragmode = "select", autosize = TRUE, selectdirection = "h",
barmode = 'overlay',
bargap = 0.05,
xaxis = list(ticks = "outside",
type = "date",
tickmode = "auto",
title = ""
)) %>%
config(scrollZoom = TRUE)

By default, bars are centered, I didn't find how to change this.
One alternative is to add a second bar, because when there are 2 bars for each x-axis unity, one bar is at the left of the axis tick, and the second at the right (what you are trying to obtain with one bar).
Why not creating a second invisible bar ? :)
df <- data.frame(
ts = seq(as.POSIXct("2019-03-20 00:00:00"), by = "hour", length.out = 24),
val = sample(1:100, 24),
val0 = 0
)
plot_ly(df, type = 'bar') %>%
add_trace(x = ~ts, y = ~val0) %>%
add_trace(x = ~ts, y = ~val) %>%
layout(
showlegend = FALSE
) %>%
config(scrollZoom = TRUE)
This will create a legend (as there are 2 kind of bars, ones for val and ones for val0), so I removed it.

Are you sure you are not over engineering? Subtracting 30 minutes gives me a nice graph when zooming in.
I'm not suggest you actually edit the data, even if it's what I'm doing in the code. A small function in the call to add bars could solve it? If you overlay other data it could make a mess but I just wanted to suggest it.
library(plotly)
library(lubridate)
df2 <- data.frame(
ts = seq(as.POSIXct("2019-03-20 00:00:00"), by = "hour", length.out = 24) - minutes(30),
val = sample(1:100, 24)
)
plot_ly() %>%
add_bars(data = df2, x = ~ts, y = ~val) %>%
layout(dragmode = "select", autosize = TRUE, selectdirection = "h",
barmode = 'overlay',
bargap = 0.05,
xaxis = list(ticks = "outside",
type = "date",
# tickson="boundaries",
# offset=1800,
tickmode = "auto",
title = ""
)) %>%
config(scrollZoom = TRUE)

Related

Add average line/plot line to area highchart R

I know there are a few similar questions to this out there but they all seem to use javascript (?) or something besides the normal R coding so I don't know how to use it in my code... anyways all I want to do is add a plotline to my area chart that shows the average of the values, how do I do that? I know that highcharter itself can not calculate the average so I can do that myself but how do I create the plotline .... thank you so much. (i tried to make the code so that it is easily 'reproducible' ? hope it is ok). I attached a picture of the current chart if that helps.
library(tidyverse)
library(highcharter)
library(ggplot2)
data("diamonds", package = "ggplot2")
df <- diamonds %>%
group_by(cut)%>%
count()
head(df, 4)
# Create chart
hc <- df %>%
hchart(
'area', hcaes(x = cut, y = n),
color = "lightblue"
) %>%
hc_yAxis(title = list(text = "cut"))
# Display chart
hc
Below is a mini example of using the highcharts widget. You can add each series using hc_add_series. In this case, we have two series and two y-axes. Using two y-axes helps to differentiate between the series. I'm not sure what values you're trying to calculate the average so I chose price.
Hope this helps add some clarity to highcharter!
library(tidyverse)
library(highcharter)
df <- diamonds %>%
group_by(cut)%>%
summarise(
n = n(),
avg_price = round(mean(price),2)
)
# create hc widget
highchart(type = "chart") %>%
# add both series
hc_add_series(df, hcaes(x = cut, y = n), color = "lightblue", yAxis = 0, type = "area", name = "N") %>%
hc_add_series(df, hcaes(x = cut, y = avg_price), yAxis = 1, type = "line", name = "Avg Price") %>%
# set type to categories since we're looking at categorical data
hc_xAxis(type = "category", categories = df$cut) %>%
hc_title(text = "Cut Freq vs Avg Price") %>%
# add each y-axis which is linked above in 'hc_add_series'
hc_yAxis_multiples(
list(title = list(text = "Cut")), # yAxis = 0
list(title = list(text = "Average Price"), opposite = TRUE) # yAxis = 1
) %>%
hc_tooltip(shared = TRUE, split = FALSE)
Ex:
Haha I got it. basically just this.
plotline <- list(
color = "red", value = mean(diamonds$cut), width = 2, zIndex = 5
)
hc_yAxis(plotLines = list(plotline))

R Plotly Stacked Bar breakout date not in correct order

I have an R plotly stacked bar chart where it's broken out by date. I am trying to change the order from the oldest date 10/1/2020 to the newest date 01/01/2021 on top. I noticed in the current state that it's not even in the correct order of dates. The data frame shows in the correct order.
The current code I have.
ramp2 <- colorRamp(c("deepskyblue4", "white"))
ramp.list2 <- rgb( ramp2(seq(0, 1, length = 15)), max = 255)
plot_ly(pcd_2,
x = ~reorder(u_reason_code,-total_qty, sum), y = ~total_qty, type = 'bar', color = ~month_breakout ) %>%
layout(list(title = 'Cost'), barmode = 'stack') %>%
layout(colorway = ramp.list2) %>%
config(displayModeBar = FALSE)
Try formating your date as factor with next code (not tested as no data was shared):
#Process data
pcd_2$Date <- as.Date(pcd_2$month_breakout,'%m/%d/%Y')
pcd_2 <- pcd_2[order(pcd_2$Date),]
pcd_2$month_breakout <- factor(pcd_2$month_breakout,
levels = unique(pcd_2$month_breakout),
ordered = T)
#Plot
ramp2 <- colorRamp(c("deepskyblue4", "white"))
ramp.list2 <- rgb( ramp2(seq(0, 1, length = 15)), max = 255)
plot_ly(pcd_2,
x = ~reorder(u_reason_code,-total_qty, sum), y = ~total_qty, type = 'bar', color = ~month_breakout ) %>%
layout(list(title = 'Cost'), barmode = 'stack') %>%
layout(colorway = ramp.list2) %>%
config(displayModeBar = FALSE)

How to create an Orbit Chart in R? (Plotly/ggplot2)

I have spent time researching with no direction on how to create an orbit chart
I would ideally like to be able to create interactive versions (such as Plotly) but a ggplot2 would suffice as well.
Any suggestions are much appreciated!
For a weekly vis contest some time ago, I created some charts like this. I think the commonly accepted term now is "connected scatterplot".
Here is the skeleton plotly code I used.
plot_ly(
df,
x = x_var,
y = y_var,
group = group_var,
mode = "markers") %>%
add_trace(
x = x_var,
y = y_var,
xaxis = list(title = ""),
yaxis = list(title = ""),
group = group_var,
line = list(shape = "spline"),
showlegend = FALSE,
hoverinfo = "none")
You can look at the github repo for my submission which includes the code for both ggplot and plotly to produce connected scatterplots.
Using ggplot2:
geom_path() connects the observations in the order in which they appear in the data. geom_line() connects them in order of the variable on the x axis.
Taken from the ggplot manual page: http://docs.ggplot2.org/current/geom_path.html
You may also try out geom_curve and geom_segment if you want more control.
Thanks to #Bishop, I was able to formulate something really close to my ideal orbit chat. I included some chart annotations, for the start and end date and a label for which direction is the optimal solution.
max_date <- final_data_grp[which.max(final_data_grp$week_num), ]
min_date <- final_data_grp[which.min(final_data_grp$week_num), ]
end <- list(
x = max_date$AreaWOH,
y = max_date$SLevel,
text = paste('End', max_date$MondayDate),
xref = "x",
yref = "y"
)
start <- list(
x = min_date$AreaWOH,
y = min_date$SLevel,
text = paste('Start', min_date$MondayDate),
xref = "x",
yref = "y"
)
best_label = list(
x = min(final_data_grp$AreaWOH),
y = max(final_data_grp$SLevel),
text = 'Best Scenario',
showarrow = FALSE,
bordercolor='#c7c7c7',
borderwidth=2,
borderpad=4,
bgcolor='#ff7f0e',
opacity=.7
)
plot_ly(
final_data_grp,
x = AreaWOH,
y = SLevel,
group = MondayDate,
showlegend = FALSE,
marker = list(size = 8,
color = 'black',
opacity = .6)) %>%
add_trace(
x = AreaWOH,
y = SLevel,
line = list(shape = "spline"),
hoverinfo = "none",
showlegend = FALSE) %>%
layout(annotations = list(start, end, best_label))

Overlaying two histograms in R Plotly

I'm trying to overlay two histogram plots in R plotly. However only one of them shows up. Here's the code I'm using with some random data:
myDF <- cbind.data.frame(Income = sample(1:9, size = 1000, replace= TRUE),
AgeInTwoYearIncrements = sample(seq(from = 2, to = 70, by = 2), size = 1000, replace = TRUE))
plot_ly(data = myDF, alpha = 0.6) %>%
add_histogram(x = ~Income, yaxis = "y1") %>%
add_histogram(x = ~AgeInTwoYearIncrements, yaxis = "y2") %>%
layout(
title = "Salary vs Age",
yaxis = list(
tickfont = list(color = "blue"),
overlaying = "y",
side = "left",
title = "Income"
),
yaxis2 = list(
tickfont = list(color = "red"),
overlaying = "y",
side = "right",
title = "Age"
),
xaxis = list(title = "count")
)
Any help would be much appreciated!
It is the main cause to give the 1st yaxis overlaying. And because xaxis is count, Income and Age is y.
plot_ly(data = myDF, alpha = 0.6) %>%
add_histogram(y = ~Income, yaxis = "y1") %>% # not `x =`
add_histogram(y = ~AgeInTwoYearIncrements, yaxis = "y2") %>%
layout(
title = "Salary vs Age",
yaxis = list(
tickfont = list(color = "blue"),
# overlaying = "y", # the main cause is this line.
side = "left",
title = "Income"
),
yaxis2 = list(
tickfont = list(color = "red"),
overlaying = "y",
side = "right",
title = "Age"
),
xaxis = list(title = "count")
)
[Edited: just flip]
plot_ly(data = myDF, alpha = 0.6) %>%
add_histogram(x = ~ Income, xaxis = "x1") %>%
add_histogram(x = ~ AgeInTwoYearIncrements, xaxis = "x2") %>%
layout(
margin = list(t = 60),
title = "Salary vs Age",
xaxis = list(
tickfont = list(color = "blue"),
side = "left",
title = "Income"
),
xaxis2 = list(
tickfont = list(color = "red"),
overlaying = "x",
side = "top",
position = 0.95,
title = "<br>Age"
),
yaxis = list(title = "count")
)
You can mix histograms:
plot_ly(data = myDF, alpha = 0.6) %>%
add_histogram(x = ~Income) %>%
add_histogram(x = ~AgeInTwoYearIncrements) %>%
layout(
title = "Salary and Age",
yaxis = list(
tickfont = list(color = "blue"),
overlaying = "y",
side = "left",
title = "count"
),
xaxis = list(title = "Salary and Age value")
)
A histogram has normally on the y-axis the frequency / count and not on the x-axis. We can produce a diagram like you want but I'm not sure if it is still a histogram.
Also, like you see in my picture you the frequency/count for salary (here blue) is more high and the variability is less then age. That make it difficult for a good looking diagram. Maybe this is just a problem of your sample data...
So When you like to go with the histogram function, you have to invert the meaning of the frequency and the value on the x-axis.
But anyway, I think a scaternplot would be a better solution to show the relation between salary and age.
edit:
This is the result I get when I run your code:
Like this I don't see the sense in the plot and what you want. The meaning of the first orange colum is that a age of 59 occurs between 0 and 5 times in your dataset. The third colum means a age of 88 ocours between 10 and 15 times in your dataset.
To present this information in a barplot don't work. Because you can have several Age-values in on categorie of counts...I hope this is clear.
Anyway, to answer your question I need more clarification.
Following the responses here, I wanted to answer this with an example that others can easily use when for instance plotting two overlapping histograms.
# Add required packages
library(plotly)
# Make some sample data
a = rnorm(1000,4)
b = rnorm(1000,6)
# Make your histogram plot with binsize set automatically
fig <- plot_ly(alpha = 0.6) # don't need "nbinsx = 30"
fig <- fig %>% add_histogram(a, name = "first")
fig <- fig %>% add_histogram(b, name = "second")
fig <- fig %>% layout(barmode = "overlay",
yaxis = list(title = "Frequency"),
xaxis = list(title = "Values"))
# Print your histogram
fig
And here is the result of the code:
Easy way to handle any number of dimensions without repetition
TL;DR: You can rearrange your data to long-form before passing it to plot_ly().
df |>
mutate(row_number = row_number()) |>
pivot_longer(!row_number) |>
plot_ly() |>
add_histogram(x = ~ value,
color = ~ name,
opacity = 0.5) |>
layout(barmode = 'overlay')
Explanation
Given a DF with multiple columns, like the one the OP posted:
df = cbind.data.frame(Income = sample(1:9, size = 1000, replace= TRUE),
AgeInTwoYearIncrements = sample(seq(from = 2, to = 70, by = 2), size = 1000, replace = TRUE))
Then, using tidyr::pivot_longer():
df |> mutate(row_number = row_number()) |> pivot_longer(!row_number)
This gives:
# A tibble: 2,000 × 3
row_number name value
<int> <chr> <dbl>
1 1 Income 1
2 1 AgeInTwoYearIncrements 20
3 2 Income 1
4 2 AgeInTwoYearIncrements 48
5 3 Income 3
6 3 AgeInTwoYearIncrements 26
7 4 Income 4
8 4 AgeInTwoYearIncrements 30
9 5 Income 4
10 5 AgeInTwoYearIncrements 60
# … with 1,990 more rows
Finally, just pipe this to plot_ly(), so the full command is:
df |>
# Add a column to keep track of the row numbers
mutate(row_number = row_number()) |>
# Squash and lengthen the df with one row per row per column (in this case, double its length)
pivot_longer(!row_number) |>
plot_ly() |>
# The magic is here. We set color to track the name variable, which will
# add a separate series per column.
# We set the opacity so we can see where our plots overlap.
add_histogram(x = ~ value,
color = ~ name,
opacity = 0.5) |>
# Without setting this, bars will be plotted side by side for the same x value
# rather than overlapping.
layout(barmode = 'overlay')
Output

R plotly show only labels where percentage value is value is above 10

I am making a pie-chart in plotly in R.
I want my labels to be on the chart, so I use textposition = "inside", and for the very small slices those values are not visible.
I am trying to find a way to exclude those labels.
Ideally, I would like to like to not print any lables on my plot that are below 10%.
Setting textposition = "auto" doesn't work well, since there are a lot of small slices, and it makes the graph look very messy.
Is there a way to do it?
For example these piecharts from plotly website (https://plot.ly/r/pie-charts/)
library(plotly)
library(dplyr)
cut <- diamonds %>%
group_by(cut) %>%
summarize(count = n())
color <- diamonds %>%
group_by(color) %>%
summarize(count = n())
clarity <- diamonds %>%
group_by(clarity) %>%
summarize(count = n())
plot_ly(cut, labels = cut, values = count, type = "pie", domain = list(x = c(0, 0.4), y = c(0.4, 1)),
name = "Cut", showlegend = F) %>%
add_trace(data = color, labels = color, values = count, type = "pie", domain = list(x = c(0.6, 1), y = c(0.4, 1)),
name = "Color", showlegend = F) %>%
add_trace(data = clarity, labels = clarity, values = count, type = "pie", domain = list(x = c(0.25, 0.75), y = c(0, 0.6)),
name = "Clarity", showlegend = F) %>%
layout(title = "Pie Charts with Subplots")
In the plot for Clarity 1.37% are outside of the plot, while I would like them not to show at all.
You'll have to specify sector labels manually like so:
# Sample data
df <- data.frame(category = LETTERS[1:10],
value = sample(1:50, size = 10))
# Create sector labels
pct <- round(df$value/sum(df$value),2)
pct[pct<0.1] <- 0 # Anything less than 10% should be blank
pct <- paste0(pct*100, "%")
pct[grep("0%", pct)] <- ""
# Install devtools
install.packages("devtools")
# Install latest version of plotly from github
devtools::install_github("ropensci/plotly")
# Plot
library(plotly)
plot_ly(df,
labels = ~category, # Note formula since plotly 4.0
values = ~value, # Note formula since plotly 4.0
type = "pie",
text = pct, # Manually specify sector labels
textposition = "inside",
textinfo = "text" # Ensure plotly only shows our labels and nothing else
)
Check out https://plot.ly/r/reference/#pie for more information...

Resources