Plotly: handling missing values with animated chart - r

I found out that when using animated plotly chart you need to have the same number of observations for each of your factors. Meaning -> one missing observation results in whole trace being discarded for entire duration of the animated chart. That is especially a problem when you use time-series data and some of your traces start later, or end sooner than others. Is there any workaround beside of imputing null values for the missings? Thanks!
Crossposting from rstudio community
Example:
library(gapminder)
library(plotly)
library(dplyr)
#working example with no missings
gapminder %>%
group_by(year, continent) %>%
summarise(pop = mean(pop), gdpPercap = mean(gdpPercap), lifeExp = mean(lifeExp)) %>%
plot_ly( x = ~gdpPercap,
y = ~lifeExp,
size = ~pop,
color = ~continent,
frame = ~year,
text = ~continent,
hoverinfo = "text",
type = 'scatter',
mode = 'markers')
#filtering one row results in missing Africa trace for entirety of the plot
gapminder %>%
group_by(year, continent) %>%
summarise(pop = mean(pop), gdpPercap = mean(gdpPercap), lifeExp = mean(lifeExp)) %>%
filter(gdpPercap > 1253) %>%
plot_ly( x = ~gdpPercap,
y = ~lifeExp,
size = ~pop,
color = ~continent,
frame = ~year,
text = ~continent,
hoverinfo = "text",
type = 'scatter',
mode = 'markers')

There seems to be no direct way to solve this problem. Indirectly, the problem with NAs in dataframe can be solved by using ggplot + ggplotly instead of plotly (see this answer). Moreover, when there is an incomplete dataset as per my example, instead of NAs in some rows, it can be solved by using complete function from the tidyverse package.
See the solution:
p <-
gapminder %>%
group_by(year, continent) %>%
summarise(pop =
mean(pop), gdpPercap = mean(gdpPercap), lifeExp = mean(lifeExp)) %>%
filter(gdpPercap > 1253) %>%
complete(continent,year) %>%
ggplot(aes(gdpPercap, lifeExp, color = continent)) +
geom_point(aes(frame = year)) + theme_bw()
ggplotly(p)
That being said, I am not a fan of workarounds when used in production, so feel free to inform me about the development in plotly animate function.

Related

how to create highchart with y axis with character vector

I am creating this chart in highchart using the R package highcharter but it’s not working because the y axis is a character:
This is my code:
library(highcharter)
library(dplyr)
hchart(
mtcars %>% rownames_to_column("rowname"),
"scatter",
hcaes(x = mpg, y = rowname),
colorByPoint = TRUE
)
How can I create a chart like this?
As far as I get it from the docs we can't have a categorical variable on the y axis in case of scatter. However, a workaround would be to map an index on y then set the category labels via hc_yAsis:
library(highcharter)
library(tibble)
library(dplyr)
y_axis_categories <- rownames(mtcars)
mtcars %>%
rowid_to_column("id") %>%
mutate(id = id - 1) |> # JS indexing starts at 0
hchart(
"scatter",
hcaes(x = mpg, y = id),
colorByPoint = TRUE
) %>%
hc_yAxis(categories = y_axis_categories)

Changing axis order during frame animation in plotly

I want the y-axis in the plot to order by pop every year when the frames move forward. So when Africa overtakes Americas the axis should change accordingly.
library(plotly)
library(tidyverse)
library(gapminder)
pop_df <- gapminder %>%
group_by(continent, year) %>%
summarise(pop = sum(pop, na.rm = TRUE), .groups = "drop")
pop_df %>%
mutate(continent = reorder(continent, pop)) %>%
plot_ly(y=~continent, x=~pop, frame = ~year) %>%
add_bars(color = ~ continent, text = ~formatC(pop/1000000, format = "f", digits = 1), textposition = "outside", showlegend =FALSE)

plotly R subplot / highlight, size an color problem

I am doing a subplot with a linked barchart and line chart (code below) and I am having a few issues:
library(readr)
library(dplyr)
library(plotly)
library(crosstalk)
library(forecast)
library(ggplot2)
library(tidyverse)
library(readr)
data <- read.csv(url("https://covid.ourworldindata.org/data/owid-covid-data.csv"))
data$date <- as.Date(data$date)
shared_data <- SharedData$new(data, key = ~location)
col <- shared_data %>%
plot_ly() %>%
mutate(location=as.character(location)) %>%
filter(date == "2020-12-07") %>%
filter(continent == "Europe" & location != "Russia" & population > 9000000) %>%
mutate(location = fct_reorder(location, total_cases_per_million, .desc = TRUE)) %>%
add_bars(x = ~location, y = ~total_cases_per_million, color = I("grey")) %>%
layout(yaxis = list(title = "Total Covid-19 cases per million people")) %>%
hide_legend()
lines <- shared_data %>%
plot_ly() %>%
filter(continent == "Europe" & location != "Russia" & population > 9000000) %>%
add_lines(x = ~date, y = ~new_cases_smoothed_per_million, color = ~location) %>%
layout(yaxis = list(title = "New Covid-19 cases (7 days avg) per million people"))
subplot(col, lines, titleY = TRUE) %>%
hide_legend() %>%
highlight(on = "plotly_hover") %>%
layout(title = "Covid-19 incidence in largest European countries")
1- highlight: when I highlight, the bars of the bar chart get sort of split in two:
Does anyone knows how to fix this?
2- color: how can I get to have the same color for the same country in both graphs? If I just map the country (location) on color in the bar chart the resulting colors do not correspond. I tried a couple of lines I found by googling around, but nothing seems to work.
3- How can I get the two graphs having the same size? (as you can see from the image, they are kind of shifted one with respect to the other)
Try adding layout(barmode = "overlay")

How to plot arrest rate (%) for the top 20 crime types (crimes of chicago dataset)?

I am working with R in RStudio and would like to plot via highchart package a graphic that includes on the x-Axis the crime type, and on the y-Axis the arrest rate in %. So to see on which crime type the highest arrest was made. I am working with following code in shiny, which is working but not ploting what exactly I want:
output$top20arrestCrime <- renderHighchart({
arrestCrimeAnalysis <- cc %>%
group_by(Primary.Type, Arrest == TRUE) %>%
summarise(Total = n()) %>%
arrange(desc(Total))
hchart(arrestCrimeAnalysis, "column", hcaes(x = Primary.Type, y = Total, color = Total)) %>%
hc_exporting(enabled = TRUE, filename = "Top_20_Locations") %>%
hc_title(text = "Top 20 Crime Types") %>%
hc_subtitle(text = "(2001 - 2016)") %>%
hc_xAxis(title = list(text = "Crime Type"), labels = list(rotation = -90)) %>%
hc_yAxis(title = list(text = "Arrest Rate %")) %>%
hc_colorAxis(stops = color_stops(n = 10, colors = c("#d98880", "#85c1e9", "#82e0aa"))) %>%
hc_add_theme(hc_theme_smpl()) %>%
hc_legend(enabled = FALSE)
})
I am working with this dataset: https://www.kaggle.com/currie32/crimes-in-chicago.
when I run the code, it just show me on the x-Axis the crime type (e.g. THEFT, ROBERRY) etc, which is correct and on the y-Axis the sum of thefts for example from 2001-2016. But I want on the y-Axis the Arrest Rate in percentage, so how many arrests happened. and this in a highcharter with the top 20 arrests crime types.
Example screenshot of Shiny app
Your problem is you haven't told highcharter to put Arrest Rate on the y axis. You've told it to put Total on the y axis:
arrestCrimeAnalysis <- cc %>%
group_by(Primary.Type, Arrest == TRUE) %>%
summarise(Total = n()) %>%
arrange(desc(Total))
hchart(arrestCrimeAnalysis, "column", hcaes(x = Primary.Type, y = Total, color = Total))
Change y = Total to y = ArrestRate or whatever your rate column name is.

Stacked bar graphs in plotly: how to control the order of bars in each stack

I'm trying to order a stacked bar chart in plotly, but it is not respecting the order I pass it in the data frame.
It is best shown using some mock data:
library(dplyr)
library(plotly)
cars <- sapply(strsplit(rownames(mtcars), split = " "), "[", i = 1)
dat <- mtcars
dat <- cbind(dat, cars, stringsAsFactors = FALSE)
dat <- dat %>%
mutate(carb = factor(carb)) %>%
distinct(cars, carb) %>%
select(cars, carb, mpg) %>%
arrange(carb, desc(mpg))
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = cars) %>%
layout(barmode = "stack")
The resulting plot doesn't respect the ordering, I want the cars with the largest mpg stacked at the bottom of each cylinder group. Any ideas?
As already pointed out here, the issue is caused by having duplicate values in the column used for color grouping (in this example, cars). As indicated already, the ordering of the bars can be remedied by grouping your colors by a column of unique names. However, doing so will have a couple of undesired side-effects:
different model cars from the same manufacturer would be shown with different colors (not what you are after - you want to color by manufacturer)
the legend will have more entries in it than you want i.e. one per model of car rather than one per manufacturer.
We can hack our way around this by a) creating the legend from a dummy trace that never gets displayed (add_trace(type = "bar", x = 0, y = 0... in the code below), and b) setting the colors for each category manually using the colors= argument. I use a rainbow pallette below to show the principle. You may like to select sme more attractive colours yourself.
dat$unique.car <- make.unique(as.character(dat$cars))
dat2 <- data.frame(cars=levels(as.factor(dat$cars)),color=rainbow(nlevels(as.factor(dat$cars))))
dat2[] <- lapply(dat2, as.character)
dat$color <- dat2$color[match(dat$cars,dat2$cars)]
plot_ly() %>%
add_trace(data=dat2, type = "bar", x = 0, y = 0, color = cars, colors=color, showlegend=T) %>%
add_trace(data=dat, type = "bar", x = carb, y = mpg, color = unique.car, colors=color, showlegend=F, marker=list(line=list(color="black", width=1))) %>%
layout(barmode = "stack", xaxis = list(range=c(0.4,8.5)))
One way to address this is to give unique names to all models of car and use that in plotly, but it's going to make the legend messier and impact the color mapping. Here are a few options:
dat$carsID <- make.unique(as.character(dat$cars))
# dat$carsID <- apply(dat, 1, paste0, collapse = " ") # alternative
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID) %>%
layout(barmode = "stack")
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID,
colors = rainbow(length(unique(carsID)))) %>%
layout(barmode = "stack")
I'll look more tomorrow to see if I can improve the legend and color mapping.

Resources