how to create highchart with y axis with character vector - r

I am creating this chart in highchart using the R package highcharter but it’s not working because the y axis is a character:
This is my code:
library(highcharter)
library(dplyr)
hchart(
mtcars %>% rownames_to_column("rowname"),
"scatter",
hcaes(x = mpg, y = rowname),
colorByPoint = TRUE
)
How can I create a chart like this?

As far as I get it from the docs we can't have a categorical variable on the y axis in case of scatter. However, a workaround would be to map an index on y then set the category labels via hc_yAsis:
library(highcharter)
library(tibble)
library(dplyr)
y_axis_categories <- rownames(mtcars)
mtcars %>%
rowid_to_column("id") %>%
mutate(id = id - 1) |> # JS indexing starts at 0
hchart(
"scatter",
hcaes(x = mpg, y = id),
colorByPoint = TRUE
) %>%
hc_yAxis(categories = y_axis_categories)

Related

How to Arrange the x axis on a Plotly Bar Chart with R

Please opine on how the order of the x axis on a Plotly bar chart can be arranged.
I am using a toy example with the diamond dataset and trying to arrange clarity by ascending mean depth. I am very familiar with ggplot but quite new to plotly. I have seen some postings online regarding this issue but none seem to be definitive. After rendering the plot, I think that the clarity categories are indeed ordered correctly, hovering the mouse to get the label values would suggest this, but these values (61.3, 61.3, 61.4, 61.6, 61.7,61.7,61.8, 61.9 for all clarity groups) don't obviously map to the y axis which is on a scale of 0 to 16k. This is confusing me. I am not looking to use the ggplotly wrapper, I am looking for a plotly solution, thanks.
library(tidyverse)
library(plotly)
set.seed(321)
my_diamonds <- ggplot2::diamonds %>%
slice(sample(nrow(.), 1000))
my_diamonds %>%
group_by(clarity) %>%
mutate(mean_depth = mean(depth)) %>%
ungroup() %>%
plot_ly(
data = .
, x = ~ clarity
, y = ~ mean_depth
) %>%
layout(
title = "Mean Depth for each Clarity Category"
, xaxis = list(categoryorder = "array", categoryarray = ~ reorder(clarity, mean_depth))
)
Data has not been processed fully prior to plotting. Once you select the appropriate information, your code plots fine. I have subtracted 61 from the mean_depth as it will be easier to see the bar order. You can remove the subtraction. Try this
set.seed(321)
my_diamonds <- ggplot2::diamonds %>%
slice(sample(nrow(.), 1000))
my_diamonds %>%
group_by(clarity) %>%
mutate(mean_depth = mean(depth)-61) %>%
distinct(clarity,mean_depth) %>% arrange(mean_depth) %>%
ungroup() %>%
plot_ly(
data = .
, x = ~ clarity
, y = ~ mean_depth
) %>%
layout(
title = "Mean Depth for each Clarity Category"
, xaxis = list(categoryorder = "array", categoryarray = ~ reorder(clarity, mean_depth))
)

Plotly: handling missing values with animated chart

I found out that when using animated plotly chart you need to have the same number of observations for each of your factors. Meaning -> one missing observation results in whole trace being discarded for entire duration of the animated chart. That is especially a problem when you use time-series data and some of your traces start later, or end sooner than others. Is there any workaround beside of imputing null values for the missings? Thanks!
Crossposting from rstudio community
Example:
library(gapminder)
library(plotly)
library(dplyr)
#working example with no missings
gapminder %>%
group_by(year, continent) %>%
summarise(pop = mean(pop), gdpPercap = mean(gdpPercap), lifeExp = mean(lifeExp)) %>%
plot_ly( x = ~gdpPercap,
y = ~lifeExp,
size = ~pop,
color = ~continent,
frame = ~year,
text = ~continent,
hoverinfo = "text",
type = 'scatter',
mode = 'markers')
#filtering one row results in missing Africa trace for entirety of the plot
gapminder %>%
group_by(year, continent) %>%
summarise(pop = mean(pop), gdpPercap = mean(gdpPercap), lifeExp = mean(lifeExp)) %>%
filter(gdpPercap > 1253) %>%
plot_ly( x = ~gdpPercap,
y = ~lifeExp,
size = ~pop,
color = ~continent,
frame = ~year,
text = ~continent,
hoverinfo = "text",
type = 'scatter',
mode = 'markers')
There seems to be no direct way to solve this problem. Indirectly, the problem with NAs in dataframe can be solved by using ggplot + ggplotly instead of plotly (see this answer). Moreover, when there is an incomplete dataset as per my example, instead of NAs in some rows, it can be solved by using complete function from the tidyverse package.
See the solution:
p <-
gapminder %>%
group_by(year, continent) %>%
summarise(pop =
mean(pop), gdpPercap = mean(gdpPercap), lifeExp = mean(lifeExp)) %>%
filter(gdpPercap > 1253) %>%
complete(continent,year) %>%
ggplot(aes(gdpPercap, lifeExp, color = continent)) +
geom_point(aes(frame = year)) + theme_bw()
ggplotly(p)
That being said, I am not a fan of workarounds when used in production, so feel free to inform me about the development in plotly animate function.

Change locale in Plotly for R (thousand separator and decimal character)

I'd like to adjust the tick labels in a plotly chart so that they would display a comma as a decimal separator and a point as a thousand separator.
library(plotly)
library(ggplot2)
library(dplyr)
diamonds %>%
mutate(cut = as.character(cut)) %>%
count(cut, clarity) %>%
plot_ly(x = ~cut, y = ~n, color = ~clarity) %>%
layout(yaxis = list(tickformat = ",.1f"))
my local is already set to "LC_COLLATE=German_Austria.1252;LC_CTYPE=German_Austria.1252;LC_MONETARY=German_Austria.1252;LC_NUMERIC=C;LC_TIME=C"
This is an ugly answer but you can set up your object:
d2 <- diamonds %>%
mutate(cut = as.character(cut)) %>%
count(cut, clarity)
and then create the axis labels from there:
ticklabels <- seq(from=0, to=round(max(d2$n), digits = -3), by=1000)
To create a custom axis label:
plot_ly(d2, x = ~cut, y = ~n, color = ~clarity) %>%
layout(yaxis = list(tickvals = ticklabels, ticktext = paste(ticklabels/1000, ".000", ",00", sep="") ))

Stacked bar graphs in plotly: how to control the order of bars in each stack

I'm trying to order a stacked bar chart in plotly, but it is not respecting the order I pass it in the data frame.
It is best shown using some mock data:
library(dplyr)
library(plotly)
cars <- sapply(strsplit(rownames(mtcars), split = " "), "[", i = 1)
dat <- mtcars
dat <- cbind(dat, cars, stringsAsFactors = FALSE)
dat <- dat %>%
mutate(carb = factor(carb)) %>%
distinct(cars, carb) %>%
select(cars, carb, mpg) %>%
arrange(carb, desc(mpg))
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = cars) %>%
layout(barmode = "stack")
The resulting plot doesn't respect the ordering, I want the cars with the largest mpg stacked at the bottom of each cylinder group. Any ideas?
As already pointed out here, the issue is caused by having duplicate values in the column used for color grouping (in this example, cars). As indicated already, the ordering of the bars can be remedied by grouping your colors by a column of unique names. However, doing so will have a couple of undesired side-effects:
different model cars from the same manufacturer would be shown with different colors (not what you are after - you want to color by manufacturer)
the legend will have more entries in it than you want i.e. one per model of car rather than one per manufacturer.
We can hack our way around this by a) creating the legend from a dummy trace that never gets displayed (add_trace(type = "bar", x = 0, y = 0... in the code below), and b) setting the colors for each category manually using the colors= argument. I use a rainbow pallette below to show the principle. You may like to select sme more attractive colours yourself.
dat$unique.car <- make.unique(as.character(dat$cars))
dat2 <- data.frame(cars=levels(as.factor(dat$cars)),color=rainbow(nlevels(as.factor(dat$cars))))
dat2[] <- lapply(dat2, as.character)
dat$color <- dat2$color[match(dat$cars,dat2$cars)]
plot_ly() %>%
add_trace(data=dat2, type = "bar", x = 0, y = 0, color = cars, colors=color, showlegend=T) %>%
add_trace(data=dat, type = "bar", x = carb, y = mpg, color = unique.car, colors=color, showlegend=F, marker=list(line=list(color="black", width=1))) %>%
layout(barmode = "stack", xaxis = list(range=c(0.4,8.5)))
One way to address this is to give unique names to all models of car and use that in plotly, but it's going to make the legend messier and impact the color mapping. Here are a few options:
dat$carsID <- make.unique(as.character(dat$cars))
# dat$carsID <- apply(dat, 1, paste0, collapse = " ") # alternative
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID) %>%
layout(barmode = "stack")
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID,
colors = rainbow(length(unique(carsID)))) %>%
layout(barmode = "stack")
I'll look more tomorrow to see if I can improve the legend and color mapping.

boxplot won't display with ggvis

I'm trying to make a boxplot with ggvis and I can't seem to view one even with a simple example
library(dplyr)
library(ggplot2)
library(shiny) #I think this is required? not sure
data.frame(theVar = c(1,5:10,15)) %>% ggvis(x = ~theVar) #makes a histogram
data.frame(theVar = c(1,5:10,15)) %>% ggvis(x = ~theVar) %>% layer_boxplots()
Error: Can't find prop y.update
forcing a y variable:
data.frame(theVar = c(1,5:10,15)) %>% ggvis(x = ~theVar,y=~theVar) %>% layer_boxplots()
seems to turn it into intervals? not sure what its doing but it's not a boxplot, nor should a boxplot need an X and Y...
If you have a single variable, you have to use your variable for y and specify a dummy for x:
library(ggvis)
data.frame(theVar = c(1,5:10,15)) %>% ggvis(y = ~theVar, x = ~ 1) %>% layer_boxplots()

Resources