Plotly R order legend entries - r

Is it possible to order the legend entries in R?
If I e.g. specify a pie chart like this:
plot_ly(df, labels = Product, values = Patients, type = "pie",
marker = list(colors = Color), textfont=list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
The legend gets sorted by which Product has the highest number of Patients. I would like the legend to be sorted in alphabetical order by Product.
Is this possible?

Yes, it's possible. Chart options are here:
https://plot.ly/r/reference/#pie.
An example:
library(plotly)
library(dplyr)
# Dummy data
df <- data.frame(Product = c('Kramer', 'George', 'Jerry', 'Elaine', 'Newman'),
Patients = c(3, 6, 4, 2, 7))
# Make alphabetical
df <- df %>%
arrange(Product)
# Sorts legend largest to smallest
plot_ly(df,
labels = ~Product,
values = ~Patients,
type = "pie",
textfont = list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
# Set sort argument to FALSE and now orders like the data frame
plot_ly(df,
labels = ~Product,
values = ~Patients,
type = "pie",
sort = FALSE,
textfont = list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
# I prefer clockwise
plot_ly(df,
labels = ~Product,
values = ~Patients,
type = "pie",
sort = FALSE,
direction = "clockwise",
textfont = list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
Session info:
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 LC_NUMERIC=C LC_TIME=English_Australia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bindrcpp_0.2.2 dplyr_0.7.5 plotly_4.7.1 ggplot2_2.2.1
EDIT:
Modified to work with plotly 4.x.x (i.e. added ~)

Related

Manually sort labels in plot_ly [duplicate]

Is it possible to order the legend entries in R?
If I e.g. specify a pie chart like this:
plot_ly(df, labels = Product, values = Patients, type = "pie",
marker = list(colors = Color), textfont=list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
The legend gets sorted by which Product has the highest number of Patients. I would like the legend to be sorted in alphabetical order by Product.
Is this possible?
Yes, it's possible. Chart options are here:
https://plot.ly/r/reference/#pie.
An example:
library(plotly)
library(dplyr)
# Dummy data
df <- data.frame(Product = c('Kramer', 'George', 'Jerry', 'Elaine', 'Newman'),
Patients = c(3, 6, 4, 2, 7))
# Make alphabetical
df <- df %>%
arrange(Product)
# Sorts legend largest to smallest
plot_ly(df,
labels = ~Product,
values = ~Patients,
type = "pie",
textfont = list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
# Set sort argument to FALSE and now orders like the data frame
plot_ly(df,
labels = ~Product,
values = ~Patients,
type = "pie",
sort = FALSE,
textfont = list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
# I prefer clockwise
plot_ly(df,
labels = ~Product,
values = ~Patients,
type = "pie",
sort = FALSE,
direction = "clockwise",
textfont = list(color = "white")) %>%
layout(legend = list(x = 1, y = 0.5))
Session info:
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 LC_NUMERIC=C LC_TIME=English_Australia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bindrcpp_0.2.2 dplyr_0.7.5 plotly_4.7.1 ggplot2_2.2.1
EDIT:
Modified to work with plotly 4.x.x (i.e. added ~)

Is there a way to subset data in ggrepel with data inherited from the pipe? [duplicate]

I am trying to subset a layer of a plot where I am passing the data to ggplot through a pipe.
Here is an example:
library(dplyr)
library(ggplot2)
library(scales)
set.seed(12345)
df_example = data_frame(Month = rep(seq.Date(as.Date("2015-01-01"),
as.Date("2015-12-31"), by = "month"), 2),
Value = sample(seq.int(30, 150), size = 24, replace = TRUE),
Indicator = as.factor(rep(c(1, 2), each = 12)))
df_example %>%
group_by(Month) %>%
mutate(`Relative Value` = Value/sum(Value)) %>%
ungroup() %>%
ggplot(aes(x = Month, y = Value, fill = Indicator, group = Indicator)) +
geom_bar(position = "fill", stat = "identity") +
theme_bw()+
scale_y_continuous(labels = percent_format()) +
geom_line(aes(x = Month, y = `Relative Value`))
This gives:
I would like only one of those lines to appear, which I would be able to do if something like this worked in the geom_line layer:
geom_line(subset = .(Indicator == 1), aes(x = Month, y = `Relative Value`))
Edit:
Session info:
R version 3.2.1 (2015-06-18) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server 2012 x64
(build 9200)
locale: 2 LC_COLLATE=English_United States.1252
LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United
States.1252 LC_NUMERIC=C [5]
LC_TIME=English_United States.1252
attached base packages: 2 stats graphics grDevices utils
datasets methods base
other attached packages: 2 scales_0.3.0 lubridate_1.3.3
ggplot2_1.0.1 lazyeval_0.1.10 dplyr_0.4.3 RSQLite_1.0.0
readr_0.2.2 [8] RJDBC_0.2-5 DBI_0.3.1 rJava_0.9-7
loaded via a namespace (and not attached): 2 Rcpp_0.12.2
knitr_1.11 magrittr_1.5 MASS_7.3-40 munsell_0.4.2
lattice_0.20-31 [7] colorspace_1.2-6 R6_2.1.1 stringr_1.0.0
plyr_1.8.3 tools_3.2.1 parallel_3.2.1 [13] grid_3.2.1
gtable_0.1.2 htmltools_0.2.6 yaml_2.1.13 assertthat_0.1
digest_0.6.8 [19] reshape2_1.4.1 memoise_0.2.1
rmarkdown_0.8.1 labeling_0.3 stringi_1.0-1 zoo_1.7-12
[25] proto_0.3-10
tl;dr: Pass the data to that layer as a function that subsets the plot's data according to your criteria.
According to ggplots documentation on layers, you have 3 options when passing the data to a new layer:
If NULL, the default, the data is inherited from the plot data as specified in the call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be fortified to produce a data frame. See fortify() for
which variables will be created.
A function will be called with a single argument, the plot data. The return value must be a data.frame, and will be used as the
layer data.
The first two options are the most usual ones, but the 3rd is perfect for our needs when the data has been modified through pyps.
In your example, adding data = function(x) subset(x,Indicator == 1) to the geom_line does the trick:
library(dplyr)
library(ggplot2)
library(scales)
set.seed(12345)
df_example = data_frame(Month = rep(seq.Date(as.Date("2015-01-01"),
as.Date("2015-12-31"), by = "month"), 2),
Value = sample(seq.int(30, 150), size = 24, replace = TRUE),
Indicator = as.factor(rep(c(1, 2), each = 12)))
df_example %>%
group_by(Month) %>%
mutate(`Relative Value` = Value/sum(Value)) %>%
ungroup() %>%
ggplot(aes(x = Month, y = Value, fill = Indicator, group = Indicator)) +
geom_bar(position = "fill", stat = "identity") +
theme_bw()+
scale_y_continuous(labels = percent_format()) +
geom_line(data = function(x) subset(x,Indicator == 1), aes(x = Month, y = `Relative Value`))
This is the resulting plot
library(dplyr)
library(ggplot2)
library(scales)
set.seed(12345)
df_example = data_frame(Month = rep(seq.Date(as.Date("2015-01-01"),
as.Date("2015-12-31"), by = "month"), 2),
Value = sample(seq.int(30, 150), size = 24, replace = TRUE),
Indicator = as.factor(rep(c(1, 2), each = 12)))
df_example %>%
group_by(Month) %>%
mutate(`Relative Value` = Value/sum(Value)) %>%
ungroup() %>%
ggplot(aes(x = Month, y = Value, fill = Indicator, group = Indicator)) +
geom_bar(position = "fill", stat = "identity") +
theme_bw()+
scale_y_continuous(labels = percent_format()) +
geom_line(aes(x = Month, y = `Relative Value`,linetype=Indicator)) +
scale_linetype_manual(values=c("1"="solid","2"="blank"))
yields:
You might benefit from stat_subset(), a stat I made for my personal use that is available in metR: https://eliocamp.github.io/metR/articles/Visualization-tools.html#stat_subset
It has an aesthetic called subset that takes a logical expression and subsets the data accordingly.
library(dplyr)
library(ggplot2)
library(scales)
set.seed(12345)
df_example = data_frame(Month = rep(seq.Date(as.Date("2015-01-01"),
as.Date("2015-12-31"), by = "month"), 2),
Value = sample(seq.int(30, 150), size = 24, replace = TRUE),
Indicator = as.factor(rep(c(1, 2), each = 12)))
df_example %>%
group_by(Month) %>%
mutate(`Relative Value` = Value/sum(Value)) %>%
ungroup() %>%
ggplot(aes(x = Month, y = Value, fill = Indicator, group = Indicator)) +
geom_bar(position = "fill", stat = "identity") +
theme_bw()+
scale_y_continuous(labels = percent_format()) +
metR::stat_subset(aes(x = Month, y = `Relative Value`, subset = Indicator == 1),
geom = "line")

geom_col is not using stat_identify when values are rounded to whole numbers

I'm trying to use geom_col to chart columns for values in time series (annual and quarterly).
When I use Zoo package's YearQtr datatype for the x-axis values and I round the y-axis values to a whole number, geom_col appears to not use the default postion = 'identity' for determining the column bar heights based on the y-value of each occurrence. Instead it appears to switch to position = 'count' and treats the rounded y-values as factors, counting the number of occurrences for each factor value (e.g., 3 occurrences have a rounded y-value = 11)
If I switch to geom_line, the graph is fine with quarterly x-axis values and rounded y-axis values.
library(zoo)
library(ggplot2)
Annual.Periods <- seq(to = 2020, by = 1, length.out = 8) # 8 years
Quarter.Periods <- as.yearqtr(seq(to = 2020, by = 0.25, length.out = 8)) # 8 Quarters
Values <- seq(to = 11, by = 0.25, length.out = 8)
Data.Annual.Real <- data.frame(X = Annual.Periods, Y = round(Values, 1))
Data.Annual.Whole <- data.frame(X = Annual.Periods, Y = round(Values, 0))
Data.Quarter.Real <- data.frame(X = Quarter.Periods, Y = round(Values, 1))
Data.Quarter.Whole <- data.frame(X = Quarter.Periods, Y = round(Values, 0))
ggplot(data = Data.Annual.Real, aes(X, Y)) + geom_col()
ggplot(data = Data.Annual.Whole, aes(X, Y)) + geom_col()
ggplot(data = Data.Quarter.Real, aes(X, Y)) + geom_col()
ggplot(data = Data.Quarter.Whole, aes(X, Y)) + geom_col() # appears to treat y-values as factors and uses position = 'count' to count occurrences (e.g., 3 occurrences have a rounded Value = 11)
ggplot(data = Data.Quarter.Whole, aes(X, Y)) + geom_line()
rstudioapi::versionInfo()
# $mode
# [1] "desktop"
#
# $version
# [1] ‘1.3.959’
#
# $release_name
# [1] "Middlemist Red"
sessionInfo()
# R version 4.0.0 (2020-04-24)
# Platform: x86_64-apple-darwin17.0 (64-bit)
# Running under: macOS Mojave 10.14.6
#
# Matrix products: default
# BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] ggplot2_3.3.1 zoo_1.8-8
ggplot tries to guess the orientation of its geom_col()-function, meaning which variable serves as the base of the bars and which as the values to represent. Apparently without any decimal numbers in your Y- variable it choses it as it's base (it stays numeric though, no conversion to factor), and sums up your quarters.
For cases like this you can provide geom_col() with the information what variable to use as the base of the bars via the orientation=argument:
ggplot(data = Data.Quarter.Whole, aes(X, Y)) + geom_col(orientation = "x")
EDIT: I have just seen that Roman answered it in the comments.

How to colour xlabs with the corresponding colour of its jitter in a geom_jitter?

I am trying to colour the xlabs with the same colour as the point they are labelling, but I am having some trouble.
Each jitter is coloured depending on a specified variable levels, and I want the same for the xlabs.
This is my code to plot the figure:
ggplot(coverage_data, aes(x=x_values, y=coverage_data$mean, fill=coverage_data$frecuency))+
geom_jitter(size=2.5, shape=21, stroke=1.5)+
scale_fill_manual(name = "frecuency", values =c("deepskyblue4", "gray67", "darkgoldenrod2", "springgreen4", "brown1", "white"))+
xlab("Id")+
ylab("max coverage")+
theme(axis.text.x=element_text(hjust=1, colour = 'black', size = 9))
If I declare colour ( in theme(axis.text.x(element_text)) ) as a vector I get an error. Do you know how can I achieve that?
Passing a vector of colors generates a warning, but with ggplot2 3.3.0 (what I'm running) it does work.
Since you didn't share any data I've made some up:
frecuency <- rep(c("A", "B", "C", "D", "E", "F"), 10)
mean <- runif(60, 10, 20)
x_values <- runif(60, 1, 100)
coverage_data <- data.frame(frecuency, mean, x_values, stringsAsFactors = FALSE)
ggplot(coverage_data, aes(x= x_values, y= mean, fill= frecuency))+
geom_jitter(size=2.5, shape=21, stroke=1.5)+
scale_fill_manual(name = "frecuency", values =c("deepskyblue4", "gray67", "darkgoldenrod2", "springgreen4", "brown1", "white"))+
xlab("Id")+
ylab("max coverage")+
theme(axis.text.x=element_text(hjust=1, colour = c("black", "blue", "green", "yellow", "red"), size = 9))
Warning message: Vectorized input to element_text() is not
officially supported. Results may be unexpected or may change in
future versions of ggplot2.
sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
other attached packages:
[1] ggQC_0.0.31 readxl_1.3.1 forcats_0.5.0
[4] stringr_1.4.0 dplyr_0.8.3 purrr_0.3.3
[7] readr_1.3.1 tidyr_1.0.2 tibble_2.1.3
[10] ggplot2_3.3.0 tidyverse_1.3.0

Bar chart in plotly *flies* when deselecting variables

Im facing some issues with ggplot2 and plotly. When creating a bar chart with ggplot2 and pass it into the function ggplotly the bars are mid air when deselecting variables. The graph is not behaving as the examples here
.
Example:
library(ggplot2)
library(reshape2)
library(plotly)
df1 <- data.frame("Price" = rnorm(3, mean = 100, sd = 4),
"Type" = paste("Type", 1:3))
df2 <- data.frame("Price" = rnorm(3, mean = 500, sd = 4),
"Type" = paste("Type", 1:3))
df <- rbind(df1, df2)
df$Dates <- rep(c("2017-01-01", "2017-06-30"), 3)
df <- melt(df, measure.vars = 3)
p <- ggplot(df, aes(fill=Type, y=Price, x=value)) +
geom_bar(stat="identity", position = "stack")
ggplotly(p)
Im running on following:
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252 LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
[5] LC_TIME=Swedish_Sweden.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zoo_1.8-0 dygraphs_1.1.1.4 plotly_4.7.0.9000 reshape2_1.4.2 ggplot2_2.2.1.9000 lubridate_1.6.0 readxl_1.0.0
Thanks!
I think the problem is in the interaction between ggplot2 and plotly.
Use plot_ly function directly
p <- plot_ly(df, x = ~value, y = ~Price, type = 'bar',split=~Type) %>%
layout(yaxis = list(title = 'Count'), barmode = 'stack')
p

Resources