Plotly: How to add a median line on a box plot - r

I would like to add trace of a median line on my box plot.
like this
Here are my plots so far:
library(plotly)
p <- plot_ly(y = ~rnorm(50), type = "box") %>%
add_trace(y = ~rnorm(50, 1))
p

Just start out with a scatter plot using plot_ly(..., type='scatter', mode='lines', ...), and follow up with one add_boxplot(...' inherit=FALSE, ...) per box plot. Here's how you do it for an entire data.frame:
Complete code with sample data:
library(dplyr)
library(plotly)
# data
df <- data.frame(iris) %>% select(-c('Species'))
medians <- apply(df,2,median)
# create common x-axis values for median line and boxplots
xVals <- seq(0, length(medians)-1, by=1)
# plotly median line setup
p <- plot_ly(x = xVals, y=medians, type='scatter', mode='lines', name='medians')
# add a trace per box plot
i <- 0
for(col in names(df)){
p <- p %>% add_boxplot(y = df[[col]], inherit = FALSE, name = col)
i <- i + 1
}
# manage layout
p <- p %>% layout(xaxis = list(range = c(min(xVals)-1, max(xVals)+1)))
p

Another option is to use ggplot2 and convert it into plotly
library(ggplot2)
library(dplyr)
library(tidyr)
library(plotly)
p = iris %>% pivot_longer(-Species) %>%
ggplot(aes(x=name,y=value,col=name)) +
geom_boxplot() + stat_summary(inherit.aes = FALSE,
aes(x=name,y=value,group=1),fun.y=median,geom="line")
ggplotly(p)
A brief explanation of the code, I use pivot_longer from tidyr to cast the data frame into long format, and first made the boxplot with the column names as x variable and color.
The stat_summary part, I specified again the same x and y variables again, and omitted the colour this time, adding group=1, this tells stat_summary to consider the whole data frame as one group, and to summarize all the y values per x-group, and draw a line through it.

Related

using dygraph library in r to plot dual axis

I am trying to plot two series at different scales on same plot with dygraph lib in r.
dygraph(data.frame(x = 1:10, y = runif(10),y2=runif(10)*100)) %>%
dyAxis("y", valueRange = c(0, 1.5)) %>%
dyAxis(runif(10)*100,name="y2", valueRange = c(0, 100)) %>%
dyEvent(2, label = "test") %>%
dyAnnotation(5, text = "A")
however, The plot does not fit the data with larger scale, I cannot figure out how to align the two axises. I suspect the option independentTicks in dyAxis() function does the trick but I cannot find how to use it in the documentation. Please help out with this. Best
One way could be:
We pass the named vector of the column with higher values to dySeries function:
See here https://rstudio.github.io/dygraphs/gallery-axis-options.html
library(dygraphs)
library(dplyr)
df = data.frame(x = 1:10, y = runif(10),y2=runif(10)*100)
y2 <- df %>%
pull(y2)
names(y2) <- df$x
dygraph(df) %>%
dySeries("y2", axis = 'y2')

Plotting box-plots with for loop in Plotly

I made several plots with this lines of code:
dataset_numeric = dplyr::select_if(dataset, is.numeric)
par(mfrow=c(3,3))
for(i in 1:9) {
boxplot(dataset_numeric[,i], main=names(dataset_numeric)[i])
}
And output from this plot is pic below :
So I want to do same but now with library(Plotly) so can anybody help me how to do that ?
The following uses packages tidyr and ggplot2. First, the data are converted to a long table with pivot_longer, and then piped to ggplot. One issue to note in the example with one box only is that an explicit x aesthetic is needed, otherwise only the first box may be shown.
library("dplyr")
library("plotly")
library("ggplot2")
library("tidyr")
dataset <- as.data.frame(matrix(rnorm(99), ncol=9))
p <- pivot_longer(dataset, cols=everything()) %>%
ggplot(aes(x=0, y = value)) +
geom_boxplot() + facet_wrap( ~ name)
ggplotly(p)
Edit: a first had still an issue, that could be solved by adding x=0.
I you want to use plotly and put all variables in the same graph, you can use add_trace() in a for loop to do what you want.
library(plotly)
dataset_numeric = dplyr::select_if(iris, is.numeric)
fig <- plot_ly(data = dataset_numeric, type = "box")
for (i in 1:ncol(dataset_numeric)) {
fig <- fig %>% add_trace(y = dataset_numeric[,i])
}
fig
If you want to have separate plot for each variable, you can use subplot()
all_plot <- list()
for (i in 1:ncol(dataset_numeric)) {
fig <- plot_ly(data = dataset_numeric, type = "box") %>%
add_trace(y = dataset_numeric[,i])
all_plot <- append(all_plot, list(fig))
}
plt <- subplot(all_plot)
plt

Plot multiple lists on the same graph in r (scatter plot)

I was trying to plot a graph that looks like the below figure based on the code under it:
xAxisName <- c("ML", "MN")
car1 <- c(5,6)
names(car1) <- xAxisName
car2 <- c(5.5,6.2)
names(car2) <- xAxisName
car3 <- c(4.9, 5.4)
names(car3) <- xAxisName
The plot plots 2 car properties on the x axis and each property has 3 car values. But these are separate lists. How could this plot be plotted?
Get all the 'car' objects into a list, bind them with bind_rows and use ggplot, then pivot to 'long' format and use ggplot
library(ggplot2)
library(dplyr)
library(tidyr)
mget(ls(pattern = '^car\\d+$')) %>%
bind_rows(.id = 'car') %>%
pivot_longer(cols = -car) %>%
ggplot(aes(x = name, y = value, color = car)) +
geom_point()+
scale_y_continuous(expand = c(5, 6))

How can I create subplots in plotly using R where each subplot is two traces

Here is a toy example I have got stuck on
library(plotly)
library(dplyr)
# construct data.frame
df <- tibble(x=c(3,2,3,5,5,5,2),y=c("a","a","a","b","b","b","b"))
# construct data.frame of last y values
latest <- df %>%
group_by(y) %>%
slice(n())
# plot for one value of y (NB not sure why value for 3 appears?)
p <- plot_ly() %>%
add_histogram(data=subset(df,y=="b"),x= ~x) %>%
add_histogram(data=subset(latest,y=="b"),x= ~x,marker=list(color="red")) %>%
layout(barmode="overlay",showlegend=FALSE,title= ~y)
p
How can i set these up as subplots, one for each unique value of y? In the real world example, I would have 20 different y's so would ideally loop or apply the code. In addition, it would be good to set standard x scales of say c(1:10) and have, for example, 2 rows
TIA
build a list containing each of the plots
set the bin sizes manually for the histograms, otherwise the automatic selection will choose different bins for each of the traces within a plot (making it look strange as in you example where the bars of each trace are different widths)
use subplot to put it all together
add titles to individual subplots using a list of annotations, as explained here
Like this:
N = nlevels(factor(df$y))
plot_list = vector("list", N)
lab_list = vector("list", N)
for (i in 1:N) {
this_y = levels(factor(df$y))[i]
p <- plot_ly() %>%
add_trace(type="histogram", data=subset(df,y==this_y), x=x, marker=list(color="blue"),
autobinx=F, xbins=list(start=0.5, end=6.5, size=1)) %>%
add_trace(type="histogram", data=subset(latest,y==this_y), x = x, marker=list(color="red"),
autobinx=F, xbins=list(start=0.5, end=6.5, size=1)) %>%
layout(barmode="overlay", showlegend=FALSE)
plot_list[[i]] = p
titlex = 0.5
titley = c(1.05, 0.45)[i]
lab_list[[i]] = list(x=titlex, y=titley, text=this_y,
showarrow=F, xref='paper', yref='paper', font=list(size=18))
}
subplot(plot_list, nrows = 2) %>%
layout(annotations = lab_list)

How can I combine a line and scatter on same plotly chart?

The two separate charts created from data.frame work correctly when created using the R plotly package.
However,
I am not sure how to combine them into one (presumably with the add_trace function)
df <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2),value=c(1:4))
plot_ly(df, x = game, y = value, mode = "markers", color = season)
plot_ly(subset(df,season=="2001"), x = game, y = value, mode = "line")
Thanks in advance
The answer given by #LukeSingham does not work anymore with plotly 4.5.2.
You have to start with an "empty" plot_ly() and then to add the traces:
df1 <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2), value=c(1:4))
df2 <- subset(df, season=="2001")
plot_ly() %>%
add_trace(data=df1, x = ~game, y = ~value, type="scatter", mode="markers") %>%
add_trace(data=df2, x = ~game, y = ~value, type="scatter", mode = "lines")
here is a way to do what you want, but with ggplot2 :-) You can change the background, line, points color as you want.
library(ggplot2)
library(plotly)
df_s <- df[c(3:4), ]
p <- ggplot(data=df, aes(x = game, y = value, color = season)) +
geom_point(size = 4) +
geom_line(data=df_s, aes(x = game, y = value, color = season))
(gg <- ggplotly(p))
There are two main ways you can do this with plotly, make a ggplot and convert to a plotly object as #MLavoie suggests OR as you suspected by using add_trace on an existing plotly object (see below).
library(plotly)
#data
df <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2),value=c(1:4))
#Initial scatter plot
p <- plot_ly(df, x = game, y = value, mode = "markers", color = season)
#subset of data
df1 <- subset(df,season=="2001")
#add line
p %>% add_trace(x = df1$game, y = df1$value, mode = "line")

Resources