Grouping not respected when using ggplotly to group boxplots - r

I was trying the following code in order to get a graph of boxplots with ggplot2 which are grouped according to different categories:
category_1 <- rep(LETTERS[1:4], each = 20)
value <- rnorm(length(category_1), mean = 200, sd = 20)
category_2 <- rep(as.factor(c("Good", "Medium", "Bad")), length.out = length(category_1))
category_3 <- rep(as.factor(c("Bright", "Dark")), length.out = length(category_1))
df <- data.frame( category_1, value, category_2, category_3)
p <- ggplot(df, aes(x = category_1, y = value, color = category_2, shape = category_3)) +
geom_boxplot(alpha = 0.5) +
geom_point(position=position_jitterdodge(), alpha=0.7)
p
I'm still too noob in stackoverflow to post images, but this is the result I want.
However, when I try to convert it to plotly using
pp <- ggplotly(p)
pp
the last 2 grouping layers (shape and color) are "ignored" and all the boxplots are plotted on top of each other, only respecting the x-axis grouping specified in aes(x = category_1, ...) as you can see here.
How can I avoid this problem? Thanks for your time.
EDIT
I've tried using plotly syntax directly and I get a similar result using the following code:
pp <- plot_ly(df, x = ~category_1, y = ~value, color = ~category_2,
mode = "markers", symbol = ~category_3, type = "box", boxpoints = "all") %>%
layout(boxmode = "group")
pp
Here the result. I said similar because plotly forces the dots to be next to, and not on top of the boxplot, which is not exactly what I wanted.
I guess the question is "solved". Although, I'm still curious if there is an explanation for the problem above. Thanks again!

I think this will solve your issue.
p <- ggplot(df, aes(x = category_1, y = value, color = category_2, shape = category_3)) +
geom_boxplot(alpha = 0.5) +
geom_point(position=position_jitterdodge(), alpha=0.7)
p %>%
ggplotly() %>%
layout(boxmode = "group")
Cheers.

Related

Show other data points when using ggiraph in R?

I am using ggiraph to make an interactive plot in R. My data is grouped and what I'm hoping to do is plot just the mean value of the group but when I hover over that point in the plot, the other points appear. Hopefully, my example below will explain what I mean.
To begin I create some data and make a basic plot:
library(ggplot2)
library(ggiraph)
# create some data
dat1 <- data.frame(X=rnorm(21),
Y=rnorm(21),
groupID=rep(1,21))
dat2 <- data.frame(X=rnorm(21,5),
Y=rnorm(21,5),
groupID=rep(2,21))
dat3 <- data.frame(X=rnorm(21,10),
Y=rnorm(21,10),
groupID=rep(3,21))
ggdat <- rbind(dat1,dat2,dat3)
ggdat$groupID <- as.factor(ggdat$groupID)
# create a plot
ggplot(ggdat, aes(X,Y)) +
geom_point(aes(color = groupID)) +
theme(legend.position = 'none')
We can see the 3 different groups in the above plot.
Then, I'm finding the mean value of each group and plot that. In the example plot below, I'm also plotting all the points with a low alpha value and the mean point in black.
library(dplyr)
# create mean data frame
dfMean <- ggdat %>%
group_by(groupID) %>%
dplyr::summarize(mX = mean(X), mY = mean(Y))
gg_scatter <- ggplot(dfMean, aes(mX, mY, tooltip = groupID, data_id = groupID)) +
geom_point(data = ggdat, aes(X,Y), alpha = 0.1, color = ggdat$groupID) +
theme(legend.position = 'none') +
geom_point_interactive()
gg_scatter
What I'm hoping to do is when I hover over one of the black points, it changes the alpha value for that group to, say, alpha = 1 and shows all the points for that group.
Naively I just tried:
girafe(ggobj = gg_scatter,
options = list(
opts_hover_inv(css = "opacity:0.5;"),
opts_hover(css = "fill:red;")
) )
but this will just highlight the mean point that I'm hovering over and changes the other mean values points alpha.
Is there a way to hover over the mean value point, which changes the alpha for that particular group?
I am not sure if I answer correctly, but I hope it could help:
In your code, you did not use geom_point_interactive()when plotting the first points, so they can not be interactive.
library(ggplot2)
library(ggiraph)
# create some data
dat1 <- data.frame(X=rnorm(21),
Y=rnorm(21),
groupID=rep(1,21))
dat2 <- data.frame(X=rnorm(21,5),
Y=rnorm(21,5),
groupID=rep(2,21))
dat3 <- data.frame(X=rnorm(21,10),
Y=rnorm(21,10),
groupID=rep(3,21))
ggdat <- rbind(dat1,dat2,dat3)
ggdat$groupID <- as.factor(ggdat$groupID)
library(dplyr)
# create mean data frame
dfMean <- ggdat %>%
group_by(groupID) %>%
dplyr::summarize(mX = mean(X), mY = mean(Y))
gg_scatter <- ggplot(dfMean, aes(mX, mY, tooltip = groupID, data_id = groupID)) +
geom_point_interactive(data = ggdat, aes(X,Y, color = groupID), alpha = 0.9) +
theme(legend.position = 'none') +
geom_point_interactive()
gg_scatter
girafe(ggobj = gg_scatter,
options = list(
opts_hover_inv(css = "opacity:0.1;"),
opts_hover(css = "fill:red;")
) )

Heatmap in plotly with defined colors per category in r

I am trying to plot a heatmap with specified colors (by category) in plotly. I asked a similar question here: "Split" up by category in plotly.
However, I ran into a new problem while trying a similar thing with a heatmap. My code looks like:
# Test DataFrame
test_df <- data.frame(
"weekday" = c("Fr", "Sa", "Su"),
"time" = c("06:00:00", "12:00:00", "18:00:00"),
"channel" = c("NBC", "CBS Drama", "ABC"),
"colors" = c("#FCB711", "#162B48", "#AA8002"),
"views" = c(1200, 1000, 1250)
)
plot_ly(colors = unique(as.character(test_df$colors)), type = "heatmap") %>%
add_trace(test_df,
x = test_df$weekday,
y = test_df$time,
z = test_df$views,
type = "heatmap")
What I get is the following picture:
The problems I have here are:
1. The colors are not the colors which I told R to use
2. I do not want a colorscale, rather the categories split up channels.
I know there is a workaround in ggplot, and I am working on it, but I want to have it in plotly.
Here is what it looks like in ggplot and what I want to have in plotly (I am aware of ggplotly, but that still isn't pure plotly):
Here is the code for the above picture:
channel_colors <- test_df %>% distinct(colors) %>% pull(colors)
names(channel_colors) <- test_df %>% distinct(channel) %>% pull(channel)
p <- ggplot(data = test_df,
aes(
x = weekday,
y = time,
fill = channel)) +
geom_tile(aes(alpha = views)) +
scale_alpha(range = c(0.5, 1)) +
theme_minimal() +
scale_fill_manual(values = channel_colors)
ggplotly(p)
I would appreciate any help.

Create a filled area line plot with plotly

I want to create a basic filled area plot with plotly using this dataset:
week<-c(2,1,3)
pts<-c(10,20,30)
wex<-data.frame(week,pts)
The x-axis should contain the week which as you can see may not be in order in the dataset but MUST be in order in the x-axis of the plot. The y axis should contain the pts.
For some reason I take nothing as a result but no error seems to exist.
library(plotly)
week<-c(2,1,3)
pts<-c(10,20,30)
wex<-data.frame(week,pts)
wex$week <- factor(wex$week, levels = wex[["week"]])
p <- plot_ly(x = ~wex$week, y = ~wex$pts,
type = 'scatter', mode = 'lines',fill = 'tozeroy')
p
Values on x axis need to be numeric (see example) and ordered.
wex <- wex[order(wex$week), ]
# wex$week <- factor(wex$week, levels = wex[["week"]])
plot_ly(x = ~wex$week, y = ~wex$pts, type = 'scatter', mode = 'lines',
fill = 'tozeroy')
This will work if you want to keep the x-axis, but it's not a pure plot_ly answer:
p <- ggplot(wex, aes(x = week, y = pts)) +
geom_point() +
geom_line() +
geom_ribbon(aes(ymin = 0, ymax = pts), fill = "blue", alpha = .6, group = 1)
g <- ggplotly(p)
g

How to plot bar plot and error bar with x,y data in R

I have data in following format.
X ID Mean Mean+Error Mean-Error
61322107 cg09959428 0.39158198 0.39733463 0.38582934
61322255 cg17147820 0.30742542 0.31572314 0.29912770
61322742 cg08922201 0.47443355 0.47973039 0.46913671
61322922 cg08360511 0.06614797 0.06750279 0.06479315
61323029 cg00998427 0.05625839 0.05779519 0.05472160
61323113 cg15492820 0.10606674 0.10830587 0.10382761
61323284 cg02950427 0.36187007 0.36727818 0.35646196
61323413 cg01996653 0.35582920 0.36276991 0.34888849
61323667 cg14161454 0.77930230 0.78821970 0.77038491
61324205 cg25149253 0.93585347 0.93948514 0.93222180
How can i plot error bar plot with column(bars)
enter image description here
where X-Axis is having X value. So each bar will be plotted at X of fixed width.
I'll try answering. I am using a package called plotly. You can look here for more details.
df <- read.csv('test.csv')
colnames(df) <- c("x", "id", "mean", "mean+error", "mean-error")
df$`mean+error` = df$`mean+error` - df$mean
df$`mean-error` = df$mean - df$`mean-error`
library(plotly)
p <- ggplot(df, aes(factor(x), y = mean)) + geom_bar(stat = "identity")
p <- plotly_build(p)
length(p$data)
p$layout$xaxis
plot_ly(df, x = 1:10, y = mean, type = "bar",
error_y = list(symmetric = F,
array = df$`mean+error`,
arrayminus = df$`mean-error`,
type = "data")) %>%
layout(xaxis = list(tickmode = "array",tickvals = 1:10,ticktext = df$x))
I get this:
The most popular approach would probably be using geom_errorbar() in ggplot2.
library("ggplot2")
ggplot(df, aes(x=ID, y = Mean)) +
geom_bar(stat="identity", fill="light blue") +
geom_errorbar(aes(ymin = Mean.Error, ymax = Mean.Error.1))
where Mean.Error and Mean.Error.1 are the header names for mean +/- error you get when you try to read in your example as text.

How can I combine a line and scatter on same plotly chart?

The two separate charts created from data.frame work correctly when created using the R plotly package.
However,
I am not sure how to combine them into one (presumably with the add_trace function)
df <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2),value=c(1:4))
plot_ly(df, x = game, y = value, mode = "markers", color = season)
plot_ly(subset(df,season=="2001"), x = game, y = value, mode = "line")
Thanks in advance
The answer given by #LukeSingham does not work anymore with plotly 4.5.2.
You have to start with an "empty" plot_ly() and then to add the traces:
df1 <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2), value=c(1:4))
df2 <- subset(df, season=="2001")
plot_ly() %>%
add_trace(data=df1, x = ~game, y = ~value, type="scatter", mode="markers") %>%
add_trace(data=df2, x = ~game, y = ~value, type="scatter", mode = "lines")
here is a way to do what you want, but with ggplot2 :-) You can change the background, line, points color as you want.
library(ggplot2)
library(plotly)
df_s <- df[c(3:4), ]
p <- ggplot(data=df, aes(x = game, y = value, color = season)) +
geom_point(size = 4) +
geom_line(data=df_s, aes(x = game, y = value, color = season))
(gg <- ggplotly(p))
There are two main ways you can do this with plotly, make a ggplot and convert to a plotly object as #MLavoie suggests OR as you suspected by using add_trace on an existing plotly object (see below).
library(plotly)
#data
df <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2),value=c(1:4))
#Initial scatter plot
p <- plot_ly(df, x = game, y = value, mode = "markers", color = season)
#subset of data
df1 <- subset(df,season=="2001")
#add line
p %>% add_trace(x = df1$game, y = df1$value, mode = "line")

Resources