In creating a trend line for a scatter plot, I am using add_trace to add a linear trend line.
When the data only has one "series" of data, i.e. there is only one group of coordinates, the code below works fine. However, when I introduce a number of series, the "trend line" looks like this:
Here is the relevant part of the code:
p <- plot_ly(filteredFull(), x=Relative.Time.Progress, y=cumul.ans.keystroke,
mode='markers', color=KeystrokeRate, size=KeystrokeRate,
marker=list(sizeref=100), type='scatter',
hoverinfo='text', text=paste("token: ",Token, "Keystrokes: ",
KeystrokeCount)) %>%
layout(
xaxis=list(range=c(0,1)),
yaxis=list(range=c(0,max(filteredFull()$cumul.ans.keystroke)))
)
lm.all <- lm(cumul.ans.keystroke ~ Relative.Time.Progress,
data=df)
observe(print(summary(lm.all)))
p <- add_trace(p, y=fitted(lm.all), x=Relative.Time.Progress,
mode='lines') %>%
layout(
xaxis= list(range = c(0,1))
)
p
I can add more code, or try to make a minimal working example, if necessary. However, I'm hoping that this is a famililar problem that is obvious from the code.
I think you'll need to specify the data = ... argument in add_trace(p, y=fitted(lm.all), x=Relative.Time.Progress, mode='lines').
The first trace seems to be a subset but the second trace uses the regression fitted values which are obtained by fitting a regression model to the entire dataset.
There might be a mismatch between Relative.Time.Progress in filteredFull() vs df.
Here's an example. Hopefully helps...
library(plotly)
df <- diamonds[sample(1:nrow(diamonds), size = 500),]
fit <- lm(price ~ carat, data = df)
df1 <- df %>% filter(cut == "Ideal")
plot_ly(df1, x = carat, y = price, mode = "markers") %>%
add_trace(x = carat, y = fitted(fit), mode = "lines")
plot_ly(df1, x = carat, y = price, mode = "markers") %>%
add_trace(data = df, x = carat, y = fitted(fit), mode = "lines")
It changed now a bit, the following code should work fine:
df <- diamonds[sample(1:nrow(diamonds), size = 500),]
fit <- lm(price ~ carat, data = df)
df1 <- df %>% filter(cut == "Ideal")
plot_ly() %>%
add_trace(data = df1, x = ~carat, y = ~price, mode = "markers") %>%
add_trace(data = df, x = ~carat, y = fitted(fit), mode = "lines")
Need to start with empty plotly and add traces.
Related
How to make the graph not select all the legends when it is generated? Just like as below.
Here is my code:
p1 <-
iris%>%
group_by(Species)%>%
plot_ly(x=~Sepal.Length, color= ~Species, legendgroup=~Species)%>%
add_markers(y= ~Sepal.Width)
Thanks.
By adding each trace (Species) separatly to the plot, you can manually set the visibility of each trace.
# divide dataset by group
iris_sub <- iris %>%
group_split(Species)
# create base for plot
p <- plotly::plot_ly(type="scatter",
mode="markers")
# add a trace for each group
lapply(iris_sub, function(f) {
# define visible group (just show virginica)
vis <- ifelse(unique(f[, "Species"]) == "virginica", TRUE, "legendonly")
# add trace to plot
p <<- get("p") %>%
add_trace(p, data = f,
x = ~Sepal.Length,
y = ~Sepal.Width,
color = ~Species,
visible = vis)
})
I am trying to specify the point symbol (shape) based on a factor, so that the point shape within the boxplot can be different (which can be very useful for highlighting a group of points). However, it looks like instead of showing different shapes, the third boxplot got split into two boxes.
Can you please advise how to achieve that?
data(iris)
iris=mutate(iris, Petal.Width_high=ifelse(Petal.Width>2,"High","Low"))
iris %>% plot_ly(x = ~ Species, y = ~ Petal.Width, color= ~ Species,
symbol = ~ Petal.Width_high,
type = "box", mode="markers",boxpoints="all",
jitter = 0.4, marker = list(size = 10),
pointpos = 0,hoverinfo='text',
text= ~paste('</br>Species: ', Species,
'</br>Petal.Width: ', Petal.Width))
Do one plot first and then add_markers afterwards. Something like:
p <- iris %>%
group_by(Species) %>%
plot_ly(x = ~ Species, y = ~ Petal.Width,
type = "box",
hoverinfo='text',
text= ~paste('</br>Species: ', Species,
'</br>Petal.Width: ', Petal.Width))
add_markers(p, symbol = ~ Petal.Width_high, marker = list(size = 10))
How do I add multiple regression lines to the same plot in plotly?
I want to graph the scatter plot, as well as a regression line for each CATEGORY
The scatter plot plots fine, however the graph lines are not graphed correctly (as compared to excel outputs, see below)
df <- as.data.frame(1:19)
df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)
df[,1] <- NULL
fv <- df %>%
filter(!is.na(x)) %>%
lm(x ~ y + y*CATEGORY,.) %>%
fitted.values()
p <- plot_ly(data = df,
x = ~x,
y = ~y,
color = ~CATEGORY,
type = "scatter",
mode = "markers"
) %>%
add_trace(x = ~y, y = ~fv, mode = "lines")
p
Apologies for not adding in all the information beforehand, and thanks for adding the suggestion of "y*CATEGORY" to fix the parallel line issue.
Excel Output
https://i.imgur.com/2QMacSC.png
R Output
https://i.imgur.com/LNypvDn.png
Try this:
library(plotly)
df <- as.data.frame(1:19)
df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)
df[,1] <- NULL
df$fv <- df %>%
filter(!is.na(x)) %>%
lm(y ~ x*CATEGORY,.) %>%
fitted.values()
p <- plot_ly(data = df,
x = ~x,
y = ~y,
color = ~CATEGORY,
type = "scatter",
mode = "markers"
) %>%
add_trace(x = ~x, y = ~fv, mode = "lines")
p
I'm trying to order a stacked bar chart in plotly, but it is not respecting the order I pass it in the data frame.
It is best shown using some mock data:
library(dplyr)
library(plotly)
cars <- sapply(strsplit(rownames(mtcars), split = " "), "[", i = 1)
dat <- mtcars
dat <- cbind(dat, cars, stringsAsFactors = FALSE)
dat <- dat %>%
mutate(carb = factor(carb)) %>%
distinct(cars, carb) %>%
select(cars, carb, mpg) %>%
arrange(carb, desc(mpg))
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = cars) %>%
layout(barmode = "stack")
The resulting plot doesn't respect the ordering, I want the cars with the largest mpg stacked at the bottom of each cylinder group. Any ideas?
As already pointed out here, the issue is caused by having duplicate values in the column used for color grouping (in this example, cars). As indicated already, the ordering of the bars can be remedied by grouping your colors by a column of unique names. However, doing so will have a couple of undesired side-effects:
different model cars from the same manufacturer would be shown with different colors (not what you are after - you want to color by manufacturer)
the legend will have more entries in it than you want i.e. one per model of car rather than one per manufacturer.
We can hack our way around this by a) creating the legend from a dummy trace that never gets displayed (add_trace(type = "bar", x = 0, y = 0... in the code below), and b) setting the colors for each category manually using the colors= argument. I use a rainbow pallette below to show the principle. You may like to select sme more attractive colours yourself.
dat$unique.car <- make.unique(as.character(dat$cars))
dat2 <- data.frame(cars=levels(as.factor(dat$cars)),color=rainbow(nlevels(as.factor(dat$cars))))
dat2[] <- lapply(dat2, as.character)
dat$color <- dat2$color[match(dat$cars,dat2$cars)]
plot_ly() %>%
add_trace(data=dat2, type = "bar", x = 0, y = 0, color = cars, colors=color, showlegend=T) %>%
add_trace(data=dat, type = "bar", x = carb, y = mpg, color = unique.car, colors=color, showlegend=F, marker=list(line=list(color="black", width=1))) %>%
layout(barmode = "stack", xaxis = list(range=c(0.4,8.5)))
One way to address this is to give unique names to all models of car and use that in plotly, but it's going to make the legend messier and impact the color mapping. Here are a few options:
dat$carsID <- make.unique(as.character(dat$cars))
# dat$carsID <- apply(dat, 1, paste0, collapse = " ") # alternative
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID) %>%
layout(barmode = "stack")
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID,
colors = rainbow(length(unique(carsID)))) %>%
layout(barmode = "stack")
I'll look more tomorrow to see if I can improve the legend and color mapping.
The two separate charts created from data.frame work correctly when created using the R plotly package.
However,
I am not sure how to combine them into one (presumably with the add_trace function)
df <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2),value=c(1:4))
plot_ly(df, x = game, y = value, mode = "markers", color = season)
plot_ly(subset(df,season=="2001"), x = game, y = value, mode = "line")
Thanks in advance
The answer given by #LukeSingham does not work anymore with plotly 4.5.2.
You have to start with an "empty" plot_ly() and then to add the traces:
df1 <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2), value=c(1:4))
df2 <- subset(df, season=="2001")
plot_ly() %>%
add_trace(data=df1, x = ~game, y = ~value, type="scatter", mode="markers") %>%
add_trace(data=df2, x = ~game, y = ~value, type="scatter", mode = "lines")
here is a way to do what you want, but with ggplot2 :-) You can change the background, line, points color as you want.
library(ggplot2)
library(plotly)
df_s <- df[c(3:4), ]
p <- ggplot(data=df, aes(x = game, y = value, color = season)) +
geom_point(size = 4) +
geom_line(data=df_s, aes(x = game, y = value, color = season))
(gg <- ggplotly(p))
There are two main ways you can do this with plotly, make a ggplot and convert to a plotly object as #MLavoie suggests OR as you suspected by using add_trace on an existing plotly object (see below).
library(plotly)
#data
df <- data.frame(season=c("2000","2000","2001","2001"), game=c(1,2,1,2),value=c(1:4))
#Initial scatter plot
p <- plot_ly(df, x = game, y = value, mode = "markers", color = season)
#subset of data
df1 <- subset(df,season=="2001")
#add line
p %>% add_trace(x = df1$game, y = df1$value, mode = "line")