I really like the parallel coordinates plot available in
Plotly but I just ran into an issue I could use help with.
Is it possible to have log10 based axis for some of the coordinates?
As you can see in the example below performing a log10 transform allows to better distinguish the smaller values. However, by transforming the data we loose the ability to interpret the values. I would prefer to log scale the axis instead of the data but couldn't find a way to do this.
I did find something related to "axis styling" in the github issue https://github.com/plotly/plotly.js/issues/1071#issuecomment-264860379 but
not a solution to this problem.
I would appreciate any ideas/pointer.
library(plotly)
# Setting up some data that span a wide range.
df <- read.csv("https://raw.githubusercontent.com/bcdunbar/datasets/master/iris.csv")
df$sepal_width[1] = 50
df$sepal_width_log10 = log10(df$sepal_width)
p <- df %>%
plot_ly(type = 'parcoords',
line = list(color = ~species_id,
colorscale = list(c(0,'red'),c(0.5,'green'),c(1,'blue'))),
dimensions = list(
list(range = c(~min(sepal_width),~max(sepal_width)),
label = 'Sepal Width', values = ~sepal_width),
list(range = c(~min(sepal_width_log10),~max(sepal_width_log10)),
tickformat='.2f',
label = 'log10(Sepal Width)', values = ~sepal_width_log10),
list(range = c(4,8),
constraintrange = c(5,6),
label = 'Sepal Length', values = ~sepal_length))
)
p
More Parallel Coordinate Examples
Plotly Parallel Coordinates Doc
Since the log projection is not supported (yet) creating tick labels manually seems to be a valid solution.
# Lets create the axis text manually and map the log10 transform
# back to the original scale.
my_tickvals = seq(min(df$sepal_width_log10), max(df$sepal_width_log10), length.out=8)
my_ticktext = signif(10 ^ my_tickvals, digits = 2)
library(plotly)
# Setting up some data that span a wide range.
df <- read.csv("https://raw.githubusercontent.com/bcdunbar/datasets/master/iris.csv")
df$sepal_width[1] = 50
df$sepal_width_log10 = log10(df$sepal_width)
# Lets create the axis text manually and map the log10 transform back to the original scale.
my_tickvals = seq(min(df$sepal_width_log10), max(df$sepal_width_log10), length.out=8)
my_ticktext = signif(10 ^ my_tickvals, digits = 2)
p <- df %>%
plot_ly(type = 'parcoords',
line = list(color = ~species_id,
colorscale = list(c(0,'red'),c(0.5,'green'),c(1,'blue'))),
dimensions = list(
list(range = c(~min(sepal_width),~max(sepal_width)),
label = 'Sepal Width', values = ~sepal_width),
list(range = c(~min(sepal_width_log10),~max(sepal_width_log10)),
tickformat='.2f',
label = 'log10(Sepal Width)', values = ~sepal_width_log10),
list(range = c(~min(sepal_width_log10),~max(sepal_width_log10)),
tickvals = my_tickvals,
ticktext = my_ticktext,
label = 'Sepal Width (log10 axis)', values = ~sepal_width_log10),
list(range = c(4,8),
constraintrange = c(5,6),
label = 'Sepal Length', values = ~sepal_length))
)
p
The underlying plotly.js parcoords doesn't support log projection (scales, axes) at the moment, though as you mention it comes up sometimes and we plan with this functionality. In the meantime, an option is to take the logarithm of the data ahead of time, with the big drawback that axis ticks will show log values, which needs explanation and adds to cognitive burden.
Related
Context & problem:
I am trying to show the evolution of a value over time and some events that occured during the same period. One x-axis shows dates and I would like to get another x-axis, on top of the plot, that shows alternative tick-labels for dates i.e. events.
Here is a ggplot version of this plot:
library(ggplot2)
# here is a dummy dataset
df <- data.frame(
timeaxis = seq.Date(from = as.Date.character("2020-01-01"),
to = as.Date.character("2020-02-01"),
by = "days"),
avalue = runif(32)
)
# now I want to add a secondary axis to show events that occured during this time period
df_event <- data.frame(
eventdate = as.Date.character(c("2020-01-05", "2020-01-17", "2020-01-20", "2020-01-25")),
eventlabel = c("diner", "exam", "meeting", "payday")
)
# and the basic plot related to it
p <- ggplot(data = df, aes(x = timeaxis, y = avalue)) +
geom_line()
# I can add a new axis like so:
p <- p + scale_x_date(sec.axis = sec_axis(
~ .,
breaks = df_event$eventdate,
labels = df_event$eventlabel
))
print(p)
Created on 2022-02-11 by the reprex package (v2.0.1)
I need to use plotly for the awesome rangeslider that adds a very nice to use way to zoom in specific time periods.
Demo:
library(dplyr)
library(plotly)
plotly::ggplotly(p) %>%
plotly::layout(
# as you can see the rangeslider is linked to xaxis, is it possible to link it to xaxis2?
xaxis = list(rangeslider = list(visible = TRUE))
)
# note that using rangeslider() does not show the plot right...
As you can see, the second x-axis is not showing. Problem is that plotly does not have, yet, a good handling of secondary axis. First, if you use ggplot2::scale_x_date(sec.axis = ggplot2::sec_axis() to get a secondary x-axis, it is not transfered to the plotly plot by using plotly::ggplotly(). Second, if you manually define a secondary x-axis using plotly functions (as proposed here or there, this axis does not change according to the rangeslider (I guess because the rangeslider is actually bound to xaxis and should de defined for xaxis2 as well). Not to mention, I was not able to change the labels of the breaks to get the "events" displayed (even the example on
Demo:
plotly::ggplotly(p) %>%
# making an invisible trace
plotly::add_lines(
data = df_event,
x = ~ eventdate,
y = 0,
color = I("transparent"),
hoverinfo = "skip",
showlegend = FALSE,
xaxis = "x2"
) %>%
plotly::layout(
xaxis = list(rangeslider = list(visible = TRUE)),
xaxis2 = list(overlaying = "x", side = "top")
)
And with the actual event labels:
plotly::ggplotly(p) %>%
# making an invisible trace
plotly::add_lines(
data = df_event,
x = ~ eventdate,
y = 0,
color = I("transparent"),
hoverinfo = "skip",
showlegend = FALSE,
xaxis = "x2"
) %>%
plotly::layout(
xaxis = list(rangeslider = list(visible = TRUE)),
xaxis2 = list(overlaying = "x",
side = "top",
tickvals = df_event$eventdate,
ticktext = df_event$eventlabel)
)
I tried to add matches = "x", anchor = "x", scaleanchor = "x" to xaxis2 list but nothing changed.
Question:
How to make a second axis in a plotly plot that reacts to "zoom" using rangeslider?
If you think there is a better way of achieving this, please go ahead! Any idea is very welcome, I am quite new to plotly and I certainly have overlooked its functionalities.
I have some data like this:
data <- data.frame(x=runif(500), y=runif(500), z=runif(500))
I want a scatterplot with points colored independently/discretely in each dimension (X, Y, and Z) using RGB values.
This is what I have tried:
Code:
library(dplyr)
library(plotly)
xyz_colors <- rgb(data$x, data$y, data$z)
plot_ly(data = data,
x = ~x, y = ~y, z = ~z,
color= xyz_colors,
type = 'scatter3d',
mode='markers') %>%
layout(scene = list(xaxis = list(title = 'X'),
yaxis = list(title = 'Y'),
zaxis = list(title = 'Z')))
Plot:
RColorBrewer thinks I'm trying to create a continuous scale from 500 intermediate colors:
Warning messages:
1: In RColorBrewer::brewer.pal(N, "Set2") :
n too large, allowed maximum for palette Set2 is 8
Returning the palette you asked for with that many colors
2: In RColorBrewer::brewer.pal(N, "Set2") :
n too large, allowed maximum for palette Set2 is 8
Returning the palette you asked for with that many colors
What are some correct ways to color the points like this in R with Plotly?
Also, how can one generally assign colors to data points in R with Plotly, individually?
To clarify, I am trying to color each point where the color is of the format "#XXYYZZ" where 'XX' a value between 00 and FF linearly mapped to the value of data$x from 0 to 1. That is, the X dimension determines the amount of red, the Y dimension determines the amount of green, and the Z dimension determines the amount of blue. At 0,0,0 the point should be black and at 1,1,1 the point should be white. The reason for this is to make as easy to visualize the 3D position of the points as possible.
Updated answer after comments:
So, is there no way to color every point separately?
Yes, there is through the power and flexibility of add_traces(). And it's a lot less cumbersome than I first thought.
Just set up an empty plotly figure with some required 3D features:
p <-plot_ly(data = data, type = 'scatter3d', mode='markers')
And apply add_traces() in a loop over each defined color:
for (i in seq_along(xyz_colors)){
p <- p %>% add_trace(x=data$x[i], y=data$y[i], z=data$z[i],
marker = list(color = xyz_colors[i], opacity=0.6, size = 5),
name = xyz_colors[i])
}
And you can easily define single points with a color of your choice like this:
p <- p %>% add_trace(x=0, y=0, z=0,
marker = list(color = rgb(0, 0, 0), opacity=0.8, size = 20),
name = xyz_colors[i])
Plot:
Complete code:
library(dplyr)
library(plotly)
# data and colors
data <- data.frame(x=runif(500), y=runif(500), z=runif(500))
xyz_colors <- rgb(data$x, data$y, data$z)
# empty 3D plot
p <-plot_ly(data = data, type = 'scatter3d', mode='markers') %>%
layout(scene = list(xaxis = list(title = 'X'),
yaxis = list(title = 'Y'),
zaxis = list(title = 'Z')))
# one trace per color
for (i in seq_along(xyz_colors)){
p <- p %>% add_trace(x=data$x[i], y=data$y[i], z=data$z[i],
marker = list(color = xyz_colors[i], opacity=0.6, size = 5),
name = xyz_colors[i])
}
# Your favorite data point with your favorite color
p <- p %>% add_trace(x=0, y=0, z=0,
marker = list(color = rgb(0, 0, 0), opacity=0.8, size = 20),
name = xyz_colors[i])
p
Original answer:
In 3D plots you can use the same color for all of the points, discern different clusters or categories from each other using different colors, or you use individual colors for each point to illustrate a fourth value (or fourth dimension if you like, as described here) in your dataset. All these approaches are, as you put it, examples of '[...] correct ways to color the points [...]'. Have a look below and see if this suits your needs. I've included fourthVal <- data$x+data$y+data$z as an example for an extra dimension. What you end up using will depend entirely on your dataset and what you'd like to illustrate.
Code:
library(dplyr)
library(plotly)
data <- data.frame(x=runif(500), y=runif(500), z=runif(500))
xyz_colors <- rgb(data$x, data$y, data$z)
fourthVal <- data$x+data$y+data$z
plot_ly(data = data,
x = ~x, y = ~y, z = ~z,
color= fourthVal,
type = 'scatter3d',
mode='markers') %>%
layout(scene = list(xaxis = list(title = 'X'),
yaxis = list(title = 'Y'),
zaxis = list(title = 'Z')))
Plot:
Plotly ignores my set color scheme when using type = "scatter", but retains it when using type = NULL, even though the color scheme isn't set until later in the code.
Attempts to use mode = "markers" or mode = "lines" or mode = "lines + markers" leads to the creation of multiple lines.
I realize mtcars is a weird dataset to add alerts and time to for this code example, but it shows the problem exactly. I have fixed the code and it is in production with type = NULL, but having plotly guess the plotting type seems unstable (and in certain instances it assumes a bar graph).
If I use type = NULL in Section 1 then I get the correct color scheme (red, yellow, green).
if I use type = "scatter" then it uses a different color scheme.
If I use type = "scatter", mode = "lines + markers" it also uses a different color scheme.
When the x axis is posixct or dates, then it at least stays all as a single line with different colored points. But because I was relying on type = NULL to get the correct color scheme, depending on the class of the time component, it may assume a bar graph, or create multiple lines (one for each alert).
library(plotly)
pal <- c("red","yellow","green4")
pal <- setNames(pal,c("Serious","Moderate","Low"))
x = NULL
for(i in 2007:2009){
for(j in 1:12)
x <- c(x, paste(i,j,"01",sep="-"))
}
x <- x[1:nrow(mtcars)]
x <-as.Date(x)
y <- as.POSIXct(x)
set.seed(4)
mtcars$alert <- sample(c("Serious","Moderate","Low"),size =
nrow(mtcars),replace = TRUE)
mtcars$date <- x
mtcars$posixct <- y
#' Section 1
p <- plot_ly(mtcars, width = 800,
height = 600, type = NULL # must be made null
#' for some reason using any other type (i.e. scatter)
#' results in multiple traces for each factor (i.e. when you color
#' by the alert, multiple lines are created.) OR results in the colors being ignored
#' The only current workaround is for plot_ly to assume scatter
#' in which case it does not separate the values into new traces and keeps the set colors.
#' in ggplot you can use geom_point() and geom_line() to layer the
#' graphics, but this is not possible in plot_ly.
) %>%
#' Section 2
add_markers(x=~posixct, y=~wt, mode = "markers",color =~alert, colors = pal,
marker =list(size = 10)) %>%
add_trace(x=~posixct,y=~wt,mode = 'line',
line = list(color = 'rgb(0, 0, 0)', width = 2), showlegend=FALSE)
p <- p %>% layout(
xaxis = list(title = "Time"),
yaxis = list(title = "weight"),
title = "weight over index as a time", titlefont = list(size = 20),
margin = list(l = 50, r = 50, t = 50, b = 50),
showlegend = TRUE,legend = list(traceorder = "reversed")
)
config(p,collaborate = FALSE,cloud = FALSE,displaylogo = FALSE,
modeBarButtonsToRemove = c("zoom2d","pan2d","select2d","lasso2d",
"zoomIn2d","zoomOut2d","autoScale2d",
"hoverCompareCartesian","toggleHover","toggleSpikelines"))
This may look similar to this question: How to format two Axes in Plotly using R? but it is not. Plus this one doesn't have any responses.
I am trying to have three subplots (all time-series) one under the other using Plot_ly in R. While the plots are correct, I want their x-axis i.e. date-time ranges to be same. At the moment the third plot has time stamp starting from 8 AM whereas others have it starting at 00:00 AM. I would like my third plot also to be starting at 00:00 AM even if there are no values there. This would make visual comparison much easier. Here is my code snippet:
pWrist <- plot_ly(combinedCounts, x = ~DATE_TIME, y= ~Vector.Magnitude_ANKLE, name = "ANKLE_COUNTS", legendgroup = "ANKLE", type = "bar")
pAnkle <- plot_ly(combinedCounts, x = ~DATE_TIME, y= ~Vector.Magnitude_WRIST, name = "WRIST_COUNTS", legendgroup = "WRIST", type = "bar")
pResponses <- plot_ly(uEMAResponsesOnly, x=~PROMPT_END_TIME, y=~RESPONSE_NUMERIC, name = "uEMA_RESPONSES", legendgroup = "uEMA", type = "bar")
subplot( style(pWrist, showlegend = TRUE), style(pAnkle, showlegend = TRUE), style(pResponses, showlegend = TRUE), nrows = 3, margin = 0.01)
The subplot function puts them all one under the other. Any help or information is appreciated.
**EDIT: **Here is what it looks like right now. I just want the third axis also to start from the same time as the first two. As you can see, the third one starts at 8 AM. Current plot
This is solved.
pResponses <- plot_ly(uEMAResponsesOnly, x=~PROMPT_END_TIME, y=~RESPONSE_NUMERIC, name = "uEMA_RESPONSES", legendgroup = "uEMA", type = "bar")%>% layout(xaxis = list(range = as.POSIXct(c('2018-01-13 00:00:00', '2018-01-14 23:00:00'))))
I am trying to produce a dumbell plot in R. In this case, there are four rows, and they need to have different and specific colors each. I define the colors as part of the dataset using colorRampPalette(). Then when I produce the plot, the colors get mixed in inappropriate ways. See the image below, and in particular the legend.
As you can see, the orange is supposed to be #7570B3 according to the legend. But this is not correct. The color 7570B3 is purple ! For this reason, the colors that I had defined in the dataset are mixed in the plot. "Alt 2" sound be in orange and "Alt 3" should be in purple.
Does anyone know how to fix this ? Any help would be very appreciated.
Here is a simple version of the code:
table_stats_scores <- data.frame(alt=c("alt1","alt2","alt3","alt4"),
average=c(15,20,10,5),
dumb_colors= colorRampPalette(brewer.pal(4,"Dark2"))(4),
min=c(10,15,5,0),max=c(20,25,15,10)
)
table_stats_scores # This is the dataset
table_stats_scores <- table_stats_scores[order(-
table_stats_scores$average),] # ordering
table_stats_scores$alt <- factor(table_stats_scores$alt,
levels = table_stats_scores$alt[order(table_stats_scores$average)])
# giving factor status to alternatives so that plot_ly() picks up on this
p <- plot_ly(table_stats_scores, x=table_stats_scores$average, color = ~
dumb_colors,
y=table_stats_scores$alt,text=table_stats_scores$alt) %>%
add_segments(x = ~min, xend = ~max, y = ~alt, yend = ~alt,name = "Min-Max
range", showlegend = FALSE, line = list(width = 4)) %>%
add_markers(x = ~average, y = ~alt, name = "Mean",
marker=list(size=8.5),showlegend = FALSE) %>%
add_text(textposition = "top right") %>%
layout(title = "Scores of alternatives",
xaxis = list(title = "scores"),
yaxis = list(title = "Alternatives")
)
p
Yes color can be an issue in plotly, because there are several ways to specify it, and the assignment order of the various elements from the dataframe can be hard to keep in sync.
The following changes were made:
added a list of brighter colors to your dataframe because I couldn't easily visualize the brewer.pal colors. Better to debug with something obvious.
changed the color parameter to the alt column, because it is really just used only indirectly to set the color, and mostly it determines the text in the legend.
added the colors to the text parameter (instead of alt) so I could see if it was assigning the colors correctly.
changed the sort order to the default "ascending" on the table_stat_scores sort because otherwise it assigned the colors in the incorrect order (don't completely understand this - seems like there is some mysterious sorting/re-ordering going on internally)
added a colors parameter to the add_segments and add_markers so that they set the color in the same way using the same column.
I think this gets you want you want:
library(plotly)
library(RColorBrewer)
table_stats_scores <- data.frame(alt=c("alt1","alt2","alt3","alt4"),
average=c(15,20,10,5),
dumb_colors= colorRampPalette(brewer.pal(4,"Dark2"))(4),
min=c(10,15,5,0),max=c(20,25,15,10)
)
table_stats_scores # This is the dataset
table_stats_scores$bright_colors <- c("#FF0000","#00FF00","#0000FF","#FF00FF")
table_stats_scores <- table_stats_scores[order(table_stats_scores$average),] # ordering
table_stats_scores$alt <- factor(table_stats_scores$alt,
levels = table_stats_scores$alt[order(table_stats_scores$average)])
# giving factor status to alternatives so that plot_ly() picks up on this
p <- plot_ly(table_stats_scores, x=~average, color = ~alt, y=~alt,text=~bright_colors) %>%
add_segments(x = ~min, xend = ~max, y = ~alt, yend = ~alt,name = "Min-Max range",
colors=~bright_colors, showlegend = FALSE, line = list(width = 4)) %>%
add_markers(x = ~average, y = ~alt, name = "Mean",
marker=list(size=8.5,colors=~bright_colors),showlegend = FALSE) %>%
add_text(textposition = "top right") %>%
layout(title = "Scores of alternatives",
xaxis = list(title = "scores"),
yaxis = list(title = "Alternatives")
)
p
yielding this: