r plotly chart based on multiple columns [duplicate] - r

This question already has answers here:
Format axis tick labels to percentage in plotly
(2 answers)
Closed 2 years ago.
I have a df which can have 2 or more columns with the first one month always fixed.I am trying to plot them using plotly r. As of now it has three columns: month,apple,orange. Based on analysis it can have another column banana. Below is the code I am using right now but it even takes the column month for y-axis. How do I fix this:
> sample_test
month apple orange
2 Aug-17 2 1
3 Dec-17 2 1
4 Feb-18 2 1
5 Jan-18 2 1
6 Jul-17 2 1
7 Jun-17 2 1
8 May-17 2 1
9 Nov-17 2 1
10 Oct-17 2 1
11 Sep-17 2 1
p<- plot_ly(sample_test, x = sample_test$month, name = 'alpha', type = 'scatter', mode = 'lines',
line = list(color = 'rgb(24, 205, 12)', width = 4)) %>%
layout(#title = "abbb",
xaxis = list(title = "Time"),
yaxis = list (title = "Percentage"))
for(trace in colnames(sample_test)){
p <- p %>% plotly::add_trace(y = as.formula(paste0("~`", trace, "`")), name = trace)
}
p
The output looks like this :

Does this help?
sample_test <- read.table(
text = ' month apple orange
2 Aug-17 2 1
3 Dec-17 2 1
4 Feb-18 2 1
5 Jan-18 2 1
6 Jul-17 2 1
7 Jun-17 2 1
8 May-17 2 1
9 Nov-17 2 1
10 Oct-17 2 1
11 Sep-17 2 1'
)
sample_test$month <- as.Date(paste('01', sample_test$month, sep = '-'), format = '%d-%b-%y')
library(plotly)
p <- plot_ly(sample_test, type = 'scatter', mode = 'lines',
line = list(color = 'rgb(24, 205, 12)', width = 4)) %>%
layout(#title = "abbb",
xaxis = list(title = "Time"),
yaxis = list (title = "Percentage", tickformat = '%'))
for(trace in colnames(sample_test)[2:ncol(sample_test)]){
p <- p %>% plotly::add_trace(x = sample_test[['month']], y = sample_test[[trace]], name = trace)
}
p
There are couple of things to note here -
While dealing with dates, it's best to format them as dates. This can save a lot of headache later on. It is also useful as most if not all functions that require dealing with dates have methods built to handle them.
While adding traces in a for loop, always reference the vector to be plotted explicitly like data$vector or data[['vector']] and not like y = ~vector, because plotly for some reason ends up plotting just one trace over and over again.

You can specify a trace for the first y element, which will give you your raw counts. Next you can add a format for your y-axis using tickformat, which will convert to percentages.
sample_test <- data.frame(month = c("Aug-17", "Dec-17", "Feb-18"), apple = c(2,2,2), orange = c(1,1,1))
p <- plot_ly(sample_test, x = sample_test$month, y = ~apple, name = 'alpha', type = 'scatter', mode = 'lines',
line = list(color = 'rgb(24, 205, 12)', width = 4)) %>%
layout(xaxis = list(title = "Time")) %>%
layout(yaxis = list(tickformat = "%", title = "Percentage"))
Although for some reason this appears to just multiply by 100 and add a % label for some reason, rather than actually calculate a percentage. From this SO answer, looks like that's all it does. I don't really use plotly, but in ggplot you can do this if you reshape your data to long and map your categorical variable (in this case fruit) as a percent.
Edit: per OP's comment, removed month from being traced.
p <- plot_ly(type = 'scatter', mode = 'lines') %>%
layout(yaxis = list(tickformat = "%", title = "Percentage"))
colNames <- names(sample_test)
colNames <- colNames[-which(colNames == 'month')]
for(trace in colNames){
p <- p %>% plotly::add_trace(data = sample_test, x = ~ month, y = as.formula(paste0("~`", trace, "`")), name = trace)
print(paste0("~`", trace, "`"))
}
p

Related

How to remove 0 y values plotly R?

I have the following code and dataframe :
datasku = data.frame(matrix(ncol = 3, nrow = 12))
names_col = c("product", "Bad Waitress", "Black Pumas")
colnames(datasku) = names_col
datasku$product <- c("x","y","s","u","i","o","l","m","n","k","b","c")
artists <- c("Bad Waitress", "Black Pumas")
datasku$`Bad Waitress`<- c(23,40,0,0,0,0,0,0,10,0,0,0)
datasku$`Black Pumas` <- c(0,40,0,0,0,65,0,0,10,0,0,0)
product Bad Waitress Black Pumas
1 x 23 0
2 y 40 40
3 s 0 0
4 u 0 0
5 i 0 0
6 o 0 65
7 l 0 0
8 m 0 0
9 n 10 10
10 k 0 0
11 b 0 0
12 c 0 0
show_vec = c()
for (i in 1:length(artists)){
show_vec = c(show_vec,FALSE)
}
get_menu_list <- function(artists){
n_names = length(artists)
buttons = vector("list",n_names)
for(i in seq_along(buttons)){
show_vec[i] = TRUE
buttons[i] = list(list(method = "restyle",
args = list("visible", show_vec),
label = artists[i]))
print(list(show_vec))
show_vec[i] = FALSE
}
return_list = list(
list(
type = 'dropdown',
active = 0,
buttons = buttons
)
)
return(return_list)
}
print(get_menu_list(artists))
fig <- plot_ly(data=datasku, x = ~product, y = ~`Bad Waitress`,type = 'bar',
transforms = list(
list(
type = 'filter',
target = 'y',
operation = '>',
value = 0
)),
hovertemplate = paste('<i>Popularity</i>: %{y:.2f}%',
'<br><i>Product</i>: %{x}<extra></extra><br>'))
fig <- fig %>% add_trace(y = ~`Black Pumas`)
fig <- fig %>% layout(showlegend = F,yaxis = list(title = 'Count'), barmode = 'group',
updatemenus = get_menu_list(artists))
fig
What this code does is basically create a plotly barchart with dropdown menus corresponding to the artists names. When clicking an artist it should only show the products which have a values >0 but it is partially working and I can't understand why.
If I run the code and select Bad Waitress this is what I obtain :
As you can see some columns are removed (since they have 0 value) but others, which have 0 value they are still shown despite the filter. How can I solve this problem?
Thank you

Multiple line 3D plot in R

I have 3 dataframes:
> head(ps_data)
mass value
1 1197.106 0.0003046761
2 1197.312 0.0002792939
3 1197.518 0.0002545125
4 1197.724 0.0002304614
5 1197.930 0.0002072700
6 1198.136 0.0001850678
> head(enf_data)
mass value
1 1252.358 0.0001400532
2 1252.560 0.0001380179
3 1252.761 0.0001360147
4 1252.963 0.0001336038
5 1253.165 0.0001310146
6 1253.367 0.0001278587
> head(uti_data)
mass value
1 1209.999 9.404051e-05
2 1210.204 9.176861e-05
3 1210.409 8.892953e-05
4 1210.614 8.613961e-05
5 1210.819 8.299913e-05
6 1211.024 8.038693e-05
I need to plot something close to this:
Where z axis will be the "value" column, y axis will be the "mass" column and the x axis will be each dataframe.
I tried to plot this using plotly package, but I'm not getting it right.
How can I do it?
EDIT: dput as requested.
structure(list(mass = c(1197.10568602095, 1197.31161534199, 1197.51756246145,
1197.72352737934, 1197.92951009569, 1198.1355106105), value = c(0.000304676093184434,
0.000279293920415841, 0.000254512541389108, 0.000230461422005283,
0.000207270028165387, 0.000185067825770437), group = c("PS",
"PS", "PS", "PS", "PS", "PS")), row.names = c(NA, 6L), class = "data.frame")
structure(list(mass = c(1252.3578527531, 1252.55956147119, 1252.76128739414,
1252.96303052216, 1253.16479085545, 1253.3665683942), value = c(0.000140053215421452,
0.000138017894050617, 0.00013601474884925, 0.000133603848925069,
0.000131014621271734, 0.000127858739055662), group = c("ENF",
"ENF", "ENF", "ENF", "ENF", "ENF")), row.names = c(NA, 6L), class = "data.frame")
structure(list(mass = c(1209.99938731277, 1210.20436650703, 1210.40936335465,
1210.61437785568, 1210.81941001019, 1211.02445981824), value = c(9.40405108642129e-05,
9.17686135352109e-05, 8.89295335433793e-05, 8.61396097238083e-05,
8.29991287322805e-05, 8.03869281229029e-05), group = c("UTI",
"UTI", "UTI", "UTI", "UTI", "UTI")), row.names = c(NA, 6L), class = "data.frame")
EDIT 2:
Got some progress using plotly:
ps_data["group"] <- "PS"
enf_data["group"] <- "ENF"
uti_data["group"] <- "UTI"
all_data <- rbind(ps_data,enf_data,uti_data)
all_long <- melt(all_data, id.vars=c("mass","group","value"))
fig <- plot_ly(all_long, x = ~group, y = ~mass, z = ~value, type = 'scatter3d', mode = 'lines',
opacity = 1, line = list(width = 6, color = ~group, reverscale = FALSE))
fig
But some strange lines appeared in x axis and the colors are not right.
EDIT 3:
I managed to plot something quite good.
My data looks like this:
> head(all_data)
mass value group
1 1197.106 0.0003046761 PS
2 1197.312 0.0002792939 PS
3 1197.518 0.0002545125 PS
4 1197.724 0.0002304614 PS
5 1197.930 0.0002072700 PS
6 1198.136 0.0001850678 PS
The dataframe is huge, with three groups (PS, ENF, UTI).
I can't fit all of it here, but I decided to place the head just for you to see the structure.
With this data I used this:
p3 <- plot_ly(all_data, x = ~group, y = ~mass, z = ~value, split = ~group, type = 'scatter3d', mode = 'lines',
line = list(width = 4))
Now I'm just trying to find some reliable way to save it in TIFF and change the axis titles.

plotly: Map size (shape) to value colum in scatterplot

I really appreaciate the 'plotly' r-package. Currently I run into an issue, where I want to visualize a data frame as points and map the point size (as well as the shape potentially) to a dimension of the data frame.
The problem I run into with my own dataset is, that the sizes are somehow "mixed up" in the sense, that the bigger points don't correspond to the bigger values.
I haven't fully understood the options I have with plotly (sizeref and other marker-options; the fundamental difference between mapping the dimension directly or in the marker arguments; etc) , so this is my best shot as a minimal example right here.
(The second plot is closer to what I currently do. If this one could be fixed, it would be preferable to me)
Your thoughts are greatly appreciated. :)
library(plotly)
set.seed(1)
df <- data.frame(x = 1:10,
y = rep(c("id1", "id2"), 5),
col = factor(sample(3, 10, replace = TRUE)))
df$size <- c(40, 40, 40, 30, 30, 30, 20, 20, 20, 10)
df
#> x y col size
#> 1 1 id1 1 40
#> 2 2 id2 2 40
#> 3 3 id1 2 40
#> 4 4 id2 3 30
#> 5 5 id1 1 30
#> 6 6 id2 3 30
#> 7 7 id1 3 20
#> 8 8 id2 2 20
#> 9 9 id1 2 20
#> 10 10 id2 1 10
# Mapping looks right, but the size may not be correct
plot_ly(df,
x = ~x,
y = ~y,
color = ~col,
size = ~size,
type = 'scatter',
mode = 'markers',
hoverinfo = "text",
text = ~paste('</br> x: ', x,
'</br> y: ', y,
'</br> col: ', col,
'</br> size: ', size)
# , marker = list(size = ~size)
)
# Size looks right, but mapping to points is wrong
plot_ly(df,
x = ~x,
y = ~y,
color = ~col,
# size = ~size,
type = 'scatter',
mode = 'markers',
hoverinfo = "text",
text = ~paste('</br> x: ', x,
'</br> y: ', y,
'</br> col: ', col,
'</br> size: ', size)
, marker = list(size = ~size)
)
devtools::session_info() # excerpt
#> plotly * 4.8.0

Multiple line chart using plotly r

I have a data frame which I am trying to plot using plotly as multiple line chart.Below is how the dataframe looks like:
Month_considered pct.x pct.y pct
<fct> <dbl> <dbl> <dbl>
1 Apr-17 79.0 18.4 2.61
2 May-17 78.9 18.1 2.99
3 Jun-17 77.9 18.7 3.42
4 Jul-17 77.6 18.5 3.84
5 Aug-17 78.0 18.3 3.70
6 Sep-17 78.0 18.9 3.16
7 Oct-17 77.6 18.9 3.49
8 Nov-17 77.6 18.4 4.01
9 Dec-17 78.5 18.0 3.46
10 Jan-18 79.3 18.4 2.31
11 2/1/18 78.9 19.6 1.48
When I iterate through to plot multiple lines below is the code used.
colNames <- colnames(delta)
p <-
plot_ly(
atc_seg_master,
x = ~ Month_considered,
type = 'scatter',
mode = 'line+markers',
line = list(color = 'rgb(205, 12, 24)', width = 4)
)
for (trace in colNames) {
p <-
p %>% plotly::add_trace(y = as.formula(paste0("~`", trace, "`")), name = trace)
}
p %>%
layout(
title = "Trend Over Time",
xaxis = list(title = ""),
yaxis = list (title = "Monthly Count of Products Sold")
)
p
This is how the output looks like
My question is how to remove trace 0 and month_considered to remove from the chart even though its not in colnames which I loop through to add the lines.
It looks like you were getting tripped up by two things:
When you initially defined p and included the data and x arguments, a trace was created -- trace 0. You can define a plot without providing any data or x values to start by just using p <- plot_ly() along with any desired layout features.
When you loop through the column names, your x axis column, Month_Considered is part of the set. You can exclude this by using setdiff() (part of base R) to create a vector with all of your column names except for Months_Considered
Putting those two things together, one way (of many possible) to accomplish what you're going for is as follows:
library(plotly)
df <- data.frame(Month_Considered = seq.Date(from = as.Date("2017-01-01"), by = "months", length.out = 12),
pct.x = seq(from = 70, to = 80, length.out = 12),
pct.y = seq(from = 30, to = 40, length.out = 12),
pct = seq(from = 10, to = 20, length.out = 12))
## Define a blank plot with the desired layout (don't add any traces yet)
p <- plot_ly()%>%
layout(title = "Trend Over Time",
xaxis = list(title = ""),
yaxis = list (title = "Monthly Count of Products Sold") )
## Make sure our list of columns to add doesnt include the Month Considered
ToAdd <- setdiff(colnames(df),"Month_Considered")
## Add the traces one at a time
for(i in ToAdd){
p <- p %>% add_trace(x = df[["Month_Considered"]], y = df[[i]], name = i,
type = 'scatter',
mode = 'line+markers',
line = list(color = 'rgb(205, 12, 24)', width = 4))
}
p

Get separate graphed lines for different variables

I have a variable data with the following structure:
week: int 1 1 2 2 3 3 4 4 5 5 ...
earn: int 2 3 2 7 8 9 2 6 4 2 ...
name: chr "C", "A", "C", "A" ...
Each name (person) has a week with what they earned. So from the above we can see that C earned 2 in week 1 while A earned 3 in week 1. C earned 2 in week two while A earned 7 in week too.
I wish to plot this on a line graph. The below is what I have tried.
p <- plot.ly(data, x = data$week, name = "Week", type = "scatter", mode = "lines") %>%
add_trace(y = data$earn, name = "earn", mode = "lines+markers) %>%
add_trace(y = data$earn, name = "earn", mode = "markers")
p
However, this gives a graph with one line where the marker for week one has (2,3) as this is both the earning on this week. However I would like two lines so it can be clearly seen the difference in earnings for both names.
Defining color will give you what you want.
p <- plot_ly(data, x = ~week, y = ~ earn) %>%
add_lines(color = ~name) %>%
add_markers(color = ~name, showlegend = FALSE)
p
alternatively you can also use:
p <- plot_ly(data=data, x = ~week, y = ~ earn) %>%
add_traces(color = ~name, mode = "lines+markers")
p

Resources