Line plotting on small dataset - r

i have this small dataset that i need to plot with plotly. Im struggling with it :/
Im looking to have 3 colored lines (each line for each of the rows (1,2,3). The x axis needs to be the column names and the Y axis represents each one of the numeric values.
My code so far looks wrong
plot_ly (x = colnames(a), y = a[1], type = "scatter" ,mode = "lines" )

I'm not sure that this is your desired plot, but it sounded closest to your description. I adapted a few columns of your data to describe.
The plot will be easier if data is in longer form with pivot_longer. Also easier if you add row numbers to your data so you can plot one line for each row number.
Since plotly will plot your xaxis categories alphabetically, you will want to relevel your name factor (name is your column names) to order of your columns.
In your plot_ly statement, use split to plot by row number.
library(plotly)
library(tidyverse)
a %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = -rn, names_to = "name", values_to = "value") %>%
mutate(name = factor(name, levels = colnames(a))) %>%
plot_ly(x = ~name, y = ~value, split = ~rn, type = "scatter", mode = "lines")
Output
Data
a <- data.frame(
N_of_Brands = c(-.4, .8, -.4),
Brand_Runs = c(-.26, .70, -.75),
Total_Volume = c(-.69, .15, -.015),
No_of_Trans = c(-.81, .45, -.35)
)

Related

Ordering plotly bars in a grouped-split bar chart

I have a data.frame with counts for several groups that are assigned to several loci from two ages:
set.seed(1)
df <- data.frame(group = c(rep("G1",2),rep("G2",2),rep("G3",2)),
locus = c(rep("A",4),rep("B",2)),
age = rep(c("2m","24m"),3),
n = as.integer(runif(6,10,100)),
stringsAsFactors = F)
I want to plot these data in a bar chart using plotly in R, where the x-axis is group, the bars are split by age, and colored by the combination of locus and age.
I set group, locus, and age as factors to give them the order I want them to folow in the figure:
df$group <- factor(df$group, levels = c("G1","G2","G3"))
df$locus <- factor(df$locus, levels = c("A","B"))
df$age <- factor(df$age,levels = c("2m","24m"))
Now I'm creating a data.frame with the specific colors I'd like each of the locus-age combination to have:
library(dplyr)
color.df <- data.frame(locus = c("A","A","B","B"), age = rep(c("2m","24m"),2), color = c("#66C2A5","#488A75","#FC8D62","#B46446"),stringsAsFactors = F) %>%
dplyr::mutate(locus_age=paste0(locus,"_",age))
color.df$locus <- factor(color.df$locus, levels=c("A","B"))
color.df$age <- factor(color.df$age, levels=c("2m","24m"))
color.df$locus_age <- factor(color.df$locus_age,levels=color.df$locus_age)
Then joining df with color.df:
df <- dplyr::left_join(df,color.df)
And finally plotting:
library(plotly)
plot_ly(x = df$group, y = df$n, split = df$age, text = df$n, type = 'bar', color = df$locus_age, colors = color.df$color, showlegend = T,
textposition = "inside", textfont = list(size=12,color='black')) %>%
layout(yaxis=list(title="N"))
Which gives:
My questions are:
Although I defined the df$age order to be c("2m","24m") the "24m" age appears before the "2m" age, as if the split argument in the plot_ly function is ignoring this order. Any idea how to fix this?
Looks like the legend is labelling both age and the locus_age. Any idea how to make it only label by locus_age?
After some experimentation, it seems to me that the split argument is causing the issues.
I have also streamlined the code so that there is a continuous pipe:
library(dplyr)
library(plotly)
data.frame(locus = c("A", "A", "B", "B"),
age = rep(c("2m", "24m"), 2),
color = c("#66C2A5", "#488A75", "#FC8D62", "#B46446"),
stringsAsFactors = FALSE) %>%
mutate(locus_age = paste0(locus, "_", age)) %>%
mutate(locus = factor(locus, levels = c("A", "B"))) %>%
mutate(age = factor(age, levels = c("2m", "24m"))) %>%
mutate(locus_age = forcats::fct_inorder(locus_age)) %>%
left_join(df, .) %>%
plot_ly(x = ~group, y = ~n,
# split = ~age,
text = ~n, type = 'bar',
color = ~locus_age, colors = ~color, showlegend = TRUE,
textposition = "inside", textfont = list(size = 12, color = 'black')) %>%
layout(yaxis = list(title = "N"))
Now, the factor levels seem to be in the expected order and the legend shows only locus_age.
Caveat
Dealing with factors can become tricky sometimes as factor levels are not always created in the expected order.
If we create a character vector in the expected order
x <- outer(c("A", "B"), c("2m", "24m"), paste, sep = "_") %>%
t() %>%
as.vector()
x
[1] "A_2m" "A_24m" "B_2m" "B_24m"
base R's
factor(x)
(without specifying factor levels explicitely)
creates a factor whose levels are sorted in the current locale by default:
[1] A_2m A_24m B_2m B_24m
Levels: A_24m A_2m B_24m B_2m
Alternatively,
forcats::as_factor(x)
creates factor levels in order of appearance (if x is character)
[1] A_2m A_24m B_2m B_24m
Levels: A_2m A_24m B_2m B_24m
or even more explicitely
forcats::fct_inorder(x)

Is there a way to set axis labels in r-plotly conditionally?

I would like to declutter a plot created with plotly and remove the repetitions of the year on the x-axis. The year should be shown only for the first month. For the rest of the labels the month (without the year) is enough.
I have tried to achieve this result with the ifelse function - without success (see reproducible example below). Is there any way to use ifelse or if_else to set the axis labels in plotly? I think it works that way in ggplot.
library(tidyverse)
library(plotly)
set.seed(42)
df <-
data.frame(date = seq(ymd('2021-01-01'), ymd('2021-12-12'), by = 'weeks'),
value = cumsum(sample(-10:20, length(seq(ymd('2021-01-01'), ymd('2021-12-12'), by = 'weeks')), replace = TRUE)))
df %>%
plot_ly(x = ~date, y = ~value) %>%
add_lines() %>%
layout(xaxis = list(dtick = "M1", tickformat = ~ifelse(date <= "2021-01-31", "%b\n%Y", "%b")))

Avoid converting numbers to dates in plotly

I have a matrix that I want to create a heatmap for in plotly. the row names are assays and the colnames are CASRN and they are in this format "131-55-5"
my matrix looks like this
the data matrix for the heatmap
for some reason plotly thinks these are dates and converts them to something like March 2000 and gives me an empty plot.
before i convert my data frame to matrix i checked and all columns are factors.
is there any way I can make sure my numbers wont turn into dates when i plot my matrix?
this is the code i am using for my heatmap
plot_ly(x=colnames(dm_new2), y=rownames(dm_new2), z = dm_new2, type = "heatmap") %>%
layout(margin = list(l=120))
Using some random data to mimic your dataset. Simply put your matrix in a dataframe. Try this:
set.seed(42)
library(plotly)
library(dplyr)
library(tidyr)
dm_new2 <- matrix(runif(12), nrow = 4, dimnames = list(LETTERS[1:4], c("131-55-5", "113-48-4", "1582-09-8")))
# Put matrix in a dataframe
dm_new2 <- as.data.frame(dm_new2) %>%
# rownames to column
mutate(x = row.names(.)) %>%
# convert to long format
pivot_longer(-x, names_to = "y", values_to = "value")
dm_new2 %>%
plot_ly(x = ~x, y = ~y, z = ~value, type = "heatmap") %>%
layout(margin = list(l=120))
Created on 2020-04-08 by the reprex package (v0.3.0)

Stacked bar graphs in plotly: how to control the order of bars in each stack

I'm trying to order a stacked bar chart in plotly, but it is not respecting the order I pass it in the data frame.
It is best shown using some mock data:
library(dplyr)
library(plotly)
cars <- sapply(strsplit(rownames(mtcars), split = " "), "[", i = 1)
dat <- mtcars
dat <- cbind(dat, cars, stringsAsFactors = FALSE)
dat <- dat %>%
mutate(carb = factor(carb)) %>%
distinct(cars, carb) %>%
select(cars, carb, mpg) %>%
arrange(carb, desc(mpg))
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = cars) %>%
layout(barmode = "stack")
The resulting plot doesn't respect the ordering, I want the cars with the largest mpg stacked at the bottom of each cylinder group. Any ideas?
As already pointed out here, the issue is caused by having duplicate values in the column used for color grouping (in this example, cars). As indicated already, the ordering of the bars can be remedied by grouping your colors by a column of unique names. However, doing so will have a couple of undesired side-effects:
different model cars from the same manufacturer would be shown with different colors (not what you are after - you want to color by manufacturer)
the legend will have more entries in it than you want i.e. one per model of car rather than one per manufacturer.
We can hack our way around this by a) creating the legend from a dummy trace that never gets displayed (add_trace(type = "bar", x = 0, y = 0... in the code below), and b) setting the colors for each category manually using the colors= argument. I use a rainbow pallette below to show the principle. You may like to select sme more attractive colours yourself.
dat$unique.car <- make.unique(as.character(dat$cars))
dat2 <- data.frame(cars=levels(as.factor(dat$cars)),color=rainbow(nlevels(as.factor(dat$cars))))
dat2[] <- lapply(dat2, as.character)
dat$color <- dat2$color[match(dat$cars,dat2$cars)]
plot_ly() %>%
add_trace(data=dat2, type = "bar", x = 0, y = 0, color = cars, colors=color, showlegend=T) %>%
add_trace(data=dat, type = "bar", x = carb, y = mpg, color = unique.car, colors=color, showlegend=F, marker=list(line=list(color="black", width=1))) %>%
layout(barmode = "stack", xaxis = list(range=c(0.4,8.5)))
One way to address this is to give unique names to all models of car and use that in plotly, but it's going to make the legend messier and impact the color mapping. Here are a few options:
dat$carsID <- make.unique(as.character(dat$cars))
# dat$carsID <- apply(dat, 1, paste0, collapse = " ") # alternative
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID) %>%
layout(barmode = "stack")
plot_ly(dat) %>%
add_trace(data = dat, type = "bar", x = carb, y = mpg, color = carsID,
colors = rainbow(length(unique(carsID)))) %>%
layout(barmode = "stack")
I'll look more tomorrow to see if I can improve the legend and color mapping.

Make dual X-axs based on different variables using ggvis

df <- data.frame(X1 = rep(1:5,1), X2 = rep(4:8,1), var1 = sample(1:10,5), row.names = c(1:5))
library("ggvis")
graph <- df %>%
ggvis(~X1) %>%
layer_lines(y = ~ var1) %>%
add_axis("y", orient = "left", title = "var1") %>%
add_axis("x", orient = "bottom", title = "X1") %>%
add_axis("x", orient = "top", title = "X2" )
graph
Obviously, the top x-axis (X2) is not correct here since it refers to the same variable as X1. I know how to create a scaled dual-y axis in ggvis. But how can I create a similar dual axis on different X? Those two X-axis should refer to different variables (X1 and X2 in this example).
I know this could be a really BAD idea to make dual X-axis. But one of my working dataset may need me to do so. Any comments and suggestions are appreciated!
The second axis needs to have a 'name' in order for the axis to know which variable to reflect. See below:
df <- data.frame(X1 = rep(1:5,1),
X2 = rep(4:8,1),
var1 = sample(1:10,5),
row.names = c(1:5))
library("ggvis")
df %>%
ggvis(~X1) %>%
#this is the line plotted
layer_lines(y = ~ var1) %>%
#and this is the bottom axis as plotted normally
add_axis("x", orient = "bottom", title = "X1") %>%
#now we add a second axis and we name it 'x2'. The name is given
#at the scale argument
add_axis("x", scale = 'x2', orient = "top", title = "X2" ) %>%
#and now we plot the second x-axis using the name created above
#i.e. scale='x2'
layer_lines(prop('x' , ~X2, scale='x2'))
And as you can see the top x-axis reflects your X2 variable and ranges between 4 and 8.
Also, as a side note: You don't need rep(4:8,1) to create a vector from 4 to 8. Just use 4:8 which returns the same vector.

Resources