I have multiple graphs that I am plotting with ggplot and then sending to plotly. I set the legend order based the most recent date, so that one can easily interpret the graphs. Everything works great in generating the ggplot, but once I send it through ggplotly() the legend order reverts to the original factor level. I tried resetting the factors but this creates a new problem - the colors are different in each graph.
Here's the code:
Data:
Country <- c("CHN","IND","INS","PAK","USA")
a <- data.frame("Country" = Country,"Pop" = c(1400,1300,267,233,330),Year=rep(2020,5))
b <- data.frame("Country" = Country,"Pop" = c(1270,1000,215,152,280),Year=rep(2000,5))
c <- data.frame("Country" = Country,"Pop" = c(1100,815,175,107,250),Year=rep(1990,5))
Data <- bind_rows(a,b,c)
Legend Ordering Vector - This uses 2020 as the year to determine order.
Legend_Order <- Data %>%
filter(Year==max(Year)) %>%
arrange(desc(Pop)) %>%
select(Country) %>%
unlist() %>%
as.vector()
Then I create my plot and use Legend Order as breaks
Graph <- Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, group = Country, color = Country), size = 1.2) +
scale_color_discrete(name = 'Country', breaks = Legend_Order)
Graph
But then when I pass this on to:
ggplotly(Graph)
For some reason plotly ignores the breaks argument and uses the original factor levels.
If I set the factor levels beforehand, the color schemes changes (since the factors are in a different order).
How can I keep the color scheme from graph to graph, but change the legend order when using plotly?
Simply recode your Conutry var as factor with the levels set according to Legend_Order. Try this:
library(plotly)
library(dplyr)
Country <- c("CHN","IND","INS","PAK","USA")
a <- data.frame("Country" = Country,"Pop" = c(1400,1300,267,233,330),Year=rep(2020,5))
b <- data.frame("Country" = Country,"Pop" = c(1270,1000,215,152,280),Year=rep(2000,5))
c <- data.frame("Country" = Country,"Pop" = c(1100,815,175,107,250),Year=rep(1990,5))
Data <- bind_rows(a,b,c)
Legend_Order <- Data %>%
filter(Year==max(Year)) %>%
arrange(desc(Pop)) %>%
select(Country) %>%
unlist() %>%
as.vector()
Data$Country <- factor(Data$Country, levels = Legend_Order)
Graph <- Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, group = Country, color = Country), size = 1.2)
ggplotly(Graph)
To "lock in" the color assignment you can make use of a named color vector like so (for short I only show the ggplots):
# Fix the color assignments using a named color vector which can be assigned via scale_color_manual
cols <- scales::hue_pal()(5) # Default ggplot2 colors
cols <- setNames(cols, Legend_Order) # Set names according to legend order
# Plot with unordered Countries but "ordered" color assignment
Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, color = Country), size = 1.2) +
scale_color_manual(values = cols)
# Plot with ordered factor
Data$Country <- factor(Data$Country, levels = Legend_Order)
Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, color = Country), size = 1.2) +
scale_color_manual(values = cols)
Related
I am currently working in a comparison between two inventory levels and I want to plot two step graphs in the same grid with a color code. This is my code.
Intento1<-data.frame(Fecha, NivelI)
Intento2<-data.frame(Fecha, Nivel2)
#Printing the step graphs in one grid
ggplot()+geom_step(Intento1, mapping=aes(x=Fecha, y=NivelI))+geom_step(Intento2, mapping=aes(x=Fecha, y=Nivel2))
And it works fine plotting both graphs in the same grid, I could also add a different color to each graph but I couldnĀ“t add the little colored labels that appear normally at the right. All support is appreciated.
For example data dummy,
dummy <- data.table(
Fecha = seq(as.Date("2020/1/1"), as.Date("2020/1/31"), "day")
)
dummy$NivelI = runif(31, 0, 10)
dummy$Nivel2 = runif(31, 0, 10)
plot using reshape2::melt like below will work.
dummy %>%
melt(id.vars = "Fecha") %>%
ggplot(aes(Fecha, value, group = variable, color = variable)) +
geom_step() + guides(color = guide_legend(title = "aaa"))
In your case, to make dummy formed data, if Fecha, NivelI and Nivel2 are vectors, just try
df <- data.frame(
Fecha,
NivelI,
Nivel2
)
then
df %>%
melt(id.vars = "Fecha") %>%
ggplot(aes(Fecha, value, group = variable, color = variable)) +
geom_step() + guides(color = guide_legend(title = "aaa"))
where "aaa" will be your legend name.
I have a dataframe like this:
library(tidyverse)
my_data <- tibble(name = c("Justin", "Janet", "Marisa"),
x = c(100, 50, 75),
y = c(2, 3, 6))
Each name is unique, and I want to make a bar graph for each person without having to do it line by line. I also want to save each plot as a unique object because I'll be inputting it into a power point using the officer package. Last, the names won't always be the same, but each name will always be unique.
For instance, I want one plot for Janet, one plot for Justin, and one plot for Marisa. I don't want them faceted but instead as their own objects.
Any thoughts?
We can get the data in long format first and for each individual name create the plot.
library(tidyverse)
long_data <- my_data %>% tidyr::pivot_longer(cols = -name, names_to = 'col')
plots_list <- map(unique(my_data$name), ~long_data %>%
filter(name == .x) %>%
ggplot() + aes(name, value, fill = col) +
geom_bar(stat = 'identity', position = 'dodge') +
scale_fill_manual(values = c('red', 'blue')) +
ggtitle(paste0('Plot for ', .x)))
This will return list of plots where individual plots can be accessed via plots_list[[1]], plots_list[[2]] etc.
plots_list[[1]]
I have the following 'code'
set.seed(100)
values<-c(rnorm(200,10,1),rnorm(200,2.1,1),rnorm(250,6,1),rnorm(75,2.1,1),rnorm(50,9,1),rnorm(210,2.05,1))
rep1<-rep(3,200)
rep2<-rep(0,200)
rep3<-rep(1,250)
rep4<-rep(0,75)
rep5<-rep(2,50)
rep6<- rep(0,210)
group<-c(rep1,rep2,rep3,rep4,rep5,rep6)
df<-data.frame(values,group)
I would like to plot these data as a scatter plot (like the attached plot) and add segments. These segments (y values) shall represent the mean value of the data for a given group. In addition, the segments should have a different color depending on the factor (group). Is there an efficient way to do it with ggplot ?
Many thanks
We can do this by augmenting your data a little. We'll use dplyr to get the mean by group, and we'll create variables that give the observation index and one that increments by one each time the group changes (which will be helpful to get the segments you want):1
library(dplyr)
df <- df %>%
mutate(idx = seq_along(values), group = as.integer(group)) %>%
group_by(group) %>%
mutate(m = mean(values)) %>%
ungroup() %>%
mutate(group2 = cumsum(group != lag(group, default = -1)))
Now we can make the plot; using geom_line() with grouping by group2, which changes every time the group changes, makes the segments you want. Then we just color by (a discretized version of) group:
ggplot(data = df, mapping = aes(x = idx, y = values)) +
geom_point(shape = 1, color = "blue") +
geom_line(aes(x = idx, y = m, group = group2, color = as.factor(group)),
size = 2) +
scale_color_manual(values = c("red", "black", "green", "blue"),
name = "group") +
theme_bw()
1 See https://stackoverflow.com/a/42705593/8386140
I want to create a function that makes a heatmap where the y axis will have unique breaks, but repeated and ordered labels. I know that this is might not be a great practice. I am also aware that similar questions have been asked before. For example: ggplot in R, reordering the bars. But I want to achieve these repeated and ordered labels through sorting within a function, not by typing them manually. I am aware of solutions for reordering axes based on the values of factor (e.g., Order Bars in ggplot2 bar graph), but I don't think they apply or can't see how to apply these to my case, where the breaks are unique but the labels repeat.
Here is some code to reproduce the problem and some of my attempts:
Libraries and data
library(ggplot2)
library(dplyr)
library(tidyr)
set.seed(4)
id <- LETTERS[1:10]
lab <- paste(c("AB", "CD"), 1:5, sep = "_") %>%
sample(., size = 10, replace = TRUE)
val <- sample.int(n = 6, size = 10, replace = TRUE)
tes <- ifelse(val >= 4, 1, 0)
dat <- data.frame(id, lab, val, tes)
A heatmap with unique breaks on the y axis
dat2 <- dat %>% gather(kind, value, val:tes)
ggplot(dat2) +
geom_tile(aes(x = kind, y = id, fill = value), color="white", size=1)
A heatmap where the y axis is labeled with repeated labels instead of the unique breaks
This works, to the point that labels are used instead of unique ids, but the y axis is not ordered by the labels. Also, I am not sure about setting breaks and labels from the data frame in wide format (dat), rather than the data frame in long format used by ggplot (dat2).
dat2 <- dat %>% gather(kind, value, val:tes)
ggplot(dat2) +
geom_tile(aes(x = kind, y = id, fill = value), color="white", size=1) +
scale_y_discrete(breaks=dat$id, labels=dat$lab)
Mapping the vector of with repeated values on the y axis obviously doesn't work
dat2 <- dat %>% gather(kind, value, val:tes)
ggplot(dat2) +
geom_tile(aes(x = kind, y = lab, fill = value), color="white", size=1)
Repeated and ordered labels, try 1
As expected, merely sorting the input data by the non-unique lab variable does not work.
dat2 <- dat %>% gather(kind, value, val:tes) %>%
arrange(lab)
ggplot(dat2) +
geom_tile(aes(x = kind, y = id, fill = value), color="white", size=1) +
scale_y_discrete(breaks=id, label=lab)
Repeated and ordered labels, try 2
Try to create a named breaks vector ordered by the (repeating) labels. This gets me nowhere. Half the labels are missing and they are still not sorted.
dat2 <- dat %>% gather(kind, value, val:tes)
brks <- setNames(dat$id, dat$lab)[sort(dat$lab)]
ggplot(dat2) +
geom_tile(aes(x = kind, y = id, fill = value), color="white", size=1) +
scale_y_discrete(breaks = brks, labels = names(brks))
Repeated and ordered labels, try 3
Starting with the data frame sorted by label, try to create an ordered factor for lab. Then sort the table by this ordered factor. No luck.
dat2 <- dat %>% gather(kind, value, val:tes) %>% arrange(lab)
dat2 <- mutate(dat2, lab_f=factor(lab, levels=sort(unique(lab)), ordered = TRUE))
dat2 <- arrange(dat2, lab_f)
# check
dat2$lab_f
ggplot(dat2) +
geom_tile(aes(x = kind, y = id, fill = value), color="white", size=1) +
scale_y_discrete(breaks = dat2$id, labels = dat2$lab_f)
A workaround, which I can use if I have to, but I am trying to avoid
We can create a combination of id and lab which will be unique and use it for the y axis
dat2 <- dat %>% gather(kind, value, val:tes) %>%
mutate(id_lab=paste(lab, id, sep="_"))
ggplot(dat2) +
geom_tile(aes(x = kind, y = id_lab, fill = value), color="white", size=1)
I must be missing something. Any help is much appreciated.
The goal is to have a function that will take an arbitrarily long table and plot a y axis with unique breaks but (possibly) repeated and ordered labels.
heat <- function(dat) {
dat2 <- dat %>% gather(kind, value, val:tes)
# any other manipulation here
ggplot(dat2) +
geom_tile(aes(x = kind, y = id, fill = value), color="white", size=1)
# scale_y_discrete() (if needed)
}
The plot I am looking for is something like this (created in inkscape)
Using limits instead of breaks sets the order:
ggplot(dat2) +
geom_tile(aes(x = kind, y = id, fill = value), color="white", size=1) +
geom_text(aes(x = 1, y = id, label = id), col = 'white') +
scale_y_discrete(limits = dat$id[order(dat$lab)], labels = sort(dat$lab))
Here is the reproducible data that I'm using as an example.
Name <- c("Blueberry", "Raspberry", "Celery", "Apples", "Peppers")
Class <- c("Berries", "Berries", "Vegetable", "Fruit", "Vegetable")
Yield <- c(30, 20, 15, 25, 40)
example <- data.frame(Class = Class, Name = Name, Yield = Yield)
When plotted with ggplot2 we get ...
ggplot(example, aes(x = Name, y = Yield, fill = Name))+
geom_bar(stat = "identity")
It would be helpful if we could give fills of similar colour to those that have the same class. For example, if Vegetables were shades of blue, Berries were shades of pink, and Fruits were shades of green you could see the yield by class of plants but still visually see the name (which is more important to us)
I feel that I could accomplish this with scale_fill_hue() but I can't seem to get it to work
ggplot(example, aes(x = Name, y = Yield))+
geom_bar(aes(fill = Class),stat = "identity")+
scale_fill_hue("Name")
The basic design in ggplot is one scale per aesthetic (see #hadley's opinion e.g. here). Thus, work-arounds are necessary in a case like yours. Here is one possibility where fill colors are generated outside ggplot. I use color palettes provided by package RColorBrewer. You can easily check the different palettes here. dplyr functions are used for the actual data massage. The generated colours are then used in scale_fill_manual:
library(dplyr)
library(RColorBrewer)
# create look-up table with a palette name for each Class
pal_df <- data.frame(Class = c("Berries", "Fruit", "Vegetable"),
pal = c("RdPu", "Greens", "Blues"))
# generate one colour palette for each Class
df <- example %>%
group_by(Class) %>%
summarise(n = n_distinct(Name)) %>%
left_join(y = pal_df, by = "Class") %>%
rowwise() %>%
do(data.frame(., cols = colorRampPalette(brewer.pal(n = 3, name = .$pal))(.$n)))
# add colours to original data
df2 <- example %>%
arrange(as.integer(as.factor(Class))) %>%
cbind(select(df, cols)) %>%
mutate(Name = factor(Name, levels = Name))
# use colours in scale_fill_manual
ggplot(data = df2, aes(x = Name, y = Yield, fill = Name))+
geom_bar(stat = "identity") +
scale_fill_manual(values = df2$cols)
A possible extension would be to create separate legends for each 'Class scale'. See e.g. my previous attempts here (second example) and here.
You can use an alpha scale as a quick (albeit not perfect) way to change intensities of colour within a class:
library("ggplot2"); theme_set(theme_bw())
library("plyr")
## reorder
example <- mutate(example,
Name=factor(Name,levels=Name))
example <- ddply(example,"Class",transform,n=seq_along(Name))
g0 <- ggplot(example, aes(x = Name, y = Yield))
g0 + geom_bar(aes(fill = Class,alpha=factor(n)),stat = "identity")+
scale_alpha_discrete(guide=FALSE,range=c(0.5,1))