Here is the reproducible data that I'm using as an example.
Name <- c("Blueberry", "Raspberry", "Celery", "Apples", "Peppers")
Class <- c("Berries", "Berries", "Vegetable", "Fruit", "Vegetable")
Yield <- c(30, 20, 15, 25, 40)
example <- data.frame(Class = Class, Name = Name, Yield = Yield)
When plotted with ggplot2 we get ...
ggplot(example, aes(x = Name, y = Yield, fill = Name))+
geom_bar(stat = "identity")
It would be helpful if we could give fills of similar colour to those that have the same class. For example, if Vegetables were shades of blue, Berries were shades of pink, and Fruits were shades of green you could see the yield by class of plants but still visually see the name (which is more important to us)
I feel that I could accomplish this with scale_fill_hue() but I can't seem to get it to work
ggplot(example, aes(x = Name, y = Yield))+
geom_bar(aes(fill = Class),stat = "identity")+
scale_fill_hue("Name")
The basic design in ggplot is one scale per aesthetic (see #hadley's opinion e.g. here). Thus, work-arounds are necessary in a case like yours. Here is one possibility where fill colors are generated outside ggplot. I use color palettes provided by package RColorBrewer. You can easily check the different palettes here. dplyr functions are used for the actual data massage. The generated colours are then used in scale_fill_manual:
library(dplyr)
library(RColorBrewer)
# create look-up table with a palette name for each Class
pal_df <- data.frame(Class = c("Berries", "Fruit", "Vegetable"),
pal = c("RdPu", "Greens", "Blues"))
# generate one colour palette for each Class
df <- example %>%
group_by(Class) %>%
summarise(n = n_distinct(Name)) %>%
left_join(y = pal_df, by = "Class") %>%
rowwise() %>%
do(data.frame(., cols = colorRampPalette(brewer.pal(n = 3, name = .$pal))(.$n)))
# add colours to original data
df2 <- example %>%
arrange(as.integer(as.factor(Class))) %>%
cbind(select(df, cols)) %>%
mutate(Name = factor(Name, levels = Name))
# use colours in scale_fill_manual
ggplot(data = df2, aes(x = Name, y = Yield, fill = Name))+
geom_bar(stat = "identity") +
scale_fill_manual(values = df2$cols)
A possible extension would be to create separate legends for each 'Class scale'. See e.g. my previous attempts here (second example) and here.
You can use an alpha scale as a quick (albeit not perfect) way to change intensities of colour within a class:
library("ggplot2"); theme_set(theme_bw())
library("plyr")
## reorder
example <- mutate(example,
Name=factor(Name,levels=Name))
example <- ddply(example,"Class",transform,n=seq_along(Name))
g0 <- ggplot(example, aes(x = Name, y = Yield))
g0 + geom_bar(aes(fill = Class,alpha=factor(n)),stat = "identity")+
scale_alpha_discrete(guide=FALSE,range=c(0.5,1))
Related
I need to create a proportional bar chart from a data set created by using the melt function from dplyr. By proportional, I mean that I would need a chart for which the height of each bar would be different, based on the proportion of the total. I would like to proportion to be for the value X generated by the following sample code, with fill based on "group". I have tried many online solutions, and I constantly run into not an error code, but solid bars, with no difference in proportions
See sample code:
library(ggplot2)
library(tidyr)
set.seed(1)
example_matrix <-matrix(rpois(90,7), nrow=6,ncol=15)
example_df <- data.frame(example_matrix)
rownames(example_df) <-c('group1','group2','group3','group4','group5','group6')
df <- reshape2::melt(as.matrix(example_df))```
library(data.table)
library(ggplot2)
set.seed(1)
example_matrix <-matrix(rpois(90,7), nrow=6,ncol=15)
example_df <- data.frame(example_matrix)
rownames(example_df) <-c('group1','group2','group3','group4','group5','group6')
df <- reshape2::melt(as.matrix(example_df))
df$fraction <- df$value/sum(df$value)
setDT(df)
p <-
ggplot(data = df, aes(x = Var2, y = fraction, fill = Var1)) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_manual(values = c("blue", "red", "black", "orange", "pink", "yellow"))
print(p)
I don't think this can be done with the plotting commands directly, you'll need to transform your data before. For example:
library(dplyr)
df <- df %>% group_by(Var2) %>% mutate(fraction = value/sum(value))
and then plot either with the ggplot solution from the other answer or here's a plotly version:
library(plotly)
plot_ly(data = df, x = ~Var2, y = ~fraction, color = ~Var1, type = 'bar')
I am currently working in a comparison between two inventory levels and I want to plot two step graphs in the same grid with a color code. This is my code.
Intento1<-data.frame(Fecha, NivelI)
Intento2<-data.frame(Fecha, Nivel2)
#Printing the step graphs in one grid
ggplot()+geom_step(Intento1, mapping=aes(x=Fecha, y=NivelI))+geom_step(Intento2, mapping=aes(x=Fecha, y=Nivel2))
And it works fine plotting both graphs in the same grid, I could also add a different color to each graph but I couldnĀ“t add the little colored labels that appear normally at the right. All support is appreciated.
For example data dummy,
dummy <- data.table(
Fecha = seq(as.Date("2020/1/1"), as.Date("2020/1/31"), "day")
)
dummy$NivelI = runif(31, 0, 10)
dummy$Nivel2 = runif(31, 0, 10)
plot using reshape2::melt like below will work.
dummy %>%
melt(id.vars = "Fecha") %>%
ggplot(aes(Fecha, value, group = variable, color = variable)) +
geom_step() + guides(color = guide_legend(title = "aaa"))
In your case, to make dummy formed data, if Fecha, NivelI and Nivel2 are vectors, just try
df <- data.frame(
Fecha,
NivelI,
Nivel2
)
then
df %>%
melt(id.vars = "Fecha") %>%
ggplot(aes(Fecha, value, group = variable, color = variable)) +
geom_step() + guides(color = guide_legend(title = "aaa"))
where "aaa" will be your legend name.
I have a dataframe like this:
library(tidyverse)
my_data <- tibble(name = c("Justin", "Janet", "Marisa"),
x = c(100, 50, 75),
y = c(2, 3, 6))
Each name is unique, and I want to make a bar graph for each person without having to do it line by line. I also want to save each plot as a unique object because I'll be inputting it into a power point using the officer package. Last, the names won't always be the same, but each name will always be unique.
For instance, I want one plot for Janet, one plot for Justin, and one plot for Marisa. I don't want them faceted but instead as their own objects.
Any thoughts?
We can get the data in long format first and for each individual name create the plot.
library(tidyverse)
long_data <- my_data %>% tidyr::pivot_longer(cols = -name, names_to = 'col')
plots_list <- map(unique(my_data$name), ~long_data %>%
filter(name == .x) %>%
ggplot() + aes(name, value, fill = col) +
geom_bar(stat = 'identity', position = 'dodge') +
scale_fill_manual(values = c('red', 'blue')) +
ggtitle(paste0('Plot for ', .x)))
This will return list of plots where individual plots can be accessed via plots_list[[1]], plots_list[[2]] etc.
plots_list[[1]]
I have multiple graphs that I am plotting with ggplot and then sending to plotly. I set the legend order based the most recent date, so that one can easily interpret the graphs. Everything works great in generating the ggplot, but once I send it through ggplotly() the legend order reverts to the original factor level. I tried resetting the factors but this creates a new problem - the colors are different in each graph.
Here's the code:
Data:
Country <- c("CHN","IND","INS","PAK","USA")
a <- data.frame("Country" = Country,"Pop" = c(1400,1300,267,233,330),Year=rep(2020,5))
b <- data.frame("Country" = Country,"Pop" = c(1270,1000,215,152,280),Year=rep(2000,5))
c <- data.frame("Country" = Country,"Pop" = c(1100,815,175,107,250),Year=rep(1990,5))
Data <- bind_rows(a,b,c)
Legend Ordering Vector - This uses 2020 as the year to determine order.
Legend_Order <- Data %>%
filter(Year==max(Year)) %>%
arrange(desc(Pop)) %>%
select(Country) %>%
unlist() %>%
as.vector()
Then I create my plot and use Legend Order as breaks
Graph <- Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, group = Country, color = Country), size = 1.2) +
scale_color_discrete(name = 'Country', breaks = Legend_Order)
Graph
But then when I pass this on to:
ggplotly(Graph)
For some reason plotly ignores the breaks argument and uses the original factor levels.
If I set the factor levels beforehand, the color schemes changes (since the factors are in a different order).
How can I keep the color scheme from graph to graph, but change the legend order when using plotly?
Simply recode your Conutry var as factor with the levels set according to Legend_Order. Try this:
library(plotly)
library(dplyr)
Country <- c("CHN","IND","INS","PAK","USA")
a <- data.frame("Country" = Country,"Pop" = c(1400,1300,267,233,330),Year=rep(2020,5))
b <- data.frame("Country" = Country,"Pop" = c(1270,1000,215,152,280),Year=rep(2000,5))
c <- data.frame("Country" = Country,"Pop" = c(1100,815,175,107,250),Year=rep(1990,5))
Data <- bind_rows(a,b,c)
Legend_Order <- Data %>%
filter(Year==max(Year)) %>%
arrange(desc(Pop)) %>%
select(Country) %>%
unlist() %>%
as.vector()
Data$Country <- factor(Data$Country, levels = Legend_Order)
Graph <- Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, group = Country, color = Country), size = 1.2)
ggplotly(Graph)
To "lock in" the color assignment you can make use of a named color vector like so (for short I only show the ggplots):
# Fix the color assignments using a named color vector which can be assigned via scale_color_manual
cols <- scales::hue_pal()(5) # Default ggplot2 colors
cols <- setNames(cols, Legend_Order) # Set names according to legend order
# Plot with unordered Countries but "ordered" color assignment
Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, color = Country), size = 1.2) +
scale_color_manual(values = cols)
# Plot with ordered factor
Data$Country <- factor(Data$Country, levels = Legend_Order)
Data %>%
ggplot() +
geom_line(aes(x = Year, y = Pop, color = Country), size = 1.2) +
scale_color_manual(values = cols)
I am trying to plot two different datasets on the same plot. I am using this code to add the lines and to actually plot everything
ggplot()+
geom_point(data=Acc, aes(x=Year, y=Accumulo), color="lightskyblue")+
geom_line(data=Acc, aes(x=Year, y=RM3), color="gold1")+
geom_line(data=Acc, aes(x=Year, y=RM5), color="springgreen3")+
geom_line(data=Acc, aes(x=Year, y=RM50), color="blue")+
geom_line(data=Vulcani, aes(x=Year, y=Accumulo.V), color="red")+
theme_bw()+
scale_x_continuous(expand=expand_scale(0)) + scale_y_continuous(limits=c(50,350),expand=expand_scale(0))
but I can't find any way to add a legend and add custom labels to the different series. I find a way to add legends on a single dataset, but I can't find a way to add to this one a legend on the side
You are better off creating a single dataset tailored to your plot needs before, which would be in the long format, so that you can give a single geom_line() instruction, and add colors to the lines with aes(color = ...) within the call to geom_line(). Here's an example with the midwest dataset (consider them as distinct datasets for the sake of example)
library(ggplot2)
library(dplyr)
library(tidyr)
long_midwest <- midwest %>%
select(popwhite, popasian, PID, poptotal) %>%
gather(key = "variable", value = "value", -PID, -poptotal) # convert to long format
long_midwest2 <- midwest %>%
select(poptotal, perchsd, PID) %>%
gather(key = "variable", value = "value", -PID, -poptotal)
plot_data <- bind_rows(long_midwest, long_midwest2) %>% # bind datasets vertically
mutate(line_type = ifelse(variable == 'perchsd', 'A', 'B')) # creates a line_type variable
ggplot(data = plot_data, aes(x=poptotal, y = value))+
geom_line(aes(color = variable, linetype = line_type)) +
scale_color_manual(
values = c('lightskyblue', 'gold1', 'blue'),
name = "My color legend"
) +
scale_linetype_manual(
values = c(3, 1), # play with the numbers to get the correct styling
name = "My linetype legend"
)
I added a line_type variable to show the most generic case where you want specific mapping between the column values and the line type. If it is the same than, say, variable, just use aes(color = variable, linetype = variable). You can then decide which linetype you want (see here for more details).
For customising the labels, just change the content of variable within the dataset with the desired values.