Group alluvia in the R alluvial diagram

Group alluvia in the R alluvial diagram - r

In the alluvial package, is it possible to combine those alluvia that have the same source and target nodes? For example the two dark alluvia in the image below, that both go through AB and 3.
Edit: Here is an example using the Titanic dataset, which shows the same behaviour:
# Titanic data
tit <- as.data.frame(Titanic)
tit3d <- aggregate( Freq ~ Class + Sex + Survived, data=tit, sum)
ord <- list(NULL, with(tit3d, order(Sex, Survived)), NULL)
alluvial(tit3d[,1:3], freq=tit3d$Freq, alpha=1, xw=0.2,
col=ifelse( tit3d$Survived == "No", "red", "gray"),
layer = tit3d$Sex != "Female",
border="white", ordering=ord)

It looks like the ggalluvial package as a geom_flow which resets at each category break. That might be more of what you want. For example
# reshape data
library(dplyr)
library(tidyr)
dd <- tit3d %>% mutate(id=1:n(), sc=Survived) %>%
gather("category", "value", -c(id, Freq, sc))
# draw plot
ggplot(dd, aes(x=category, stratum=value, alluvium = id,
label=value))+
geom_flow(aes(fill=sc)) +
geom_stratum(alpha = .5) + geom_text(stat = "stratum", size = 3) +
theme_minimal()

Related

How to reorder plots in combined ggplot2 graph?

I have seen the solutions to reordering subplots when it's just one object being plotted (e.g. mydata), but I am not sure how to do this when there are multiple objects being plotted (in this instance, mydata1 and mydata2). I would like to switch the order of the violins such that Treatment2 is on the left, and Treatment1 is on the right, instead of vice-versa like I currently have it:
mycp <- ggplot() + geom_violin(data = mydata1, aes(x= treatment, y = Myc_List1, fill = Myc_List1, colour="Myc Pathway (Treatment1)")) +
geom_violin(data = mydata2, aes(x= treatment, y = Myc_List1, fill = Myc_List1, colour = "Myc Pathway (Treatment2)"))
When I try solutions such as in Ordering of bars in ggplot, or the following solution posed at https://www.r-graph-gallery.com/22-order-boxplot-labels-by-names.html, this graph remains unchanged.
Hopefully this makes sense, and thank you for reading!
UPDATE
Here is another solution as well from https://www.datanovia.com/en/blog/how-to-change-ggplot-legend-order/
mydata$treatment<- factor(mydata$treatment, levels = c("Treatment2", "Treatment1"))

I'm not sure how to reorder factors in this case, but you can change the x axis scale to get the desired result, e.g.
library(tidyverse)
data("Puromycin")
dat1 <- Puromycin %>%
filter(state == "treated")
dat2 <- Puromycin %>%
filter(state == "untreated")
mycp <- ggplot() +
geom_violin(data = dat1, aes(x= state, y = conc, colour = "Puromycin (Treatment1)")) +
geom_violin(data = dat2, aes(x= state, y = conc, colour = "Puromycin (Treatment2)"))
mycp
mycp2 <- ggplot() +
geom_violin(data = dat1, aes(x = state, y = conc, colour = "Puromycin (Treatment1)")) +
geom_violin(data = dat2, aes(x = state, y = conc, colour = "Puromycin (Treatment2)")) +
scale_x_discrete(limits = c("untreated", "treated"))
mycp2

Stack the data into a single data frame and set the order by converting treatment to a factor. In your example, the colors and legend are redundant, since you can label the x-axis values to describe each treatment, or change the x-axis title to "Myc Pathway", but the code below in any case shows how to get the ordering.
library(tidyverse)
bind_rows(mydata1, mydata2) %>%
mutate(treatment = factor(treatment, levels=paste0("Treatment", c(2,1)) %>%
ggplot(aes(treatment, Myc_List1, colour=treatment)) +
geom_violin()
Here's a reproducible example:
library(tidyverse)
theme_set(theme_bw(base_size=15))
# Create two separate data frames to start with
d1=iris %>% filter(Species=="setosa")
d2=iris %>% filter(Species=="versicolor")
bind_rows(d1, d2) %>%
mutate(Species = factor(Species, levels=c("versicolor", "setosa"))) %>%
ggplot(aes(Species, Petal.Width, colour=Species)) +
geom_violin()

Two ggplot with subset in pipe

I would like to plot two lines in one plot (both has the same axis), but one of the line is subset values from data frame.
I tries this
DF%>% ggplot(subset(., Cars == "A"), aes(Dates, sold_A)) +geom_line()+ ggplot(., (Dates, sold_ALL))
but this error occurred
object '.' not found

(1) You can't add a ggplot object to a ggplot object:
(2) Try taking the subset out of the call to ggplot.
DF %>%
subset(Cars == "A") %>%
ggplot(aes(Dates, sold_A)) +
geom_line() +
geom_line(data = DF, aes(Dates, sold_ALL))

I think you are misunderstanding how ggplot works. If we are attempting to do it your way, we could do:
DF %>% {ggplot(subset(., Cars == "A"), aes(Dates, sold_A)) +
geom_line(colour = "red") +
geom_line(data = subset(., Cars == "B"), colour = "blue") +
lims(y = c(0, 60))}
But it would be easier and better to map the variable Cars to the colour aesthetic, so your plot would be as simple as:
DF %>% ggplot(aes(Dates, sold_A, color = Cars)) + geom_line() + lims(y = c(0, 60))
Note that as well as being simpler code, we get the legend for free.
Data
Obviously, we didn't have your data for this question, but here is a constructed data set with the same name and same column variables:
set.seed(1)
Dates <- rep(seq(as.Date("2020-01-01"), by = "day", length = 20), 2)
Cars <- rep(c("A", "B"), each = 20)
sold_A <- rpois(40, rep(c(20, 40), each = 20))
DF <- data.frame(Dates, Cars, sold_A)

If you want only one plot, you would need to remove ggplot(., aes(Dates, sold_ALL)) and wrap directly into a structure like geom_line(data=., aes(Dates, sold_ALL)). Then, use the sage advice from #MrFlick. Here an example using iris data:
library(ggplot2)
library(dplyr)
#Example
iris %>%
{ggplot(subset(., Species == "setosa"), aes(Sepal.Length, Sepal.Width)) +
geom_point()+
geom_point(data=.,aes(Petal.Length, Petal.Width),color='blue')}
Output:
The ggplot(., aes(Dates, sold_ALL)) is creating a new canvas and the new plot.

Why do I not get two legends using ggplot2?

I am plotting different models' prediction lines over some data points. I would like to get a legend indicating to which individual belongs each point colour and another legend indicating to which model belongs each line colour. Below I share a fake example for reproducibility:
set.seed(123)
df <- data.frame(Height =rnorm(500, mean=175, sd=15),
Weight =rnorm(500, mean=70, sd=20),
ID = rep(c("A","B","C","D"), (500/4)))
mod1 <- lmer(Height ~ Weight + (1|ID), df)
mod2 <- lmer(Height ~ poly(Weight,2) + (1|ID), df)
y.mod1 <- predict(mod1, data.frame(Weight=df$Weight),re.form=NA) # Prediction of y according to model 1
y.mod2 <- predict(mod2, data.frame(Weight=df$Weight),re.form=NA) # Prediction of y according to model 2
df <- cbind(df, y.mod1,y.mod2)
df <- as.data.frame(df)
head(df)
Height Weight ID y.mod1 y.mod2
1 166.5929 57.96214 A 175.9819 175.4918
2 171.5473 50.12603 B 176.2844 176.3003
3 198.3806 90.53570 C 174.7241 174.7082
4 176.0576 85.02123 D 174.9371 174.5487
5 176.9393 39.81667 A 176.6825 177.7303
6 200.7260 68.09705 B 175.5905 174.8027
First I plot my data points:
Plot_a <- ggplot(df,aes(x=Weight, y=Height,colour=ID)) +
geom_point() +
theme_bw() +
guides(color=guide_legend(override.aes=list(fill=NA)))
Plot_a
Then, I add lines relative to the prediction models:
Plot_b <- Plot_a +
geom_line(data = df, aes(x=Weight, y=y.mod1,color='mod1'),show.legend = T) +
geom_line(data = df, aes(x=Weight, y=y.mod2,color='mod2'),show.legend = T) +
guides(fill = guide_legend(override.aes = list(linetype = 0)),
color=guide_legend(title=c("Model")))
Plot_b
Does anyone know why I am not getting two different legends, one titled Model and the other ID?
I would like to get this

This type of problems generaly has to do with reshaping the data. The format should be the long format and the data is in wide format. See this post on how to reshape the data from long to wide format.
The plot layers become simpler, one geom_line is enough and there is no need for guideto override the aesthetics.
To customize the models' legend text, create a vector of legends, in this case with plotmath, in order to have math notation. And the colors are set manually too.
library(dplyr)
library(tidyr)
library(ggplot2)
model_labels <- c(expression(X^1), expression(X^2))
df %>%
pivot_longer(
cols = c(y.mod1, y.mod2),
names_to = "Model",
values_to = "Value"
) %>%
ggplot(aes(Weight, Height)) +
geom_point(aes(fill = ID), shape = 21) +
geom_line(aes(y = Value, color = Model)) +
scale_color_manual(labels = model_labels,
values = c("coral", "coral4")) +
theme_bw()

The issue is that in ggplot2 each aesthetic can only have one scale and only one legend. As you are using only the color aes you get one legend. If you want multiple legends for the same aesthetic have a look at the ggnewscales package. Otherwise you have to make use of a second aesthetic.
My preferred approach would be similar to the one proposed by #RuiBarradas. However, to stick close to your approach this could be achieved like so:
Instead of color map on linetype in your calls to geom_line.
Set the colors for the lines as arguments, i.e. not inside aes.
Make use of scale_linetype_manual to get solid lines for both models.
Make use of guide_legend to fix the colors appearing in the legend
library(ggplot2)
library(lme4)
#> Loading required package: Matrix
set.seed(123)
df <- data.frame(Height =rnorm(500, mean=175, sd=15),
Weight =rnorm(500, mean=70, sd=20),
ID = rep(c("A","B","C","D"), (500/4)))
mod1 <- lmer(Height ~ Weight + (1|ID), df)
mod2 <- lmer(Height ~ poly(Weight,2) + (1|ID), df)
y.mod1 <- predict(mod1, data.frame(Weight=df$Weight),re.form=NA) # Prediction of y according to model 1
y.mod2 <- predict(mod2, data.frame(Weight=df$Weight),re.form=NA) # Prediction of y according to model 2
df <- cbind(df, y.mod1,y.mod2)
df <- as.data.frame(df)
Plot_a <- ggplot(df) +
geom_point(aes(x=Weight, y=Height, colour=ID)) +
theme_bw() +
guides(color=guide_legend(override.aes=list(fill=NA)))
line_colors <- scales::hue_pal()(2)
Plot_b <- Plot_a +
geom_line(aes(x=Weight, y=y.mod1, linetype = "mod1"), color = line_colors[1]) +
geom_line(aes(x=Weight, y=y.mod2, linetype = "mod2"), color = line_colors[2]) +
scale_linetype_manual(values = c(mod1 = "solid", mod2 = "solid")) +
labs(color = "ID", linetype = "Model") +
guides(linetype = guide_legend(override.aes = list(color = line_colors)))
Plot_b

Standard evaluation inside a function with dplyr

I have data with lots of factor variables that I am visualising to get a feel for each of the variables. I am reproducing a lot of the code with minor tweaks for variable names etc. so decided to write a function to simply things. I just can't get it to work...
Dummy Data
ID <- sample(1:32, 128, replace = TRUE)
AgeGrp <- sample(c("18-65", "65-75", "75-85", "85+"), 128, replace = TRUE)
ID <- factor(ID)
AgeGrp <- factor(AgeGrp)
data <- data_frame(ID, AgeGrp)
data
Basically what I am trying to do with each factor variable is produce a bar chart with labels of percentages inside the bars. For example with the dummy data.
plotstats <- #Create a table with pre-summarised percentages
data %>%
group_by(AgeGrp) %>%
summarise(count = n()) %>%
mutate(pct = count/sum(count)*100)
age_plot <- #Plot the data
ggplot(data,aes(x = AgeGrp)) +
geom_bar() + #Add the percentage labels using pre-summarised table
geom_text(data = plotstats, aes(label=paste0(round(pct,1),"%"),y=pct),
size=3.5, vjust = -1, colour = "sky blue") +
ggtitle("Count of Age Group")
age_plot
This works fine with the dummy data - but when I try to create a function...
basic_plot <-
function(df, x){
plotstats <-
df %>%
group_by_(x) %>%
summarise_(
count = ~n(),
pct = ~count/sum(count)*100)
plot <-
ggplot(df,aes(x = x)) +
geom_bar() +
geom_text(data = plotstats, aes(label=paste0(round(pct,1),"%"),
y=pct), size=3.5, vjust = -1, colour = "sky blue")
plot
}
basic_plot(data, AgeGrp)
I get the error code :
Error in UseMethod("as.lazy") : no applicable method for 'as.lazy' applied to an object of class "factor"
I have looked at questions here, here, and here and also looked at the NSE Vignette but can't find my fault.

Group similar factors - fills ggplot2

Here is the reproducible data that I'm using as an example.
Name <- c("Blueberry", "Raspberry", "Celery", "Apples", "Peppers")
Class <- c("Berries", "Berries", "Vegetable", "Fruit", "Vegetable")
Yield <- c(30, 20, 15, 25, 40)
example <- data.frame(Class = Class, Name = Name, Yield = Yield)
When plotted with ggplot2 we get ...
ggplot(example, aes(x = Name, y = Yield, fill = Name))+
geom_bar(stat = "identity")
It would be helpful if we could give fills of similar colour to those that have the same class. For example, if Vegetables were shades of blue, Berries were shades of pink, and Fruits were shades of green you could see the yield by class of plants but still visually see the name (which is more important to us)
I feel that I could accomplish this with scale_fill_hue() but I can't seem to get it to work
ggplot(example, aes(x = Name, y = Yield))+
geom_bar(aes(fill = Class),stat = "identity")+
scale_fill_hue("Name")

The basic design in ggplot is one scale per aesthetic (see #hadley's opinion e.g. here). Thus, work-arounds are necessary in a case like yours. Here is one possibility where fill colors are generated outside ggplot. I use color palettes provided by package RColorBrewer. You can easily check the different palettes here. dplyr functions are used for the actual data massage. The generated colours are then used in scale_fill_manual:
library(dplyr)
library(RColorBrewer)
# create look-up table with a palette name for each Class
pal_df <- data.frame(Class = c("Berries", "Fruit", "Vegetable"),
pal = c("RdPu", "Greens", "Blues"))
# generate one colour palette for each Class
df <- example %>%
group_by(Class) %>%
summarise(n = n_distinct(Name)) %>%
left_join(y = pal_df, by = "Class") %>%
rowwise() %>%
do(data.frame(., cols = colorRampPalette(brewer.pal(n = 3, name = .$pal))(.$n)))
# add colours to original data
df2 <- example %>%
arrange(as.integer(as.factor(Class))) %>%
cbind(select(df, cols)) %>%
mutate(Name = factor(Name, levels = Name))
# use colours in scale_fill_manual
ggplot(data = df2, aes(x = Name, y = Yield, fill = Name))+
geom_bar(stat = "identity") +
scale_fill_manual(values = df2$cols)
A possible extension would be to create separate legends for each 'Class scale'. See e.g. my previous attempts here (second example) and here.

You can use an alpha scale as a quick (albeit not perfect) way to change intensities of colour within a class:
library("ggplot2"); theme_set(theme_bw())
library("plyr")
## reorder
example <- mutate(example,
Name=factor(Name,levels=Name))
example <- ddply(example,"Class",transform,n=seq_along(Name))
g0 <- ggplot(example, aes(x = Name, y = Yield))
g0 + geom_bar(aes(fill = Class,alpha=factor(n)),stat = "identity")+
scale_alpha_discrete(guide=FALSE,range=c(0.5,1))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Group alluvia in the R alluvial diagram - r

Related

How to reorder plots in combined ggplot2 graph?

Two ggplot with subset in pipe

Why do I not get two legends using ggplot2?

Standard evaluation inside a function with dplyr

Group similar factors - fills ggplot2

Categories

Resources