Removing Empty Facet Categories - r

Am having trouble making my faceted plot only display data, as opposed to displaying facets with no data.
The following code:
p<- ggplot(spad.data, aes(x=Day, y=Mean.Spad, color=Inoc))+
geom_point()
p + facet_grid(N ~ X.CO2.)
Gives the following graphic:
I have played around with it for a while but can't seem to figure out a solution.
Dataframe viewable here: https://docs.google.com/spreadsheets/d/11ZiDVRAp6qDcOsCkHM9zdKCsiaztApttJIg1TOyIypo/edit?usp=sharing
Reproducible Example viewable here: https://docs.google.com/document/d/1eTp0HCgZ4KX0Qavgd2mTGETeQAForETFWdIzechTphY/edit?usp=sharing

Your issue lies in the missing observations for your x- and y variables. Those don't influence the creation of facets, that is only influenced by the levels of faceting variables present in the data. Here is an illustration using sample data:
#generate some data
nobs=100
set.seed(123)
dat <- data.frame(G1=sample(LETTERS[1:3],nobs, T),
G2 = sample(LETTERS[1:3], nobs, T),
x=rnorm(nobs),
y=rnorm(nobs))
#introduce some missings in one group
dat$x[dat$G1=="C"] <- NA
#attempt to plot
p1 <- ggplot(dat, aes(x=x,y=y)) + facet_grid(G1~G2) + geom_point()
p1 #facets are generated according to the present levels of the grouping factors
#possible solution: remove the missing data before plotting
p2 <- ggplot(dat[complete.cases(dat),], aes(x=x, y=y)) + facet_grid(G1 ~G2) + geom_point()
p2

Related

Filtering data into subplots using ggplot2

I have a dataset of three variables: year, age group and result. There are 9 different age groups. What I am trying to do is to create a 3x3 plot where I have 9 subplots with geom_line() plots of each subgroups. So far I ended up with a 3x3 plot with all the results in all of the plots which was my step 1. However, I can't seem to find a way to do the last little bit to create my plot. Help is appreciated.
Here is my code so far:
# n is a list of unique values aka age groups in the Data1
n <- unique(Data1$age)
# Preparation for plotting all subplots
plot_lst <- vector("list", length = length(n))
for (i in 1:length(n)) {
g <- ggplot(Data1, aes(x=year, y=data, color=age)) +
geom_line()
plot_lst[[i]] <- g
}
# Plotting all subplots into one
cowplot::plot_grid(plotlist = plot_lst, ncol = 3)
I know it is currently missing the filtering which I mainly tried to do within the geom_line() but I wasn't able to find a working solution.
Found a solution with:
ggplot(Data1, aes(year, data)) +
geom_line() +
geom_point() +
facet_wrap(~ age)

How do I add a separate legend for each variable in geom_tile?

I would like to have a separate scale bar for each variable.
I have measurements taken throughout the water column for which the means have been calculated into 50cm bins. I would like to use geom_tile to show the variation of each variable in each bin throughout the water column, so the plot has the variable (categorical) on the x-axis, the depth on the y-axis and a different colour scale for each variable representing the value. I am able to do this for one variable using
ggplot(data, aes(x=var, y=depth, fill=value, color=value)) +
geom_tile(size=0.6)+ theme_classic()+scale_y_continuous(limits = c(0,11), expand = c(0, 0))
But if I put all variables onto one plot, the legend is scaled to the min and max of all values so the variation between bins is lost.
To provide a reproducible example, I have used the mtcars, and I have included alpha = which, of course, doesn't help much because the scale of each variable is so different
data("mtcars")
# STACKS DATA
library(reshape2)
dat2b <- melt(mtcars, id.vars=1:2)
dat2b
ggplot(dat2b) +
geom_tile(aes(x=variable , y=cyl, fill=variable, alpha = value))
Which produces
Is there a way I can add a scale bar for each variable on the plot?
This question is similar to others (e.g. here and here), but they do not use a categorical variable on the x-axis, so I have not been able to modify them to produce the desired plot.
Here is a mock-up of the plot I have in mind using just four of the variables, except I would have all legends horizontal at the bottom of the plot using theme(legend.position="bottom")
Hope this helps:
The function myfun was originally posted by Duck here: R ggplot heatmap with multiple rows having separate legends on the same graph
library(purrr)
library(ggplot2)
library(patchwork)
data("mtcars")
# STACKS DATA
library(reshape2)
dat2b <- melt(mtcars, id.vars=1:2)
dat2b
#Split into list
List <- split(dat2b,dat2b$variable)
#Function for plots
myfun <- function(x)
{
G <- ggplot(x, aes(x=variable, y=cyl, fill = value)) +
geom_tile() +
theme(legend.direction = "vertical", legend.position="bottom")
return(G)
}
#Apply
List2 <- lapply(List,myfun)
#Plot
reduce(List2, `+`)+plot_annotation(title = 'My plot')
patchwork::wrap_plots(List2)

How to wrap a wrapped plot plus another plot? [duplicate]

This question already has answers here:
Side-by-side plots with ggplot2
(14 answers)
Closed 2 years ago.
I want the six plots in one plot. And I would like to specify the titles of each plot. How can I do that?
p<-ggplot(df, aes(x=COD_NEIGHB))+
geom_bar(stat="count", width=0.3, fill="steelblue")+
theme_minimal()
# histogram of the strata in the whole dataset
s<-ggplot(data = df, mapping = aes(x = COD_NEIGHB)) +
geom_bar(stat="count", width=0.3, fill="steelblue")+
facet_wrap(~ fold)
plot_grid(p, s, ncol=2,label_size = 2)
After that, I did the suggestion
df$fold <- as.character(df$fold)
# Duplicate data. Set category in the duplicated dataset to "all"
df_all <- df
df_all$fold <- "all"
# Row bind the datasets
df_all <- rbind(df, df_all)
ggplot(df_all, aes(x=COD_NEIGHB)) +
geom_bar(stat="count", width=0.3, fill="steelblue")+
facet_wrap(~fold)
But now the problem is the scale. y-axis has to be on the proper scale.
any idea for that?
Thanks in advance!!!!
If I got you right you want a plot with facets by categories plus an additonal facet showing the total data. One option to achieve this is to duplicate your dataset to add an addtional category "all".
As no example data was provided I make use of mtcars to show you the basic idea:
library(ggplot2)
mtcars$cyl <- as.character(mtcars$cyl)
# Duplicate data. Set category in the duplicated dataset to "all"
mtcars_all <- mtcars
mtcars_all$cyl <- "all"
# Row bind the datasets
mtcars_all <- rbind(mtcars, mtcars_all)
ggplot(mtcars_all, aes(hp, mpg)) +
geom_point() +
facet_wrap(~cyl)
Here is another useful tool with the help of the ggarrange() function from the ggpubr package. You can arrange multiple plots on one page or multiple pages. You can also create a common, unique legend once you merge all your plots together.
Similar to previous answers, I used mtcars to demonstrate a simple use case:
#install.packages("ggpubr")
#library(ggpubr)
p1 <- ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
theme_minimal()
p2 <- ggplot(mtcars, aes(y = mpg, x = cyl)) +
geom_boxplot() +
theme_minimal()
ggarrange(p1, p2, ncol = 2)

Restricting the x being counted in a historgram

library(alr4)
par(mfrow = c(2,2))
ggplot(walleye, aes(x= age)) + geom_histogram() + facet_grid(~age)
I would like to create 4 histograms from the data set walleye. I would like the histograms to be for the length of the walleye. The for histograms should each have their own age for counting. I would like to restrict the ages from 1 to 4. How can I do that with ggplot?
If I understand what you are trying to do correctly, this should help:
library(alr4)
library(ggplot2)
ggplot(subset(walleye, age<5), aes(x=length)) + geom_histogram() + facet_grid(~age)
This way you are only plotting the subset of the data where age is 1-4, and you are actually plotting histograms of length.
You could try this too (adding another line of code on top of your code):
library(alr4)
library(ggplot2)
p <- ggplot(walleye, aes(x= age)) + geom_histogram() + facet_grid(~age)
p %+% subset(walleye, age %in% 1:4)

Get data associated to ggplot + stat_ecdf()

I like the stat_ecdf() feature part of ggplot2 package, which I find quite useful to explore a data series. However this is only visual, and I wonder if it is feasible - and if yes how - to get the associated table?
Please have a look to the following reproducible example
p <- ggplot(iris, aes_string(x = "Sepal.Length")) + stat_ecdf() # building of the cumulated chart
p
attributes(p) # chart attributes
p$data # data is iris dataset, not the serie used for displaying the chart
As #krfurlong showed me in this question, the layer_data function in ggplot2 can get you exactly what you're looking for without the need to recreate the data.
p <- ggplot(iris, aes_string(x = "Sepal.Length")) + stat_ecdf()
p.data <- layer_data(p)
The first column in p.data, "y", contains the ecdf values. "x" is the Sepal.Length values on the x-axis in your plot.
We can recreate the data:
#Recreate ecdf data
dat_ecdf <-
data.frame(x=unique(iris$Sepal.Length),
y=ecdf(iris$Sepal.Length)(unique(iris$Sepal.Length))*length(iris$Sepal.Length))
#rescale y to 0,1 range
dat_ecdf$y <-
scale(dat_ecdf$y,center=min(dat_ecdf$y),scale=diff(range(dat_ecdf$y)))
Below 2 plots should look the same:
#plot using new data
ggplot(dat_ecdf,aes(x,y)) +
geom_step() +
xlim(4,8)
#plot with built-in stat_ecdf
ggplot(iris, aes_string(x = "Sepal.Length")) +
stat_ecdf() +
xlim(4,8)

Resources