ggplot(data = df1) with added ggMarginal (data = df2) - r

I aim to create a ggplot with Date along the x axis, and jump height along the y axis. Simplistically, for 1 athlete in a large group of athletes, this will allow the reader to see improvements in jump height over time.
Additionally, I would like to add a ggMarginal(type = "density") to this plot. Here, I aim to plot the distribution of all athlete jump heights. As a result, the reader can interpret the performance of the primary athlete in relationship to the group distribution.
For the sack of a reproducible example, the Iris df will work.
'''
library(dplyr)
library(ggplot2)
library(ggExtra)
df1 <- iris %<%
filter(Species == "setosa")
df2 <- iris
#I have tried as follows, but a variety of error have occurred:
ggplot(NULL, aes(x=Sepal.Length, y=Sepal.Width))+
geom_point(data=df1, size=2)+
ggMarginal(data = df2, aes(x=Sepal.Length, y=Sepal.Width), type="density", margins = "y", size = 6)
'''
Although this data frame is significantly different than mine, in relation to the Iris data set, I aim to plot x = Sepal.Length, y = Sepal.Width for the Setosa species (df1), and then use ggMarginal to show the distribution of Sepal.Width on the y axis for all the species (df2)
I hope this makes sense!
Thank you for your time and expertise

As far as I get it from the docs you can't specify a separate data frame for ggMarginal. Either you specify a plot to which you want to add a marginal plot or you provide the data directly to ggMarginal.
But one option to achieve your desired result would be to create your density plot as a separate plot and glue it to your main plot via patchwork:
library(ggplot2)
library(patchwork)
df1 <- subset(iris, Species == "setosa")
df2 <- iris
p1 <- ggplot(df1, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point(size = 2)
p2 <- ggplot(df2, aes(y = Sepal.Width)) +
geom_density() +
theme_void()
p1 + p2 +
plot_layout(widths = c(6, 1))

Related

is there a way to show the same legend for each facet in ggplot2

reproducible example:
library(datasets)
library(tidyverse)
data(iris)
iris$facet <- "A"
A <- iris
iris$facet <- "B"
B <- iris
iris <- rbind(A,B)
iris %>%
ggplot(aes(x = Sepal.Width, y = Sepal.Length, color = Species))+
geom_line()+
facet_wrap(.~facet)+
theme(legend.position = c(0.4,0.6))
I have been asked to make a plot where this same legend is positioned on top of each facet. So the same identical legend that I have put at c(0.4,0.6). I don't mind having to specify exact position for each time I have to repeat the label, but I can't make it work. The only suggestions I have is using the directlabels package to label the lines. But this is not acceptable to those making the decision. I know that the default is to show this only once for the entire plot, but we think it would make it easier to interpret if we show this once for every facet.
I don't like using grid.arrange as I have seen suggested - this will make it difficult to align the facets and share y axis etc. (since in my actual figure the y-axis are different)
Just to put it in other words I want the same label shown in A facet shown in the B facet at the same time.
Are you looking for something like this?
Using patchwork, and placing all plots in a list, you can use wrap_plots which is like facet_wrap for a list of plots.
Then using the & operator you can apply a theme to all facets, so they are the same.
Edit: based on more info, I also added a scaling option that would scale x and y for all facets the same.
library(tidyverse)
library(patchwork)
data(iris)
plot_list <- list()
plot_list[[1]] <-
iris %>%
mutate(fac = "A") %>%
ggplot(aes(x = Sepal.Width, y = Sepal.Length, color = Species))+
geom_line()+
facet_wrap(~fac)
plot_list[[2]] <-
iris %>%
mutate(fac = "b") %>%
ggplot(aes(x = Sepal.Width, y = Sepal.Length, color = Species))+
geom_line()+
facet_wrap(~fac)
wrap_plots(plot_list) &
theme(legend.position = c(.85,.85)) &
scale_y_continuous(limits = c(4, 8)) &
scale_x_continuous(limits = c(2, 4.5))
Created on 2020-06-26 by the reprex package (v0.3.0)

Density over histogram using ggplot2

I have "long" format data frame which contains two columns: first col - values, second col- sex [Male - 1/Female - 2]. I wrote some code to make a histogram of entire dataset (code below).
ggplot(kz6, aes(x = values)) +
geom_histogram()
However, I want also add a density over histogram to emphasize the difference between sexes i.e. I want to combine 3 plots: histogram for entire dataset, and 2 density plots for each sex. I tried to use some examples (one, two, three, four), but it still does not work. Code for density only works, while the combinations of hist + density does not.
density <- ggplot(kz6, aes(x = x, fill = factor(sex))) +
geom_density()
both <- ggplot(kz6, aes(x = values)) +
geom_histogram() +
geom_density()
both_2 <- ggplot(kz6, aes(x = values)) +
geom_histogram() +
geom_density(aes(x = kz6[kz6$sex == 1,]))
P.S. some examples contains y=..density.. what does it mean? How to interpret this?
To plot a histogram and superimpose two densities, defined by a categorical variable, use appropriate aesthetics in the call to geom_density, like group or colour.
ggplot(kz6, aes(x = values)) +
geom_histogram(aes(y = ..density..), bins = 20) +
geom_density(aes(group = sex, colour = sex), adjust = 2)
Data creation code.
I will create a test data set from built-in data set iris.
kz6 <- iris[iris$Species != "virginica", 4:5]
kz6$sex <- "M"
kz6$sex[kz6$Species == "versicolor"] <- "F"
kz6$Species <- NULL
names(kz6)[1] <- "values"
head(kz6)

Q: Display grouped and combined boxplot in a single plot in R

I am trying to display grouped boxplot and combined boxplot into one plot. Take the iris data for instance:
data(iris)
p1 <- ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot()
p1
I am trying to compare overall distribution with distributions within each categories. So is there a way to display a boxplot of all samples on the left of these three grouped boxplots?
Thanks in advance.
You can rbind a new version of iris, where Species equals "All" for all rows, to iris before piping to ggplot
p1 <- iris %>%
rbind(iris %>% mutate(Species = 'All')) %>%
ggplot(aes(x = Species, y = Sepal.Length)) +
geom_boxplot()
Yes, you can just create a column for all species as follows:
iris = iris %>% mutate(all = "All Species")
p1 <- ggplot(iris) +
geom_boxplot(aes(x=Species, y=Sepal.Length)) +
geom_boxplot(aes(x=all, y=Sepal.Length))
p1

boxplot ggplot2::qplot() ungrouped and grouped data in same plot R

My data set features a factor(TypeOfCat) and a numeric (AgeOfCat).
I've made the below box plot. In addition to a box representing each type of cat, I've also tried to add a box representing the ungrouped data (ie the entire cohort of cats and their ages). What I've got is not quite what I'm after though, as sum() of course won't provide all the information needed to create such a plot. Any help would be much appreciated.
Data set and current code:
Df1 <- data.frame(TypeOfCat=c("A","B","B","C","C","A","B","C","A","B","A","C"),
AgeOfCat=c(14,2,5,8,4,5,2,6,3,6,12,7))
Df2 <- data.frame(TypeOfCat=c("AllCats"),
AgeOfCat=sum(Df1$AgeOfCat)))
Df1 <- rbind(Df1, Df2)
qplot(Df1$TypeOfCat,Df1$AgeOfCat, geom = "boxplot") + coord_flip()
No need for sum. Just take all the values individually for AllCats:
# Your original code:
library(ggplot2)
Df1 <- data.frame(TypeOfCat=c("A","B","B","C","C","A","B","C","A","B","A","C"),
AgeOfCat=c(14,2,5,8,4,5,2,6,3,6,12,7))
# this is the different part:
Df2 <- data.frame(TypeOfCat=c("AllCats"),
AgeOfCat=Df1$AgeOfCat)
Df1 <- rbind(Df1, Df2)
qplot(Df1$TypeOfCat,Df1$AgeOfCat, geom = "boxplot") + coord_flip()
You can see you have all the observations if you add geom_point to the boxplot:
ggplot(Df1, aes(TypeOfCat, AgeOfCat)) +
geom_boxplot() +
geom_point(color='red') +
coord_flip()
Like this?
library(ggplot2)
# first double your data frame, but change "TypeOfCat", since it contains all:
df <- rbind(Df1, transform(Df1, TypeOfCat = "AllCats"))
# then plot it:
ggplot(data = df, mapping = aes(x = TypeOfCat, y = AgeOfCat)) +
geom_boxplot() + coord_flip()

plot selected columns using ggplot2

I would like to plot multiple separate plots and so far I have the following code:
However, I don't want the final column from my dataset; it makes ggplot2 plot x-variable vs x-variable.
library(ggplot2)
require(reshape)
d <- read.table("C:/Users/trinh/Desktop/Book1.csv", header=F,sep=",",skip=24)
t<-c(0.25,1,2,3,4,6,8,10)
d2<-d2[,3:13] #removing unwanted columns
d2<-cbind(d2,t) #adding x-variable
df <- melt(d2, id = 't')
ggplot(data=df, aes(y=value,x=t) +geom_point(shape=1) +
geom_smooth(method='lm',se=F)+facet_grid(.~variable)
I tried adding
data=subset(df,df[,3:12])
but I don't think I am writing it correctly. Please advise. Thanks.
Here's how you could do it, using data(iris) as an example:
(i) plot with all variables
df <- reshape2::melt(iris, id="Species")
ggplot(df, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)
(ii) plot without "Petal.Width"
library(dplyr)
df2 <- df %>% filter(!variable == "Petal.Width")
ggplot(df2, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)

Resources