Overlay plot and histogram in R with ggplot

Overlay plot and histogram in R with ggplot - r

I am trying to overlay a Plot and a Histogram in R, usign the ggplot2 package.
The Plot contains a set of curves (visualized as straight lines due to logarithmich axis) and a horizontal line.
I would like to plot on the same image an histogram showing the density distribution of the crossing ponts between the curves and the horizontal line. I can plot the histogram alone but not on the graph because the aes-length is not the same (the last intersection is at x = 800, while the x asis is much longer).
the code I wrote is:
baseplot +
geom_histogram(data = timesdf, aes(v)) + xlim(0,2000)
where v contains the intersections between the curves and the dashed line.
Any ideas?
edited: as suggested I wrote a little reproducible example:
library(ggplot2)
xvalues <- c(0:100)
yvalues1 <- xvalues^2-1000
yvalues2 <- xvalues^3-100
yvalues3 <- xvalues^4-10
yvalues4 <- xvalues^5-50
plotdf <- as.data.frame(xvalues)
plotdf$horiz <- 5
plotdf$vert1 <- yvalues1
plotdf$vert2 <- yvalues2
plotdf$vert3 <- yvalues3
plotdf$vert4 <- yvalues4
baseplot <- ggplot(data = plotdf, mapping = aes(x= xvalues, y= horiz))+
geom_line(linetype = "dashed", size = 1)+
geom_line(data = plotdf, mapping = aes(x= xvalues, y = vert1))+
geom_line(data = plotdf, mapping = aes(x= xvalues, y = vert2))+
geom_line(data = plotdf, mapping = aes(x= xvalues, y = vert3))+
geom_line(data = plotdf, mapping = aes(x= xvalues, y = vert4))+
coord_cartesian(xlim=c(0, 100), ylim=c(0, 1000))
baseplot
v<-c(ncol(plotdf)-1)
for(i in 1:ncol(plotdf)){
v[i] <- plotdf[max(which(plotdf[,i]<5)),1]
}
v <- as.integer(v[-1])
timesdf <- as.data.frame(v)
# my wish: visualize baseplot and histplot on the same image
histplot <- ggplot() + geom_histogram(data = timesdf, aes(v)) +
coord_cartesian(xlim=c(0, 100), ylim=c(0, 10))

Related

Plotting a vertical normal distribution next to a box plot in R

I'm trying to plot box plots with normal distribution of the underlying data next to the plots in a vertical format like this:
This is what I currently have graphed from an excel sheet uploaded to R:
And the code associated with them:
set.seed(12345)
library(ggplot2)
library(ggthemes)
library(ggbeeswarm)
#graphing boxplot and quasirandom scatterplot together
ggplot(X8_17_20_R_20_60, aes(Type, Diameter)) +
geom_quasirandom(shape=20, fill="gray", color = "gray") +
geom_boxplot(fill="NA", color = c("red4", "orchid4", "dark green", "blue"),
outlier.color = "NA") +
theme_hc()
Is this possible in ggplot2 or R in general? Or is the only way this would be feasible is through something like OrignLab (where the first picture came from)?

You can do something similar to your example plot with the gghalves package:
library(gghalves)
n=0.02
ggplot(iris, aes(Species, Sepal.Length)) +
geom_half_boxplot(center=TRUE, errorbar.draw=FALSE,
width=0.5, nudge=n) +
geom_half_violin(side="r", nudge=n) +
geom_half_dotplot(dotsize=0.5, alpha=0.3, fill="red",
position=position_nudge(x=n, y=0)) +
theme_hc()

There are a few ways to do this. To gain full control over the look of the plot, I would just calculate the curves and plot them. Here's some sample data that's close to your own and shares the same names, so it should be directly applicable:
set.seed(12345)
X8_17_20_R_20_60 <- data.frame(
Diameter = rnorm(4000, rep(c(41, 40, 42, 40), each = 1000), sd = 6),
Type = rep(c("AvgFeret", "CalcDiameter", "Feret", "MinFeret"), each = 1000))
Now we create a little data frame of normal distributions based on the parameters taken from each group:
df <- do.call(rbind, mapply( function(d, n) {
y <- seq(min(d), max(d), length.out = 1000)
data.frame(x = n - 5 * dnorm(y, mean(d), sd(d)) - 0.15, y = y, z = n)
}, with(X8_17_20_R_20_60, split(Diameter, Type)), 1:4, SIMPLIFY = FALSE))
Finally, we draw your plot and add a geom_path with the new data.
library(ggplot2)
library(ggthemes)
library(ggbeeswarm)
ggplot(X8_17_20_R_20_60, aes(Type, Diameter)) +
geom_quasirandom(shape = 20, fill = "gray", color = "gray") +
geom_boxplot(fill="NA", aes(color = Type), outlier.color = "NA") +
scale_color_manual(values = c("red4", "orchid4", "dark green", "blue")) +
geom_path(data = df, aes(x = x, y = y, group = z), size = 1) +
theme_hc()
Created on 2020-08-21 by the reprex package (v0.3.0)

triangular plot using ggtern

I am trying to triplot some points with some regions in triangular plot this code:
library(ggtern)
g <- data.frame(x=c(1,.6,.6), y=c(0,.4,0), z=c(0,0,.4), Series="Green")
r <- data.frame(x=c(0,0.4,0), y=c(0,0,0.4), z=c(1,0.6,0.6), Series="Red")
p <- data.frame(x=c(0,0.4,0), y=c(1,0.6,0.6), z=c(0,0,0.4), Series="Purple")
DATA = rbind(g,r,p)
plot <- ggtern(data=DATA,aes(x,y,z)) +
geom_polygon(aes(fill=Series),alpha=.5,color="black",size=0.25) +
scale_fill_manual(values=as.character(unique(DATA$Series))) +
theme(legend.position=c(0,1),legend.justification=c(0,1)) +
labs(fill="Region",title="Sample Filled Regions")
print(plot)
I want to add some points to this plot that are taken from text file and I am reading their x,y and z coordinates. How I can these points to the plot?
if I try something like this it delete the previous plot:
plot <- ggtern(data = data.frame(x = cordnate_x, y = cordnate_y, z = cordnate_z),aes(x, y, z)) + geom_point() +theme_rgbg()
print(plot)
This is the plot that I need to add points to it
traingular plot

You can add points as if you have normal ggplot object.
g <- data.frame(x=c(1,.6,.6), y=c(0,.4,0), z=c(0,0,.4), Series="Green")
r <- data.frame(x=c(0,0.4,0), y=c(0,0,0.4), z=c(1,0.6,0.6), Series="Red")
p <- data.frame(x=c(0,0.4,0), y=c(1,0.6,0.6), z=c(0,0,0.4), Series="Purple")
DATA = rbind(g,r,p)
temp <- data.frame(x=c(0.4), y=c(0.6), z=c(0.4))
plot<- ggtern(data=DATA,aes(x,y,z)) +
geom_polygon(aes(fill=Series),alpha=.5,color="black",size=0.25) +
scale_fill_manual(values=as.character(unique(DATA$Series))) +
theme(legend.position=c(0,1),legend.justification=c(0,1)) +
labs(fill="Region",title="Sample Filled Regions") +
geom_point(data = temp, colour = "red") +
annotate("text", x = 0.3, y = 0.6, z = 0.4, label = "Some text")

Align x axes of box plot and line plot using ggplot

Im trying to align the x-axes of a bar plot and line plot in one window frame using ggplot. Here is the fake data I'm trying to do it with.
library(ggplot2)
library(gridExtra)
m <- as.data.frame(matrix(0, ncol = 2, nrow = 27))
colnames(m) <- c("x", "y")
for( i in 1:nrow(m))
{
m$x[i] <- i
m$y[i] <- ((i*2) + 3)
}
My_plot <- (ggplot(data = m, aes(x = x, y = y)) + theme_bw())
Line_plot <- My_plot + geom_line()
Bar_plot <- My_plot + geom_bar(stat = "identity")
grid.arrange(Line_plot, Bar_plot)
Thank you for your help.

#eipi10 answers this particular case, but in general you also need to equalize the plot widths. If, for example, the y labels on one of the plots take up more space than on the other, even if you use the same axis on each plot, they will not line up when passed to grid.arrange:
axis <- scale_x_continuous(limits=range(m$x))
Line_plot <- ggplot(data = m, aes(x = x, y = y)) + theme_bw() + axis + geom_line()
m2 <- within(m, y <- y * 1e7)
Bar_plot <- ggplot(data = m2, aes(x = x, y = y)) + theme_bw() + axis + geom_bar(stat = "identity")
grid.arrange(Line_plot, Bar_plot)
In this case, you have to equalize the plot widths:
Line_plot <- ggplot_gtable(ggplot_build(Line_plot))
Bar_plot <- ggplot_gtable(ggplot_build(Bar_plot))
Bar_plot$widths <-Line_plot$widths
grid.arrange(Line_plot, Bar_plot)

The gridlines on the x axes will be aligned if you use scale_x_continuous to force ggplot to use limits you specify.
My_plot <- ggplot(data = m, aes(x = x, y = y)) + theme_bw() +
scale_x_continuous(limits=range(m$x))
Now, when you add the layers, the axes will share the common scaling.

Saving ggplot to a list then applying to grid.arrange geom_line from last plot populates all previous plots

I am very new to R and ggplot2. I am trying to create a grid of plots of correlations as well as their trailing max and min values using a for loop. The plots are then saved as a PDF to a directory. When they are saved the blue lines(min max) are correctly plotted. However when I then use the do.call(grid.arrange,t) or any other call to the plots in the list. you do not get the correct blue lines, but the last plots blue lines populate all of the plots.
I dont understand how this can plot and save the pdf correctly but not store the ggplot object correctly in the t list() or how there is some confusion in the render using do.call(grid.arrange,t). How can the original line (black) plot correctly but the geom_line additions do not ? I am really confused.
If someone could kindly help me check this code and find out how to plot all lines correctly then place them in a grid that would be great.
reproducable code below using random data
require(TTR)
require(ggplot2)
library(gridExtra)
set.seed(12345)
filelocation = "c:/"
values <- as.data.frame(matrix( rnorm(5*500,mean=0,sd=3), 500, 5))
t <- list()
rollLength = 25
for( i in 1:(ncol(values)))
{
p <- ggplot(data=values, aes(x = index(values)) )
p <- p + geom_line(data=values, aes_string(y = colnames(values)[i]))
p <- p + geom_line(data = values, aes(x = index(values), y = runMax(values[,i], n = rollLength) ), colour = "blue", linetype = "longdash" )
p <- p + geom_line(data = values, aes(x = index(values), y = runMin(values[,i], n = rollLength) ), colour = "blue", linetype = "longdash" )
p <- p + ggtitle(colnames(values)[i]) + xlab("Date") + ylab("Pearson Correlation")
print(p)
ggsave( file = paste(colnames(values)[i],".pdf",sep = "") , path = filelocation)
assign(paste("p", i, sep = ""), p)
t[[i]] <- p
}
do.call(grid.arrange,t)

Hmm, this isn't exactly what you want I think, but close, and less code
require(TTR)
require(ggplot2)
set.seed(12345)
values <- as.data.frame(matrix( rnorm(5*500,mean=0,sd=3), 500, 5))
rollLength = 25
library(reshape2)
dfmelt <- melt(values)
dfmelt$max <- runMax(dfmelt$value, n=rollLength)
dfmelt$min <- runMin(dfmelt$value, n=rollLength)
dfmelt$row <- index(dfmelt)
ggplot(dfmelt, aes(x = row, y = value)) +
geom_line() +
geom_line(aes(x = row, y = max), data=dfmelt, colour = "blue",
linetype = "longdash") +
geom_line(aes(x = row, y = min), data=dfmelt, colour = "blue",
linetype = "longdash") +
facet_wrap(~ variable, scales="free")

ggplot2 - draw logistic distribution with small part of the area colored

I have following code to draw my logistic distribution:
x=seq(-2000,2000,length=1000)
dat <- data.frame(x=x)
dat$value <- dlogis(x,location=200,scale=400/log(10))
dat$type <- "Expected score"
p <- ggplot(data=dat, aes(x=x, y=value)) + geom_line(col="blue", size=1) +
coord_cartesian(xlim = c(-500, 900), ylim = c(0, 0.0016)) +
scale_x_continuous(breaks=c(seq(-500, 800, 100)))
pp <- p + geom_line(aes(x = c(0,0), y = c(0,0.0011)), size=0.9, colour="green", linetype=2, alpha=0.7)
Now what I would like to do is to highlight the area to the left of x = 0.
I tried to do it like this:
x = seq(-500, 0, length=10)
y = dlogis(x,location=200,scale=400/log(10))
pol <- data.frame(x = x, y = y)
pp + geom_polygon(aes(data=pol,x=x, y=y), fill="light blue", alpha=0.6)
But this does not work. Not sure what I am doing wrong. Any help?

I haven't diagnosed the problem with your polygon (although I think you would need to give the full path around the outside, i.e. attach rep(0,length(x)) to the end of y and rev(x) to the end of x), but geom_ribbon (as in Shading a kernel density plot between two points. ) seems to do the trick:
pp + geom_ribbon(data=data.frame(x=x,y=y),aes(ymax=y,x=x,y=NULL),
ymin=0,fill="light blue",alpha=0.5)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Overlay plot and histogram in R with ggplot - r

Related

Plotting a vertical normal distribution next to a box plot in R

triangular plot using ggtern

Align x axes of box plot and line plot using ggplot

Saving ggplot to a list then applying to grid.arrange geom_line from last plot populates all previous plots

ggplot2 - draw logistic distribution with small part of the area colored

Categories

Resources