Making barplot with offset baselines using ggplot2 - r

If I'm using the default R graphics package, then I can make plots with staggered baselines.
Here is an example using the CPTtools package (which is available on github)
library(devtools)
install_github("ralmond/CPTtools")
library(CPTtools)
margins <- data.frame (
Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09),
NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14),
Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04)
)
margins <- as.matrix(margins)
baseline <- apply(margins[1:2,],2,sum)
stackedBarplot(margins,offset=-baseline,
main="Marginal Distributions for NetPASS skills",
sub="Baseline at 2nd Semester level.",
col=hsv(223/360,.2,0.10*(5:1)+.5))
This produces the output:
The key is the offset argument which is passed to the barplot function.
For various reasons, I'm trying to rewrite the graphics of CPTtools using ggplot. I can't figure out how to adjust the position of the stacked bar. The closest I've come is
library(tidyverse)
margins <- data.frame (
Trouble=c(Novice=.19,Semester1=.24,Semester2=.28,Semseter3=.20,Semester4=.09),
NDK=c(Novice=.01,Semester1=.09,Semester2=.35,Semseter3=.41,Semester4=.14),
Model=c(Novice=.19,Semester1=.28,Semester2=.31,Semseter3=.18,Semester4=.04)
)
tibble::rownames_to_column(margins,var="Level") %>%
tidyr::pivot_longer(-Level,
names_to="Attribute", values_to="probability") ->
marg
marg %>% filter(Level=="Novice" | Level=="Semester1") %>%
group_by(Attribute) %>% summarize(baseline=sum(probability)) ->
bases
ggplot(marg,aes(Attribute,probability,fill=Level)) +
geom_col(position=position_stack() +
scale_fill_brewer(palette="Blues")

Related

Is there a way to create a kissing people curve using ggplot2 in R

Is it possible to create custom graphs using ggplot2, for example I want to create a graph of kissing people.
Simple variant
Not completely, but partially, I was able to reproduce it, everything except for the "lines of the eyes" is not clear how to mark them
But how to make a more complex graph of kissing people. In general, is it possible to somehow approximate such a curve, more voluminou?
thank you for your help.
perhaps not what you are looking for, but if you have already got the image, and want to reproduce it in ggplot, then you can use the following method:
library(tidyverse)
library(magick)
library(terra)
# read image
im <- image_read("./data/kiss_1.png")
# conver to black/white image
im2 <- im %>%
image_quantize(
max = 2,
colorspace = "gray" )
# get a matrix of the pixel-colors
m <- as.raster(im2) %>% as.matrix()
# extract coordinates of the black pixels
df <- as.data.frame(which(m == "#000000ff", arr.ind=TRUE))
df$row <- df$row * -1
# plot point
ggplot(df, aes(x = col, y = row)) + geom_point()

how to match widths of two html dygraph plots while using htmltools::tagList

I use the excellent dygraphs package in R all the time. The synchronized plots work great but I am having difficulty in keeping the widths of the two synchronized plots identical when one dygraph has data only on primary axis while second has primary + secondary y axis both plotted. The "y2" axis labels reduces the width of the chart 2 thus throwing the x axis of both charts out of sync.
Take a look with the following toy example:
library(data.table)
library(magrittr)
library(lubridate)
library(dygraphs)
library(htmltools)
# create 2 small data.tables.
dt1 <- data.table(datetime = seq(ymd_hm(202205100800),by = "1 mins",length.out = 50))[order(datetime)]
dt2 <- data.table(datetime = seq(ymd_hm(202205100800),by = "1 mins",length.out = 50))[order(datetime)]
dt1[,temp1:=rnorm(50,10,0.5)]
dt2[,temp2:=rnorm(50,7,0.5)]
dt2[,power:=rnorm(50,100,0.5)] # scale is higher hence will be ploted on the secondary access.
d1 <- dygraph(dt1,group = "X")
d2 <- dygraph(dt2,group = "X") %>% dySeries("power",axis = "y2")
Now I combine the two charts using htmltools::browsable() function. I am open to use any other function to combine the two plots so long as they can be one html document.
browsable(tagList(d1,d2))
As you see the two charts are a little displaced. I would like the two x axis aligned exactly.
Thanks for reading and replicating my example !

Animating Histograms with plotly

I'm trying to create an animated demonstration of the Law of Large Numbers, where I want to show the histogram converging to the density as the sample size increase.
I can do this with R shiny, putting a slider on the sample size, but when I try to set up a plotly animation using the sample size as the frame, I get an error deep in the bowels of ggploty. Here is the sample code:
library(tidyverse)
library(plotly)
XXX <- rnorm(200)
plotdat <- bind_rows(lapply(25:200, function(i) data.frame(x=XXX[1:i],f=i)))
hplot <- ggplot(plotdat,aes(x,frame=f)) + geom_histogram(binwidth=.25)
ggplotly(hplot)
The last line returns the error. Error in -data$group : invalid argument to unary operator.
I'm not sure where it is suppose to be getting data$group (this value has been magically set for me in other invocations of ggplotly).
Skipping the initial ggplot and going straight to plotly, does this work for you?
plotdat %>%
plot_ly(x=~x,
type = 'histogram',
frame = ~f) %>%
layout(yaxis = list(range = c(0,50)))
Or, using your original syntax, we can add a position specification that seems to prevent the bug. This version looks better, with standard ggplot formatting and tweened animation.
hplot <- ggplot(plotdat, aes(x, frame = f)) +
geom_histogram(binwidth=.25, position = "identity")
ggplotly(hplot) %>%
animation_opts(frame = 100) # minimum ms per frame to control speed
(I don't know why this fixes it, but when I googled your error I saw a plotly issue on github that was solved by specifying the position, and it seems to fix the error here too. https://github.com/plotly/plotly.R/issues/1544)

Set common y axis limits from a list of ggplots

I am running a function that returns a custom ggplot from an input data (it is in fact a plot with several layers on it). I run the function over several different input data and obtain a list of ggplots.
I want to create a grid with these plots to compare them but they all have different y axes.
I guess what I have to do is extract the maximum and minimum y axes limits from the ggplot list and apply those to each plot in the list.
How can I do that? I guess its through the use of ggbuild. Something like this:
test = ggplot_build(plot_list[[1]])
> test$layout$panel_scales_x
[[1]]
<ScaleContinuousPosition>
Range:
Limits: 0 -- 1
I am not familiar with the structure of a ggplot_build and maybe this one in particular is not a standard one as it comes from a "custom" ggplot.
For reference, these plots are created whit the gseaplot2 function from the enrichplot package.
I dont know how to "upload" an R object but if that would help, let me know how to do it.
Thanks!
edit after comments (thanks for your suggestions!)
Here is an example of the a gseaplot2 plot. GSEA stands for Gene Set Enrichment Analysis, it is a technique used in genomic studies. The gseaplot2 function calculates a running average and then plots it and another bar plot on the bottom.
and here is the grid I create to compare the plots generated from different data:
I would like to have a common scale for the "Running Enrichment Score" part.
I guess I could try to recreate the gseaplot2 function and input all of the datasets and then create the grid by facet_wrap, but I was wondering if there was an easy way of extracting parameters from a plot list.
As a reproducible example (from the enrichplot package):
library(clusterProfiler)
data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]
wpgmtfile <- system.file("extdata/wikipathways-20180810-gmt-Homo_sapiens.gmt", package="clusterProfiler")
wp2gene <- read.gmt(wpgmtfile)
wp2gene <- wp2gene %>% tidyr::separate(term, c("name","version","wpid","org"), "%")
wpid2gene <- wp2gene %>% dplyr::select(wpid, gene) #TERM2GENE
wpid2name <- wp2gene %>% dplyr::select(wpid, name) #TERM2NAME
ewp2 <- GSEA(geneList, TERM2GENE = wpid2gene, TERM2NAME = wpid2name, verbose=FALSE)
gseaplot2(ewp2, geneSetID=1, subplots=1:2)
And this is how I generate the plot list (probably there is a much more elegant way):
plot_list = list()
for(i in 1:3) {
fig_i = gseaplot2(ewp2,
geneSetID=i,
subplots=1:2)
plot_list[[i]] = fig_i
}
ggarrange(plotlist=plot_list)

Combining plotly plots with highly variable widths

I want to combine several R plotly heatmaps, using plotly's subplot, with some a-priori defined width per each subplot, and these widths are highly variable, since they reflect real proportions of my data.
Here's an example dataset:
library(dplyr)
library(plotly)
library(grDevices)
set.seed(1)
df <- data.frame(row = rep(paste0("rid",1:100),10),
col = paste0("cid",unlist(lapply(1:10,function(x) rep(x,100)))),
val = rnorm(1000,-2,1))
Here I generate the list of plots:
plot.list <- lapply(1:10,function(i) plot_ly(z=c(dplyr::filter(df,col == paste0("cid",i))$val),x=dplyr::filter(df,col == paste0("cid",i))$col,y=dplyr::filter(df,col == paste0("cid",i))$row,
colors=colorRamp(c("darkblue","lightgray","darkred")),type="heatmap") %>%
layout(yaxis=list(title=NULL),xaxis=list(tickvals=i,ticktext=as.character(i))))
Here are the plot widths:
plot.widths <- c(0.33277,0.0663,0.28308,0.09323,0.12969,0.0603,0.00651,0.01149,0.01503,0.0016)
Clearly, they sum up to 1.
If I just try this:
subplot(plot.list,shareX=T,shareY=T,nrows=1,margin=0.001,widths=plot.widths) %>% layout(showlegend=F)
I get:
I realized that the plot widths that are smaller than 0.015 cause this.
Right now my quick fix is to scale up the widths of the plots that are below 0.015 and to reciprocally scale down the widths of the plots that are above 0.015 so that the minimal plot width is 0.015 and they still sum up to 1.
Like this:
below.cutoff.widths <- which(plot.widths < 0.015)
if(length(below.cutoff.widths) > 0){
scale.up.factor <- 0.015/min(plot.widths[below.cutoff.widths])
scale.down.factor <- sum(plot.widths[-below.cutoff.widths])/(1-sum(scale.up.factor*(plot.widths[below.cutoff.widths])))
plot.widths[-below.cutoff.widths] <- plot.widths[-below.cutoff.widths]/scale.down.factor
plot.widths[below.cutoff.widths] <- plot.widths[below.cutoff.widths]*scale.up.factor
plot.widths <- plot.widths-.Machine$double.eps
}
Which works and gives:
The problem is that in the example above this significantly distorts the original widths and therefore distorts the message that I'm trying to convey with this plot.
Any idea how to handle this better?
I don't suppose there's a way to simply convert the plot.list's plotly objects to ggplot objects and use gridExtra's arrangeGrob to arrange them together on a grid?

Resources