Weird box plot result when trying to create a box plot in ggplot2 using linearized data - r

I have an interactions matrix that looks like this:
chr10.117800000 chr10.117801000 chr10.117802000 chr10.117803000 chr10.117804000 chr10.117805000 chr10.117806000
chr10.117800000 0.006484824 0.006451925 0.006422584 0.006386328 0.006292793 0.006277799 0.006231732
chr10.117801000 0.006451925 0.006435975 0.006415112 0.006378994 0.006285668 0.006272825 0.006226796
chr10.117802000 0.006422584 0.006415112 0.006406884 0.006370748 0.006277475 0.006264644 0.006222890
chr10.117803000 0.006386328 0.006378994 0.006370748 0.006346183 0.006307680 0.006294757 0.006254941
chr10.117804000 0.006292793 0.006285668 0.006277475 0.006307680 0.006324919 0.006311969 0.006276300
chr10.117805000 0.006277799 0.006272825 0.006264644 0.006294757 0.006311969 0.006303327 0.006269839
chr10.117806000 0.006231732 0.006226796 0.006222890 0.006254941 0.006276300 0.006269839 0.006244967
chr10.117807000 0.006242481 0.006235449 0.006231538 0.006265652 0.006287100 0.006282769 0.006257918
chr10.117808000 0.006140677 0.006133760 0.006129913 0.006161364 0.006188786 0.006186627 0.006166320
chr10.117809000 0.006098614 0.006091771 0.006087950 0.006119074 0.006146385 0.006146359 0.006130442
I am trying to linearize and label it to prep it for ggplot2, which I have accomplished using this code:
data <- as.vector(data)
data <- cbind(Counts = data, Genotype = "KO")
However, whenever I take my data and plot it using ggplot2 with this command:
blah <- ggplot(data =test, aes(x = Genotype, y = Counts)) + geom_boxplot()
It gives me a weird looking box plot that looks like this:
I have tried to add scale_y_continuous(limits = c(0, 0.002)), but each time that I do that I get an error that I'm trying to add discrete values to a continuous scale. Does anyone know what's going on, or if there is a better way to do this?

Related

how to mimic histogram plot from flowjo in R using flowCore?

I'm new to flowCore + R. I would like to mimic a histogram plot after gating that can be manually done in FlowJo software. I got something similar but it doesn't look quite right because it is a "density" plot and is shifted. How can I get the x axis to shift over and look similar to how FlowJo outputs the plot? I tried reading this document but couldn't find a plot similar to the one in FlowJo: howtoflowcore Appreciate any guidance. Thanks.
code snippet:
library(flowCore)
parentpath <- "/parent/path"
subfolder <- "Sample 1"
fcs_files <- list.files(paste0(parentpath, subfolder), pattern = ".fcs")
fs <- read.flowSet(fcs_files)
rect.g <- rectangleGate(filterId = "main",list("FSC-A" = c(1e5, 2e5), "SSC-A" = c(3e4,1e5)))
fs_sub <- Subset(fs, rect.g)
p <- ggcyto(fs_sub[[15]], aes(x= `UV-379-A`)) +
geom_density(fill='black', alpha = 0.4) +
ggcyto_par_set(limits = list(x = c(-1e3, 5e4), y = c(0, 6e-5)))
p
FlowJo output:
R FlowCore output:
The reason that for the "shift" is that the x axis is logarithmic (base 10) in the flowJo graph. To achieve the same result in R, add
+ scale_x_log10()
after the existing code. This might interact weirdly with the axis limits you've set, so bare that in mind.
To make the y-axis "count" rather than density, you can change the first line of your ggcyto() call to:
aes(x= `UV-379-A`, y = after_stat(count))
Let me know if that works - I don't have your data to hand so that's all from memory!
For any purely aesthetic changes, they are relatively easy to look up.

add interactivity to basic box plot r

I'm trying to create a box plot without outliers
I have tried with ggplot's outlier.shape but the result was not satisfactory as it hides most part of the Whiskers because of the different range in Resolved_Hour column across priority. Tweaking the cartesian coordinates further elongates the Whisker's
a <- ggplot(data, aes(x=priority,y=Resolved_Hour,color = priority))+
geom_boxplot(outlier.shape = NA)+
coord_cartesian(ylim = quantile(data$Resolved_Hour,c(0.25,0.75),na.rm = T))
ggplotly(a)
Eg:
The basic box plot from graphics library helped me plot without outliers using outline = F
boxplot(Resolved_Hour ~ priority,data = data,horizontal=TRUE,axes=TRUE,outline=FALSE,col = "bisque")
But I couldn't add interactivity to this plot as
a <- boxplot(Resolved_Hour ~ priority,data = data,horizontal=TRUE,axes=TRUE,outline=FALSE,col = "bisque")
ggplotly(a)
It throws an error as
Error in UseMethod("ggplotly", p) :
no applicable method for 'ggplotly' applied to an object of class "list"
The plot is saved as a list as below:
Any help is highly appreciated :-) to resolve and plot a better box plot.
Thank you.

Why aren't any points showing up in the qqcomp function when using plotstyle="ggplot"?

I want to compare the fit of different distributions to my data in a single plot. The qqcomp function from the fitdistrplus package pretty much does exactly what I want to do. The only problem I have however, is that it's mostly written using base R plot and all my other plots are written in ggplot2. I basically just want to customize the qqcomp plots to look like they have been made in ggplot2.
From the documentation (https://www.rdocumentation.org/packages/fitdistrplus/versions/1.0-14/topics/graphcomp) I get that this is totally possible by setting plotstyle="ggplot". If I do this however, no points are showing up on the plot, even though it worked perfectly without the plotstyle argument. Here is a little example to visualize my problem:
library(fitdistrplus)
library(ggplot2)
set.seed(42)
vec <- rgamma(100, shape=2)
fit.norm <- fitdist(vec, "norm")
fit.gamma <- fitdist(vec, "gamma")
fit.weibull <- fitdist(vec, "weibull")
model.list <- list(fit.norm, fit.gamma, fit.weibull)
qqcomp(model.list)
This gives the following output:
While this:
qqcomp(model.list, plotstyle="ggplot")
gives the following output:
Why are the points not showing up? Am I doing something wrong here or is this a bug?
EDIT:
So I haven't figured out why this doesn't work, but there is a pretty easy workaround. The function call qqcomp(model.list, plotstyle="ggplot") still returns an ggplot object, which includes the data used to make the plot. Using that data one can easily write an own plot function that does exactly what one wants. It's not very elegant, but until someone finds out why it's not working as expected I will just use this method.
I was able to reproduce your error and indeed, it's really intriguing. Maybe, you should contact developpers of this package to mention this bug.
Otherwise, if you want to reproduce this qqplot using ggplot and stat_qq, passing the corresponding distribution function and the parameters associated (stored in $estimate):
library(ggplot2)
df = data.frame(vec)
ggplot(df, aes(sample = vec))+
stat_qq(distribution = qgamma, dparams = as.list(fit.gamma$estimate), color = "green")+
stat_qq(distribution = qnorm, dparams = as.list(fit.norm$estimate), color = "red")+
stat_qq(distribution = qweibull, dparams = as.list(fit.weibull$estimate), color = "blue")+
geom_abline(slope = 1, color = "black")+
labs(title = "Q-Q Plots", x = "Theoritical quantiles", y = "Empirical quantiles")
Hope it will help you.

R grouped/centered barplot with different fill with ggplot2

I have the following dataset:
db1.1 <- data.frame(Status1.1 = rep(c("Completed", "Ongoing"), each=9),code1.1= rep(c(1:9), times=2), nProj1.1 = c(-24,-2,-17,-59,-1,-12,-6,0,0,0,2,3,5,0,2,0,1,1))
With this dataset, I build a graphic very similar to this one (code1.1 is the x axis, nProj1.1 is the y axis, and Status1.1 gives the two different grey tones):
I used this code to build the graphic:
ggplot(db1.1, aes(x=code1.1, y=nProj1.1, fill=Status1.1)) + geom_bar(stat="identity", position="identity")+coord_flip()+geom_hline(yintercept = 0, size=1)
However, I want to add a new variable/overlap a graphic, to obtain the following result:
Basically, it is the same as the one above but with values over the grey bars, with the dashed lines.
I have a new dataset that should correspond to the bars with dashed lines, with the same variables:
db1.2 <- data.frame(Status1.2 = rep(c("Completed", "Ongoing"), each=9),code1.2= rep(c(1:9), times=2), nProj1.2 = c(0,0,-14,-43,-1,-10,-5,0,0,0,2,3,5,0,1,0,0,1)) # manter assim, que já atribui a classe a cada variavel; ex.: factor, num, int, etc
I tried following this question: R-stacked-grouped barplot with different fill in R , but I didn't manage yet to make it work. I can also group both datasets and create a new binary variable, but I am not sure if that would help.
Does anyone know how can I make this kind of graph?

Lines in ggplot order

From library mgcv
i get the points to plot with:
fsb <- fs.boundary(r0=0.1, r=1.1, l=2173)
if with standard graphic package i plot fsb and then i add lines i get :
x11()
plot(fsb)
lines(fsb$x,fsb$y)
I try now with ggplot (this is the line within a bigger code) :
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
ts=fsb$x
ps=fsb$y
geom_line(data=tpdf, aes(ts,ps), inherit.aes = FALSE)
i get a messy plot:
I think that i'm failing the order in geom_line
This can be solved by using geom_path:
ggplot(tpdf)+
geom_point(aes(ts,ps)) +
geom_path(aes(ts,ps))
You have a very odd way of using ggplot I recommend you to reexamine it.
data:
library(mgcv)
fsb <- fs.boundary(r0 = 0.1, r=2, l=13)
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
You'll have to specify the group parameter - for example, this
ggplot(tpdf) +
geom_point(aes(ts, ps)) +
geom_line(aes(ts, ps, group = gl(4, 40)))
gives me a plot similar to the one in base R.

Resources