add interactivity to basic box plot r - r

I'm trying to create a box plot without outliers
I have tried with ggplot's outlier.shape but the result was not satisfactory as it hides most part of the Whiskers because of the different range in Resolved_Hour column across priority. Tweaking the cartesian coordinates further elongates the Whisker's
a <- ggplot(data, aes(x=priority,y=Resolved_Hour,color = priority))+
geom_boxplot(outlier.shape = NA)+
coord_cartesian(ylim = quantile(data$Resolved_Hour,c(0.25,0.75),na.rm = T))
ggplotly(a)
Eg:
The basic box plot from graphics library helped me plot without outliers using outline = F
boxplot(Resolved_Hour ~ priority,data = data,horizontal=TRUE,axes=TRUE,outline=FALSE,col = "bisque")
But I couldn't add interactivity to this plot as
a <- boxplot(Resolved_Hour ~ priority,data = data,horizontal=TRUE,axes=TRUE,outline=FALSE,col = "bisque")
ggplotly(a)
It throws an error as
Error in UseMethod("ggplotly", p) :
no applicable method for 'ggplotly' applied to an object of class "list"
The plot is saved as a list as below:
Any help is highly appreciated :-) to resolve and plot a better box plot.
Thank you.

Related

how to mimic histogram plot from flowjo in R using flowCore?

I'm new to flowCore + R. I would like to mimic a histogram plot after gating that can be manually done in FlowJo software. I got something similar but it doesn't look quite right because it is a "density" plot and is shifted. How can I get the x axis to shift over and look similar to how FlowJo outputs the plot? I tried reading this document but couldn't find a plot similar to the one in FlowJo: howtoflowcore Appreciate any guidance. Thanks.
code snippet:
library(flowCore)
parentpath <- "/parent/path"
subfolder <- "Sample 1"
fcs_files <- list.files(paste0(parentpath, subfolder), pattern = ".fcs")
fs <- read.flowSet(fcs_files)
rect.g <- rectangleGate(filterId = "main",list("FSC-A" = c(1e5, 2e5), "SSC-A" = c(3e4,1e5)))
fs_sub <- Subset(fs, rect.g)
p <- ggcyto(fs_sub[[15]], aes(x= `UV-379-A`)) +
geom_density(fill='black', alpha = 0.4) +
ggcyto_par_set(limits = list(x = c(-1e3, 5e4), y = c(0, 6e-5)))
p
FlowJo output:
R FlowCore output:
The reason that for the "shift" is that the x axis is logarithmic (base 10) in the flowJo graph. To achieve the same result in R, add
+ scale_x_log10()
after the existing code. This might interact weirdly with the axis limits you've set, so bare that in mind.
To make the y-axis "count" rather than density, you can change the first line of your ggcyto() call to:
aes(x= `UV-379-A`, y = after_stat(count))
Let me know if that works - I don't have your data to hand so that's all from memory!
For any purely aesthetic changes, they are relatively easy to look up.

Weird box plot result when trying to create a box plot in ggplot2 using linearized data

I have an interactions matrix that looks like this:
chr10.117800000 chr10.117801000 chr10.117802000 chr10.117803000 chr10.117804000 chr10.117805000 chr10.117806000
chr10.117800000 0.006484824 0.006451925 0.006422584 0.006386328 0.006292793 0.006277799 0.006231732
chr10.117801000 0.006451925 0.006435975 0.006415112 0.006378994 0.006285668 0.006272825 0.006226796
chr10.117802000 0.006422584 0.006415112 0.006406884 0.006370748 0.006277475 0.006264644 0.006222890
chr10.117803000 0.006386328 0.006378994 0.006370748 0.006346183 0.006307680 0.006294757 0.006254941
chr10.117804000 0.006292793 0.006285668 0.006277475 0.006307680 0.006324919 0.006311969 0.006276300
chr10.117805000 0.006277799 0.006272825 0.006264644 0.006294757 0.006311969 0.006303327 0.006269839
chr10.117806000 0.006231732 0.006226796 0.006222890 0.006254941 0.006276300 0.006269839 0.006244967
chr10.117807000 0.006242481 0.006235449 0.006231538 0.006265652 0.006287100 0.006282769 0.006257918
chr10.117808000 0.006140677 0.006133760 0.006129913 0.006161364 0.006188786 0.006186627 0.006166320
chr10.117809000 0.006098614 0.006091771 0.006087950 0.006119074 0.006146385 0.006146359 0.006130442
I am trying to linearize and label it to prep it for ggplot2, which I have accomplished using this code:
data <- as.vector(data)
data <- cbind(Counts = data, Genotype = "KO")
However, whenever I take my data and plot it using ggplot2 with this command:
blah <- ggplot(data =test, aes(x = Genotype, y = Counts)) + geom_boxplot()
It gives me a weird looking box plot that looks like this:
I have tried to add scale_y_continuous(limits = c(0, 0.002)), but each time that I do that I get an error that I'm trying to add discrete values to a continuous scale. Does anyone know what's going on, or if there is a better way to do this?

Why aren't any points showing up in the qqcomp function when using plotstyle="ggplot"?

I want to compare the fit of different distributions to my data in a single plot. The qqcomp function from the fitdistrplus package pretty much does exactly what I want to do. The only problem I have however, is that it's mostly written using base R plot and all my other plots are written in ggplot2. I basically just want to customize the qqcomp plots to look like they have been made in ggplot2.
From the documentation (https://www.rdocumentation.org/packages/fitdistrplus/versions/1.0-14/topics/graphcomp) I get that this is totally possible by setting plotstyle="ggplot". If I do this however, no points are showing up on the plot, even though it worked perfectly without the plotstyle argument. Here is a little example to visualize my problem:
library(fitdistrplus)
library(ggplot2)
set.seed(42)
vec <- rgamma(100, shape=2)
fit.norm <- fitdist(vec, "norm")
fit.gamma <- fitdist(vec, "gamma")
fit.weibull <- fitdist(vec, "weibull")
model.list <- list(fit.norm, fit.gamma, fit.weibull)
qqcomp(model.list)
This gives the following output:
While this:
qqcomp(model.list, plotstyle="ggplot")
gives the following output:
Why are the points not showing up? Am I doing something wrong here or is this a bug?
EDIT:
So I haven't figured out why this doesn't work, but there is a pretty easy workaround. The function call qqcomp(model.list, plotstyle="ggplot") still returns an ggplot object, which includes the data used to make the plot. Using that data one can easily write an own plot function that does exactly what one wants. It's not very elegant, but until someone finds out why it's not working as expected I will just use this method.
I was able to reproduce your error and indeed, it's really intriguing. Maybe, you should contact developpers of this package to mention this bug.
Otherwise, if you want to reproduce this qqplot using ggplot and stat_qq, passing the corresponding distribution function and the parameters associated (stored in $estimate):
library(ggplot2)
df = data.frame(vec)
ggplot(df, aes(sample = vec))+
stat_qq(distribution = qgamma, dparams = as.list(fit.gamma$estimate), color = "green")+
stat_qq(distribution = qnorm, dparams = as.list(fit.norm$estimate), color = "red")+
stat_qq(distribution = qweibull, dparams = as.list(fit.weibull$estimate), color = "blue")+
geom_abline(slope = 1, color = "black")+
labs(title = "Q-Q Plots", x = "Theoritical quantiles", y = "Empirical quantiles")
Hope it will help you.

Run points() after plot() on a dataframe

I'm new to R and want to plot specific points over an existing plot. I'm using the swiss data frame, which I visualize through the plot(swiss) function.
After this, want to add outliers given by the Mahalanobis distance:
mu_hat <- apply(swiss, 2, mean); sigma_hat <- cov(swiss)
mahalanobis_distance <- mahalanobis(swiss, mu_hat, sigma_hat)
outliers <- swiss[names(mahalanobis_distance[mahalanobis_distance > 10]),]
points(outliers, pch = 'x', col = 'red')
but this last line has no effect, as the outlier points aren't added to the previous plot. I see that if repeat this procedure on a pair of variables, say
plot(swiss[2:3])
points(outliers[2:3], pch = 'x', col = 'red')
the red points are added to the plot.
Ask: is there any restriction to how the points() function can be used for a multivariate data frame?
Here's a solution using GGally::ggpairs. It's a little ugly as we need to modify the ggally_points function to specify the desired color scheme.
I've assumed that mu_hat = colMeans(swiss) and sigma_hat = cov(swiss).
library(dplyr)
library(GGally)
swiss %>%
bind_cols(distance = mahalanobis(swiss, colMeans(swiss), cov(swiss))) %>%
mutate(is_outlier = ifelse(distance > 10, "yes", "no")) %>%
ggpairs(columns = 1:6,
mapping = aes(color = is_outlier),
upper = list(continuous = function(data, mapping, ...) {
ggally_points(data = data, mapping = mapping) +
scale_colour_manual(values = c("black", "red"))
}),
lower = list(continuous = function(data, mapping, ...) {
ggally_points(data = data, mapping = mapping) +
scale_colour_manual(values = c("black", "red"))
}),
axisLabels = "internal")
Unfortunately this isn't possible the way you're currently doing things. When plotting a data frame R produces many plots and aligns them. What you're actually seeing there is 6 by 6 = 36 individual plots which have all been aligned to look nice.
When you use the dots command, it tells it to place the dots on the current plot. Which doesn't really make sense when you have 36 plots, at least not the way you want it to.
ggplot is a really powerful tool in R, it provides far greater combustibility. For example you could set up the dataframe to include your outliers, but have them labelled as "outlier" and place it in each plot that you have set up as facets. The more you explore it you might find there are better plots which suit your needs as well.
Plotting a dataframe in base R is a good exploratory tool. You could set up those outliers as a separate dataframe and plot it, so you can see each of the 6 by 6 plots side by side and compare. It all depends on your goal. If you're goal is to produce exactly as you've described, the ggplot2 package will help you create something more professional. As #Gregor suggested in the comments, looking up the function ggpairs from the GGally package would be a good place to start.
A quick google image search shows some funky plots akin to what you're after and then some!
Find it here

Lines in ggplot order

From library mgcv
i get the points to plot with:
fsb <- fs.boundary(r0=0.1, r=1.1, l=2173)
if with standard graphic package i plot fsb and then i add lines i get :
x11()
plot(fsb)
lines(fsb$x,fsb$y)
I try now with ggplot (this is the line within a bigger code) :
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
ts=fsb$x
ps=fsb$y
geom_line(data=tpdf, aes(ts,ps), inherit.aes = FALSE)
i get a messy plot:
I think that i'm failing the order in geom_line
This can be solved by using geom_path:
ggplot(tpdf)+
geom_point(aes(ts,ps)) +
geom_path(aes(ts,ps))
You have a very odd way of using ggplot I recommend you to reexamine it.
data:
library(mgcv)
fsb <- fs.boundary(r0 = 0.1, r=2, l=13)
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
You'll have to specify the group parameter - for example, this
ggplot(tpdf) +
geom_point(aes(ts, ps)) +
geom_line(aes(ts, ps, group = gl(4, 40)))
gives me a plot similar to the one in base R.

Resources