Using pcaCoda from the package "robCompositions" with ggplot2 - r

I would like to plot the results of the robust PCA (pcaCoDa) from the robCompositions package using ggplot2.
Previously, it worked with ggbiplot (https://github.com/vqv/ggbiplot) however, I can no longer get it to work with my current R version (3.6.0).
Is there a way to do a biplot with the pcaCoda results with ggplot2 using CRAN packages?
Here is a working example without using ggplot:
library(robCompositions)
df <- arcticLake
a <- pcaCoDa(df)
biplot(a)
And another example without using the robust PCA, but using the autoplot function:
library(ggplot2)
autoplot(princomp(df))
However, I would like to use the robust PCA with ggplot/autoplot. When I try to plot it, i get the following error:
autoplot(a)
Error: Objects of type pcaCoDa not supported by autoplot.
I also tried the following and also get an error:
autoplot(a$princompOutputClr)
Error in scale.default(data, center = FALSE, scale = 1/scale) :
length of 'scale' must equal the number of columns of 'x'
Any advice? Thanks!

For some reasons that I ignore pcaCoda returns one value less for scale and center compared to the output of other pca methods such as prcomp or princomp. I think that's the reason why autoplot does not want to plot this object.
Alternatively, if you want to apply the robust algortithm, you can use the package pcaMethods available on bioconductor, here i provided an example using the iris dataset that you can found on the documentation of pcaMethods (https://bioconductor.org/packages/release/bioc/html/pcaMethods.html):
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("pcaMethods")
library(pcaMethods)
library(ggplot2)
robust = pca(iris[c(1, 2, 3, 4)], method = "robustPca", scale = "uv", center = TRUE)
iris = merge(iris, scores(robust), by =0)
ggplot(iris, aes( x= PC1, y = PC2, colour = Species))+
geom_point()+
stat_ellipse()
Does it look what you are trying to get ?

Related

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Lines in ggplot order

From library mgcv
i get the points to plot with:
fsb <- fs.boundary(r0=0.1, r=1.1, l=2173)
if with standard graphic package i plot fsb and then i add lines i get :
x11()
plot(fsb)
lines(fsb$x,fsb$y)
I try now with ggplot (this is the line within a bigger code) :
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
ts=fsb$x
ps=fsb$y
geom_line(data=tpdf, aes(ts,ps), inherit.aes = FALSE)
i get a messy plot:
I think that i'm failing the order in geom_line
This can be solved by using geom_path:
ggplot(tpdf)+
geom_point(aes(ts,ps)) +
geom_path(aes(ts,ps))
You have a very odd way of using ggplot I recommend you to reexamine it.
data:
library(mgcv)
fsb <- fs.boundary(r0 = 0.1, r=2, l=13)
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
You'll have to specify the group parameter - for example, this
ggplot(tpdf) +
geom_point(aes(ts, ps)) +
geom_line(aes(ts, ps, group = gl(4, 40)))
gives me a plot similar to the one in base R.

adding correlation test results to ggplot

I'm trying to create a ggplot and add results of a correlation test I have done.
Something along the lines of:
p+annotate("text",x=12.5,y=15.25,label=c(cor.test$estimate,cor.test$p.value))
I keep getting error messages no matter what I try.
Any ideas?
I have actually managed to add stat details to the plot by using stat_cor() from the package ggpubr
library(ggpubr)
p+stat_cor(method="pearson")
There is a package in development that can do this for you (ggstatsplot is on CRAN).
Here is an example of how to create correlation plot:
ggstatsplot::ggscatterstats(data = iris, x = Sepal.Length, y = Sepal.Width)
This will produce a plot that looks like the following (you can similarly get results from Spearman's rho (type = 'spearman') or robust correlation test (type = 'robust')):
Check out the documentation of the function for more.

Plot histograms or pie charts in a scatter plot

I need to repeat the thing done in:
tiny pie charts to represent each point in an scatterplot using ggplot2 but I stumbled into the problem that the package ggsubplot is not available for 3.3.1 R version.
Essentially I need a histogram or a pie chart in predefined points on the scatterplot. Here is the same code that is used in the cited post:
foo <- data.frame(X=runif(30),Y=runif(30),A=runif(30),B=runif(30),C=runif(30))
foo.m <- melt(foo, id.vars=c("X","Y"))
ggplot(foo.m, aes(X,Y))+geom_point()
ggplot(foo.m) +
geom_subplot2d(aes(x = X, y = Y, subplot = geom_bar(aes(variable,
value, fill = variable), stat = "identity")), width = rel(.5), ref = NULL)
The code used libraries reshape2, ggplot2 and ggsubplot.
The image that I want to see is in the post cited above
UPD: I downloaded the older versions of R (3.0.2 and 3.0.3) and checkpoint package, and used:
checkpoint("2014-09-18")
as was described in the comment bellow. But I get an error:
Using binwidth 0.0946
Using binwidth 0.0554
Error in layout_base(data, vars, drop = drop) :
At least one layer must contain all variables used for facetting
Which I can't get around, because when I try to include facet, the following error comes up:
Error: ggsubplots do not support facetting
It doesn't look like ggsubplot is going to fix itself any time soon. One option would be to use the checkpoint package, and essentially "reset" your copy of R to a time when the package was compatible. This post suggests using a time point of 2014-09-18.

R: ggfortify: "Objects of type prcomp not supported by autoplot"

I am trying to use ggfortify to visualize the results of a PCA I did using prcomp.
sample code:
iris.pca <- iris[c(1, 2, 3, 4)]
autoplot(prcomp(iris.pca))
Error: Objects of type prcomp not supported by autoplot. Please use qplot() or ggplot() instead.
What is odd is that autoplot is specifically designed to handle the results of prcomp - ggplot and qplot can't handle objects like this. I'm running R version 3.2 and just downloaded ggfortify off of github this AM.
Can anyone explain this message?
I'm guessing that you didn't load the required libraries, the code below:
library(devtools)
install_github('sinhrks/ggfortify')
library(ggfortify); library(ggplot2)
data(iris)
iris.pca <- iris[c(1, 2, 3, 4)]
autoplot(prcomp(iris.pca))
will work
Even if ggfortify simplicity is charming, I discourage its use for some overlap with standard ggplot2 functions (e.g. the warning replacing previous import ‘dplyr::vars’ by ‘ggplot2::vars’ when loading ‘ggfortify’). A smart workaround would be to use directly ggplot2.
Here I propose the two versions and their results.
# creating the PCA obj using iris data set
iris.pca <- iris[c(1, 2, 3, 4)]
pca.obj <- prcomp(iris.pca)
# ggfortify way - w coloring
library(ggfortify)
autoplot(pca.obj) + theme_minimal()
# ggplot2 way - w coloring
library(ggplot2)
dtp <- data.frame('Species' = iris$Species, pca.obj$x[,1:2]) # the first two componets are selected (NB: you can also select 3 for 3D plottings or 3+)
ggplot(data = dtp) +
geom_point(aes(x = PC1, y = PC2, col = Species)) +
theme_minimal()
NB: the colouring with straightforward ggplot2 data frame structure is much easier.

Resources