R: ggfortify: "Objects of type prcomp not supported by autoplot" - r

I am trying to use ggfortify to visualize the results of a PCA I did using prcomp.
sample code:
iris.pca <- iris[c(1, 2, 3, 4)]
autoplot(prcomp(iris.pca))
Error: Objects of type prcomp not supported by autoplot. Please use qplot() or ggplot() instead.
What is odd is that autoplot is specifically designed to handle the results of prcomp - ggplot and qplot can't handle objects like this. I'm running R version 3.2 and just downloaded ggfortify off of github this AM.
Can anyone explain this message?

I'm guessing that you didn't load the required libraries, the code below:
library(devtools)
install_github('sinhrks/ggfortify')
library(ggfortify); library(ggplot2)
data(iris)
iris.pca <- iris[c(1, 2, 3, 4)]
autoplot(prcomp(iris.pca))
will work

Even if ggfortify simplicity is charming, I discourage its use for some overlap with standard ggplot2 functions (e.g. the warning replacing previous import ‘dplyr::vars’ by ‘ggplot2::vars’ when loading ‘ggfortify’). A smart workaround would be to use directly ggplot2.
Here I propose the two versions and their results.
# creating the PCA obj using iris data set
iris.pca <- iris[c(1, 2, 3, 4)]
pca.obj <- prcomp(iris.pca)
# ggfortify way - w coloring
library(ggfortify)
autoplot(pca.obj) + theme_minimal()
# ggplot2 way - w coloring
library(ggplot2)
dtp <- data.frame('Species' = iris$Species, pca.obj$x[,1:2]) # the first two componets are selected (NB: you can also select 3 for 3D plottings or 3+)
ggplot(data = dtp) +
geom_point(aes(x = PC1, y = PC2, col = Species)) +
theme_minimal()
NB: the colouring with straightforward ggplot2 data frame structure is much easier.

Related

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Using pcaCoda from the package "robCompositions" with ggplot2

I would like to plot the results of the robust PCA (pcaCoDa) from the robCompositions package using ggplot2.
Previously, it worked with ggbiplot (https://github.com/vqv/ggbiplot) however, I can no longer get it to work with my current R version (3.6.0).
Is there a way to do a biplot with the pcaCoda results with ggplot2 using CRAN packages?
Here is a working example without using ggplot:
library(robCompositions)
df <- arcticLake
a <- pcaCoDa(df)
biplot(a)
And another example without using the robust PCA, but using the autoplot function:
library(ggplot2)
autoplot(princomp(df))
However, I would like to use the robust PCA with ggplot/autoplot. When I try to plot it, i get the following error:
autoplot(a)
Error: Objects of type pcaCoDa not supported by autoplot.
I also tried the following and also get an error:
autoplot(a$princompOutputClr)
Error in scale.default(data, center = FALSE, scale = 1/scale) :
length of 'scale' must equal the number of columns of 'x'
Any advice? Thanks!
For some reasons that I ignore pcaCoda returns one value less for scale and center compared to the output of other pca methods such as prcomp or princomp. I think that's the reason why autoplot does not want to plot this object.
Alternatively, if you want to apply the robust algortithm, you can use the package pcaMethods available on bioconductor, here i provided an example using the iris dataset that you can found on the documentation of pcaMethods (https://bioconductor.org/packages/release/bioc/html/pcaMethods.html):
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("pcaMethods")
library(pcaMethods)
library(ggplot2)
robust = pca(iris[c(1, 2, 3, 4)], method = "robustPca", scale = "uv", center = TRUE)
iris = merge(iris, scores(robust), by =0)
ggplot(iris, aes( x= PC1, y = PC2, colour = Species))+
geom_point()+
stat_ellipse()
Does it look what you are trying to get ?

Lines in ggplot order

From library mgcv
i get the points to plot with:
fsb <- fs.boundary(r0=0.1, r=1.1, l=2173)
if with standard graphic package i plot fsb and then i add lines i get :
x11()
plot(fsb)
lines(fsb$x,fsb$y)
I try now with ggplot (this is the line within a bigger code) :
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
ts=fsb$x
ps=fsb$y
geom_line(data=tpdf, aes(ts,ps), inherit.aes = FALSE)
i get a messy plot:
I think that i'm failing the order in geom_line
This can be solved by using geom_path:
ggplot(tpdf)+
geom_point(aes(ts,ps)) +
geom_path(aes(ts,ps))
You have a very odd way of using ggplot I recommend you to reexamine it.
data:
library(mgcv)
fsb <- fs.boundary(r0 = 0.1, r=2, l=13)
tpdf <- data.frame(ts=fsb$x,ps=fsb$y)
You'll have to specify the group parameter - for example, this
ggplot(tpdf) +
geom_point(aes(ts, ps)) +
geom_line(aes(ts, ps, group = gl(4, 40)))
gives me a plot similar to the one in base R.

contour plot of a custom function in R

I'm working with some custom functions and I need to draw contours for them based on multiple values for the parameters.
Here is an example function:
I need to draw such a contour plot:
Any idea?
Thanks.
First you construct a function, fourvar that takes those four parameters as arguments. In this case you could have done it with 3 variables one of which was lambda_2 over lambda_1. Alpha1 is fixed at 2 so alpha_1/alpha_2 will vary over 0-10.
fourvar <- function(a1,a2,l1,l2){
a1* integrate( function(x) {(1-x)^(a1-1)*(1-x^(l2/l1) )^a2} , 0 , 1)$value }
The trick is to realize that the integrate function returns a list and you only want the 'value' part of that list so it can be Vectorize()-ed.
Second you construct a matrix using that function:
mat <- outer( seq(.01, 10, length=100),
seq(.01, 10, length=100),
Vectorize( function(x,y) fourvar(a1=2, x/2, l1=2, l2=y/2) ) )
Then the task of creating the plot with labels in those positions can only be done easily with lattice::contourplot. After doing a reasonable amount of searching it does appear that the solution to geom_contour labeling is still a work in progress in ggplot2. The only labeling strategy I found is in an external package. However, the 'directlabels' package's function directlabel does not seem to have sufficient control to spread the labels out correctly in this case. In other examples that I have seen, it does spread the labels around the plot area. I suppose I could look at the code, but since it depends on the 'proto'-package, it will probably be weirdly encapsulated so I haven't looked.
require(reshape2)
mmat <- melt(mat)
str(mmat) # to see the names in the melted matrix
g <- ggplot(mmat, aes(x=Var1, y=Var2, z=value) )
g <- g+stat_contour(aes(col = ..level..), breaks=seq(.1, .9, .1) )
g <- g + scale_colour_continuous(low = "#000000", high = "#000000") # make black
install.packages("directlabels", repos="http://r-forge.r-project.org", type="source")
require(directlabels)
direct.label(g)
Note that these are the index positions from the matrix rather than the ratios of parameters, but that should be pretty easy to fix.
This, on the other hand, is how easilyy one can construct it in lattice (and I think it looks "cleaner":
require(lattice)
contourplot(mat, at=seq(.1,.9,.1))
As I think the question is still relevant, there have been some developments in the contour plot labeling in the metR package. Adding to the previous example will give you nice contour labeling also with ggplot2
require(metR)
g + geom_text_contour(rotate = TRUE, nudge_x = 3, nudge_y = 5)

Partially superposing data in lattice's xyplot()

Please, reproduce this code:
install.packages('lattice')
install.packages('zoo')
require(lattice)
require(zoo)
X <- matrix(runif(25 * 8), ncol = 8)
(Its purpose is just to load packages and to create a matrix with 8 columns).
Using zoo it is very easy to create such a plot:
plot.zoo(X, screen = c(1,1,2,2,3,3,4,4), col = c(1,2))
How can I make the same with lattice's xyplot() function?
This can be done via zoo:::xyplot.zoo: as reported in zoo package documentation, xyplot.zoo has xyplot methods for time series objects.
Then, for what concerns the above question, it is possible to use:
xyplot(as.zoo(X), screen = c(1,1,2,2,3,3,4,4), col = c(1,2))
to produce a trellis object like in lattice selecting the desired layout with the screen argument.

Resources