I am very new to R and trying to plot a PCA figure of my data using ggbiplot. So please bear with me if my question does not make any senses to you. Basically, I was following the tutorial I found here, except I was using my own data set.
Everything was fine until I wish to use the code below to plot a figure:
g <- ggbiplot(ir.pca, obs.scale = 1, var.scale = 1,
groups = ir.ppm, ellipse = TRUE,
circle = TRUE)
Then, I encountered an error stating : Error in names(ell)[1:2] <- c("xvar", "yvar") :
'names' attribute [2] must be the same length as the vector [0]
After that, I edited my code and using the default setting for groups =, which should be = NULL as I recall.
g <- ggbiplot(ir.pca, obs.scale = 1, var.scale = 1,
groups = ir.ppm, ellipse = TRUE,
circle = TRUE) `
With the edited code, I did able to plot the PCA figure but It cannot categorize the observations into different groups as I desired. Although I still does not know the meaning of the error: Error in names(ell)[1:2] <- c("xvar", "yvar") : 'names' attribute [2] must be the same length as the vector [0] , I do suspect that it may have something to do with my factor ir.ppm.
Here are all the code I have used before I have encountered the error.
ppm3 = read.csv("normalize_GasPhase_heatmap_no_ID_transpose.csv", header = TRUE, row.names = 1)
ppm3_1 <- ppm3[,1:30]
ir.ppm <- ppm3[,31]
ir.pca <- prcomp(ppm3_1, center = TRUE, scale. = TRUE)
library(ggbiplot)
g <- ggbiplot(ir.pca, obs.scale = 1, var.scale = 1, groups = ir.ppm, ellipse = TRUE, circle = TRUE)
In total, I have 6 observations and 31 variables in my raw data ppm3.
I have been browsing some questions related to plotting PCA figure with ggbiplot in stackoverflow, but it seems not much people encountered the same problem as I did. I would really appreciate if anyone can offer me some help. Thank you.
You only have one observation for each of your factors in ir.ppm. You need more observations for each of the factors in order to display ellipses.
One work around is to remove the ellipses option like this:
g <- ggbiplot(ir.pca, obs.scale = 1, var.scale = 1,
groups = ir.ppm,
circle = TRUE)
Related
I am trying to create an R function that would run a GWR on variables that the user specifies from a Spatial Polygons Data Frame. The end result of running the function are two mappings - one of the independent variable's values and one of the coefficient values from the GWR model. I'm having trouble with the second map.
I have managed to create the GWR model and a 'results' object for the coefficients that I would be visualizing.
gwr.model <- gwr(SpatialPolygonsDataFrame#data[, y] ~ SpatialPolygonsDataFrame#data[, x],
data = SpatialPolygonsDataFrame,
adapt = GWRbandwidth,
hatmatrix = TRUE,
se.fit = TRUE)
results <- as.data.frame(gwr.model$SDF)
gwr.map <- SpatialPolygonsDataFrame
gwr.map#data <- cbind(SpatialPolygonsDataFrame#data, as.matrix(results))
To create the visualization of the GWR coefficients, I have to specify my tm_fill() to be a column from the 'results' object, but I do not know how to do it so that the function may be used will any Spatial Polygons Data Frame. So far, I have tried using the paste0() function, as so:
map2 <- tm_shape(gwr.map) + tm_fill(paste0("SpatialPolygonsDataFrame.", x), n = 5, style = "quantile", title = "Coefficient") +
tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)
But I got an error saying that the fill argument is neither colors nor a valid variable name.
I'll be grateful for any tips that could help me resolve the issue.
Switching to the package sf - leaving sp behind - probably will solve your problem here.
In the absence of a reproducible example, let me try to suggest the following here:
convert your results with gwr.map.sf <- sf::st_as_sf(gwr.map). Then you add the results of your GWR simply as a new column: gwr.map$results <- results (my understanding is that the dimensions should fit).
Finally you should be able to plot like this:
map2 <- tm_shape(gwr.map.sf) + tm_fill("results", n = 5, style = "quantile", title = "Coefficient") +
tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)
I'm using Chord Diagrams in R (via the Circlize/Circos packages) to visual name associations in a dataset. I was able to generate the Chord Diagram (as shown below):
However, I don't know how to sort each sector (or each name) based on its respective width (e.g.: In the lower half of the Chord Diagram, I would like to arrange the sectors in descending order like this: N/A would be placed first, followed by Dean, Aaron, Malcolm, ... Jay). Is there a specific circos function that would allow me to do this?
Here's my code:
library(circlize)
setwd("C:/Users/Main/Desktop/")
data <- read.table('./r_test.txt',header = FALSE,sep = '\t')
chordDiagram(data,annotationTrack="grid",grid.col =
c("springgreen","coral","indianred","violet",
"greenyellow","cyan","purple","firebrick",
"gold","darkblue","red","magenta",
"orangered","brown","blueviolet","darkgoldenrod",
"aquamarine","khaki"),preAllocateTracks=list(track.height = link.sort =
TRUE,link.decreasing = TRUE)
circos.trackPlotRegion(track.index = 1, panel.fun = function(x, y) {
xlim = get.cell.meta.data("xlim")
xplot = get.cell.meta.data("xplot")
ylim = get.cell.meta.data("ylim")
sector.name = get.cell.meta.data("sector.index")
circos.text(mean(xlim), ylim[1], sector.name, facing = " niceFacing = TRUE,
adj = c(0, .75),cex=2)
},bg.border = NA)
The data file is a tab-delineated .txt file with names in the first 2 columns (there are 10 names in each column along with "Other" and "N/A" in the columns; the third column is a frequency count).
Depends on the order of the data you inputted.
Do data[order,] and do the same thing, where order = a vector of the names in the order that you'd like.
Here is a very useful resource that I have used: https://jokergoo.github.io/circlize_book/book/the-chorddiagram-function.html
Good luck!
I am trying to draw PCA results with ggbiplot, how can I draw supplementary variables ?
I found this discussion for MCA results, but I would like to have the arrows as well...
data(wine)
wine.pca <- PCA(wine, scale. = TRUE, quanti.sup = c(4,5))
plot(wine.pca)
ggbiplot(wine.pca)
Besides, this code gives me an error :
1: In sweep(pcobj$ind$coord, 2, 1/(d * nobs.factor), FUN = "*") :
STATS is longer than the extent of 'dim(x)[MARGIN]'
2: In sweep(v, 2, d^var.scale, FUN = "*") :
STATS is longer than the extent of 'dim(x)[MARGIN]'
I tried your code and didn't reproduce your error but had other problems. I googled PCA() and found the package used to do the PCA was FactoMineR. After looking at the documentation, I also changed scale. to scale.unit and quanti.sup to quali.sup, giving the correct columns the categorical variables are in.
library(FactoMineR)
data(wine)
wine.pca <- PCA(wine, scale.unit = TRUE, quali.sup = c(1,2))
plot(wine.pca)
ggbiplot(wine.pca)
That should give the correct output.
I have created a boxplot with the code
boxplot(X,horizontal = TRUE, axes = FALSE, staplewex = 1, boxwex = 0.05)
text(x = boxplot.stats(X[,1])$stats, labels = boxplot.stats(X[,1])$stats, y = 1.04).
However, the results I get from the text/boxplot function are different from then of the quantile(X[,1],0.25) or summarize(X) functions. I think boxplot maybe uses other definitions. But I was confucsed by the boxplot documentation since it is not very readable. Maybe someone can explain the differences to me!
Thank you for your help!
Simon
Hi everyone I have a simple question but for which i havent been able to get an answer in any tutorial. Ive done a simple principal component analysis on a set of data and then plot my data with biplot.
CP <- prcomp(dat, scale. = T)
summary(CP)
biplot(CP)
With this i get a scatter plot of my data in terms of the first and second component. I wish to separate my data by color, indicating R to paint my first 20 data in red and next 20 data in blue. I dont know how to tell R to color those two sets of data.
Any help will be very appreciated. thks!
(im very new to R)
Disclaimer: This is not a direct answer but can be tweak to obtain the desired output.
library(ggbiplot)
data(wine)
wine.pca <- prcomp(wine, scale. = TRUE)
print(ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, groups = wine.class, ellipse = TRUE, circle = TRUE))
Using plot() will provide you more flexibility - you may use it alone or with text() for text labels as belows (Thanks #flodel for useful comments):
col = rep(c("red","blue"),each=20)
plot(CP$x[,1], CP$x[,2], pch="", main = "Your Plot Title", xlab = "PC 1", ylab = "PC 2")
text(CP$x[,1], CP$x[,2], labels=rownames(CP$x), col = col)
However if you want to use biplot() try this code:
biplot(CP$x[1:20,], CP$x[21:40,], col=c("red","blue"))