R: Parallel Coordinates Plot without GGally - r

I am using the R programming language. I am using a computer that does not have a USB port or an internet connection - I only have R with a few preloaded libraries (e.g. ggplot2, reshape2, dplyr, base R).
Is it possible to make "parallel coordinate" plots (e.g. below) using only the "ggplot2" library and not "ggally"?
#load libraries (I do not have GGally)
library(GGally)
#load data (I have MASS)
data(crabs, package = "MASS")
#make 2 different parallel coordinate plots
ggparcoord(crabs)
ggparcoord(crabs, columns = 4:8, groupColumn = "sex")
Thanks
Source: https://homepage.divms.uiowa.edu/~luke/classes/STAT4580-2020/parcor.html

In fact, you do not even need ggplot! This is just a plot of standardised values (minus mean divided by SD), so you can implement this logic with any plotting function capable of doing so. The cleanest and easiest way to do it is in steps in base R:
# Standardising the variables of interest
data(crabs, package = "MASS")
crabs[, 4:8] <- apply(crabs[, 4:8], 2, scale)
# This colour solution works in great generality, although RColorBrewer has better distinct schemes
mycolours <- rainbow(length(unique(crabs$sex)), end = 0.6)
# png("gally.png", 500, 400, type = "cairo", pointsize = 14)
par(mar = c(4, 4, 0.5, 0.75))
plot(NULL, NULL, xlim = c(1, 5), ylim = range(crabs[, 4:8]) + c(-0.2, 0.2),
bty = "n", xaxt = "n", xlab = "Variable", ylab = "Standardised value")
axis(1, 1:5, labels = colnames(crabs)[4:8])
abline(v = 1:5, col = "#00000033", lwd = 2)
abline(h = seq(-2.5, 2.5, 0.5), col = "#00000022", lty = 2)
for (i in 1:nrow(crabs)) lines(as.numeric(crabs[i, 4:8]), col = mycolours[as.numeric(crabs$sex[i])])
legend("topright", c("Female", "Male"), lwd = 2, col = mycolours, bty = "n")
# dev.off()
You can apply this logic (x axis with integer values, y axis with standardised variable lines) in any package that can conveniently draw multiple lines (as in time series), but this solution has no extra dependencies an will not become unavailable due to an orphaned package with 3 functions getting purged from CRAN.

The closest thing I found to this without the "GGally" was the built in function using the "MASS" library:
#source: https://stat.ethz.ch/R-manual/R-devel/library/MASS/html/parcoord.html
library(MASS)
parcoord(state.x77[, c(7, 4, 6, 2, 5, 3)])
ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3])
parcoord(log(ir)[, c(3, 4, 2, 1)], col = 1 + (0:149)%/%50)

Related

How to replicate a figure describing standard error of the mean in R?

The first figure in link here shows a very nice example of how to visualise standard error and I would like to replicate that in R.
I'm getting there with the following
set.seed(1)
pop<-rnorm(1000,175,10)
mean(pop)
hist(pop)
#-------------------------------------------
# Plotting Standard Error for small Samples
#-------------------------------------------
smallSample <- replicate(10,sample(pop,3,replace=TRUE)) ; smallSample
smallMeans<-colMeans(smallSample)
par(mfrow=c(1,2))
x<-c(1:10)
plot(x,smallMeans,ylab="",xlab = "",pch=16,ylim = c(150,200))
abline(h=mean(pop))
#-------------------------------------------
# Plotting Standard Error for Large Samples
#-------------------------------------------
largeSample <- replicate(10,sample(pop,20,replace=TRUE))
largeMeans<-colMeans(largeSample)
x<-c(1:10)
plot(x,largeMeans,ylab="",xlab = "",pch=16,ylim = c(150,200))
abline(h=mean(pop))
But I'm not sure how to plot the raw data as they have with the X symbols. Thanks.
Using base plotting, you need to use the arrows function.
In R there is no function (ASAIK) that computes standard error so try this
sem <- function(x){
sd(x) / sqrt(length(x))
}
Plot (using pch = 4 for the x symbols)
plot(x, largeMeans, ylab = "", xlab = "", pch = 4, ylim = c(150,200))
abline(h = mean(pop))
arrows(x0 = 1:10, x1 = 1:10, y0 = largeMeans - sem(largeSample) * 5, largeMeans + sem(largeSample) * 5, code = 0)
Note: the SE's from the data you provided were quite small, so i multiplied them by 5 to make them more obvious
Edit
Ahh, to plot all the points, then perhaps ?matplot, and ?matpoints would be helpful? Something like:
matplot(t(largeSample), ylab = "", xlab = "", pch = 4, cex = 0.6, col = 1)
abline(h = mean(pop))
points(largeMeans, pch = 19, col = 2)
Is this more the effect you're after?

Can I make new 'pch' figure in 'plot' function? (R) [duplicate]

Is there some way how to make custom points in R? I am familiar with pch argument where are many choices, but what if I need to plot for example tree silhouettes?
For example if I draw some point as eps. (or similar) file, can I use it in R?. Solution by raster is not good in the case of complicated objects (f.e. trees).
You can do this with the grImport package. I drew a spiral in Inkscape and saved it as drawing.ps. Following the steps outlined in the grImport vignette, we trace the file and read it as a sort of polygon.
setwd('~/R/')
library(grImport)
library(lattice)
PostScriptTrace("drawing.ps") # creates .xml in the working directory
spiral <- readPicture("drawing.ps.xml")
The vignette uses lattice to plot the symbols. You can also use base graphics, although a conversion is needed from device to plot coordinates.
# generate random data
x = runif(n = 10, min = 1, max = 10)
y = runif(n = 10, min = 1, max = 10)
# lattice (as in the vignette)
x11()
xyplot(y~x,
xlab = "x", ylab = "y",
panel = function(x, y) {
grid.symbols(spiral, x, y, units = "native", size = unit(10, "mm"))
})
# base graphics
x11()
plot(x, y, pty = 's', type = 'n', xlim = c(0, 10), ylim = c(0, 10))
xx = grconvertX(x = x, from = 'user', to = 'ndc')
yy = grconvertY(y = y, from = 'user', to = 'ndc')
grid.symbols(spiral, x = xx, y = yy, size = 0.05)

Mixed plot with histogram and superimposed line plot in same figure

I know there are strong opinions about mixing plot types in the same figures, especially if there are two y axes involved. However, this is a situation in which I have no alternative - I need to create a figure using R that follows a standard format - a histogram on one axis (case counts), and a superimposed line graph showing an unrelated rate on an independent axis.
The best I have been able to do is stacked ggplot2 facets, but this is not as easy to interpret for the purposes of this analysis as the combined figure. The people reviewing this output will need it in the format they are used to.
I'm attaching an example below.
Any ideas?
For etiquette purposes, sample data below:
y1<-sample(0:1000,20,rep=TRUE)
y2<-sample(0:100,20,rep=TRUE)
x<-1981:2000
I feel your pain - have had to recreate plots before. even did it in SAS once
if it's a once off, I'm be tempted to go old-school. something like this:
# Generate some data
someData <- data.frame(Year = 1987:2009,
mCases = rpois(23, 3),
pVac = sample(55:80, 23, T))
par(mar = c(5, 5, 5, 5))
with(someData, {
# Generate the barplot
BP <- barplot(mCases, ylim = c(0, 18), names = Year,
yaxt = "n", xlab = "", ylab = "Measles cases in Thousands")
axis(side = 2, at = 2*1:9, las = 1)
box()
# Add the % Vaccinated
par(new = T)
plot(BP, pVac, type = "l", ylim = c(0, 100), axes = F, ylab = "", xlab = "")
axis(side = 4, las = 1)
nudge <- ifelse(pVac > median(pVac), 2, -2)
text(BP, pVac + nudge, pVac)
mtext(side = 4, "% Vaccinated", line = 3)
par(new = F)
})
Try library(plotrix)
library(plotrix)
## Create sample data
y2<-sample(0:80,20,rep=TRUE)
x2<-sort(sample(1980:2010,20,rep=F))
y1<-sample(0:18,20,rep=TRUE)
x1<-sort(sample(1980:2010,20,rep=F))
x<-1980:2010
twoord.plot(x1,y1,x2,y2,
lylim=c(0,18),rylim=c(0,100),type=c("bar","l"),
ylab="Measles Cases in thousands",rylab="% Vaccinated",
lytickpos=seq(0,18,by=2),rytickpos=seq(0,100,by=10),ylab.at=9,rylab.at=50,
lcol=3,rcol=4)

How to make custom plot symbols from vector graphics in R

Is there some way how to make custom points in R? I am familiar with pch argument where are many choices, but what if I need to plot for example tree silhouettes?
For example if I draw some point as eps. (or similar) file, can I use it in R?. Solution by raster is not good in the case of complicated objects (f.e. trees).
You can do this with the grImport package. I drew a spiral in Inkscape and saved it as drawing.ps. Following the steps outlined in the grImport vignette, we trace the file and read it as a sort of polygon.
setwd('~/R/')
library(grImport)
library(lattice)
PostScriptTrace("drawing.ps") # creates .xml in the working directory
spiral <- readPicture("drawing.ps.xml")
The vignette uses lattice to plot the symbols. You can also use base graphics, although a conversion is needed from device to plot coordinates.
# generate random data
x = runif(n = 10, min = 1, max = 10)
y = runif(n = 10, min = 1, max = 10)
# lattice (as in the vignette)
x11()
xyplot(y~x,
xlab = "x", ylab = "y",
panel = function(x, y) {
grid.symbols(spiral, x, y, units = "native", size = unit(10, "mm"))
})
# base graphics
x11()
plot(x, y, pty = 's', type = 'n', xlim = c(0, 10), ylim = c(0, 10))
xx = grconvertX(x = x, from = 'user', to = 'ndc')
yy = grconvertY(y = y, from = 'user', to = 'ndc')
grid.symbols(spiral, x = xx, y = yy, size = 0.05)

VGAM percentile curve plot in R

I am running following code from help files of VGAM:
library(VGAM)
fit4 <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), data = bmi.nz, trace = TRUE)
qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1, xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4)
How can I just prevent putting points on the plot so that the graph shows only percentile curves? Is there some option in qtplot where I can suppress points on this graph, so that I do not need to resort to long ggplot route as done on this page: Percentiles from VGAM ? In my earlier question there were other issues also so this point got ignored. Thanks for your help.
There is no qtplot help page so I went to the package help Index and saw qtplot.lmscreg listed. It had a 'pcol.arg' to control points color so I set it to "transparent":
qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1,
xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4,
pcol.arg="transparent")

Resources