Tweaking xlabel and ylabel in parallel plot parcoord of R - r

I made 13 parallel coordinate plots lines, where each plot has x lines, each of 5 points. There are three things that I would like to change:
I would like to remove very long vertical x-axis ticks that protrude below out of the graph
I would like to change the x-axis labels of each plot to be "N", "1", "2", "3", "4"
I would like the y-axis to be labelled for each plot. It currently is not. The maximum y-value for each plot is max(input). So, I like four y-axis labels: max(input), 3/4 max(input), 1/2 max(input), and 1/4 max(input) (all to the nearest integer to keep it neat).
I would like a main title over all the graphs (I'll just call it "Main Title" for now)
Here is my code currently:
par(mfrow = c(3,5))
par(mar=c(0.1,0.1,0.1,0.1))
# For each color (cluster) in the random network
for (i in 1:max(net$colors)){
color = mergedColors[which(net$colors == i)[1]]
input = countTable[which(net$colors==i),]
parcoord(input, lty = 1, var.label = FALSE, col = color)
}
where the str(input) is a data.frame of x observations of 5 variables.
I tried to add things like x.label = c("N","1","2","3","4"), but that did not work.
Edit:
Here is some sample data, as per suggestions. Please let me know if I should include anything else:
net <- data.frame(colors=as.numeric(sample(1:15, 100, replace = T)))
mycols <- c("brown", "blue", "turquoise", "greenyellow", "red",
"pink", "green", "yellow", "magenta", "black","purple",
"tomato1","peachpuff","orchid","slategrey")
mergedColors = mycols[net$colors]
countTable <- data.frame(matrix(sample(1:100,100*5, replace=T),
ncol=5, dimnames=list(NULL, c("Norm","One","Two","Three","Four"))))

OK. I'm not sure I understand request 1, but here's what I came up with so far
library(MASS)
opar<-par(no.readonly=T)
par(mfrow = c(3,5))
par(oma=c(1.2,2,2,0))
par(mar=c(2,2,0.1,0.1))
# For each color (cluster) in the random network
for (i in 1:max(net$colors)){
color = mergedColors[which(net$colors == i)]
input = countTable[which(net$colors==i),]
colnames(input)<-c("N",1:4)
parcoord(input, lty = 1, var.label = FALSE, col = color)
axis(2,at=seq(0,1,length.out=5),labels=seq(min(input),max(input), length.out=5))
}
mtext("Main Title",3, outer=T)
par(opar)

Related

"col" argument in plot function not working when a factor value is used for x - axis

I am doing quarterly analysis, for which I want to plot a graph. To maintain continuity on x axis I have turned quarters into factors. But then when I am using plot function and trying to color it red, the col argument is not working.
An example:
quarterly_analysis <- data.frame(Quarter = as.factor(c(2020.1,2020.2,2020.3,2020.4,2021.1,2021.2,2021.3,2021.4)),
AvgDefault = as.numeric(c(0.24,0.27,0.17,0.35,0.32,0.42,0.38,0.40)))
plot(quarterly_analysis, col="red")
But I am getting the graph in black color as shown below:
Converting it to a factor is not ideal to plot unless you have multiple values for each factor - it tries to plot a box plot-style plot. For example, with 10 observations in the same factor, the col = "red" color shows up as the fill:
set.seed(123)
fact_example <- data.frame(factvar = as.factor(rep(LETTERS[1:3], 10)),
numvar = runif(30))
plot(fact_example$factvar, fact_example$numvar,
col = "red")
With only one observation for each factor, this is not ideal because it is just showing you the line that the box plot would make.
You could use border = "red:
plot(quarterly_analysis$Quarter,
quarterly_analysis$AvgDefault, border="red")
Or if you want more flexibility, you can plot it numerically and do a little tweaking for more control (i.e., can change the pch, or make it a line graph):
# make numeric x values to plot
x_vals <- as.numeric(substr(quarterly_analysis$Quarter,1,4)) + rep(seq(0, 1, length.out = 4))
par(mfrow=c(1,3))
plot(x_vals,
quarterly_analysis$AvgDefault, col="red",
pch = 7, main = "Square Symbol", axes = FALSE)
axis(1, at = x_vals,
labels = quarterly_analysis$Quarter)
axis(2)
plot(x_vals,
quarterly_analysis$AvgDefault, col="red",
type = "l", main = "Line graph", axes = FALSE)
axis(1, at = x_vals,
labels = quarterly_analysis$Quarter)
axis(2)
plot(x_vals,
quarterly_analysis$AvgDefault, col="red",
type = "b", pch = 7, main = "Both", axes = FALSE)
axis(1, at = x_vals,
labels = quarterly_analysis$Quarter)
axis(2)
Data
set.seed(123)
quarterly_analysis <- data.frame(Quarter = as.factor(paste0(2019:2022,
rep(c(".1", ".2", ".3", ".4"),
each = 4))),
AvgDefault = runif(16))
quarterly_analysis <- quarterly_analysis[order(quarterly_analysis$Quarter),]

How to add centroids to an RDA plot

I'd like to replace the arrows on this RDA plot with centroids, something like what's pictured here.
This is the code I currently have which provides me arrows (I guess by default). I have shared our RDA code and I think this is where we might be able to change it from arrows to centroid:
# add arrows for effects of the expanatory variables
arrows(0,0, # start them from (0,0)
sc_bp[,1], sc_bp[,2], # end them at the score value
col = "red",
lwd = 1,
length = .1)
(but I share the entire code chunk (below), just in case.
Please note that my data is on fish community (species) and substrate types at 36 sites, I'd like to replace the arrows for substrates with centroids within my RDA.
##Now, the RDA
Y.mat<-Belt_2021_fish_transformed_forPCA #fish community
str(Y.mat)
X.mat<-Reefcheck_2021_forPCA #substrate
str(X.mat)
###Community data has already been transformed with hellinger
##Now, try the RDA
fish_substrate_rda<-rda(Y.mat,X.mat)
```
##Plot
## extract % explained by the first 2 axes
perc_b <- round(100*(summary(fish_substrate_rda)$cont$importance[2, 1:2]), 2)
## extract scores - these are coordinates in the RDA space
sc_si <- scores(fish_substrate_rda, display="sites", choices=c(1,2), scaling=1)
sc_sp <- scores(fish_substrate_rda, display="species", choices=c(1,2), scaling=1)
sc_sp <- sc_sp[c(2,7,8),]
sc_bp <- scores(fish_substrate_rda, display="bp", choices=c(1,2), scaling=1)
sc_bp <- sc_bp[c(2,5,6),]
# Set up a blank plot with scaling, axes, and labels
plot(fish_substrate_rda,
scaling = 1, # set scaling type
type = "none", # this excludes the plotting of any points from the results
frame = TRUE,
# set axis limits
ylim = c(-1.5,0.7),
xlim = c(-1.5,1.2),
# label the plot (title, and axes)
main = "Triplot RDA - scaling 1",
xlab = paste0("RDA1 (", perc_b[1], "%)"),
ylab = paste0("RDA2 (", perc_b[2], "%)")
)
# add points for site scores
points(sc_si,
pch = 21, # set shape (here, circle with a fill colour)
col = "black", # outline colour
bg = "steelblue", # fill colour
cex = 0.7) # size
# add points for species scores
points(sc_sp,
pch = 22, # set shape (here, square with a fill colour)
col = "black",
bg = "#f2bd33",
cex = 0.7)
# add text labels for species abbreviations
text(sc_sp + c(-0.09, -0.09), # adjust text coordinates to avoid overlap with points
labels = rownames(sc_sp),
col = "grey40",
font = 2, # bold
cex = 0.6)
# add arrows for effects of the expanatory variables
arrows(0,0, # start them from (0,0)
sc_bp[,1], sc_bp[,2], # end them at the score value
col = "red",
lwd = 1,
length = .1)
# add text labels for arrows
text(x = sc_bp[,1] -0.01, # adjust text coordinate to avoid overlap with arrow tip
y = sc_bp[,2] - 0.09,
labels = rownames(sc_bp),
col = "red",
cex = .7,
font = 1)
```
I have not found anything online that might help me to accomplish this.

Delete Scale for Height in dendrogram visualisation

I can create a dendrogram using
x<-1:100
dim(x)<-c(10,10)
set.seed(1)
groups<-c("red","red", "red", "red", "blue", "blue", "blue","blue", "red", "blue")
x.clust<-as.dendrogram(hclust(dist(x)))
x.clust.dend <- x.clust
labels_colors(x.clust.dend) <- groups
x.clust.dend <- assign_values_to_leaves_edgePar(x.clust.dend, value = groups, edgePar = "col") # add the colors.
x.clust.dend <- assign_values_to_leaves_edgePar(x.clust.dend, value = 3, edgePar = "lwd") # make the lines thick
plot(x.clust.dend)
However I want to delete the scale of height information in the left as shown in Figure below.
My guess is that it should be extremely trivial but I am not able to find a way to do this. One solution which I don't want is using the ggplot2 as below:
ggplot(as.ggdend(dend2))
This is because I will loose some of the formatting like color_bars()
The graphical parameter 'axes = FALSE" can be used to remove the distance measure for the plot.dendogram command:
plot(x.clust.dend, axes=F)
This will produce the following dendogram without distance axis:
You can just set yaxt = "n"
plot(x.clust.dend, yaxt = "n")
You can add another axis with
axis(side = 2, labels = FALSE)

Legend graphing help in R

So I am trying to add some graphs to my notes. I have created a simple interest function that will plot several simple interest functions using different rates and I would like to add a legend that would simple say...
"i =: 0%, x%, y%, z%" on one single line, where each 0,x,y,z is in the different color of the representative function using that interest rate.
I looked into the paste() function and attempted to make it one string but I am not sure exactly how to loop it into the int_seq and pull out each individual index and make it a different color then put it into a single string.
# indexs to be used
t = 0:50
int_seq = seq(0.025,0.10,by=0.025) # intere rate sequence
colors = c("red","blue","green","orange") #colors of interest rate seq
index = 1:length(int_seq)
# AV Simple Interest (all good)
avSimple = function(i,t){
av = (1 + (i * t))
return(av)}
# Plot range for y-axis (all good)
yrange = c(avSimple(min(int_seq),min(t)) * 0.95,
avSimple(max(int_seq),max(t)) * 1.05)
# Plots Simple Interest with different interest rates (all good)
plot(t,avSimple(0,t), type="l", main = "AV Simple Interest", xlab = "Time",
ylab = "AV", ylim = yrange)
# loops through the int_seq and plots line based on interest rate
# and specified color (all good)
for (i in index)
lines(t,avSimple(int_seq[i],t), col = colors[i])
# Adds legend to plot for different interest rates
# !!This is where I need the help, not sure best way to approach!!
legend(0,avSimple(0.075,50), c("i =: 0%", for (i in index) int_seq[i]),
col = colors)
Not sure what kind of legend you want. Since you say in one line, you might want to add horiz = TRUE, but here are some other options:
You can pass full vectors to legend so there is no need for a loop in this case. Just create a vector of labels but also use a vector of colors corresponding to each label (which you have already done).
# indexs to be used
t = 0:50
int_seq = seq(0.025,0.10,by=0.025) # intere rate sequence
colors = c("red","blue","green","orange") #colors of interest rate seq
index = 1:length(int_seq)
# AV Simple Interest (all good)
avSimple = function(i,t){
av = (1 + (i * t))
return(av)}
# Plot range for y-axis (all good)
yrange = c(avSimple(min(int_seq),min(t)) * 0.95,
avSimple(max(int_seq),max(t)) * 1.05)
plot(t, type="n", main = "AV Simple Interest", xlab = "Time",
ylab = "AV", ylim = yrange)
# for (i in index)
# lines(t,avSimple(int_seq[i],t), col = colors[i])
# Adds legend to plot for different interest rates
# !!This is where I need the help, not sure best way to approach!!
labs <- sprintf('i =: %s%%', c(0, int_seq))
labs2 <- paste0(c(0, int_seq), '%')
legend('topleft', legend = labs, col = colors, lty = 1, title = 'normal')
l <- legend('top', legend = rep('i =:', length(labs)), lty = 1,
col = colors, text.width = max(strwidth(labs)) + 1,
title = 'right-justified')
text(l$rect$left + l$rect$w, l$text$y, labs2, pos = 2)
legend('topright', legend = labs, text.col = colors, title = 'colored')
legend('bottom', legend = labs, col = colors, lty = 1, horiz = TRUE,
cex = .7, title = 'horizontal')

Any way to make plot points in scatterplot more transparent in R?

I have a 3 column matrix; plots are made by points based on column 1 and column 2 values, but colored based on column 2 (6 different groups). I can successfully plot all points, however, the last plot group (group 6) which was assigned the color purple, masks the plots of the other groups. Is there a way to make the plot points more transparent?
s <- read.table("/.../parse-output.txt", sep="\t")
dim(s)
[1] 67124 3
x <- s[,1]
y <- s[,2]
z <- s[,3]
cols <- cut(z, 6, labels = c("pink", "red", "yellow", "blue", "green", "purple"))
plot(x, y, main= "Fragment recruitment plot - FR-HIT", ylab = "Percent identity", xlab = "Base pair position", col = as.character(cols), pch=16)
Otherwise, you have function alpha in package scales in which you can directly input your vector of colors (even if they are factors as in your example):
library(scales)
cols <- cut(z, 6, labels = c("pink", "red", "yellow", "blue", "green", "purple"))
plot(x, y, main= "Fragment recruitment plot - FR-HIT",
ylab = "Percent identity", xlab = "Base pair position",
col = alpha(cols, 0.4), pch=16)
# For an alpha of 0.4, i. e. an opacity of 40%.
When creating the colors, you may use rgb and set its alpha argument:
plot(1:10, col = rgb(red = 1, green = 0, blue = 0, alpha = 0.5),
pch = 16, cex = 4)
points((1:10) + 0.4, col = rgb(red = 0, green = 0, blue = 1, alpha = 0.5),
pch = 16, cex = 4)
Please see ?rgb for details.
Transparency can be coded in the color argument as well. It is just two more hex numbers coding a transparency between 0 (fully transparent) and 255 (fully visible). I once wrote this function to add transparency to a color vector, maybe it is usefull here?
addTrans <- function(color,trans)
{
# This function adds transparancy to a color.
# Define transparancy with an integer between 0 and 255
# 0 being fully transparant and 255 being fully visable
# Works with either color and trans a vector of equal length,
# or one of the two of length 1.
if (length(color)!=length(trans)&!any(c(length(color),length(trans))==1)) stop("Vector lengths not correct")
if (length(color)==1 & length(trans)>1) color <- rep(color,length(trans))
if (length(trans)==1 & length(color)>1) trans <- rep(trans,length(color))
num2hex <- function(x)
{
hex <- unlist(strsplit("0123456789ABCDEF",split=""))
return(paste(hex[(x-x%%16)/16+1],hex[x%%16+1],sep=""))
}
rgb <- rbind(col2rgb(color),trans)
res <- paste("#",apply(apply(rgb,2,num2hex),2,paste,collapse=""),sep="")
return(res)
}
Some examples:
cols <- sample(c("red","green","pink"),100,TRUE)
# Fully visable:
plot(rnorm(100),rnorm(100),col=cols,pch=16,cex=4)
# Somewhat transparant:
plot(rnorm(100),rnorm(100),col=addTrans(cols,200),pch=16,cex=4)
# Very transparant:
plot(rnorm(100),rnorm(100),col=addTrans(cols,100),pch=16,cex=4)
If you are using the hex codes, you can add two more digits at the end of the code to represent the alpha channel:
E.g. half-transparency red:
plot(1:100, main="Example of Plot With Transparency")
lines(1:100 + sin(1:100*2*pi/(20)), col='#FF000088', lwd=4)
mtext("use `col='#FF000088'` for the lines() function")
If you decide to use ggplot2, you can set transparency of overlapping points using the alpha argument.
e.g.
library(ggplot2)
ggplot(diamonds, aes(carat, price)) + geom_point(alpha = 1/40)

Resources