QQ plot: More than two data - r

I wanted to graph a QQ plot similar to this picture:
I managed to get a QQ plot using two samples, but I do not know how to add a third one to the plot.
Here is my result:
Here is the code I used:
qqplot(table$Bedouin, table$Tunisia, xlim = c(-0.25,0.25), ylim = c(-025,0.25))
In my table data frame I have other populations I would like to add. But I can't.
Thank you in advance.

I suppose you're looking for a scatterplot of sorted values since all variables are stored in the same data frame.
An example dataset:
set.seed(10)
dat <- data.frame(A = rnorm(20), B = rnorm(20), C = rnorm(20))
This is a way to create the plot with basic R functions:
# create a QQ-plot of B as a function of A
qqplot(dat$A, dat$B, xlim = range(dat), ylim = range(dat),
xlab = "A", ylab = "B/C")
# create a diagonal line
abline(a = 0, b = 1)
# add the points of C
points(sort(dat$A), sort(dat$C), col = "red")
# create a legend
legend("bottomright", legend = c("B", "C"), pch = 1, col = c("black", "red"))

You can add the line
par(new=TRUE)
Then use the qqplot() again to over-plot the first plot as follows:
set.seed(10)
dat <- data.frame(A = rnorm(20), B = rnorm(20), C = rnorm(20))
# create a QQ-plot of B as a function of A
qqplot(dat$A, dat$B,
xlim = range(dat), ylim = range(dat),
xlab = "Distribution A", ylab = "Other distributions")
# set overplotting
par(new=TRUE)
# create a QQ-plot of B as a function of C
qqplot(dat$A, dat$C,
xlim = range(dat), ylim = range(dat),
xlab = "Distribution A",
ylab = "Other distributions",
col = "red")
# create a diagonal line
abline(a = 0, b = 1)
# create a legend
legend("bottomright", legend = c("B", "C"), pch = 1, col = c("black", "red"))

Related

How can I show non-inferiority with a plot using R

I compare two treatments A and B. The objective is to show that A is not inferior to B. The non inferiority margin delta =-2
After comparing Treatment A - Treatment B I have these results
Mean difference and 95% CI = -0.7 [-2.1, 0.8]
I would like to plot this either with a package or manually. I have no idea how to do it.
Welch Two Sample t-test
data: mydata$outcome[mydata$traitement == "Bras S"] and mydata$outcome[mydata$traitement == "B"]
t = 0.88938, df = 258.81, p-value = 0.3746
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.133224 0.805804
sample estimates:
mean of x mean of y
8.390977 9.054688
I want to create this kind of plot:
You could abstract the relevant data from the t.test results and then plot in base R using segments and points to plot the data and abline to draw in the relevant vertical lines. Since there were no reproducible data, I made some up but the process is generally the same.
#sample data
set.seed(123)
tres <- t.test(runif(10), runif(10))
# get values to plot from t test results
ci <- tres$conf.int
ests <- tres$estimate[1] - tres$estimate[2]
# plot
plot(x = ci, ylim = c(0,2), xlim = c(-4, 4), type = "n", # blank plot
bty = "n", xlab = "Treatment A - Treatment B", ylab = "",
axes = FALSE)
points(x = ests, y = 1, pch = 20) # dot for point estimate
segments(x0 = ci[1], x1 = ci[2], y0 = 1) #CI line
abline(v = 0, lty = 2) # vertical line, dashed
abline(v = 2, lty = 1, col = "darkblue") # vertical line, solid, blue
axis(1, col = "darkblue") # add in x axis, blue
EDIT:
If you wanted to more accurately recreate your figure with the x axis in descending order and using your statement "Mean difference and 95% CI = -0.7 [-2.1, 0.8]", you can do the following manipulations to the above approach:
diff <- -0.7
ci <- c(-2.1, 0.8)
# plot
plot(1, xlim = c(-4, 4), type = "n",
bty = "n", xlab = "Treatment A - Treatment B", ylab = "",
axes = FALSE)
points(x = -diff, y = 1, pch = 20)
segments(x0 = -ci[2], x1 = -ci[1], y0 = 1)
abline(v = 0, lty = 2)
abline(v = 2, lty = 1, col = "darkblue")
axis(1, at = seq(-4,4,1), labels = seq(4, -4, -1), col = "darkblue")

How can I change the colour of my points on my db-RDA triplot in R?

QUESTION: I am building a triplot for the results of my distance-based RDA in R, library(vegan). I can get a triplot to build, but can't figure out how to make the colours of my sites different based on their location. Code below.
#running the db-RDA
spe.rda.signif=capscale(species~canopy+gmpatch+site+year+Condition(pair), data=env, dist="bray")
#extract % explained by first 2 axes
perc <- round(100*(summary(spe.rda.signif)$cont$importance[2, 1:2]), 2)
#extract scores (coordinates in RDA space)
sc_si <- scores(spe.rda.signif, display="sites", choices=c(1,2), scaling=1)
sc_sp <- scores(spe.rda.signif, display="species", choices=c(1,2), scaling=1)
sc_bp <- scores(spe.rda.signif, display="bp", choices=c(1, 2), scaling=1)
#These are my location or site names that I want to use to define the colours of my points
site_names <-env$site
site_names
#set up blank plot with scaling, axes, and labels
plot(spe.rda.signif,
scaling = 1,
type = "none",
frame = FALSE,
xlim = c(-1,1),
ylim = c(-1,1),
main = "Triplot db-RDA - scaling 1",
xlab = paste0("db-RDA1 (", perc[1], "%)"),
ylab = paste0("db-RDA2 (", perc[2], "%)")
)
#add points for site scores - these are the ones that I want to be two different colours based on the labels in the original data, i.e., env$site or site_names defined above. I have copied the current state of the graph
points(sc_si,
pch = 21, # set shape (here, circle with a fill colour)
col = "black", # outline colour
bg = "steelblue", # fill colour
cex = 1.2) # size
Current graph
I am able to add species names and arrows for environmental predictors, but am just stuck on how to change the colour of the site points to reflect their location (I have two locations defined in my original data). I can get them labelled with text, but that is messy.
Any help appreciated!
I have tried separating shape or colour of point by site_name, but no luck.
If you only have a few groups (in your case, two), you could make the group a factor (within the plot call). In R, factors are represented as an integer "behind the scenes" - you can represent up to 8 colors in base R using a simple integer:
set.seed(123)
df <- data.frame(xvals = runif(100),
yvals = runif(100),
group = sample(c("A", "B"), 100, replace = TRUE))
plot(df[1:2], pch = 21, bg = as.factor(df$group),
bty = "n", xlim = c(-1, 2), ylim = c(-1, 2))
legend("topright", unique(df$group), pch = 21,
pt.bg = unique(as.factor(df$group)), bty = "n")
If you have more than 8 groups, or if you would like to define your own colors, you can simply create a vector of colors the length of your groups and still use the same factor method, though with a few slight tweaks:
# data with 10 groups
set.seed(123)
df <- data.frame(xvals = runif(100),
yvals = runif(100),
group = sample(LETTERS[1:10], 100, replace = TRUE))
# 10 group colors
ccols <- c("red", "orange", "blue", "steelblue", "maroon",
"purple", "green", "lightgreen", "salmon", "yellow")
plot(df[1:2], pch = 21, bg = ccols[as.factor(df$group)],
bty = "n", xlim = c(-1, 2), ylim = c(-1, 2))
legend("topright", unique(df$group), pch = 21,
pt.bg = ccols[unique(as.factor(df$group))], bty = "n")
For pch just a slight tweak to wrap it in as.numeric:
pchh <- c(21, 22)
ccols <- c("slateblue", "maroon")
plot(df[1:2], pch = pchh[as.numeric(as.factor(df$group))], bg = ccols[as.factor(df$group)],
bty = "n", xlim = c(-1, 2), ylim = c(-1, 2))
legend("topright", unique(df$group),
pch = pchh[unique(as.numeric(as.factor(df$group)))],
pt.bg = ccols[unique(as.factor(df$group))], bty = "n")

labeling nmds plot by column values in r

I ran metaMDS and want to plot and color code by a grouping based on certain data frame characters. In my original data frame, df$yr are years and df$2 are sites. I want to color by the years.
caltmds <- metaMDS(df[,3:12], k=3)
plot(caltmds, type = 'n')
cols <- c("red2", "mediumblue")
points(caltmds, col = cols[df$yr])
I also tried from this post:
scl <- 3
colvec <- c("red2", "mediumblue")
plot(caltmds, type = "n", scaling = scl)
with(df, points(caltmds, display = "sites", col = colvec[yr], pch = 21, bg = colvec[yr]))
text(caltmds, display = "species", cex = 0.8, col = "darkcyan")
with(df, legend("topright", legend = levels(yr), bty = "n", col = colvec, pch = 21, pt.bg = colvec))
Nothing plots
#DATA
df1 = mtcars
mycolors = df1$cyl #Identify the grouping vector
library(vegan)
m = metaMDS(df1)
x = scores(m) #Extract co-ordinates
plot(x, col = as.numeric(as.factor(mycolors)))

Plot a curve of data frame

df1 <- read.csv("C:\\Users\\Unique\\Desktop\\Data Science\\
R Scripts\\LimeBison_ch1.csv")
df2 <- df1[df1[,3] == 73.608125,]
plot(df2[,1],df2[,2], xlab = "Milliseconds", ylab = "Amplitude",
main = "Amplitude vs Time Graph",type = "p", pch =16, col = "red",
xlim = c(-200,1200), ylim = c(-1.5,1.5))
x <- tapply(df2$Amplitude, df2$Time, mean)
df3 <- data.frame(Time = names(x), Average_Amplitude = x)
How can I plot a curve of the data frame df3 over the scatter plot of df2?
I'm not sure about your data, but if you are using base plotting, then you can plot a line on top of a scatter plot by using the lines function
df <- data.frame(x = 1:40, y = c(1:20, 20: 1))
plot(df$x, df$y, cex = 2)
lines(df$x, df$y)

Dates on x-axis; big dataset

I would like to plot a time series. I created an example to show how the graph should look like:
set.seed(1)
r <- rnorm(20,0,1)
z <- c(1,1,1,1,1,-1,-1,-1,1,-1,1,1,1,-1,1,1,-1,-1,1,-1)
data <- as.data.frame(na.omit(cbind(z, r)))
series1 <- ts(cumsum(c(1,data[,2]*data[,1])))
series2 <- ts(cumsum(c(1,data[,2])))
d1y <- seq(as.Date("1991-01-01"),as.Date("2015-01-01"),length.out=21)
plot_strategy <- function(series1, series2, currency)
{x11()
matplot(cbind(series1, series2), xaxt = "n", xlab = "Time",
ylab = "Value", col = 1:3, ann = TRUE, type = 'l',
lty = 1)
axis(1, at=seq(2,20,2), labels=format(d1y[seq(2,20,2)],"%Y"))
legend(x = "topleft", legend = c("TR", "BA"),
lty = 1,col = 1:3)
dev.copy2pdf(file= currency, width = 11.69, height = 8.27)}
plot_strategy(series1, series2,
currency= "all.pdf")
The actual dataset contains 6334 values. I therefore change the code to this:
axis(1, at=seq(2,6334,365), labels=format(d1y[seq(2,6334,365)],"%Y"))
But now, there are no values on the x-axis. Any suggstions?

Resources