VGAM percentile curve plot in R - r

I am running following code from help files of VGAM:
library(VGAM)
fit4 <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), data = bmi.nz, trace = TRUE)
qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1, xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4)
How can I just prevent putting points on the plot so that the graph shows only percentile curves? Is there some option in qtplot where I can suppress points on this graph, so that I do not need to resort to long ggplot route as done on this page: Percentiles from VGAM ? In my earlier question there were other issues also so this point got ignored. Thanks for your help.

There is no qtplot help page so I went to the package help Index and saw qtplot.lmscreg listed. It had a 'pcol.arg' to control points color so I set it to "transparent":
qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1,
xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4,
pcol.arg="transparent")

Related

Funnel plot with effective sample size in R

I am intending to create a funnel plot with an effective sample size. I am using funnel.default() function of metafor package. I tried the following code to create such a funnel plot
Soil_mineral_nitrogen$inv_n_tilda <- with(Soil_mineral_nitrogen, (control_mean + treatment_mean) / (control_mean*treatment_mean))
par(mfrow = c(1, 2))
funnel(Soil_mineral_nitrogen$lnrr, Soil_mineral_nitrogen$inv_n_tilda, yaxis="ninv",
#xlim = c(-3, 3),
ylab = "Effective sample size (ñ)",
xlab = "Effect size (RR)", col =Soil_mineral_nitrogen$unique_id, atransf = exp)
But this code is returning me an error as follows
Error in funnel.default(Soil_mineral_nitrogen$lnrr, Soil_mineral_nitrogen$inv_n_tilda, :
No sample size information available.
Does anyone know how to deal with this error?
It worked for me with the following code
par(mfrow = c(1, 2))
funnel(Soil_mineral_nitrogen$lnrr, Soil_mineral_nitrogen$v, ni = Soil_mineral_nitrogen$inv_n_tilda,
yaxis="ni",
#xlim = c(-3, 3),
ylab = "Effective sample size (ñ)",
xlab = "Effect size (RR)", col =Soil_mineral_nitrogen$unique_id, atransf = exp)

How could I conduct meta-analysis on percentage outcomes using R?

My example data is as follows:
df <- data.frame(study = c("Hodaie","Kerrigan","Lee","Andrade","Lim"), SR = c(0.5460, 0.2270, 0.7540, 0.6420, 0.5000), SE = c(12.30, 15.70, 12.80, 13.80, 9.00), Patients = c(5, 5, 3, 6, 4))
I want to conduct the meta-analysis with SR(single group percentage), SE (standard error that I can compute based on sample size and percentage), and patients(sample size for each study), and I hope I could get the following forest plot (I found this example in an article, and they also have one group percentage data, but I can't find which R statement or argument they used):
Could anyone tell me which R statement or argument that I could use to conduct the meta-analysis and generate the forest plot above? Thank you!
I am sure there are plenty of ways to do this using packages but it can be accomplished in base R (and there are likely more elegant solutions using base R). The way I do it is to first build a blank plot much larger than the needed graphing portion, then overlay the relevant elements on it. I find one has more control over it this way. A basic example that could get you started is below. If you are new to R (based on your name NewRUser), I suggest running it line-by-line to see how it all works. Again, this is only one way and there are likely better approaches. Good luck!
Sample Data
#### Sample Data (modified from OP)
df <- data.frame(Study = c("Hodaie","Kerrigan","Lee","Andrade","Lim"),
SR = c(0.5460, 0.2270, 0.7540, 0.6420, 0.5000),
SE = c(12.30, 15.70, 12.80, 13.80, 9.00),
Patients = c(5, 5, 3, 6, 4),
ci_lo = c(30, -8.0, 50, 37, 32),
ci_hi = c(78, 53, 100, 91, 67))
### Set up plotting elements
n.studies <- nrow(df)
yy <- n.studies:1
seqx <- seq(-100, 100, 50)
## blank plot much larger than needed
plot(range(-550, 200), range(0, n.studies), type = 'n', axes = F, xlab = '', ylab = '') #blank plot, much bigger than plotting portion needed
# Set up axes
axis(side = 1, at = seqx, labels = seqx, cex.axis = 1, mgp = c(2, 1.5, 1)) # add axis and label (bottom)
mtext(side = 1, at = 0, 'Seizure Reduction', line = 2.5, cex = 0.85, padj = 1)
axis(side = 3, at = seqx, labels = seqx, cex.axis = 1, mgp = c(2, 1.5, 1)) # add axis and label (top)
mtext(side = 3, at = 0, 'Seizure Reduction', line = 2.5, cex = 0.85, padj = -1)
## add lines and dots
segments(df[, "ci_lo"], yy, df[,"ci_hi"], yy) # add lines
points(df[,"SR"]*100, yy, pch = 19) # add points
segments(x0 = 0, y0 = max(yy), y1 = 0, lty = 3, lwd = 0.75) #vertical line # 0
### Add text information
par(xpd = TRUE)
text(x = -550, y = yy, df[,"Study"], pos = 4)
text(x = -450, y = yy, df[,"SR"]*100, pos = 4)
text(x = -350, y = yy, df[,"SE"], pos = 4)
text(x = -250, y = yy, df[,"Patients"], pos = 4)
text(x = 150, y = yy, paste0(df[,"ci_lo"], "-", df[,"ci_hi"]), pos = 4)
text(x = c(seq(-550, -250, 100), 150), y = max(yy)+0.75,
c(colnames(df)[1:4], "CI"), pos = 4, font = 2)
# Add legend
legend(x = 50, y = 0.5, c("Point estimate", "95% Confidence interval"),
pch = c(19, NA), lty = c(NA, 19), bty = "n", cex = 0.65)

R: Parallel Coordinates Plot without GGally

I am using the R programming language. I am using a computer that does not have a USB port or an internet connection - I only have R with a few preloaded libraries (e.g. ggplot2, reshape2, dplyr, base R).
Is it possible to make "parallel coordinate" plots (e.g. below) using only the "ggplot2" library and not "ggally"?
#load libraries (I do not have GGally)
library(GGally)
#load data (I have MASS)
data(crabs, package = "MASS")
#make 2 different parallel coordinate plots
ggparcoord(crabs)
ggparcoord(crabs, columns = 4:8, groupColumn = "sex")
Thanks
Source: https://homepage.divms.uiowa.edu/~luke/classes/STAT4580-2020/parcor.html
In fact, you do not even need ggplot! This is just a plot of standardised values (minus mean divided by SD), so you can implement this logic with any plotting function capable of doing so. The cleanest and easiest way to do it is in steps in base R:
# Standardising the variables of interest
data(crabs, package = "MASS")
crabs[, 4:8] <- apply(crabs[, 4:8], 2, scale)
# This colour solution works in great generality, although RColorBrewer has better distinct schemes
mycolours <- rainbow(length(unique(crabs$sex)), end = 0.6)
# png("gally.png", 500, 400, type = "cairo", pointsize = 14)
par(mar = c(4, 4, 0.5, 0.75))
plot(NULL, NULL, xlim = c(1, 5), ylim = range(crabs[, 4:8]) + c(-0.2, 0.2),
bty = "n", xaxt = "n", xlab = "Variable", ylab = "Standardised value")
axis(1, 1:5, labels = colnames(crabs)[4:8])
abline(v = 1:5, col = "#00000033", lwd = 2)
abline(h = seq(-2.5, 2.5, 0.5), col = "#00000022", lty = 2)
for (i in 1:nrow(crabs)) lines(as.numeric(crabs[i, 4:8]), col = mycolours[as.numeric(crabs$sex[i])])
legend("topright", c("Female", "Male"), lwd = 2, col = mycolours, bty = "n")
# dev.off()
You can apply this logic (x axis with integer values, y axis with standardised variable lines) in any package that can conveniently draw multiple lines (as in time series), but this solution has no extra dependencies an will not become unavailable due to an orphaned package with 3 functions getting purged from CRAN.
The closest thing I found to this without the "GGally" was the built in function using the "MASS" library:
#source: https://stat.ethz.ch/R-manual/R-devel/library/MASS/html/parcoord.html
library(MASS)
parcoord(state.x77[, c(7, 4, 6, 2, 5, 3)])
ir <- rbind(iris3[,,1], iris3[,,2], iris3[,,3])
parcoord(log(ir)[, c(3, 4, 2, 1)], col = 1 + (0:149)%/%50)

How to change the increment value for xlim and ylim when I wanna plot

I am quite new in R.
I am doing a part of my MSc thesis and wanna make some diurnal plots of for instance methane production in a period of time.
Now I a wanna see its variation in time and its correlation with another factor in the same time. Then I have two questions.
First:
How to define the xlim and ylim to increase by 2 hours. It has its own default and when I give it for example:
xlim = c(0, 23)
then it starts from 0 and goes up in 5 hours. I want it to go up in 2 hours.
Second:
How to put another variable which might be correlated to my first variable in the same time period. Let's say methane production in 23 hours could be related to oxygen consumption, just as an example. How can I put oxygen and methane in the same axis(y) against time (x)?
I will be so appreciated if you could help me with this.
Kinds,
Farhad
You can use at and labels arguments in axis function call to customize labels and tick locations.
You can use axis function with argument side = 4 to create custom y-axis on the right of you graph.
Please see the code below illustrating the above mentioned points:
set.seed(123)
x <- 0:23
df<- data.frame(
x,
ch4 = 1000 - x ^ 2,
o2 = 2000 - 2 * (x - 10) ^ 2
)
par(mar = c(5, 5, 2, 5))
with(df, plot(x, ch4,
type = "l", col = "red3",
ylab = "CH4 emission",
lwd = 3,
xlim = c(0, 23),
xlab = "",
xaxt = "n"))
axis(1, at = seq(0, 23, 2), labels = seq(0, 23, 2))
par(new = TRUE)
with(df, plot(x, o2,
pch = 16, axes = FALSE,
xlab = NA, ylab = NA, cex = 1.2))
axis(side = 4)
mtext(side = 4, line = 3, "O2 consumption")
legend("topright",
legend = c("O2", "CH4"),
lty = c(1, 0),
lwd = c(3, NA),
pch = c(NA, 16),
col = c("red3", "black"))
Output:

Mixed plot with histogram and superimposed line plot in same figure

I know there are strong opinions about mixing plot types in the same figures, especially if there are two y axes involved. However, this is a situation in which I have no alternative - I need to create a figure using R that follows a standard format - a histogram on one axis (case counts), and a superimposed line graph showing an unrelated rate on an independent axis.
The best I have been able to do is stacked ggplot2 facets, but this is not as easy to interpret for the purposes of this analysis as the combined figure. The people reviewing this output will need it in the format they are used to.
I'm attaching an example below.
Any ideas?
For etiquette purposes, sample data below:
y1<-sample(0:1000,20,rep=TRUE)
y2<-sample(0:100,20,rep=TRUE)
x<-1981:2000
I feel your pain - have had to recreate plots before. even did it in SAS once
if it's a once off, I'm be tempted to go old-school. something like this:
# Generate some data
someData <- data.frame(Year = 1987:2009,
mCases = rpois(23, 3),
pVac = sample(55:80, 23, T))
par(mar = c(5, 5, 5, 5))
with(someData, {
# Generate the barplot
BP <- barplot(mCases, ylim = c(0, 18), names = Year,
yaxt = "n", xlab = "", ylab = "Measles cases in Thousands")
axis(side = 2, at = 2*1:9, las = 1)
box()
# Add the % Vaccinated
par(new = T)
plot(BP, pVac, type = "l", ylim = c(0, 100), axes = F, ylab = "", xlab = "")
axis(side = 4, las = 1)
nudge <- ifelse(pVac > median(pVac), 2, -2)
text(BP, pVac + nudge, pVac)
mtext(side = 4, "% Vaccinated", line = 3)
par(new = F)
})
Try library(plotrix)
library(plotrix)
## Create sample data
y2<-sample(0:80,20,rep=TRUE)
x2<-sort(sample(1980:2010,20,rep=F))
y1<-sample(0:18,20,rep=TRUE)
x1<-sort(sample(1980:2010,20,rep=F))
x<-1980:2010
twoord.plot(x1,y1,x2,y2,
lylim=c(0,18),rylim=c(0,100),type=c("bar","l"),
ylab="Measles Cases in thousands",rylab="% Vaccinated",
lytickpos=seq(0,18,by=2),rytickpos=seq(0,100,by=10),ylab.at=9,rylab.at=50,
lcol=3,rcol=4)

Resources