R: Reduce number of plots in quantile regression results - r

By using the following code I am able to plot the results of my quantile regression model:
quant_reg_all <- rq(y_quant ~ X_quant, tau = seq(0.05, 0.95, by = 0.05), data=df_lasso)
quant_plot <- summary(quant_reg_all, se = "boot")
plot(quant_plot)
However, as there are many variables the plots are unreadable as shown in the image below:
Including the label, I have 18 variables.
How could I plot a few of these images at the time so they are readable?

depending on the number of graphs you cant, you could do:
quant_reg_all <- rq(y_quant ~ X_quant, tau = seq(0.05, 0.95, by = 0.05), data=df_lasso)
quant_plot <- summary(quant_reg_all, se = "boot")
plot(quant_plot, 1:3)# plot the first 3
plot(quant_plot, c(3, 6, 9, 10))# plot the 3rd, 6th, 9th and 10th plots

Related

How to plot the significant Tukey results as red

I have used the following code to plot the results of the Tukey test after my Anova analysis in R.
TukeyHSD(myANOVA, conf.level=.90)
TUKEY <- TukeyHSD(myANOVA, conf.level=.90)
plot(TUKEY , las=1 , col="black")
However, since the number of lines plotted is too large, I would like to have the significant ones highlighted or in red. I have seen a similar question here with the comment "overwrite the black lines showing significant differences with red lines" however, I don't know how to do it.
Imagine we have the following data:
data <- data.frame(group = rep(c("P1", "P2", "P3"), each = 40), values = c(rnorm(40, 0, 3),rnorm (40, 8, 10),rnorm (40, 0, 3)))
Then we conduct a Tukey test, convert the results to a matrix, and then to a dataframe (I don't know how to do it otherwise):
results_test <- TukeyHSD(aov(data$values~ data$group), conf.level=.95)
results_matrix <- as.matrix (results_test)
df_res <- as.data.frame(results_matrix[1])
Then we plot it using an ifelse function, as a function of the p-values:
plot(results_matrix, col= ifelse(df_res[,4]<0.05, 'red', 'black'))
I personally prefer the plot generated by the multcomp package, and with this package you can perform the Tukey test for unbalanced designs.
library(multcomp)
### set up a one-way ANOVA
data(warpbreaks)
amod <- aov(breaks ~ tension, data = warpbreaks)
### specify all pair-wise comparisons among levels of variable "tension"
tuk <- glht(amod, linfct = mcp(tension = "Tukey"))
### p-values
pvalues <- adjusted()(tuk)$pvalues
### get confidence intervals
ci.glht <- confint(tuk)
### plot them
plot(ci.glht, col = ifelse(pvalues < 0.05, "red", "black"),
xlab = "Difference in mean levels")

ggpairs: Plotting only the first two rows of a correlation matrix

I'd like to do some correlation analysis with plotting. As my actual data is too large I used the mtcars dataframe to setup an example.
Here the code
library(ggplot2)
library(ggcorrplot)
mtcars
library(ggcorrplot)
# Computing correlation matrix
corrmatr_mtcars <- round(cor(subset(mtcars[c(3:7,1)])),1)
head(corrmatr_mtcars[,1:6])
corrmatr_mtcars
# Computing correlation matrix with p-values
corrmatr_mtcars.mat <- cor_pmat(mtcars[c(3:7,1)])
head(corrmatr_mtcars.mat[, 1:6])
corrmatr_mtcars.mat
library(GGally)
ggpairs(mtcars[c(3:7,1)],
title = "Corr Analysis of...",
lower = list(continuous = wrap("cor",
size = 3)),
upper = list(
continuous = wrap("smooth",
alpha = 0.3,
size = 0.1))
)
With this plot result:
But, I am interested only in the correlation of the first two variables against all others. So, for avoiding unneccessary information and saving place I'd rather like
my plot to show only the first two correlation rows. All other correlations could be dropped.
In the end, I imagine something as follows needing only 3 rows.
Subsequently the Corr-Value labels should be placed at the scatterplot panels.>br>
I couldn't find any option to do so.
Would that even generally be possible with ggpairs (without complex functions)? If yes: how? If no: what could be an approach with a comparable result?
It can be done this way
library(ggplot2)
library(ggcorrplot)
mtcars
library(ggcorrplot)
# Computing correlation matrix
corrmatr_mtcars <- round(cor(subset(mtcars[c(3:7,1)])),1)
head(corrmatr_mtcars[,1:6])
corrmatr_mtcars
# Computing correlation matrix with p-values
corrmatr_mtcars.mat <- cor_pmat(mtcars[c(3:7,1)])
head(corrmatr_mtcars.mat[, 1:6])
corrmatr_mtcars.mat
library(GGally)
gg1 = ggpairs(mtcars[c(3:7,1)],
title = "Corr Analysis of...",
lower = list(continuous = wrap("cor",
size = 3)),
upper = list(
continuous = wrap("smooth",
alpha = 0.3,
size = 0.1))
)
gg1$plots = gg1$plots[1:12]
gg1$yAxisLabels = gg1$yAxisLabels[1:2]
gg1

How to draw forest plot from Dataframe (HR and CI)

I have a data of cox regression from spss containing following columns.
I am thinking to use this data as dataframe in R and create a forest plot out of it. How can i create a forest plot from this data in R? How to create forest plot from dataframe containing HR/OR and CIs. ?
Here is reproducable data as follows, it would be great help if you teach me how to make one. I tried but couldnt make one.
HR<-c(2,3,5)
ci_u<-c(1.2,1.1,1.3)
ci_l<-c(1.3,1.4,1.3)
names<-c("High","Low","medium")
datf<-data.frame(HR,ci_u,ci_l,sig,ns)
I am suggesting a simple ggplot approach as it offers great control. The underlying idea is to plot HRs as points and then add CIs as error bars. I altered your dataset because
you did not define sig and ns variables in your data frame
the point estimates do not fall between upper and lower CI values. I understand that you made up these values, but I am changing since the plot wont look good as the CI lines will fall only at one side of the point.
I used the following dataframe
dataset <- data.frame(
study_label = c(paste(rep("Study", 4), 1:4, sep = "_")),
HR = c(.72, 1.4, 1.7, 1.4),
lci = c(.52, 1.1, 1.3, 1.2),
uci = c(.83, 1.9, 2.1, 1.5)
)
require(ggplot2)
ggplot(dataset, aes(y = study_label, x = HR))+
geom_point()+ #map HRs as points on x axis and variables/study labels at y
geom_errorbar(aes(xmin = lci, xmax = uci))+ #add CIs as error bars
geom_vline(xintercept = 1, linetype = "dashed")#draw a vertical line at x=1 as null for ratio estimates
Please see the output

superimposing two probability plots with probplot

I can create a lognormal probability plot using the probplot() function from the e1071 package. A problem arises when I try to add another set of lognormal data to the first plot. Although I use the command par(new=T), the xaxis of the two plots are different and don't align.
Is there another way to go about this?
I tried using the points() function. However, it appears I need the x and y coordinates to plot it and I don't know how to extract the x, y coordinates from the probplot() function.
''' R
# Program to plot random logn failure times with probability plot
library(e1071)
logn_prob_plot <- function() {
set.seed(1)
x<-rlnorm(10,1,1)
par(bty="l")
par(col.lab="white")
p<-probplot(x,qdist=qlnorm)
par(col.lab="black")
mtext(text="failure time", col="black",side=1,line=3,outer=F)
mtext(text="lognormal probability", col="black",side=2,line=3,outer=F)
set.seed(2)
y=rlnorm(10,2,3)
par(new=T)
par(col.lab="white")
probplot(y,qdist=qlnorm,xlab="fail time",ylab="lognormal probability")
par(col.lab="black")
mtext(text="failure time", col="black",side=1,line=3,outer=F)
mtext(text="lognormal probability", col="black",side=2,line=3,outer=F)
}
logn_prob_plot()
My expected result is two groups of data on the same probability plot with the same x and y axes. Instead, I get two different x-axes that are not aligned.
First lets simulate the variables:
set.seed(1)
x<-rlnorm(10,1,1)
set.seed(2)
y=rlnorm(10,2,3)
The first probplot is:
p<-probplot(x,qdist=qlnorm, meanlog = 1, sdlog = 1)
which produces the output:
The second probplot is:
q <- probplot(y,qdist=qlnorm,meanlog = 2, sdlog = 3)
which produces the output:
Your best shot a merging them is using the scale of the smaller one and discarding some points:
p<-probplot(x,qdist=qlnorm, meanlog = 1, sdlog = 1)
points(sort(x), p[[1]](ppoints(length(x))), col = "red", pch = 19)
lines(q, col = "blue")
points(sort(y), q[[1]](ppoints(length(y))), col = "blue", pch = 19)
which gives:
The red line and points are from the distribution with meanlog = 1, sdlog = 1 and the
blue ones are from the one with meanlog = 2, sdlog = 3.
I further have to warn you that from reading the code of the probplot() function:
xl <- quantile(x, c(0.25, 0.75))
yl <- qdist(c(0.25, 0.75), ...)
slope <- diff(yl)/diff(xl)
the slope of the line is determined only by position the first and the third quartile and not bz what happens elsewhere.

How to change font size in plot of cross-validation result from cvTools package in R?

I try to manipulate size of fonts (title and axis labels, especially) in plot of cross-validation result from cvTools package in R. I am afraid it does not work:
library(cvTools)
data(coleman)
set.seed(1234)
# Split n observations into K groups to be used for (repeated) K-fold cross-validation
folds <- cvFolds(nrow(coleman), K = 5, R = 50)
# perform cross-validation for an LS regression model
fitLm <- lm(Y ~ ., data = coleman)
cvFitLm <- cvLm(fitLm, cost = rtmspe, folds = folds, trim = 0.1)
# perform cross-validation for an MM regression model
fitLmrob <- lmrob(Y ~ ., data = coleman, k.max = 500)
cvFitLmrob <- cvLmrob(fitLmrob, cost = rtmspe, folds = folds, trim = 0.1)
# combine results into one object
cvFits <- cvSelect(LS = cvFitLm, MM = cvFitLmrob)
The two lines below differ in plot points size, but there is no change in title font size / labels font size.
# plot combined results
plot(cvFits, main = "foo_title")
plot(cvFits, main = "foo_title", cex = 0.5, cex.main = 0.5, cex.lab = 0.5)
What I miss here?
Base graphics par settings generally don't work for lattice and other grid graphics. The plot methods for cvTools use lattice graphics. Here are ways to change the various font sizes in your plot:
plot(cvFits, cex=0.5, # Point markers
main = list("foo_title", cex = 1), # Title
xlab=list(cex=0.75), ylab=list(cex=0.75), # Axis titles
scales=list(x=list(cex=0.75), y=list(cex=0.75))) # Axis tick labels

Resources