I have a data of cox regression from spss containing following columns.
I am thinking to use this data as dataframe in R and create a forest plot out of it. How can i create a forest plot from this data in R? How to create forest plot from dataframe containing HR/OR and CIs. ?
Here is reproducable data as follows, it would be great help if you teach me how to make one. I tried but couldnt make one.
HR<-c(2,3,5)
ci_u<-c(1.2,1.1,1.3)
ci_l<-c(1.3,1.4,1.3)
names<-c("High","Low","medium")
datf<-data.frame(HR,ci_u,ci_l,sig,ns)
I am suggesting a simple ggplot approach as it offers great control. The underlying idea is to plot HRs as points and then add CIs as error bars. I altered your dataset because
you did not define sig and ns variables in your data frame
the point estimates do not fall between upper and lower CI values. I understand that you made up these values, but I am changing since the plot wont look good as the CI lines will fall only at one side of the point.
I used the following dataframe
dataset <- data.frame(
study_label = c(paste(rep("Study", 4), 1:4, sep = "_")),
HR = c(.72, 1.4, 1.7, 1.4),
lci = c(.52, 1.1, 1.3, 1.2),
uci = c(.83, 1.9, 2.1, 1.5)
)
require(ggplot2)
ggplot(dataset, aes(y = study_label, x = HR))+
geom_point()+ #map HRs as points on x axis and variables/study labels at y
geom_errorbar(aes(xmin = lci, xmax = uci))+ #add CIs as error bars
geom_vline(xintercept = 1, linetype = "dashed")#draw a vertical line at x=1 as null for ratio estimates
Please see the output
Related
I am trying to plot an NMDS plot of species community composition data with ellipses which represent 95% confidence intervals. I generated the data for my NMDS plot using metaMDS and successfully have ordinations generated using the basic plot functions in R (see code below). However, I am struggling to get my data to plot successfully using ggplot2 and this is the only way I have seen 95% CIs plotted on NMDS plots. I am hoping someone is able to help me correct my code so the ellipses show 95% CIs, or could point me in the right direction for achieving this using other methods?
My basic code for plotting my NMDS plot:
orditorp(dung.families.mds, display = "sites", labels = F, pch = c(16, 8, 17, 18) [as.numeric(group.variables$Heating)], col = c("green", "blue", "orange", "black") [as.numeric(group.variables$Dungfauna)], cex = 1.3)
ordiellipse(dung.families.mds, groups = group.variables$Dungfauna, draw = "polygon", lty = 1, col = "grey90")
legend("topleft", "stress = 0.1329627", bty = "n", cex = 1)
My ordination:
I realize this question is old, but I found this post useful for plotting confidence ellipses during my work, and maybe it will help you. Plotting ordiellipse function from vegan package onto NMDS plot created in ggplot2
Edit: Below I have copied the code from the second part of Didzis Elferts's answer on the link above.
Where "sol" is the metaMDS object:
First, make NMDS data frame with group column.
NMDS = data.frame(MDS1 = sol$points[,1], MDS2 = >sol$points[,2],group=MyMeta$amt)
Next, save result of function ordiellipse() as some object.
ord<-ordiellipse(sol, MyMeta$amt, display = "sites", >kind = "se", conf = 0.95, label = T)
Data frame df_ell contains values to show ellipses. It is calculated again with function veganCovEllipse which is hidden in vegan package. This function is applied to each level of NMDS (group) and now it uses arguments stored in ord object - cov, center and scale of each level.
df_ell <- data.frame()
for(g in levels(NMDS$group)){
df_ell <- rbind(df_ell, cbind(as.data.frame(with(NMDS[NMDS$group==g,],
veganCovEllipse(ord[[g]]$cov,ord[[g]]$center,ord[[g]]$scale)))
,group=g))
}
Plotting is done the same way as in previous example. As for the calculating of coordinates for elipses object of ordiellipse() is used, this solution will work with different parameters you provide for this function.
ggplot(data = NMDS, aes(MDS1, MDS2)) + geom_point(aes(color = group)) +
geom_path(data=df_ell, aes(x=NMDS1, y=NMDS2,colour=group), size=1, linetype=2)
I'd like to do some correlation analysis with plotting. As my actual data is too large I used the mtcars dataframe to setup an example.
Here the code
library(ggplot2)
library(ggcorrplot)
mtcars
library(ggcorrplot)
# Computing correlation matrix
corrmatr_mtcars <- round(cor(subset(mtcars[c(3:7,1)])),1)
head(corrmatr_mtcars[,1:6])
corrmatr_mtcars
# Computing correlation matrix with p-values
corrmatr_mtcars.mat <- cor_pmat(mtcars[c(3:7,1)])
head(corrmatr_mtcars.mat[, 1:6])
corrmatr_mtcars.mat
library(GGally)
ggpairs(mtcars[c(3:7,1)],
title = "Corr Analysis of...",
lower = list(continuous = wrap("cor",
size = 3)),
upper = list(
continuous = wrap("smooth",
alpha = 0.3,
size = 0.1))
)
With this plot result:
But, I am interested only in the correlation of the first two variables against all others. So, for avoiding unneccessary information and saving place I'd rather like
my plot to show only the first two correlation rows. All other correlations could be dropped.
In the end, I imagine something as follows needing only 3 rows.
Subsequently the Corr-Value labels should be placed at the scatterplot panels.>br>
I couldn't find any option to do so.
Would that even generally be possible with ggpairs (without complex functions)? If yes: how? If no: what could be an approach with a comparable result?
It can be done this way
library(ggplot2)
library(ggcorrplot)
mtcars
library(ggcorrplot)
# Computing correlation matrix
corrmatr_mtcars <- round(cor(subset(mtcars[c(3:7,1)])),1)
head(corrmatr_mtcars[,1:6])
corrmatr_mtcars
# Computing correlation matrix with p-values
corrmatr_mtcars.mat <- cor_pmat(mtcars[c(3:7,1)])
head(corrmatr_mtcars.mat[, 1:6])
corrmatr_mtcars.mat
library(GGally)
gg1 = ggpairs(mtcars[c(3:7,1)],
title = "Corr Analysis of...",
lower = list(continuous = wrap("cor",
size = 3)),
upper = list(
continuous = wrap("smooth",
alpha = 0.3,
size = 0.1))
)
gg1$plots = gg1$plots[1:12]
gg1$yAxisLabels = gg1$yAxisLabels[1:2]
gg1
By using the following code I am able to plot the results of my quantile regression model:
quant_reg_all <- rq(y_quant ~ X_quant, tau = seq(0.05, 0.95, by = 0.05), data=df_lasso)
quant_plot <- summary(quant_reg_all, se = "boot")
plot(quant_plot)
However, as there are many variables the plots are unreadable as shown in the image below:
Including the label, I have 18 variables.
How could I plot a few of these images at the time so they are readable?
depending on the number of graphs you cant, you could do:
quant_reg_all <- rq(y_quant ~ X_quant, tau = seq(0.05, 0.95, by = 0.05), data=df_lasso)
quant_plot <- summary(quant_reg_all, se = "boot")
plot(quant_plot, 1:3)# plot the first 3
plot(quant_plot, c(3, 6, 9, 10))# plot the 3rd, 6th, 9th and 10th plots
I have plots that are .25 ha and I need my data to be displayed as 1 ha. I'm trying to make the following graph but multiplying the counts by 4 (so I have a full hectare instead of a quarter). However, all posts seem to deal with changing axis titles, values, etc., but I need to change the actual histogram frequency counts.
Histogram x-variable in size classes plotted by factor variable
ggplot(liveTrees, aes(diam1DBH)) +
geom_histogram(binwidth =10) +
facet_wrap(~site) +
ggtitle("Stems/0.25ha by Size Class") +
ylab("Stems/0.25ha") +
xlab("Diameter Class")
liveTrees = my data
diam1DBH = diameter (numeric, continuous)
site = plot location (factor)
Original code:
What I've tried: `
for (i in 1:length(unique(liveTrees$site))) {
test<-hist(liveTrees[liveTrees$site== unique(liveTrees$site)[i], "diam1DBH"], plot = F)
b <- barchart(test$counts*4, width = 10, xlim=c(0,350), cex.axis = 0.85)
axis(side = 1, at = "b", cex.axis = 0.85)
}
But I keep getting
Error in axis(side = 1, at = "b", cex.axis = 0.85) : no locations are
finite In addition: Warning message: In axis(side = 1, at = "b",
cex.axis = 0.85) : NAs introduced by coercion
So, with this I can get the counts, but the numbers aren't right and they're not in a useful format.
My data is a data.frame, example: data example
What I need is the sum of each diameter class, each bin frequency amount, multiplied by 4. I've been trying to do this but can't get it to work, any help is appreciated!
If you multiply the frequencies by 4, the values will change but the graphs will still look the same, so there are two options, one is to simply change the axis value labels, or the other simpler way is to add the data 4 times. For example:
ggplot(rbind(data, data,data,data), aes(variable_X)) + geom_histogram(binwidth =10)
This way the data is multiplied, and no new data.frame is made that could confuse analysis later on.
Say I some data, d, and I fit nls models to two subsets of the data.
x<- seq(0,4,0.1)
y1<- (x*2 / (0.2 + x))
y1<- y1+rnorm(length(y1),0,0.2)
y2<- (x*3 / (0.2 + x))
y2<- y2+rnorm(length(y2),0,0.4)
d<-data.frame(x,y1,y2)
m.y1<-nls(y1~v*x/(k+x),start=list(v=1.9,k=0.19),data=d)
m.y2<-nls(y2~v*x/(k+x),start=list(v=2.9,k=0.19),data=d)
I then want to plot the fitted model regression line over data, and shade the prediction interval. I can do this with the package investr and get nice plots for each subset individually:
require(investr)
plotFit(m.y1,interval="prediction",ylim=c(0,3.5),pch=19,col.pred='light blue',shade=T)
plotFit(m.y2,interval="prediction",ylim=c(0,3.5),pch=19,col.pred='pink',shade=T)
However, if I plot them together I have a problem. The shading of the second plot covers the points and shading of the first plot:
1: How can I make sure the points on the first plot end up on top of the shading of the second plot?
2: How can I make the region where the shaded prediction intervals overlap a new color (like purple, or any fusion of the two colors that are overlapping)?
Use adjustcolor to add transparency like this:
plotFit(m.y1, interval = "prediction", ylim = c(0,3.5), pch = 19,
col.pred = adjustcolor("lightblue", 0.5), shade = TRUE)
par(new = TRUE)
plotFit(m.y2, interval = "prediction", ylim = c(0,3.5), pch = 19,
col.pred = adjustcolor("light pink", 0.5), shade = TRUE)
Depending on what you want you can play around with the two transparency values (here both set to 0.5) and possibly make only one of them transparent.