I am having trouble with plotting a forest plot based on a multi-level model, in which I'd also like to display pooled effects of subgroups, as well as the results for subgroup differences.
So far, I have managed to produce a plot of the data where clusters are grouped together. I would like to extend this plot by adding pooled effects of subgroups at the right positions, without losing the grouping of the clusters. (As it is explained here, but also while keeping what is shown in the last example of this).
This is the code I have used so far to produce the "normal" forest plot for my model (sorry, it's pretty long):
# ma_data => my data
# main_3L => my multi-level model
# Prepare row argument for separation by study
dd <- c(0, diff(ma_data$ID))
dd[dd > 0] <- 1
rows <- (1:main_3L$k) + cumsum(dd)
par(tck=-.01, mgp = c(1.6,.2,0), cex=1)
# refactor ID var
ma_data$ID_plot <- substr(ma_data$short_cite, 1, nchar(ma_data$short_cite))
ma_data$ID_plot <- paste(sub(" ||) ","",substr(ma_data$ID_plot,0,2)), substr(ma_data$ID_plot,3,nchar(ma_data$ID_plot)), sep="")
tiff("./figures/forestFull_ext1.tiff", width=3200,height=4500, res=300)
# Plot the forest!
metafor::forest(main_3L,
addpred = TRUE, # adds prediction interval
cex=0.5,
header="Author(s) and Year",
rows=rows, # uses the vector created above
order=order(ma_data$ID, ma_data$es_adj),
ylim=c(0.5,max(rows)+3),
xlim=c(-5,3),
xlab="Hedges' G",
ilab=cbind(as.character(ma_data$setup),as.character(ma_data$target_1), as.character(ma_data$measure_type), ma_data$task, as.character(ma_data$cogdom_pooled), ma_data$sample_size_exp),
ilab.xpos=c(-3.9,-3.6,-3.3,-2.8,-2.2,-1.7),
slab=ma_data$ID_plot,
mlab = mlabfun("Overall RE Modell", main_3L, main_3L.I2)) # Adds Q,Qp, I² and sigma² values.
abline(h = rows[c(1,diff(rows)) == 2] - 1, lty="dotted")
# adds a second polygon with robust estimates for standard error
addpoly(coeftest.main_3L$beta, sei = coeftest.main_3L$SE,
rows = -2.5,
cex = 0.5,
mlab = "Robust RE Model estimate",
col = "darkred")
par(cex=0.5, font=2)
# text(c(-4,-3.7,-3.2,-2.5, -2), 150.5, pos=3, c("Target", "Measure","Task","Cognitive Domain", "N"))
text(c(-3.9,-3.6,-3.3,-2.8,-2.2,-1.7), 150.5, pos=3, c("Setup", "Target", "Measure","Task","Cognitive Domain", "N"))
dev.off()
Specifically, I need to know how to "make space" for the additional rows and polygons.
Also, is there an option in the forest() function to display only the pooled effects of subgroups and main effect, bot not the individual effect sizes? I know that it is possible in the meta package, but have not found anything similar in metafor.
Any help is greatly appreciated!
I am using the R. I am trying to use the "lines' command in ggplot2 to show the predicted values vs. the actual values for a statistical model (arima, time series). Yet, when I ran the code, I can only see a line of one color.
I simulated some data in R and then tried to make plots that show actual vs predicted:
#set seed
set.seed(123)
#load libraries
library(xts)
library(stats)
#create data
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,100,10)
final_data <- data.frame(date_decision_made, property_damages_in_dollars)
#aggregate
y.mon<-aggregate(property_damages_in_dollars~format(as.Date(date_decision_made),
format="%W-%y"),data=final_data, FUN=sum)
y.mon$week = y.mon$`format(as.Date(date_decision_made), format = "%W-%y")`
ts = ts(y.mon$property_damages_in_dollars, start = c(2014,1), frequency = 12)
#statistical model
fit = arima(ts, order = c(4, 1, 1))
Here were my attempts at plotting the graphs:
#first attempt at plotting (no second line?)
plot(fit$residuals, col="red")
lines(fitted(fit),col="blue")
#second attempt at plotting (no second line?)
par(mfrow = c(2,1),
oma = c(0,0,0,0),
mar = c(2,4,1,1))
plot(ts, main="as-is") # plot original sim
lines(fitted(fit), col = "red") # plot fitted values
legend("topleft", legend = c("original","fitted"), col = c("black","red"),lty = 1)
#third attempt (plot actual, predicted and 5 future values - here, the actual and future values show up, but not the predicted)
pred = predict(fit, n.ahead = 5)
ts.plot(ts, pred$pred, lty = c(1,3), col=c(5,2))
However, none of these seem to be working correctly. Could someone please tell me what I am doing wrong? (note: the computer I am using for my work does not have an internet connection or a usb port - it only has R with some preloaded packages. I do not have access to the forecast package.)
Thanks
Sources:
In R plot arima fitted model with the original series
R fitted ARIMA off by one timestep? pkg:Forecast
Plotting predicted values in ARIMA time series in R
You seem to be confusing a couple of things:
fitted usually does not work on an object of class arima. Usually, you can load the forecast package first and then use fitted.
But since you do not have acces to the forecast package you cannot use fitted(fit): it always returns NULL. I had problems with fitted
before.
You want to compare the actual series (x) to the fitted series (y), yet in your first attempt you work with the residuals (e = x - y)
You say you are using ggplot2 but actually you are not
So here is a small example on how to plot the actual series and the fitted series without ggplot.
set.seed(1)
x <- cumsum(rnorm(10))
y <- stats::arima(x, order = c(1, 0, 0))
plot(x, col = "red", type = "l")
lines(x - y$residuals, col = "blue")
I Hope this answer helps you get back on tracks.
I am trying to visualize some data and in order to do it I am using R's hist.
Bellow are my data
jancoefabs <- as.numeric(as.vector(abs(Janmodelnorm$coef)))
jancoefabs
[1] 1.165610e+00 1.277929e-01 4.349831e-01 3.602961e-01 7.189458e+00
[6] 1.856908e-04 1.352052e-05 4.811291e-05 1.055744e-02 2.756525e-04
[11] 2.202706e-01 4.199914e-02 4.684091e-02 8.634340e-01 2.479175e-02
[16] 2.409628e-01 5.459076e-03 9.892580e-03 5.378456e-02
Now as the more cunning of you might have guessed these are the absolute values of some model's coefficients.
What I need is an histogram that will have for axes:
x will be the number (count or length) of coefficients which is 19 in total, along with their names.
y will show values of each column (as breaks?) having a ylim="" set, according to min and max of those values (or something similar).
Note that Janmodelnorm$coef simply produces the following
(Intercept) LON LAT ME RAT
1.165610e+00 -1.277929e-01 -4.349831e-01 -3.602961e-01 -7.189458e+00
DS DSA DSI DRNS DREW
-1.856908e-04 1.352052e-05 4.811291e-05 -1.055744e-02 -2.756525e-04
ASPNS ASPEW SI CUR W_180_270
-2.202706e-01 -4.199914e-02 4.684091e-02 -8.634340e-01 -2.479175e-02
W_0_360 W_90_180 W_0_180 NDVI
2.409628e-01 5.459076e-03 -9.892580e-03 -5.378456e-02
So far and consulting ?hist, I am trying to play with the code bellow without success. Therefore I am taking it from scratch.
# hist(jancoefabs, col="lightblue", border="pink",
# breaks=8,
# xlim=c(0,10), ylim=c(20,-20), plot=TRUE)
When plot=FALSE is set, I get a bunch of somewhat useful info about the set. I also find hard to use breaks argument efficiently.
Any suggestion will be appreciated. Thanks.
Rather than using hist, why not use a barplot or a standard plot. For example,
## Generate some data
set.seed(1)
y = rnorm(19, sd=5)
names(y) = c("Inter", LETTERS[1:18])
Then plot the cofficients
barplot(y)
Alternatively, you could use a scatter plot
plot(1:19, y, axes=FALSE, ylim=c(-10, 10))
axis(2)
axis(1, 1:19, names(y))
and add error bars to indicate the standard errors (see for example Add error bars to show standard deviation on a plot in R)
Are you sure you want a histogram for this? A lattice barchart might be pretty nice. An example with the mtcars built-in data set.
> coef <- lm(mpg ~ ., data = mtcars)$coef
> library(lattice)
> barchart(coef, col = 'lightblue', horizontal = FALSE,
ylim = range(coef), xlab = '',
scales = list(y = list(labels = coef),
x = list(labels = names(coef))))
A base R dotchart might be good too,
> dotchart(coef, pch = 19, xlab = 'value')
> text(coef, seq(coef), labels = round(coef, 3), pos = 2)
I have created a model using following
age hrs charges
530.6071 792.10 3474.60
408.6071 489.70 1247.06
108.0357 463.00 1697.07
106.6071 404.15 1676.33
669.4643 384.65 1701.13
556.4643 358.15 1630.30
665.4643 343.85 2468.83
508.4643 342.35 3366.44
106.0357 335.25 2876.82
interaction_model <- rlm( charges~age+hrs+age*hrs, age_vs_hrs_charges_cleaned);
Any idea how i can plot this in 3D?
I already plotted using
library(effects);
plot(effect(term="age:hrs", mod=interaction_model,default.levels=20),multiline=TRUE);
but this is not very clear visualization.
Any help?
There are several ways to do this.
model <- lm( charges~age+hrs+age*hrs, df)
# set up grid of (x,y) values
age <- seq(0,1000, by=20)
hrs <- seq(0,1000, by=20)
gg <- expand.grid(age=age, hrs=hrs)
# prediction from the linear model
gg$charges <-predict(model,newdata=gg)
# contour plot
library(ggplot2)
library(colorRamps)
library(grDevices)
jet.colors <- colorRampPalette(matlab.like(9))
ggplot(gg, aes(x=age, y=hrs, z=charges))+
stat_contour(aes(color=..level..),binwidth=200, size=2)+
scale_color_gradientn(colours=jet.colors(8))
# 3D scatterplot
library(scatterplot3d)
scatterplot3d(gg$age, gg$hrs, gg$charges)
# interactive 3D scatterplot (just a screen shot here)
library(rgl)
plot3d(gg$age,gg$hrs,gg$charges)
# interactive 3D surface plot with shading (screen shot)
colorjet <- jet.colors(100)
open3d()
rgl.surface(x=age, z=hrs, y=0.05*gg$charges,
color=colorzjet[ findInterval(gg$charges, seq(min(gg$charges), max(gg$charges), length=100))] )
axes3d()
A little while ago I wrote a couple of functions to display the results of a (general) linear model, together with colour coded data points, in either 3D (interactive, using rgl) or 2D (using a contour plot) :
# plot predictions of a (general) linear model as a function of two explanatory variables as an image / contour plot
# together with the actual data points
# mean value is used for any other variables in the model
plotImage=function(model=NULL,plotx=NULL,ploty=NULL,plotPoints=T,plotContours=T,plotLegend=F,npp=1000,xlab=NULL,ylab=NULL,zlab=NULL,xlim=NULL,ylim=NULL,pch=16,cex=1.2,lwd=0.1,col.palette=NULL) {
n=npp
require(rockchalk)
require(aqfig)
require(colorRamps)
require(colorspace)
require(MASS)
mf=model.frame(model);emf=rockchalk::model.data(model)
if (is.null(xlab)) xlab=plotx
if (is.null(ylab)) ylab=ploty
if (is.null(zlab)) zlab=names(mf)[[1]]
if (is.null(col.palette)) col.palette=rev(rainbow_hcl(1000,c=100))
x=emf[,plotx];y=emf[,ploty];z=mf[,1]
if (is.null(xlim)) xlim=c(min(x)*0.95,max(x)*1.05)
if (is.null(ylim)) ylim=c(min(y)*0.95,max(y)*1.05)
preds=predictOMatic(model,predVals=c(plotx,ploty),n=npp,divider="seq")
zpred=matrix(preds[,"fit"],npp,npp)
zlim=c(min(c(preds$fit,z)),max(c(preds$fit,z)))
par(mai=c(1.2,1.2,0.5,1.2),fin=c(6.5,6))
graphics::image(x=seq(xlim[1],xlim[2],len=npp),y=seq(ylim[1],ylim[2],len=npp),z=zpred,xlab=xlab,ylab=ylab,col=col.palette,useRaster=T,xaxs="i",yaxs="i")
if (plotContours) graphics::contour(x=seq(xlim[1],xlim[2],len=npp),y=seq(ylim[1],ylim[2],len=npp),z=zpred,xlab=xlab,ylab=ylab,add=T,method="edge")
if (plotPoints) {cols1=col.palette[(z-zlim[1])*999/diff(zlim)+1]
pch1=rep(pch,length(n))
cols2=adjustcolor(cols1,offset=c(-0.3,-0.3,-0.3,1))
pch2=pch-15
points(c(rbind(x,x)),c(rbind(y,y)), cex=cex,col=c(rbind(cols1,cols2)),pch=c(rbind(pch1,pch2)),lwd=lwd) }
box()
if (plotLegend) vertical.image.legend(zlim=zlim,col=col.palette) # TO DO: add z axis label, maybe make legend a bit smaller?
}
# plot predictions of a (general) linear model as a function of two explanatory variables as an interactive 3D plot
# mean value is used for any other variables in the model
plotPlaneFancy=function(model=NULL,plotx1=NULL,plotx2=NULL,plotPoints=T,plotDroplines=T,npp=50,x1lab=NULL,x2lab=NULL,ylab=NULL,x1lim=NULL,x2lim=NULL,cex=1.5,col.palette=NULL,segcol="black",segalpha=0.5,interval="none",confcol="lightgrey",confalpha=0.4,pointsalpha=1,lit=T,outfile="graph.png",aspect=c(1,1,0.3),zoom=1,userMatrix=matrix(c(0.80,-0.60,0.022,0,0.23,0.34,0.91,0,-0.55,-0.72,0.41,0,0,0,0,1),ncol=4,byrow=T),windowRect=c(0,29,1920,1032)) { # or library(colorRamps);col.palette <- matlab.like(1000)
require(rockchalk)
require(rgl)
require(colorRamps)
require(colorspace)
require(MASS)
mf=model.frame(model);emf=rockchalk::model.data(model)
if (is.null(x1lab)) x1lab=plotx1
if (is.null(x2lab)) x2lab=plotx2
if (is.null(ylab)) ylab=names(mf)[[1]]
if (is.null(col.palette)) col.palette=rev(rainbow_hcl(1000,c=100))
x1=emf[,plotx1]
x2=emf[,plotx2]
y=mf[,1]
if (is.null(x1lim)) x1lim=c(min(x1),max(x1))
if (is.null(x2lim)) x2lim=c(min(x2),max(x2))
preds=predictOMatic(model,predVals=c(plotx1,plotx2),n=npp,divider="seq",interval=interval)
ylim=c(min(c(preds$fit,y)),max(c(preds$fit,y)))
open3d(zoom=zoom,userMatrix=userMatrix,windowRect=windowRect)
if (plotPoints) plot3d(x=x1,y=x2,z=y,type="s",col=col.palette[(y-min(y))*999/diff(range(y))+1],size=cex,aspect=aspect,xlab=x1lab,ylab=x2lab,zlab=ylab,lit=lit,alpha=pointsalpha)
if (!plotPoints) plot3d(x=x1,y=x2,z=y,type="n",col=col.palette[(y-min(y))*999/diff(range(y))+1],size=cex,aspect=aspect,xlab=x1lab,ylab=x2lab,zlab=ylab)
if ("lwr" %in% names(preds)) persp3d(x=unique(preds[,plotx1]),y=unique(preds[,plotx2]),z=matrix(preds[,"lwr"],npp,npp),color=confcol, alpha=confalpha, lit=lit, back="lines",add=TRUE)
ypred=matrix(preds[,"fit"],npp,npp)
cols=col.palette[(ypred-min(ypred))*999/diff(range(ypred))+1]
persp3d(x=unique(preds[,plotx1]),y=unique(preds[,plotx2]),z=ypred,color=cols, alpha=0.7, lit=lit, back="lines",add=TRUE)
if ("upr" %in% names(preds)) persp3d(x=unique(preds[,plotx1]),y=unique(preds[,plotx2]),z=matrix(preds[,"upr"],npp,npp),color=confcol, alpha=confalpha, lit=lit, back="lines",add=TRUE)
if (plotDroplines) segments3d(x=rep(x1,each=2),y=rep(x2,each=2),z=matrix(t(cbind(y,fitted(model))),nc=1),col=segcol,lty=2,alpha=segalpha)
if (!is.null(outfile)) rgl.snapshot(outfile, fmt="png", top=TRUE)
}
Here is what you get as output with your model :
data=data.frame(age=c(530.6071,408.6071,108.0357,106.6071,669.4643,556.4643,665.4643,508.4643,106.0357),
hrs=c(792.10,489.70,463.00,404.15,384.65,358.15,343.85,342.35,335.25),
charges=c(3474.60,1247.06,1697.07,1676.33,1701.13,1630.30,2468.83,3366.44,2876.82))
library(MASS)
fit1=rlm( charges~age+hrs+age*hrs, data)
plotPlaneFancy(fit1, plotx1 = "age", plotx2 = "hrs")
plotPlaneFancy(fit1, plotx1 = "age", plotx2 = "hrs",interval="confidence")
(or interval="prediction" to show 95% prediction intervals)
plotImage(fit1,plotx="age",ploty="hrs",plotContours=T,plotLegend=T)