I use following example code to plot an impulse response function:
# Load data and apply VAR
library("vars")
data(Canada)
data <- Canada
data <- data.frame(data[,1:2])
names(data)
var <- VAR(data, p=2, type = "both")
# Apply IRf
irf <- irf(var, impulse = "e", response = "prod", boot = T, cumulative = FALSE, n.ahead = 20)
str(irf)
plot(irf)
# Response
irf$irf
# Lower & Higher
irf$Lower
irf$Upper
#Create DataFrame and Plot
irf_df <- data.frame(irf$irf,irf$Lower,irf$Upper)
irf_df$T<-seq.int(nrow(irf_df)) #T
irf_df
plot(data.frame(irf_df$T, irf_df[1]), type="l", main="Impulse Response")
abline(h=0, col="blue", lty=2)
It looks like it works so far, though I sense that the code could be improved.
Would it be possible to add a confidence band for the Lower and Upper bounds of the confidence interval?
If you want to plot the Lower and Upper bands, you can use the lines() function, setting the y-limits of the plot if desired.
plot(irf_df$T, irf_df$prod, type="l", main="Impulse Response",
ylim = c(min(irf_df$prod.1), max(irf_df$prod.2)) * 1.1)
abline(h=0, col="blue", lty=2)
lines(irf_df$T, irf_df$prod.1, lty = 2)
lines(irf_df$T, irf_df$prod.2, lty = 2)
For a fancier plot with the confidence band filled in, use polygon. Here, we set up an empty plot, then plot the polygon, and finally overlay the line. Also note here that there's no need to set up a new data.frame: we can simply use values from the irf() output:
plot(irf$irf$e, type = "n", main = "Impulse Response",
ylim = c(min(irf$Lower$e), max(irf$Upper$e)))
polygon(x = c(seq_along(irf$irf$e), rev(seq_along(irf$irf$e))),
y = c(irf$Lower$e, rev(irf$Upper$e)),
lty = 0, col = "#fff7ec")
lines(irf$irf$e)
Output:
Related
I am planning to reproduce the attached figure, but I have no clue how to do so:
Let´s say I would be using the CO2 example dataset, and I would like to plot the relative change of the Uptake according to the Treatment. Instead of having the three variables in the example figure, I would like to show the different Plants grouped for each day/Type.
So far, I managed only to get this bit of code, but this is far away from what it should look like.
aov1 <- aov(CO2$uptake~CO2$Type+CO2$Treatment+CO2$Plant)
plot(TukeyHSD(aov1, conf.level=.95))
Axes should be switched, and I would like to add statistical significant changes indicated with letters or stars.
You can do this by building it in base R - this should get you started. See comments in code for each step, and I suggest running it line by line to see what's being done to customize for your specifications:
Set up data
# Run model
aov1 <- aov(CO2$uptake ~ CO2$Type + CO2$Treatment + CO2$Plant)
# Organize plot data
aov_plotdata <- data.frame(coef(aov1), confint(aov1))[-1,] # remove intercept
aov_plotdata$coef_label <- LETTERS[1:nrow(aov_plotdata)] # Example labels
Build plot
#set up plot elements
xvals <- 1:nrow(aov_plotdata)
yvals <- range(aov_plotdata[,2:3])
# Build plot
plot(x = range(xvals), y = yvals, type = 'n', axes = FALSE, xlab = '', ylab = '') # set up blank plot
points(x = xvals, y = aov_plotdata[,1], pch = 19, col = xvals) # add in point estimate
segments(x0 = xvals, y0 = aov_plotdata[,2], y1 = aov_plotdata[,3], lty = 1, col = xvals) # add in 95% CI lines
axis(1, at = xvals, label = aov_plotdata$coef_label) # add in x axis
axis(2, at = seq(floor(min(yvals)), ceiling(max(yvals)), 10)) # add in y axis
segments(x0=min(xvals), x1 = max(xvals), y0=0, lty = 2) #add in midline
legend(x = max(xvals)-2, y = max(yvals), aov_plotdata$coef_label, bty = "n", # add in legend
pch = 19,col = xvals, ncol = 2)
I need to overlay a normal distribution curve based on a dataset on a histogram of the same dataset.
I get the histogram and the normal curve right individually. But the curve just stays a flat line when combined to the histogram using the add = TRUE attribute in the curve function.
I did try adjusting the xlim and ylim to check if it works but am not getting the intended results, I am confused about how to set the (x and y) limits to suit both the histogram and the curve.
Any suggestions? My dataset is a set of values for 100 individuals daily walk distances ranging from min = 0.4km to max = 10km
bd.m <- read_excel('walking.xlsx')
hist(bd.m, ylim = c(0,10))
curve(dnorm(x, mean = mean(bd.m), sd = sd(bd.m)), add = TRUE, col = 'red')
You need to set freq = FALSEin the call to hist. For example:
dt <- rnorm(1000, 2)
hist(dt, freq = F)
curve(dnorm(x, mean = mean(dt), sd = sd(dt)), add = TRUE, col = 'red')
I have a plot in R that I am trying to replicate in ggplot2. I have the following code:
theta = seq(0,1,length=500)
post <- dgamma(theta,0.5, 1)
plot(theta, post, type = "l", xlab = expression(theta), ylab="density", lty=1, lwd=3)
I have tried to replicate this plot in ggplot2 and this is the closest I was able to get.
df=data_frame(post,theta)
ggplot(data=df,aes(x=theta))+
stat_function(fun=dgamma, args=list(shape=1, scale=.5))
You didn't match up your parameters correctly. The signature of dgamma is
dgamma(x, shape, rate = 1, scale = 1/rate, log = FALSE)
so when you call
dgamma(theta, 0.5, 1)
that's
dgamma(theta, shape=0.5, rate=1)
which means you would translate the ggplot as
ggplot(data=df,aes(x=theta))+
stat_function(fun=dgamma, args=list(shape=0.5, rate=1))
you could also adjust the y-limits if you like with scale_y_continuous(limits=c(0,12)) or something similar.
My GAM curves are being shifted downwards. Is there something wrong with the intercept? I'm using the same code as Introduction to statistical learning... Any help's appreciated..
Here's the code. I simulated some data (a straight line with noise), and fit GAM multiple times using bootstrap.
(It took me a while to figure out how to plot multiple GAM fits in one graph. Thanks to this post Sam's answer, and this post)
library(gam)
N = 1e2
set.seed(123)
dat = data.frame(x = 1:N,
y = seq(0, 5, length = N) + rnorm(N, mean = 0, sd = 2))
plot(dat$x, dat$y, xlim = c(1,100), ylim = c(-5,10))
gamFit = vector('list', 5)
for (ii in 1:5){
ind = sample(1:N, N, replace = T) #bootstrap
gamFit[[ii]] = gam(y ~ s(x, 10), data = dat, subset = ind)
par(new=T)
plot(gamFit[[ii]], col = 'blue',
xlim = c(1,100), ylim = c(-5,10),
axes = F, xlab='', ylab='')
}
The issue is with plot.gam. If you take a look at the help page (?plot.gam), there is a parameter called scale, which states:
a lower limit for the number of units covered by the limits on the ‘y’ for each plot. The default is scale=0, in which case each plot uses the range of the functions being plotted to create their ylim. By setting scale to be the maximum value of diff(ylim) for all the plots, then all subsequent plots will produced in the same vertical units. This is essential for comparing the importance of fitted terms in additive models.
This is an issue, since you are not using range of the function being plotted (i.e. the range of y is not -5 to 10). So what you need to do is change
plot(gamFit[[ii]], col = 'blue',
xlim = c(1,100), ylim = c(-5,10),
axes = F, xlab='', ylab='')
to
plot(gamFit[[ii]], col = 'blue',
scale = 15,
axes = F, xlab='', ylab='')
And you get:
Or you can just remove the xlim and ylim parameters from both calls to plot, and the automatic setting of plot to use the full range of the data will make everything work.
Given such data:
#Cutpoint SN (1-PPV)
5 0.56 0.01
7 0.78 0.19
9 0.91 0.58
How can I plot ROC curve with R that produce similar result like the
attached ?
I know ROCR package but it doesn't take such input.
If you just want to create the plot (without that silly interpolation spline between points) then just plot the data you give in the standard way, prepending a point at (0,0) and appending one at (1,1) to give the end points of the curve.
## your data with different labels
dat <- data.frame(cutpoint = c(5, 7, 9),
TPR = c(0.56, 0.78, 0.91),
FPR = c(0.01, 0.19, 0.58))
## plot version 1
op <- par(xaxs = "i", yaxs = "i")
plot(TPR ~ FPR, data = dat, xlim = c(0,1), ylim = c(0,1), type = "n")
with(dat, lines(c(0, FPR, 1), c(0, TPR, 1), type = "o", pch = 25, bg = "black"))
text(TPR ~ FPR, data = dat, pos = 3, labels = dat$cutpoint)
abline(0, 1)
par(op)
To explain the code: The first plot() call sets up the plotting region, without doing an plotting at all. Note that I force the plot to cover the range (0,1) in both axes. The par() call tells R to plot axes that cover the range of the data - the default extends them by 4 percent of the range on each axis.
The next line, with(dat, lines(....)) draws the ROC curve and here we prepend and append the points at (0,0) and (1,1) to give the full curve. Here I use type = "o" to give both points and lines overplotted, the points are represented by character 25 which allows it to be filled with a colour, here black.
Then I add labels to the points using text(....); the pos argument is used to position the label away from the actual plotting coordinates. I take the labels from the cutpoint object in the data frame.
The abline() call draws the 1:1 line (here the 0, and 1 mean an intercept of 0 and a slope of 1 respectively.
The final line resets the plotting parameters to the defaults we saved in op prior to plotting (in the first line).
The resulting plot looks like this:
It isn't an exact facsimile and I prefer the plot using the default for the axis ranges(adding 4 percent):
plot(TPR ~ FPR, data = dat, xlim = c(0,1), ylim = c(0,1), type = "n")
with(dat, lines(c(0, FPR, 1), c(0, TPR, 1), type = "o", pch = 25, bg = "black"))
text(TPR ~ FPR, data = dat, pos = 3, labels = dat$cutpoint)
abline(0, 1)
Again, not a true facsimile but close.