Consider following script to plot an impulse response function:
library(vars)
Canada <- Canada * 999
var <- VAR(Canada, p = 2, type = "both")
plot(irf(var, impulse = "rw", response = "U", boot = T, cumulative = FALSE, n.ahead = 20))
plot(irf(var, impulse = "rw", response = "U", boot = T, cumulative = TRUE, n.ahead = 20))
I wonder how I could access the data of the plot (and 95% intervals)?
It would be great to print a plot with a color filled confidence band, a green impulse response line and different axis descriptions. A solution with R's inbuild plot features would be preferred over ggplot.
Thanks!
You can view the data returned by irf:
library("vars")
# generate some dummy data
df <- data.frame(n=rnorm(100), p=rpois(100, 2))
var <- VAR(df, p = 2, type = "both")
irf <- irf(var, impulse = "n", response = "p", boot = T,
cumulative = FALSE, n.ahead = 20)
# inspect coefficients object
str(irf)
All the data you need is accessible from here (e.g. check irf$Lower and irf$Upper).
One way to customise the default plot would be to look at the source of the function being called when you run plot(irf):
vars:::plot.varirf
In this case it's a bit involved but you can copy the body of this function and edit the code to change the colours, draw a filled polygon and edit the labels of the axes to get them exactly the way you want.
Updated:
Here's a starting point for the confidence bands:
# set up the base plot
plot(irf$irf$n, type="n", ylim = c(-.3, .5),
ylab = "Your label", xlab = "Another label")
abline(h=0)
# draw the filled polygon for confidence intervals
polygon(
c(1:length(irf$Upper$n), length(irf$Lower$n):1),
c(irf$Upper$n, rev(irf$Lower$n)),
col = "grey80", border = NA)
# add coefficient estimate line
lines(irf$irf$n, col = "darkgreen")
I had a similar problem, so I modeled it myself. I am not an advanced R user so maybe someone can put that into a function or so.
This method creates a plot of all IRFs, with a vertical at y=0, the names of the impulses on the x-axis and the responses on the y-axis. The IRF-plots are also size-adjusted.
"VAR_BS_9016_5VAR" is my "varest" object. I used 5 variables but this method can easily be shortened or expanded.
par(mfrow=c(5,5), oma = c(0,0,0,0) + 0.1, mar = c(5,5,0,0) + 0.1)
for (i in 1:5){
for (j in 1:5){
var_plot=irf(VAR_BS_9016_5VAR, impulse = paste(colnames(VAR_BS_9016_5VAR$y)[i]), response=paste(colnames(VAR_BS_9016_5VAR$y)[j]), n.ahead = 20, ortho=TRUE, boot=TRUE, runs=1000, ci=0.9)
plot(x=c(1:21), y=unlist(var_plot$Lower), type="l", lwd = 3, lty=2,col="red", ylab=paste(colnames(VAR_BS_9016_5VAR$y)[j]), xlab=paste(var_plot$impulse), ylim=range(c(unlist(var_plot$Lower),unlist(var_plot$Upper))) )
lines(x=c(1:21),y=unlist(var_plot$Upper),type="l",lwd = 3, lty=2,col="red")
lines(x=c(1:21),y=unlist(var_plot$irf),type="l", lwd = 3)
abline(a = NULL, h = 0)
}
}
Here is my solution for obtaining a data frame that can be used in ggplot when you have multiple impulses and multiple responses.
For the pipe operator please get library(dplyr). Be careful since dplyr and MASS (dependency of vars-package) have naming conflicts (e.g., for "select"):
getIRFPlotData <- function(impulse, response, list) {
cbind.data.frame(Week = 0:(nrow(list[[1]][[1]])-1),
Lower = list[[2]][names(list[[2]]) == impulse][[1]] %>% as.data.frame() %>% dplyr::select_(response) %>% pull(1),
irf = list[[1]][names(list[[1]]) == impulse][[1]] %>% as.data.frame() %>% dplyr::select_(response) %>% pull(1),
Upper = list[[3]][names(list[[3]]) == impulse][[1]] %>% as.data.frame() %>% dplyr::select_(response) %>% pull(1),
Impulse = impulse,
Response = response, stringsAsFactors = FALSE)
}
With this you can return a data.frame with columns = Lower, irf, Upper, Impulse, Response. When you use dplyr::bind_rows() on the data frames you can stack the different data.frames on top of each other and using ggplot2::facet_wrap() and facet_grid() you can produce charts similar to the ones outputted by vars:::plot.varirf(), but are fully flexible to append stuff and work with the data.
getIRFPlotData("Spendings", "Returns", irf4c) %>% ggplot(.) + geom_line(aes(Week, Lower), linetype="dashed") + geom_line(aes(Week, irf)) + geom_line(aes(Week, Upper),linetype="dashed") + geom_ribbon(aes(Week, ymin=Lower, ymax=Upper), alpha = 0.3) + theme_minimal()
Related
I'm looking to replicate this correlation plot, or at least get as close as possible to it.
Specifically, I want:
the correlation values in the lower half, with values varying on a greyscale based on absolute value
the circles in the top half, with varying diameter and on the colour scale.
I want to be able to edit the axis scale labels so that full descriptions are on the y-axis, and numeric references on the x-axis
I have gotten relatively close, but have not managed precise enough replication. I describe my closest attempts below with reproducible code. The corrplot package has gotten me closest.
# general preparation
library(car)
correlations = cor(mtcars)
corrplot package
library(corrplot)
corrplot.mixed(correlations,
upper = "number", #upper.col = ???
lower = "circle", #lower.col = ???
tl.pos = "lt", tl.col = "black", tl.cex = 0.5)
Notes:
there is a way to make the coefficients in greyscale, but I don't understand it: https://rdrr.io/cran/corrplot/man/COL1.html
For some bizarre reason, when I use my own data (as opposed to mtcar), the coefficient colours don't match with the actual correlation values. I cannot give a reproducible code example here, because it works fine with the mtcar data.
cormat package
source("http://www.sthda.com/upload/rquery_cormat.r")
rquery.cormat(mtcar)
ggcorrplot
library("ggcorrplot")
# circles separate
ggcorrplot(correlations, # correlation matrix
method = "circle", # circles instead of squares
type = "upper", # show only upped triangle
show.diag = F, # don't show diagonal values (1)
lab = F, # don't show cor coeffs
outline.col = "white", # no outline of circles
ggtheme = theme_bw, # theme
colors = c("#440154FF","#238A8DFF","#FDE725FF"))
# coefs separate
ggcorrplot(correlations, # correlation matrix
method = "circle", # circles instead of squares
type = "upper", # show only upped triangle
show.diag = F, # don't show diagonal values (1)
lab = T, # don't show cor coeffs
outline.col = NA, # don't show circles
ggtheme = theme_bw, # theme
colors = c("#440154FF","#238A8DFF","#FDE725FF"))
# can't combine both plots?
corrgram package
library(corrgram)
corrgram(correlations,
labels = indices_all,
lower.panel = "panel.fill",
upper.panel = "panel.cor")
Some other notes:
It seems the halves of the plots tend to run via the opposite diagonal than in the example plot, but I guess that's not a big concern.
Out-of-the-box options are quick and nice. However, when it comes to customizing then IMHO it may be worthwhile to build up the plot from scratch using ggplot2. As a first step this involves some data wrangling to get you correlation matrix into the right shape. Also in this step I convert the categories to factors and a numeric id. Based on the ids I split the data in the upper and lower diagonal values which could then be plotted separately using a geom_point and a geom_text. Besides that it's important to add the drop=FALSE to the x and y scale to keep all factor levels and the right order. Also I use some functions to get the desired axis labels:
EDIT: Following the suggestion by #AllanCameron I added a coord_equal as the "final" touch to get a nice square matrix like look. And Thanks to #RichtieSacramento the code now maps the absolute value on the size aes.
library(dplyr)
library(tidyr)
library(ggplot2)
correlations = cor(mtcars)
levels <- colnames(mtcars)
corr_long <- correlations %>%
data.frame() %>%
mutate(row = factor(rownames(.), levels = levels),
rowid = as.numeric(row)) %>%
pivot_longer(-c(row, rowid), names_to = "col") %>%
mutate(col = factor(col, levels = levels),
colid = as.numeric(col))
ggplot(corr_long, aes(col, row)) +
geom_point(aes(size = abs(value), fill = value),
data = ~filter(.x, rowid > colid), shape = 21) +
geom_text(aes(label = scales::number(value, accuracy = .01), color = abs(value)),
data = ~filter(.x, rowid < colid), size = 8 / .pt) +
scale_x_discrete(labels = ~ attr(.x, "pos"), drop = FALSE) +
scale_y_discrete(labels = ~ paste0(.x, " (", attr(.x, "pos"), ")"), drop = FALSE) +
scale_fill_viridis_c(limits = c(-1, 1)) +
scale_color_gradient(low = grey(.8), high = grey(.2)) +
coord_equal() +
guides(size = "none", color = "none") +
theme(legend.position = "bottom",
panel.grid = element_blank(),
axis.ticks = element_blank()) +
labs(x = NULL, y = NULL, fill = NULL)
I am trying to make an interaction plot in sjPlot showing percent probabiliites of my outcome under two conditions of my predictive variable. Everything works perfectly, except the show.values = T and sort.est = T arguments, which don't seem to do anything. Is there a way to get this to work? Or, if not, how can I extract the dataframe sjPlot is using to create this figure? Looking for some way to either label or tabulate the displayed probability values. Thank you!
Here is some example data and what I have so far:
set.seed(100)
dat <- data.frame(Species = rep(letters[1:10], each = 5),
threat_cat = rep(c("recreation", "climate", "pollution", "fire", "invasive_spp"), 10),
impact.pres = sample(0:1, size = 50, replace = T),
threat.pres = sample(0:1, size = 50, replace = T))
mod <- glm(impact.pres ~ 0 + threat_cat/threat.pres,
data = dat, family = "binomial")
library(sjPlot)
library(ggpubr)
plot_model(mod, type = "int",
title = "",
axis.title = c("Threat category", "Predicted probabilities of threat being observed"),
legend.title = "Threat predicted",
colors = c("#f2bf10",
"#4445ad"),
line.size = 2,
dot.size = 4,
sort.est = T,
show.values = T)+
coord_flip()+
theme_pubr(legend = "right", base_size = 30)
sjPlot produces a ggplot object, so you can examine the aesthetic mappings and underlying data. After a bit of digging around you will find the default mapping is already correct for the x, y placements of text labels, so all you need to do is add a geom_text to the plot, and only need to specify the labels as an aesthetic mapping. You can get the labels from a column called predicted stored in the ggplot object.
The upshot is that if you add the following layer to your plot:
geom_text(aes(label = scales::percent(predicted)),
position = position_dodge(width = 1), size = 8)
You get
Getting the labels in order is trickier. You have to fiddle with the internal components of the plot to do this. Suppose we store the above plot as p, then we can sort by the predicted percentages by doing:
p$data <- as.data.frame(p$data)
ord <- p$data$x[p$data$group == 1][order(p$data$predicted[p$data$group == 1])]
p$data$x <- match(p$data$x, ord)
p$scales$scales[[1]]$labels <- p$scales$scales[[1]]$labels[ord]
p
I use a R package, SetMethods, to get the fsQCA results of panel data. In the package, it uses cluster.plot() function to generate a plot.
However, I have a hard time letting the x-axis of the graph show the number of units as tick marks. For example, I want it shows 10, 20, 30,..,140 on the x-axis to know how many units' consistency score lower than a certain point.
Is there any method to add tick marks on a plot that is not generated by plot() function? Thanks in advance.
Here I use the dataset in the package as an example.
install.packages("SetMethods")
library(SetMethods)
data("PAYF")
PS <- minimize(data = PAYF,
outcome = "HL",
conditions = c("HE","GG","AH","HI","HW"),
incl.cut = 0.9,
n.cut = 2,
include = "?",
details = TRUE,
show.cases = TRUE)
PS
# Perform cluster diagnostics:
CB <- cluster(data = PAYF,
results = PS,
outcome = "HL",
unit_id = "COUNTRY",
cluster_id = "REGION",
necessity=FALSE,
wicons = FALSE)
CB
# Plot pooled, between, and within consistencies:
cluster.plot(cluster.res = CB,
labs = TRUE,
size = 8,
angle = 6,
wicons = TRUE)
Finally, I get a graph as follows.
However, I want it shows 10, 20, 30,..,140 on the x-axis to know how many units' consistency score lower than a certain point.
Is there any method to add tick marks on a plot that is not generated by plot() function? Thanks in advance.
If you look inside the cluster.plot function definition (in RStudio press F2 while pointer is on it) you will see that it uses ggplot2 under the hood. Only it doesn't return ggplot2 objects but just prints them one over another. Because of this it's not really possible to modify the output afterwards in any covenient manner.
But you can always copy the function code and rewrite it for your own need. The part that prints the final plot in your case is
CTw <- list()
ticklabw = unique(as.character(cluster.res$unit_ids))
xtickw <- seq(1, length(ticklabw), by = 1)
if (class(cluster.res) == "clusterminimize") {
for (i in 1:length(cluster.res$output)) {
CTw[[i]] <- cluster.res$output[[i]]$WICONS
dtw <- data.frame(x = xtickw, y = CTw[[i]])
dtw <- dtw[order(dtw$y), ]
dtw$xr <- reorder(dtw$x, 1 - dtw$y)
pw <- ggplot(dtw, aes(y = dtw[, 2], x = dtw[,
3])) + geom_point() + ylim(0, 1) + theme_classic(base_size = 16) +
geom_hline(yintercept = cluster.res$output[[i]]$POCOS) +
labs(title = names(cluster.res$output[i]),
x = "Units", y = "Consistency") + theme(axis.text.x = element_blank())
suppressWarnings(print(pw))
}
}
You can modify the ggplot2 construction part to something like this (packages ggplot2 and dplyr need to be loaded):
pw <-
dtw %>%
mutate(x_ind = as.numeric(xr)) %>%
ggplot(aes(x_ind, y)) +
geom_point() +
ylim(0, 1) +
theme_classic(base_size = 16) +
geom_hline(yintercept = cluster.res$output[[i]]$POCOS) +
scale_x_continuous(breaks = seq(from = 0, to = 140, by = 10)) +
labs(title = names(cluster.res$output[i]),
x = "Units", y = "Consistency")
I'm trying to plot multiple simple Random Walks in R, but am having problems doing so.
Please be aware that by simple Random Walk I mean the Sum of Random Variables that can either be {-1} or {1} with each values having the same probability, not some Random Walk absed on white Noise. (see the definition on https://en.wikipedia.org/wiki/Random_walk#One-dimensional_random_walk )
I use the following code to plot the Random Walks:
set.seed(1)
n <- 200
Random_Walk<- cumsum(sample(c(-1, 1), n, TRUE))
n <- 200
Random_Walk_2 <- cumsum(sample(c(-1, 1), n, TRUE))
ts.plot(Random_Walk, gpars=list(xlab="Length of Random Walk", ylab="Distance from origin",lty=c(1:1)))
This code works fine, but once I try to plot both Random Walks in the same Graph it breaks.
Can someone explain how i could plot both of them or even multiple Random Walks in one Graph?
Additionally I was wondering whether there is some tools that could give me the variance or the standard deviation of all those Random Walks
Thank you all in advance!!
This is a possible solution in R-base
plot(Random_Walk, type = "l", xlim = c(0, 200), ylim = c(-15, 15),
col = "blue", xlab = "n", ylab = "Rw")
par(new=T)
plot(Random_Walk_2, type = "l", xlim = c(0, 200), ylim = c(-15, 15),
col = "red", xlab = "n", ylab = "Rw")
This is a possible solution with ggplot2:
library(ggplot2)
df_rw <- data.frame(n = 1:200, r1 = Random_Walk, r2 = Random_Walk_2)
ggplot(df_rw) +
geom_line(aes(n, r1), col = "blue") +
geom_line(aes(n, r2), col = "red") +
labs(x = "n", y = "Rw")
This is another possibile solution with ggplot2
library(ggplot2)
df_rw2 <- data.frame(n = c(1:200, 1:200),
rw = c(Random_Walk, Random_Walk_2),
lab = rep(c("Random Walk 1", "Random Walk 2"), each = 200))
ggplot(df_rw2) +
geom_line(aes(x = n, y = rw, color = lab)) +
scale_color_manual(values = c("red", "blue"))
Here is a simple base R solution with the many times forgotten function matplot.
RW <- cbind(Random_Walk, Random_Walk_2)
matplot(RW, type = "l", lty = "solid")
A ggplot2 solution could be the following. But the data format should be the long format and the data is in wide format. See this post on how to reshape the data from wide to long format.
library(tidyverse)
as.data.frame(RW) %>%
mutate(x = row_number()) %>%
pivot_longer(-x) %>%
ggplot(aes(x, value, color = name)) +
geom_line()
As for the 2 first moments of your random walk, see this post of Mathematics Stack Exchange.
I am really confused. I would like to change the axis labels of a plot (classification or uncertainty) for a 'Mclust' model object in R and I don't understand why it's working for a simple object with just two variables, but not several ones.
Here an example:
require(mclust)
mod1 = Mclust(iris[,1:2])
plot(mod1, what = "uncertainty", dimens = c(1,2), xlab = "test")
# changed x-axis-label
mod2 = Mclust(iris[,1:4])
plot(mod2, what = "uncertainty", dimens = c(1,2), xlab = "test")
# no changed x-axis-label
Another way I tried was with coordProj:
coordProj(data= iris[, -5], dimens = c(1,2), parameters = mod2$parameters,
z = mod2$z, what = "uncertainty", xlab = "test")
# Error in plot.default(data[, 1], data[, 2], pch = 19, main = "", xlab = xlab, :
# formal argument "xlab" matched by multiple actual arguments
So I thought, maybe it will work with ggplot2 (and that would be my favourite option). Now I can change the axis labels and so on but I don't know how to plot the ellipses?
require(ggplot2)
ggplot(data = iris) +
geom_point(aes(x = Sepal.Length, y = Sepal.Width, size = mod2$uncertainty)) +
scale_x_continuous(name = "test")
It would be nice, if someone might know a solution to change the axis labels in plot.Mclust or to add the ellipses to ggplot.
Thanks a lot!
I started to look at the code for plot.Mclust, but then I just used stat_ellipse and changed the level until the plots looked the same. It appears to be a joint t-distribution (the default) at 50% confidence (instead of the default 95%). There's probably a better way to do it using the actual covariance matrix (mod2$parameters$variance$sigma), but this gets you to where you want.
require(dplyr)
iris %>%
mutate(uncertainty = mod2$uncertainty,
classification = factor(mod2$classification)) %>%
ggplot(aes(Sepal.Length, Sepal.Width, size = uncertainty, colour = classification)) +
geom_point() +
guides(size = F, colour = F) + theme_classic() +
stat_ellipse(level = 0.5, type = "t") +
labs(x = "Label X", y = "Label Y")