I'm using the svars package to generate some IRF plots. The plots are rendered using ggplot2, however I need some help with changing some of the aesthetics.
Is there any way I can change the fill and alpha of the shaded confidence bands, as well as the color of the solid line? I know in ggplot2 you can pass fill and alpha arguments to geom_ribbon (and col to geom_line), just unsure of how to do the same within the plot function of this package's source code.
# Load Dataset and packages
library(tidyverse)
library(svars)
data(USA)
# Create SVAR Model
var.model <- vars::VAR(USA, lag.max = 10, ic = "AIC" )
svar.model <- id.chol(var.model)
# Wild Bootstrap
cores <- parallel::detectCores() - 1
boot.svar <- wild.boot(svar.model, n.ahead = 30, nboot = 500, nc = cores)
# Plot the IRFs
plot(boot.svar)
I'm also looking at the command for a historical decomposition plot (see below). Is there any way I could omit the first two facets and plot only the bottom three lines on the same facet?
hist.decomp <- hd(svar.model, series = 1)
plot(hist.decomp)
Your first desired result is easily achieved by resetting the aes_params after calling plot. For your second goal. There is probably an approach to manipulate the ggplot object. Instead my approach below constructs the plot from scratch. Basically I copy and pasted the data wrangling code from vars:::plot.hd and filtered the prepared dataset for the desired series:
# Plot the IRFs
p <- plot(boot.svar)
p$layers[[1]]$aes_params$fill <- "pink"
p$layers[[1]]$aes_params$alpha <- .5
p$layers[[2]]$aes_params$colour <- "green"
p
# Helper to convert to long dataframe. Source: svars:::plot.hd
hd2PlotData <- function(x) {
PlotData <- as.data.frame(x$hidec)
if (inherits(x$hidec, "ts")) {
tsStructure = attr(x$hidec, which = "tsp")
PlotData$Index <- seq(from = tsStructure[1], to = tsStructure[2],
by = 1/tsStructure[3])
PlotData$Index <- as.Date(yearmon(PlotData$Index))
}
else {
PlotData$Index <- 1:nrow(PlotData)
PlotData$V1 <- NULL
}
dat <- reshape2::melt(PlotData, id = "Index")
dat
}
hist.decomp <- hd(svar.model, series = 1)
dat <- hd2PlotData(hist.decomp)
dat %>%
filter(grepl("^Cum", variable)) %>%
ggplot(aes(x = Index, y = value, color = variable)) +
geom_line() +
xlab("Time") +
theme_bw()
EDIT One approach to change the facet labels is via a custom labeller function. For a different approach which changes the facet labels via the data see here:
myvec <- LETTERS[1:9]
mylabel <- function(labels, multi_line = TRUE) {
data.frame(variable = labels)
}
p + facet_wrap(~variable, labeller = my_labeller(my_labels))
Related
I'm having trouble displaying the multiple graphs on the same page. I'm having a data frame with 18 numerical columns. For each column, I need to show its histogram and boxplot on the same page with a 4*9 grid. Following is what I tried. But I need to show it along with the boxplot as well. Through a for a loop if possible. Can someone please help me to do it.
library(gridExtra)
library(ggplot2)
p <- list()
for(i in 1:18){
x <- my_data[,i]
p[[i]] <- ggplot(gather(x), aes(value)) +
geom_histogram(bins = 10) +
facet_wrap(~key, scales = 'free_x')
}
do.call(grid.arrange,p)
I received the following graph.
When following is tried, I'm getting the graph in separate pages
library(dplyr)
dat2 <- my_data %>% mutate_all(scale)
# Boxplot from the R trees dataset
boxplot(dat2, col = rainbow(ncol(dat2)))
par(mfrow = c(2, 2)) # Set up a 2 x 2 plotting space
# Create the loop.vector (all the columns)
loop.vector <- 1:4
p <- list()
for (i in loop.vector) { # Loop over loop.vector
# store data in column.i as x
x <- my_data[,i]
# Plot histogram of x
p[[i]] <-hist(x,
main = paste("Question", i),
xlab = "Scores",
xlim = c(0, 100))
plot_grid(p, label_size = 12)
}
You can assemble the base R boxplot and the ggplot object generated with facet_wrap together using the R package patchwork:
library(ggplot2)
library(patchwork)
p <- ggplot(mtcars, aes(x = mpg)) +
geom_histogram() +
facet_wrap(~gear)
wrap_elements(~boxplot(split(mtcars$mpg, mtcars$gear))) / p
ggsave('test.png', width = 6, height = 8, units = 'in')
I am trying to add the trendline from an SMA (standardized major axis) fit to my ggplot. However, when I extract the coefficients from the SMA and give them to geom_abline() the line extends over the entire plot instead of clipping to the data. The natural solution to this would be use a geom_segment() instead, manually calculating the endpoints of the line. However, when I do this the lines don't match each other and neither match the SMA fit. What's going on here?
I am aware that you can use the plot function directly on an sma object but I would prefer to use ggplot
Note: this is my first time asking a question so my apologies if I'm missing something!
Edit: I am using a log-log axis, which I suspect may be part of the issue.
Reproducible version below:
library(tidyverse)
library(smatr) #for the SMA
# sample data set
x <- rlnorm(100, meanlog = 10)
var <- rlnorm(100, meanlog = 10)
df <- data.frame(x=x, y=x+var)
# fit using an SMA
sm <- sma(x~y, data = df, log = "xy")
# get sma coefficients into a data.frame
bb <- data.frame(coef(sm))
bb <- bb %>%
rownames_to_column(var = "Coef") %>%
pivot_wider(names_from = "Coef", values_from = "coef.sm.")
## calculate end coordinates for segment
bb$min_x <- min(df$x, na.rm = TRUE)
bb$max_x <- max(df$x, na.rm = TRUE)
bb <- bb %>%
mutate(min_y = (slope*min_x) + elevation) %>%
mutate(max_y = (slope*max_x) + elevation)
# plot into ggplot
p1 <- ggplot(df, aes(x=x, y=y)) +
geom_point(shape=21) +
scale_y_continuous(trans = 'log10')+
scale_x_continuous(trans = 'log10') +
geom_abline(data=bb,aes(intercept=elevation,slope=slope), color = "blue")
p1 + geom_segment(data=bb, aes(x=min_x, xend=max_x, y=min_y, yend=max_y), color = "orange")
#this is the plot from the smatr package for comparison
plot(sm)
Facebook's Prophet in R (there's also a Python version) is used to generate time series forecasts.
A model m is created by:
m <- prophet(df)
future <- make_future_dataframe(m, periods = 365)
forecast <- predict(m, future)
plot(m, forecast)
Which returns a very nicely formatted graph, like:
I would like to change the line type, to get not dots but a usual thin line.
I had tried this
lines(m$history$y,lty=1)
but got an error
In doTryCatch(return(expr), name, parentenv, handler)
Are there are any suggestions how to convert those dots into a line?
The plot method for prophet objects uses ggplot2, so base R graphics functions like lines() won't work. You can use ggplot2::geom_line() to add lines, but at the moment I don't see an easy way to replace the points by lines ...
Example from ?prophet:
history <- data.frame(ds = seq(as.Date('2015-01-01'), as.Date('2016-01-01'), by = 'd'),
y = sin(1:366/200) + rnorm(366)/10)
m <- prophet(history)
future <- make_future_dataframe(m, periods = 365)
forecast <- predict(m, future)
pp <- plot(m,forecast)
Add lines:
library(ggplot2)
pp + geom_line()
This question provides a (hacky) way forward:
pp2 <- pp + geom_line()
qq2 <- ggplot_build(pp2)
qq2$data[[2]]$colour <- NA
plot(ggplot_gtable(qq2))
But obviously something went wrong with the hack. The better bet would be to look at the plot method(prophet:::plot.prophet) and modify it to behave as you want ... Here is the bare-bones version:
df <- prophet:::df_for_plotting(m, forecast)
gg <-ggplot(df, aes(x = ds, y = y)) + labs(x = "ds", y = "y")
gg <- gg + geom_ribbon(ggplot2::aes(ymin = yhat_lower,
ymax = yhat_upper), alpha = 0.2, fill = "#0072B2",
na.rm = TRUE)
## replace first geom_point() with geom_line() in next line ...
gg <- gg + geom_line(na.rm = TRUE) + geom_line(aes(y = yhat),
color = "#0072B2", na.rm = TRUE) + theme(aspect.ratio = 3/5)
I may have stripped out some components that exist in your data/forecast, though ...
it is possible to make such manipulations with dyplot.prophet(m, forecast) (html version of plot) :) before that, we should rewrite function like here:
dyplot.prophet <- function(x, fcst, uncertainty=TRUE,
...)
{
forecast.label='Predicted'
actual.label='Actual'
# create data.frame for plotting
df <- prophet:::df_for_plotting(x, fcst)
# build variables to include, or not, the uncertainty data
if(uncertainty && exists("yhat_lower", where = df))
{
colsToKeep <- c('y', 'yhat', 'yhat_lower', 'yhat_upper')
forecastCols <- c('yhat_lower', 'yhat', 'yhat_upper')
} else
{
colsToKeep <- c('y', 'yhat')
forecastCols <- c('yhat')
}
# convert to xts for easier date handling by dygraph
dfTS <- xts::xts(df %>% dplyr::select_(.dots=colsToKeep), order.by = df$ds)
# base plot
dyBase <- dygraphs::dygraph(dfTS)
presAnnotation <- function(dygraph, x, text) {
dygraph %>%
dygraphs::dyAnnotation(x, text, text, attachAtBottom = TRUE)
}
dyBase <- dyBase %>%
# plot actual values
dygraphs::dySeries(
'y', label=actual.label, color='black',stepPlot = TRUE, strokeWidth=1
) %>%
# plot forecast and ribbon
dygraphs::dySeries(forecastCols, label=forecast.label, color='blue') %>%
# allow zooming
dygraphs::dyRangeSelector() %>%
# make unzoom button
dygraphs::dyUnzoom()
if (!is.null(x$holidays)) {
for (i in 1:nrow(x$holidays)) {
# make a gray line
dyBase <- dyBase %>% dygraphs::dyEvent(
x$holidays$ds[i],color = "rgb(200,200,200)", strokePattern = "solid")
dyBase <- dyBase %>% dygraphs::dyAnnotation(
x$holidays$ds[i], x$holidays$holiday[i], x$holidays$holiday[i],
attachAtBottom = TRUE)
}
}
return(dyBase)
}
the strokeWidth=0 was before and we have changed it to strokeWidth=1 and added stepPlot = TRUE
the whole basis code is situated here: https://rdrr.io/cran/prophet/src/R/plot.R
I am trying to create a plotly boxplot in R that doesnt show the outliers, and I found this link in the official page of plotly:
https://plot.ly/ggplot2/box-plots/#outliers
library(plotly)
set.seed(123)
df <- diamonds[sample(1:nrow(diamonds), size = 1000),]
p <- ggplot(df, aes(cut, price, fill = cut)) +
geom_boxplot(outlier.shape = NA) +
ggtitle("Ignore outliers in ggplot2")
# Need to modify the plotly object and make outlier points have opacity equal
to 0
p <- plotly_build(p)
p$data <- lapply(p$data, FUN = function(x){
x$marker = list(opacity = 0)
return(x)
})
# Create a shareable link to your chart
# Set up API credentials: https://plot.ly/r/getting-started
chart_link = plotly_POST(p, filename="geom_boxplot/outliers")
chart_link
The problem is that in their webpage and in my console, outliers are still being displayed.
Is this some kind of bug?
Seems like a typo. Maybe the example wasn't updated to account for some changes in the object structure. After calling p <- plotly_build(p), we observe that there is no p$data, but there is p$x$data. So, changing the lapply call to the following:
p$x$data <- lapply(p$x$data, FUN = function(x){
x$marker = list(opacity = 0)
return(x)
})
makes everything work as intended:
I'm running an R script generating plots of the PCA analysis using FactorMineR.
I'd like to output the coordinates for the generated PCA plots but I'm having trouble finding the right coordinates. I found results1$ind$coord and results1$var$coord but neither look like the default plot.
I found
http://www.statistik.tuwien.ac.at/public/filz/students/seminar/ws1011/hoffmann_ausarbeitung.pdf
and
http://factominer.free.fr/classical-methods/principal-components-analysis.html
but neither describe the contents of the variable created by the PCA
library(FactoMineR)
data1 <- read.table(file=args[1], sep='\t', header=T, row.names=1)
result1 <- PCA(data1,ncp = 4, graph=TRUE) # graphs generated automatically
plot(result1)
I found that $ind$coord[,1] and $ind$coord[,2] are the first two pca coords in the PCA object. Here's a worked example that includes a few other things you might want to do with the PCA output...
# Plotting the output of FactoMineR's PCA using ggplot2
#
# load libraries
library(FactoMineR)
library(ggplot2)
library(scales)
library(grid)
library(plyr)
library(gridExtra)
#
# start with a clean slate
rm(list=ls(all=TRUE))
#
# load example data
data(decathlon)
#
# compute PCA
res.pca <- PCA(decathlon, quanti.sup = 11:12, quali.sup=13, graph = FALSE)
#
# extract some parts for plotting
PC1 <- res.pca$ind$coord[,1]
PC2 <- res.pca$ind$coord[,2]
labs <- rownames(res.pca$ind$coord)
PCs <- data.frame(cbind(PC1,PC2))
rownames(PCs) <- labs
#
# Just showing the individual samples...
ggplot(PCs, aes(PC1,PC2, label=rownames(PCs))) +
geom_text()
# Now get supplementary categorical variables
cPC1 <- res.pca$quali.sup$coor[,1]
cPC2 <- res.pca$quali.sup$coor[,2]
clabs <- rownames(res.pca$quali.sup$coor)
cPCs <- data.frame(cbind(cPC1,cPC2))
rownames(cPCs) <- clabs
colnames(cPCs) <- colnames(PCs)
#
# Put samples and categorical variables (ie. grouping
# of samples) all together
p <- ggplot() + theme(aspect.ratio=1) + theme_bw(base_size = 20)
# no data so there's nothing to plot...
# add on data
p <- p + geom_text(data=PCs, aes(x=PC1,y=PC2,label=rownames(PCs)), size=4)
p <- p + geom_text(data=cPCs, aes(x=cPC1,y=cPC2,label=rownames(cPCs)),size=10)
p # show plot with both layers
# Now extract the variables
#
vPC1 <- res.pca$var$coord[,1]
vPC2 <- res.pca$var$coord[,2]
vlabs <- rownames(res.pca$var$coord)
vPCs <- data.frame(cbind(vPC1,vPC2))
rownames(vPCs) <- vlabs
colnames(vPCs) <- colnames(PCs)
#
# and plot them
#
pv <- ggplot() + theme(aspect.ratio=1) + theme_bw(base_size = 20)
# no data so there's nothing to plot
# put a faint circle there, as is customary
angle <- seq(-pi, pi, length = 50)
df <- data.frame(x = sin(angle), y = cos(angle))
pv <- pv + geom_path(aes(x, y), data = df, colour="grey70")
#
# add on arrows and variable labels
pv <- pv + geom_text(data=vPCs, aes(x=vPC1,y=vPC2,label=rownames(vPCs)), size=4) + xlab("PC1") + ylab("PC2")
pv <- pv + geom_segment(data=vPCs, aes(x = 0, y = 0, xend = vPC1*0.9, yend = vPC2*0.9), arrow = arrow(length = unit(1/2, 'picas')), color = "grey30")
pv # show plot
# Now put them side by side in a single image
#
grid.arrange(p,pv,nrow=1)
#
# Now they can be saved or exported...
Adding something extra to Ben's answer. You'll note in the first chart in Ben's response that the labels overlap somewhat. The pointLabel() function in the maptools package attempts to find locations for the labels without overlap. It's not perfect, but you can adjust the positions in the new dataframe (see below) to fine tune if you want. (Also, when you load maptools you get a note about gpclibPermit(). You can ignore it if you're concerned about the restricted licence). The first part of the script below is Ben's script.
# load libraries
library(FactoMineR)
library(ggplot2)
library(scales)
library(grid)
library(plyr)
library(gridExtra)
#
# start with a clean slate
# rm(list=ls(all=TRUE))
#
# load example data
data(decathlon)
#
# compute PCA
res.pca <- PCA(decathlon, quanti.sup = 11:12, quali.sup=13, graph = FALSE)
#
# extract some parts for plotting
PC1 <- res.pca$ind$coord[,1]
PC2 <- res.pca$ind$coord[,2]
labs <- rownames(res.pca$ind$coord)
PCs <- data.frame(cbind(PC1,PC2))
rownames(PCs) <- labs
#
# Now, the code to produce Ben's first chart but with less overlap of the labels.
library(maptools)
PCs$label=rownames(PCs)
# Base plot first for pointLabels() to get locations
plot(PCs$PC1, PCs$PC2, pch = 20, col = "red")
new = pointLabel(PCs$PC1, PCs$PC2, PCs$label, cex = .7)
new = as.data.frame(new)
new$label = PCs$label
# Then plot using ggplot2
(p = ggplot(data = PCs) +
geom_hline(yintercept = 0, linetype = 3, colour = "grey20") +
geom_vline(xintercept = 0, linetype = 3, colour = "grey20") +
geom_point(aes(PC1, PC2), shape = 20, col = "red") +
theme_bw())
(p = p + geom_text(data = new, aes(x, y, label = label), size = 3))
The result is:
An alternative is to use the biplot function from CoreR or biplot.psych from the psych package. This will put the components and the data onto the same figure.
For the decathlon data set, use principal and biplot from the psych package:
library(FactoMineR) #needed to get the example data
library(psych) #needed for principal
data(decathlon) #the data set
pc2 <- principal(decathlon[1:10],2) #just the first 10 columns
biplot(pc2,labels = rownames(decathlon),cex=.5, main="Biplot of Decathlon results")
#this is a call to biplot.psych which in turn calls biplot.
#adjust the cex parameter to change the type size of the labels.
This looks like:
!a biplot http://personality-project.org/r/images/olympic.biplot.pdf
Bill