I am trying to create a plotly boxplot in R that doesnt show the outliers, and I found this link in the official page of plotly:
https://plot.ly/ggplot2/box-plots/#outliers
library(plotly)
set.seed(123)
df <- diamonds[sample(1:nrow(diamonds), size = 1000),]
p <- ggplot(df, aes(cut, price, fill = cut)) +
geom_boxplot(outlier.shape = NA) +
ggtitle("Ignore outliers in ggplot2")
# Need to modify the plotly object and make outlier points have opacity equal
to 0
p <- plotly_build(p)
p$data <- lapply(p$data, FUN = function(x){
x$marker = list(opacity = 0)
return(x)
})
# Create a shareable link to your chart
# Set up API credentials: https://plot.ly/r/getting-started
chart_link = plotly_POST(p, filename="geom_boxplot/outliers")
chart_link
The problem is that in their webpage and in my console, outliers are still being displayed.
Is this some kind of bug?
Seems like a typo. Maybe the example wasn't updated to account for some changes in the object structure. After calling p <- plotly_build(p), we observe that there is no p$data, but there is p$x$data. So, changing the lapply call to the following:
p$x$data <- lapply(p$x$data, FUN = function(x){
x$marker = list(opacity = 0)
return(x)
})
makes everything work as intended:
Related
I'm using the svars package to generate some IRF plots. The plots are rendered using ggplot2, however I need some help with changing some of the aesthetics.
Is there any way I can change the fill and alpha of the shaded confidence bands, as well as the color of the solid line? I know in ggplot2 you can pass fill and alpha arguments to geom_ribbon (and col to geom_line), just unsure of how to do the same within the plot function of this package's source code.
# Load Dataset and packages
library(tidyverse)
library(svars)
data(USA)
# Create SVAR Model
var.model <- vars::VAR(USA, lag.max = 10, ic = "AIC" )
svar.model <- id.chol(var.model)
# Wild Bootstrap
cores <- parallel::detectCores() - 1
boot.svar <- wild.boot(svar.model, n.ahead = 30, nboot = 500, nc = cores)
# Plot the IRFs
plot(boot.svar)
I'm also looking at the command for a historical decomposition plot (see below). Is there any way I could omit the first two facets and plot only the bottom three lines on the same facet?
hist.decomp <- hd(svar.model, series = 1)
plot(hist.decomp)
Your first desired result is easily achieved by resetting the aes_params after calling plot. For your second goal. There is probably an approach to manipulate the ggplot object. Instead my approach below constructs the plot from scratch. Basically I copy and pasted the data wrangling code from vars:::plot.hd and filtered the prepared dataset for the desired series:
# Plot the IRFs
p <- plot(boot.svar)
p$layers[[1]]$aes_params$fill <- "pink"
p$layers[[1]]$aes_params$alpha <- .5
p$layers[[2]]$aes_params$colour <- "green"
p
# Helper to convert to long dataframe. Source: svars:::plot.hd
hd2PlotData <- function(x) {
PlotData <- as.data.frame(x$hidec)
if (inherits(x$hidec, "ts")) {
tsStructure = attr(x$hidec, which = "tsp")
PlotData$Index <- seq(from = tsStructure[1], to = tsStructure[2],
by = 1/tsStructure[3])
PlotData$Index <- as.Date(yearmon(PlotData$Index))
}
else {
PlotData$Index <- 1:nrow(PlotData)
PlotData$V1 <- NULL
}
dat <- reshape2::melt(PlotData, id = "Index")
dat
}
hist.decomp <- hd(svar.model, series = 1)
dat <- hd2PlotData(hist.decomp)
dat %>%
filter(grepl("^Cum", variable)) %>%
ggplot(aes(x = Index, y = value, color = variable)) +
geom_line() +
xlab("Time") +
theme_bw()
EDIT One approach to change the facet labels is via a custom labeller function. For a different approach which changes the facet labels via the data see here:
myvec <- LETTERS[1:9]
mylabel <- function(labels, multi_line = TRUE) {
data.frame(variable = labels)
}
p + facet_wrap(~variable, labeller = my_labeller(my_labels))
Facebook's Prophet in R (there's also a Python version) is used to generate time series forecasts.
A model m is created by:
m <- prophet(df)
future <- make_future_dataframe(m, periods = 365)
forecast <- predict(m, future)
plot(m, forecast)
Which returns a very nicely formatted graph, like:
I would like to change the line type, to get not dots but a usual thin line.
I had tried this
lines(m$history$y,lty=1)
but got an error
In doTryCatch(return(expr), name, parentenv, handler)
Are there are any suggestions how to convert those dots into a line?
The plot method for prophet objects uses ggplot2, so base R graphics functions like lines() won't work. You can use ggplot2::geom_line() to add lines, but at the moment I don't see an easy way to replace the points by lines ...
Example from ?prophet:
history <- data.frame(ds = seq(as.Date('2015-01-01'), as.Date('2016-01-01'), by = 'd'),
y = sin(1:366/200) + rnorm(366)/10)
m <- prophet(history)
future <- make_future_dataframe(m, periods = 365)
forecast <- predict(m, future)
pp <- plot(m,forecast)
Add lines:
library(ggplot2)
pp + geom_line()
This question provides a (hacky) way forward:
pp2 <- pp + geom_line()
qq2 <- ggplot_build(pp2)
qq2$data[[2]]$colour <- NA
plot(ggplot_gtable(qq2))
But obviously something went wrong with the hack. The better bet would be to look at the plot method(prophet:::plot.prophet) and modify it to behave as you want ... Here is the bare-bones version:
df <- prophet:::df_for_plotting(m, forecast)
gg <-ggplot(df, aes(x = ds, y = y)) + labs(x = "ds", y = "y")
gg <- gg + geom_ribbon(ggplot2::aes(ymin = yhat_lower,
ymax = yhat_upper), alpha = 0.2, fill = "#0072B2",
na.rm = TRUE)
## replace first geom_point() with geom_line() in next line ...
gg <- gg + geom_line(na.rm = TRUE) + geom_line(aes(y = yhat),
color = "#0072B2", na.rm = TRUE) + theme(aspect.ratio = 3/5)
I may have stripped out some components that exist in your data/forecast, though ...
it is possible to make such manipulations with dyplot.prophet(m, forecast) (html version of plot) :) before that, we should rewrite function like here:
dyplot.prophet <- function(x, fcst, uncertainty=TRUE,
...)
{
forecast.label='Predicted'
actual.label='Actual'
# create data.frame for plotting
df <- prophet:::df_for_plotting(x, fcst)
# build variables to include, or not, the uncertainty data
if(uncertainty && exists("yhat_lower", where = df))
{
colsToKeep <- c('y', 'yhat', 'yhat_lower', 'yhat_upper')
forecastCols <- c('yhat_lower', 'yhat', 'yhat_upper')
} else
{
colsToKeep <- c('y', 'yhat')
forecastCols <- c('yhat')
}
# convert to xts for easier date handling by dygraph
dfTS <- xts::xts(df %>% dplyr::select_(.dots=colsToKeep), order.by = df$ds)
# base plot
dyBase <- dygraphs::dygraph(dfTS)
presAnnotation <- function(dygraph, x, text) {
dygraph %>%
dygraphs::dyAnnotation(x, text, text, attachAtBottom = TRUE)
}
dyBase <- dyBase %>%
# plot actual values
dygraphs::dySeries(
'y', label=actual.label, color='black',stepPlot = TRUE, strokeWidth=1
) %>%
# plot forecast and ribbon
dygraphs::dySeries(forecastCols, label=forecast.label, color='blue') %>%
# allow zooming
dygraphs::dyRangeSelector() %>%
# make unzoom button
dygraphs::dyUnzoom()
if (!is.null(x$holidays)) {
for (i in 1:nrow(x$holidays)) {
# make a gray line
dyBase <- dyBase %>% dygraphs::dyEvent(
x$holidays$ds[i],color = "rgb(200,200,200)", strokePattern = "solid")
dyBase <- dyBase %>% dygraphs::dyAnnotation(
x$holidays$ds[i], x$holidays$holiday[i], x$holidays$holiday[i],
attachAtBottom = TRUE)
}
}
return(dyBase)
}
the strokeWidth=0 was before and we have changed it to strokeWidth=1 and added stepPlot = TRUE
the whole basis code is situated here: https://rdrr.io/cran/prophet/src/R/plot.R
One of my favorite tools for exploratory analysis is pairs(), however in the case of a limited number of discrete values, it falls flat as the dots all align perfectly. Consider the following:
y <- t(rmultinom(n=1000,size=4,prob=rep(.25,4)))
pairs(y)
It doesn't really give a good sense of correlation. Is there an alternative plot style that would?
If you change y to a data.frame you can add some 'jitter' and with the col option you can set the transparency level (the 4th number in rgb):
y <- data.frame(y)
pairs(sapply(y,jitter), col = rgb(0,0,0,.2))
Or you could use ggplot2's plotmatrix:
library(ggplot2)
plotmatrix(y) + geom_jitter(alpha = .2)
Edit: Since plotmatrix in ggplot2 is deprecated use ggpairs (GGally package mentioned in #hadley's comment above)
library(GGally)
ggpairs(y, lower = list(params = c(alpha = .2, position = "jitter")))
Here is an example using corrplot:
M <- cor(y)
corrplot.mixed(M)
You can find more examples in the intro
http://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html
Here are a couple of options using ggplot2:
library(ggplot2)
## re-arrange data (copied from plotmatrix function)
prep.plot <- function(data) {
grid <- expand.grid(x = 1:ncol(data), y = 1:ncol(data))
grid <- subset(grid, x != y)
all <- do.call("rbind", lapply(1:nrow(grid), function(i) {
xcol <- grid[i, "x"]
ycol <- grid[i, "y"]
data.frame(xvar = names(data)[ycol], yvar = names(data)[xcol],
x = data[, xcol], y = data[, ycol], data)
}))
all$xvar <- factor(all$xvar, levels = names(data))
all$yvar <- factor(all$yvar, levels = names(data))
return(all)
}
dat <- prep.plot(data.frame(y))
## plot with transparent jittered points
ggplot(dat, aes(x = x, y=y)) +
geom_jitter(alpha=.125) +
facet_grid(xvar ~ yvar) +
theme_bw()
## plot with color representing density
ggplot(dat, aes(x = factor(x), y=factor(y))) +
geom_bin2d() +
facet_grid(xvar ~ yvar) +
theme_bw()
I don't have enough credits yet to comment on #Vincent 's post - when doing
library(GGally)
ggpairs(y, lower = list(params = c(alpha = .2, position = "jitter")))
I get
Error in stop_if_params_exist(obj$params) :
'params' is a deprecated argument. Please 'wrap' the function to supply arguments. help("wrap", package = "GGally")
So it seems, based on the indicated help page, that it would need to be in this case here:
ydf <- as.data.frame(y)
regularPlot <- ggpairs(ydf, lower = list(continuous = wrap(ggally_points, alpha = .2, position = "jitter")))
regularPlot
I am trying to make the plot with horizontal lines where the data2 and data3 points should be within data1 range. This will give an overlapping lines in different colors but I am getting an error which says:
Error in strsplit(filename, "\\.") : non-character argument
Here is the data and code. Please give me some suggestion.
data1 <- data.frame(Start=c(10),End=c(19))
data2 <- data.frame(Start=c(5),End=c(15))
data3 <- data.frame(Start=c(6),End=c(18))
filter_data2 <- data2[data2$Start >= (data1$Start-(data1$Start/2)) & data2$End <= (data1$End+(data1$End/2)), ]
filter_data3 <- data3[data3$Start >= (data1$Start-(data1$Start/2)) & data3$End <= (data1$End+(data1$End/2)), ]
data1 <- data.frame(rep(1,nrow(data1)),data1)
colnames(data1) <- c("ID","start","end")
data2 <- data.frame(rep(2,nrow(filter_data2)),filter_data2)
colnames(data2) <- c("ID","start","end")
data3 <- data.frame(rep(3,nrow(filter_data3)),filter_data3)
colnames(data3) <- c("ID","start","end")
dat1 <- rbind(data1,data2,data3)
pdf("overlap.pdf")
p <- ggplot(dat1, aes(x=(max(start)-max(start)/2), y = ID, colour=ID))
p <- p + geom_segment(aes(xend =(max(end)+max(end)/2), ystart = ID, yend = ID))
p <- p + scale_colour_brewer(palette = "Set1")
ggsave(p)
There are two problems in your code. If you want to use scale_colour_brewer() then ID values should be set as factor
p <- ggplot(dat1, aes(x=(max(start)-max(start)/2), y = ID, colour=as.factor(ID)))
Next, to save the ggplot2 plot you have two possibilities.
Using ggsave() function you should provide file name and format. In this case function pdf() is unnecessary.
ggsave(plot=p,file="plot.pdf")
Using function pdf(), you should add print(p) and then dev.off(). In this case you don't need ggsave() function.
pdf("overlap.pdf")
print(p)
dev.off()
I am trying to plot celestial object on the sky (basically with coordinates equivalent to latitude/longitude). I successfully plotted all my points using the "aitoff" projection of the coord_map function, but in this case, the grid is badly displayed, i.e. residual horizontal lines are still displayed for latitudes non equal to zero along with their correct projections.
How could I remove these lines?
Here is code that reproduces the behavior:
library(ggplot2)
library(mapproj)
sky2 = data.frame(RA=0, Dec=0)
skyplot2 <- qplot(RA,Dec,data=sky2,xlim=c(0,360),ylim=c(-89.999,89.999),
xlab="R.A.(°)", ylab="Decl. (°)",main="Source repartition on the sky")
skyplot2 + coord_map(projection="aitoff",orientation=c(89.999,180,0)) +
scale_y_continuous(breaks=(-2:2)*30,limits=c(-89.999,89.999)) +
scale_x_continuous(breaks=(0:8)*45,limits=c(0,360),
labels=c("","","","","","","","",""))
Definitely this is a bug in ggplot2 so could you please file this bug?
https://github.com/hadley/ggplot2/issues?state=open Filed as a bug.
Here is a quick and dirty hack.
f <- function(x, y, ...) {
if (any(is.na(x))) {
id <- rle(!is.na(x))$length
id <- rep(seq_along(id), id)
df <- data.frame(x, y, id)
df <- df[order(df$id, df$x), ]
} else if (any(is.na(y))) {
id <- rle(!is.na(y))$length
id <- rep(seq_along(id), id)
df <- data.frame(x, y, id)
}
polylineGrob(df$x, df$y, id = df$id, gp = gpar(col = "white"))
}
skyplot2 <- qplot(RA,Dec,data=sky2,xlim=c(0,360),ylim=c(-89.999,89.999),
xlab="R.A.(°)", ylab="Decl. (°)",main="Source repartition on the sky")
skyplot2 + coord_map(projection="aitoff",orientation=c(89.999,180,0)) +
scale_y_continuous(breaks=(-2:2)*30,limits=c(-89.999,89.999)) +
scale_x_continuous(breaks=(0:8)*45,limits=c(0,360),
labels=c("","","","","","","","","")) +
opts(panel.grid.major = f)
Note that this may work only with the aitoff projection.
You just need to add:
+ opts(axis.ticks = theme_blank())