Lag 0 is not plotted in GGCcf - r

With the following code I plotted the Cross Correlation of my data. All works wonderful, however the visualization does not depict Lag 0, which is highly important for my studies.
p= ggCcf(
df_ccf$Asia_Co,
df_ccf$EU_USA,
lag.max = 10,
type = c("correlation", "covariance"),
plot = TRUE,
na.action = na.contiguous)
plot(p)
The plot is looking like that:
Head of data:

I encountered the same issue; it might be an issue/bug with 'ggCcf' from the forecast library. I couldn't get ggCcf to work, no matter what I tried. Anyone who wants to reproduce this behaviour, try:
ggCcf(c(1,2,3,4),c(2,3,4,6))
The workaround is using regular/base R ccf:
max_lag = 10
result = ccf(series1, series2, lag.max = max_lag)
y = results$acf
x = c(-max_lag:max_lag)
You can use these two series to plot the ccf using ggplot2 and choosing an appropriate ylim.
The downside of this all is less conveniance, but the upside is that you can add some flair/styling to your plot now that you are doing everything yourself anyway ;).

Related

Why aren't any points showing up in the qqcomp function when using plotstyle="ggplot"?

I want to compare the fit of different distributions to my data in a single plot. The qqcomp function from the fitdistrplus package pretty much does exactly what I want to do. The only problem I have however, is that it's mostly written using base R plot and all my other plots are written in ggplot2. I basically just want to customize the qqcomp plots to look like they have been made in ggplot2.
From the documentation (https://www.rdocumentation.org/packages/fitdistrplus/versions/1.0-14/topics/graphcomp) I get that this is totally possible by setting plotstyle="ggplot". If I do this however, no points are showing up on the plot, even though it worked perfectly without the plotstyle argument. Here is a little example to visualize my problem:
library(fitdistrplus)
library(ggplot2)
set.seed(42)
vec <- rgamma(100, shape=2)
fit.norm <- fitdist(vec, "norm")
fit.gamma <- fitdist(vec, "gamma")
fit.weibull <- fitdist(vec, "weibull")
model.list <- list(fit.norm, fit.gamma, fit.weibull)
qqcomp(model.list)
This gives the following output:
While this:
qqcomp(model.list, plotstyle="ggplot")
gives the following output:
Why are the points not showing up? Am I doing something wrong here or is this a bug?
EDIT:
So I haven't figured out why this doesn't work, but there is a pretty easy workaround. The function call qqcomp(model.list, plotstyle="ggplot") still returns an ggplot object, which includes the data used to make the plot. Using that data one can easily write an own plot function that does exactly what one wants. It's not very elegant, but until someone finds out why it's not working as expected I will just use this method.
I was able to reproduce your error and indeed, it's really intriguing. Maybe, you should contact developpers of this package to mention this bug.
Otherwise, if you want to reproduce this qqplot using ggplot and stat_qq, passing the corresponding distribution function and the parameters associated (stored in $estimate):
library(ggplot2)
df = data.frame(vec)
ggplot(df, aes(sample = vec))+
stat_qq(distribution = qgamma, dparams = as.list(fit.gamma$estimate), color = "green")+
stat_qq(distribution = qnorm, dparams = as.list(fit.norm$estimate), color = "red")+
stat_qq(distribution = qweibull, dparams = as.list(fit.weibull$estimate), color = "blue")+
geom_abline(slope = 1, color = "black")+
labs(title = "Q-Q Plots", x = "Theoritical quantiles", y = "Empirical quantiles")
Hope it will help you.

Run points() after plot() on a dataframe

I'm new to R and want to plot specific points over an existing plot. I'm using the swiss data frame, which I visualize through the plot(swiss) function.
After this, want to add outliers given by the Mahalanobis distance:
mu_hat <- apply(swiss, 2, mean); sigma_hat <- cov(swiss)
mahalanobis_distance <- mahalanobis(swiss, mu_hat, sigma_hat)
outliers <- swiss[names(mahalanobis_distance[mahalanobis_distance > 10]),]
points(outliers, pch = 'x', col = 'red')
but this last line has no effect, as the outlier points aren't added to the previous plot. I see that if repeat this procedure on a pair of variables, say
plot(swiss[2:3])
points(outliers[2:3], pch = 'x', col = 'red')
the red points are added to the plot.
Ask: is there any restriction to how the points() function can be used for a multivariate data frame?
Here's a solution using GGally::ggpairs. It's a little ugly as we need to modify the ggally_points function to specify the desired color scheme.
I've assumed that mu_hat = colMeans(swiss) and sigma_hat = cov(swiss).
library(dplyr)
library(GGally)
swiss %>%
bind_cols(distance = mahalanobis(swiss, colMeans(swiss), cov(swiss))) %>%
mutate(is_outlier = ifelse(distance > 10, "yes", "no")) %>%
ggpairs(columns = 1:6,
mapping = aes(color = is_outlier),
upper = list(continuous = function(data, mapping, ...) {
ggally_points(data = data, mapping = mapping) +
scale_colour_manual(values = c("black", "red"))
}),
lower = list(continuous = function(data, mapping, ...) {
ggally_points(data = data, mapping = mapping) +
scale_colour_manual(values = c("black", "red"))
}),
axisLabels = "internal")
Unfortunately this isn't possible the way you're currently doing things. When plotting a data frame R produces many plots and aligns them. What you're actually seeing there is 6 by 6 = 36 individual plots which have all been aligned to look nice.
When you use the dots command, it tells it to place the dots on the current plot. Which doesn't really make sense when you have 36 plots, at least not the way you want it to.
ggplot is a really powerful tool in R, it provides far greater combustibility. For example you could set up the dataframe to include your outliers, but have them labelled as "outlier" and place it in each plot that you have set up as facets. The more you explore it you might find there are better plots which suit your needs as well.
Plotting a dataframe in base R is a good exploratory tool. You could set up those outliers as a separate dataframe and plot it, so you can see each of the 6 by 6 plots side by side and compare. It all depends on your goal. If you're goal is to produce exactly as you've described, the ggplot2 package will help you create something more professional. As #Gregor suggested in the comments, looking up the function ggpairs from the GGally package would be a good place to start.
A quick google image search shows some funky plots akin to what you're after and then some!
Find it here

Knn Regression in R

I am investigating Knn regression methods and later Kernel Smoothing.
I wish to demonstrate these methods using plots in R. I have generated a data set using the following code:
x = runif(100,0,pi)
e = rnorm(100,0,0.1)
y = sin(x)+e
I have been trying to follow a description of how to use "knn.reg" in 9.2 here:
https://daviddalpiaz.github.io/r4sl/k-nearest-neighbors.html#regression
grid2=data.frame(x)
knn10 = FNN::knn.reg(train = x, test = grid2, y = y, k = 10)
My predicted values seem reasonable to me but when I try to plot a line with them on top of my x~y plot I don't get what I'm hoping for.
plot(x,y)
lines(grid2$x,knn10$pred)
I feel like I'm missing something obvious and would really appreciate any help or advice you can offer, thank you for your time.
You just need to sort the x values before plotting the lines.
plot(x,y)
ORD = order(grid2$x)
lines(grid2$x[ORD],knn10$pred[ORD])

Using panel.mathdensity and panel.densityplot in lattice graphics to plot Bayesian prior and posterior

I am trying to plot a Bayesian prior and posterior distribution using lattice graphics. I would like to have both distributions in one panel, for direct comparison.
I've tried different solutions all day, including qqmath but I didn't get them to work. Here's the attempt that has been most successful so far:
# my data
d <- dgamma(seq(from=0.00001,to=0.01,by=0.00001),shape = .1, scale = .01)
# my plot
densityplot(~d,
plot.points=FALSE,
panel = function(x,...) {
panel.densityplot(x,...)
panel.mathdensity(
dmath = dgamma,
args = list(shape = .1, scale=.01)
)
}
)
Even though the code runs through nicely, it doesn't do what I want it to. It plots the posterior (d) but not the prior.
I added stop("foo") to densityplot(...) to stop execution if an error occurs and I searched online for the error message:
Error in eval(substitute(groups), data, environment(formula)) : foo
But there are only a few results and they seem unrelated to me.
So, here's my question: Can anyone help me with this approach to achieve what I want?
I asked a similar question which leads to the same result. I got an answer and it was useful. You can find everything here

Combine two plots created with effects package in R

I have the following Problem. After running an ordered logit model, I want to R's effects package to visualize the results. This works fine and I did so for two independent variables, then I tried to combine the two plots. However, this does not seem to work. I provide a little replicable example here so you can see my problem for yourself:
library(car)
data(Chile)
mod <- polr(vote ~ age + log(income), data=Chile)
eff <- effect("log(income)", mod)
plot1 <- plot(eff, style="stacked",rug=F, key.args=list(space="right"))
eff2 <- effect("age", mod)
plot2 <- plot(eff2, style="stacked",rug=F, key.args=list(space="right"))
I can print these two plots now independently, but when I try to plot them together, the first plot is overwritten. I tried setting par(mfrow=c(2,1)), which didn't work. Next I tried the following:
print(plot1, position=c(0, .5, 1, 1), more=T)
print(plot2, position=c(0,0, 1, .5))
In this latter case, the positions of the two plots are just fine, but still the first plot vanishes once I add the second (or better, it is overwritten). Any suggestions how to prevent this behavior would be appreciated.
Reading down the long list of arguments to ?print.eff we see that there are some arguments for doing just this:
plot(eff, style="stacked",rug=F, key.args=list(space="right"),
row = 1,col = 1,nrow = 1,ncol = 2,more = TRUE)
plot(eff2, style="stacked",rug=F, key.args=list(space="right"),
row = 1,col = 2,nrow = 1,ncol = 2)
The reason par() didn't work is because this package is using lattice graphics, which are based on the grid system, which is incompatible with base graphics. Neither par() nor layout will have any effect on grid graphics.
This seems to work:
plot(eff,col=1,row=2,ncol=1,nrow=2,style="stacked",rug=F,
key.args=list(space="right"),more=T)
plot(eff2,col=1,row=1,ncol=1,nrow=2,style="stacked",rug=F,
key.args=list(space="right"))
edit: Too late...

Resources