Significance level of ACF and PACF in R - r

I want to obtain the the limits that determine the significance of autocorrelation coefficients and partial autocorrelation coefficients, but I don't know how to do it.
I obtained the Partial autocorrelogram using this function pacf(data). I want that R print me the values indicated in the figure.

The limits that determine the significance of autocorrelation coefficients are: +/- of (exp(2*1.96/√(N-3)-1)/(exp(2*1.96/√(N-3)+1).
Here N is the length of the time series, and I used the 95% confidence level.

The correlation values that correspond to the m % confidence intervals chosen for the test are given by 0 ± i/√N where:
N is the length of the time series
i is the number of standard deviations we expect m % of the correlations to lie within under the null hypothesis that there is zero autocorrelation.
Since the observed correlations are assumed to be normally distributed:
i=2 for a 95% confidence level (acf's default),
i=3 for a 99% confidence level,
and so on as dictated by the properties of a Gaussian distribution
Figure A1, Page 1011 here provides a nice example of how the above principle applies in practice.

After investigating acf and pacf functions and library psychometric with its CIz and CIr functions I found this simple code to do the task:
Compute confidence interval for z Fisher:
ciz = c(-1,1)*(-qnorm((1-alpha)/2)/sqrt(N-3))
here alpha is the confidence level (typically 0.95). N - number of observations.
Compute confidence interval for R:
cir = (exp(2*ciz)-1)/(exp(2*ciz)+1

Related

Change significance level MannKendall trend test -- R

I want to perform Mann-Kendall test at 99% and 90% confidence interval (CI). When running the lines below the analysis will be based on a 95% CI. How to change the code to perform it on 99 and 90% CI?
vec = c(1,2,3,4,5,6,7,8,9,10)
MannKendall(vec)
I cannot comment yet, but I have a question, what do you mean when you say that you need to perform the analysis on a 99 and 95% CI. Do you want to know if your value is significant at the 99 and 90% significance level?
If you just need to know if your score is significant at 99 and 90% significance then r2evans was right, the alpha or significance level is just an arbitrary threshold that you use to define how small your probability should be for you to assume that there "is no effect" or in this case that there is independence between the observations. More importantly, the calculation of the p-value is independent of the confidence level you select, so if you want to know if your result is significant at different confidence levels just compare your p-value at those levels.
I checked how the function works and did not see any indication that the alpha level selected is going to affect the results. if you check the source code of MannKendall(x) (by typing MannKendall without parenthesis or anything) you can see that is just Kendall(1:length(x), x). The function Kendall calculates a statistic tau, that "measures the strength of monotonic association between the vectors x and y", then it returns a p-value by calculating how likely your observed tau is under the assumption that there is no relation between length(x) and x. In other words, how likely it is that you obtain that tau just by chance, as you can see this is not dependent on the confidence level at all, the confidence level only matters at the end when you are deciding how small the probability of your tau should be for you to assume that it cannot have been obtained just by chance.

Calculating marginal effects from predicted probabilities of zeroinfl() model object

This plot, which I previously created, shows predicted probabilities of claim onset based on two variables, PIB (scaled across the x-axis) and W, presented as its 75th and 25th percentiles. Confidence intervals for the predictions are presented alongside the two lines.
Probability of Claim Onset
As I theorize that W and PIB have an interactive effect on claim onset, I'd like to see if there is any significance in the marginal effect of W on PIB. Confidence intervals of the predicted probabilities alone cannot confirm that this effect is insignificant, per my reading here (https://www.sociologicalscience.com/download/vol-6/february/SocSci_v6_81to117.pdf).
I know that you can calculate marginal effect easily from predicted probabilities by subtracting one from the other. Yet, I don't understand how I can get the confidence intervals for the marginal effect -- obviously needed to determine when and where my two sets of probabilities are indeed significantly different from one another.
The function that I used for calculating predicted probabilities of the zeroinfl() model object and the confidence intervals of those predicted probabilities is derived from an online posting (https://stat.ethz.ch/pipermail/r-help/2008-December/182806.html). I'm happy to provide more code if needed, but as this is not a question about an error, I am not sure it is needed.
So, I'm not entirely sure this is the correct answer, but to anyone who might come across the same problem I did:
Assuming that the two prediction lines maintain the same variance, you can pool SE before then calculating. See the wikipedia for Pooled Variance to confirm.
SEpooled <- ((pred_1_OR_pred_2$SE * sqrt(simulation_n))^2) * (sqrt((1/simulation_n)+(1/simulation_n)))
low_conf <- (pred_1$PP - pred_2$PP) - (1.96*SEpooled)
high_conf <- (pred_1$PP - pred_2$PP) + (1.96*SEpooled)
##Add this to the plot
lines(pred_1$x_val, low_conf, lty=2)
lines(pred_1$x_val, high_conf, lty=2)

confidence interval of estimates in a fitted hybrid model by spatstat

hybrid Gibbs models are flexible for fitting spatial pattern data, however, I am confused on how to get the confidence interval for the fitted model's estimate. for instance, I fitted a hybrid geyer model including a hardcore and a geyer saturation components, got the estimates:
Mo.hybrid<-Hybrid(H=Hardcore(), G=Geyer(81,1))
my.hybrid<-ppm(my.X~1,Mo.hybrid, correction="bord")
#beta = 1.629279e-06
#Hard core distance: 31.85573
#Fitted G interaction parameter gamma: 10.241487
what I interested is the gamma, which present the aggregation of points. obviously, the data X is a sample, i.e., of cells in a anatomical image. in order to report statistical result, a confidence interval for gamma is needed. however, i do not have replicates for the image data.
can i simlate 10 time of the fitted hybrid model, then refitted them to get confidence interval of the estimate? something like:
mo.Y<-rmhmodel(cif=c("hardcore","geyer"),
par=list(list(beta=1.629279e-06,hc=31.85573),
list(beta=1, gamma=10.241487,r=81,sat=1)), w=my.X)
Y1<-rmh(model=mo.Y, control = list(nrep=1e6,p=1, fixall=TRUE),
start=list(n.start=c(npoint(my.X))))
Y1.fit<-ppm(Y1~1, Mo.hybrid,rbord=0.1)
# simulate and fit Y2,Y3,...Y10 in same way
or:
Y10<-simulate(my.hybrid,nsim=10)
Y1.fit<-ppm(Y10[1]~1, Mo.hybrid,rbord=0.1)
# fit Y2,Y3,...Y10 in same way
certainly, the algorithms is different, the rmh() can control simulated intensity while the simulate() does not.
now the questions are:
is it right to use simualtion to get confidence interval of estimate?
or the fitted model can provide estimate interval that could be extracted?
if simulation is ok, which algorithm is better in my case?
The function confint calculates confidence intervals for the canonical parameters of a statistical model. It is defined in the standard stats package. You can apply it to fitted point process models in spatstat: in your example just type confint(my.hybrid).
You wanted a confidence interval for the non-canonical parameter gamma. The canonical parameter is theta = log(gamma) so if you do exp(confint(my.hybrid) you can read off the confidence interval for gamma.
Confidence intervals and other forms of inference for fitted point process models are discussed in detail in the spatstat book chapters 9, 10 and 13.
The confidence intervals described above are the asymptotic ones (based on the asymptotic variance matrix using the central limit theorem).
If you really wanted to estimate the variance-covariance matrix by simulation, it would be safer and easier to fit the model using method='ho' (which performs the simulation) and then apply confint as before (which would then use the variance of the simulations rather than the asymptotic variance).
rmh.ppm and simulate.ppm are essentially the same algorithm, apart from some book-keeping. The differences observed in your example occur because you passed different arguments. You could have passed the same arguments to either of these functions.

Obtaining correct confidence intervals for Poisson rate ratio test

In R I am having a problem obtaining exact confidence intervals for the rate ratio as calculated by the poisson.test function. This function uses the method in binom.test to calculate the confidence limits between two rates e.g.
poisson.test(x = c(10000,20000), T = c(15000,15000), r = 1, conf.level = 0.95)
This works fine when x (events) is lower than T (observations). However, at very high x (events) and relatively low T (observations), the variance of the Poisson distribution will approach infinity. This is not accounted for in the poisson.test function, as the binom.test method has no such limits. Consequently, with a progressively higher event rates, the confidence interval of both the individual rates and the ratios of these rates become progressively narrower, while they should asymptotically widen.
Would anybody know an alternative way to test the ratio of two very high rates using the Uniformly Most Powerful method and obtain their correct confidence limits?

How to calculate confidence intervals for each bar plotted from CCF function to see if they are significantly different from other bars in R

I've researched how to identify what the Confidence Intervals are for a given bar to be considered statistically significant, cross-correlation using ccf in R when using the CCF() function. However, I'm trying to make comparisons across lag bars that are statistically significant to see if they are significantly different from other bars. Is there a way to calculate the confidence interval of the ACF correlation found for each bar on a CCF plot?
The code below simulates some data that illustrates the issue. You can see that there are 4 bars (lags -3 to 1) that are all significant at the .9999 confidence level.
What I am trying to determine if the ACF correlation at lag -1 is significantly different from the coefficients observed at lags -2 and 0. I believe calculating the CIs for each bar would allow me to do this, but I am unsure how to achieve this.
#Simulating 2 time seres that are correlated across multiple lags
set.seed(1)
y <- filter(rnorm(1000), filter=rep(1,3), circular=TRUE)
#plot(y)
x = lag(y, 1) * 2
#plot(x)
ccfObject = ccf(x,y, 3)
plot(ccfObject, ci=.9999)
My Official Question
How can I calculate the confidence interval for each bar outputted by the CCF() function in R?
Secondarily...
2. Is making this type of comparison statistically valid? If not, how should I make such comparisions in R?

Resources