Autocorrelation scatterplot for a special lag - r

I am using acf function to calculate Autocorrelation. I want to have a scatter plot for a special lag for example lag 2. Does it possible to do it with acf function.
Is there any package that help me to have a scatter plot for each lag?

lag.plot1=function(data1,max.lag=1,corr=TRUE,smooth=FALSE){
name1=paste(deparse(substitute(data1)),"(t-",sep="")
name2=paste(deparse(substitute(data1)),"(t)",sep="")
data1=as.ts(data1)
max.lag=as.integer(max.lag)
prow=ceiling(sqrt(max.lag))
pcol=ceiling(max.lag/prow)
a=acf(data1,max.lag,plot=FALSE)$acf[-1]
par(mfrow=c(prow,pcol), mar=c(2.5, 4, 2.5, 1), cex.main=1.1, font.main=1)
for(h in 1:max.lag){
plot(lag(data1,-h), data1, xy.labels=FALSE, main=paste(name1,h,")",sep=""), ylab=name2, xlab="")
if (smooth==TRUE)
lines(lowess(ts.intersect(lag(data1,-h),data1)[,1],
ts.intersect(lag(data1,-h),data1)[,2]), col="red")
if (corr==TRUE)
legend("topright", legend=round(a[h], digits=2), text.col ="blue", bg="white", x.intersp=0)
}
}
lag.plot1(dat,12,smooth=TRUE)
The full call is lag.plot1(x,m,corr=TRUE,smooth=TRUE) and it will generate a grid of scatterplots of x(t-h) versusx(t) for h = 1,...,m, along with the autocorrelation values in blue and a lowess fit in red.If you don't want any correlations or lines, simply use R's lag.plot
library(forecast)
lag.plot(LakeHuron,lags=3,do.lines=FALSE)

Related

How to add a Poisson distribution curve that approaches 3?

I want to add a curve to an existing plot.
This curve should be a poisson distribution curve that approaches the mean 3.
I've tried this code
points is a vector with 1000 values
plot(c(1:1000), points,type="l")
abline(h=3)
x = 0:1000
curve(dnorm(x, 3, sqrt(3)), lwd=2, col="red", add=TRUE)
I am getting a plot, but without any curve.
I would like to see a curve that approaches 3.
you can do something like this:
plot(0:20, 3+dpois( x=0:20, lambda=3 ), xlim=c(-2,20))
normden <- function(x){3+dnorm(x, mean=3, sd=sqrt(3))}
curve(normden, from=-4, to=20, add=TRUE, col="red")
running this code will produce the following:
is that what you intended?

Multiple plots using curve() function (e.g. normal distribution)

I am trying to plot multiple functions using curve(). My example tries to plot multiple normal distributions with different means and the same standard deviation.
png("d:/R/standardnormal-different-means.png",width=600,height=300)
#First normal distribution
curve(dnorm,
from=-2,to=2,ylab="d(x)",
xlim=c(-5,5))
abline(v=0,lwd=4,col="black")
#Only second normal distribution is plotted
myMean <- -1
curve(dnorm(x,mean=myMean),
from=myMean-2,to=myMean+2,
ylab="d(x)",xlim=c(-5,5), col="blue")
abline(v=-1,lwd=4,col="blue")
dev.off()
As the curve() function creates a new plot each time, only the second normal distribution is plotted.
I reopened this question because the ostensible duplicates focus on plotting two different functions or two different y-vectors with separate calls to curve. But since we want the same function, dnorm, plotted for different means, we can automate the process (although the answers to the other questions could also be generalized and automated in a similar way).
For example:
my_curve = function(m, col) {
curve(dnorm(x, mean=m), from=m - 3, to=m + 3, col=col, add=TRUE)
abline(v=m, lwd=2, col=col)
}
plot(NA, xlim=c(-10,10), ylim=c(0,0.4), xlab="Mean", ylab="d(x)")
mapply(my_curve, seq(-6,6,2), rainbow(7))
Or, to generalize still further, let's allow multiple means and standard deviations and provide an option regarding whether to include a mean line:
my_curve = function(m, sd, col, meanline=TRUE) {
curve(dnorm(x, mean=m, sd=sd), from=m - 3*sd, to=m + 3*sd, col=col, add=TRUE)
if(meanline==TRUE) abline(v=m, lwd=2, col=col)
}
plot(NA, xlim=c(-10,10), ylim=c(0,0.4), xlab="Mean", ylab="d(x)")
mapply(my_curve, rep(0,4), 4:1, rainbow(4), MoreArgs=list(meanline=FALSE))
You can also use line segments that start at zero and stop at the top of the density distribution, rather than extending all the way from the bottom to the top of the plot. For a normal distribution the mean is also the point of highest density. However, I've used the which.max approach below as a more general way of identifying the x-value at which the maximum y-value occurs. I've also added arguments for line width (lwd) and line end cap style (lend=1 means flat rather than rounded):
my_curve = function(m, sd, col, meanline=TRUE, lwd=1, lend=1) {
x=curve(dnorm(x, mean=m, sd=sd), from=m - 3*sd, to=m + 3*sd, col=col, add=TRUE)
if(meanline==TRUE) segments(m, 0, m, x$y[which.max(x$y)], col=col, lwd=lwd, lend=lend)
}
plot(NA, xlim=c(-10,20), ylim=c(0,0.4), xlab="Mean", ylab="d(x)")
mapply(my_curve, seq(-5,5,5), c(1,3,5), rainbow(3))

R overlap normal curve to probability histogram

In R I'm able to overlap a normal curve to a density histogram:
Eventually I can convert the density histogram to a probability one:
a <- rnorm(1:100)
test <-hist(a, plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100 # Probability
plot(test, ylab="Probability")
curve(dnorm(x, mean=mean(a), sd=sd(a)), add=TRUE)
But I cannot overlap the normal curve anymore since it goes off scale.
Any solution? Maybe a second Y-axis
Now the question is clear to me. Indeed a second y-axis seems to be the best choice for this as the two data sets have completely different scales.
In order to do this you could do:
set.seed(2)
a <- rnorm(1:100)
test <-hist(a, plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100 # Probability
plot(test, ylab="Probability")
#start new graph
par(new=TRUE)
#instead of using curve just use plot and create the data your-self
#this way below is how curve works internally anyway
curve_data <- dnorm(seq(-2, 2, 0.01), mean=mean(a), sd=sd(a))
#plot the line with no axes or labels
plot(seq(-2, 2, 0.01), curve_data, axes=FALSE, xlab='', ylab='', type='l', col='red' )
#add these now with axis
axis(4, at=pretty(range(curve_data)))
Output:
At first you should save your rnorm data otherwise you get different data each time.
seed = rnorm(100)
Next go ahead with
hist(seed,probability = T)
curve(dnorm(x, mean=mean(na.omit(seed)), sd=sd(na.omit(seed))), add=TRUE)
Now you have the expected result. Histogram with density curve.
The y-axis isn't a "probability" as you have labeled it. It is count data. If you convert your histogram to probabilities, you shouldn't have a problem:
x <- rnorm(1000)
hist(x, freq= FALSE, ylab= "Probability")
curve(dnorm(x, mean=mean(x), sd=sd(x)), add=TRUE)

Plot a log-curve to a scatter plot

I am facing a probably pretty easy-to-solve issue: adding a log- curve to a scatter plot.
I have already created the corresponding model and now only need to add the respective curve/line.
The current model is as follows:
### DATA
SpStats_urbanform <- c (0.3702534,0.457769,0.3069843,0.3468263,0.420108,0.2548158,0.347664,0.4318018,0.3745645,0.3724192,0.4685135,0.2505839,0.1830535,0.3409849,0.1883303,0.4789871,0.3979671)
co2 <- c (6.263937,7.729964,8.39634,8.12979,6.397212,64.755192,7.330138,7.729964,11.058834,7.463414,7.196863,93.377393,27.854284,9.081405,73.483949,12.850917,12.74407)
### Plot initial plot
plot (log10 (1) ~ log10 (1), col = "white", xlab = "PUSHc values",
ylab = "Corrected GHG emissions [t/cap]", xlim =c(0,xaxes),
ylim =c(0,yaxes), axes =F)
axis(1, at=seq(0.05, xaxes, by=0.05), cex.axis=1.1)
axis(2, at=seq(0, yaxes, by=1), cex.axis=1.1 )
### FIT
fit_co2_urbanform <- lm (log10(co2) ~ log10(SpStats_urbanform))
### Add data points (used points() instead of simple plot() bc. of other code parts)
points (co2_cap~SpStats_urbanform, axes = F, cex =1.3)
Now, I've already all the fit_parameters and are still not able to construct the respective fit-curve for co2_cap (y-axis)~ SpStats_urbanform (x-axis)
Can anyone help me finalizing this little piece of code ?
First, if you want to plot in a log-log space, you have to specify it with argument log="xy":
plot (co2~SpStats_urbanform, log="xy")
Then if you want to add your regression line, then use abline:
abline(fit_co2_urbanform)
Edit: If you don't want to plot in a log-log scale then you'll have to translate your equation log10(y)=a*log10(x)+b into y=10^(a*log10(x)+b) and plot it with curve:
f <- coefficients(fit_co2_urbanform)
curve(10^(f[1]+f[2]*log10(x)),ylim=c(0,100))
points(SpStats_urbanform,co2)

How do I plot the 'inverse' of a survival function?

I am trying to plot the inverse of a survival function, as the data I'm is actually an increase in proportion of an event over time. I can produce Kaplan-Meier survival plots, but I want to produce the 'opposite' of these. I can kind of get what I want using the following fun="cloglog":
plot(survfit(Surv(Days_until_workers,Workers)~Queen_Number+Treatment,data=xdata),
fun="cloglog", lty=c(1:4), lwd=2, ylab="Colonies with Workers",
xlab="Days", las=1, font.lab=2, bty="n")
But I don't understand quite what this has done to the time (i.e. doesn't start at 0 and distance decreases?), and why the survival lines extend above the y axis.
Would really appreciate some help with this!
Cheers
Use fun="event" to get the desired output
fit <- survfit(Surv(time, status) ~ x, data = aml)
par(mfrow=1:2, las=1)
plot(fit, col=2:3)
plot(fit, col=2:3, fun="event")
The reason for fun="cloglog" screwing up the axes is that it does not plot a fraction at all. It is instead plotting this according to ?plot.survfit:
"cloglog" creates a complimentary log-log survival plot (f(y) = log(-log(y)) along with log scale for the x-axis)
Moreover, the fun argument is not limited to predefined functions like "event" or "cloglog", so you can easily give it your own custom function.
plot(fit, col=2:3, fun=function(y) 3*sqrt(1-y))

Resources