I'm working my way through the vignette workded example in the “vars” library in R. I’m understand most of the example in the vignette expect Table 5 of the vignette here
Running the following code, I see where the cointegrating vector and loadings point estimates are derived, but I do not understand where the t-statistics have been derived.
I ran this code:
library("vars")
data("Canada")
Canada <- Canada[, c("prod", "e", "U", "rw")]
vecm <- ca.jo(Canada[, c("rw", "prod", "e", "U")], type = "trace", ecdet = "trend", K = 3, spec = "transitory")
vecm.r1 <- cajorls(vecm, r = 1)
And got these eigenvectors and weightings,
Eigenvectors, normalised to first column:
(These are the cointegration relations)
rw.l1 prod.l1 e.l1 U.l1 trend.l1
rw.l1 1.00000000 1.0000000 1.0000000 1.000000 1.0000000
prod.l1 0.54487553 -3.0021508 0.7153696 -7.173608 0.4087221
e.l1 -0.01299605 -3.8867890 -2.0625220 -30.429074 -3.3884676
U.l1 1.72657188 -10.2183404 -5.3124427 -49.077209 -5.1326687
trend.l1 -0.70918872 0.6913363 -0.3643533 11.424630 0.1157125
Weights W:
(This is the loading matrix)
rw.l1 prod.l1 e.l1 U.l1 trend.l1
rw.d -0.084814510 0.048563997 -0.02368720 -0.0016583069 5.722004e-12
prod.d -0.011994081 0.009204887 -0.09921487 0.0020567547 -7.478364e-12
e.d -0.015606039 -0.038019447 -0.01140202 -0.0005559337 -1.229460e-11
U.d -0.008659911 0.020499657 0.02896325 0.0009140795 1.103862e-11
Which give me the correct alpha and beta values in Table 5.
But I have no idea where the t-statistics in table 5 have been derived in the code. Can anybody out there point me in the right direction please?
Regards,
James
My biggest gripe with urca: it does not calculate t-statistics for the beta vectors. It looks like the author maybe at some point wanted to include the feature (hence table 5 in the vignette you cite above) but did not get around to actually doing it.
The good news: For the case of one cointegrating vector (r=1), there is sample code available to get the standard errors. See this discussion thread
Standard Errors for the Long-Run Cointegrating Relationship with urca when there is a single cointegrating vector
Continuing from your example above, the code given by the nabble thread discussed above is:
alpha <- coef(vecm.r1$rlm)[1, ] # the coefficients on ecm1
beta <- vecm.r1$beta # the point estimates of beta
resids <- resid(vecm.r1$rlm)
N <- nrow(resids)
sigma <- crossprod(resids) / N
## t-stats for beta
beta.se <- sqrt(diag(kronecker(solve(crossprod(vecm#RK[, -1])),
solve(t(alpha) %*% solve(sigma) %*% alpha))))
beta.t <- c(NA, beta[-1] / beta.se)
names(beta.t) <- rownames(vecm.r1$beta)
beta.t
This reproduces exactly the standard errors given by the vignette.
Compare with Gretl
If we run the same model with gretl, we get exactly identical point estimates for beta but the standard errors roughly agree:
Bonus: Create LaTeX table of the cointegrating vector with standard errors using urca and texreg
You need to write a slightly clumsy "extract" function to assemble the standard errors and point estimates together into the form that texreg likes:
extract.cajo_beta <- function(cajo, orls) {
alpha <- coef(orls$rlm)[1, ];
resids <- resid(orls$rlm);
N <- nrow(resids);
sigma <- crossprod(resids) / N;
# get standard errors and p-values
beta <- orls$beta
beta.se <- sqrt(diag(kronecker(solve(crossprod(cajo#RK[, -1])),
solve(t(alpha) %*% solve(sigma) %*% alpha))));
beta.se2 <- c(NA, beta.se);
beta.t <- c(NA, beta[-1] / beta.se);
beta.pval <- dt(beta.t, df= orls$rlm$df.residual)
tr <- createTexreg(coef.names = as.character(rownames(beta)), coef = as.numeric(beta), se = beta.se2,
pvalues = beta.pval,
# remove this goodness of fit measure afterwards from the table
gof.names = c('Dummy'), gof = c(1), gof.decimal = c(FALSE)
);
return(tr);
}
Then run:
> screenreg(extract.cajo_beta(vecm, vecm.r1))
=================
Model 1
-----------------
rw.l1 1.00
prod.l1 0.54
(0.61)
e.l1 -0.01
(0.68)
U.l1 1.73
(1.45)
trend.l1 -0.71 *
(0.28)
-----------------
Dummy 1
=================
*** p < 0.001, ** p < 0.01, * p < 0.05
Related
I notice searching through stackoverflow for similar questions that this has been asked several times hasn't really been properly answered. Perhaps with help from other users this post can be a helpful guide to programming a numerical estimate of the parameters of a multivariate normal distribution.
I know, I know! The closed form solutions are available and trivial to implement. In my case I am interested in modifying the likelihood function for a specific purpose and I don't expect an exact analytic solution so this is a test case to check the procedure.
So here is my attempt. Please comment. Especially if I am missing opportunities for optimization. Note, I'm not a statistician so I'd appreciate any pointers.
ll_multN <- function(theta,X) {
# theta = c(mu, diag(Sigma), Sigma[upper.tri(Sigma)])
# X is an nxk dataset
# MLE: L = - (nk/2)*log(2*pi) - (n/2)*log(det(Sigma)) - (1/2)*sum_i(t(X_i-mu)^2 %*% Sigma^-1 %*% (X_i-mu)^2)
# summation over i is performed using a apply call for efficiency
n <- nrow(X)
k <- ncol(X)
# def mu
mu.vec <- theta[1:k]
# def Sigma
Sigma.diag <- theta[(k+1):(2*k)]
Sigma.offd <- theta[(2*k+1):length(theta)]
Sigma <- matrix(NA, k, k)
Sigma[upper.tri(Sigma)] <- Sigma.offd
Sigma <- t(Sigma)
Sigma[upper.tri(Sigma)] <- Sigma.offd
diag(Sigma) <- Sigma.diag
# compute summation
sum_i <- sum(apply(X, 1, function(x) (matrix(x,1,k)-mu.vec)%*%solve(Sigma)%*%t(matrix(x,1,k)-mu.vec)))
# compute log likelihood
logl <- -.5*n*k*log(2*pi) - .5*n*log(det(Sigma))
logl <- logl - .5*sum_i
return(-logl)
}
Simulated dataset generated using the rmvnorm() function in the package "mvtnorm". Random positive definite covariance matrix generated using the additional function Posdef() (taken from here: https://stat.ethz.ch/pipermail/r-help/2008-February/153708)
library(mvtnorm)
Posdef <- function (n, ev = runif(n, 0, 5)) {
# generates a random positive definite covariance matrix
Z <- matrix(ncol=n, rnorm(n^2))
decomp <- qr(Z)
Q <- qr.Q(decomp)
R <- qr.R(decomp)
d <- diag(R)
ph <- d / abs(d)
O <- Q %*% diag(ph)
Z <- t(O) %*% diag(ev) %*% O
return(Z)
}
set.seed(2)
n <- 1000 # number of data points
k <- 3 # number of variables
mu.tru <- sample(0:3, k, replace=T) # random mean vector
Sigma.tru <- Posdef(k) # random covariance matrix
eigen(Sigma.tru)$val # check positive def (all lambda > 0)
# Generate simulated dataset
X <- rmvnorm(n, mean=mu.tru, sigma=Sigma.tru)
# initial parameter values
pars.init <- c(mu=rep(0,k), sig_ii=rep(1,k), sig_ij=rep(0, k*(k-1)/2))
# limits for optimization algorithm
eps <- .Machine$double.eps # get a small value for bounding the paramter space to avoid things such as log(0).
lower.bound <- c(rep(-Inf,k), # bound on mu
rep(eps,k), # bound on sigma_ii
rep(-Inf,k)) # bound on sigma_ij i=/=j
upper.bound <- c(rep(Inf,k), # bound on mu
rep(100,k), # bound on sigma_ii
rep(100,k)) # bound on sigma_ij i=/=j
system.time(
o <- optim(pars.init,
ll_multN, X=X, method="L-BFGS-B",
lower = lower.bound,
upper = upper.bound)
)
plot(x=c(mu.tru,diag(Sigma.tru),Sigma.tru[upper.tri(Sigma.tru)]),
y=o$par,
xlab="Parameter",
ylab="Estimate",
pch=20)
abline(c(0,1), col="red", lty=2)
This currently runs on my laptop in
user system elapsed
47.852 24.014 24.611
and gives this graphical output:
Estimated mean and variance
In particular any advice on limit setting or algorithm choice would be much appreciated.
Thanks
This began as a question, but after reading the reference provided in the answer below, as well as the source code, the solution became clear. In case anyone else finds themselves in this position:
The orcutt package, version 2.2 uses a special procedure to calculate the DW statistic for its CO models. The MWE uses the orcutt package's example to show that its Durbin-Watson statistic is not based on the residuals of the OC estimation.
library(orcutt)
data(icecream, package="orcutt")
dw_calc = function(x){sum((x[2:length(x)] - x[1:(length(x)-1)]) ^ 2) / sum(x ^ 2)}
lm = lm(cons ~ price + income + temp, data=icecream)
e = lm$residuals
dw_calc(e)
# 1.02117 <- Durbin-Watson Statistic
coch = cochrane.orcutt(lm)
e = coch$residuals
dw_calc(e)
# 1.006431 <- Durbin-Watson Statistic
coch
# Durbin-Watson statistic
# (original): 1.02117 , p-value: 3.024e-04
# (transformed): 1.54884 , p-value: 5.061e-02
The orcutt package reports 1.54884 but the actual DW is 1.006431 for the new residuals. The reported value, 1.54884, comes from the last round of the convergence procedure (see Hildreth-Lu). See below for a thorough explanation:
lm = lm(cons ~ price + income + temp, data=icecream)
reg = lm(cons ~ price + income + temp, data=icecream)
convergence = 8
X <- model.matrix(reg)
Y <- model.response(model.frame(reg))
n<-length(Y)
e<-reg$residuals
e2<-e[-1]
e3<-e[-n]
regP<-lm(e2~e3-1)
rho<-summary(regP)$coeff[1]
rho2<-c(rho)
XB<-X[-1,]-rho*X[-n,]
YB<-Y[-1]-rho*Y[-n]
regCO<-lm(YB~XB-1)
ypCO<-regCO$coeff[1]+as.matrix(X[,-1])%*%regCO$coeff[-1]
e1<-ypCO-Y
e2<-e1[-1]
e3<-e1[-n]
regP<-lm(e2~e3-1)
rho<-summary(regP)$coeff[1]
rho2[2]<-rho
i<-2
while (round(rho2[i-1],convergence)!=round(rho2[i],convergence)){
XB<-X[-1,]-rho*X[-n,]
YB<-Y[-1]-rho*Y[-n]
regCO<-lm(YB~XB-1)
ypCO<-regCO$coeff[1]+as.matrix(X[,-1])%*%regCO$coeff[-1]
e1<-ypCO-Y
e2<-e1[-1]
e3<-e1[-n]
regP<-lm(e2~e3-1)
rho<-summary(regP)$coeff[1]
i<-i+1
rho2[i]<-rho
}
regCO$number.interaction<-i-1
regCO$rho <- rho2[i-1]
regCO$DW <- c(lmtest::dwtest(reg)$statistic, lmtest::dwtest(reg)$p.value,
lmtest::dwtest(regCO)$statistic, lmtest::dwtest(regCO)$p.value)
dw_calc(regCO$residuals)
regF<-lm(YB ~ 1)
tF <- anova(regCO,regF)
regCO$Fs <- c(tF$F[2],tF$`Pr(>F)`[2])
# fitted.value
regCO$fitted.values <- model.matrix(reg) %*% (as.matrix(regCO$coeff))
# coeff
names(regCO$coefficients) <- colnames(X)
# st.err
regCO$std.error <- summary(regCO)$coeff[,2]
# t value
regCO$t.value <- summary(regCO)$coeff[,3]
# p value
regCO$p.value <- summary(regCO)$coeff[,4]
class(regCO) <- "orcutt"
# formula
regCO$call <- reg$call
# F statistics and p value
df1 <- dim(model.frame(reg))[2] - 1
df2 <- length(regCO$residuals) - df1 - 1
RSS <- sum((regCO$residuals)^2)
TSS <- sum((regCO$model[1] - mean(regCO$model[,1]))^2)
regCO$rse <- sqrt(RSS/df2)
regCO$r.squared <- 1 - (RSS/TSS)
regCO$adj.r.squared <- 1 - ((RSS/df2)/(TSS/(df1 + df2)))
regCO$gdl <- c(df1, df2)
#
regCO$rank <- df1
regCO$df.residual <- df2
regCO$assign <- regCO$assign[-(df1+1)]
regCO$residuals <- Y - regCO$fitted.values
regCO
}
I don't know this area well, but it seems much more likely that there are multiple definitions/estimation methods for this statistic. Going back to the book cited in ?orcutt (Verbeek M. (2004) A guide to modern econometrics, John Wiley & Sons Ltd, ISBN:978-88-08-17054-5), and searching on Google books gives
The value shown as computed by orcutt in your example above agrees with the value given in the book. Earlier in the book (p. 108) it says
In [the Cochrane-Orcutt procedure], $\rho$ and $\beta$ are recursively estimated until convergence, i.e. having estimated $\beta$ by EGLS (by $\beta^*$), the residuals are recomputed and $\rho$ is estimated again using the residuals from the EGLS step. With this new estimate of $\rho$, EGLS is applied again and one obtains a new estimate of $\beta$ ...
In other words, it seems as though the estimate of $\rho$ that you give above corresponds only to the first step of the Orcutt-Cochrane procedure.
In R, how does the function ar.yw estimate the variance? Specifically, where does the number "var.pred" come from? It does not seem to come from the usual YW estimate of the variance, nor the sum of squared residuals divided by df (even though there is disagreement about what the df should be, none of the choices give an answer equivalent to var.pred). And yes, I know that there are better methods than YW; just trying to figure out what R is doing.
set.seed(82346)
temp <- arima.sim(n=10, list(ar = 0.5), sd=1)
fit <- ar(temp, method = "yule-walker", demean = FALSE, aic=FALSE, order.max=1)
## R's estimate of the sigma squared
fit$var.pred
## YW estimate
sum(temp^2)/10 - fit$ar*sum(temp[2:10]*temp[1:9])/10
## YW if there was a mean
sum((temp-mean(temp))^2)/10 - fit$ar*sum((temp[2:10]-mean(temp))*(temp[1:9]-mean(temp)))/10
## estimate based on residuals, different possible df.
sum(na.omit(fit$resid^2))/10
sum(na.omit(fit$resid^2))/9
sum(na.omit(fit$resid^2))/8
sum(na.omit(fit$resid^2))/7
Need to read the code if it's not documented.
?ar.yw
Which says: "In ar.yw the variance matrix of the innovations is computed from the fitted coefficients and the autocovariance of x." If that is not enough explanation, then you need to look at the code:
methods(ar.yw)
#[1] ar.yw.default* ar.yw.mts*
#see '?methods' for accessing help and source code
getAnywhere(ar.yw.default)
# there are two cases that I see
x <- as.matrix(x)
nser <- ncol(x)
if (nser > 1L) # .... not your situation
#....
else{
r <- as.double(drop(xacf))
z <- .Fortran(C_eureka, as.integer(order.max), r, r,
coefs = double(order.max^2), vars = double(order.max),
double(order.max))
coefs <- matrix(z$coefs, order.max, order.max)
partialacf <- array(diag(coefs), dim = c(order.max, 1L,
1L))
var.pred <- c(r[1L], z$vars)
#.......
order <- if (aic)
(0L:order.max)[xaic == 0L]
else order.max
ar <- if (order)
coefs[order, seq_len(order)]
else numeric()
var.pred <- var.pred[order + 1L]
var.pred <- var.pred * n.used/(n.used - (order + 1L))
So you now need to find the Fortran code for C_eureka. I think I'm finding it here: https://svn.r-project.org/R/trunk/src/library/stats/src/eureka.f This is the code that aI think is returning the var.pred estimate. I'm not a time series guy and It's your responsibility to review this process for applicability to your problem.
subroutine eureka (lr,r,g,f,var,a)
c
c solves Toeplitz matrix equation toep(r)f=g(1+.)
c by Levinson's algorithm
c a is a workspace of size lr, the number
c of equations
c
snipped
c estimate the innovations variance
var(l) = var(l-1) * (1 - f(l,l)*f(l,l))
if (l .eq. lr) return
d = 0.0d0
q = 0.0d0
do 50 i = 1, l
k = l-i+2
d = d + a(i)*r(k)
q = q + f(l,i)*r(k)
50 continue
I am using maximum-likelihood optimization in Stan, but unfortunately the optimizing() function doesn't report standard errors:
> MLb4c <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits)
STAN OPTIMIZATION COMMAND (LBFGS)
init = user
save_iterations = 1
init_alpha = 0.001
tol_obj = 1e-012
tol_grad = 1e-008
tol_param = 1e-008
tol_rel_obj = 10000
tol_rel_grad = 1e+007
history_size = 5
seed = 292156286
initial log joint probability = -4038.66
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
13 -2772.49 9.21091e-005 0.0135987 0.07606 0.9845 15
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
> t2 <- proc.time()
> print(t2 - t1)
user system elapsed
0.11 0.19 0.74
>
> MLb4c
$par
psi alpha beta
0.9495000 0.4350983 -0.2016895
$value
[1] -2772.489
> summary(MLb4c)
Length Class Mode
par 3 -none- numeric
value 1 -none- numeric
How do I get the standard errors of the estimates (or confidence interval - quantiles), and possibly p-values?
EDIT: I did as kindly advised by #Ben Goodrich:
> MLb4cH <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits, hessian = TRUE)
> sqrt(diag(solve(-MLb4cH$hessian)))
psi alpha beta
0.21138314 0.03251696 0.03270493
But these "unconstrained" standard errors seem to be very different from the real ones - here as is the output from bayesian fitting using stan():
> print(outb4c, dig = 5)
Inference for Stan model: tmp_stan_model.
3 chains, each with iter=500; warmup=250; thin=1;
post-warmup draws per chain=250, total post-warmup draws=750.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha 0.43594 0.00127 0.03103 0.37426 0.41578 0.43592 0.45633 0.49915 594 1.00176
beta -0.20262 0.00170 0.03167 -0.26640 -0.22290 -0.20242 -0.18290 -0.13501 345 1.00402
psi 0.94905 0.00047 0.01005 0.92821 0.94308 0.94991 0.95656 0.96632 448 1.00083
lp__ -2776.94451 0.06594 1.15674 -2780.07437 -2777.50643 -2776.67139 -2776.09064 -2775.61263 308 1.01220
You can specify the hessian = TRUE argument to the optimizing function, which will return the Hessian as part of the list of output. Thus, you can obtain estimated standard errors via sqrt(diag(solve(-MLb4c$hessian))); however those standard errors pertain to the estimates in the unconstrained space. To obtain estimated standard errors for the parameters in the constrained space, you could either use the delta method or draw many times from a multivariate normal distribution whose mean vector is MLb4c$par and whose variance-covariance is solve(-MLb4c$hessian), convert those draws to the constrained space with the constrain_pars function, and estimate the standard deviation of each column.
Here is some R code you could adapt to your case
# 1: Compile and save a model (make sure to pass the data here)
model <- stan(file="model.stan", data=c("N","K","X","y"), chains = 0, iter = 0)
# 2: Fit that model
fit <- optimizing(object=get_stanmodel(model), as_vector = FALSE,
data=c("N","K","X","y"), hessian = TRUE)
# 3: Extract the vector theta_hat and the Hessian for the unconstrained parameters
theta_hat <- unlist(fit$par)
upars <- unconstrain_pars(linear, relist(theta_hat, fit$par))
Hessian <- fit$hessian
# 4: Extract the Cholesky decomposition of the (negative) Hessian and invert
R <- chol(-Hessian)
V <- chol2inv(R)
rownames(V) <- colnames(V) <- colnames(Hessian)
# 5: Produce a matrix with some specified number of simulation draws from a multinormal
SIMS <- 1000
len <- length(theta_hat)
unconstrained <- upars + t(chol(V)) %*%
matrix(rnorm(SIMS * len), nrow = len, ncol = SIMS)
theta_sims <- t(apply(unconstrained, 2, FUN = function(upars) {
unlist(constrain_pars(linear, upars))
}))
# 6: Produce estimated standard errors for the constrained parameters
se <- apply(theta_sims, 2, sd)
Given:
set.seed(1001)
outcome<-rnorm(1000,sd = 1)
covariate<-rnorm(1000,sd = 1)
log-likelihood of normal pdf:
loglike <- function(par, outcome, covariate){
cov <- as.matrix(cbind(1, covariate))
xb <- cov * par
(- 1/2* sum((outcome - xb)^2))
}
optimize:
opt.normal <- optim(par = 0.1,fn = loglike,outcome=outcome,cov=covariate, method = "BFGS", control = list(fnscale = -1),hessian = TRUE)
However I get different results when running an simple OLS. However maximizing log-likelihhod and minimizing OLS should bring me to a similar estimate. I suppose there is something wrong with my optimization.
summary(lm(outcome~covariate))
Umm several things... Here's a proper working likelihood function (with names x and y):
loglike =
function(par,x,y){cov = cbind(1,x); xb = cov %*% par;(-1/2)*sum((y-xb)^2)}
Note use of matrix multiplication operator.
You were also only running it with one par parameter, so it was not only broken because your loglike was doing element-element multiplication, it was only returning one value too.
Now compare optimiser parameters with lm coefficients:
opt.normal <- optim(par = c(0.1,0.1),fn = loglike,y=outcome,x=covariate, method = "BFGS", control = list(fnscale = -1),hessian = TRUE)
opt.normal$par
[1] 0.02148234 -0.09124299
summary(lm(outcome~covariate))$coeff
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.02148235 0.03049535 0.7044466 0.481319029
covariate -0.09124299 0.03049819 -2.9917515 0.002842011
shazam.
Helpful hints: create data that you know the right answer for - eg x=1:10; y=rnorm(10)+(1:10) so you know the slope is 1 and the intercept 0. Then you can easily see which of your things are in the right ballpark. Also, run your loglike function on its own to see if it behaves as you expect.
Maybe you will find it usefull to see the difference between these two methods from my code. I programmed it the following way.
data.matrix <- as.matrix(hprice1[,c("assess","bdrms","lotsize","sqrft","colonial")])
loglik <- function(p,z){
beta <- p[1:5]
sigma <- p[6]
y <- log(data.matrix[,1])
eps <- (y - beta[1] - z[,2:5] %*% beta[2:5])
-nrow(z)*log(sigma)-0.5*sum((eps/sigma)^2)
}
p0 <- c(5,0,0,0,0,2)
m <- optim(p0,loglik,method="BFGS",control=list(fnscale=-1,trace=10),hessian=TRUE,z=data.matrix)
rbind(m$par,sqrt(diag(solve(-m$hessian))))
And for the lm() method I find this
m.ols <- lm(log(assess)~bdrms+lotsize+sqrft+colonial,data=hprice1)
summary(m.ols)
Also if you would like to estimate the elasticity of assessed value with respect to the lotsize or calculate a 95% confidence interval
for this parameter, you could use the following
elasticity.at.mean <- mean(hprice1$lotsize) * m$par[3]
var.coefficient <- solve(-m$hessian)[3,3]
var.elasticity <- mean(hprice1$lotsize)^2 * var.coefficient
# upper bound
elasticity.at.mean + qnorm(0.975)* sqrt(var.elasticity)
# lower bound
elasticity.at.mean + qnorm(0.025)* sqrt(var.elasticity)
A more simple example of the optim method is given below for a binomial distribution.
loglik1 <- function(p,n,n.f){
n.f*log(p) + (n-n.f)*log(1-p)
}
m <- optim(c(pi=0.5),loglik1,control=list(fnscale=-1),
n=73,n.f=18)
m
m <- optim(c(pi=0.5),loglik1,method="BFGS",hessian=TRUE,
control=list(fnscale=-1),n=73,n.f=18)
m
pi.hat <- m$par
numerical calculation of s.d
rbind(pi.hat=pi.hat,sd.pi.hat=sqrt(diag(solve(-m$hessian))))
analytical
rbind(pi.hat=18/73,sd.pi.hat=sqrt((pi.hat*(1-pi.hat))/73))
Or this code for the normal distribution.
loglik1 <- function(p,z){
mu <- p[1]
sigma <- p[2]
-(length(z)/2)*log(sigma^2) - sum(z^2)/(2*sigma^2) +
(mu*sum(z)/sigma^2) - (length(z)*mu^2)/(2*sigma^2)
}
m <- optim(c(mu=0,sigma2=0.1),loglik1,
control=list(fnscale=-1),z=aex)