I'm using the following code to perform PCA:
PCA <- prcomp(Ret1, center = TRUE, scale. = TRUE)
summary(PCA)
I get the following result:
#Importance of components:
# PC1 PC2 PC3 PC4
#Standard deviation 1.6338 0.9675 0.60446 0.17051
#Proportion of Variance 0.6673 0.2340 0.09134 0.00727
#Cumulative Proportion 0.6673 0.9014 0.99273 1.00000
What I would like to do is a Rolling PCA for a specific window ( e.g. 180 days). The Result should be a matrix which shows the evolution of the "Proportion of Variance" of all principal components though time.
I tried it with
rollapply(Ret1, 180, prcomp)
but this doesn't work and I have no Idea how to save the "Proportion of Variance" for each time step in matrix.
The output matrix should look like this:
# PC1 PC2 PC3 PC4
#Period 1 0.6673 0.2340 0.09134 0.00727
#Period 2 0.7673 0.1340 0.09134 0.00727
# ....
Here is a mini subset of my data Ret1:
Cats Dogs Human Frogs
2016-12-13 0.0084041063 6.518479e-03 6.096295e-04 5.781271e-03
2016-12-14 -0.0035340384 -8.150321e-03 4.418382e-04 -5.978296e-03
2016-12-15 0.0107522782 3.875708e-03 -1.784663e-02 3.012253e-03
2016-12-16 0.0033034130 -1.752174e-03 -1.753624e-03 -4.448850e-04
2016-12-17 0.0000000000 0.000000e+00 0.000000e+00 0.000000e+00
2016-12-18 0.0000000000 0.000000e+00 0.000000e+00 0.000000e+00
2016-12-19 0.0019876743 1.973190e-03 -8.577261e-03 1.996151e-03
2016-12-20 0.0033235161 3.630921e-03 -4.757395e-03 4.594355e-03
2016-12-21 0.0003401156 -2.460351e-03 3.708875e-03 -1.636413e-03
2016-12-22 -0.0010940147 -1.864724e-03 -7.991572e-03 -1.158029e-03
2016-12-23 -0.0005387228 1.250898e-03 -2.843725e-03 7.492594e-04
2016-12-24 0.0000000000 0.000000e+00 0.000000e+00 0.000000e+00
2016-12-25 0.0000000000 0.000000e+00 0.000000e+00 0.000000e+00
2016-12-26 0.0000000000 0.000000e+00 0.000000e+00 0.000000e+00
2016-12-27 0.0019465877 2.245918e-03 0.000000e+00 5.632058e-04
2016-12-28 0.0002396803 -8.391658e-03 8.307552e-03 -5.598988e-03
2016-12-29 -0.0020884556 -2.933868e-04 1.661246e-03 -7.010738e-04
2016-12-30 0.0026172923 -4.647865e-03 9.574997e-03 -2.889166e-03
I tried the following:
PCA <- function(x){
Output=cumsum((apply((prcomp(x,center = TRUE, scale. = TRUE))$x, 2, var))/sum(vars))
return(Output)}
window <- 10
data <- Ret1
result <- rollapply(data, window,PCA)
plot(result)
#Gives you the Proportion of Variance = cumsum((apply((prcomp(x,center = TRUE, scale. = TRUE))$x, 2, var))/sum(vars))
First, the correct function for your purpose may be written as follow, using $sdev result of prcomp. I have left over center = TRUE and scale. = TRUE as they are function default.
PCA <- function(x){
oo <- prcomp(x)$sdev
oo / sum(oo)
}
Now, we can easily use sapply to do rolling operation:
## for your mini dataset of 18 rows
window <- 10
n <- nrow(Ret1)
oo <- sapply(seq_len(n - window + 1), function (i) PCA(Ret1[i:(i + window - 1), ]))
oo <- t(oo) ## an extra transposition as `sapply` does `cbind`
# [,1] [,2] [,3] [,4]
# [1,] 0.5206345 0.3251099 0.12789683 0.02635877
# [2,] 0.5722264 0.2493518 0.14588631 0.03253553
# [3,] 0.6051199 0.1973694 0.16151859 0.03599217
# [4,] 0.5195527 0.2874197 0.16497219 0.02805543
# [5,] 0.5682829 0.3100708 0.09456654 0.02707977
# [6,] 0.5344804 0.3149862 0.08912882 0.06140464
# [7,] 0.5954948 0.2542775 0.10434155 0.04588616
# [8,] 0.5627977 0.2581071 0.13068875 0.04840648
# [9,] 0.6089650 0.2559285 0.11022974 0.02487672
Each column is a PC, while each row gives proportional variance for each component in that period.
To further plot the result, you can use matplot:
matplot(oo, type = "l", lty = 1, col = 1:4,
xlab = "period", ylab = "proportional variance")
PCA 1-4 are sketched with colour 1:4, i.e., "black", "red", "green" and "blue".
Additional comments:
If you want to use zoo::rollapply, do
oo <- zoo::rollapply(Ret1, window, PCA, by.column = FALSE)
Precisely, I am reporting proportional standard deviation. If you really want proportional variance, chance PCA function to:
PCA <- function(x){
oo <- prcomp(x)$sdev ^ 2
oo / sum(oo)
}
Related
I want to generat a veusing R.
Is there a way to generate a sequence of POSITIVE numbers that satisfy specific constraints
a mean of 13,
a standard deviation of 30.96 , and
a sample size of 6.
Thank you guys.
Another option (after echoing all the concerns about this being an XY problem):
We can transform n samples of practically any continuous distribution by finding a pair of scale and translation parameters that satisfies the desired constraints.
f <- function(n, mu, sigma) {
x <- rnorm(n) # substitute any continuous distribution here
fn <- function(par) {
y <- exp(x*par[2] + par[1])
log(abs(mean(y) - mu) + abs(sd(y) - sigma))
}
with(optim(c(0, 0), fn), exp(x*par[2] + par[1]))
}
# example usage
f(6L, 13, 30.96)
#> [1] 2.569263e-09 1.912637e-06 9.521086e-05 4.023787e-01 7.618698e+01 1.410541e+00
# take 20 sets of samples
m <- matrix(NA, 20, 8)
for (i in 1:nrow(m)) {
m[i, 1:6] <- sort(f(6L, 13, 30.96))
m[i, 7] <- mean(m[i, 1:6]) - 13 # difference from desired mean
m[i, 8] <- sd(m[i, 1:6]) - 30.96 # difference from desired SD
}
m
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 2.373175e-12 5.239577e-08 1.612212e-07 9.033024e-01 0.9063398 76.19036 -7.105427e-15 -1.776357e-14
#> [2,] 1.185131e-17 2.420683e-12 1.967461e-11 1.613739e-03 1.8189977 76.17939 -7.105427e-15 3.552714e-15
#> [3,] 4.178030e-03 6.449223e-03 3.329587e-02 7.889274e-02 1.6949839 76.18220 1.776357e-15 0.000000e+00
#> [4,] 1.134263e-19 8.582391e-13 3.675582e-12 7.909737e-06 1.8206428 76.17935 -1.776357e-15 -3.552714e-15
#> [5,] 3.875063e-11 5.453168e-06 1.005380e-05 7.971751e-02 1.7390532 76.18121 -3.552714e-15 0.000000e+00
#> [6,] 2.057442e-10 1.395120e-04 2.825930e-03 8.257653e-01 0.9809560 76.19031 1.065814e-14 0.000000e+00
#> [7,] 3.281160e-02 3.939408e-02 4.617334e-02 6.313399e-01 1.0591873 76.19109 -1.776357e-15 -1.776357e-14
#> [8,] 1.652719e-08 1.700647e-08 2.909621e-05 1.366993e-02 1.8066190 76.17968 -1.065814e-14 -1.065814e-14
#> [9,] 3.231989e-18 1.103402e-08 4.891128e-08 2.246015e-01 1.5912433 76.18416 1.776357e-15 0.000000e+00
#> [10,] 2.147959e-25 1.412579e-21 4.929303e-16 7.013902e-04 1.8199323 76.17937 0.000000e+00 -5.684342e-14
#> [11,] 8.239083e-08 1.516883e-05 4.268010e-01 6.580114e-01 0.7215226 76.19365 1.065814e-14 3.552714e-15
#> [12,] 8.837010e-05 7.983109e-04 3.712888e-03 9.311085e-03 1.8064017 76.17969 0.000000e+00 -7.105427e-15
#> [13,] 3.586152e-178 2.149918e-134 1.681243e-56 2.681863e-09 1.8206509 76.17935 -5.329071e-15 -9.592327e-14
#> [14,] 5.860182e-07 1.645025e-05 1.057840e-03 3.878328e-02 1.7798381 76.18030 -1.776357e-15 -3.552714e-15
#> [15,] 2.363474e-05 1.090204e-03 3.533081e-03 2.924378e-01 1.5174856 76.18543 0.000000e+00 1.421085e-14
#> [16,] 1.666746e-03 6.105670e-02 1.966066e-01 6.731973e-01 0.8746421 76.19283 1.776357e-15 0.000000e+00
#> [17,] 1.633101e-06 5.641356e-05 2.427083e-02 9.874914e-02 1.6947355 76.18219 -1.598721e-14 -7.105427e-15
#> [18,] 2.124617e-72 2.143486e-68 9.930707e-68 1.367184e-08 1.8206509 76.17935 2.842171e-14 -1.776357e-14
#> [19,] 5.846315e-19 1.528350e-06 2.030263e-06 7.959439e-04 1.8198318 76.17937 -1.776357e-15 -1.776357e-14
#> [20,] 5.630657e-03 1.462770e-01 2.914364e-01 6.119128e-01 0.7504916 76.19425 1.776357e-15 0.000000e+00
Inspired by #SamMason's comment, here is an empirical solution. Though I am concerned that providing OP with an answer may actually do them a disservice, since I suspect this may well be an XY problem.
First, establish if it is possible to construct a set of six numbers that satisfy OP's constraints:
f <- function(x) {
y <- c(rep(x, 5), 6*13 - 5*x)
sd(y) - 30.96
}
uniroot(f, c(0, 1))$root
[1] 0.3606329
> y
[1] 0.3606329 0.3606329 0.3606329 0.3606329 0.3606329 76.1968355
> mean(y)
[1] 13
> sd(y)
[1] 30.96
So, yes, it is possible to construct a set of six numbers that meets OP's constraints. Now introduce a (small) degree of variation into the first five elements of the sample, fix the sixth to satisfy the mean constraint and calculate the sample SD. If the SD isn't "close enough" to the target, throw the sample away and try again. Impose an upper limit to the number of tries to prevent infinite looping.
# For reproducibility
set.seed(1234)
f1 <- function(sd) {
y <- rnorm(5, mean=0.3606329, sd=sd)
y[6] <- 6*13 - sum(y)
y
}
findIt <- function(sd, epsilon=0.001, maxIter=1000) {
iteration <- 0
found <- FALSE
while(!found) {
z <- f1(sd)
if (abs(sd(z) - 30.96) < epsilon) {
found <- TRUE
return(z)
}
iteration <- iteration + 1
if (iteration == maxIter) {
warning(paste0("No solution found after ", maxIter, " iterations"))
return(NA)
}
}
}
z <- findIt(0.2)
> z
[1] 0.44505164 0.66907765 0.47566925 0.09247431 0.12141987 76.19630728
> mean(z)
[1] 13
> sd(z)
[1] 30.96053
Note that findIt() is sloppy in that it assumes that the SD is "small enough" that all generated values are positive.
This procedure generates a sample that is "random" in the sense that five of the values are iid N(0.3606329, sigma * sigma) and the fifth is (highly) correlated with the sum of the other five. The joint distribution of all six values in the sample is not obvious to me.
I can't imagine a scenario in which this is a sensible thing to do.
I am given an empirical distribution FXemp of a real-valued random variable X. Given now X1,..., Xn having the same distribution as X and dependencies given by a copula C. I would like now to produce random samples of X1,..., Xn element of R.
E.g. I am given a vector of samples and the corresponding cdf
x <- rnorm(1000)
df <- ecdf(x)
Assume that I pick for a example a t-student or Clayton copula C. How can I produce random samples of for example 10 copies of x, where their dependency is determined by C.
Is there an easy way?
Or are their any packages that can be used here?
You can sample from the copula (with uniform margins) by using the copula package, and then apply the inverse ecdf to each component:
library(copula)
x <- rnorm(100) # sample of X
d <- 5 # desired number of copies
copula <- claytonCopula(param = 2, dim = d)
nsims <- 25 # number of simulations
U <- rCopula(nsims, copula) # sample from the copula (with uniform margins)
# now sample the copies of X ####
Xs <- matrix(NA_real_, nrow = nsims, ncol = d)
for(i in 1:d){
Xs[,i] <- quantile(x, probs = U[,i], type = 1) # type=1 is the inverse ecdf
}
Xs
# [,1] [,2] [,3] [,4] [,5]
# [1,] -0.5692185 -0.9254869 -0.6821624 -1.2148041 -0.682162391
# [2,] -0.4680407 -0.4263257 -0.3456553 -0.6132320 -0.925486872
# [3,] -1.1322063 -1.2148041 -0.8115089 -1.0074435 -1.430405604
# [4,] 0.9760268 1.2600186 1.0731551 1.2369623 0.835024471
# [5,] -1.1280825 -0.8995429 -0.5761037 -0.8115089 -0.543125426
# [6,] -0.1848303 -1.2148041 -0.5692185 0.8974921 -0.613232036
# [7,] -0.5692185 -0.3070884 -0.8995429 -0.8115089 -0.007292346
# [8,] 0.1696306 0.4072428 0.7646646 0.4910863 1.236962330
# [9,] -0.7908557 -1.1280825 -1.2970952 0.3655081 -0.633521404
# [10,] -1.3226053 -1.0074435 -1.6857615 -1.3226053 -1.685761474
# [11,] -2.5410325 -2.3604936 -2.3604936 -2.3604936 -2.360493569
# [12,] -2.3604936 -2.2530003 -1.9311289 -2.2956444 -2.360493569
# [13,] 0.4072428 -0.2150035 -0.3564803 -0.1051930 -0.166434458
# [14,] -0.4680407 -1.0729763 -0.6335214 -0.8995429 -0.899542914
# [15,] -0.9143225 -0.1522242 0.4053462 -1.0729763 -0.158375658
# [16,] -0.4998761 -0.7908557 -0.9813504 -0.1763604 -0.283013334
# [17,] -1.2148041 -0.9143225 -0.5176347 -0.9143225 -1.007443492
# [18,] -0.2150035 0.5675260 0.5214050 0.8310799 0.464151265
# [19,] -1.2148041 -0.6132320 -1.2970952 -1.1685962 -1.132206305
# [20,] 1.4456635 1.0444720 0.7850181 1.0742214 0.785018119
# [21,] 0.3172811 1.2369623 -0.1664345 0.9440006 1.260018624
# [22,] 0.5017980 1.4068250 1.9950305 1.2600186 0.976026807
# [23,] 0.5675260 -1.0729763 -1.2970952 -0.3653535 -0.426325703
# [24,] -2.5410325 -2.2956444 -2.3604936 -2.2956444 -2.253000326
# [25,] 0.4053462 -0.5431254 -0.5431254 0.8350245 0.950891450
I often use B-splines for regression. Up to now I've never needed to understand the output of bs in detail: I would just choose the model I was interested in, and fit it with lm. However, I now need to reproduce a b-spline model in an external (non-R) code. So, what's the meaning of the matrix generated by bs? Example:
x <- c(0.0, 11.0, 17.9, 49.3, 77.4)
bs(x, df = 3, degree = 1) # generate degree 1 (linear) B-splines with 2 internal knots
# 1 2 3
# [1,] 0.0000000 0.0000000 0.0000000
# [2,] 0.8270677 0.0000000 0.0000000
# [3,] 0.8198433 0.1801567 0.0000000
# [4,] 0.0000000 0.7286085 0.2713915
# [5,] 0.0000000 0.0000000 1.0000000
# attr(,"degree")
# [1] 1
# attr(,"knots")
# 33.33333% 66.66667%
# 13.30000 38.83333
# attr(,"Boundary.knots")
# [1] 0.0 77.4
# attr(,"intercept")
# [1] FALSE
# attr(,"class")
# [1] "bs" "basis" "matrix"
Ok, so degree is 1, as I specified in input. knots is telling me that the two internal knots are at x = 13.3000 and x = 38.8333 respectively. Was a bit surprised to see that the knots are at fixed quantiles, I hoped R would find the best quantiles for my data, but of course that would make the model not linear, and also wouldn't be possible without knowing the response data. intercept = FALSE means that no intercept was included in the basis (is that a good thing? I've always being taught not to fit linear models without an intercept...well guess lm is just adding one anyway).
However, what about the matrix? I don't really understand how to interpret it. With three columns, I would think it means that the basis functions are three. This makes sense: if I have two internal knots K1 and K2, I will have a spline between left boundary knot B1 and K1, another spline between K1 and K2, and a final one between K2 and B2, so...three basis functions, ok. But which are the basis functions exactly? For example, what does this column mean?
# 1
# [1,] 0.0000000
# [2,] 0.8270677
# [3,] 0.8198433
# [4,] 0.0000000
# [5,] 0.0000000
EDIT: this is similar to but not precisely the same as this question. That question asks about the interpretation of the regression coefficients, but I'm a step before that: I would like to understand the meaning of the model matrix coefficients. If I try to make the same plots as suggested in the first answer, I get a messed up plot:
b <- bs(x, df = 3, degree = 1)
b1 <- b[, 1] ## basis 1
b2 <- b[, 2] ## basis 2
b3 <- b[,3]
par(mfrow = c(1, 3))
plot(x, b1, type = "l", main = "basis 1: b1")
plot(x, b2, type = "l", main = "basis 2: b2")
plot(x, b3, type = "l", main = "basis 3: b3")
These can't be the B-spline basis functions, because they have too many knots (each function should only have one).
The second answer would actually allow me to reconstruct my model outside R, so I guess I could go with that. However, also that answer doesn't exactly explains what the elements of the b matrix are: it deals with the coefficients of a linear regression, which I haven't still introduced here. It's true that that is my final goal, but I wanted to understand also this intermediate step.
The matrix b
# 1 2 3
# [1,] 0.0000000 0.0000000 0.0000000
# [2,] 0.8270677 0.0000000 0.0000000
# [3,] 0.8198433 0.1801567 0.0000000
# [4,] 0.0000000 0.7286085 0.2713915
# [5,] 0.0000000 0.0000000 1.0000000
is actually just the matrix of the values of the three basis functions in each point of x, which should have been obvious to me since it's exactly the same interpretation as for a polynomial linear model. As a matter of fact, since the boundary knots are
bknots <- attr(b,"Boundary.knots")
# [1] 0.0 77.4
and the internal knots are
iknots <- attr(b,"knots")
# 33.33333% 66.66667%
# 13.30000 38.83333
then the three basis functions, as shown here, are:
knots <- c(bknots[1],iknots,bknots[2])
y1 <- c(0,1,0,0)
y2 <- c(0,0,1,0)
y3 <- c(0,0,0,1)
par(mfrow = c(1, 3))
plot(knots, y1, type = "l", main = "basis 1: b1")
plot(knots, y2, type = "l", main = "basis 2: b2")
plot(knots, b3, type = "l", main = "basis 3: b3")
Now, consider b[,1]
# 1
# [1,] 0.0000000
# [2,] 0.8270677
# [3,] 0.8198433
# [4,] 0.0000000
# [5,] 0.0000000
These must be the values of b1 in x <- c(0.0, 11.0, 17.9, 49.3, 77.4). As a matter of fact, b1 is 0 in knots[1] = 0 and 1 in knots[2] = 13.3000, meaning that in x[2] (11.0) the value must be 11/13.3 = 0.8270677, as expected. Similarly, since b1 is 0 for knots[3] = 38.83333, the value in x[3] (17.9) must be (38.83333-13.3)/17.9 = 0.8198433. Since x[4], x[5] > knots[3] = 38.83333, b1 is 0 there. A similar interpretation can be given for the other two columns.
Just a small correction to the excellent answer by #DeltaIV above (it looks like I can not comment.)
So in b1, when he calculated b1(x[3]), it should be (38.83333-17.9)/(38.83333-13.3)=0.8198433 by linear interpolation. Everything else is perfect.
Note b1 should look like this
\frac{t}{13.3}I(0<=t<13.3)+\frac{38.83333-t}{38.83333-13.3}I(13.3<=t<38.83333)
I have a small data set of locations and benzene concentrations in mg/kg
WELL.ID X Y BENZENE
1 MW-02 268.8155 282.83 0.00150
2 IW-06 271.6961 377.01 0.00050
3 IW-07 251.0236 300.41 0.01040
4 IW-08 278.9238 300.37 0.03190
5 MW-10 281.4008 414.15 2.04000
6 MW-12 391.3973 449.40 0.01350
7 MW-13 309.5307 335.55 0.01940
8 MW-15 372.8967 370.04 0.01620
9 MW-17 250.0000 428.04 0.01900
10 MW-24 424.4025 295.69 0.00780
11 MW-28 419.3205 250.00 0.00100
12 MW-29 352.9197 277.27 0.00031
13 MW-31 309.3174 370.92 0.17900
and I am trying to krig the values in a grid (the property these wells reside on) like so
setwd("C:/.....")
getwd()
require(geoR)
require(ggplot2)
a <- read.table("krigbenz_loc.csv", sep = ",", header = TRUE)
b <- data.matrix(a)
c <- as.geodata(b)
x.range <- as.integer(range(a[,2]))
y.range <- as.integer(range(a[,3]))
x = seq(from=x.range[1], to=x.range[2], by=1)
y = seq(from=y.range[1], to=y.range[2], by=1)
length(x)
length(y)
xv <- rep(x,length(y))
yv <- rep(y, each=length(x))
in_mat <- as.matrix(cbind(xv, yv))
this is when I start the Krig with
q <- ksline(c, cov.model="exp", cov.pars=c(10,3.33), nugget=0, locations=in_mat)
however, when looking at the output of this with
cbind(q$predict[1:10], q$krige.var[1:10])
i see
[,1] [,2]
[1,] 343.8958 10.91698
[2,] 343.8958 10.91698
[3,] 343.8958 10.91698
[4,] 343.8958 10.91698
[5,] 343.8958 10.91698
[6,] 343.8958 10.91698
[7,] 343.8958 10.91698
[8,] 343.8958 10.91698
[9,] 343.8958 10.91698
[10,] 343.8958 10.91698
these values do not change for the first 5000 rows... (cant view more because max.print = 5000... not sure how to change this either but that is a tangent..)
I am realizing that my
cov.pars = c(10,3.33)
being range and sill, are probably the issue.
the geoR.pdf, pg 19 describes what is expected from cov.pars however I am not sure how I should decide what these covariance parameters need to be.
Is there a method to find the appropriate values from my existing data or can I set these to generic values where my output will be similar to a kriging performed in the spatial analyst package of ESRI's ArcGIS?
ZR
::::EDIT:::
my geodata object was improperly converted...
here is the correct way to do this
c <- as.geodata(b, coords.col = 2:3, data.col = 4, )
also...for the variogram,
v1 <- variog(c)
length(v1$n)
v1.summary <- cbind(c(1:11), v1$v, v1$n)
colnames(v1.summary) <- c("lag", "semi-variance", "# of pairs")
v1.summary
One way to do this is to use the variofit function (also in the geoR package) to estimate the covariance parameters. For example, using your data and initial values:
vario <- variog(c) # See other options here for binning, etc
# Note that the order of the cov.pars is variance, then range, (see your question)
fitted_model <- variofit(vario=vario, ini.cov.pars=c(10, 3.33), cov.model='exp')
q <- ksline(c, cov.model=fitted_model$cov.model, cov.pars=fitted_model$cov.pars,
nugget=fitted_model$nugget, locations=in_mat)
It is worth your time to look at the variogram, by the way.
I am trying to get the Hessian matrix from my own data, and I have two results -
using the code Hessian from library(numDeriv)
using code numericHessian from library(maxLik)
The result from the Hessian is very very small relative to the result from the numericHessian.
In this case, which results should I trust?
Specifically, the data I used ranged from 350000 to 1100000 and they were 9X2 matrix with a total of 18 data values.
I used with a sort of standard deviation formula and the result from "numericHessian" was ranging from 230 to 466 with 2X2 matrix, whereas the result from "Hessian" ranged from -3.42e-18 to 1.34e-17 which was much less than the previous one.
Which one do you think is correct calculation for the sort of standard deviation?
The code is as follows:
data=read.table("C:/file.txt", header=T);
data <- as.matrix(data);
library(plyr)
library(MASS)
w1 = tail(data/(rowSums(data)),1)
w2 = t(w1)
f <- function(x){
w1 = tail(x/(rowSums(x)),1)
w2 = t(w1)
r = ((w1%*%cov(cbind(x))%*%w2)^(1/2))
return(r)
}
library(maxLik);
numericHessian(f, t0=rbind(data[1,1], data[1,2]))
library(numDeriv);
hessian(f, rbind(data[1,1], data[1,2]), method="Richardson")
The file.txt is the following:
1 2
137 201
122 342
142 111
171 126
134 123
823 876
634 135
541 214
423 142
The result from the "numericHessian" is:
[,1] [,2]
[1,] 0.007105427 0.007105427
[2,] 0.007105427 0.000000000
Then, the result from the "Hessian" is:
[,1] [,2]
[1,] -3.217880e-15 -1.957243e-16
[2,] -1.957243e-16 1.334057e-16
Thank you very much in advance.
You have not given a reproducible example, but I'll try anyway.
library(bbmle)
x <- 0:10
y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8)
d <- data.frame(x,y)
LL <- function(ymax=15, xhalf=6)
-sum(stats::dpois(y, lambda=ymax/(1+x/xhalf), log=TRUE))
fit <- mle2(LL)
cc <- coef(fit)
Here are the finite-difference estimates of the Hessians (matrices of second derivatives) of the negative log-likelihood function at the MLE: inverting these matrices gives an estimate of the variance-covariance matrices of the parameters.
library(numDeriv)
hessian(LL,cc)
## [,1] [,2]
## [1,] 1.296717e-01 -1.185789e-15
## [2,] -1.185789e-15 4.922087e+00
library(maxLik)
numericHessian(LL, t0=cc)
## [,1] [,2]
## [1,] 0.1278977 0.000000
## [2,] 0.0000000 4.916956
So for this relatively trivial example, numDeriv::hessian and maxLik::numericHessian give very similar results. So there must be something you haven't shown us, or something special about the numerics of your problem. In order to proceed further, we need a reproducible example please ...
dat <- matrix(c(137,201,122,342,142,111,
171,126,134,123,823,876,
634,135,541,214,423,142),
byrow=TRUE,ncol=2)
f <- function(x){
w1 <- tail(x/(rowSums(x)),1)
sqrt(w1%*%cov(cbind(x))%*%t(w1))
}
p <- t(dat[1,1:2,drop=FALSE])
f(p) ## 45.25483
numDeriv::hessian(f,p)
## [,1] [,2]
## [1,] -3.217880e-15 -1.957243e-16
## [2,] -1.957243e-16 1.334057e-16
maxLik::numericHessian(f,t0=p)
## [,1] [,2]
## [1,] 0.007105427 0.007105427
## [2,] 0.007105427 0.000000000
OK, these clearly disagree. I'm not sure why, but in this particular case we can analyze what you're doing and see which one is right:
since your input matrix is a single column, x/rowSums(x) is a vector of ones, so the last element (w1 <- tail(...,1)) is just 1.
so your expression reduces to sqrt(cov(cbind(x))). Again, since x is a one-column matrix, cov() is just the variance, and sqrt(cov(.)) is just the standard deviation, or the norm of the vector.
the variance is a quadratic function of any element's deviation from the mean, and so the standard deviation is more or less linear in the deviation from the mean (except at zero), so we would expect the second derivatives to be zero. So it looks like numDeriv::hessian is giving the right answer
We can also confirm this by increasing eps for numericHessian:
maxLik::numericHessian(f,t0=p,eps=1e-3)
## [,1] [,2]
## [1,] 0 0.000000e+00
## [2,] 0 -7.105427e-09
The bottom line is that numDeriv uses a more accurate (but slower) method, but you can get reasonable answers from numericHessian if you're careful.