VECM in R: Testing weak exogeneity and imposing restrictions - r

I estimated VECM and would like to make 4 separate tests of weak exogeneity for each variable.
library(urca)
library(vars)
data(Canada)
e prod rw U
1980 Q1 929.6105 405.3665 386.1361 7.53
1980 Q2 929.8040 404.6398 388.1358 7.70
1980 Q3 930.3184 403.8149 390.5401 7.47
1980 Q4 931.4277 404.2158 393.9638 7.27
1981 Q1 932.6620 405.0467 396.7647 7.37
1981 Q2 933.5509 404.4167 400.0217 7.13
...
jt = ca.jo(Canada, type = "trace", ecdet = "const", K = 2, spec = "transitory")
t = cajorls(jt, r = 1)
t$rlm$coefficients
e.d prod.d rw.d U.d
ect1 -0.005972228 0.004658649 -0.10607044 -0.02190508
e.dl1 0.812608320 -0.063226620 -0.36178542 -0.60482042
prod.dl1 0.208945048 0.275454380 -0.08418285 -0.09031236
rw.dl1 -0.045040603 0.094392696 -0.05462048 -0.01443323
U.dl1 0.218358784 -0.538972799 0.24391761 -0.16978208
t$beta
ect1
e.l1 1.00000000
prod.l1 0.08536852
rw.l1 -0.14261822
U.l1 4.28476955
constant -967.81673980
I guess that my equations are:
and I would like to test whether alpha_e, alpha_prod, alpha_rw, alpha_U (they marked red in the picture above) are zeros and impose necessary restrictions on my model. So, my question is: how can I do it?
I guess that my estimated alphas are:
e.d prod.d rw.d U.d
ect1 -0.005972228 0.004658649 -0.10607044 -0.02190508
I guess that I should use alrtest function from urca library:
alrtest(z = jt, A = A1, r = 1)
and probably my A matrix for alpha_e should be like this:
A1 = matrix(c(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1),
nrow = 4, ncol = 3, byrow = TRUE)
The results of the test:
jt1 = alrtest(z = jt, A = A1, r = 1)
summary(jt1)
The value of the likelihood ratio test statistic:
0.48 distributed as chi square with 1 df.
The p-value of the test statistic is: 0.49
Eigenvectors, normalised to first column
of the restricted VAR:
[,1]
RK.e.l1 1.0000
RK.prod.l1 0.1352
RK.rw.l1 -0.1937
RK.U.l1 3.9760
RK.constant -960.2126
Weights W of the restricted VAR:
[,1]
[1,] 0.0000
[2,] 0.0084
[3,] -0.1342
[4,] -0.0315
Which I guess means that I can't reject my hypothesis of weak exogeneity of alpha_e. And my new alphas here are: 0.0000, 0.0084, -0.1342, -0.0315.
Now the question is how can I impose this restriction on my VECM model?
If I do:
t1 = cajorls(jt1, r = 1)
t1$rlm$coefficients
e.d prod.d rw.d U.d
ect1 -0.005754775 0.007717881 -0.13282970 -0.02848404
e.dl1 0.830418381 -0.049601229 -0.30644063 -0.60236338
prod.dl1 0.207857861 0.272499006 -0.06742147 -0.08561076
rw.dl1 -0.037677197 0.102991919 -0.05986655 -0.02019326
U.dl1 0.231855899 -0.530897862 0.30720652 -0.16277775
t1$beta
ect1
e.l1 1.0000000
prod.l1 0.1351633
rw.l1 -0.1936612
U.l1 3.9759842
constant -960.2126150
the new model don't have 0.0000, 0.0084, -0.1342, -0.0315 for alphas. It has -0.005754775 0.007717881 -0.13282970 -0.02848404 instead.
How can I get reestimated model with alpha_e = 0? I want reestimated model with alpha_e = 0 because I would like to use it for predictions (vecm -> vec2var -> predict, but vec2var doesn't accept jt1 directly). And in general - are calculations which I made correct or not?
Just for illustration, in EViews imposing restriction on alpha looks like this (not for this example):

If you have 1 cointegrating relationship (r=1), as it is in t = cajorls(jt, r = 1),
your loading matrix can not have 4 rows and 3 columns:
A1 = matrix(c(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1),
nrow = 4, ncol = 3, byrow = TRUE)
Matrix A can only have 4 rows and 1 column, if you have 4 variables and 1 cointegrating relationship.

Related

Principal Component Analysis in R by hand

The questions is about Principal Component Analysis, partly done by hand.
Disclaimer: My background is not in maths and I am using R for the first time.
Given are the following five data points in R^3. Where xi1-3 are variables and x1 - x5 are observations.
| x1 x2 x3 x4 x5
----------------------
xi1 | -2 -2 0 2 2
xi2 | -2 2 0 -2 2
xi3 | -4 0 0 0 4
Three principal component vectors after the principal component analysis has been performed are given, and look like this:
Phi1 = (0.41, 0.41, 0.82)T
Phi2 = (-0.71, 0.71, 0.00)T
Phi3 = (0.58, 0.58, -0.58)T
The questions are as follows
1) Calculate the principal component scores zi1, zi2 and zi3 for each of the 5 data points.
2) Calculate the proportion of the variance explained by each principal component.
So far I have answered question 1 with the following code, where Z represents the scores:
A = matrix(
c(-2, -2, 0, 2, 2, -2, 2, 0, -2, 2, -4, 0, 0, 0, 4),
nrow = 3,
ncol = 5,
byrow = TRUE
)
Phi = matrix (
c(0.41, -0.71, 0.58,0.41, 0.71, 0.58, 0.82, 0.00, -0.58),
nrow = 3,
ncol = 3,
byrow = FALSE
)
Z = Phi%*%A
Now I am stuck with question 2, I am given the formula:
But I am not sure how I can recreate the formula with an R command, can anyone help me?
#Here is the numerator:
(Phi%*%A)^2%>%rowSums()
[1] 48.4128 16.1312 0.0000
#Here is the denominator:
sum(A^2)
[1] 64
#So the answer is:
(Phi%*%A)^2%>%rowSums()/sum(A^2)
[1] 0.75645 0.25205 0.00000
we can verify with prcomp+summary:
summary(prcomp(t(A)))
Importance of components:
PC1 PC2 PC3
Standard deviation 3.464 2.00 0
Proportion of Variance 0.750 0.25 0
Cumulative Proportion 0.750 1.00 1
This is roughly the same since your $\Phi$ is rounded to two decimals.

Quadratic optimization - portfolio maximization problems

In portfolio analysis, given the expectation, we aim to find the weight of each asset to minimize the variance
here is the code
install.packages("quadprog")
library(quadprog)
#Denoting annualized risk as an vector sigma
sigma <- c(0.56, 7.77, 13.48, 16.64)
#Formulazing the correlation matrix proposed by question
m <- diag(0.5, nrow = 4, ncol = 4)
m[upper.tri(m)] <- c(-0.07, -0.095, 0.959, -0.095, 0.936, 0.997)
corr <- m + t(m)
sig <- corr * outer(sigma, sigma)
#Defining the mean
mu = matrix(c(1.73, 6.65, 9.11, 10.30), nrow = 4)
m0 = 8
Amat <- t(matrix(c(1, 1, 1, 1,
c(mu),
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
0, 0, 0, 1), 6, 4, byrow = TRUE))
bvec <- c(1, m0, 0, 0, 0, 0)
qp <- solve.QP(sig, rep(0, nrow(sig)), Amat, bvec, meq = 2)
qp
x = matrix(qp$solution)
x
(t(x) %*% sig %*% x)^0.5
I understand the formulation of mu and covariance matrix and know the usage of the quadprog plot
However, I don‘t understand why Amat and bvec are defined in this way, why the are 6 by 4 matrix.
$mu0$ is the expectation we aim to have for the portfolio and it is fixed at value 8%
Attached is the question
As you are probably aware, the reason that Amat has four columns is that there are four assets that you are allocating over. It has six rows because there are six constraints in your problem:
The allocations add up to 1 (100%)
Expected return = 8%
'Money market' allocation >= 0
'Capital stable' allocation >= 0
'Balance' allocation >= 0
'Growth' allocation >= 0
Look at the numbers that define each constraint. They are why bvec is [1, 8, 0, 0, 0, 0]. Of these six, the first two are equality constraints, which is why meq is set to 2 (the other four are greater than or equal constraints).
Edited to add:
The way the constraints work is this: each column of Amat defines a constraint, which is then multiplied by the asset allocations, with the result equal to (or greater-than-or-equal-to) some target that is set in bvec. For example:
The first column of Amat is [1, 1, 1, 1], and the first entry of bvec is 1. So the first constraint is:
1 * money_market + 1 * capital_stable + 1 * balance + 1 * growth = 1
This is a way of saying that the asset allocations add up to 1.
The second constraint says that the expected returns add up to 8:
1.73 * money_market + 6.65 * capital_stable + 9.11 * balance + 10.32 * growth = 8
Now consider the third constraint, which says that the 'Money market' allocation is greater than or equal to zero. That's because the 3rd column of Amat is [1, 0, 0, 0] and the third entry of bvec is 0. So this constraint looks like:
1 * money_market + 0 * capital_stable + 0 * balance + 0 * growth >= 0
Simplifying, that's the same as:
money_market >= 0

R solve.QP tracking error minimization constraints inconsistent

I am struggling with Solve.QP to get a solution to minimize tracking error. I have a benchmark consisting of 6 assets (asset_a to asset_f). For my portfolio I have upper and lower bounds (I cannot have a position in asset_f). The cov matrix is also given. I want to get the portfolio weights for the 6 assets that minimizes tracking error vs the benchmark (with position in asset_f equal to zero).
benchmark:
asset_a: 0.3
asset_b: 0.3
asset_c: 0.1
asset_d: 0.1
asset_e: 0.1
asset_f: 0.1
lowerbounds:
asset_a: 0.166
asset_b: 0.133
asset_c: 0.037
asset_d: 0.035
asset_e: 0.039
asset_f: 0
upperbounds:
asset_a: 1
asset_b: 1
asset_c: 1
asset_d: 1
asset_e: 1
asset_f: 0
benchmark weights and bounds:
test.benchmark_weights = c(0.3, 0.3, 0.1, 0.1, 0.1, 0.1)
test.lowerbound = c(0.166, 0.133, 0.037, 0.035, 0.039,0)
test.upperbound = c(1, 1, 1, 1, 1, 0)
cov matrix (test.Dmat):
test.dmat = matrix(c(0.0119127162, 0.010862842, 0.010266683, 0.0009550136, 0.008242322, 0.00964462, 0.0108628421, 0.010603072, 0.009872992, 0.0011019412, 0.007422522, 0.0092528873, 0.0102666826, 0.009872992, 0.010487808, 0.0012107665, 0.006489204, 0.0096216627, 0.0009550136, 0.001101941, 0.001210766, 0.0115527788, 0.001181745, 0.0008387247, 0.0082423222, 0.007422522, 0.006489204, 0.0011817453, 0.012920482, 0.005973886, 0.00964462, 0.009252887, 0.009621663, 0.0008387247, 0.005973886, 0.0089904809), nrow=6, ncol=6)
dvec (test.dvec):
test.dvec = matrix(c(0, 0, 0, 0, 0, 0), nrow=6, ncol=1)
Amat constraints matrix (test.Amat):
test.amat = matrix(c(1,1,1,1,1,1, 1,1,1,1,1,0, -1,0,0,0,0,0, 0,-1,0,0,0,0, 0,0,-1,0,0,0, 0,0,0,-1,0,0, 0,0,0,0,-1,0, 0,0,0,0,0,-1, 1,0,0,0,0,0, 0,1,0,0,0,0, 0,0,1,0,0,0, 0,0,0,1,0,0, 0,0,0,0,1,0, 0,0,0,0,0,0, -1,0,0,0,0,0, 0,-1,0,0,0,0, 0,0,-1,0,0,0, 0,0,0,-1,0,0, 0,0,0,0,-1,0, 0,0,0,0,0,0), nrow=6, ncol=20)
bvec (test.bvec)
test.bvec =cbind(0, 1, t(test.benchmark_weights), t(test.lowerbound), -t(test.upperbound)) %>% as.matrix()
then running the solver
solve.QP(as.matrix(test.Dmat), test.dvec, test.Amat, test.bvec)
gives me
constraints are inconsistent, no solution!
Seems like there is something wrong with your Amat and bvec, i.e. you need not have to pass in both sum of weights on first 5 assets equal to 1 and sum of 6 assets equal 1 and also benchmark weights are not constraints but the bounds are:
library(quadprog)
N = 6L
test.dvec = rep(0, N)
test.amat = cbind(
rep(1, N),
diag(1, N),
diag(-1, N))
test.bvec = c(1, test.lowerbound, -test.upperbound)
res = solve.QP(test.dmat, test.dvec, test.amat, test.bvec, meq=1L)
round(res$solution, 2)
#[1] 0.17 0.13 0.10 0.44 0.17 0.00

Cointegration analysis in R: How do I get the relevant information from `urca::cajorls`?

Consider the cajorls from urca package in R. This is an estimation of the VEC model given the a ca.jo object. How can I by the output of cajorls find the loading matrix alpha? Beta and the other parameters are simply I can't find the loading matrix.
This code below is taken from a textbook. Can you help identify the loading matrix by adding to this piece of code.
library(urca)
set.seed(1234)
n = 250
e1 = rnorm(n, 0, 0.5)
e2 = rnorm(n, 0, 0.5)
e3 = rnorm(n, 0, 0.5)
u1.ar1 = arima.sim(model = list(ar = 0.75), innov = e1, n = n)
u2.ar1 = arima.sim(model = list(ar = 0.3), innov = e2, n = n)
y3 = cumsum(e3)
y1 = 0.8*y3 + u1.ar1
y2 = -0.3*y3 + u2.ar1
y.mat = data.frame(y1,y2,y3)
plot(ts(y.mat))
vecm = ca.jo(y.mat)
jo.results = summary(vecm)
print(jo.results )
# reestimated
vecm.r2 = cajorls(vecm, r = 2)
summary(vecm.r2)
Maybe I should perform operations at mu own?
I ran your skript and found this
print(jo.results)
######################
# Johansen-Procedure #
######################
Test type: maximal eigenvalue statistic (lambda max) , with linear trend
Eigenvalues (lambda):
[1] 0.285347239 0.127915199 0.006887218
Values of teststatistic and critical values of test:
test 10pct 5pct 1pct
r <= 2 | 1.71 6.50 8.18 11.65
r <= 1 | 33.94 12.91 14.90 19.19
r = 0 | 83.32 18.90 21.07 25.75
Eigenvectors, normalised to first column:
(These are the cointegration relations)
y1.l2 y2.l2 y3.l2
y1.l2 1.00000 1.00000000 1.0000000
y2.l2 -43.55337 -0.07138149 0.0528435
y3.l2 -13.58606 -0.73018096 -3.4121605
Weights W:
(This is the loading matrix)
y1.l2 y2.l2 y3.l2
y1.d -0.0007084809 -0.27450042 2.250788e-03
y2.d 0.0174625514 0.03598729 7.150656e-05
y3.d -0.0030589216 -0.02899838 3.086942e-03
Doesn't it say, Wieghts W: (This is the loading matrix)?
Or do you look for something else?

How do I perform koyck lag transformations in PMML?

I'm using PMML to transfer my models (that I develop in R) between different platforms. One issue I often face is that given input data I need to do a lot of pre-processing. Most times this is rather straightforward in PMML but I cannot figure out how to do it when I need a Koyck lag transformation. Now the first few lines of the input data set looks like this:
Y Z S Xa Xb Xc
1 11.37738 1 0.8414710 0.0 0.0 581102.6
2 21.29848 2 0.9092974 700254.1 0.0 35695.1
3 14.30348 3 0.1411200 0.0 384556.3 0.0
4 18.07305 4 0.0000000 413643.2 0.0 0.0
5 29.02756 5 0.0000000 604453.3 0.0 350888.2
6 20.73336 6 0.0000000 0.0 0.0 168961.2
and is generated by:
df<-structure(list(Y = c(11.3773828021943, 21.2984762226498, 14.3034834956969,
18.0730530464578, 29.0275566937015, 20.7333617643781, 30.9707039948106,
30.2428379202751, 22.1677291047936, 19.7450403054104, 18.4642890388219,
28.4145184014117, 28.5224574661743, 40.5073319897728, 40.8853498146471,
20.7173713186907, 35.8080372291603, 37.6213598048788, 38.3123458040493,
25.143519382411),
Z = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20),
S = c(0.841470984807897, 0.909297426825682, 0.141120008059867,
0, 0, 0, 0.656986598718789, 0.989358246623382,
0.412118485241757, 0, 0, 0, 0.420167036826641, 0.99060735569487,
0.650287840157117, 0, 0, 0, 0.149877209662952, 0.912945250727628),
Xa = c(0, 700254.133201206, 0, 413643.212229974, 604453.339408554,
0, 623209.174415675, 1042574.05046884, 0, 0, 397257.053501325,
441408.09060313, 0, 0, 597980.888163467, 0, 121672.230528635,
199542.274825303, 447951.083632432, 84751.5842557032),
Xb = c(0, 0, 384556.309344495, 0, 0, 0, 0, 0, 0, 0, 0,
179488.805498654, 31956.7161910341, 785611.676606721,
65452.7295721654, 0, 231214.563631705, 0, 0,
176249.685091327),
Xc = c(581102.615208462, 35695.0974169599, 0, 0, 350888.245086195,
168961.239749307, 458076.400377529, 218707.589596171,
0, 506676.223324812, 0, 25613.8139087091, 429615.016105429,
410675.885159107, 0, 229898.803944166, 2727.64268459058,
711726.797796325, 354985.810664457, 0)),
.Names = c("Y", "Z", "S", "Xa", "Xb", "Xc"),
row.names = c(NA, -20L),
class = "data.frame")
I want to create a new variable M using koyck lags of the variables Xa, Xb and Xc like this:
lagIt<-function (x, d, ia = mean(x))
{
y <- x
y[1] <- y[1] + ia*d
for (i in 2:length(x)) y[i] <- y[i] + y[i-1] * d
y
}
df2<-transform(df, M=(lagIt(tanh(Xa/300000), 0.5) +
lagIt(tanh(Xb/100000), 0.7) + lagIt(tanh(Xc/400000), 0.3)))
> head(df2)
# Y Z S Xa Xb Xc M
# 1 11.37738 1 0.8414710 0.0 0.0 581102.6 1.460318
# 2 21.29848 2 0.9092974 700254.1 0.0 35695.1 1.637388
# 3 14.30348 3 0.1411200 0.0 384556.3 0.0 1.767136
# 4 18.07305 4 0.0000000 413643.2 0.0 0.0 1.960151
# 5 29.02756 5 0.0000000 604453.3 0.0 350888.2 2.796750
# 6 20.73336 6 0.0000000 0.0 0.0 168961.2 1.761774
and finally build a model:
fit<-lm(Y~Z+S+M, data=df2)
Using the pmml library in R I can get the PMML XML output like this.
library(pmml)
pmml(fit)
However, I want to include a section of where the creation of the variable M takes place. How can I write that section conforming to PMML? Again the input data is the df data.frame and I want all pre-processing of data to be defined in PMML.
PMML operates on single-valued data records, but you're trying to use vector-valued data records. Most certainly, you cannot do (for-)loops in PMML.
Depending on your deployment platform, you might be able to use extension functions. Basically, this involves 1) programming Koyck lag transformation, 2) turning it into a standalone extension library and 3) making the PMML engine aware of this extension library. This extension function can be called by name just like all other built-in and user-defined functions.
The above should be doable using the JPMML library.

Resources