Passing hypothesis in wald.test in R - r

Please forgive as i am new to this forum. The study requires to check the sum of coefficients=0. The test can be conducted using eviews like c(2)+c(3)+c(4)=0, where 2 is the coefficient of 2nd term and hence forth. The code for the same using R is
require(Hmisc)#this package is used to generate lags
require(aod)#this package is used to conduct wald test
output<-lm(formula = s_dep ~ m_dep + Lag(m_dep,-1) + Lag(m_dep,-2) + s_rtn, data = qs_eq_comm)
wald.test(b=coef(object=output),Sigma=vcov(object=output), Terms=2:4, H0=2+3+4)
#H0=2+3+4 checks if the sum is zero
This gives error : Error in wald.test(b = coef(object = output), Sigma = vcov(object = output), : Vectors of tested coefficients and of null hypothesis have different lengths. As per the aod package the documentation specifies the format as
wald.test(Sigma, b, Terms = NULL, L = NULL, H0 = NULL,df = NULL, verbose = FALSE)
Please help to conduct this test.

In wald.test function either T or L can passed as the parameter. L is a matrix conformable to b, such as its product with b i.e., L %*% b gives the linear combinations of the coefficients to be tested. Create L matrix first and then conduct wald test.
l<-cbind(0,+1,+1,+1,0)
wald.test(b=coef(object=output),Sigma=vcov(object=output), L=l)

Related

R is only returning non-zero coefficient estimates when using the "poly" function to generate predictors. How do I get the zero values into a vector?

I'm using regsubsets from the leaps library to perform the best subset selection. I need to compare the coefficients it generates to the "true" coefficients I specified when simulating the data (by comparison, meaning, the difference between them squared, and the square root taken of the sum), for each number of predictors.
Since there are 16 different models that regsubsets generated, I use a loop to do this automatically. It would work except that when I extract the coefficients from the best model fit with x predictors, it only gives me the non-zero coefficients of the polynomial fit. This messes up the size of the coefi vector causing it to be smaller in size than the truecoef true coefficients vector.
If I could somehow force all coefficients to be spat out from the model, I wouldn't have an issue. But after looking extensively, I don't know how to do that.
Alternative ways of solving this problem would also be appreciated.
library(leaps)
regfit.train=regsubsets(y ~ poly(x,25, raw = TRUE), data=mydata[train,], nvmax=25)
truecoef = c(3,0,-7,4,-2,8,0,-5,0,2,0,4,5,6,3,2,2,0,3,1,1)
coef.errors = rep(NA, 16)
for (i in 1:16) {
coefi = coef(regfit.train, id=i)
coef.errors[i] = mean((truecoef-coefi)^2)
}
The equation I'm trying to estimate, where j is the coefficient and r refers to the best model containing "r" coefficients:
Thanks!
This is how I ended up solving it (with some help):
The loop indexes which coefficients are available and performs the subtraction, for those unavailable, it assumes they are zero.
truecoef = c(3,0,-7,4,-2,8,0,-5,0,2,0,4,5,6,3,2,2,0,3,1,1)
val.errors = rep(NA, 16)
x_cols = colnames(x, do.NULL = FALSE, prefix = "x.")
for (i in 1:16) {
coefis = coef(regfit.train, id = i)
val.errors[i] = sqrt(sum((truecoef[x_cols %in% names(coefis)] -
coefis[names(coefis) %in% x_cols])^2) + sum(truecoef[!(x_cols %in% names(coefis))])^2)
}

How to solve "impacts()" neighbors length error after running spdep::lagsarlm (Spatial Autoregressive Regression model)?

I have 9,150 polygons in my dataset. I was trying to run a spatial autoregressive model (SAR) in spdep to test spatial dependence of my outcome variable. After running the model, I wanted to examine the direct/indirect impacts, but encountered an error that seems to have something to do with the length of neighbors in the weights matrix not being equal to n.
I tried running the very same equation as SLX model (Spatial Lag X), and impacts() worked fine, even though there were some polygons in my set that had no neighbors. I Googled and looked at spdep documentation, but couldn't find a clue on how to solve this error.
# Defining queen contiguity neighbors for polyset and storing the matrix as list
q.nbrs <- poly2nb(polyset)
listweights <- nb2listw(q.nbrs, zero.policy = TRUE)
# Defining the model
model.equation <- TIME ~ A + B + C
# Run SAR model
reg <- lagsarlm(model.equation, data = polyset, listw = listweights, zero.policy = TRUE)
# Run impacts() to show direct/indirect impacts
impacts(reg, listw = listweights, zero.policy = TRUE)
Error in intImpacts(rho = rho, beta = beta, P = P, n = n, mu = mu, Sigma = Sigma, :
length(listweights$neighbours) == n is not TRUE
I know that this is a question from 2019, but maybe it can help people dealing with the same problem. I found out that in my case the problem was the type of dataset, your data=polyset should be of type "SpatialPolygonsDataFrame". Which can be achieved by converting your data:
polyset_spatial_sf <- sf::as_Spatial(polyset, IDs = polyset$ID)
Then rerun your code.

Using R INLA hyperparameter to.theta and from.theta functions

R-INLA model hyperparameters have to.theta and from.theta functions that appear to be for converting between different parameterisations. It would be convenient to use those conversion functions but how does one do so?
Example with ar1
From the ar1 documentation (http://www.math.ntnu.no/inla/r-inla.org/doc/latent/ar1.pdf):
The parameter rho is represented as theta_2 = log((1 + rho)/(1 - rho))
and further down under hyper, theta2 we have to.theta 'function(x) log((1+x)/(1-x))'. It would be nice if we could use that to convert between rho and theta_2.
Let's try using an example
library(INLA)
# Example from ar1 documentation (http://www.math.ntnu.no/inla/r-inla.org/doc/latent/ar1.pdf)
#simulate data
n = 100
rho = 0.8
prec = 10
## note that the marginal precision would be
marg.prec = prec * (1-rho^2)
E=sample(c(5,4,10,12),size=n,replace=T)
eta = as.vector(arima.sim(list(order = c(1,0,0), ar = rho), n = n,sd=sqrt(1/prec)))
y=rpois(n,E*exp(eta))
data = list(y=y, z=1:n, E=E)
## fit the model
formula = y~f(z,model="ar1")
result = inla(formula,family="poisson", data = data, E=E)
That runs fine.
Can we use to.theta like this?
formula.to.theta = y~f(z,model="ar1",
hyper = list(rho = list(initial = to.theta(0.25))))
result = inla(formula.to.theta,family="poisson", data = data, E=E)
# Error in to.theta(0.25) : could not find function "to.theta"
So we can't use it like that. Is there another way to specify formula.to.theta that would work?
Pretty sure the answer to your question is "no". The Documentation is saying, not that there are functions by those names in the package, but rather that the hyper hyperparameter element will have functions by those names with values as given in the documentation. There is no reason to think that pasting those names after formula. would result in a meaningful function. Here is how to examine the value of from.theta in the environment of a specific call to the f-function:
library(INLA)
eval( f(z, model = "ar1") )$hyper$theta3$from.theta
===== result ========
function (x)
x
<environment: 0x7fdda6214040>
attr(,"inla.read.only")
[1] TRUE
The result from f( , "ar1") actually has three theta's each with a to and from function. You may be trying to change the hyper$thetax$param value which does not have an attr(,"inla.read.only") value of TRUE.
It would probably be more informative for you to execute this:
eval( f(z, model = "ar1") )$hyper

Get ARIMA white noise with known parameter in R

I have a MA(1) model with known parameter and known .
I'd like to know, is there a function in R that can return for me?
I also tried to get by iteration:
However, in reality, is unknown and cannot be specified at the first place.
I'm having this question because I used gnls to estimate a nonlinear model with residuals being MA(1) process. The code is something like:
model = gnls(y ~ c + log( x1^g + x2^g), start = list(c = 0.04, g = 0.3),
correlation = corARMA(c(0.5), form = ~ 1, p = 0, q = 1, fixed = FALSE))
It returns every parameter estimation including . But residuals(model) returns instead of .
So any suggestions?
Thank you for the help in advance.
Yes. You can use Arima function available in R.
fit <- arima(ts(data), order=c(0,0,1))
as you do not want AR and I part. You can set it to zero.
summary(fit)
You can observe parameters learned and errors by summary function.
For more information, refer to : https://www.otexts.org/fpp/8/7

Kernlab kraziness: inconsistent results for identical problems

I've found some puzzling behavior in the kernlab package: estimating SVMs which are mathematically identical produces different results in software.
This code snippet just takes the iris data and makes it a binary classification problem for the sake of simplicity. As you can see, I'm using linear kernels in both SVMs.
library(kernlab)
library(e1071)
data(iris)
x <- as.matrix(iris[, 1:4])
y <- as.factor(ifelse(iris[, 5] == 'versicolor', 1, -1))
C <- 5.278031643091578
svm1 <- ksvm(x = x, y = y, scaled = FALSE, kernel = 'vanilladot', C = C)
K <- kernelMatrix(vanilladot(), x)
svm2 <- ksvm(x = K, y = y, C = C, kernel = 'matrix')
svm3 <- svm(x = x, y = y, scale = FALSE, kernel = 'linear', cost = C)
However, the summary information of svm1 and svm2 are dramatically different: kernlab reports completely different support vector counts, training error rates, and objective function values between the two models.
> svm1
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 5.27803164309158
Linear (vanilla) kernel function.
Number of Support Vectors : 89
Objective Function Value : -445.7911
Training error : 0.26
> svm2
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 5.27803164309158
[1] " Kernel matrix used as input."
Number of Support Vectors : 59
Objective Function Value : -292.692
Training error : 0.333333
For the sake of comparison, I also computed the same model using e1071, which provides an R interface for the libsvm package.
svm3
Call:
svm.default(x = x, y = y, scale = FALSE, kernel = "linear", cost = C)
Parameters:
SVM-Type: C-classification
SVM-Kernel: linear
cost: 5.278032
gamma: 0.25
Number of Support Vectors: 89
It reports 89 support vectors, the same as svm1.
My question is whether there are any known bugs in the kernlab package which can account for this unusual behavior.
(Kernlab for R is an SVM solver that allows one to use one of several pre-packaged kernel functions, or a user-supplied kernel matrix. The output is an estimate of a support vector machine for the user-supplied hyperparameters.)
Reviewing some of the code, it appears that this is the offending line:
https://github.com/cran/kernlab/blob/efd7d91521b439a993efb49cf8e71b57fae5fc5a/src/svm.cpp#L4205
That is, in the case of a user-supplied kernel matrix, the ksvm is just looking at two dimensions, rather than whatever the dimensionality of the input is. This seems strange, and is probably a hold-over from some testing or whatever. Tests of the linear kernel with data of just two dimensions produces the same result: replace 1:4 with 1:2 in the above and the output and predictions all agree.

Resources