Kernlab kraziness: inconsistent results for identical problems - r

I've found some puzzling behavior in the kernlab package: estimating SVMs which are mathematically identical produces different results in software.
This code snippet just takes the iris data and makes it a binary classification problem for the sake of simplicity. As you can see, I'm using linear kernels in both SVMs.
library(kernlab)
library(e1071)
data(iris)
x <- as.matrix(iris[, 1:4])
y <- as.factor(ifelse(iris[, 5] == 'versicolor', 1, -1))
C <- 5.278031643091578
svm1 <- ksvm(x = x, y = y, scaled = FALSE, kernel = 'vanilladot', C = C)
K <- kernelMatrix(vanilladot(), x)
svm2 <- ksvm(x = K, y = y, C = C, kernel = 'matrix')
svm3 <- svm(x = x, y = y, scale = FALSE, kernel = 'linear', cost = C)
However, the summary information of svm1 and svm2 are dramatically different: kernlab reports completely different support vector counts, training error rates, and objective function values between the two models.
> svm1
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 5.27803164309158
Linear (vanilla) kernel function.
Number of Support Vectors : 89
Objective Function Value : -445.7911
Training error : 0.26
> svm2
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 5.27803164309158
[1] " Kernel matrix used as input."
Number of Support Vectors : 59
Objective Function Value : -292.692
Training error : 0.333333
For the sake of comparison, I also computed the same model using e1071, which provides an R interface for the libsvm package.
svm3
Call:
svm.default(x = x, y = y, scale = FALSE, kernel = "linear", cost = C)
Parameters:
SVM-Type: C-classification
SVM-Kernel: linear
cost: 5.278032
gamma: 0.25
Number of Support Vectors: 89
It reports 89 support vectors, the same as svm1.
My question is whether there are any known bugs in the kernlab package which can account for this unusual behavior.
(Kernlab for R is an SVM solver that allows one to use one of several pre-packaged kernel functions, or a user-supplied kernel matrix. The output is an estimate of a support vector machine for the user-supplied hyperparameters.)

Reviewing some of the code, it appears that this is the offending line:
https://github.com/cran/kernlab/blob/efd7d91521b439a993efb49cf8e71b57fae5fc5a/src/svm.cpp#L4205
That is, in the case of a user-supplied kernel matrix, the ksvm is just looking at two dimensions, rather than whatever the dimensionality of the input is. This seems strange, and is probably a hold-over from some testing or whatever. Tests of the linear kernel with data of just two dimensions produces the same result: replace 1:4 with 1:2 in the above and the output and predictions all agree.

Related

How to convert a tensor to an R array (in a loss function, so without eager execution)?

I have TensorFlow version 2.4 and work with the R packages tensorflow (2.2.0) and keras (2.3.0.0.9000).
I would like to convert tensors to R arrays in a loss function (don't ask why).
Here is an example when such a conversion (outside a loss function) works:
library(tensorflow)
library(keras)
x.R <- matrix(1:12, ncol = 3) # dummy R object
x.tensor <- keras_array(x.R) # converting the R object to a tensor
as.array(x.tensor) # converting it back to an R array. This works because...
stopifnot(tf$executing_eagerly()) # ... eager execution is enabled
During training of a model, eager execution is FALSE though and thus
the as.array() call fails. To see this, let's first define a dummy
neural network model and training data.
d <- 2 # input and output dimension
in.lay <- layer_input(shape = d)
hid.lay <- layer_dense(in.lay, units = 300, activation = "relu")
out.lay <- layer_dense(hid.lay, units = d, activation = "sigmoid")
model <- keras_model(in.lay, out.lay)
n <- 1200 # number of training samples
data <- matrix(runif(n * d), ncol = d) # training data
Now let's define the loss function and compile the model with it.
myloss <- function(x, y) { # x and y are tensors here
stopifnot(!tf$executing_eagerly()) # confirms that eager execution is disabled
x. <- as.array(x) # ... fails with "RuntimeError: Evaluation error: invalid first argument, must be vector (list or atomic)." How can we convert x to an R array?
loss_mean_squared_error(x, y) # just a dummy return value (the MSE)
}
compile(model, optimizer = "adam", loss = myloss)
Let's try and fit this model (to see that it fails to convert the tensor x to an R array via as.array()).
prior <- matrix(rexp(n * d), ncol = d) # input sample to train the NN on
n.epoch <- 5 # number of epochs to train
batch.size <- 400 # batch size
fit(model, x = prior, y = data, batch_size = batch.size, epochs = n.epoch) # fails with error message given above
The R package tensorflow provides tfe_enable_eager_execution() to enable eager execution
in a session. But if I call it with TensorFlow 2.4, then I obtain:
tfe_enable_eager_execution() # "Error in py_get_attr_impl(x, name, silent) : AttributeError: module 'tensorflow' has no attribute 'contrib'"
Ideally, I wouldn't want to mess with eager execution much (not sure about the side effects),
just converting a tensor to an array. My guess is that there is no other way than eager execution as
only then the pointers are resolved and the R package tensorflow finds the data
in the tensor and is able to convert it to an array.
Other ideas to enable/disable eager execution are mentioned here but that's all in Python
and not available in R it seems. And this this post seems to ask the same question but in a different context.

How to solve "impacts()" neighbors length error after running spdep::lagsarlm (Spatial Autoregressive Regression model)?

I have 9,150 polygons in my dataset. I was trying to run a spatial autoregressive model (SAR) in spdep to test spatial dependence of my outcome variable. After running the model, I wanted to examine the direct/indirect impacts, but encountered an error that seems to have something to do with the length of neighbors in the weights matrix not being equal to n.
I tried running the very same equation as SLX model (Spatial Lag X), and impacts() worked fine, even though there were some polygons in my set that had no neighbors. I Googled and looked at spdep documentation, but couldn't find a clue on how to solve this error.
# Defining queen contiguity neighbors for polyset and storing the matrix as list
q.nbrs <- poly2nb(polyset)
listweights <- nb2listw(q.nbrs, zero.policy = TRUE)
# Defining the model
model.equation <- TIME ~ A + B + C
# Run SAR model
reg <- lagsarlm(model.equation, data = polyset, listw = listweights, zero.policy = TRUE)
# Run impacts() to show direct/indirect impacts
impacts(reg, listw = listweights, zero.policy = TRUE)
Error in intImpacts(rho = rho, beta = beta, P = P, n = n, mu = mu, Sigma = Sigma, :
length(listweights$neighbours) == n is not TRUE
I know that this is a question from 2019, but maybe it can help people dealing with the same problem. I found out that in my case the problem was the type of dataset, your data=polyset should be of type "SpatialPolygonsDataFrame". Which can be achieved by converting your data:
polyset_spatial_sf <- sf::as_Spatial(polyset, IDs = polyset$ID)
Then rerun your code.

Parameter estimates using FME ODE model fitting in R

I have a system of ODE equations that I am trying to fit to generated data, synthetic or lab. The final product I am interested in is the parameter and it's estimated error. We use the R package FME with modCost and modFit. As an example, a system of ODEs may be defined as such:
eqs <- function (time, y, parms, ...) {
with(as.list(c(parms, y)), {
dP <- k2*PA - k1*A*P # concentration of nucleic acid
dA <- dP # concentration of free protein
dPA <- -dP
list(c(dA,dP,dPA))
}
}
with parameters k1 and k2 and variables A,P and PA. I import the data (not shown) and define the cost function used in modFit
cost <- function(p, data, ...) {
yy <- p[c("A","P","PA")]
pp <- p[c("k1", "k2")]
out <- ode(yy, time, eqs, pp)
modCost(out, data, ...)
}
I set some initial conditions with a parms vector and then do the fitting with
fit <- modFit(f = cost, p = parms, data = dat, weight = "std",
lower = rep(0, 8), upper = c(600,100,600,0.01,0.01), method = "Marq")
I then do a final ode to get the generated fits with best parameters, Bob's your uncle, and boom, estimated parameters. The input numbers don't matter, I hope my process outline is legible for those who use this package.
My issue and question centers around two things: I'm a scientist, a physicist, and the error of the estimated parameters is important to report. Can I generate the estimated error from MFE somehow or is there a separate package for that kind of return?
I don't get your point. You can just use:
summary(fit)
to see the Std. Error.

How to weight observations in mxnet?

I am new to neural networks and the mxnet package in R. I want to do a logistic regression on my predictors since my observations are probabilities varying between 0 and 1. I'd like to weight my observations by a vector obsWeights I have, but I'm not sure where to implement the weights. There seems to be a weight= option in mx.symbol.FullyConnected but if I try weight=obsWeights I get the following error message
Error in mx.varg.symbol.FullyConnected(list(...)) :
Cannot find argument 'weight', Possible Arguments:
----------------
num_hidden : int, required
Number of hidden nodes of the output.
no_bias : boolean, optional, default=False
Whether to disable bias parameter.
How should I proceed to weight my observations? Here is my code at the moment.
# Prepare data
train.mm = model.matrix(obs ~ . , data = train_data)
train_label = train_data$obs
# Normalize
train.mm = apply(train.mm, 2, function(x) (x-min(x))/(max(x)-min(x)))
# Create MXDataIter compatible iterator
batch_size = 128
train.iter = mx.io.arrayiter(data=t(train.mm), label=train_label,
batch.size=batch_size, shuffle=T)
# Symbolic model definition
data = mx.symbol.Variable('data')
fc1 = mx.symbol.FullyConnected(data=data, num.hidden=128, name='fc1')
act1 = mx.symbol.Activation(data=fc1, act.type='relu', name='act1')
final = mx.symbol.FullyConnected(data=act1, num.hidden=1, name='final')
logistic = mx.symbol.LogisticRegressionOutput(data=final, name='logistic')
# Run model
mxnet_train = mx.model.FeedForward.create(
symbol = logistic,
X = train.iter,
initializer = mx.init.Xavier(rnd_type = 'gaussian', factor_type = 'avg', magnitude = 2),
num.round = 25)
Assigning the fully connected weight argument is not what you want to do at any rate. That weight is a reference to parameters of the layer; i.e., what you multiply in the inputs by to get output values These are the parameter values you're trying to learn.
If you want to make some samples matter more than others, then you'll need to adjust the loss function. For example, multiply the usual loss function by your weights so that they do not contribute as much to the overall average loss.
I do not believe the standard Mxnet loss functions have a spot for assigning weights (that is LogisticRegressionOutput won't cover this). However, you can make your own cost function that does. This would involve passing your final layer through a sigmoid activation function to first generate the usual logistic regression output value. Then pass that into the loss function you define. You could do squared error, but for logistic regression you'll probably want to use the cross entropy function:
l * log(y) + (1 - l) * log(1 - y),
where l is the label and y is the predicted value.
Ideally, you'd write a symbol with an efficient definition of the gradient (Mxnet has a cross entropy function, but its for softmax input, not a binary output. You could translate your output to two outputs with softmax as an alternative, but that seems less easy to work with in this case), but the easiest path would be to let Mxnet do its autodiff on it. Then you multiply that cross entropy loss by the weights.
I haven't tested this code, but you'd ultimately have something like this (this is what you'd do in python, should be similar in R):
label = mx.sym.Variable('label')
out = mx.sym.Activation(data=final, act_type='sigmoid')
ce = label * mx.sym.log(out) + (1 - label) * mx.sym.log(1 - out)
weights = mx.sym.Variable('weights')
loss = mx.sym.MakeLoss(weigths * ce, normalization='batch')
Then you want to input your weight vector into the weights Variable along with your normal input data and labels.
As an added tip, the output of an mxnet network with a custom loss via MakeLoss outputs the loss, not the prediction. You'll probably want both in practice, in which case its useful to group the loss with a gradient-blocked version of the prediction so that you can get both. You'd do that like this:
pred_loss = mx.sym.Group([mx.sym.BlockGrad(out), loss])

Passing hypothesis in wald.test in R

Please forgive as i am new to this forum. The study requires to check the sum of coefficients=0. The test can be conducted using eviews like c(2)+c(3)+c(4)=0, where 2 is the coefficient of 2nd term and hence forth. The code for the same using R is
require(Hmisc)#this package is used to generate lags
require(aod)#this package is used to conduct wald test
output<-lm(formula = s_dep ~ m_dep + Lag(m_dep,-1) + Lag(m_dep,-2) + s_rtn, data = qs_eq_comm)
wald.test(b=coef(object=output),Sigma=vcov(object=output), Terms=2:4, H0=2+3+4)
#H0=2+3+4 checks if the sum is zero
This gives error : Error in wald.test(b = coef(object = output), Sigma = vcov(object = output), : Vectors of tested coefficients and of null hypothesis have different lengths. As per the aod package the documentation specifies the format as
wald.test(Sigma, b, Terms = NULL, L = NULL, H0 = NULL,df = NULL, verbose = FALSE)
Please help to conduct this test.
In wald.test function either T or L can passed as the parameter. L is a matrix conformable to b, such as its product with b i.e., L %*% b gives the linear combinations of the coefficients to be tested. Create L matrix first and then conduct wald test.
l<-cbind(0,+1,+1,+1,0)
wald.test(b=coef(object=output),Sigma=vcov(object=output), L=l)

Resources