R stan 8schools example faile - r

I am trying to run classical r-stan example from this official tutorial. But I can't because I get the following error message:
The code I use is as follows:
schools_dat <- list(J = 8,
y = c(28, 8, -3, 7, -1, 1, 18, 12),
sigma = c(15, 10, 16, 11, 9, 11, 10, 18))
fit <- stan(file = '8schools.stan', data = schools_dat,
iter = 1000, chains = 4)
Where 8schools.stan is file I have additionally created and nested with the following code:
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
real eta[J];
}
transformed parameters {
real theta[J];
for (j in 1:J)
theta[j] = mu + tau * eta[j];
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}
I want to figure out the reasons of this error and how to avoid it.
Additionally I want mention that:
The following code works propertly and returns 10:
fx <- inline::cxxfunction( signature(x = "integer", y = "numeric" ) , '
return ScalarReal( INTEGER(x)[0] * REAL(y)[0] ) ;
' )
fx( 2L, 5 )
I have installed latest frozen version of Rtools.
Also, Rtools, R and Rstudio are located in folders without spaces.
I work on Windows (not laptop)
I have been strictly following all the steps from installiation guide.
Will be very greatfull for help!
Update
I also have set Sys.setenv(USE_CXX14=1) as Ben Goodrich has suggested but now I get the following error message

This is a temporary issue due to the fact that (this week) the StanHeaders R package (which requires C++14) has been accepted on CRAN but not yet the rstan R package (which uses C++14 to compile the models). So, for the time being, you first need to call
Sys.setenv(USE_CXX14=1)
But anyone reading this post much in the future is unlikely to need to do that explicitly (the rstan R package will do it internally).
On Windows you need to put
CXX14=g++ -std=c++1y
CXX14FLAGS=-O3
into the ~/.R/Makevars file but on a Mac, those lines should already be there.

Related

Setting initial (state) values for ODE system in compiled model (deSolve, Rcpp)

I am struggling with a probably minor problem while calling compiled ODEs to be solved
via the R package 'deSolve' and I seeking advice from more expert users.
Background
I have a couple of ODE systems to be solved with 'deSolve'. I have defined the ODEs in separate C++ functions (one for each model) I am calling through R in conjunction with 'Rcpp'. The initial values of the system change if the function takes input from another model (so basically to have a cascade).
This works quite nicely, however, for one model I have to set the initial parameters for t < 2. I've tried to do this in the C++ function, but it does not seem to work.
Running code example
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export("set_ODE")]]
SEXP set_ODE(double t, NumericVector state, NumericVector parameters) {
List dn(3);
double tau2 = parameters["tau2"];
double Ae2_4 = parameters["Ae2_4"];
double d2 = parameters["d2"];
double N2 = parameters["N2"];
double n2 = state["n2"];
double m4 = state["m4"];
double ne = state["ne"];
// change starting conditions for t < 2
if(t < 2) {
n2 = (n2 * m4) / N2;
m4 = n2;
ne = 0;
}
dn[0] = n2*d2 - ne*Ae2_4 - ne/tau2;
dn[1] = ne/tau2 - n2*d2;
dn[2] = -ne*Ae2_4;
return(Rcpp::List::create(dn));
}
/*** R
state <- c(ne = 10, n2 = 0, m4 = 0)
parameters <- c(N2 = 5e17, tau2 = 1e-8, Ae2_4 = 5e3, d2 = 0)
results <- deSolve::lsoda(
y = state,
times = 1:10,
func = set_ODE,
parms = parameters
)
print(results)
*/
The output reads (here only the first two rows):
time ne n2 m4
1 1 1.000000e+01 0.000000e+00 0.000000e+00
2 2 1.000000e+01 2.169236e-07 -1.084618e-11
Just in case: How to run this code example?
My example was tested using RStudio:
Copy the code into a file with the ending *.cpp
Click on the 'Source' button (or <shift> + <cmd> + <s>)
It should work also without RStudio present, but the packages 'Rcpp' and 'deSolve' must be installed and to compile the code it needs Rtools on Windows, GNU compilers on Linux and Xcode on macOS.
Problem
From my understanding, ne should be 0 for time = 1 (or t < 2). Unfortunately, the solver does not seem to consider what I have provided in the C++ function, except for the ODEs. If I change state in R to another value, however, it works. Somehow the if-condition I have defined in C++ is ignored, but I don't understand why and how I can calculate the initial values in C++ instead of R.
I was able to reproduce your code. It seems to me that this is indeed elegant, even if it does not leverage the full power of the solver. The reason is, that Rcpp creates an interface to the compiled model via an ordinary R function. So back-calls from the slovers (e.g. lsoda) to R are necessary in each time step. Such back-calls are not for the "plain" C/Fortran interface. Here communication between solver and model takes place at the machine code level.
With this informational, I can see that we don't need to expect initialization issues at the C/C++ level, but it looks like a typical case. As the model function is simply the derivative of the model (and only this). The integration is done by the solver "from outside". It calls the model always with the actual integration state, derived from the time step before (roughly speaking). Therefore, it is not possible to force the state variables to fixed values within the model function.
However, there are several options how to resolve this:
chaining of lsoda calls
use of events
The following shows a chained approach, but I am not yet sure about the initialization of the parameters in the first time segment, so may only be part of the solution.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export("set_ODE")]]
SEXP set_ODE(double t, NumericVector state, NumericVector parameters) {
List dn(3);
double tau2 = parameters["tau2"];
double Ae2_4 = parameters["Ae2_4"];
double d2 = parameters["d2"];
double N2 = parameters["N2"];
double n2 = state["n2"];
double m4 = state["m4"];
double ne = state["ne"];
dn[0] = n2*d2 - ne*Ae2_4 - ne/tau2;
dn[1] = ne/tau2 - n2*d2;
dn[2] = -ne*Ae2_4;
return(Rcpp::List::create(dn));
}
/*** R
state <- c(ne = 10, n2 = 0, m4 = 0)
parameters <- c(N2 = 5e17, tau2 = 1e-8, Ae2_4 = 5e3, d2 = 0)
## the following is not yet clear to me !!!
## especially as it is essentially zero
y1 <- c(ne = 0,
n2 = unname(state["n2"] * state["m4"]/parameters["N2"]),
m4 = unname(state["n2"]))
results1 <- deSolve::lsoda(
y = y,
times = 1:2,
func = set_ODE,
parms = parameters
)
## last time step, except "time" column
y2 <- results1[nrow(results1), -1]
results2 <- deSolve::lsoda(
y = y2,
times = 2:10,
func = set_ODE,
parms = parameters
)
## omit 1st time step in results2
results <- rbind(results1, results2[-1, ])
print(results)
*/
The code has also another potential problem as the parameters span several magnitudes from 1e-8 to 1e17. This can lead to numerical issues, as the relative precision of most software, including R covers only 16 orders of magnitude. Can this be the reason, why the results are all zero? Here it may help to re-scale the model.

RStan Error: “Exception: variable does not exist… failed to create the sampler; sampling not done”

Hey everyone––I'm having trouble fitting a Stan model using RStan. I feel like there's an issue with how I've formatted my data. My data was originally in a data.frame, as it was generated using a mechanistic model. I re-formatted my data as a list, but I'm new to this, so there's a fair chance I messed up. Here's the story so far (P.S. I'm sorry if this post isn't formatted properly!):
library(rstan)
library(gdata)
library(bayesplot)
write("// Stan Model; code the actual model in Stan
functions{
real[] dZ_dt(real t, // Time
real[] Z, // System state {Parasite, Host}
real[] alpha, // Parameters
real[] x_r, // Unused data (?)
int[] x_i){
real P = Z[1]; // System state coded as an array, such that Z = (P,H)
real H = Z[2];
real r = alpha[1]; // Parameters of the system, in order they appear
real O = alpha[2];
real h = alpha[3];
real b = alpha[4];
real c = alpha[5];
real u = alpha[6];
real dP_dt = P*r - H*(O*P/1 + O*P*h); // Deterministic mechanistic model
real dH_dt = b + H*(c*(O*P/1 + O*P*h)-u);
return{dP_dt,dH_dt}; // Return the system state
}
}
data{
int<lower=0>N; // Define N as non-negative integer
real ts[N]; // Assigns time points to N (I think?)
real y_init[2];
real<lower=0>y[N,2]; // Define y as real and non-negative
}
parameters{
real<lower=0>alpha[6]; // Make all items in alpha non-neg
real<lower=0>Z_init[2]; // Initial population size non-neg
real<lower=0>sigma[2]; // Error term non-negative
}
transformed parameters{
real Z[N,2]
= integrate_ode_rk45(dZ_dt,Z_init,0,ts,alpha,rep_array(0.0,0),rep_array(0,0),1e-6,1e-5,2000);
}
model{
alpha[{1}]~uniform(0,10);
alpha[{2}]~uniform(0,1);
alpha[{3}]~uniform(0,60);
alpha[{4}]~uniform(0,100);
alpha[{5}]~uniform(0,1);
alpha[{6}]~uniform(0,1);
sigma~lognormal(-1,1);
Z_init~lognormal(log(10),1);
for (k in 1:2){
y_init[k]~lognormal(log(Z_init[k]),sigma[k]);
y[ ,k]~lognormal(log(Z[ ,k]),sigma[k]);
}
}",
"Stan_Model_TypeII.stan")
stanc("Stan_Model_TypeII.stan") # To check that we wrote a file (I think we did?)
Stan_Model_TypeII <- stan_model("Stan_Model_TypeII.stan")
# Squeezing the data into a form that Stan gets
N <- length(Stoch_Data_TypeII$t)-1 # N is 1952 which makes sense bc length of DF is 1953
ts <- 1:N
y_init <- c(Stoch_Data_TypeII$P[1], Stoch_Data_TypeII$H[1]) # Initial states, P = 1; H = 18
y <- as.matrix(Stoch_Data_TypeII[2:(N+1),2:3])
y <- cbind(y[,2],y[,1]); # This worked, sick; where y[,1] is H, and y[,2] is P
Stan_StochData_TypeII <- list(N,ts,y_init,y)
# Fitting the data to the model
fit <- stan(file = "Stan_Model_TypeII.stan",
data = Stan_StochData_TypeII,
warmup = 500, iter = 1000, chains = 2, cores = 1, thin = 1,
algorithm = "HMC",
diagnostic_file = "TypeII_Fitting_Output.R",
seed = 1996, verbose = TRUE)
And here's the progress made on the model:
TRANSLATING MODEL 'Stan_Model_TypeII' FROM Stan CODE TO C++ CODE NOW.
successful in parsing the Stan model 'Stan_Model_TypeII'.
CHECKING DATA AND PREPROCESSING FOR MODEL 'Stan_Model_TypeII' NOW.
COMPILING MODEL 'Stan_Model_TypeII' NOW.
STARTING SAMPLER FOR MODEL 'Stan_Model_TypeII' NOW.
And here's what the error code reads:
Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) :
Exception: variable does not exist; processing stage=data initialization; variable name=N; base type=int (in 'model1b35f2189f6_Stan_Model_TypeII' at line 25)
failed to create the sampler; sampling not done

Parallel computing on two R servers using batchtools/BatchJobs

I'm trying to use batchtools/BatchJobs for parallel computing on two unix-based R servers. I'm completely new to this and hence followed a few articles and package details to do this. I have added some links below:
batchtools,
BatchJobs
So far I have not really understood how to use batchtools for multi-machines. On the other hand, with BatchJobs I have better progress.
I made an ssh connection from the terminal first and execute the following lines:
reg = makeRegistry("TestExp")
reg$cluster.functions = makeClusterFunctionsSSH(worker = makeSSHWorker(nodename="sla19438")) #By BatchJobs
#Test Function
piApprox = function(n) {
nums = matrix(runif(2 * n), ncol = 2)
d = sqrt(nums[, 1]^2 + nums[, 2]^2)
4 * mean(d <= 1)
}
set.seed(42)
piApprox(1000)
BatchJobs::batchMap(reg = reg, fun = piApprox, n = rep(1e7, 10))
getJobTable()
BatchJobs::submitJobs(reg = reg, resources = list(walltime = 3600, memory = 1024))
getStatus(reg = reg)
loadResult(reg = reg, id = 5)
mean(sapply(1:10, loadResult, reg = reg))
It works and gives me the results but I can't see any indication of the jobs being run on the other machine (sla19438) when I run "top" in the terminal.
Please help me understand what I'm doing wrong. Maybe there is some configuration needed but I don't see any material online which dumbs down the steps for a newbie like me.
Thanks

R fit user defined distribution

I am trying to fit my own distribution to my data, find the optimum parameters of the distribution to match the data and ultimately find the FWHM of the peak in the distribution. From what I've read, the package fitdistrplus is the way to do this. I know the data takes the shape of a lorentzian peak on a quadratic background.
plot of the data:
plot of raw data
The raw data used:
data = c(0,2,5,4,5,4,3,3,2,2,0,4,4,2,5,5,3,3,4,4,4,3,3,5,5,6,6,8,4,0,6,5,7,5,6,3,2,1,7,0,7,9,5,7,5,3,5,5,4,1,4,8,10,2,5,8,7,14,7,5,8,4,2,2,6,5,4,6,5,7,5,4,8,5,4,8,11,9,4,8,11,7,8,6,9,5,8,9,10,8,4,5,8,10,9,12,10,10,5,5,9,9,11,19,17,9,17,10,17,18,11,14,15,12,11,14,12,10,10,8,7,13,14,17,18,16,13,16,14,17,20,15,12,15,16,18,24,23,20,17,21,20,20,23,20,15,20,28,27,26,20,17,19,27,21,28,32,29,20,19,24,19,19,22,27,28,23,37,41,42,34,37,29,28,28,27,38,32,37,33,23,29,55,51,41,50,44,46,53,63,49,50,47,54,54,43,45,58,54,55,67,52,57,67,69,62,62,65,56,72,75,88,87,77,70,71,84,85,81,84,75,78,80,82,107,102,98,82,93,98,90,94,118,107,113,103,99,103,96,108,114,136,126,126,124,130,126,113,120,107,107,106,107,136,143,135,151,132,117,118,108,120,145,140,122,135,153,157,133,130,128,109,106,122,133,132,150,156,158,150,137,147,150,146,144,144,149,171,185,200,194,204,211,229,225,235,228,246,249,238,214,228,250,275,311,323,327,341,368,381,395,449,474,505,529,585,638,720,794,896,919,1008,1053,1156,1134,1174,1191,1202,1178,1236,1200,1130,1094,1081,1009,949,890,810,760,690,631,592,561,515,501,489,467,439,388,377,348,345,310,298,279,253,257,259,247,237,223,227,217,210,213,197,197,192,195,198,201,202,211,193,203,198,202,174,164,162,173,170,184,170,168,175,170,170,168,162,149,139,145,151,144,152,155,170,156,149,147,158,171,163,146,151,150,147,137,123,127,136,149,147,124,137,133,129,130,128,139,137,147,141,123,112,136,147,126,117,116,100,110,120,105,91,100,100,105,92,88,78,95,75,75,82,82,80,83,83,66,73,80,76,69,81,93,79,71,80,90,72,72,63,57,53,62,65,49,51,57,73,54,56,78,65,52,58,49,47,56,46,43,50,43,40,39,36,45,28,35,36,43,48,37,36,35,39,31,24,29,37,26,22,36,33,24,31,31,20,30,28,23,21,27,26,29,21,20,22,18,19,19,20,21,20,25,18,12,18,20,20,13,14,21,20,16,18,12,17,20,24,21,20,18,11,17,12,5,11,13,16,13,13,12,12,9,15,13,15,11,12,11,8,13,16,16,16,14,8,8,10,11,11,17,15,15,9,9,13,12,3,11,14,11,14,13,8,7,7,15,12,8,12,14,9,5,2,10,8)
I have calculated the equations which define the distribution and cumulative distribution:
dFF <- function(x,a,b,c,A,gamma,pos) a + b*x + (c*x^2) + ((A/pi)*(gamma/(((x-pos)^2) + (gamma^2))))
pFF <- function(x,a,b,c,A,gamma,pos) a*x + (b/2)*(x^2) + (c/3)*(x^3) + A/2 + (A/pi)*(atan((x - pos)/gamma))
I believe these to be correct. From what I understand, a distribution fit should be possible using just these definitions using the fitdist (or mledist) method:
fitdist(data,'FF', start = list(0,0.3,-0.0004,70000,13,331))
mledist(data,'FF', start = list(0,0.3,-0.0004,70000,13,331))
This returns the statement 'function cannot be evaluated at initial parameters> Error in fitdist(data, "FF", start = list(0, 0.3, -4e-04, 70000, 13, 331)):the function mle failed to estimate the parameters, with the error code 100' in the first case and in the second I just get a list of 'NA' values for the estimates.
I then calculated a function to give the quantile distribution values to use the other fitting methods (qmefit):
qFF <- function(p,a,b,c,A,gamma,pos)
{
qList = c()
axis = seq(1,600,1)
aF = dFF(axis,a,b,c,A,gamma,pos)
arr = histogramCpp(aF) # change data to a histogram format
for(element in 1:length(p)){
q = quantile(arr,p[element], names=FALSE)
qList = c(qList,q)
}
return(qList)
}
Part of this code requires calling the c++ function (by using the library Rcpp):
#include <Rcpp.h>
#include <vector>
#include <math.h>
using namespace Rcpp;
// [[Rcpp::export]]
std::vector<int> histogramCpp(NumericVector x) {
std::vector<int> arr;
double number, fractpart, intpart;
for(int i = 0; i <= 600; i++){
number = (x[i]);
fractpart = modf(number , &intpart);
if(fractpart < 0.5){
number = (int) intpart;
}
if(fractpart >= 0.5){
number = (int) (intpart+1);
}
for(int j = 1; j <= number; j++){
arr.push_back(i);
}
}
return arr;
}
This c++ method just turns the data into a histogram format. If the first element of the vector describing the data is 4 then '1' is added 4 times to the returned vector etc. . This also seems to work as sensible values are returned. plot of the quantile function:
Plot of quantiles returned for probabilities from 0 to 1 in steps of 0.001
The 'qmefit' method can then be attempted through the fitdist function:
fitdist(data,'FF', start = list(0,0.3,-0.0004,70000,13,331), method = 'qme', probs = c(0,0.3,0.4,0.5,0.7,0.9))
I chose the 'probs' values randomly as I don't fully understand their meaning. This either straight-up crashes the R session or after a brief stuttering returns a list of 'NA' values as estimates and the line <std::bad_alloc : std::bad_alloc>
I am not sure if I am making a basic mistake here and any help or recommendations are appreciated.
In the end I managed to find a work-around for this using the rPython package and lmfit from python. It solved my issue and might be useful for others with the same issue. The R-code was as follows:
library(rPython)
python.load("pyFit.py")
python.assign("row",pos)
python.assign("vals",vals)
python.exec("FWHM,ERROR,FIT = fitDist(row,vals)")
FWHM = python.get("FWHM")
ERROR = python.get("ERROR")
cFIT = python.get("FIT")
and the called python code was:
from lmfit import Model, minimize, Parameters, fit_report
from sklearn import mixture
import numpy as np
import matplotlib.pyplot as plt
import math
def cauchyDist(x,a,b,c,d,e,f,g,A,gamma,pos):
return a + b*x + c*pow(x,2) + d*pow(x,3) + e*pow(x,4) + f*pow(x,5) + g*pow(x,6) + (A/np.pi)*(gamma/((pow((x-pos),2)) + (pow(gamma,2))))
def fitDist(row, vals):
gmod = Model(cauchyDist)
x = np.arange(0,600)
result = gmod.fit(vals, x=x, a = 0, b = 0.3, c = -0.0004, d = 0, e = 0, f= 0, g = 0, A = 70000, gamma = 13, pos = row)
newFile = open('fitData.txt', 'w')
newFile.write(result.fit_report())
newFile.close()
with open('fitData.txt', 'r') as inF:
for line in inF:
if 'gamma:' in line:
j = line.split()
inF.close()
FWHM = float(j[1])
error = float(j[3])
fit = result.best_fit
fit = fit.tolist()
return FWHM, error, fit
I increased the order of polynomial to obtain a better fit for the data and returned the FWHM, its error and the values for the fit. There are likely much better ways of achieving this but the final fit is as I needed.
Final fit. Red data points are raw data, the black line is the fitted distribution.

(in R) Why is result of ksvm using user-defined linear kernel different from that of ksvm using "vanilladot"?

I wanted to use user-defined kernel function for Ksvm in R.
so, I tried to make a vanilladot kernel and compare with "vanilladot" which is built in "kernlab" as practice.
I write my kernel as follow.
#
###vanilla kernel with class "kernel"
#
kfunction.k <- function(){
k <- function (x,y){crossprod(x,y)}
class(k) <- "kernel"
k}
l<-0.1 ; C<-1/(2*l)
###use kfunction.k
tmp<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel=kfunction.k(), C = C)
alpha(tmp)[[1]]
ind<-alphaindex(tmp)[[1]]
x.s<-x[ind,] ; y.s<-y[ind]
w.class.k<-t(alpha(tmp)[[1]]*y.s)%*%x.s
w.class.k
I thouhgt result of this operation is eqaul to that of following.
However It dosn't.
#
###use "vanilladot"
#
l<-0.1 ; C<-1/(2*l)
tmp1<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel="vanilladot", C = C)
alpha(tmp1)[[1]]
ind1<-alphaindex(tmp1)[[1]]
x.s<-x[ind1,] ; y.s<-y[ind1]
w.tmp1<-t(alpha(tmp1)[[1]]*y.s)%*%x.s
w.tmp1
I think maybe this problem is related to kernel class.
When class is set to "kernel", this problem is occured.
However When class is set to "vanillakernel", the result of ksvm using user-defined kernel is equal to that of ksvm using "vanilladot" which is built in Kernlab.
#
###vanilla kernel with class "vanillakernel"
#
kfunction.v.k <- function(){
k <- function (x,y){crossprod(x,y)}
class(k) <- "vanillakernel"
k}
# The only difference between kfunction.k and kfunction.v.k is "class(k)".
l<-0.1 ; C<-1/(2*l)
###use kfunction.v.k
tmp<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel=kfunction.v.k(), C = C)
alpha(tmp)[[1]]
ind<-alphaindex(tmp)[[1]]
x.s<-x[ind,] ; y.s<-y[ind]
w.class.v.k<-t(alpha(tmp)[[1]]*y.s)%*%x.s
w.class.v.k
I don't understand why the result is different from "vanilladot", when setting the class to "kernel".
Is there an error in my operation?
First, it seems like a really good question!
Now to the point. In the sources of ksvm we can find when is a line drawn between using user-defined kernel, and the built-ins:
if (type(ret) == "spoc-svc") {
if (!is.null(class.weights))
weightedC <- class.weights[weightlabels] * rep(C,
nclass(ret))
else weightedC <- rep(C, nclass(ret))
yd <- sort(y, method = "quick", index.return = TRUE)
xd <- matrix(x[yd$ix, ], nrow = dim(x)[1])
count <- 0
if (ktype == 4)
K <- kernelMatrix(kernel, x)
resv <- .Call("tron_optim", as.double(t(xd)), as.integer(nrow(xd)),
as.integer(ncol(xd)), as.double(rep(yd$x - 1,
2)), as.double(K), as.integer(if (sparse) xd#ia else 0),
as.integer(if (sparse) xd#ja else 0), as.integer(sparse),
as.integer(nclass(ret)), as.integer(count), as.integer(ktype),
as.integer(7), as.double(C), as.double(epsilon),
as.double(sigma), as.integer(degree), as.double(offset),
as.double(C), as.double(2), as.integer(0), as.double(0),
as.integer(0), as.double(weightedC), as.double(cache),
as.double(tol), as.integer(10), as.integer(shrinking),
PACKAGE = "kernlab")
reind <- sort(yd$ix, method = "quick", index.return = TRUE)$ix
alpha(ret) <- t(matrix(resv[-(nclass(ret) * nrow(xd) +
1)], nclass(ret)))[reind, , drop = FALSE]
coef(ret) <- lapply(1:nclass(ret), function(x) alpha(ret)[,
x][alpha(ret)[, x] != 0])
names(coef(ret)) <- lev(ret)
alphaindex(ret) <- lapply(sort(unique(y)), function(x)
which(alpha(ret)[,
x] != 0))
xmatrix(ret) <- x
obj(ret) <- resv[(nclass(ret) * nrow(xd) + 1)]
names(alphaindex(ret)) <- lev(ret)
svindex <- which(rowSums(alpha(ret) != 0) != 0)
b(ret) <- 0
param(ret)$C <- C
}
The important parts are two things, first, if we provide ksvm with our own kernel, then ktype=4 (while for vanillakernel, ktype=0) so it makes two changes:
in case of user-defined kernel, the kernel matrix is computed instead of actually using the kernel
tron_optim routine is ran with the information regarding the kernel
Now, in the svm.cpp we can find the tron routines, and in the tron_run (called from tron_optim), that LINEAR kernel has a separate optimization routine
if (param->kernel_type == LINEAR)
{
/* lots of code here */
while (Cpj < Cp)
{
totaliter += s.Solve(l, prob->x, minus_ones, y, alpha, w,
Cpj, Cnj, param->eps, sii, param->shrinking,
param->qpsize);
/* lots of code here */
}
totaliter += s.Solve(l, prob->x, minus_ones, y, alpha, w, Cp, Cn,
param->eps, sii, param->shrinking, param->qpsize);
delete[] w;
}
else
{
Solver_B s;
s.Solve(l, BSVC_Q(*prob,*param,y), minus_ones, y, alpha, Cp, Cn,
param->eps, sii, param->shrinking, param->qpsize);
}
As you can see, the linear case is treated in the more complex, more detailed way. There is an inner optimization loop calling the solver many times. It would require really deep analysis of actual optimization being performed here, but at this step one can answer your question in a following way:
There is no error in your operation
kernlab's svm has a separate routine for training SVM with linear kernel, which is based on the type of kernel passed to the code, changing "kernel" to "vanillakernel" made the ksvm think it is actually working with vanillakernel, and so performed this separate optimization routine
It does not seem as a bug in fact, as the linear SVM is in fact very different from the kernelized version in terms of efficient optimization techniques. Amount of heuristic as well as numerical issues that has to be taken care of is really big. As a result, some approximations are required and can lead to the different results. While for the rich feature space (like those induced by RBF kernel) it should not really matter, for simple kernels line linear ones - this simplifications can lead to significant output changes.

Resources