parameter C. epsilon as vector in kernlab's ksvm in R - r

I am trying to use ksvm function of kernlab package in R for epsilon-SVM regression. I want to put parameters C(regularization constant) and epsilon (insensitivity) as vectors(length of vector = training data length). But I am not able to figure out how to do this. Please suggest some way.

Why do you assume that you can do it? According to documentation of ksvm you can only weight classes, not particular samples. Such modification is accessible in for example sklearn python library (as samples' weights).
To artificialy implement per samples C-weights you could oversample your data. It will be very inefficient (especially if you have large differences in C values), but it can be applied to almost any SVM library.

Related

Preventing underforecasting of support vector regression in R

I'm currently using the e1071 package in R to forecast product demand using support vector regression via the svm function in the package. While support vector regression yields much higher forecast accuracy for my data compared to other methods (e.g. ARIMA, simple exponential smoothing), my results show that the svm function tends to underforecast. In my particular case, underforecasting is worse and much more expensive than overforecasting. Therefore, I want to implement something in R to tells support vector regression to penalize underforecasting much more than overforecasting.
Unfortunately, I can't really find any possibility to do this. There seems to be nothing on this in the e1071 package. The kernlab package has a support vector function (ksvm) that implements an 'eps-bsvr bound-constraint svm regression' but I can't find any information what is meant by bound-constraint or how to define that bound.
Has anyone seen any examples how to do this in R? I'm only finding very mathematical papers on asymmetric loss functions for support vector regression, and I don't have the skills to translate this into R code, so i'm looking for an already existing solution in R.

R CRAN Neural Network Package compute vs prediction

I am using R along with the neuralnet package see docs (https://cran.r-project.org/web/packages/neuralnet/neuralnet.pdf). I have used the neural network function to build and train my model.
Now I have built my model I want to test it on real data. Could someone explain if I should use the compute or prediction function? I have read the documentation and it isnt clear, both functions seem to do similar?
Thanks
The short answer is to use compute to do predictions.
You can see an example of using compute on the test set here. We can also see that compute is the right one from the documentation:
compute, a method for objects of class nn, typically produced by neuralnet. Computes the outputs
of all neurons for specific arbitrary covariate vectors given a trained neural network.
The above says that you can use covariate vectors in order to compute the output of the neural network i.e. make a prediction.
On the other hand prediction does what is mentioned in the title in the documentation:
Summarizes the output of the neural network, the data and the fitted
values of glm objects (if available)
Moreover, it only takes two arguments: the nn object and a list of glm models so there isn't a way to pass in the test set in order to make a prediction.

How Matrix Inversion is done in "krige" function of gstat package of R tool

I am in the midway of understanding how gstat package in R tool implements kriging method. I have understood the calculation of empirical semivariogram and fitting semivariogram models. But I did not understand how it implements the matrix inversion to calculate the weights of the kriging estimators. I have a large data set containing 50000 lat-long-precipitaion triplets. Theoretically inversion of a matrix of size 50000x50000 must be done in order to get the weights. While this large matrix takes several GBs of man memory, which is particularly impractical.
My question is that how krige function does all this within a second?
Regards,
Chandan
You didn't tell what your computing environment is, but I believe it is safe to say that it didn't solve a 50000 points kriging problem in a second. In order to understand what it did, please provide more information, e.g. the commands you used, and the output gstat gave.

Output posterior distribution from bayesian network in R (bnlearn)

I'm experimenting with Bayesian networks in R and have built some networks using the bnlearn package. I can use them to make predictions for new observations with predict(), however I would also like to have the posterior distribution over the possible classes. Is there a way of retrieving this information?
It seems like there is a prob-parameter that does this for the naive bayes implementation in the bnlearn package, but not for networks fitted with bn.fit.
Thankful for any help with this.
See the documentation of bnlearn.
predict function implements prob only for naive.bayes and TAN.
In short, because all other methods do not necessarily compute posterior probabilities.
[bnlearn] :: predict returns the predicted values for node given the data specified by data. Depending on the
value of method, the predicted values are computed as follows:
a)parents b)bayes-lw
When using bayes-lw , likelihood weighting simulations are performed for making predictions.
Hope this helps. :)

Sample size (sampsize) argument in cforest()?

I've been working with the RandomForest package in R, which is great for classification yet weak when it comes to calculating variable importance for datasets with highly correlated predictors (like mine)
I want to switch over to the party package (to use cforest) but I can't find the equivalent of the sampsize argument when using cforest(), this argument is particularly important for me because I have a highly imbalanced dataset and have to use sampling methods to cope with that problem.
Alternatively, is there a way to pass a randomForest forest to cforest (Can s3 objects be transformed into s4 objects?!) I can train the classifier in randomForest and use the party package to get the variable importance....
Many thanks...

Resources