Training Hidden Markov Model in R - r

Is it possible to train Hidden Markov Model in R?
I have a set of observations with its corresponding labels. And I need to train HMM in order to get the Markov parameters (i.e. the transition probabilities matrix, emission probabilities matrix and initial distribution). So, I can predict for the future observations.
In other words, I need the opposite of Forward_Backward Algorithm..

Yes, you can. R is a good tool for simulation and statistical analysis. There are many nice packages available. You do not need to implement at all (of course you can), just learn to use them.
An example of using package.
An example of implementing HMM is here. Here DNA sequence is modeled using HMM.
Similar question is asked here as well.

Related

k-means clustering with survival Data

just a basic question concerning k-means clustering analysis on survival data, like this one:
I am doing k-means clustering to identify clusters which Gene influences the survival most... However do I include the survival time into my k-means function or should I leave it out? So should I put it into the kmeans() function e.g. in R?
Kind regards,
Hashriama
I think that your approach is not the best one. Your goal is to select genes associated with censored/uncensored survival. The use of supervised methods seems the most suitable. Using a k-means will only cluster genes by similarities without regard to survival, and even if you wanted to add survival in your modeling it would not make sense because you are omitting censoring.
There are Cox regressions to which an L1 penalty is added, allowing variable selection without omitting censoring. This kind of approach seems more appropriate to accomplish your goal and fits better in your context. To learn more, here is an article from Jiang Gui & Hongzhe Li that uses penalized Cox regression (look at the R package biospear too if needed):
https://academic.oup.com/bioinformatics/article/21/13/3001/196819

How to run a truncated and inflated Poisson model in R?

My data doesn't contain any zeros. The minimum value for my outcome, y, is 1 and that is the value that is inflated. My objective is to run a truncated and inflated Poisson regression model using R.
I already know how to separate way each regression zero truncated and zero inflated. I want to know how to combine the two conditions into one model.
Thanks for you help.
For zero inflated models or zero-hurdle models, the standard approach is to use pscl package. I also wrote a package fitting that kind of models here but it is not yet mature and fully tested. Unless you have voluminous data, I still recommend you to use pscl that is more flexible, robust and documented.
For zero-truncated models, you can have a look at the VGML::vglm function. You might find useful information here.
Note that you are not doing the same distributional assumption so you won't need the same estimation data. Given the description of your dataset, I think you are looking for a zero-truncated model (since you do not observe zeros). With zero-inflated models, you decompose your observed pattern into zeros generated by a selection model and others generated by a count data model. This doesn't look to be a pattern consistent with your dataset.

Decisional boundary SVM caret (R)

I have built an SVM-RBF model in R using Caret. Is there a way of plotting the decisional boundary?
I know it is possible to do so by using other R packages but unfortunately I’m forced to use the Caret package because this is the only package I found that allows me to calculate the variables importance.
In alternative, can you suggest a package that allows to plot the decision boundaries AND gives also the vars importance?
Thank you very much
First of all, unlike other methods, SVM does not produce feature importance. In your case, the importance score caret reports is calculated independent of the method itself: https://topepo.github.io/caret/variable-importance.html#model-independent-metrics
Second, the decision boundary (or hyperplane) you see in most textbook example is based on a toy problem with only two or three features. If you have more than three features, it is not trivial to visualize this hyperplane.

Quantile Regression with Time-Series Models (ARIMA-ARCH) in R

I am working on quantile forecasting with time-series data. The model I am using is ARIMA(1,1,2)-ARCH(2) and I am trying to get quantile regression estimates of my data.
So far, I have found "quantreg" package to perform quantile regression, but I have no idea how to put ARIMA-ARCH models as the model formula in function rq.
rq function seems to work for regressions with dependent and independent variables but not for time-series.
Is there some other package that I can put time-series models and do quantile regression in R? Any advice is welcome. Thanks.
I just put an answer on the Data Science forum.
It basically says that most of the ready made packages are using so called exact test based on assumption on the distribution (independent identical normal-Gauss distribution, or wider).
You also have a family of resampling methods in which you simulate a sample with a similar distribution of your observed sample, perform your ARIMA(1,1,2)-ARCH(2) and repeat the process a great number of times. Then you analyze this great number of forecast and measure (as opposed to compute) your confidence intervals.
The resampling methods differs in the way to generate the simulated samples. The most used are:
The Jackknife: in which you "forget" one point, that is you simulate a n samples of size n-1 (if n is the size of the observed sample).
The Bootstrap: in which you simulate a sample by taking n values of the original sample with replacements: some will be taken once, some twice or more, some never,...
It is a (not easy) theorem that the expectation of the confidence intervals, as most of the usual statistical estimators, are the same on the simulated sample than on the original sample. With the difference that you can measure them with a great number of simulations.
Hello and welcome to StackOverflow. Please take some time to read the help page, especially the sections named "What topics can I ask about here?" and "What types of questions should I avoid asking?". And more importantly, please read the Stack Overflow question checklist. You might also want to learn about Minimal, Complete, and Verifiable Examples.
I can try to address your question, although this is hard since you don't provide any code/data. Also, I guess by "put ARIMA-ARCH models" you actually mean that you want to make an integrated series stationary using an ARIMA(1,1,2) plus an ARCH(2) filters.
For an overview of the R time-series capabilities you can refer to the CRAN task list.
You can easily apply these filters in R with an appropriate function.
For instance, you could use the Arima() function from the forecast package, then compute the residuals with residuals() from the stats package. Next, you can use this filtered series as input for the garch() function from the tseries package. Other possibilities are of course possible. Finally, you can apply quantile regression on this filtered series. For instance, you can check out the dynrq() function from the quantreg package, which allows time-series objects in the data argument.

how to perform semi-supervised k-mean clustering

I am new in r. I am trying to perform semi-supervised k-means clustering. I plan to divide my 2/3 of my data as a training set, and 1/3 as a test set. My objective is to train a model using the known clusters, and then propagate the training model to the test set. the propagation result will be compare with the prior clusters. my objective is to check the prediction accuracy of kmeans clustering. Therefore I am wondering if there is a way we can do semi-supervised kmeans clustering using r? any package is needed. thank you.
thank you
regards,
Use kmeans(). It should come with the stats package, which you should have if you've installed R correctly. You can read how to use functions by putting a ? before the function call, e.g. ?kmeans().
Search online if you're still lost about how to use the function - there are plenty of guides and toy examples online.
M

Resources