Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I want to use R recurrent neural network package rnn to classify polarity of aspect and sentiment pairs. for example, the inputs are pre-trained embedding of word "speed" and "fast", I expect to get a class label of this pair by RNN classification.
Could you give me some instruction about using the rnn package for this task?
What the input X and out Y of the trainr() method should be?
A first place to look would be at the documentation of the package using:
library(rnn)
help(rnn)
This page refers to the two main functions of the package trainr() and predictr()
The trainr() help page describes the form of the inputs, which should be numeric arrays. For classes you should thus transform your character data to factor using as.factor().
The help page also gives and example. For long form examples have a look at:
vignette('rnn')
and
vignette('sinus')
You can also find the vignettes on the CRAN page.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a known configuration nodes, weights, bias values, and activation function (tanh) for a neural network. I'd like to build that neural network as some 'neural network' object in R by proscribing the parts, and not fitting a network. How can I do this? I see many options to fit a neural network, but cannot find out how to build a network when I already know the components.
R do provide startweights argument to initialize custom weights, see StackOverflow thread. I also won't see citations for changing bias and transfer function.
Either use MATLAB (which is not a good idea for a R expert) or better design custom network based on following fact:
ANN is just a set of maths operations on input vectors and output vectors, where math operations are adjustment of weights based on error term in a loop using simple back-propogation. Use vectors and maths operations ONLY in R to design a simple ANN with back-propogation training
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have 22 companies response about 22 questions/parameters in a 22x22 matrix. I applied clustering technique which gives me different groups with similarities.
Now I would like to find correlations between parameters and companies preferences. Which technique is more suitable in R?
Normally we build Bayesian network to find a graphical relationship between different parameters from data. As this data is very limited, how i can build Bayesian Network for it?
Any suggestion to analyze this data.
Try looking at Feature selection and Feature Importance in R, it's simple,
this could lead you: http://machinelearningmastery.com/feature-selection-with-the-caret-r-package/
Some packages are good: https://cran.r-project.org/web/packages/FSelector/FSelector.pdf
, https://cran.r-project.org/web/packages/varSelRF/varSelRF.pdf
this is good SE question with good answers: https://stats.stackexchange.com/questions/56092/feature-selection-packages-in-r-which-do-both-regression-and-classification
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
SAS MODEL Statement introduction
INCLUDE=n
forces the first n independent variables listed in the MODEL statement to be included in all models. The selection methods are performed on the other variables in the MODEL statement. The INCLUDE= option is not available with SELECTION=NONE.
I think you will find that R users are mostly averse (on solid theoretic grounds) to mimicking SAS's stepwise regression functions. However, you will find that step argument scope has an 'upper' and a 'lower' option and you probably should first read the ?step-help page and then create a value for 'lower'.
scope
defines the range of models examined in the stepwise search. This should be either a single formula, or a list containing components upper and lower, both formulae. See the details for how to specify the formulae and how they are used.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am trying to build predictive models from text data. I built document-term matrix from the text data (unigram and bigram) and built different types of models on that (like svm, random forest, nearest neighbor etc). All the techniques gave decent results, but I want to improve the results. I tried tuning the models by changing parameters, but that doesn't seem to improve the performance much. What are the possible next steps for me?
This isn't really a programming question, but anyway:
If your goal is prediction, as opposed to text classification, usual methods are backoff models (Katz Backoff) and interpolation/smoothing, e.g. Kneser-Ney smoothing.
More complicated models like Random Forests are AFAIK not absolutely necessary and may pose problems if you need to make predictions quickly. If you are using an interpolation model, you can still tune the model parameters (lambda) using a held out portion of the data.
Finally, I agree with NEO on the reading part and would recommend "Speech and Language Processing" by Jurafsky and Martin.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am working on a data set that has a bunch of raw text that I am vectorizing and using in my matrix for a random forest regression. My question is, should I be treating each word as a .factor or a .numeric if it is a sparse matrix? Which one speed up the computation time?
My understanding is that R matrices coerce factors to characters, so you're better off using numeric.
I'm not terribly familiar with RandomForest -- I have a general idea of what it does, but I'm not sure about the guts of its R implementation. If you need to give it a design matrix (for instance, how ANOVAs or GLMs work when you implement them by hand), you can try using the model.matrix function.