Is there a function to calculate the scatter matrix in R language? - r

Recently I have been trying to use an optimizer to make feature selections for clustering. I need a fitness function to tell the optimizer which feature set is better. So I refer to the criteria mentioned in the book "Introduction to Statistical Pattern Recognition 2nd Ed chapter 10,10.2- Keinosuke Fukunaga". The content is shown below.
I have found a function(ScatterMatrices()) in Matlab to calculate the value J. As shown below.
However, I didn't find any function similar to ScatterMatrices() in Matlab. I would appreciate it if you could help me🙏.

withinSS: Within-class Sum of Squares Matrix
"Calculates within-class sum of squares and cross product matrix (a.k.a. within-class scatter matrix)"
Which is available in the archive Index of /src/contrib/Archive/DiscriMiner
How do I install a package that has been archived from CRAN

Related

R package for survey raking that does automatic cell collapsing

I know there are various R packages for performing raking (i.e. calibration to external estimates, iterative proportional fitting, etc) to construct survey weights. I wanted to find a package that would automatically collapse cells if a cell count fell below a certain value. Is there a package out there with such a feature? Or if not raking exactly, a weighting package for a similar algorithm (e.g. GREG, entropy balancing) that would have such a feature for matchings to targets. Thank you.
Doing initial research, packages like "Ipfp: Multidimensional Iterative Proportional Fitting" didn't seems to have the feature I wanted.

How to compute greeks for option pricing in Montecarlo Simulation

I have to complete an assignment in Rstudio but I'm pretty new to it:
Compute the first order greeks (delta, vega, theta, rho, lambda, epsilon) for a plain vanilla European option using R.
I have generated the code to price the option both with Montecarlo and Binary Tree, but I have no idea how to proceed. On the internet, I found codes to generate the Greeks with the Black-Scholes model, but it's not exactly what I'm looking for.
My teacher suggested doing it with the definition of derivative.
Could someone help me?

Calculate Cosine Similarity for a word2vec model in R

I´m working with the package "word2vec" model in R and got a huge problem. I wanna figure out which words are the closest synonyms to "uncertainty" and "economy" like the paper of Azqueta-Gavaldon (2020): "Economic policy uncertainty in the euro area: An unsupervised machine learning approach".So I did the word2vec function of the word2vec package to create my own word2vec model. With the function predict (object, ...) I can create a table which shows me the words which are closest to my considered words.The problem is that the similarity of this function is defined as the (sqrt(sum(x . y) / ncol(x))) which is not the cosine similarity.
I know that I can use the function cosine(x,y). This function but just works to calculate the cosine similarity between two vectors and can´t do the output like the predict function which I described above.
Does anyone know how to determine the cosine similarity for each word in my Word2Vec model to the other and give me an output of the most similar words to a given word based on these values?
This would really help me a lot and I am already grateful for your answers.
Kind regards,
Tom
following github-code explains how you can use the cosine similarity in Word2Vec Models in R:
https://gist.github.com/adamlauretig/d15381b562881563e97e1e922ee37920
You can use this function at every matrix in R and therefore for every Word2Vec Model built in R.
Kind Regards,
Tom

Is there an R package that runs "Spatial Vector Autoregression"?

I am looking for an R package which can run "Spatial Vector Autoregression".
tandfonline.com/doi/full/10.1080/17421770701346689
According to Chen and Conley (2001), this is a "vector autoregression (VAR) whose coefficient matrix and shock covariance matrix are functions of economic distances between agents. The impact of other agents’ variables on the conditional mean of a given agent’s variable is a function of their economic distances from this agent. Similarly, covariances of VAR shocks are functions of distances between agents in the previous period, a property we refer to as being isotropic."
(Chen, X & Conley, T.G. (2001) A new semiparametric spatial model for panel
time series, Journal of Econometrics, 105, 59–83)
Surprisingly, however, I could only see until "Spatial Autoregression" which is still not what I need for my purpose. May I get help finding the package for this please? Otherwise, may I know an official way to run this Spatial Vector Autoregression model using R programming?
I think I've found what you're looking for, devtools::install_github("James-Thorson/VAST"). VAST stands for "Vector-Autoregressive Spatio-Temporal." This package is a wrapper around a package that incorporates spatial modeling. Essentially it adds to it.
You can see coding examples here. If you want to look at help, use ?VAST::VAST and select one of the three hyperlinks at the bottom of the short description and details (make_settings, fit_model, and plot_results).
Please note:
When I installed this package to check out what it included, it came back with a conflict that the package TMB required an earlier version of the Matrix package. I had not had TMB installed before installing this package. I had no issues installing TMB independently (without a conflict with the version of the Matrix package). However when I called the library VAST it still gave me that error. When I called the library TMB, then the library VAST I didn't receive the warning and both libraries loaded.

in R, does a "goodness of fit" value exist for vegan's CCA, similar to NMDS's "stress" value?

I would like to know if there is a way to extract something similar to the metaMDS "stress" value from a vegan cca object? I've tried the goodness.cca function and its relatives
(http://cc.oulu.fi/~jarioksa/softhelp/vegan/html/goodness.cca.html)
They tell me about the stats per sample, but I'm interested in the overall goodness of fit for reducing a multidimensional system to two dimensions (if something like that exists, as it uses different calculations).
I would like to continue with vegan, if possible, though I found this link here:
(Goodness of fit in CCA in R)
Thanks a alot
RJ
It is called eigenvalue. Earlier people have complained that NMDS does not have something like eigenvalue, but only has stress. The total variation in the data is the sum of all eigenvalues so that proportion of the eigenvalue or the cumulative sum of eigenvalues from the total is proportion explained. All these can be extracted with eigenvals() and its summary() (see their help with ?eigenvals). The significance tests for axes are available with its anova.cca() function (look at its documentation for references).
The web page you referred to is about another method, canonical correlations. In vegan we have that in CCorA with permutation test. Just pick your method, and then find your tools.

Resources