Discrete Markov chain sequence generator: is there any specific example using the markov chain library? - markov

I would like to use Markov chain to generate a sequence of states based on a given probability from a specific transition matrix.
I have already calculated the transition matrix so I just need a random state generator.
As far as I know there is a library for discrete markov chain?
Are there any specific examples I could follow?
thanks a lot! Here is an example of the transition matrix with 3 states. I would like to create a sequence of 5 timesteps.
A = np.array([[0.4, 0.3, 0.3], [0.5, 0.1, 0.4], [0, 1, 0]])

Related

Can we test if distance matrices are significantly farther apart?

I work in the field of linguistics, and my current project involves lots of distance matrices generated from language data, which measure the distance and similarity of dialects. Concretely, my distance matrices range between 0 and 1, where 0 represents no distance and 1 represents maximal distance between dialects. Now, I am wondering if there exists statistical significance tests or something like that, wherein we can test if dialect A and dialect B are significantly farther apart than B and C? Alternatively, is there a customary threshold, say 0.5, whereby distances > 0.5 indicates dialects are more different than similar? For instance, consider the distances between CMM_Press and CTM_Press on one hand, and CMM_Other and CTM_Other on the other hand in the distance matrices below.
In the four distance matrices above, I am especially interested in the distances of the following pairs:
CMM_Press and CTM_Press: 0.2, 0.19, 0.5, 0.4;
CMM_Other and CTM_Other: 0.6, 0.41, 0.4, 0.69.
Is there significance tests with which I can test if, say, CMM_Other and CTM_Other are significantly farther apart than CMM_Press and CTM_Press?
To facilitate the answering of the questions, you can find the dataset, a R markdown file containing the distance matrices and the scripts for the analysis in this OSF link.
In addition, I would like to know if is there exists a good reference on how to interpret distance matrices (e.g., from an ecology point of view where the Mantel test was invented).

How to train a hidden markov model with constrained probabilities (or missing links between hidden states)?

I have a hidden Markov model (HMM) with 3 hidden states and 2 discrete emission symbols. I know that the probability of transitioning from state 2 to state 3 is 0 (i.e. there is no direct link from S2 to S3). What is the best way of fitting the parameters (implementing the constraint) of this model given an observed sequence of symbols?
Can this be done in python's hmmlearn?
This turned out to be quite easy in hmmlearn. Below is a code example that illustrates the approach.
class ConstrainedGaussianHMM(hmmlearn.hmm.GaussianHMM):
def _do_mstep(self, stats):
# do the standard HMM learning step
super()._do_mstep(stats)
# NOTE: the mapping of state indices to the data is nondeterministic
# so you should find a heuristic to identify the correct ones
s2 = 1, s3 = 2
# manipulate the transition matrix as you see fit
self.transmat_[s2,s3] = 0.0
Complete code example can be found in https://github.com/jonnor/machinehearing/blob/d557001e697f01ac5d7498e5cad00363bd8205a2/handson/constrained-hmm/ConstrainedHMM.ipynb

Extract Fourier coeffients from fft() in R

I need to derive Fourier time series coefficients associated to (i-1)^th harmonic from fft() function in R, some idea?
For instance
Adding these concepts we get the general form of the Fourier Series:
f(t)=a_0+∑_k a_k×sin(kwt+ρ_k)
where a_0 is the DC component, w=2πf_0, where f_0 is the fundamental frequency of the original wave.
Each wave component a_k×sin(kwt+ρ_k) is also called a harmonic.
If I fixed the number of harmonics to 2, I would like to derive a_0,a_1,a_2 from fft()
It's a very general question. You can look here: Fourier Transform: A R Tutorial, for a start.

Calculate Normals from Heightmap

I am trying to convert an heightmap into a matrix of normals using central differencing which will later correspond to the steepness of a giving point.
I found several links with correct results but without explaining the math behind.
T
L O R
B
From this link I realised I can just do:
Vec3 normal = Vec3(2*(R-L), 2*(B-T), -4).Normalize();
The thing is that I don't know where the 2* and -4 comes from.
In this explanation of central differencing I see that we should divide that value by 2, but I still don't know how to connect all of this.
What I really want to know is the linear algebra definition behind this.
I have an heightmap, I want to measure the central differences and I want to obtain the normal vector to use later to measure the steepness.
PS: the Z-axis is the height.
From vector calculus, the normal of a surface is given by the gradient operator:
A height map h(x, y) is a special form of the function f:
For a discretized height map, assuming that the grid size is 1, the first-order approximations to the two derivative terms above are given by:
Since the x step from L to R is 2, and same for y. The above is exactly the formula you had, divided through by 4. When this vector is normalized, the factor of 4 is canceled.
(No linear algebra was harmed in the writing of this answer)

How to calculate "compound" Markov transition matrix in Stata or R?

By "compound" I mean the transition matrix satisfies the Markov property,namely I have two columns s_t and s_t+k that represent state of each individual in two period t and t+k respectively.
What I want is to find the matrix M that
s_t+k = M^k * s_t
so that matrix M satisfies the Markov property.
My default working language is Stata, in which commands like tab, svy:tab or xttran can generate one period transition matrices, but these matrices do not necessarily satisfy the Markov property. So I wonder how to achieve my goal in Stata or other common language like R or Python.
PS:This problem raise from a paper which research many countries' GDP_per_capita transition dynamics from 1960 to 2010. Say, at the beginning of each decades, we group all countries into 5 groups (from 1:extremely poor country to 5: high-income country), so we have a distribution of countries with 5 states. It's easy if I simply estimate the decade-to-decade transition matrix using markovchain class. However, the author claim that (page11, footnote4)
“The decade average transition matrix is estimated based on
the 5-decade transition matrices from 1960 to 2010 by employing
a numerical optimization program. Instead of taking the simple average
for the five transition matrices (which suffers from Jensen’s
Inequality), we estimate a transition matrix that can give us an exact
5 decade duration transition matrix (entry in 1960 and exit in 2010)
by taking its power 5.”
In R you can use the markovchain package to get the transition matrix that satisfies markov property. You can use the following example code...
library(markovchain)
data(rain)
mysequence<-rain$rain
createSequenceMatrix(mysequence)
myFit<-markovchainFit(data=mysequence,,method="bootstrap",nboot=5, name="Bootstrap Mc")
myFit
The myFit is your estimated transition matrix. This example uses the Alofi rainfall dataset.
The multiplication of matrix in R is not * but %*%.
I wrote a simple function in R to solve the problem.
trans_mat = function(k,s_t,M){
for(i in 1:k){
M = M % * % M
}
return(M%*%s_t)}
now, what you need to do is to type in k(how long the period you want),s_t(the original state), and M(markov property).
s_t+k = trans_mat(k,s_t,M)
The markovchain package directly implements the power for any markovchain object:
require(markovchain)
#creating the MC
myMatr<-matrix(data=c(0.2,0.8,.6,.4),ncol=2,byrow=TRUE)
myMc<-as(myMatr,"markovchain")
#5th power of the MC
myMc5<-myMc^5
myMc5

Resources