How to calculate "compound" Markov transition matrix in Stata or R? - r

By "compound" I mean the transition matrix satisfies the Markov property,namely I have two columns s_t and s_t+k that represent state of each individual in two period t and t+k respectively.
What I want is to find the matrix M that
s_t+k = M^k * s_t
so that matrix M satisfies the Markov property.
My default working language is Stata, in which commands like tab, svy:tab or xttran can generate one period transition matrices, but these matrices do not necessarily satisfy the Markov property. So I wonder how to achieve my goal in Stata or other common language like R or Python.
PS:This problem raise from a paper which research many countries' GDP_per_capita transition dynamics from 1960 to 2010. Say, at the beginning of each decades, we group all countries into 5 groups (from 1:extremely poor country to 5: high-income country), so we have a distribution of countries with 5 states. It's easy if I simply estimate the decade-to-decade transition matrix using markovchain class. However, the author claim that (page11, footnote4)
“The decade average transition matrix is estimated based on
the 5-decade transition matrices from 1960 to 2010 by employing
a numerical optimization program. Instead of taking the simple average
for the five transition matrices (which suffers from Jensen’s
Inequality), we estimate a transition matrix that can give us an exact
5 decade duration transition matrix (entry in 1960 and exit in 2010)
by taking its power 5.”

In R you can use the markovchain package to get the transition matrix that satisfies markov property. You can use the following example code...
library(markovchain)
data(rain)
mysequence<-rain$rain
createSequenceMatrix(mysequence)
myFit<-markovchainFit(data=mysequence,,method="bootstrap",nboot=5, name="Bootstrap Mc")
myFit
The myFit is your estimated transition matrix. This example uses the Alofi rainfall dataset.

The multiplication of matrix in R is not * but %*%.
I wrote a simple function in R to solve the problem.
trans_mat = function(k,s_t,M){
for(i in 1:k){
M = M % * % M
}
return(M%*%s_t)}
now, what you need to do is to type in k(how long the period you want),s_t(the original state), and M(markov property).
s_t+k = trans_mat(k,s_t,M)

The markovchain package directly implements the power for any markovchain object:
require(markovchain)
#creating the MC
myMatr<-matrix(data=c(0.2,0.8,.6,.4),ncol=2,byrow=TRUE)
myMc<-as(myMatr,"markovchain")
#5th power of the MC
myMc5<-myMc^5
myMc5

Related

emission probabilities for HMM in R

How can we calculate Emission probabilities for a Hidden Markov Model (HMM) in R?
As for calculating Transition Probabilities we use function
tr <- seqtrate(exampledata)
and this function returns a Transition Matrix. Example data is a sequential data.
Is there a function that returns us an Emission Matrix?
Please have a look to R's HMM package from https://cran.r-project.org/web/packages/HMM/HMM.pdf
You can find such an example there
hmm = initHMM(c("A","B"), c("L","R"), transProbs=matrix(c(.8,.2,.2,.8),2),
emissionProbs=matrix(c(.6,.4,.4,.6),2))
print(hmm)
# Sequence of observations
observation = c("L","L","R","R")
baumWelch(hmm, observation, maxIterations=100, delta=1E-9, pseudoCount=0)
baumWelch algorithm returns the updated emission probabilities.

Extract Fourier coeffients from fft() in R

I need to derive Fourier time series coefficients associated to (i-1)^th harmonic from fft() function in R, some idea?
For instance
Adding these concepts we get the general form of the Fourier Series:
f(t)=a_0+∑_k a_k×sin(kwt+ρ_k)
where a_0 is the DC component, w=2πf_0, where f_0 is the fundamental frequency of the original wave.
Each wave component a_k×sin(kwt+ρ_k) is also called a harmonic.
If I fixed the number of harmonics to 2, I would like to derive a_0,a_1,a_2 from fft()
It's a very general question. You can look here: Fourier Transform: A R Tutorial, for a start.

Apply a transformation matrix over time

I have an initial frame and a bounding box around some information. I have a transformation matrix T, for which I want to use to transform this bounding box.
I could easily apply the transformation and draw it in the output frame, but I would like to apply the transformation over a sequence of x frames, can anyone suggest a way to do this?
Aly
Building on #egor-n comment, you could compute R = T^{1/x} and compute your bounding box on frame i+1 from the one at frame i by
B_{i+1} = R * B_{i}
with B_{0} your initial bounding box. Depending on the precise form of T, we could discuss how to compute R.
There are methods for affine transforms - to make decomposition of affine transform matrix to product of translation, rotation, scaling and shear matrices, and linear interpolation of parameters of every matrix (for example, rotation angle for R and so on). Example
But for homography matrix there is no single solution, as described here, so one can find some "good" approximation (look at complex math in that article). Probably, some limitations for possible transforms could simplify the problem.
Here's something a little different you could try. Let M be the matrix representing the final transformation. You could try interpolating between I (the identity matrix, with 1's on the diagonal and 0's elsewhere) using the formula
M(t) = exp(t * ln(M))
where t is time from 0 to 1, M(0) = I, M(1) = M, exp is the exponential function for matrices given by the usual infinite series, and ln is the similar natural logarithm function for matrices given by the usual infinite series.
The correctness of the formula depends on the type of transformation represented by M and the type of transformations allowed in intermediate steps. The formula should work for rigid motions. For other types of transformations, various bad things might happen, including divergence of the logarithm series. Other formulas can be used in other cases; let me know if you're using transformations other than rigid motions and I can give some other formulas.
The exponential and logarithm functions may be available in a matrix library. If not, they can be easily implemented as partial sums of infinite series.
The above method should give the same result as some quaternion methods in the case of rotations. The quaternion methods are probably faster when they're available.
UPDATE
I see you mention elsewhere that your transformation is a homography (perspectivity), so the method I suggested above for rigid motions won't work. Instead you could use a different, but related method outlined in ftp://ftp.cs.huji.ac.il/users/aristo/papers/SYGRAPH2005/sig05.pdf. It goes as follows: represent your transformation by a matrix in one higher dimension. Scale the matrix so that its determinant is equal to 1. Call the resulting matrix G. You want to interpolate from the identity matrix I to G, going through perspectivities.
In what follows, let M^T be the transpose of M. Let the function expp be defined by
expp(M) = exp(-M^T) * exp(M+M^T)
You need to find the inverse of that function at G; in other words you need to solve the equation
expp(M) = G
where G is your transformation matrix with determinant 1. Call the result M = logp(G). That equation can be solved by standard numerical techniques, or you can use Matlab or other math software. It's somewhat time-consuming and complicated to do, but you only have to do it once.
Then you calculate the series of transformations by
G(t) = expp(t * logp(G))
where t varies from 0 to 1 in steps of 1/k, where k is the number of frames you want.
You could parameterize the transform over some number of frames by adding a variable with a domain greater than zero but less than 1.
Let t be the frame number
Let T be the total number of frames
Let P be the original location and orientation of the object
Let theta be the total rotation angle
and translation be the vector [x,y]'
The transform in 2D becomes:
T(P|t) = R(t)*P +(t*[x,y]')/T
where R(t) = {{Cos((theta*t)/T),-Sin((theta*t)/T)},{Sin((theta*t)/T),Cos((theta*t)/T)}}
So that at frame t_n you apply the transform T(t) to the position of the object at time t_0 = 0 (which is equivalent to no transform)

Fit and evaluate a second order transition matrix (Markov Process) in R?

I am trying to build a second-order Markov Chain model, now I am try to find transition matrix from the following data.
dat<-data.frame(replicate(20,sample(c("A", "B", "C","D"), size = 100, replace=TRUE)))
Now I know how to fit the first order Markov transition matrix using the function markovchainFit(dat) in markovchain package.
Is there any way to fit the second order transition matrix?
How do evaluate the Markov Chain models? i.e. Should I choose the first order model or second order model?
This function should produce a Markov chain transition matrix to any lag order that you wish.
dat<-data.frame(replicate(20,sample(c("A", "B", "C","D"), size = 100, replace=TRUE)))
Markovmatrix <- function(X,l=1){
tt <- table(X[,-c((ncol(X)-l+1):ncol(X))] , c(X[,-c(1:l)]))
tt <- tt / rowSums(tt)
return(tt)
}
Markovmatrix(as.matrix(dat),1)
Markovmatrix(as.matrix(dat),2)
where l is the lag.
e.g. 2nd order matrix, the output is:
A B C D
A 0.2422803 0.2185273 0.2446556 0.2945368
B 0.2426304 0.2108844 0.2766440 0.2698413
C 0.2146119 0.2716895 0.2123288 0.3013699
D 0.2480000 0.2560000 0.2320000 0.2640000
As for how to test what order model. There are several suggestions. One put forward by Gottman and Roy (1990) in their introductory book to Sequential Analysis is to use information value. There is a chapter on that - most of the chapter is available online.
You can also perform a likelihood-ratio chi-Square test. This is very similar to a chi square test in that you are comparing observed to expected frequencies of transitions. However, the formula is as follows:
The degrees of freedom are the square of the number of codes minus one. In your case you have 4 codes, so (4-1)^2 = 9. You can then look up the associated p-value.
I hope this helps.

Getting the next observation from a HMM gaussian mixture distribution

I have a continuous univariate xts object of length 1000, which I have converted into a data.frame called x to be used by the package RHmm.
I have already chosen that there are going to be 5 states and 4 gaussian distributions in the mixed distribution.
What I'm after is the expected mean value for the next observation. How do I go about getting that?
So what I have so far is:
a transition matrix from running the HMMFit() function
a set of means and variances for each of the gaussian distributions in the mixture, along with their respective proportions, all of which was also generated form the HMMFit() function
a list of past hidden states relating to the input data when using the output of the HMMFit function and putting it into the viterbi function
How would I go about getting the next hidden state (i.e. the 1001st value) from what I've got, and then using it to get the weighted mean from the gaussian distributions.
I think I'm pretty close just not too sure what the next part is...The last state is state 5, do I use the 5th row in the transition matrix somehow to get the next state?
All I'm after is the weighted mean for what is to be expect in the next observation, so the next hidden state isn't even necessary. Do I multiply the probabilities in row 5 by each of the means, weighted to their proportion for each state? and then sum it all together?
here is the code I used.
# have used 2000 iterations to ensure convergence
a <- HMMFit(x, nStates=5, nMixt=4, dis="MIXTURE", control=list(iter=2000)
v <- viterbi(a,x)
a
v
As always any help would be greatly appreciated!
Next predicted value uses last hidden state last(v$states) to get probability weights from the transition matrix a$HMM$transMat[last(v$states),] for each state the distribution means a$HMM$distribution$mean are weighted by proportions a$HMM$distribution$proportion, then its all multiplied together and summed. So in the above case it would be as follows:
sum(a$HMM$transMat[last(v$states),] * .colSums((matrix(unlist(a$HMM$distribution$mean), nrow=4,ncol=5)) * (matrix(unlist(a$HMM$distribution$proportion), nrow=4,ncol=5)), m=4,n=5))

Resources