Simple Markov Chain in R (visualization) - r

i'd like to do a simple first order markov chain in R. I know there are packages like MCMC, but couldn't found one to display it graphically. Is this even possible? It would be nice if given a transition matrix and an initial state, one can visually see the path through the markov chain (maybe i've to do this by hand...).
Thanks.

This shows how to apply a random transition matrix to a particular starting vector: c(1,0,0,0):
set.seed(123)
tmat <- matrix(rnorm(16)^2,ncol=4)
# need entries to be positive, could have used abs()
tmat <- tmat/rowSums(tmat) # need the rows to sum to 1
tmat
[,1] [,2] [,3] [,4]
[1,] 0.326123580 0.01735335 0.48977444 0.166748625
[2,] 0.016529424 0.91768404 0.06196453 0.003822008
[3,] 0.546050789 0.04774713 0.33676288 0.069439199
[4,] 0.001008839 0.32476060 0.02627217 0.647958394
require(expm) # for the %^% function
matplot( t( # need to transpose to get arguments to matplot correctly
sapply(1:20, function(x) matrix(c(1,0,0,0), ncol=4) %*% (tmat %^% x) ) ) )
You can see it approaching equilibrium:

The package coda (http://cran.r-project.org/web/packages/coda/index.html) has tools for analyzing MCMC results, including some plotting functionality.

Perhaps this query on Biostar can help you: Visualizing HMM files of HMMER3. It point to two external applications, LogoMat-M and HMMeditor, for visualizing Profile Hidden Markov Models (pHMMs).

You can use markovchain R package, that models Discrete Time Markov Chains and contains a plotting facility based on igraph package.
library(markovchain) #loading the package
myMatr<-matrix(c(0,.2,.8,.1,.8,.1,.3,0,.7),byrow=TRUE,nrow = 3) #defining a transition matrix
rownames(myMatr)<-colnames(myMatr)<-c("a","b","c")
myMc<-as(myMatr, "markovchain")
plot(myMc)

Related

PCA scores for only the first principal components are of "wrong" sign

I am currently trying to get into principal component analysis and regression. I therefore tried caclulating the principal components of a given matrix by hand and compare it with the results you get out of the r-package rcomp.
The following is the code for doing pca by hand
### compute principal component loadings and scores by hand
df <- matrix(nrow = 5, ncol = 3, c(90,90,60,60,30,
60,90,60,60,30,
90,30,60,90,60))
# calculate covariance matrix to see variance and covariance of
cov.mat <- cov.wt(df)
cen <- cov.mat$center
n.obs <- cov.mat$n.obs
cv <- cov.mat$cov * (1-1/n.obs)
## calcualate the eigenvector and values
edc <- eigen(cv, symmetric = TRUE)
ev <- edc$values
evec <- edc$vectors
cn <- paste0("Comp.", 1L:ncol(cv))
cen <- cov.mat$center
### get loadings (or principal component weights) out of the eigenvectors and compute scores
loadings <- structure(edc$vectors, class = "loadings")
df.scaled <- scale(df, center = cen, scale = FALSE)
scr <- df.scaled %*% evec
I compared my results to the ones obtained by using the princomp-package
pca.mod <- princomp(df)
loadings.mod <- pca.mod$loadings
scr.mod <- pca.mod$scores
scr
scr.mod
> scr
[,1] [,2] [,3]
[1,] -6.935190 32.310906 7.7400588
[2,] -48.968014 -19.339313 -0.3529382
[3,] 1.733797 -8.077726 -1.9350147
[4,] 13.339605 18.519500 -9.5437444
[5,] 40.829802 -23.413367 4.0916385
> scr.mod
Comp.1 Comp.2 Comp.3
[1,] 6.935190 32.310906 7.7400588
[2,] 48.968014 -19.339313 -0.3529382
[3,] -1.733797 -8.077726 -1.9350147
[4,] -13.339605 18.519500 -9.5437444
[5,] -40.829802 -23.413367 4.0916385
So apparently, I did quite good. The computed scores equal at least scale-wise. However: The scores for the first pricipal components differ in the sign. This is not the case for the other two.
This leads to two questions:
I have read that it is no problem multiplying the loadings and the scores of principal components by minus one. Does this hold, when only one of the principal components are of a different sign as well?
What am I doing "wrong" from a computational standpoint? The procedure seems straightforward to me and I dont see what I could change in my own calculations to get the same signs as the princomp-package.
When checking this with the mtcars data set, the signs for my first PC were right, however now the second and fourth PC scores are of different signs, compared to the package. I can not make any sense of this. Any help is appreciated!
The signs of eigenvectors and loadings are arbitrary, so there is nothing "wrong" here. The only thing that you should expect to be preserved is the overall pattern of signs within each loadings vector, i.e. in the example above the princomp answer for PC1 gives +,+,-,-,- while yours gives -,-,+,+,+. That's fine. If yours gave e.g. -,+,-,-,+ that would be trouble (because the two would no longer be equivalent up to multiplication by -1).
However, while it's generally true that the signs are arbitrary and hence could vary across algorithms, compilers, operating systems, etc., there's an easy solution in this particular case. princomp has a fix_sign argument:
fix_sign: Should the signs of the loadings and scores be chosen so that
the first element of each loading is non-negative?
Try princomp(df,fix_sign=FALSE)$scores and you'll see that the signs (probably!) line up with your results. (In general the fix_sign=TRUE option is useful because it breaks the symmetry in a specific way and thus will always result in the same answers across all platforms.)

Simulating data from multivariate distribution in R based on Winbugs/JAGS script

I am trying to simulate data, based on part of a JAGS/Winbugs script. The script comes from Eaves & Erkanli (2003, see, http://psych.colorado.edu/~carey/pdffiles/mcmc_eaves.pdf, page 295-296).
The (part of) the script I want to base my simulations on is as follows (different variable names than in the original paper):
for(fam in 1 : nmz ){
a2mz[fam, 1:N] ~ dmnorm(mu[1:N], tau.a[1:N, 1:N])
a1mz[fam, 1:N] ~ dmnorm(a2mz[fam, 1:N], tau.a[1:N, 1:N])
}
#Prior
tau.a[1:N, 1:N] ~ dwish(omega.g[,], N)
I want to simulate data in R for the parameters a2mz and a1mz as given in the script above.
So basically, I want to simualte data from -N- (e.g. = 3) multivariate distributions with -fam- (e.g. 10) persons with sigma tau.a.
To make this more illustrative: The purpose is to simulate genetic effects for -fam- (e.g. 10) families. The genetic effect is the same for each family (e.g. monozygotic twins), with a variance of tau.a (e.g. 0.5). Of these genetic effects, 3 'versions' (3 multivariate distributions) have to be simulated.
What I tried in R to simulate the data as given in the JAGS/Winbugs script is as follows:
library(MASS)
nmz = 10 #number of families, here e.g. 10
var_a = 0.5 #tau.g in the script
a2_mz <- mvrnorm(3, mu = rep(0, nmz), Sigma = diag(nmz)*var_a)
This simulates data for the a2mz parameter as referred to in the JAGS/Winbugs script above:
> print(t(a2_mz))
[,1] [,2] [,3]
[1,] -1.1563683 -0.4478091 -0.15037563
[2,] 0.5673873 -0.7052487 0.44377336
[3,] 0.2560446 0.9901964 -0.65463341
[4,] -0.8366952 0.4924839 -0.56891991
[5,] 0.7343780 0.5429955 0.87529201
[6,] 0.5592868 -0.3899988 -0.33709105
[7,] -1.8233663 -0.7149141 -0.18153049
[8,] -0.8213804 -1.4397075 -0.09159725
[9,] -0.7002797 -0.3996970 -0.29142215
[10,] 1.1084067 0.3884869 -0.46207940
However, when I then try to use these data to simulate data for the a1mz (third line of the JAGS/Winbugs) script, then something goes wrong and I am not sure what:
a1_mz <- mvrnorm(3, mu = t(a2_mz), Sigma = c(diag(nmz)*var_a, diag(nmz)*var_a, diag(nmz)*var_a))
This results in the error:
Error in eigen(Sigma, symmetric = TRUE, EISPACK = EISPACK) :
non-square matrix in 'eigen'
Can anyone give me any hints or tips on what I am doing wrong?
Many thanks,
Best regards,
inga
mvrnorm() takes a mean-vector and a variance matrix as input, and that's not what you're feeding it. I'm not sure I understand your question, but if you want to simulate 3 samples from 3 different multivariate normal distributions with same variance and different mean. Then just use:
a1_mz<-array(dim=c(dim(a2_mz),3))
for(i in 1:3) a1_mz[,,i]<-mvrnorm(3,t(a2_mz)[,i],diag(nmz)*var_a)

R: backwards principal component calculation

I would like to perform a backwards principal component calculation in R, meaning: obtaining the original matrix by the PCA object itself.
This is an example case:
# Load an expression matrix
load(url("http://www.giorgilab.org/allexp_rsn.rda"))
# Calculate PCA
pca <- prcomp(t(allexp_rsn))
In order to obtain the original matrix, one should multiply the rotations by the PCA themselves, as such:
test<-pca$rotation%*%pca$x
However, as you may check, the calculated "test" matrix is completely different from the original "allexp_rsn" matrix. What am I doing wrong? Is the function prcomp adding something else to the svs procedure?
Thanks :-)
Using USArrests:
pca <- prcomp(t(USArrests))
out <- t(pca$x%*%t(pca$rotation))
out <- sweep(out, 1, pca$center, '+')
apply(USArrests - out, 2, sum)
Murder Assault UrbanPop Rape
1.070921e-12 -2.778222e-12 3.801404e-13 1.428191e-12
Remember that a prerequisite to perform PC analysis is to scale and center the data. I believe that prcomp procedure does that, so pca$x returns scaled original data (with mean 0 and std. equal to 1).
Here is a solution using the eigen function, applied to a B/W image matrix to illustrate the point. The function uses increasing numbers of PCs, but you can use all of them, or only some of them
library(gplots)
library(png)
# Download an image:
download.file("http://www.giorgilab.org/pictures/monalisa.tar.gz",destfile="monalisa.tar.gz",cacheOK = FALSE)
untar("monalisa.tar.gz")
# Read image:
img <- readPNG("monalisa.png")
# Dimension
d<-1
# Rotate it:
rotate <- function(x) t(apply(x, 2, rev))
centermat<-rotate(img[,,d])
# Plot it
image(centermat,col=gray(c(0:100)/100))
# Increasing PCA
png("increasingPCA.png",width=2000,height=2000,pointsize=20)
par(mfrow=c(5,5),mar=c(0,0,0,0))
for(end in (1:25)*12){
for(d in 1){
centermat<-rotate(img[,,d])
eig <- eigen(cov(centermat))
n <- 1:end
eigmat<-t(eig$vectors[,n] %*% (t(eig$vectors[,n]) %*% t(centermat)))
image(eigmat,col=gray(c(0:100)/100))
}
}
dev.off()

Problems with Newton's Method for finding coefficient and Hessian

I am trying to write a function that uses Newton's method (coefficients+(inverse hessian)*gradient) to iteratively find the coefficients for a loglinear model.
I am using the following code:
##reading in the data
dat<-read.csv('hw8.csv')
summary(dat)
# data file containing yi and xi
attach(dat)
##Creating column of x's
x<-cbind(1,xi)
mle<-function(c){
gi<- 1-yi*exp(c[1]+c[2]*xi)
hi<- gi-1
H<- -1*(t(x)%*%hi%*%x)
g<-t(x)%*%gi
c<-c+solve(H)%*%g
return(c)
}
optim(c(0,1),mle,hessian=TRUE)
When I run the code, I get the following error:
Error in t(x) %*% hi %*% x : non-conformable arguments
RMate stopped at line 29
Given that the formula is drawn from Bill Greene's problem set, I don't think it is a formula problem. I think I am doing something wrong in passing my function.
How can I fix this?
Any help with this function would be much appreciated.
As Jonathan said in the comments, you need proper dimensions:
R> X <- matrix(1:4, ncol=2)
R> t(X) %*% X
[,1] [,2]
[1,] 5 11
[2,] 11 25
R>
But you also should use the proper tools so maybe look at the loglin function in the stats package, and/or the loglm function in the MASS package. Both will be installed by default with your R installation.

Help using predict() for kernlab's SVM in R?

I am trying to use the kernlab R package to do Support Vector Machines (SVM). For my very simple example, I have two pieces of training data. A and B.
(A and B are of type matrix - they are adjacency matrices for graphs.)
So I wrote a function which takes A+B and generates a kernel matrix.
> km
[,1] [,2]
[1,] 14.33333 18.47368
[2,] 18.47368 38.96053
Now I use kernlab's ksvm function to generate my predictive model. Right now, I'm just trying to get the darn thing to work - I'm not worried about training error, etc.
So, Question 1: Am I generating my model correctly? Reasonably?
# y are my classes. In this case, A is in class "1" and B is in class "-1"
> y
[1] 1 -1
> model2 = ksvm(km, y, type="C-svc", kernel = "matrix");
> model2
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 1
[1] " Kernel matrix used as input."
Number of Support Vectors : 2
Objective Function Value : -0.1224
Training error : 0
So far so good. We created our custom kernel matrix, and then we created a ksvm model using that matrix. We have our training data labeled as "1" and "-1".
Now to predict:
> A
[,1] [,2] [,3]
[1,] 0 1 1
[2,] 1 0 1
[3,] 0 0 0
> predict(model2, A)
Error in as.matrix(Z) : object 'Z' not found
Uh-oh. This is okay. Kind of expected, really. "Predict" wants some sort of vector, not a matrix.
So lets try some things:
> predict(model2, c(1))
Error in as.matrix(Z) : object 'Z' not found
> predict(model2, c(1,1))
Error in as.matrix(Z) : object 'Z' not found
> predict(model2, c(1,1,1))
Error in as.matrix(Z) : object 'Z' not found
> predict(model2, c(1,1,1,1))
Error in as.matrix(Z) : object 'Z' not found
> predict(model2, km)
Error in as.matrix(Z) : object 'Z' not found
Some of the above tests are nonsensical, but that is my point: no matter what I do, I just can't get predict() to look at my data and do a prediction. Scalars don't work, vectors don't work. A 2x2 matrix doesn't work, nor does a 3x3 matrix.
What am I doing wrong here?
(Once I figure out what ksvm wants, then I can make sure that my test data can conform to that format in a sane/reasonable/mathematically sound way.)
If you think about how the support vector machine might "use" the kernel matrix, you'll see that you can't really do this in the way you're trying (as you've seen :-)
I actually struggled a bit with this when I first was using kernlab + a kernel matrix ... coincidentally, it was also for graph kernels!
Anyway, let's first realize that since the SVM doesn't know how to calculate your kernel function, it needs to have these values already calculated between your new (testing) examples, and the examples it picks out as the support vectors during the training step.
So, you'll need to calculate the kernel matrix for all of your examples together. You'll later train on some and test on the others by removing rows + columns from the kernel matrix when appropriate. Let me show you with code.
We can use the example code in the ksvm documentation to load our workspace with some data:
library(kernlab)
example(ksvm)
You'll need to hit return a few (2) times in order to let the plots draw, and let the example finish, but you should now have a kernel matrix in your workspace called K. We'll need to recover the y vector that it should use for its labels (as it has been trampled over by other code in the example):
y <- matrix(c(rep(1,60),rep(-1,60)))
Now, pick a subset of examples to use for testing
holdout <- sample(1:ncol(K), 10)
From this point on, I'm going to:
Create a training kernel matrix named trainK from the original K kernel matrix.
Create an SVM model from my training set trainK
Use the support vectors found from the model to create a testing kernel matrix testK ... this is the weird part. If you look at the code in kernlab to see how it uses the support vector indices, you'll see why it's being done this way. It might be possible to do this another way, but I didn't see any documentation/examples on predicting with a kernel matrix, so I'm doing it "the hard way" here.
Use the SVM to predict on these features and report accuracy
Here's the code:
trainK <- as.kernelMatrix(K[-holdout,-holdout]) # 1
m <- ksvm(trainK, y[-holdout], kernel='matrix') # 2
testK <- as.kernelMatrix(K[holdout, -holdout][,SVindex(m), drop=F]) # 3
preds <- predict(m, testK) # 4
sum(sign(preds) == sign(y[holdout])) / length(holdout) # == 1 (perfect!)
That should just about do it. Good luck!
Responses to comment below
what does K[-holdout,-holdout] mean? (what does the "-" mean?)
Imagine you have a vector x, and you want to retrieve elements 1, 3, and 5 from it, you'd do:
x.sub <- x[c(1,3,5)]
If you want to retrieve everything from x except elements 1, 3, and 5, you'd do:
x.sub <- x[-c(1,3,5)]
So K[-holdout,-holdout] returns all of the rows and columns of K except for the rows we want to holdout.
What are the arguments of your as.kernelMatrix - especially the [,SVindex(m),drop=F] argument (which is particulary strange because it looks like that entire bracket is a matrix index of K?)
Yeah, I inlined two commands into one:
testK <- as.kernelMatrix(K[holdout, -holdout][,SVindex(m), drop=F])
Now that you've trained the model, you want to give it a new kernel matrix with your testing examples. K[holdout,] would give you only the rows which correspond to the training examples in K, and all of the columns of K.
SVindex(m) gives you the indexes of your support vectors from your original training matrix -- remember, those rows/cols have holdout removed. So for those column indices to be correct (ie. reference the correct sv column), I must first remove the holdout columns.
Anyway, perhaps this is more clear:
testK <- K[holdout, -holdout]
testK <- testK[,SVindex(m), drop=FALSE]
Now testK only has the rows of our testing examples and the columns that correspond to the support vectors. testK[1,1] will have the value of the kernel function computed between your first testing example, and the first support vector. testK[1,2] will have the kernel function value between your 1st testing example and the second support vector, etc.
Update (2014-01-30) to answer comment from #wrahool
It's been a while since I've played with this, so the particulars of kernlab::ksvm are a bit rusty, but in principle this should be correct :-) ... here goes:
what is the point of testK <- K[holdout, -holdout] - aren't you removing the columns that correspond to the test set?
Yes. The short answer is that if you want to predict using a kernel matrix, you have to supply the a matrix that is of the dimension rows by support vectors. For each row of the matrix (the new example you want to predict on) the values in the columns are simply the value of the kernel matrix evaluated between that example and the support vector.
The call to SVindex(m) returns the index of the support vectors given in the dimension of the original training data.
So, first doing testK <- K[holdout, -holdout] gives me a testK matrix with the rows of the examples I want to predict on, and the columns are from the same examples (dimension) the model was trained on.
I further subset the columns of testK by SVindex(m) to only give me the columns which (now) correspond to my support vectors. Had I not done the first [, -holdout] selection, the indices returned by SVindex(m) may not correspond to the right examples (unless all N of your testing examples are the last N columns of your matrix).
Also, what exactly does the drop = FALSE condition do?
It's a bit of defensive coding to ensure that after the indexing operation is performed, the object that is returned is of the same type as the object that was indexed.
In R, if you index only one dimension of a 2D (or higher(?)) object, you are returned an object of the lower dimension. I don't want to pass a numeric vector into predict because it wants to have a matrix
For instance
x <- matrix(rnorm(50), nrow=10)
class(x)
[1] "matrix"
dim(x)
[1] 10 5
y <- x[, 1]
class(y)
[1] "numeric"
dim(y)
NULL
The same will happen with data.frames, etc.
First off, I have not used kernlab much. But simply looking at the docs, I do see working examples for the predict.ksvm() method. Copying and pasting, and omitting the prints to screen:
## example using the promotergene data set
data(promotergene)
## create test and training set
ind <- sample(1:dim(promotergene)[1],20)
genetrain <- promotergene[-ind, ]
genetest <- promotergene[ind, ]
## train a support vector machine
gene <- ksvm(Class~.,data=genetrain,kernel="rbfdot",\
kpar=list(sigma=0.015),C=70,cross=4,prob.model=TRUE)
## predict gene type probabilities on the test set
genetype <- predict(gene,genetest,type="probabilities")
That seems pretty straight-laced: use random sampling to generate a training set genetrain and its complement genetest, then fitting via ksvm and a call to a predict() method using the fit, and new data in a matching format. This is very standard.
You may find the caret package by Max Kuhn useful. It provides a general evaluation and testing framework for a variety of regression, classification and machine learning methods and packages, including kernlab, and contains several vignettes plus a JSS paper.
Steve Lianoglou is right.
In kernlab it is a bit wired, and when predicting it requires the input kernel matrix between each test example and the support vectors. You need to find this matrix yourself.
For example, a test matrix [n x m], where n is the number of test samples and m is the number of support vectors in the learned model (ordered in the sequence of SVindex(model)).
Example code
trmat <- as.kernelMatrix(kernels[trainidx,trainidx])
tsmat <- as.kernelMatrix(kernels[testidx,trainidx])
#training
model = ksvm(x=trmat, y=trlabels, type = "C-svc", C = 1)
#testing
thistsmat = as.kernelMatrix(tsmat[,SVindex(model)])
tsprediction = predict(model, thistsmat, type = "decision")
kernels is the input kernel matrix. trainidx and testidx are ids for training and test.
Build the labels yourself from the elements of the solution. Use this alternate predictor method which takes ksvm model (m) and data in original training format (d)
predict.alt <- function(m, d){
sign(d[, m#SVindex] %*% m#coef[[1]] - m#b)
}
K is a kernelMatrix for training. For validation's sake, if you run predict.alt on the training data you will notice that the alternate predictor method switches values alongside the fitted values returned by ksvm. The native predictor behaves in an unexpected way:
aux <- data.frame(fit=kout#fitted, native=predict(kout, K), alt=predict.alt(m=kout, d=as.matrix(K)))
sample_n(aux, 10)
fit native alt
1 0 0 -1
100 1 0 1
218 1 0 1
200 1 0 1
182 1 0 1
87 0 0 -1
183 1 0 1
174 1 0 1
94 1 0 1
165 1 0 1

Resources