Understanding Keras prediction output of a rnn model in R - r

I'm trying out the Keras package in R by doing this tutorial about forecasting the temperature. However, the tutorial has no explanation on how to predict with the trained RNN model and I wonder how to do this. To train a model I used the following code copied from the tutorial:
dir.create("~/Downloads/jena_climate", recursive = TRUE)
download.file(
"https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip",
"~/Downloads/jena_climate/jena_climate_2009_2016.csv.zip"
)
unzip(
"~/Downloads/jena_climate/jena_climate_2009_2016.csv.zip",
exdir = "~/Downloads/jena_climate"
)
library(readr)
data_dir <- "~/Downloads/jena_climate"
fname <- file.path(data_dir, "jena_climate_2009_2016.csv")
data <- read_csv(fname)
data <- data.matrix(data[,-1])
train_data <- data[1:200000,]
mean <- apply(train_data, 2, mean)
std <- apply(train_data, 2, sd)
data <- scale(data, center = mean, scale = std)
generator <- function(data, lookback, delay, min_index, max_index,
shuffle = FALSE, batch_size = 128, step = 6) {
if (is.null(max_index))
max_index <- nrow(data) - delay - 1
i <- min_index + lookback
function() {
if (shuffle) {
rows <- sample(c((min_index+lookback):max_index), size = batch_size)
} else {
if (i + batch_size >= max_index)
i <<- min_index + lookback
rows <- c(i:min(i+batch_size, max_index))
i <<- i + length(rows)
}
samples <- array(0, dim = c(length(rows),
lookback / step,
dim(data)[[-1]]))
targets <- array(0, dim = c(length(rows)))
for (j in 1:length(rows)) {
indices <- seq(rows[[j]] - lookback, rows[[j]],
length.out = dim(samples)[[2]])
samples[j,,] <- data[indices,]
targets[[j]] <- data[rows[[j]] + delay,2]
}
list(samples, targets)
}
}
lookback <- 1440
step <- 6
delay <- 144
batch_size <- 128
train_gen <- generator(
data,
lookback = lookback,
delay = delay,
min_index = 1,
max_index = 200000,
shuffle = TRUE,
step = step,
batch_size = batch_size
)
val_gen = generator(
data,
lookback = lookback,
delay = delay,
min_index = 200001,
max_index = 300000,
step = step,
batch_size = batch_size
)
test_gen <- generator(
data,
lookback = lookback,
delay = delay,
min_index = 300001,
max_index = NULL,
step = step,
batch_size = batch_size
)
# How many steps to draw from val_gen in order to see the entire validation set
val_steps <- (300000 - 200001 - lookback) / batch_size
# How many steps to draw from test_gen in order to see the entire test set
test_steps <- (nrow(data) - 300001 - lookback) / batch_size
library(keras)
model <- keras_model_sequential() %>%
layer_flatten(input_shape = c(lookback / step, dim(data)[-1])) %>%
layer_dense(units = 32, activation = "relu") %>%
layer_dense(units = 1)
model %>% compile(
optimizer = optimizer_rmsprop(),
loss = "mae"
)
history <- model %>% fit_generator(
train_gen,
steps_per_epoch = 500,
epochs = 20,
validation_data = val_gen,
validation_steps = val_steps
)
I tried to predict the temperature with the code below. If I am correct, this should give me the normalized predicted temperature for every batch. So when I denormalize the values and average them, I get the predicted temperature. Is this correct and if so for which time is then predicted (latest observation time + delay?) ?
prediction.set <- test_gen()[[1]]
prediction <- predict(model, prediction.set)
Also, what is the correct way to use keras::predict_generator() and the test_gen() function? If I use the following code:
model %>% predict_generator(generator = test_gen,
steps = test_steps)
it gives this error:
error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: Error when checking model input: the list of Numpy
arrays that you are passing to your model is not the size the model expected.
Expected to see 1 array(s), but instead got the following list of 2 arrays:
[array([[[ 0.50394005, 0.6441838 , 0.5990761 , ..., 0.22060473,
0.2018686 , -1.7336458 ],
[ 0.5475698 , 0.63853574, 0.5890239 , ..., -0.45618412,
-0.45030192, -1.724062...

Note: my familiarity with syntax of R is very little, so unfortunately I can't give you an answer using R. Instead, I am using Python in my answer. I hope you could easily translate back, my words at least, to R.
... If I am correct, this should give me the normalized predicted
temperature for every batch.
Yes, that's right. The predictions would be normalized since you have trained it with normalized labels:
data <- scale(data, center = mean, scale = std)
Therefore, you would need to denormalize the values using the computed mean and std to find the real predictions:
pred = model.predict(test_data)
denorm_pred = pred * std + mean
... for which time is then predicted (latest observation time +
delay?)
That's right. Concretely, since in this particular dataset every ten minutes a new obeservation is recorded and you have set delay=144, it would mean that the predicted value is the temperature 24 hours ahead (i.e. 144 * 10 = 1440 minutes = 24 hours) from the last given observation.
Also, what is the correct way to use keras::predict_generator() and
the test_gen() function?
predict_generator takes a generator that gives as output only test samples and not the labels (since we don't need labels when we are performing prediction; the labels are needed when training, i.e. fit_generator(), and when evaluating the model, i.e. evaluate_generator()). That's why the error mentions that you need to pass one array instead of two arrays. So you need to define a generator that only gives test samples or one alternative way, in Python, is to wrap your existing generator inside another function that gives only the input samples (I don't know whether you can do this in R or not):
def pred_generator(gen):
for data, labels in gen:
yield data # discards labels
preds = model.predict_generator(pred_generator(test_generator), number_of_steps)
You need to provide one other argument which is the number of steps of generator to cover all the samples in test data. Actually we have num_steps = total_number_of_samples / batch_size. For example, if you have 1000 samples and each time the generator generate 10 samples, you need to use generator for 1000 / 10 = 100 steps.
Bonus: To see how good your model performs you can use evaluate_generator using the existing test generator (i.e. test_gen):
loss = model.evaluate_generator(test_gen, number_of_steps)
The given loss is also normalized and to denormalize it (to get a better sense of prediction error) you just need to multiply it by std (you don't need to add mean since you are using mae, i.e. mean absolute error, as the loss function):
denorm_loss = loss * std
This would tell you how much your predictions are off on average. For example, if you are predicting the temperature, a denorm_loss of 5 means that the predictions are on average 5 degrees off (i.e. are either less or more than the actual value).
Update: For prediction, you can define a new generator using an existing generator in R like this:
pred_generator <- function(gen) {
function() { # wrap it in a function to make it callable
gen()[1] # call the given generator and get the first element (i.e. samples)
}
}
preds <- model %>%
predict_generator(
generator = pred_generator(test_gen), # pass test_gen directly to pred_generator without calling it
steps = test_steps
)
evaluate_generator(model, test_gen, test_steps)

Related

Output of keras in R can not be used to predict

i use keras and tensorflow to run an lstm in R to predict some stock market prices.
Here I am providing the code where instead of stock market prices, I just use one randomly generated vector VECTOR of length 100. Then I consider a training period of 80 first values and try to predict the 20 test values...
What am I doing wrong?
I am getting an error:Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "keras_training_history"
Thank you
library(tensorflow)
library(keras)
set.seed(12345)
VECTOR=rnorm(100,2,5)
VECTOR_training=VECTOR[1:80]
VECTOR_test=VECTOR[81:100]
training_rescaled=scale(VECTOR_training)
#I also calculate the scale factors because I will need them when I will be coming
#back to the original data
scale_factors=matrix(NA,nrow=1,ncol=2)
scale_factors=c(mean(VECTOR_training), sd(VECTOR_training))
#We want to predict 20 days, so we need to base each prediction on 20 data points.
prediction_stocks=20
lag_stocks=prediction_stocks
test_rescaled =training_rescaled[(length(VECTOR_training)- prediction_stocks + 1):length(VECTOR_training)]
#We lag the data 20times, so that each prediction is based on 20 values, and arrange lagged values into columns. Then we transform it into the desired 3D form.
x_train_data_stocks=t(sapply(1:(length(VECTOR_training)-lag_stocks-prediction_stocks+1),
function(x) training_rescaled[x:(x+lag_stocks-1),1]
))
# now we transform it into 3D form
x_train_arr_stocks=array(
data=as.numeric(unlist(x_train_data_stocks)),
dim=c(
nrow(x_train_data_stocks),
lag_stocks,
1
)
)
#Now we apply similar transformation for the Y values.
y_train_data_stocks=t(sapply(
(1 + lag_stocks):(length(training_rescaled) - prediction_stocks + 1),
function(x) training_rescaled[x:(x + prediction_stocks - 1)]
))
y_train_arr_stocks= array(
data = as.numeric(unlist(y_train_data_stocks)),
dim = c(
nrow(y_train_data_stocks),
prediction_stocks,
1
)
)
#In the same manner we need to prepare input data for the prediction
#list_test_rescaled
# this time our array just has one sample, as we intend to perform one 20-days prediction
x_pred_arr_stocks=array(
data = test_rescaled,
dim = c(
1,
lag_stocks,
1
)
)
###lstm forecast prova
set.seed(12345)
lstm_model <- keras_model_sequential()
lstm_model_prova=
layer_lstm(lstm_model,units = 70, # size of the layer
batch_input_shape = c(1, 20, 1), # batch size, timesteps, features
return_sequences = TRUE,
stateful = TRUE) %>%
# fraction of the units to drop for the linear transformation of the inputs
layer_dropout(rate = 0.5) %>%
layer_lstm(units = 50,
return_sequences = TRUE,
stateful = TRUE) %>%
layer_dropout(rate = 0.5) %>%
time_distributed(keras::layer_dense(units = 1))
lstm_model_compile=compile(lstm_model_prova,loss = 'mae', optimizer = 'adam', metrics = 'accuracy')
lstm_fit_prova=fit(lstm_model_compile,
x = x_train_arr_stocks[[1]],
y = y_train_arr_stocks[[1]],
batch_size = 1,
epochs = 20,
verbose = 0,
shuffle = FALSE
)
lstm_forecast_prova=predict(lstm_fit_prova,x_pred_arr_stocks, batch_size = 1)
It works if I use
lstm_forecast_prova=predict(lstm_model_compile,x_pred_arr_stocks, batch_size = 1)
But shouldn't I use the fitted model in order to make the predictions?
Also, if I plot the fitted model, the accuracy is 0. And actually on my real data the predictions do not make any sense. So what does it mean that the accuracy is 0? Maybe something is wrong with the lstm parameters?
Thank you in advance!!

Simulating ODE model for different initial conditions in R

I have a model, and I want to generate random initial conditions, run the model, and save the output so that each simulation is a replicate. But I have a hard time interpreting and implementing loops (and I also know they are not always the best to use in R), so I'm struggling.
My ultimate goal is to iterate the simulation across 10 different random initial conditions, and save the output of the ODE including a column for simulation number.
First I have my random initial conditions:
library(deSolve)
states <- c(r=runif(1, min=0.1, max=25), # resource state variable
c=runif(1, min=0.1, max=10)) # consumer state variable
Then I have my parameters and model:
parameters <- c(g=5, # resource growth rate )
K=25, # resource carrying capacity
a=1, # consumer attack rate
h=1, # consumer handling time
e=0.9, # consumer conversion efficiency
m=0.5, # consumer mortality rate
avgrain = 1500, # average rainfall
A = 1000,
w = 0.6,
phi = 8.5,
ropt1 = 1500, # optimal rainfall for resource growth
s1 = 1000, # standard deviation for plant growth rate as a function of rainfall
ropt2 = 1000, # optimal rainfall for herbivore attack (feeding) rate
s2 = 500, # standard deviation for herbivore attack rate as a function of rainfall
avgtemp = 20, # average temperature
A_temp = 7,
w_temp = 0.5,
phi_temp = 0.5,
topt1 = 13, # optimal temperature for resource growth
ts1 = 10 # standard deviation for plant growth rate as a function of temperature
)
model <- function(t, states, parameters) {
with(as.list(c(states, parameters)), {
# rainfall time series function
rain <- avgrain + (A*sin((w*t)+phi)) # rainfall function
# temperature time series function
temp = avgtemp + (A_temp*sin((w_temp*t)+phi_temp))
# dynamic g and a equations
dg_both <- (exp(-(rain - ropt1)^2/(s1^2))) + (exp(-(temp - topt1)^2/(ts1^2)))
da = exp(-(rain - ropt2)^2/(s2^2))
# rate of change of state variables
dr <- dg_both*r*(1-(r/K)) - ((c*da*r)/(1+(da*h*r)))
dc <- ((c*e*da*r)/(1+(da*h*r)))- c*m
# return rate of change
list(c(dr, dc), rain=rain, temp=temp, dg_both=dg_both, da=da)
})
}
times <- seq(0, 200, by = 1)
out <- ode(y = states, times = times, func = model, parms = parameters, method="lsoda")
Would I do this with a for loop? Thank you in advance!
Here one of the other approaches, mentioned by #Ben Bolker. Here we use replicate instead of a loop. This has the advantage, that we don't need to create a list() for the results beforehand.
N <- 10
res <- replicate(N, ode(y = c(r = runif(1, min = 0.1, max = 25),
c = runif(1, min = 0.1, max = 10)),
times = times, func = model,
parms = parameters, method="lsoda"),
simplify = FALSE)
plot(out, res)
As an additional goody, we can also plot the results using deSolve's built-in plotting function. This works of course also with res in Ben's approach. The resulting data structure can then be simplified to something like a matrix or array, either with do.call(rbind, res) as in Ben's example, or with option simplify directly in replicate.
Yes, a for loop will be fine. There are lots of other slightly fancier ways to do this (replicate or lapply from base R, purrr::map_dfr from tidyverse ...), but they won't save you any time or memory — they're just a slightly more elegant way to do the same thing.
set.seed(101)
N <- 10
res <- list()
for (i in 1:N) {
## pick new initial conditions
cur_states <- c(r=runif(1, min=0.1, max=25),
c=runif(1, min=0.1, max=10))
## run model and attach index column to the matrix
res[[i]] <-
cbind(run = i,
ode(y = cur_states, times = times, func = model,
parms = parameters, method="lsoda")
)
}
## combine individual runs into one long matrix
res_final <- do.call(rbind,res)

Using a custom R generator function with fit_generator (Keras, R)

I'd like to train a convolutional network to solve a multi-class, multi-label problem on image data. Due to the nature of the data, and for reasons I'll spare you, it would be best if I could use a custom R generator function to feed to the fit_generator command, instead of its built-in image_data_generator and flow_images_from_directory commands (which I was successfully able to get working, just not for this particular problem).
Here (https://www.rdocumentation.org/packages/keras/versions/2.2.0/topics/fit_generator) it says that I can do just that, without giving any examples. So I tried the following. Here is an extremely stripped down example of what I'm trying to do (this code is entirely self contained):
library(keras)
library(reticulate) #for py_iterator function
play.network = keras_model_sequential() %>%
layer_dense(units = 10, activation = "relu", input_shape = c(10)) %>%
layer_dense(units = 1, activation = "relu")
play.network %>% compile(
optimizer = "rmsprop",
loss = "mse"
)
mikes.custom.generator.function = function() #generates a 2-list of a random 1 x 10 array, and a scalar
{
new.func = function()
{
arr = array(dim = c(1,10))
arr[,] = sample(1:10, 10, replace = TRUE)/10
return(list(arr,runif(1)))
}
}
mikes.custom.iterator = py_iterator(mikes.custom.generator.function()) #creates a python iterator object
generator_next(mikes.custom.iterator) #correctly returns a 2-member list consisting of a 1 x 10 array, and a scalar
generator_next(mikes.custom.iterator)[[1]] #a 1 x 10 array
generator_next(mikes.custom.iterator)[[2]] #a scalar
#try to fit with "fit_generator":
play.network %>% fit_generator( #FREEZES.
mikes.custom.iterator,
steps_per_epoch = 1,
epochs = 1
)
The thing freezes at training time, without giving me an error message or anything. I also tried it with a custom image data generator for my original problem, same result.
Note that this network trains just fine if I just use fit and input the training data manually:
play.network %>% fit(generator_next(mikes.custom.iterator)[[1]],generator_next(mikes.custom.iterator)[[2]], epochs = 1, batch_size = 1)
#trains just fine
I think I know the problem, but I don't know the solution. If you ask it for the class of my custom iterator, it gives
class(mikes.custom.iterator)
[1] "python.builtin.iterator" "rpytools.generator.RGenerator" "python.builtin.object"
whereas if I build an iterator using the builtin image_data_generator and flow_images_from_directory commands, it gives
train_datagen <- image_data_generator(rescale = 1/255)
class(train_datagen)
[1] "keras.preprocessing.image.ImageDataGenerator" "keras_preprocessing.image.ImageDataGenerator" "python.builtin.object"
train_generator <- flow_images_from_directory(
train_dir,
train_datagen,
....
)
class(train_generator)
[1] "python.builtin.iterator" "keras_preprocessing.image.DirectoryIterator" "keras_preprocessing.image.Iterator" "tensorflow.python.keras.utils.data_utils.Sequence" "python.builtin.object"
So my guess is that train_datagen and/or train_generator have attributes that mikes.custom.iterator does not, and fit_generator is trying to call upon mikes.custom.iterator using functions other than the basic generator_next (which is in theory all it should really need). But I don't know what they may be, or how to build mikes.custom.iterator correctly, even after searching for two hours online.
Help anyone?
In R, you can build an iterator using <<- operator. This is very helpful to build a custom generator function; and it is compatible with Keras' fit_generator() function.
Some minimal example:
# example data
data <- data.frame(
x = runif(80),
y = runif(80),
z = runif(80)
)
# example generator
data_generator <- function(data, x, y, batch_size) {
# start iterator
i <- 1
# return an iterator function
function() {
# reset iterator if already seen all data
if ((i + batch_size - 1) > nrow(data)) i <<- 1
# iterate current batch's rows
rows <- c(i:min(i + batch_size - 1, nrow(data)))
# update to next iteration
i <<- i + batch_size
# create container arrays
x_array <- array(0, dim = c(length(rows), length(x)))
y_array <- array(0, dim = c(length(rows), length(y)))
# fill the container
x_array[1:length(rows), ] <- data[rows, x]
y_array[1:length(rows), ] <- data[rows, y]
# return the batch
list(x_array, y_array)
}
}
# set-up a generator
gen <- data_generator(
data = data.matrix(data),
x = 1:2, # it is flexible, you can use the column numbers,
y = c("y", "z"), # or the column name
batch_size = 32
)
From above function, you can simply check the resulting arrays by calling the generator:
gen()
Or you could also test the generator using a simple Keras model:
# import keras
library(keras)
# set up a simple keras model
model <- keras_model_sequential() %>%
layer_dense(32, input_shape = c(2)) %>%
layer_dense(2)
model %>% compile(
optimizer = "rmsprop",
loss = "mse"
)
# fit using generator
model %>% fit_generator(
generator = gen,
steps_per_epoch = 100, # will auto-reset after see all sample
epochs = 10
)
I have to admit that the process is a little bit complex and requires extensive programming. You should check this featured blog post by François Chollet himself, or kerasgenerator package that I develop personally.
sampling_generator <- function(X_data, Y_data, batch_size) {
function() {
rows <- sample(1:nrow(X_data), batch_size, replace = TRUE)
list(X_data[rows,], Y_data[rows,])
}
}
model %>%
fit_generator(sampling_generator(X_train, Y_train, batch_size = 128),
steps_per_epoch = nrow(X_train) / 128, epochs = 10)
I found this answer in R keras FAQs which seems to work
https://keras.rstudio.com/articles/faq.html#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory

Stan MCMC chains switching back and forth between warmup and sampling

I am currently using R combining with Stan to conduct MCMC sampling for obtaining posterior distribution of a certain demand variable d, given historical demand dH and currently observed variable x (so the formulation is figuring out P(d|dH, x), which is proportional to P(x|d)P(d|dH).
My question
I found it really weird that the sampling process shows MCMC jumping back and forth between warmup and sampling (isn't it the case that the first nth iterations are always in warmup stage, followed by actual sampling stage?) At the same time, it skipped Chain 1 completely(?!). Below is the picture of the progress it shows:
My code
for(i in 1:365){
nrow = nrow(rte_m[[i]]);
ncol = ncol(rte_m[[i]]);
A <- as.matrix(rte_m[[i]]);
sigma_x <- as.vector(sample.int(10, nrow(kf_vect[[i]]), replace=TRUE))
sigma_y <- as.vector(eps_vect[[i]])
yH <- as.vector(dh_vect[[i]]);
yT <- yH + as.vector(eps_vect[[i]]);
epsilon <- sample.int(10, nrow(kf_vect[[i]]), replace=TRUE)
x <- as.vector(as.matrix(rte_m[[i]])%*%yT) + epsilon
iterations = 500;
#input data into a list called stan_data
stan_data = list(nrow = nrow, ncol = ncol,
yH = yH,
x = x, epsilon = epsilon,
A = A, sigma_x = sigma_x, sigma_y = sigma_y);
#input it into our Stan model file "stamodeling.stan"
stanmodel1 <- stan_model(file = "stamodeling.stan",
model_name = "stanmodel1");
#MCMC sampling
stanfit <- sampling(stanmodel1, data = list(ncol = ncol,nrow = nrow,
yH = yH,
x=x, epsilon = epsilon,
A = A, sigma_x = sigma_x, sigma_y = sigma_y)
,iter=iterations, warmup = 200, chains = 4, cores = 2);
Stan Modeling File
Data Files
What's happening isn't that a given chain is switching between warmup and sampling. Instead, what's happening is that the progress messages from the various chains are being interspersed with one another.
So, for example, when you see the following:
[Iteration:] 50/500 [0%] (Warmup)
[Iteration:] 50/500 [0%] (Warmup)
You're actually seeing two messages, one from Chain A and the second from Chain B.

Learning hidden markov model in R

A hidden Markov model (HMM) is one in which you observe a sequence of observations, but do not know the sequence of states the model went through to generate the observations. Analyses of hidden Markov models seek to recover the sequence of hidden states from the observed data.
I have data with both observations and hidden states (observations are of continuous values) where the hidden states were tagged by an expert. I would like to train a HMM that would be able - based on a (previously unseen) sequence of observations - to recover the corresponding hidden states.
Is there any R package to do that? Studying the existing packages (depmixS4, HMM, seqHMM - for categorical data only) allows you to specify a number of hidden states only.
EDIT:
Example:
data.tagged.by.expert = data.frame(
hidden.state = c("Wake", "REM", "REM", "NonREM1", "NonREM2", "REM", "REM", "Wake"),
sensor1 = c(1,1.2,1.2,1.3,4,2,1.78,0.65),
sensor2 = c(7.2,5.3,5.1,1.2,2.3,7.5,7.8,2.1),
sensor3 = c(0.01,0.02,0.08,0.8,0.03,0.01,0.15,0.45)
)
data.newly.measured = data.frame(
sensor1 = c(2,3,4,5,2,1,2,4,5,8,4,6,1,2,5,3,2,1,4),
sensor2 = c(2.1,2.3,2.2,4.2,4.2,2.2,2.2,5.3,2.4,1.0,2.5,2.4,1.2,8.4,5.2,5.5,5.2,4.3,7.8),
sensor3 = c(0.23,0.25,0.23,0.54,0.36,0.85,0.01,0.52,0.09,0.12,0.85,0.45,0.26,0.08,0.01,0.55,0.67,0.82,0.35)
)
I would like to create a HMM with discrete time t whrere random variable x(t) represents the hidden state at time t, x(t) {"Wake", "REM", "NonREM1", "NonREM2"}, and 3 continuous random variables sensor1(t), sensor2(t), sensor3(t) representing the observations at time t.
model.hmm = learn.model(data.tagged.by.user)
Then I would like to use the created model to estimate hidden states responsible for newly measured observations
hidden.states = estimate.hidden.states(model.hmm, data.newly.measured)
Data (training/testing)
To be able to run learning methods for Naive Bayes classifier, we need longer data set
states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
artificial.hypnogram = rep(c(5,4,1,2,3,4,5), times = c(40,150,200,300,50,90,30))
data.tagged.by.expert = data.frame(
hidden.state = states[artificial.hypnogram],
sensor1 = log(artificial.hypnogram) + runif(n = length(artificial.hypnogram), min = 0.2, max = 0.5),
sensor2 = 10*artificial.hypnogram + sample(c(-8:8), size = length(artificial.hypnogram), replace = T),
sensor3 = sample(1:100, size = length(artificial.hypnogram), replace = T)
)
hidden.hypnogram = rep(c(5,4,1,2,4,5), times = c(10,10,15,10,10,3))
data.newly.measured = data.frame(
sensor1 = log(hidden.hypnogram) + runif(n = length(hidden.hypnogram), min = 0.2, max = 0.5),
sensor2 = 10*hidden.hypnogram + sample(c(-8:8), size = length(hidden.hypnogram), replace = T),
sensor3 = sample(1:100, size = length(hidden.hypnogram), replace = T)
)
Solution
In the solution, we used Viterbi algorithm - combined with Naive Bayes classifier.
At each clock time t, a Hidden Markov Model consist of
an unobserved state (denoted as hidden.state in this case) taking a finite number of states
states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
a set of observed variables (sensor1, sensor2, sensor3 in this case)
Transition matrix
A new state is entered based upon a transition probability distribution
(transition matrix). This can be easily computed from data.tagged.by.expert e.g. using
library(markovchain)
emit_p <- markovchainFit(data.tagged.by.expert$hidden.state)$estimate
Emission matrix
After each transition is made, an observation (sensor_i) is produced according to a conditional probability distribution (emission matrix) which depends on the current state H of hidden.state only. We will replace emmision matrices by Naive Bayes classifier.
library(caret)
library(klaR)
library(e1071)
model = train(hidden.state ~ .,
data = data.tagged.by.expert,
method = 'nb',
trControl=trainControl(method='cv',number=10)
)
Viterbi algorithm
To solve the problem, we use Viterbi algorithm with the initial probability of 1 for "Wake" state and 0 otherwise. (We expect the patient to be awake in the beginning of the experiment)
# we expect the patient to be awake in the beginning
start_p = c(NonREM1 = 0,NonREM2 = 0,NonREM3 = 0, REM = 0, Wake = 1)
# Naive Bayes model
model_nb = model$finalModel
# the observations
observations = data.newly.measured
nObs <- nrow(observations) # number of observations
nStates <- length(states) # number of states
# T1, T2 initialization
T1 <- matrix(0, nrow = nStates, ncol = nObs) #define two 2-dimensional tables
row.names(T1) <- states
T2 <- T1
Byj <- predict(model_nb, newdata = observations[1,])$posterior
# init first column of T1
for(s in states)
T1[s,1] = start_p[s] * Byj[1,s]
# fill T1 and T2 tables
for(j in 2:nObs) {
Byj <- predict(model_nb, newdata = observations[j,])$posterior
for(s in states) {
res <- (T1[,j-1] * emit_p[,s]) * Byj[1,s]
T2[s,j] <- states[which.max(res)]
T1[s,j] <- max(res)
}
}
# backtract best path
result <- rep("", times = nObs)
result[nObs] <- names(which.max(T1[,nObs]))
for (j in nObs:2) {
result[j-1] <- T2[result[j], j]
}
# show the result
result
# show the original artificial data
states[hidden.hypnogram]
References
To read more about the problem, see Vomlel Jiří, Kratochvíl Václav : Dynamic Bayesian Networks for the Classification of Sleep Stages , Proceedings of the 11th Workshop on Uncertainty Processing (WUPES’18), p. 205-215 , Eds: Kratochvíl Václav, Vejnarová Jiřina, Workshop on Uncertainty Processing (WUPES’18), (Třeboň, CZ, 2018/06/06) [2018] Download

Resources