Keras LSTM and multiple input feature: how to define parameters - r

I am discoveting Keras in R and the LSTM. Following this blog post, I want to predict time series, and I would like to use various past time point (t-1, t-2) to predict the t point.
Here is what I tried so far:
library(data.table)
library(tensorflow)
library(keras)
Serie <- c(5.66333333333333, 5.51916666666667, 5.43416666666667, 5.33833333333333,
5.44916666666667, 6.2025, 6.57916666666667, 6.70666666666667,
6.95083333333333, 8.1775, 8.55083333333333, 8.42166666666667,
8.01333333333333, 8.99833333333333, 11.0025, 10.3116666666667,
10.51, 10.9916666666667, 10.6116666666667, 10.8475, 13.7841666666667,
16.2916666666667, 15.9975, 14.3683333333333, 13.4041666666667,
11.8666666666667, 9.11916666666667, 9.47862416666667, 9.08404666666667,
8.79606166666667, 9.93211091666667, 9.03834041666667, 8.58787275,
6.77499383333333, 7.21377583333333, 7.53497175, 6.31212966666667,
5.5825105, 4.64021041666667, 4.608787, 5.39446983333333, 4.93945983333333,
4.8612215, 4.13088808333333, 4.09916575, 3.40943183333333, 3.79573258333333,
4.30319966666667, 4.23431266666667, 3.64880758333333, 3.11700716666667,
3.321058, 2.53599408333333, 2.20433991666667, 1.66643905833333,
0.84187275, 0.467880658333333, 0.810507858333333, 0.795)
Npoints <- 2 # number of previous point to take into account
I then create a data frame with the lagged time series, and create a test and train set:
supervised <- data.table(x = diff(Serie, differences = 1))
supervised[,c(paste0("x-",1:Npoints)) := lapply(1:Npoints,function(i){c(rep(NA,i),x[1:(.N-i)])})] # create shifted versions
# take the non NA
supervised <- supervised[!is.na(get(paste0("x-",Npoints)))]
head(supervised)
# Split dataset into training and testing sets
N = nrow(supervised)
n = round(N *0.7, digits = 0)
train = supervised[1:n, ]
test = supervised[(n+1):N, ]
I rescale the data
scale_data = function(train, test, feature_range = c(0, 1)) {
x = train
fr_min = feature_range[1]
fr_max = feature_range[2]
std_train = ((x - min(x,na.rm = T) ) / (max(x,na.rm = T) - min(x,na.rm = T) ))
std_test = ((test - min(x,na.rm = T) ) / (max(x,na.rm = T) - min(x,na.rm = T) ))
scaled_train = std_train *(fr_max -fr_min) + fr_min
scaled_test = std_test *(fr_max -fr_min) + fr_min
return( list(scaled_train = as.vector(scaled_train), scaled_test = as.vector(scaled_test) ,scaler= c(min =min(x,na.rm = T), max = max(x,na.rm = T))) )
}
Scaled = scale_data(train, test, c(-1, 1))
# define x and y train
y_train = as.vector(Scaled$scaled_train[, 1])
x_train = Scaled$scaled_train[, -1]
And following this post I reshape the data in 3D
x_train_reshaped <- array(NA,dim= c(1,dim(x_train)))
x_train_reshaped[1,,] <- as.matrix(x_train)
I do the following model and try to start the learning :
model <- keras_model_sequential()
model%>%
layer_lstm(units = 1, batch_size = 1, input_shape = dim(x_train), stateful= TRUE)%>%
layer_dense(units = 1)
# compile model ####
model %>% compile(
loss = 'mean_squared_error',
optimizer = optimizer_adam( lr= 0.02, decay = 1e-6 ),
metrics = c('accuracy')
)
# make a test
model %>% fit(x_train_reshaped, y_train, epochs=1, batch_size=1, verbose=1, shuffle=FALSE)
but I get the following error:
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: No data provided for "dense_11". Need data for each key in: ['dense_11']
Trying to reshape the data differently didn't help.
What I am doing wrong ?

Keras and tensorflow in R cannot recognise the size of your input/target data when they are data frames.
y_train is both a data.table and a data.frame:
class(y_train)
[1] "data.table" "data.frame"
The keras fit documentation states: "y: Vector, matrix, or array of target (label) data (or list if the model has multiple outputs)." Similarly, for x.
Unfortunately, there still appears to be an input and/or target dimensionality mismatch when y_train is cast to a matrix:
model %>%
fit(x_train_reshaped, as.matrix(y_train), epochs=1, batch_size=1, verbose=1, shuffle=FALSE)
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: Input arrays should have the same number of samples as target arrays.
Found 1 input samples and 39 target samples.
Hope this answer helps you, or someone else, make further progress.

Related

Output of keras in R can not be used to predict

i use keras and tensorflow to run an lstm in R to predict some stock market prices.
Here I am providing the code where instead of stock market prices, I just use one randomly generated vector VECTOR of length 100. Then I consider a training period of 80 first values and try to predict the 20 test values...
What am I doing wrong?
I am getting an error:Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "keras_training_history"
Thank you
library(tensorflow)
library(keras)
set.seed(12345)
VECTOR=rnorm(100,2,5)
VECTOR_training=VECTOR[1:80]
VECTOR_test=VECTOR[81:100]
training_rescaled=scale(VECTOR_training)
#I also calculate the scale factors because I will need them when I will be coming
#back to the original data
scale_factors=matrix(NA,nrow=1,ncol=2)
scale_factors=c(mean(VECTOR_training), sd(VECTOR_training))
#We want to predict 20 days, so we need to base each prediction on 20 data points.
prediction_stocks=20
lag_stocks=prediction_stocks
test_rescaled =training_rescaled[(length(VECTOR_training)- prediction_stocks + 1):length(VECTOR_training)]
#We lag the data 20times, so that each prediction is based on 20 values, and arrange lagged values into columns. Then we transform it into the desired 3D form.
x_train_data_stocks=t(sapply(1:(length(VECTOR_training)-lag_stocks-prediction_stocks+1),
function(x) training_rescaled[x:(x+lag_stocks-1),1]
))
# now we transform it into 3D form
x_train_arr_stocks=array(
data=as.numeric(unlist(x_train_data_stocks)),
dim=c(
nrow(x_train_data_stocks),
lag_stocks,
1
)
)
#Now we apply similar transformation for the Y values.
y_train_data_stocks=t(sapply(
(1 + lag_stocks):(length(training_rescaled) - prediction_stocks + 1),
function(x) training_rescaled[x:(x + prediction_stocks - 1)]
))
y_train_arr_stocks= array(
data = as.numeric(unlist(y_train_data_stocks)),
dim = c(
nrow(y_train_data_stocks),
prediction_stocks,
1
)
)
#In the same manner we need to prepare input data for the prediction
#list_test_rescaled
# this time our array just has one sample, as we intend to perform one 20-days prediction
x_pred_arr_stocks=array(
data = test_rescaled,
dim = c(
1,
lag_stocks,
1
)
)
###lstm forecast prova
set.seed(12345)
lstm_model <- keras_model_sequential()
lstm_model_prova=
layer_lstm(lstm_model,units = 70, # size of the layer
batch_input_shape = c(1, 20, 1), # batch size, timesteps, features
return_sequences = TRUE,
stateful = TRUE) %>%
# fraction of the units to drop for the linear transformation of the inputs
layer_dropout(rate = 0.5) %>%
layer_lstm(units = 50,
return_sequences = TRUE,
stateful = TRUE) %>%
layer_dropout(rate = 0.5) %>%
time_distributed(keras::layer_dense(units = 1))
lstm_model_compile=compile(lstm_model_prova,loss = 'mae', optimizer = 'adam', metrics = 'accuracy')
lstm_fit_prova=fit(lstm_model_compile,
x = x_train_arr_stocks[[1]],
y = y_train_arr_stocks[[1]],
batch_size = 1,
epochs = 20,
verbose = 0,
shuffle = FALSE
)
lstm_forecast_prova=predict(lstm_fit_prova,x_pred_arr_stocks, batch_size = 1)
It works if I use
lstm_forecast_prova=predict(lstm_model_compile,x_pred_arr_stocks, batch_size = 1)
But shouldn't I use the fitted model in order to make the predictions?
Also, if I plot the fitted model, the accuracy is 0. And actually on my real data the predictions do not make any sense. So what does it mean that the accuracy is 0? Maybe something is wrong with the lstm parameters?
Thank you in advance!!

Kriging interpolation using GeoStats package in Julia

I am trying to build a model for kriging interpolation using GeoStats package in julia.
I have tried an example of 2D interpolations but the results are not accurate, as mentioned below.
Code for 2D interpolation:
using KrigingEstimators, DataFrames, Variography, Plots
OK = OrdinaryKriging(GaussianVariogram()) # interpolator object
f(x) = sin(x)
# fit it to the data:
x_train = range(0, 10.0, length = 9) |> collect
y_train = f.(x_train)
scatter(x_train, y_train, label="train points")
x_train = reshape(x_train, 1, length(x_train))
krig = KrigingEstimators.fit(OK, x_train, y_train) # fit function
result = []
variance =[]
test = range(0, 10, length = 101) |> collect
y_test = f.(test)
test = reshape(test, 1, length(test))
for i in test
μ, σ² = KrigingEstimators.predict(krig, [i])
push!(result, μ)
push!(variance, σ²)
end
df_krig_vario = DataFrame(:predict=>result, :real=>y_test, :variance=>variance)
println(first(df_krig_vario, 5))
mean_var = sum(variance)/length(variance)
println("")
println("mean variance is $mean_var")
test = reshape(test, length(test), 1)
plot!(test, y_test, label="actual")
plot!(test, result, label="predict", legend=:bottomright, title="Gaussian Variogram")
With reference to the above figure it can be seen that the interpolation prediction is not accurate. May I know, how to improve this accuracy?

SHAP with Keras model : operands could not be broadcast together with shapes (2,6) (10,)

I am running SHAP from the library shapper in R for a classification model intrepetation on a Keras CNN model:
library(keras)
library("shapper")
library("DALEX")
I made a simple reproductible example
mdat.train <- cbind(rep(1:2, each = 5), matrix(c(1:30), ncol = 3, byrow = TRUE))
train.conv <- array_reshape(mdat.train[,-1], c(nrow(mdat.train[,-1]), ncol(mdat.train[,-1]), 1))
mdat.test <- cbind(rep(1:2, each = 3), matrix(c(1:18), ncol = 3, byrow = TRUE))
test.conv <- array_reshape(mdat.test[,-1], c(nrow(mdat.test[,-1]), ncol(mdat.test[,-1]), 1))
My CNN model
model.CNN <- keras_model_sequential()
model.CNN %>%
layer_conv_1d(filters=16L, kernel_initializer=initializer_he_normal(seed=NULL), kernel_size=2L, input_shape = c(dim(train.conv)[[2]],1)) %>%
layer_batch_normalization() %>%
layer_activation_leaky_relu() %>%
layer_flatten() %>%
layer_dense(50, activation ="relu") %>%
layer_dropout(rate=0.5) %>%
layer_dense(units=2, activation ='sigmoid')
model.CNN %>% compile(
loss = loss_binary_crossentropy,
optimizer = optimizer_adam(lr = 0.001, beta_1 = 0.9, beta_2 = 0.999, epsilon = 1e-08),
metrics = c("accuracy"))
model.CNN %>% fit(
train.conv, mdat.train[,1], epochs = 5, verbose = 1)
My Shap command
p_function <- function(model, data) predict(model.CNN, test.conv, type = "prob")
exp_cnn <- explain(model.CNN, data = train.conv)
ive_cnn <- shap(exp_cnn, data = train.conv, new_observation = test.conv, predict_function = p_function)
I am getting this error :
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: operands could not be broadcast together with shapes (2,6) (10,)
Detailed traceback:
File "/.local/lib/python3.6/site-packages/shap/explainers/kernel.py", line 120, in __init__
self.fnull = np.sum((model_null.T * self.data.weights).T, 0)
Problem You've presented has two steps. First of all shown error comes from code typo. p_function shown by You calls global objects instead of passed ones. Thats why You have witnessed that error.
But to my surpirse I've found package not working even after clarifying that mistake. Let me explain motivation and the solution.
Have to say that 3D Arrays are not common in R, therefore shapper package does not support that type of train data. It assumes data.frame format at the beginning of the task (because it's iterating over variables). To be honest it took me like 2 hourse to find a reason why it is not working as well as a solution.
First of all we need new variables that are understandable for shapper.
shapper_data <- as.data.frame(train.conv)
shapper_new_obs <- as.data.frame(test.conv)[1,]
as well as new predict_function
p_function <- function(model, data) {
mat <- as.matrix(data)
mat <- array_reshape(mat, c(nrow(data), ncol(data), 1))
predict(model, mat, type = "prob")
}
Two new lines will convert data.frame into proper shaped array.
Then line
ive_cnn <- individual_variable_effect(x = model.CNN, data = shapper_data, new_observation = shapper_new_obs, predict_function = p_function)
Works perfectly fine for me.
Best
Szymon

What can be the cause for difference in MAE outcome from deep-learing with R between these datasets?

I’m trying to replicate the deep learning example below with the same Boston housing dataset from another source.
https://jjallaire.github.io/deep--with-r-notebooks/notebooks/3.6-predicting-house-prices.nb.html
Originally the data source is:
library(keras) dataset <- dataset_boston_housing()
Alternatively I try to use:
library(mlbench)
data(BostonHousing)
The difference between the datasets are:
the dataset from mlbench contains column names.
the dataset from keras is already split between test and train.
the set from keras is organised with lists containing matrices while the dataset from mlbench is a dataframe
the fourth column contains a categorical variable "chas" which could not be preprocessed from the mlbench dataset while it can be preprocessed from the keras dataset. To compare apples with apples I have deleted this column from both datasets.
In order to compare both datasets I have merged the train and testset from keras into 1 dataset. After this I have compared the merged dataset from keras with mlbench with summary() and these are identical for every feature (min, max, median, mean).
Since the dataset from keras is already split between test and train (80-20), I can only use one training set for the deep learning proces. This training set gives a validation_mae of around 2.5. See this graph:
If I partition the data from mlbench at 0.8 to construct a training set of similar size, run the deep learing code and do this several times, I never reach a validation_mae of around 2.5. The range is between 4 and 6. An example of the output is this graph:
Does someone know what can be the cause for this difference?
Code with dataset from keras:
library(keras)
dataset <- dataset_boston_housing()
c(c(train_data, train_targets), c(test_data, test_targets)) %<-% dataset
train_data <- train_data[,-4]
test_data <- test_data[,-4]
mean <- apply(train_data, 2, mean)
std <- apply(train_data, 2, sd)
train_data <- scale(train_data, center = mean, scale = std)
test_data <- scale(test_data, center = mean, scale = std)
# After this line the code is the same for both code examples.
# =========================================
# Because we will need to instantiate the same model multiple times,
# we use a function to construct it.
build_model <- function() {
model <- keras_model_sequential() %>%
layer_dense(units = 64, activation = "relu",
input_shape = dim(train_data)[[2]]) %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 1)
model %>% compile(
optimizer = "rmsprop",
loss = "mse",
metrics = c("mae")
)
}
k <- 4
indices <- sample(1:nrow(train_data))
folds <- cut(1:length(indices), breaks = k, labels = FALSE)
num_epochs <- 100
all_scores <- c()
for (i in 1:k) {
cat("processing fold #", i, "\n")
# Prepare the validation data: data from partition # k
val_indices <- which(folds == i, arr.ind = TRUE)
val_data <- train_data[val_indices,]
val_targets <- train_targets[val_indices]
# Prepare the training data: data from all other partitions
partial_train_data <- train_data[-val_indices,]
partial_train_targets <- train_targets[-val_indices]
# Build the Keras model (already compiled)
model <- build_model()
# Train the model (in silent mode, verbose=0)
model %>% fit(partial_train_data, partial_train_targets,
epochs = num_epochs, batch_size = 1, verbose = 0)
# Evaluate the model on the validation data
results <- model %>% evaluate(val_data, val_targets, verbose = 0)
all_scores <- c(all_scores, results$mean_absolute_error)
}
all_scores
mean(all_scores)
# Some memory clean-up
k_clear_session()
num_epochs <- 500
all_mae_histories <- NULL
for (i in 1:k) {
cat("processing fold #", i, "\n")
# Prepare the validation data: data from partition # k
val_indices <- which(folds == i, arr.ind = TRUE)
val_data <- train_data[val_indices,]
val_targets <- train_targets[val_indices]
# Prepare the training data: data from all other partitions
partial_train_data <- train_data[-val_indices,]
partial_train_targets <- train_targets[-val_indices]
# Build the Keras model (already compiled)
model <- build_model()
# Train the model (in silent mode, verbose=0)
history <- model %>% fit(
partial_train_data, partial_train_targets,
validation_data = list(val_data, val_targets),
epochs = num_epochs, batch_size = 1, verbose = 1
)
mae_history <- history$metrics$val_mean_absolute_error
all_mae_histories <- rbind(all_mae_histories, mae_history)
}
average_mae_history <- data.frame(
epoch = seq(1:ncol(all_mae_histories)),
validation_mae = apply(all_mae_histories, 2, mean)
)
library(ggplot2)
ggplot(average_mae_history, aes(x = epoch, y = validation_mae)) + geom_line()
Code with dataset from mlbench (after the line with "=====", the code is the same as in the code above:
library(dplyr)
library(mlbench)
library(groupdata2)
data(BostonHousing)
parts <- partition(BostonHousing, p = 0.2)
test_data <- parts[[1]]
train_data <- parts[[2]]
train_targets <- train_data$medv
test_targets <- test_data$medv
train_data$medv <- NULL
test_data$medv <- NULL
train_data$chas <- NULL
test_data$chas <- NULL
mean <- apply(train_data, 2, mean)
std <- apply(train_data, 2, sd)
train_data <- scale(train_data, center = mean, scale = std)
test_data <- scale(test_data, center = mean, scale = std)
library(keras)
# After this line the code is the same for both code examples.
# =========================================
build_model <- function() {
model <- keras_model_sequential() %>%
layer_dense(units = 64, activation = "relu",
input_shape = dim(train_data)[[2]]) %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 1)
model %>% compile(
optimizer = "rmsprop",
loss = "mse",
metrics = c("mae")
)
}
k <- 4
indices <- sample(1:nrow(train_data))
folds <- cut(1:length(indices), breaks = k, labels = FALSE)
num_epochs <- 100
all_scores <- c()
for (i in 1:k) {
cat("processing fold #", i, "\n")
# Prepare the validation data: data from partition # k
val_indices <- which(folds == i, arr.ind = TRUE)
val_data <- train_data[val_indices,]
val_targets <- train_targets[val_indices]
# Prepare the training data: data from all other partitions
partial_train_data <- train_data[-val_indices,]
partial_train_targets <- train_targets[-val_indices]
# Build the Keras model (already compiled)
model <- build_model()
# Train the model (in silent mode, verbose=0)
model %>% fit(partial_train_data, partial_train_targets,
epochs = num_epochs, batch_size = 1, verbose = 0)
# Evaluate the model on the validation data
results <- model %>% evaluate(val_data, val_targets, verbose = 0)
all_scores <- c(all_scores, results$mean_absolute_error)
}
all_scores
mean(all_scores)
# Some memory clean-up
k_clear_session()
num_epochs <- 500
all_mae_histories <- NULL
for (i in 1:k) {
cat("processing fold #", i, "\n")
# Prepare the validation data: data from partition # k
val_indices <- which(folds == i, arr.ind = TRUE)
val_data <- train_data[val_indices,]
val_targets <- train_targets[val_indices]
# Prepare the training data: data from all other partitions
partial_train_data <- train_data[-val_indices,]
partial_train_targets <- train_targets[-val_indices]
# Build the Keras model (already compiled)
model <- build_model()
# Train the model (in silent mode, verbose=0)
history <- model %>% fit(
partial_train_data, partial_train_targets,
validation_data = list(val_data, val_targets),
epochs = num_epochs, batch_size = 1, verbose = 1
)
mae_history <- history$metrics$val_mean_absolute_error
all_mae_histories <- rbind(all_mae_histories, mae_history)
}
average_mae_history <- data.frame(
epoch = seq(1:ncol(all_mae_histories)),
validation_mae = apply(all_mae_histories, 2, mean)
)
library(ggplot2)
ggplot(average_mae_history, aes(x = epoch, y = validation_mae)) + geom_line()
Thank you!
writing here because I can't comment...
I checked the mlbench dataset here and it said, that it contains the 14 columns of the original boston data set and 5 additional columns. Not sure if you might have a faulty dataset because you state that there are no differences in the column counts of the datasets.
Another guess might be, that the second example graph is from a model which is stuck in a local minima. To get more comparable models, you might want to work with the same seeds to make sure that the inizialisations of the weights etc. are the same to get the same results.
Hope this helps and feel free to ask.

Same accuracy for SVM classification

I am using car evaluation dataset from UCI. I am trying to use SVM classification for it. After Model creation, when I calculate accuracy using confusion matrix, even if i change the parameters of SVM, getting same accuracy every time. Posting my code below.
require("e1071");
#Code to read data from csv and convert to numeric
car_data <- read.csv("car.data.csv",header = TRUE,sep = ",",quote = "\"");
#backup original data to other data frame
car_data_bkp <- car_data;
car_data$buying<-as.numeric(car_data$buying);
car_data$maint<-as.numeric(car_data$maint);
car_data$doors<-as.numeric(car_data$doors);
car_data$persons<-as.numeric(car_data$persons);
car_data$lug_boot<-as.numeric(car_data$lug_boot);
car_data$safety<-as.numeric(car_data$safety);
car_data$class<-as.numeric(car_data$class);
#scaling of data
maxs = apply(car_data, MARGIN = 2, max);
mins = apply(car_data, MARGIN = 2, min);
scaled = as.data.frame(scale(car_data, center = mins, scale = maxs - mins));
#sampling of data for train and testing
trainIndex <- sample(1:nrow(scaled), 0.8 * nrow(scaled));
train <- scaled[trainIndex, ];
test <- scaled[-trainIndex, ];
n <- names(train);
f <- as.formula(paste("class ~", paste(n[!n %in% "class"], collapse = " + ")));
svm_model <- svm(formula=f,train,cross = 2,tolerance= 0.00001, cost = 1000,gamma=1);
summary(svm_model);
svm.pred <- predict(svm_model, test[,-7],type = "class");
table(pred = svm.pred, true = test[,7]);
#calculate accuracy
sum(diag(svm.pred))/sum(svm.pred);

Resources