How to make keras utilize all available CPU capacity when training? - r

I am trying to implement a bunch of different neural networks for a regression problem. When I train a single model I see that my computer doesn't utilize all the available CPU, which I guess would be preferable to make the training faster.
Ultimately I want to specify multiple models (around 4), that can be trained simultaneously, but to begin with I just want to utilize all CPU when training a single model. The screenshot below shows how my CPU is used when i train a model:
CPU utilization
In the code below I tried setting use_multiprocessing = TRUE, which I thought could help, but I get the error that the argument is not used.
library(keras)
epoch <- 50
lr <- 0.1
decay <- lr / epoch
# initialize model
fit_NN4 <- keras_model_sequential() %>%
layer_flatten(input_shape = training %>% select(-date, -mktcap, -permno, -ret.adj) %>% ncol()) %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 32, activation = "relu") %>%
layer_dense(units = 16, activation = "relu") %>%
layer_dense(units = 8, activation = "relu") %>%
layer_dense(units = 1)
# compile
fit_NN4 %>% compile(
loss = "mse", # loss objective function
#optimizer = optimizer_rmsprop(),
optimizer = optimizer_sgd(lr = lr, decay = decay),
metrics = c("mean_absolute_error")
)
# train the model
fit_NN4 %>%
fit(
# Training data
x = training %>% select(-date, -mktcap, -permno, -ret.adj) %>% as.matrix(),
y = training %>% pull(ret.adj) %>% as.matrix(),
epoch = epoch,
# Validation data
validation_data =
list(validation %>% select(-date, -mktcap, -permno, -ret.adj) %>% as.matrix(),
validation %>% pull(ret.adj) %>% as.matrix()),
# Callbacks
callbacks = list(
callback_early_stopping(monitor = "val_loss", # early stop objective
mode = "min", # minimize objective
verbose = 1, # Return epoch at stop
patience = 4 # wait for 4 epochs to stop relative to min
use_multiprocessing = TRUE
)) # Use all CPU capacity
)

Related

'keras_model_sequential()' runs forever

I am working on a project, where I build an LSTM model for GDP growth forecasting. When I try to build the model using 'keras_model_sequential()' it gets stuck there and runs forever! I am confused, I installed both the Keras and TensorFlow packages but still, it runs forever. R keeps running in the first line of this code sample.
lstm_model <- keras_model_sequential()
lstm_model %>%
# 1st LSTM layer
layer_lstm(units = 20, # size of the layer
batch_input_shape = c(1, 5, 1), # batch size, timesteps, features
return_sequences = TRUE, # reserve the sequence
stateful = TRUE) %>%
# Dropout layer
layer_dropout(rate = 0.3) %>%
# 2nd LSTM layer
layer_lstm(units = 20,
return_sequences = TRUE,
stateful = TRUE) %>%
layer_dropout(rate = 0.3) %>%
# Final dense/output layer
time_distributed(keras::layer_dense(units = 1))
# Use Adam optimizer, Mean absolute error as loss function, and want to see accuracy
lstm_model %>%
compile(loss = 'mae', optimizer = 'adam', metrics = 'accuracy')
#Summary of the model
summary(lstm_model)
# fit the model
lstm_model %>% fit(
x = x_train_arr,
y = y_train_arr,
batch_size = 1,
epochs = 20,
verbose = 0,
shuffle = FALSE
)

Fixing initial weights when training Keras model in R

I want to fix the weights of the model I'm training to get reproducible results for the report. The problem is that on different runs I get similar but slightly different results and training sometimes takes 600 epochs, sometimes takes 3500 using callback_early_stopping monitoring validation MSE and with min_delta of 0.00003.
Overall, I'm happy enough with results of all runs, but just need to find if there's a way to get reproducible results by fixing weights.
I tried setting seed at various parts of the process - before creating model, before compiling it and before training but nothing seems to work. Any way to do it?
BATCH <- nrow(x_train)
SHAPE <- ncol(x_train)
# Create a neural network model
set.seed(42)
model <- keras_model_sequential()
model %>%
layer_dense(units = 12, activation = "relu", input_shape = c(SHAPE)) %>%
layer_dense(units = 24, activation = "relu") %>%
layer_dense(units = 1, activation = "linear")
#print model summary
print(summary(model))
# initialise early stopping callback and optimiser
early_stoping <- callback_early_stopping(
monitor = "val_loss",
min_delta = 0.00003,
patience = 50,
restore_best_weights = TRUE
)
optim <- optimizer_adam(learning_rate = 0.00005)
set.seed(42)
model %>% compile(
optimizer = optim,
loss = "mse",
metrics = c("mse", "mae")
)
# fit model
set.seed(42)
val_data <- list(x_val = x_val, y_val = y_val)
hist <- model %>% fit(
x = x_train,
y = y_train,
batch_size = BATCH,
epochs = 6000,
validation_data = val_data,
shuffle = FALSE,
callbacks = early_stoping
)

Saliency maps for individual channels?

I trained a CNN using Keras and R with the TensorFlow backend for classifying multispectral images. I want to calculate saliency maps per input data band for a single input image. My idea is to calculate the mean of the saliency map for each channel to get the information, which input band contributed most to the classification. Is this possible and if yes, how? Everywhere I looked, I only found python implementations of saliency maps.
Let's just assume, this is my network and I want to calculate saliency maps for all three channels of the one image in the end, so I know which channel is most important:
# download & load data
cifar <- dataset_cifar10()
# set up model
model <- keras_model_sequential() %>%
layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = "relu", input_shape = c(32,32,3)) %>%
layer_max_pooling_2d() %>%
layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = "relu") %>%
layer_max_pooling_2d() %>%
layer_flatten() %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 10, activation = "softmax")
# compile model
model %>% compile(
optimizer = "adam",
loss = "sparse_categorical_crossentropy",
metrics = "accuracy"
)
# run model
history <- model %>%
fit(
x = cifar$train$x,
y = cifar$train$y,
epochs = 10
)
# pick out one image
test_img <- cifar$test$x[1,,,]
# what now?

How to define lstm, units, lags and batch size in keras package

I have data of almost 4700 entries. I have to predict power output. I am unable to understand the algorithm of the LSTM like what is units? how to select units for my data and what are data lags? The code I am using for this work is available here https://www.r-bloggers.com/2018/11/lstm-with-keras-tensorflow/ as I have interest in lstm so I am only using that part of this code.
library(keras)
model <- keras_model_sequential()
model %>%
layer_lstm(units = 100,
input_shape = c(datalags, 2),
batch_size = batch.size,
return_sequences = TRUE,
stateful = TRUE) %>%
layer_dropout(rate = 0.5) %>%
layer_lstm(units = 50,
return_sequences = FALSE,
stateful = TRUE) %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 1)
model %>%
compile(loss = 'mae', optimizer = 'adam')
So in this code I am unable to understand
what is meant by units here?
What are datalags,
The code uses datalags value as 10, how do I define it for my data? and how to manually select them for my data?

Validation accuracy much lower than training accuracy in Keras for text classification

I am new to Keras and trying to create a model. The issue is that my training accuracy is around 80 percent but the validation accuracy is drastically low at 15 percent. I have 545 rows in my dataset. I have normalized all the input features. Any help on what can be tweaked would be really helpful.
Sharing the complete data and code here
https://drive.google.com/open?id=1g8Cmw2bmAI9DnOU-rB4sjsOeBuFp6NUy
#Normalize data
data[,1:(ncol(data)-1)] = normalize(data[,1:(ncol(data)-1)])
data[,ncol(data)] = as.numeric(data[,ncol(data)]) - 1
set.seed(128)
ind = sample(2,nrow(data),replace = T,prob = c(0.7,0.3))
training = data[ind==1,1:(ncol(data)-1)]
test = data[ind==2,1:(ncol(data)-1)]
traintarget = data[ind==1,ncol(data)]
testtarget = data[ind==2,ncol(data)]
# One hot encoding
trainLabels = to_categorical(traintarget)
testLabels = to_categorical(testtarget)
print(testLabels)
model = keras_model_sequential()
model %>%
layer_dense(units = 150, activation = 'relu', input_shape = c(520)) %>%
layer_dense(units = 50, activation = 'relu') %>%
layer_dense(units = 9, activation = 'softmax')
model %>%
compile(loss = 'categorical_crossentropy', optimizer = 'adam',metrics = 'accuracy')
history = model %>%
fit(training,
trainLabels,
epoch = 300,
batch_size = 32,
validation_split = 0.2)
prob = model %>%
predict_proba(test)
pred = model %>%
predict_classes(test)
table2 = table(Predicted = pred, Actual = testtarget)
cbind(prob,pred,testtarget)
Simply put when your model is succeeding at training but not validation it is overfitting. The best way to combat this is by a) making sure that your inputs actually predict the outputs because otherwise a large enough model will just memorize the historical data. and b) by adding a dropout layer in your network. Finally, 500 something training samples seems a little low for training a neural network.

Resources