Why is this simple regression (keras) ANN is failing so bad? - r

I am trying to do a non-linear regression on a very simple data. When running the following code i got really bad results. Almost every time the result is a simple linear regression. When i check the weights of my model most (if not all) neurons are 'dead'. They all have negative weights with negative biases making the ReLu function to return 0 for all inputs (since all inputs are in the range [0,1]).
As far as i can tell this is a problem with the optimizer. I also tried using a very low and a very high learning rate, no luck. The optimizer seems to be getting stuck in a 'very' sub optimal local minima.
I also tried to set the initial weights to be all positive [0,0.1], the optimizer 'cheats' its way into a linear regression by setting all biases roughly at the same value.
Any can help me? what i am doing wrong? Is this really the best a state of the art ANN can achieve on a simple regression problem?
library(keras)
fun <- function(x) 0.2+0.4*x^2+0.3*x*sin(15*x)+0.05*cos(50*x).
x_test <- seq(0,1,0.01)
y_test <- fun(x_test)
plot(x_test, y_test, type = 'l')
x_train <- runif(50)
y_train <- fun(x_train)
points(x_train, y_train)
model <- keras_model_sequential() %>%
layer_dense(10, 'relu', input_shape = 1) %>%
layer_dense(1)
model %>% compile(
optimizer = 'sgd',
loss = "mse"
)
history <- model %>%
fit(x = x_train, y = y_train,
epochs = 100,
batch_size = 10,
validation_data = list(x_test, y_test)
)
y_pred <- model %>% predict(x_test)
plot(x_test, y_test, type = 'l')
points(x_train, y_train)
lines(x_test, y_pred, col = 'red')
predicted outputs versus actual ones.

Change sigmoid with relu activation and fix your ) type error in the end of sgd.
EDIT
Also add a second dense layer and train for much more epochs, like this:
model <- keras_model_sequential() %>%
layer_dense(10, 'relu', input_shape = 1) %>%
layer_dense(10, 'relu') %>%
layer_dense(1)
model %>% compile(
optimizer = 'sgd',
loss = "mse"
)
history <- model %>%
fit(x = x_train, y = y_train,
epochs = 2000,
batch_size = 10,
validation_data = list(x_test, y_test)
)

Related

Fixing initial weights when training Keras model in R

I want to fix the weights of the model I'm training to get reproducible results for the report. The problem is that on different runs I get similar but slightly different results and training sometimes takes 600 epochs, sometimes takes 3500 using callback_early_stopping monitoring validation MSE and with min_delta of 0.00003.
Overall, I'm happy enough with results of all runs, but just need to find if there's a way to get reproducible results by fixing weights.
I tried setting seed at various parts of the process - before creating model, before compiling it and before training but nothing seems to work. Any way to do it?
BATCH <- nrow(x_train)
SHAPE <- ncol(x_train)
# Create a neural network model
set.seed(42)
model <- keras_model_sequential()
model %>%
layer_dense(units = 12, activation = "relu", input_shape = c(SHAPE)) %>%
layer_dense(units = 24, activation = "relu") %>%
layer_dense(units = 1, activation = "linear")
#print model summary
print(summary(model))
# initialise early stopping callback and optimiser
early_stoping <- callback_early_stopping(
monitor = "val_loss",
min_delta = 0.00003,
patience = 50,
restore_best_weights = TRUE
)
optim <- optimizer_adam(learning_rate = 0.00005)
set.seed(42)
model %>% compile(
optimizer = optim,
loss = "mse",
metrics = c("mse", "mae")
)
# fit model
set.seed(42)
val_data <- list(x_val = x_val, y_val = y_val)
hist <- model %>% fit(
x = x_train,
y = y_train,
batch_size = BATCH,
epochs = 6000,
validation_data = val_data,
shuffle = FALSE,
callbacks = early_stoping
)

Deep learning, neural network

I have a question regarding applying a neural network in categorical data.
1- I have one output which is numeric (Connection.Duration)
2- I have 5 inputs, 4 of them (EVSE.ID, User.ID, Fee, Day) are categorical and 1 (Time) is numeric.
I want to apply a neural network to predict the Connection.Duration. I do not know the correct command to use for categorical data. I used model.matrix but I did not how to continue with the new data frame (m) which contains the categorical data.
I would like to ask for help please.
data$Fee <- as.factor(data$Fee)
data$EVSE.ID <- as.factor(data$EVSE.ID)
data$User.ID <- as.factor(data$User.ID)
data$Day <- as.factor(data$Day)
data$Time <- as.factor(data$Time)
data$Connection.Duration <- as.factor(data$Connection.Duration)
m <- model.matrix(Connection.Duration ~ EVSE.ID+Time+Day+Fee+User.ID,
data= data)
# Neural Networks
n <- neuralnet(Connection.Duration ~ EVSE.ID+Time+Day+Fee+User.ID,
data = m,
hidden=c(100,60))
# Data partition
set.seed(1234)
ind <- sample(2, nrow(m), replace = TRUE, prob = c(0.7, 0.3))
training <- m[ind==1,1:5]
testing <- m[ind==2,1:5]
trainingtarget <- m[ind==1, 6]
testingtarget <- m[ind==2, 6]
# Normalize
m <- colMeans(training)
s <- apply(training, 2, sd)
training <- scale(training, center = m, scale = s)
testing <- scale(testing, center = m, scale = s)
# Create Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 5, activation = 'relu', input_shape = c(5)) %>%
layer_dense(units = 1)
# Compile
model %>% compile(loss= 'mse',
optimizer= 'rmsprop',
metrics='mae')
# Fit model
mymodel <- model %>%
fit(training,
trainingtarget,
epochs= 100,
batch_size = 32,
validation_split = 0.2)
# Evaluate
model %>% evaluate(testing, testingtarget)
pred <- model %>% predict(testing)
mean(testingtarget- pred^2)
plot(testingtarget, pred)
# Fine-tune Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 100, activation = 'relu', input_shape = c(5)) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units = 60, activation = 'relu', input_shape = c(5)) %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 1)
# Compile
model %>% compile(loss= 'mse',
optimizer= optimizer_rmsprop(lr=0.0001),
metrics='mae')
# Fit model
mymodel <- model %>%
fit(training,
trainingtarget,
epochs= 100,
batch_size = 32,
validation_split = 0.2)
# Evaluate
model %>% evaluate(testing, testingtarget)
pred <- model %>% predict(testing)
mean(testingtarget- pred^2)
plot(testingtarget, pred)
What you're looking for is called "one hot encoding". There are functions in tensorflow/keras to help out with the encoding.
But otherwise, I would try to do it up front. I would not rely on model.matrix as it doesn't give you quite what you want.
You can easily write your own function, but here's an example using the mltools package:
library(data.table)
library(mltools)
one_hot(data.table(x = factor(letters), n = 1:26))
Note: it requires data.table rather than data.frame but you can convert your data back and forth.

Set class weights in Keras of R when there are multiple outputs

I'm using the keras package in R to fit a neural network model. The model I'm working on has two outputs: output1 is continuous(for regression), output2 is binary(for classification).
Since we have a very imbalanced dataset for the classification problem(output2), I want to assign different class weights to deal with the imbalance, but apparently we don't need to do that for output1(the regression).
Here is the sample code for the NN model that I'm working on:
input <- layer_input(shape = c(32,24))
output <- input %>%
layer_lstm(units = 64, dropout = 0.2, recurrent_dropout = 0.2)
pred1 <- output %>%
layer_dense(units = 1, name = "output1")
pred2 <- output %>%
layer_dense(units = 1, activation = "sigmoid", name = "output2")
model <- keras_model(
input,
list(pred1, pred2)
)
summary(model)
model %>% compile(
optimizer = "rmsprop",
loss = list(
output1 = "mse",
output2 = "binary_crossentropy"
),
loss_weights = list(
output1 = 0.25,
output2 = 10
)
)
history <- model %>% fit(
train_x, list(output1 = train_y1,output2 = train_y2),
epochs = 10,
batch_size = 5000,
class_weight = ???,
validation_data = list(valid_x, list(output1 = valid_y1,output2 = valid_y2))
)
If we just have one binary output, I know that the class weights can be assigned by:
class_weight = list("0"=1,"1"=100),
but it doesn't work anymore when we have two outputs and just want to assign the weights to one of them. I guess I may need to somehow specify the name of the binary output in "class_weight" so that it knows the weights only apply to output2, but I don't know how to do it in R.
Does anyone know how to assign class weights to the binary output only when we have two outputs(one is regression, one is classification)? Thank you very much for the help!

Model training stage: validation_data with the same data set

I would like someone to explain to me why when I do a training with a validation_data identical to the training data set, I get two curves that are different and not superimposed?
x <- matrix(rnorm(50 * 10), nrow = 50)
y <- matrix(rnorm(50), nrow = 50)
model <- keras_model_sequential()
model %>%
layer_dense(units = 1, input_shape = dim(x)[2]) %>%
layer_dropout(rate = 1) %>%
layer_activation("linear")
model %>% compile(
loss = "mse",
optimizer = "adam",
metrics = "mse"
)
history <- model %>% fit(x, y, batch_size = 1, epochs = 10, verbose = 1, validation_data = list(x, y))
plot(history)
Here are some reasons why that might happened:
The loss is calculated and averaged during training. That means between loss calculations there are gradient updates, so the next loss over minibatch is the loss over a different model. On other hand, val_loss is calculated after training, over the same model for the whole dataset. That's why they are different in value.
To put it visually, it is like this:
Epoch 1:
batch_1 -> nnet_1 -> loss_1 -> optimize nnet_1 to nnet_2
batch_2 -> nnet_2 -> loss_2 -> optimize nnet_2 to nnet_3
...
batch_n -> nnet_n -> loss_n -> optimize nnet_n-1 to nnet_n
loss = loss_1 + loss_2 + ... + loss_n
val_loss = loss of the nnet_n over whole dataset
you see how their calculation differs?
During training (when loss is calculated), dropout is enabled. After training (validation phase, when val_loss is calculated, dropout is disabled.

Validation accuracy much lower than training accuracy in Keras for text classification

I am new to Keras and trying to create a model. The issue is that my training accuracy is around 80 percent but the validation accuracy is drastically low at 15 percent. I have 545 rows in my dataset. I have normalized all the input features. Any help on what can be tweaked would be really helpful.
Sharing the complete data and code here
https://drive.google.com/open?id=1g8Cmw2bmAI9DnOU-rB4sjsOeBuFp6NUy
#Normalize data
data[,1:(ncol(data)-1)] = normalize(data[,1:(ncol(data)-1)])
data[,ncol(data)] = as.numeric(data[,ncol(data)]) - 1
set.seed(128)
ind = sample(2,nrow(data),replace = T,prob = c(0.7,0.3))
training = data[ind==1,1:(ncol(data)-1)]
test = data[ind==2,1:(ncol(data)-1)]
traintarget = data[ind==1,ncol(data)]
testtarget = data[ind==2,ncol(data)]
# One hot encoding
trainLabels = to_categorical(traintarget)
testLabels = to_categorical(testtarget)
print(testLabels)
model = keras_model_sequential()
model %>%
layer_dense(units = 150, activation = 'relu', input_shape = c(520)) %>%
layer_dense(units = 50, activation = 'relu') %>%
layer_dense(units = 9, activation = 'softmax')
model %>%
compile(loss = 'categorical_crossentropy', optimizer = 'adam',metrics = 'accuracy')
history = model %>%
fit(training,
trainLabels,
epoch = 300,
batch_size = 32,
validation_split = 0.2)
prob = model %>%
predict_proba(test)
pred = model %>%
predict_classes(test)
table2 = table(Predicted = pred, Actual = testtarget)
cbind(prob,pred,testtarget)
Simply put when your model is succeeding at training but not validation it is overfitting. The best way to combat this is by a) making sure that your inputs actually predict the outputs because otherwise a large enough model will just memorize the historical data. and b) by adding a dropout layer in your network. Finally, 500 something training samples seems a little low for training a neural network.

Resources