Deep Neural Network Classification Problem MNIST - r

Hi I m trying to develop a neural net which it is capable to read handwritten numbers
I`ve copied this model from the internet (Ref: https://www.youtube.com/watch?v=5bso_5X7Zu4&t=643s)
My problem comes with new data, It has a high rate of missclassification (See the end of the post)
#Youtube link https://www.youtube.com/watch?v=5bso_5X7Zu4&t=785s
# MNIST data
library(keras)
mnist <- dataset_mnist()
trainx <- mnist$train$x
trainy <- mnist$train$y
testx <- mnist$test$x
testy <- mnist$test$y
# Plot images
par(mfrow = c(3,3))
for (i in 1:9) plot(as.raster(trainx[i,,], max = 255))
par(mfrow= c(1,1))
# Five
a <- c(1, 12, 36, 48, 66, 101, 133, 139, 146)
par(mfrow = c(3,3))
for (i in a) plot(as.raster(trainx[i,,], max = 255))
par(mfrow= c(1,1))
# Reshape & rescale
trainx <- array_reshape(trainx, c(nrow(trainx), 784))
testx <- array_reshape(testx, c(nrow(testx), 784))
trainx <- trainx / 255
testx <- testx /255
# One hot encoding
trainy <- to_categorical(trainy, 10)
testy <- to_categorical(testy, 10)
# Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 512, activation = 'relu', input_shape = c(784)) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units= 256, activation = 'relu') %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 10, activation = 'softmax')
# Compile
model %>%
compile(loss = 'categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = 'accuracy')
# Fit model
history <- model %>%
fit(trainx,
trainy,
epochs = 30,
batch_size = 32,
validation_split = 0.2)
# Evaluation and Prediction - Test data
model %>% evaluate(testx, testy)
pred <- model %>% predict_classes(testx)
table(Predicted = pred, Actual = mnist$test$y)
prob <- model %>% predict_proba(testx)
cbind(prob, Predicted_class = pred, Actual = mnist$test$y)[1:5,]
# New data
library(EBImage)
temp = list.files(pattern = "*.jpg")
mypic <- list()
for (i in 1:length(temp)) {mypic[[i]] <- readImage(temp[[i]])}
par(mfrow = c(4,4))
for (i in 1:length(temp)) plot(mypic[[i]])
for (i in 1:length(temp)) {colorMode(mypic[[i]]) <- Grayscale}
for (i in 1:length(temp)) {mypic[[i]] <- 1-mypic[[i]]}
for (i in 1:length(temp)) {mypic[[i]] <- resize(mypic[[i]], 28, 28)}
str(mypic)
par(mfrow = c(4,5))
for (i in 1:length(temp)) plot(mypic[[i]])
for (i in 1:length(temp)) {mypic[[i]] <- array_reshape(mypic[[i]], c(28,28,3))}
new <- NULL
for (i in 1:length(temp)) {new <- rbind(new, mypic[[i]])}
newx <- new[,1:784]
newy <- c(7,5,2,0,5,3,4,3,2,7,5,6,8,5,6)
# Prediction
pred <- model %>% predict_classes(newx)
pred
table(Predicted = pred, Actual = newy)
prob <- model %>% predict_proba(newx)
cbind(prob, Predicted = pred, Actual = newy)
The problem is the classification with new data
I have obtained the next prediction (You can download the images from here)
7 5 2 0 5 3 8 2 2 6 6 6 3 5 5
The first six numbers are correctly classificated(Numbers from the video)(7 5 2 0 5 3) but the next 9 (Numbers written by me) show a very poor result.
I also tried to test my nnet with the training data numbers but it continues failing and I dont understand why :(
Any idea? Why is happening this?
Thanks

Related

How to find the predicted values with Keras

I'm learning keras, and would like to see the predicted numbers that are returned. The model has a number of items returned, but none of them seem to be the predicted values.
df <- MASS::Boston
index <- sample(c(TRUE, FALSE), nrow(df), replace=TRUE, prob=c(0.7,0.3))
train_features <- Boston[index,]
test_features <- Boston[!index,]
train_labels <- Boston$medv[index]
test_labels <- Boston$medv[!index]
train_features <- scale(train_features)
train_features <- train_features[,1:ncol(train_features)]
test_features <- scale(test_features)
test_features <- test_features[,1:ncol(test_features)]
mean <- apply(train_features, 2, mean)
sd <- apply(train_features, 2, sd)
train_data <- scale(train_features, center = mean, scale = sd)
test_data <- scale(test_features, center = mean, scale = sd)
train_targets <- Boston$medv[index]
test_targets <- Boston$medv[!index]
Here is where the model is built:
build_model <- function() {
model <- keras_model_sequential() %>%
layer_dense(64, activation = "relu") %>%
layer_dense(64, activation = "relu") %>%
layer_dense(1)
model %>% compile(optimizer = "rmsprop",
loss = "mse",
metrics = "mse")
model
}
Next we set up five folds, and track all_scores:
k <- 5
fold_id <- sample(rep(1:k, length.out = nrow(train_data)))
num_epochs <- 100
all_scores <- numeric()
for (i in 1:k) {
cat("Processing fold #", i, "\n")
val_indices <- which(fold_id == i)
val_data <- train_data[val_indices, ]
val_targets <- train_targets[val_indices]
partial_train_data <- train_data[-val_indices, ]
partial_train_targets <- train_targets[-val_indices]
model <- build_model()
model %>% fit (
partial_train_data,
partial_train_targets,
epochs = num_epochs,
batch_size = 16,
verbose = 0
)
results <- model %>%
evaluate(val_data, val_targets, verbose = 0)
all_scores[[i]] <- results[['mse']]
}
keras.RMSE <- sqrt(mean(all_scores))
However, none of the variables seem to have the predicted values. A few examples:
all_scores is a set of RMSE scores (which I also want)
val_targets appears to be the wrong dimensions
model$fit does not return a value or set of values
model$predict generates predicted values, but those have already been generated, and I can't locate them.
How are the predicted values returned in a keras model?

Reading handwritten numbers using Deep Networks with MNIST Data in R Part3

I try to write a program based on Deep Networks to read handwritten numbers. I found a code in Youtube (https://www.youtube.com/watch?v=5bso_5X7Zu4) which works there but it does not work for me. The problem is that I get error when I try to predict my handwritten number (namely number 5) that I have made in Windows Paint.
My number in Paint which its file name is number5.jpg is:
My complete code is:
library("pkgdown")
library(keras)
# devtools::install_github("rstudio/reticulate")
mnist <- dataset_mnist()
# str(mnist)
trainx <- mnist$train$x
trainy <- mnist$train$y
testx <- mnist$test$x
testy <- mnist$test$y
table(mnist$train$y, mnist$train$y)
table(mnist$test$y, mnist$test$y)
# plot images
windows()
par(mfrow = c(3,3))
for (i in 1:9) plot(as.raster(trainx[i,,], max=255))
trainx[2,,]
windows()
hist(trainx[1,,])
# Reshape & rescale
trainx <- array_reshape(trainx, c(nrow(trainx), 784))
testx <- array_reshape(testx, c(nrow(testx), 784))
trainx <- trainx / 255
testx <- testx / 255
#windows()
#hist(trainx[1,])
# One hot encoding
trainy <- to_categorical(trainy, 10)
testy <- to_categorical(testy, 10)
# trainx <- as.matrix(trainx)
# trainy <- as.matrix(trainy)
# Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 128, activation = 'relu', input_shape = c(784)) %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 64, activation = 'relu') %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 10, activation = 'softmax')
summary(model)
# Compile
model %>%
compile(loss ='categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = 'accuracy')
# Fit model
history <- model %>%
fit(trainx,
trainy,
epochs = 30,
batch_size = 32,
validation_split = 0.2)
plot(history)
# Evaluation and Precition - Test data
model %>% evaluate(testx, testy)
pred <- model %>% predict(testx) %>% k_argmax() %>% as.integer() %>% .[1:7840000]
prob <- model %>% predict(testx)
cbind(Predicted_class = pred , Actual = mnist$test$y)[1:150,]
# New data
#install.packages("BiocManager")
#BiocManager::install("EBImage")
library(EBImage)
setwd("C:/Users/hofo/Arbeidsmapper/Documents/NLP/NLP_BOSTOTTE/JPG_filer")
mypic <- readImage("number5.jpg")
mypic <- resize(mypic, 28, 28)
mypic <- array_reshape(mypic, c(28, 28, 3))
new <- NULL
new <- rbind(new, mypic)
str(new)
newx <- new[1:1,1:784]
newy <- c(5)
pred <- model %>% predict(newx) %>% k_argmax() %>% as.integer() %>% .[1:9]
The error when I run the last row is:
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: in user code:
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\keras\engine\training.py:1586 predict_function *
return step_function(self, iterator)
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\keras\engine\training.py:1576 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1286 run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2849 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3632 _call_for_each_replica
return fn(*args, **kwargs)
Can you also please help me with this?

Text classification with own word embeddings using Neural Networks in R

This is a rather lengthy one, so please bear with me, unfortunately enough the error occurs right at the very end...I cannot predict on the unseen test set!
I would like to perform text classification with word embeddings (that I have trained on my data set) that are embedded into neural networks.
I simply have column with textual descriptions = input and four different price classes = target.
For a reproducible example, here are the necessary data set and the word embedding:
DF: https://www.dropbox.com/s/it0jsbv8e7nkryt/DF.csv?dl=0
WordEmb: https://www.dropbox.com/s/ia5fmio2e0plwkr/WordEmb.txt?dl=0
And here my code:
set.seed(2077)
DF = read.delim("DF.csv", header = TRUE, sep = ",",
dec = ".", stringsAsFactors = FALSE)
DF <- DF[,-1]
# parameters
max_num_words = 9000 # simply see number of observations
validation_split = 0.3
embedding_dim = 300
##### Data Preparation #####
# split into training and test set
set.seed(2077)
n <- nrow(DF)
shuffled <- DF[sample(n),]
# Split the data in train and test
train <- shuffled[1:round(0.7 * n),]
test <- shuffled[(round(0.7 * n) + 1):n,]
rm(n, shuffled)
# predictor/target variable
x_train <- train$Description
x_test <- test$Description
y_train <- train$Price_class
y_test <- test$Price_class
### encode target variable ###
# One hot encode training target values
trainLabels <- to_categorical(y_train)
trainLabels <- trainLabels[, 2:5]
# One hot encode test target values
testLabels <- keras::to_categorical(y_test)
testLabels <- testLabels[, 2:5]
### encode predictor variable ###
# pad sequences
tokenizer <- text_tokenizer(num_words = max_num_words)
# finally, vectorize the text samples into a 2D integer tensor
set.seed(2077)
tokenizer %>% fit_text_tokenizer(x_train)
train_data <- texts_to_sequences(tokenizer, x_train)
tokenizer %>% fit_text_tokenizer(x_test)
test_data <- texts_to_sequences(tokenizer, x_test)
# determine average length of document -> set as maximal sequence length
seq_mean <- stri_count(train_data, regex="\\S+")
mean((seq_mean))
max_sequence_length = 70
# This turns our lists of integers into a 2D integer tensor of shape`(samples, maxlen)`
x_train <- keras::pad_sequences(train_data, maxlen = max_sequence_length)
x_test <- keras::pad_sequences(test_data, maxlen = max_sequence_length)
word_index <- tokenizer$word_index
Encoding(names(word_index)) <- "UTF-8"
#### PREPARE EMBEDDING MATRIX ####
embeddings_index <- new.env(parent = emptyenv())
lines <- readLines("WordEmb.txt")
for (line in lines) {
values <- strsplit(line, ' ', fixed = TRUE)[[1]]
word <- values[[1]]
coefs <- as.numeric(values[-1])
embeddings_index[[word]] <- coefs
}
embedding_dim <- 300
embedding_matrix <- array(0,c(max_num_words, embedding_dim))
for(word in names(word_index)){
index <- word_index[[word]]
if(index < max_num_words){
embedding_vector <- embeddings_index[[word]]
if(!is.null(embedding_vector)){
embedding_matrix[index+1,] <- embedding_vector
}
}
}
##### Convolutional Neural Network #####
# load pre-trained word embeddings into an Embedding layer
# note that we set trainable = False so as to keep the embeddings fixed
num_words <- min(max_num_words, length(word_index) + 1)
embedding_layer <- keras::layer_embedding(
input_dim = num_words,
output_dim = embedding_dim,
weights = list(embedding_matrix),
input_length = max_sequence_length,
trainable = FALSE
)
# train a 1D convnet with global maxpooling
sequence_input <- layer_input(shape = list(max_sequence_length), dtype='int32')
preds <- sequence_input %>%
embedding_layer %>%
layer_conv_1d(filters = 128, kernel_size = 1, activation = 'relu') %>%
layer_max_pooling_1d(pool_size = 5) %>%
layer_conv_1d(filters = 128, kernel_size = 1, activation = 'relu') %>%
layer_max_pooling_1d(pool_size = 5) %>%
layer_conv_1d(filters = 128, kernel_size = 1, activation = 'relu') %>%
layer_max_pooling_1d(pool_size = 2) %>%
layer_flatten() %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dense(units = 4, activation = 'softmax')
model <- keras_model(sequence_input, preds)
model %>% compile(
loss = 'categorical_crossentropy',
optimizer = 'adam',
metrics = c('acc')
)
model %>% keras::fit(
x_train,
trainLabels,
batch_size = 1024,
epochs = 20,
validation_split = 0.3
)
Now here is where I get stuck:
I cannot use the results of the NN to predict on the unseen test data set:
# Predict the classes for the test data
classes <- model %>% predict_classes(x_test, batch_size = 128)
I get this error:
Error in py_get_attr_impl(x, name, silent) :
AttributeError: 'Model' object has no attribute 'predict_classes'
Afterwards, I'd proceed like this:
# Confusion matrix
table(y_test, classes)
# Evaluate on test data and labels
score <- model %>% evaluate(x_val, testLabels, batch_size = 128)
# Print the score
print(score)
For now the actual accuracy does not really matter since this is only a small example of my data set.
I know this is a long one but AAANNY help would be very muuuch appreciated.

Shape error in image classification model in Keras R

I am having trouble with one area of code and it prevents me from finishing my research paper. I am new to Machine Learning and R, but I have learned a lot so far. Here is my code:
# Install packages and libraries
install.packages("keras")
source("http://bioconductor.org/biocLite.R")
library(keras)
library(EBImage)
# Read images
setwd('C:/Users/ebarn/Desktop/DataSet')
pics <- c('p1.jpg', 'p2.jpg', 'p3.jpg', 'p4.jpg', 'p5.jpg',
'p6.jpg','c1.jpg', 'c2.jpg', 'c3.jpg', 'c4.jpg', 'c5.jpg',
'c6.jpg')
mypic <- list()
for (i in 1:12) {mypic[[i]] <- readImage(pics[i])}
# Explore
print(mypic[[1]])
display(mypic[[1]])
display(mypic[[8]])
summary(mypic[[1]])
hist(mypic[[12]])
str(mypic)
# Resize
for (i in 1:12) {mypic[[i]] <- resize(mypic[[i]], 28, 28)}
str(mypic)
# Reshape
28*28*3
for (i in 1:12) {mypic[[i]] <- array_reshape(mypic[[i]], c(28,
28, 3))}
str(mypic)
# Row Bind
trainx <- NULL
for(i in 1:5) {trainx <- rbind(trainx, mypic[[i]])}
str(trainx)
for(i in 7:11) {trainx <- rbind(trainx, mypic[[i]])}
str(trainx)
testx <- rbind(mypic[[6]], mypic[[12]])
trainy <- c(0,0,0,0,0,1,1,1,1,1)
testy <- c(0, 1)
# One Hot Encoding
trainLabels <- to_categorical(trainy)
testLabels <- to_categorical(testy)
trainLabels
# Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 256, activation = 'relu', input_shape =
c(2352))
%>%
layer_dense(units = 128, activation = 'relu')
%>%
layer_dense(units = 2, activation = 'softmax')
summary(model)
# Compile
model %>%
compile(loss = 'sparse_categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = c('accuracy'))
# model.add(Dense(10, activation = 'softmax'))
# Fit Model
history <- model %>%
fit(trainx, trainLabels, epochs = 30, batch_size = 32,
validation_split = 0.2)
plot(history)
# Evaluation & Prediction - train data
model %>% evaluate(trainx, trainLabels)
The Fit Model method will not print out my graph. Here is the error it gives me:
ValueError: Error when checking target: expected dense _1 to have shape (1,) but got array with shape (2,)
You are one-hot encoding the labels:
# One Hot Encoding
trainLabels <- to_categorical(trainy)
testLabels <- to_categorical(testy)
Therefore, they are no longer sparse labels and you need to use categorical_crossentropy as the loss function instead of sparse_categorical_crossentropy. Alternatively, you can comment the one-hot encoding lines.

Feature/variable importance for Keras model using Lime

I have a two-class classification Keras model with multi-type input data where I predict class A and B based on 1 continuous and 3 categorical input data. In the below dummy example, continuous1, categorical1 and categorical2 are 1D tensors the categorical3 is a 2D tensor of shape (samples, indices) with length num_index=20 and are one-hot encoded. I then want to use e.g. Lime to analyze which data inputs contributed more than others to the prediction. I'm following this tutorial, but as I have multi-type input data, I encounter some difficulties.
set.seed(666)
#Dummy data
dat <- data.frame(samples=c(rep(paste0("A_",1:9800)),rep(paste0("B_",9801:10000))))
dat$label <- c(rep("A",9800),rep("B",200))
dat$continuous1 <- c(rnorm(9800, 100, 45),rnorm(200, 0, 25))
dat$continuous1[dat$continuous1<0] <- 0
dat$categorical1 <- c(rep(1:100,98),rep(1:100,2))
dat$categorical2 <- c(rep(1:98,100),rep(99:100,100))
pool_A <- factor(1:15,levels=1:20)
pool_B <- factor(16:20,levels=1:20)
dat_categorical3 <- vector("list",10000)
names(dat_categorical3) <- c(as.character(dat[dat$label=="A",]$samples),as.character(dat[dat$label=="B",]$samples))
for(i in 1:9800){
dat_categorical3[[i]] <- (as.numeric(table(sample(pool_A, 20, replace=T))) > 0) + 0L
}
for(i in 9801:10000){
dat_categorical3[[i]] <- (as.numeric(table(sample(pool_B, 20, replace=T))) > 0) + 0L
}
dat_categorical3_tensor <- do.call(rbind,dat_categorical3)
## Split data for training-validating-testing
# As we have much fewer Bs than As, each partition has to have a more or less equal number of Bs(?)
training_Bs <- sample(x=as.character(dat[dat$label=="B",]$samples), size=67, replace=F)
validating_Bs <- sample(x=setdiff(as.character(dat[dat$label=="B",]$samples),training_Bs), size=67, replace=F)
testing_Bs <- setdiff(as.character(dat[dat$label=="B",]$samples),c(training_Bs,validating_Bs))
# We also parition the data containing As equally, though this could probably be done in e.g., 80-10-10
training_As <- sample(x=as.character(dat[dat$label=="A",]$samples), size=3267, replace=F)
validating_As <- sample(x=setdiff(as.character(dat[dat$label=="A",]$samples),training_As), size=3267, replace=F)
testing_As <- setdiff(as.character(dat[dat$label=="A",]$samples),c(training_As,validating_As))
# Put together
training_dat <- dat[dat$samples %in% c(training_As,training_Bs),]
training_categorical3_tensor <- dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(training_As,training_Bs),]
#training_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(training_As,training_Bs),]))
training_labels <- ifelse(training_dat$label=="A",0,1)
training_dat2 <- as.matrix(training_dat[,c(3:5)]) #use this
validating_dat <- dat[dat$samples %in% c(validating_As,validating_Bs),]
validating_categorical3_tensor <- dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(validating_As,validating_Bs),]
#validating_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(validating_As,validating_Bs),]))
validating_labels <- ifelse(validating_dat$label=="A",0,1)
validating_dat2 <- as.matrix(validating_dat[,c(3:5)]) #use this
testing_dat <- dat[dat$samples %in% c(testing_As,testing_Bs),]
testing_categorical3_tensor <- dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(testing_As,testing_Bs),]
#testing_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(testing_As,testing_Bs),]))
testing_labels <- ifelse(testing_dat$label=="A",0,1)
testing_dat2 <- as.matrix(testing_dat[,c(3:5)]) #use this
## Keras model
library(keras)
# Input layers
all_dat_input <- layer_input(shape = 3, dtype = 'float32', name = 'all_dat_input')
categorical3_indices_input <- layer_input(shape = 20, dtype = 'float32', name = 'categorical3_input')
input_tensor <- c(all_dat_input, categorical3_indices_input)
# Output layers
all_dat_out <- all_dat_input %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5)
categorical3_indices_out <- categorical3_indices_input %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5)
output_tensor <- layer_concatenate(c(all_dat_out, categorical3_indices_out)) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units=1, activation="sigmoid")
model <- keras_model(inputs=input_tensor, outputs=output_tensor)
# Compile
model %>% compile(
optimizer = "rmsprop",
loss = "binary_crossentropy",
metrics = "accuracy"
)
# Fit
history <- model %>% fit(
x=list(training_dat2,training_categorical3_tensor),
y=training_labels,
batch_size=256,
epochs=20,
validation_data=list(list(validating_dat2,validating_categorical3_tensor),validating_labels)
)
# Compare with put aside testing data
results <- model %>% evaluate(list(testing_dat2,testing_categorical3_tensor),testing_labels)
results
$loss
[2] 1.984114e-07
$acc
[2] 1
# Using the trained network to generate predictions on new data
predictions <- model %>% predict(list(testing_dat2,testing_categorical3_tensor))
head(predictions[3267:3332])
[2] 0.9999994 0.9999999 1.0000000 0.9999999 1.0000000 0.9999832
# the network correctly identifies Bs as Bs with a confidence above >99%
As the error message I get from lime::explain() seems to indicate some mismatch/missing variable names/labels, I created and tried different inputs (all generated the same error);
input_test1 <- data.frame(training_dat2,training_categorical3_tensor)
#input_test2 <- data.frame(all_dat_input=training_dat2,categorical3_indices_input=training_categorical3_tensor)
#input_test3 <- data.frame(all_dat_input=training_dat2, categorical3_input=training_categorical3_tensor)
## Feature/variable importance using Lime
library(lime)
explainer <- lime(input_test1, model, bin_continuous = F) #try different input_test1/input_test2/input_test3
explanation <- explain(input_test1, explainer=explainer, n_labels=2, n_features = 4) ##try different input_test1/input_test2/input_test3
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: No data provided for "all_dat_input". Need data for each key in: ['all_dat_input', 'categorical3_input']
All three input_test lead to the above error. I also tried making sure of that the actual data, specifically the categorical3_tensor column names matched those of the input_test (i.e., X1, X2, X3 etc), but that did not help either.
training_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(training_As,training_Bs),]))
validating_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(validating_As,validating_Bs),]))
testing_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(testing_As,testing_Bs),]))
Any advice/help is highly appreciated!

Resources