Hi I m trying to develop a neural net which it is capable to read handwritten numbers
I`ve copied this model from the internet (Ref: https://www.youtube.com/watch?v=5bso_5X7Zu4&t=643s)
My problem comes with new data, It has a high rate of missclassification (See the end of the post)
#Youtube link https://www.youtube.com/watch?v=5bso_5X7Zu4&t=785s
# MNIST data
library(keras)
mnist <- dataset_mnist()
trainx <- mnist$train$x
trainy <- mnist$train$y
testx <- mnist$test$x
testy <- mnist$test$y
# Plot images
par(mfrow = c(3,3))
for (i in 1:9) plot(as.raster(trainx[i,,], max = 255))
par(mfrow= c(1,1))
# Five
a <- c(1, 12, 36, 48, 66, 101, 133, 139, 146)
par(mfrow = c(3,3))
for (i in a) plot(as.raster(trainx[i,,], max = 255))
par(mfrow= c(1,1))
# Reshape & rescale
trainx <- array_reshape(trainx, c(nrow(trainx), 784))
testx <- array_reshape(testx, c(nrow(testx), 784))
trainx <- trainx / 255
testx <- testx /255
# One hot encoding
trainy <- to_categorical(trainy, 10)
testy <- to_categorical(testy, 10)
# Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 512, activation = 'relu', input_shape = c(784)) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units= 256, activation = 'relu') %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 10, activation = 'softmax')
# Compile
model %>%
compile(loss = 'categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = 'accuracy')
# Fit model
history <- model %>%
fit(trainx,
trainy,
epochs = 30,
batch_size = 32,
validation_split = 0.2)
# Evaluation and Prediction - Test data
model %>% evaluate(testx, testy)
pred <- model %>% predict_classes(testx)
table(Predicted = pred, Actual = mnist$test$y)
prob <- model %>% predict_proba(testx)
cbind(prob, Predicted_class = pred, Actual = mnist$test$y)[1:5,]
# New data
library(EBImage)
temp = list.files(pattern = "*.jpg")
mypic <- list()
for (i in 1:length(temp)) {mypic[[i]] <- readImage(temp[[i]])}
par(mfrow = c(4,4))
for (i in 1:length(temp)) plot(mypic[[i]])
for (i in 1:length(temp)) {colorMode(mypic[[i]]) <- Grayscale}
for (i in 1:length(temp)) {mypic[[i]] <- 1-mypic[[i]]}
for (i in 1:length(temp)) {mypic[[i]] <- resize(mypic[[i]], 28, 28)}
str(mypic)
par(mfrow = c(4,5))
for (i in 1:length(temp)) plot(mypic[[i]])
for (i in 1:length(temp)) {mypic[[i]] <- array_reshape(mypic[[i]], c(28,28,3))}
new <- NULL
for (i in 1:length(temp)) {new <- rbind(new, mypic[[i]])}
newx <- new[,1:784]
newy <- c(7,5,2,0,5,3,4,3,2,7,5,6,8,5,6)
# Prediction
pred <- model %>% predict_classes(newx)
pred
table(Predicted = pred, Actual = newy)
prob <- model %>% predict_proba(newx)
cbind(prob, Predicted = pred, Actual = newy)
The problem is the classification with new data
I have obtained the next prediction (You can download the images from here)
7 5 2 0 5 3 8 2 2 6 6 6 3 5 5
The first six numbers are correctly classificated(Numbers from the video)(7 5 2 0 5 3) but the next 9 (Numbers written by me) show a very poor result.
I also tried to test my nnet with the training data numbers but it continues failing and I dont understand why :(
Any idea? Why is happening this?
Thanks
Related
I'm learning keras, and would like to see the predicted numbers that are returned. The model has a number of items returned, but none of them seem to be the predicted values.
df <- MASS::Boston
index <- sample(c(TRUE, FALSE), nrow(df), replace=TRUE, prob=c(0.7,0.3))
train_features <- Boston[index,]
test_features <- Boston[!index,]
train_labels <- Boston$medv[index]
test_labels <- Boston$medv[!index]
train_features <- scale(train_features)
train_features <- train_features[,1:ncol(train_features)]
test_features <- scale(test_features)
test_features <- test_features[,1:ncol(test_features)]
mean <- apply(train_features, 2, mean)
sd <- apply(train_features, 2, sd)
train_data <- scale(train_features, center = mean, scale = sd)
test_data <- scale(test_features, center = mean, scale = sd)
train_targets <- Boston$medv[index]
test_targets <- Boston$medv[!index]
Here is where the model is built:
build_model <- function() {
model <- keras_model_sequential() %>%
layer_dense(64, activation = "relu") %>%
layer_dense(64, activation = "relu") %>%
layer_dense(1)
model %>% compile(optimizer = "rmsprop",
loss = "mse",
metrics = "mse")
model
}
Next we set up five folds, and track all_scores:
k <- 5
fold_id <- sample(rep(1:k, length.out = nrow(train_data)))
num_epochs <- 100
all_scores <- numeric()
for (i in 1:k) {
cat("Processing fold #", i, "\n")
val_indices <- which(fold_id == i)
val_data <- train_data[val_indices, ]
val_targets <- train_targets[val_indices]
partial_train_data <- train_data[-val_indices, ]
partial_train_targets <- train_targets[-val_indices]
model <- build_model()
model %>% fit (
partial_train_data,
partial_train_targets,
epochs = num_epochs,
batch_size = 16,
verbose = 0
)
results <- model %>%
evaluate(val_data, val_targets, verbose = 0)
all_scores[[i]] <- results[['mse']]
}
keras.RMSE <- sqrt(mean(all_scores))
However, none of the variables seem to have the predicted values. A few examples:
all_scores is a set of RMSE scores (which I also want)
val_targets appears to be the wrong dimensions
model$fit does not return a value or set of values
model$predict generates predicted values, but those have already been generated, and I can't locate them.
How are the predicted values returned in a keras model?
I try to write a program based on Deep Networks to read handwritten numbers. I found a code in Youtube (https://www.youtube.com/watch?v=5bso_5X7Zu4) which works there but it does not work for me. The problem is that I get error when I try to predict my handwritten number (namely number 5) that I have made in Windows Paint.
My number in Paint which its file name is number5.jpg is:
My complete code is:
library("pkgdown")
library(keras)
# devtools::install_github("rstudio/reticulate")
mnist <- dataset_mnist()
# str(mnist)
trainx <- mnist$train$x
trainy <- mnist$train$y
testx <- mnist$test$x
testy <- mnist$test$y
table(mnist$train$y, mnist$train$y)
table(mnist$test$y, mnist$test$y)
# plot images
windows()
par(mfrow = c(3,3))
for (i in 1:9) plot(as.raster(trainx[i,,], max=255))
trainx[2,,]
windows()
hist(trainx[1,,])
# Reshape & rescale
trainx <- array_reshape(trainx, c(nrow(trainx), 784))
testx <- array_reshape(testx, c(nrow(testx), 784))
trainx <- trainx / 255
testx <- testx / 255
#windows()
#hist(trainx[1,])
# One hot encoding
trainy <- to_categorical(trainy, 10)
testy <- to_categorical(testy, 10)
# trainx <- as.matrix(trainx)
# trainy <- as.matrix(trainy)
# Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 128, activation = 'relu', input_shape = c(784)) %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 64, activation = 'relu') %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 10, activation = 'softmax')
summary(model)
# Compile
model %>%
compile(loss ='categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = 'accuracy')
# Fit model
history <- model %>%
fit(trainx,
trainy,
epochs = 30,
batch_size = 32,
validation_split = 0.2)
plot(history)
# Evaluation and Precition - Test data
model %>% evaluate(testx, testy)
pred <- model %>% predict(testx) %>% k_argmax() %>% as.integer() %>% .[1:7840000]
prob <- model %>% predict(testx)
cbind(Predicted_class = pred , Actual = mnist$test$y)[1:150,]
# New data
#install.packages("BiocManager")
#BiocManager::install("EBImage")
library(EBImage)
setwd("C:/Users/hofo/Arbeidsmapper/Documents/NLP/NLP_BOSTOTTE/JPG_filer")
mypic <- readImage("number5.jpg")
mypic <- resize(mypic, 28, 28)
mypic <- array_reshape(mypic, c(28, 28, 3))
new <- NULL
new <- rbind(new, mypic)
str(new)
newx <- new[1:1,1:784]
newy <- c(5)
pred <- model %>% predict(newx) %>% k_argmax() %>% as.integer() %>% .[1:9]
The error when I run the last row is:
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: in user code:
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\keras\engine\training.py:1586 predict_function *
return step_function(self, iterator)
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\keras\engine\training.py:1576 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1286 run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2849 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
C:\Users\hofo\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3632 _call_for_each_replica
return fn(*args, **kwargs)
Can you also please help me with this?
This is a rather lengthy one, so please bear with me, unfortunately enough the error occurs right at the very end...I cannot predict on the unseen test set!
I would like to perform text classification with word embeddings (that I have trained on my data set) that are embedded into neural networks.
I simply have column with textual descriptions = input and four different price classes = target.
For a reproducible example, here are the necessary data set and the word embedding:
DF: https://www.dropbox.com/s/it0jsbv8e7nkryt/DF.csv?dl=0
WordEmb: https://www.dropbox.com/s/ia5fmio2e0plwkr/WordEmb.txt?dl=0
And here my code:
set.seed(2077)
DF = read.delim("DF.csv", header = TRUE, sep = ",",
dec = ".", stringsAsFactors = FALSE)
DF <- DF[,-1]
# parameters
max_num_words = 9000 # simply see number of observations
validation_split = 0.3
embedding_dim = 300
##### Data Preparation #####
# split into training and test set
set.seed(2077)
n <- nrow(DF)
shuffled <- DF[sample(n),]
# Split the data in train and test
train <- shuffled[1:round(0.7 * n),]
test <- shuffled[(round(0.7 * n) + 1):n,]
rm(n, shuffled)
# predictor/target variable
x_train <- train$Description
x_test <- test$Description
y_train <- train$Price_class
y_test <- test$Price_class
### encode target variable ###
# One hot encode training target values
trainLabels <- to_categorical(y_train)
trainLabels <- trainLabels[, 2:5]
# One hot encode test target values
testLabels <- keras::to_categorical(y_test)
testLabels <- testLabels[, 2:5]
### encode predictor variable ###
# pad sequences
tokenizer <- text_tokenizer(num_words = max_num_words)
# finally, vectorize the text samples into a 2D integer tensor
set.seed(2077)
tokenizer %>% fit_text_tokenizer(x_train)
train_data <- texts_to_sequences(tokenizer, x_train)
tokenizer %>% fit_text_tokenizer(x_test)
test_data <- texts_to_sequences(tokenizer, x_test)
# determine average length of document -> set as maximal sequence length
seq_mean <- stri_count(train_data, regex="\\S+")
mean((seq_mean))
max_sequence_length = 70
# This turns our lists of integers into a 2D integer tensor of shape`(samples, maxlen)`
x_train <- keras::pad_sequences(train_data, maxlen = max_sequence_length)
x_test <- keras::pad_sequences(test_data, maxlen = max_sequence_length)
word_index <- tokenizer$word_index
Encoding(names(word_index)) <- "UTF-8"
#### PREPARE EMBEDDING MATRIX ####
embeddings_index <- new.env(parent = emptyenv())
lines <- readLines("WordEmb.txt")
for (line in lines) {
values <- strsplit(line, ' ', fixed = TRUE)[[1]]
word <- values[[1]]
coefs <- as.numeric(values[-1])
embeddings_index[[word]] <- coefs
}
embedding_dim <- 300
embedding_matrix <- array(0,c(max_num_words, embedding_dim))
for(word in names(word_index)){
index <- word_index[[word]]
if(index < max_num_words){
embedding_vector <- embeddings_index[[word]]
if(!is.null(embedding_vector)){
embedding_matrix[index+1,] <- embedding_vector
}
}
}
##### Convolutional Neural Network #####
# load pre-trained word embeddings into an Embedding layer
# note that we set trainable = False so as to keep the embeddings fixed
num_words <- min(max_num_words, length(word_index) + 1)
embedding_layer <- keras::layer_embedding(
input_dim = num_words,
output_dim = embedding_dim,
weights = list(embedding_matrix),
input_length = max_sequence_length,
trainable = FALSE
)
# train a 1D convnet with global maxpooling
sequence_input <- layer_input(shape = list(max_sequence_length), dtype='int32')
preds <- sequence_input %>%
embedding_layer %>%
layer_conv_1d(filters = 128, kernel_size = 1, activation = 'relu') %>%
layer_max_pooling_1d(pool_size = 5) %>%
layer_conv_1d(filters = 128, kernel_size = 1, activation = 'relu') %>%
layer_max_pooling_1d(pool_size = 5) %>%
layer_conv_1d(filters = 128, kernel_size = 1, activation = 'relu') %>%
layer_max_pooling_1d(pool_size = 2) %>%
layer_flatten() %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dense(units = 4, activation = 'softmax')
model <- keras_model(sequence_input, preds)
model %>% compile(
loss = 'categorical_crossentropy',
optimizer = 'adam',
metrics = c('acc')
)
model %>% keras::fit(
x_train,
trainLabels,
batch_size = 1024,
epochs = 20,
validation_split = 0.3
)
Now here is where I get stuck:
I cannot use the results of the NN to predict on the unseen test data set:
# Predict the classes for the test data
classes <- model %>% predict_classes(x_test, batch_size = 128)
I get this error:
Error in py_get_attr_impl(x, name, silent) :
AttributeError: 'Model' object has no attribute 'predict_classes'
Afterwards, I'd proceed like this:
# Confusion matrix
table(y_test, classes)
# Evaluate on test data and labels
score <- model %>% evaluate(x_val, testLabels, batch_size = 128)
# Print the score
print(score)
For now the actual accuracy does not really matter since this is only a small example of my data set.
I know this is a long one but AAANNY help would be very muuuch appreciated.
I am having trouble with one area of code and it prevents me from finishing my research paper. I am new to Machine Learning and R, but I have learned a lot so far. Here is my code:
# Install packages and libraries
install.packages("keras")
source("http://bioconductor.org/biocLite.R")
library(keras)
library(EBImage)
# Read images
setwd('C:/Users/ebarn/Desktop/DataSet')
pics <- c('p1.jpg', 'p2.jpg', 'p3.jpg', 'p4.jpg', 'p5.jpg',
'p6.jpg','c1.jpg', 'c2.jpg', 'c3.jpg', 'c4.jpg', 'c5.jpg',
'c6.jpg')
mypic <- list()
for (i in 1:12) {mypic[[i]] <- readImage(pics[i])}
# Explore
print(mypic[[1]])
display(mypic[[1]])
display(mypic[[8]])
summary(mypic[[1]])
hist(mypic[[12]])
str(mypic)
# Resize
for (i in 1:12) {mypic[[i]] <- resize(mypic[[i]], 28, 28)}
str(mypic)
# Reshape
28*28*3
for (i in 1:12) {mypic[[i]] <- array_reshape(mypic[[i]], c(28,
28, 3))}
str(mypic)
# Row Bind
trainx <- NULL
for(i in 1:5) {trainx <- rbind(trainx, mypic[[i]])}
str(trainx)
for(i in 7:11) {trainx <- rbind(trainx, mypic[[i]])}
str(trainx)
testx <- rbind(mypic[[6]], mypic[[12]])
trainy <- c(0,0,0,0,0,1,1,1,1,1)
testy <- c(0, 1)
# One Hot Encoding
trainLabels <- to_categorical(trainy)
testLabels <- to_categorical(testy)
trainLabels
# Model
model <- keras_model_sequential()
model %>%
layer_dense(units = 256, activation = 'relu', input_shape =
c(2352))
%>%
layer_dense(units = 128, activation = 'relu')
%>%
layer_dense(units = 2, activation = 'softmax')
summary(model)
# Compile
model %>%
compile(loss = 'sparse_categorical_crossentropy',
optimizer = optimizer_rmsprop(),
metrics = c('accuracy'))
# model.add(Dense(10, activation = 'softmax'))
# Fit Model
history <- model %>%
fit(trainx, trainLabels, epochs = 30, batch_size = 32,
validation_split = 0.2)
plot(history)
# Evaluation & Prediction - train data
model %>% evaluate(trainx, trainLabels)
The Fit Model method will not print out my graph. Here is the error it gives me:
ValueError: Error when checking target: expected dense _1 to have shape (1,) but got array with shape (2,)
You are one-hot encoding the labels:
# One Hot Encoding
trainLabels <- to_categorical(trainy)
testLabels <- to_categorical(testy)
Therefore, they are no longer sparse labels and you need to use categorical_crossentropy as the loss function instead of sparse_categorical_crossentropy. Alternatively, you can comment the one-hot encoding lines.
I have a two-class classification Keras model with multi-type input data where I predict class A and B based on 1 continuous and 3 categorical input data. In the below dummy example, continuous1, categorical1 and categorical2 are 1D tensors the categorical3 is a 2D tensor of shape (samples, indices) with length num_index=20 and are one-hot encoded. I then want to use e.g. Lime to analyze which data inputs contributed more than others to the prediction. I'm following this tutorial, but as I have multi-type input data, I encounter some difficulties.
set.seed(666)
#Dummy data
dat <- data.frame(samples=c(rep(paste0("A_",1:9800)),rep(paste0("B_",9801:10000))))
dat$label <- c(rep("A",9800),rep("B",200))
dat$continuous1 <- c(rnorm(9800, 100, 45),rnorm(200, 0, 25))
dat$continuous1[dat$continuous1<0] <- 0
dat$categorical1 <- c(rep(1:100,98),rep(1:100,2))
dat$categorical2 <- c(rep(1:98,100),rep(99:100,100))
pool_A <- factor(1:15,levels=1:20)
pool_B <- factor(16:20,levels=1:20)
dat_categorical3 <- vector("list",10000)
names(dat_categorical3) <- c(as.character(dat[dat$label=="A",]$samples),as.character(dat[dat$label=="B",]$samples))
for(i in 1:9800){
dat_categorical3[[i]] <- (as.numeric(table(sample(pool_A, 20, replace=T))) > 0) + 0L
}
for(i in 9801:10000){
dat_categorical3[[i]] <- (as.numeric(table(sample(pool_B, 20, replace=T))) > 0) + 0L
}
dat_categorical3_tensor <- do.call(rbind,dat_categorical3)
## Split data for training-validating-testing
# As we have much fewer Bs than As, each partition has to have a more or less equal number of Bs(?)
training_Bs <- sample(x=as.character(dat[dat$label=="B",]$samples), size=67, replace=F)
validating_Bs <- sample(x=setdiff(as.character(dat[dat$label=="B",]$samples),training_Bs), size=67, replace=F)
testing_Bs <- setdiff(as.character(dat[dat$label=="B",]$samples),c(training_Bs,validating_Bs))
# We also parition the data containing As equally, though this could probably be done in e.g., 80-10-10
training_As <- sample(x=as.character(dat[dat$label=="A",]$samples), size=3267, replace=F)
validating_As <- sample(x=setdiff(as.character(dat[dat$label=="A",]$samples),training_As), size=3267, replace=F)
testing_As <- setdiff(as.character(dat[dat$label=="A",]$samples),c(training_As,validating_As))
# Put together
training_dat <- dat[dat$samples %in% c(training_As,training_Bs),]
training_categorical3_tensor <- dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(training_As,training_Bs),]
#training_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(training_As,training_Bs),]))
training_labels <- ifelse(training_dat$label=="A",0,1)
training_dat2 <- as.matrix(training_dat[,c(3:5)]) #use this
validating_dat <- dat[dat$samples %in% c(validating_As,validating_Bs),]
validating_categorical3_tensor <- dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(validating_As,validating_Bs),]
#validating_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(validating_As,validating_Bs),]))
validating_labels <- ifelse(validating_dat$label=="A",0,1)
validating_dat2 <- as.matrix(validating_dat[,c(3:5)]) #use this
testing_dat <- dat[dat$samples %in% c(testing_As,testing_Bs),]
testing_categorical3_tensor <- dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(testing_As,testing_Bs),]
#testing_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(testing_As,testing_Bs),]))
testing_labels <- ifelse(testing_dat$label=="A",0,1)
testing_dat2 <- as.matrix(testing_dat[,c(3:5)]) #use this
## Keras model
library(keras)
# Input layers
all_dat_input <- layer_input(shape = 3, dtype = 'float32', name = 'all_dat_input')
categorical3_indices_input <- layer_input(shape = 20, dtype = 'float32', name = 'categorical3_input')
input_tensor <- c(all_dat_input, categorical3_indices_input)
# Output layers
all_dat_out <- all_dat_input %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5)
categorical3_indices_out <- categorical3_indices_input %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5)
output_tensor <- layer_concatenate(c(all_dat_out, categorical3_indices_out)) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units=1, activation="sigmoid")
model <- keras_model(inputs=input_tensor, outputs=output_tensor)
# Compile
model %>% compile(
optimizer = "rmsprop",
loss = "binary_crossentropy",
metrics = "accuracy"
)
# Fit
history <- model %>% fit(
x=list(training_dat2,training_categorical3_tensor),
y=training_labels,
batch_size=256,
epochs=20,
validation_data=list(list(validating_dat2,validating_categorical3_tensor),validating_labels)
)
# Compare with put aside testing data
results <- model %>% evaluate(list(testing_dat2,testing_categorical3_tensor),testing_labels)
results
$loss
[2] 1.984114e-07
$acc
[2] 1
# Using the trained network to generate predictions on new data
predictions <- model %>% predict(list(testing_dat2,testing_categorical3_tensor))
head(predictions[3267:3332])
[2] 0.9999994 0.9999999 1.0000000 0.9999999 1.0000000 0.9999832
# the network correctly identifies Bs as Bs with a confidence above >99%
As the error message I get from lime::explain() seems to indicate some mismatch/missing variable names/labels, I created and tried different inputs (all generated the same error);
input_test1 <- data.frame(training_dat2,training_categorical3_tensor)
#input_test2 <- data.frame(all_dat_input=training_dat2,categorical3_indices_input=training_categorical3_tensor)
#input_test3 <- data.frame(all_dat_input=training_dat2, categorical3_input=training_categorical3_tensor)
## Feature/variable importance using Lime
library(lime)
explainer <- lime(input_test1, model, bin_continuous = F) #try different input_test1/input_test2/input_test3
explanation <- explain(input_test1, explainer=explainer, n_labels=2, n_features = 4) ##try different input_test1/input_test2/input_test3
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: No data provided for "all_dat_input". Need data for each key in: ['all_dat_input', 'categorical3_input']
All three input_test lead to the above error. I also tried making sure of that the actual data, specifically the categorical3_tensor column names matched those of the input_test (i.e., X1, X2, X3 etc), but that did not help either.
training_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(training_As,training_Bs),]))
validating_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(validating_As,validating_Bs),]))
testing_categorical3_tensor <- as.matrix(data.frame(dat_categorical3_tensor[rownames(dat_categorical3_tensor) %in% c(testing_As,testing_Bs),]))
Any advice/help is highly appreciated!