i am trying to make a prediction using ''classifier'' - jupyter-notebook

the error:
TypeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_720\1161286573.py in
----> 1 dtree.fit(X_train, y_train)
TypeError: fit() missing 1 required positional argument: 'y'
my code:
dtree.fit(X_train, y_train)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state=19)
print("X_train διαστάσεις:", X_train.shape)
print("y_train διαστάσεις:", y_train.shape)
print("X_test διαστάσεις:", X_test.shape)
print("y_test διαστάσεις:", y_test.shape)
from sklearn import tree
dtree = tree.DecisionTreeClassifier
the error:
TypeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_720\1161286573.py in
----> 1 dtree.fit(X_train, y_train)
TypeError: fit() missing 1 required positional argument: 'y'

Related

R Error in py_get_attr_impl(x, name, silent) : AttributeError: module 'tensorflow' has no attribute 'placeholder'

I am trying to Implement Auto Encoder Dimension Reduction from Tensorflow in R, in this example:
library(dimRed)
library(tensorflow)
fraud_data <- read.csv("fraud_data")
data_label <- fraud_data["class"]
my_formula <- as.formula("class ~ .")
dat <- as.dimRedData(my_formula, fraud_data)
dimen <- NULL
dimension_params <- NULL
dimen <- dimRed::AutoEncoder()
dimension_params <- dimen#stdpars
dimension_params$ndim <- 2
emb <- dimen#fun(fraud_data, dimension_params)
dimensional_data <- data.frame(emb#data#data)
x11()
plot(x=dimensional_data[,1], y=dimensional_data[,2], col=data_label, main="Laplacian Eigenmaps Projection")
legend(x=legend_pos, legend = unique(data_label), col=unique(data_label), pch=1)
I keep getting AttributeError module 'tensorflow' has no attribute 'placeholder'" as stated in this traceback:
14. stop(structure(list(message = "AttributeError: module 'tensorflow' has no attribute 'placeholder'",
call = py_get_attr_impl(x, name, silent), cppstack = NULL), class = c("Rcpp::exception",
"C++Error", "error", "condition")))
13. py_get_attr_impl(x, name, silent)
12. py_get_attr(x, name)
11. py_get_attr_or_item(x, name, TRUE)
10. `$.python.builtin.object`(x, name)
9. `$.python.builtin.module`(tf, "placeholder")
8. tf$placeholder
7. graph_params(d_in = ncol(indata), n_hidden = n_hidden, activation = activation,
weight_decay = weight_decay, learning_rate = learning_rate,
n_steps = n_steps, ndim = ndim)
6. eval(substitute(expr), data, enclos = parent.frame())
5. eval(substitute(expr), data, enclos = parent.frame())
4. with.default(pars, {
graph_params(d_in = ncol(indata), n_hidden = n_hidden, activation = activation,
weight_decay = weight_decay, learning_rate = learning_rate,
n_steps = n_steps, ndim = ndim) ...
3. with(pars, {
graph_params(d_in = ncol(indata), n_hidden = n_hidden, activation = activation,
weight_decay = weight_decay, learning_rate = learning_rate,
n_steps = n_steps, ndim = ndim) ...
2. dimen#fun(dat, dimension_params)
Error in py_get_attr_impl(x, name, silent) :
AttributeError: module 'tensorflow' has no attribute 'placeholder'
As by the common solution is to Disable Tensorflow 2 Behaviour as stated in Tensorflow 2.0 - AttributeError: module 'tensorflow' has no attribute 'Session', I tried to use reticulate and suppress the errors by this example:
library(reticulate)
x <- import("tensorflow.compat.v1", as="tf")
x$disable_v2_behavior()
but this doesn't change anything.. and I still get AttributeError, I am wondering, How should I make a proper change in Tensorflow from R in this case?
Here is Sample Data used for the example: https://drive.google.com/file/d/1Yt4V1Ir00fm1vQ9futziWbwjUE9VvYK7/view?usp=sharing
I found out deeper that tf acts as R tensorflow module, since ?tf is a valid command after using library(tensorflow), and then since Tensorflow updated to version 2+, instead using tf$placeholder, use tf$compat$v1$placeholder, so I had an idea to add the features available in tf$compat$v1 to tf
tf_synchronize <- function(){
library(tensorflow)
rm(list=c("tf")) #Delete first if there any tf variable in Global Environment
tf_compat_names <- names(tf$compat$v1)
for(x in 2:length(tf_compat_names)){
tf[[tf_compat_names[x]]] <- tf$compat$v1[[tf_compat_names[x]]]
}
}
With this executed, the AttributeError are no more, and Auto Encoder from Dimension Reduction is successfully executed

Custom Evaluation Function based on F1 for use in xgboost - Python API

I have written the following custom evaluation function to use with xgboost, in order to optimize F1. Umfortuantely it returns an exception when run with xgboost.
The evaluation function is the following:
def F1_eval(preds, labels):
t = np.arange(0, 1, 0.005)
f = np.repeat(0, 200)
Results = np.vstack([t, f]).T
P = sum(labels == 1)
for i in range(200):
m = (preds >= Results[i, 0])
TP = sum(labels[m] == 1)
FP = sum(labels[m] == 0)
if (FP + TP) > 0:
Precision = TP/(FP + TP)
Recall = TP/P
if (Precision + Recall >0) :
F1 = 2 * Precision * Recall / (Precision + Recall)
else:
F1 = 0
Results[i, 1] = F1
return(max(Results[:, 1]))
Below I provide a reproducible example along with the error message:
from sklearn import datasets
Wine = datasets.load_wine()
X_wine = Wine.data
y_wine = Wine.target
y_wine[y_wine == 2] = 1
X_wine_train, X_wine_test, y_wine_train, y_wine_test = train_test_split(X_wine, y_wine, test_size = 0.2)
clf_wine = xgb.XGBClassifier(max_depth=6, learning_rate=0.1,silent=False, objective='binary:logistic', \
booster='gbtree', n_jobs=8, nthread=None, gamma=0, min_child_weight=1, max_delta_step=0, \
subsample=0.8, colsample_bytree=0.8, colsample_bylevel=1, reg_alpha=0, reg_lambda=1)
clf_wine.fit(X_wine_train, y_wine_train,\
eval_set=[(X_wine_train, y_wine_train), (X_wine_test, y_wine_test)], eval_metric=F1_eval, early_stopping_rounds=10, verbose=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-453-452852658dd8> in <module>()
12 clf_wine = xgb.XGBClassifier(max_depth=6, learning_rate=0.1,silent=False, objective='binary:logistic', booster='gbtree', n_jobs=8, nthread=None, gamma=0, min_child_weight=1, max_delta_step=0, subsample=0.8, colsample_bytree=0.8, colsample_bylevel=1, reg_alpha=0, reg_lambda=1)
13
---> 14 clf_wine.fit(X_wine_train, y_wine_train,eval_set=[(X_wine_train, y_wine_train), (X_wine_test, y_wine_test)], eval_metric=F1_eval, early_stopping_rounds=10, verbose=True)
15
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\sklearn.py in fit(self, X, y, sample_weight, eval_set, eval_metric, early_stopping_rounds, verbose, xgb_model, sample_weight_eval_set)
519 early_stopping_rounds=early_stopping_rounds,
520 evals_result=evals_result, obj=obj, feval=feval,
--> 521 verbose_eval=verbose, xgb_model=None)
522
523 self.objective = xgb_options["objective"]
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\training.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks, learning_rates)
202 evals=evals,
203 obj=obj, feval=feval,
--> 204 xgb_model=xgb_model, callbacks=callbacks)
205
206
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\training.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
82 # check evaluation result.
83 if len(evals) != 0:
---> 84 bst_eval_set = bst.eval_set(evals, i, feval)
85 if isinstance(bst_eval_set, STRING_TYPES):
86 msg = bst_eval_set
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\core.py in eval_set(self, evals, iteration, feval)
957 if feval is not None:
958 for dmat, evname in evals:
--> 959 feval_ret = feval(self.predict(dmat), dmat)
960 if isinstance(feval_ret, list):
961 for name, val in feval_ret:
<ipython-input-383-dfb8d5181b18> in F1_eval(preds, labels)
11
12
---> 13 P = sum(labels == 1)
14
15
TypeError: 'bool' object is not iterable
I do not understand why the function is not working. I have followed the examples here: https://github.com/dmlc/xgboost/blob/master/demo/guide-python/custom_objective.py
I would like to understand where I err.
When doing sum(labels == 1), Python evaluates labels == 1 as a Boolean object, thus you get TypeError: 'bool' object is not iterable
The function sum expecting an iterable object, like a list. Here's an example of your error:
In[32]: sum(True)
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-32-6eb8f80b7f2e>", line 1, in <module>
sum(True)
TypeError: 'bool' object is not iterable
If you want to use f1_score of scikit-learn you can implement the following wrapup:
from sklearn.metrics import f1_score
import numpy as np
def f1_eval(y_pred, dtrain):
y_true = dtrain.get_label()
err = 1-f1_score(y_true, np.round(y_pred))
return 'f1_err', err
params of the wrap up are list (of predictions) and DMatrix, and it returns a string, float
# Setting your classifier
clf_wine = xgb.XGBClassifier(max_depth=6, learning_rate=0.1,silent=False, objective='binary:logistic', \
booster='gbtree', n_jobs=8, nthread=None, gamma=0, min_child_weight=1, max_delta_step=0, \
subsample=0.8, colsample_bytree=0.8, colsample_bylevel=1, reg_alpha=0, reg_lambda=1)
# When you fit, add eval_metric=f1_eval
# Please don't forget to insert all the .fit arguments required
clf_wine.fit(eval_metric=f1_eval)
Here you can see an example of how to implement custom objective function and custom evaluation metric
Example containing the following code:
# user defined evaluation function, return a pair metric_name, result
# NOTE: when you do customized loss function, the default prediction value is margin
# this may make builtin evaluation metric not function properly
# for example, we are doing logistic loss, the prediction is score before logistic transformation
# the builtin evaluation error assumes input is after logistic transformation
# Take this in mind when you use the customization, and maybe you need write customized evaluation function
def evalerror(preds, dtrain):
labels = dtrain.get_label()
# return a pair metric_name, result
# since preds are margin(before logistic transformation, cutoff at 0)
return 'error', float(sum(labels != (preds > 0.0))) / len(labels)
which specify that an evaluation function gets as arguments (predictions, dtrain) dtrain is of type DMatrix and returns a string, float which is the name of the metric and the error.
Adding working python code example
import numpy as np
def _F1_eval(preds, labels):
t = np.arange(0, 1, 0.005)
f = np.repeat(0, 200)
results = np.vstack([t, f]).T
# assuming labels only containing 0's and 1's
n_pos_examples = sum(labels)
if n_pos_examples == 0:
raise ValueError("labels not containing positive examples")
for i in range(200):
pred_indexes = (preds >= results[i, 0])
TP = sum(labels[pred_indexes])
FP = len(labels[pred_indexes]) - TP
precision = 0
recall = TP / n_pos_examples
if (FP + TP) > 0:
precision = TP / (FP + TP)
if (precision + recall > 0):
F1 = 2 * precision * recall / (precision + recall)
else:
F1 = 0
results[i, 1] = F1
return (max(results[:, 1]))
if __name__ == '__main__':
labels = np.random.binomial(1, 0.75, 100)
preds = np.random.random_sample(100)
print(_F1_eval(preds, labels))
And if you want to implement _F1_eval to work specifically for xgboost evaluation methods add this:
def F1_eval(preds, dtrain):
res = _F1_eval(preds, dtrain.get_label())
return 'f1_err', 1-res

openmdao 2.2.0: TypeError at setup

When running the following example code:
from openmdao.api import Problem, Group, IndepVarComp, ImplicitComponent, ScipyOptimizeDriver
class Test1Comp(ImplicitComponent):
def setup(self):
self.add_input('x', 0.5)
self.add_input('design_x', 1.0)
self.add_output('z', val=0.0)
self.add_output('obj')
self.declare_partials(of='*', wrt='*', method='fd', form='central', step=1.0e-4)
def apply_nonlinear(self, inputs, outputs, resids):
x = inputs['x']
design_x = inputs['design_x']
z = outputs['z']
resids['z'] = x*z + z - 4
resids['obj'] = (z/5.833333 - design_x)**2
if __name__ == "__main__":
prob = Problem()
model = prob.model = Group()
model.add_subsystem('p1', IndepVarComp('x', 0.5))
model.add_subsystem('d1', IndepVarComp('design_x', 1.0))
model.add_subsystem('comp', Test1Comp())
model.connect('p1.x', 'comp.x')
model.connect('d1.design_x', 'comp.design_x')
prob.driver = ScipyOptimizeDriver()
prob.driver.options["optimizer"] = 'SLSQP'
model.add_design_var("d1.design_x", lower=0.5, upper=1.5)
model.add_objective('comp.obj')
prob.setup()
prob.run_model()
print(prob['comp.z'])
I get:
Traceback (most recent call last):
File "C:/Users/jonat/Desktop/mockup_component3.py", line 40, in <module>
prob.setup()
File "C:\Python\openmdao\core\problem.py", line 409, in setup
model._setup(comm, 'full', mode)
File "C:\Python\openmdao\core\system.py", line 710, in _setup
self._setup_relevance(mode, self._relevant)
File "C:\Python\openmdao\core\system.py", line 1067, in _setup_relevance
self._relevant = relevant = self._init_relevance(mode)
File "C:\Python\openmdao\core\group.py", line 693, in _init_relevance
return get_relevant_vars(self._conn_global_abs_in2out, desvars, responses, mode)
File "C:\Python\openmdao\core\group.py", line 1823, in get_relevant_vars
if 'type_' in nodes[node]:
TypeError: 'instancemethod' object has no attribute '__getitem__'
Can someone explain why? I've succesfully run a similar component, but without optimization, so I'm suspicious the error comes from the optimization constructs. For example, do I have to define the objective in an ExplicitComponent?
I get a more descriptive message when I run:
KeyError: 'Variable name "comp.y" not found.'
Which just means that component "comp" doesn't have a variable named "y" (or "z").
The issue seems to have been caused by incorrect installation of OpenMDAO. I had previously tried to install by downloading a zip-file containing OpenMDAO. Now I instead installed using pip and the error disappeared.

How To resolve "NameError: name 'export_graphviz' is not defined"

export_graphviz is not defined error!!
Used the following codes in Jupyrer notebook Python 3.x
`def show_tree (tree, features, path):
f = io.StringIO()
export_graphviz(tree, out_file = f, feature_names=features )
pydotplus.graph_from_dot_data(f.getvalue()).write_png(path)
img = misc.imread(path)
plt.rcParams['figure.figsize']=[20,20]
plt.imshow(img)
show_tree(dt,features,'dc_tree.png')
`
Got the following error when calling the function show_tree
`NameError Traceback (most recent call last)
<ipython-input-21-96002279767f> in <module>()
----> 1 show_tree(dt,features,'dc_tree.png')
`<ipython-input-20-f73dae020a9a> in show_tree(tree, features, path)
4 def show_tree (tree, features, path):
5 f = io.StringIO()
----> 6 export_graphviz(tree, out_file = f, feature_names=features )
7 pydotplus.graph_from_dot_data(f.getvalue()).write_png(path)
8 img = misc.imread(path)`
NameError: name 'export_graphviz' is not defined
To use the export_graphviz exporter you need to import it from sklearn.tree.
Try
from sklearn import tree
(see example at the bottom of this page)
or
from sklearn.tree import DecisionTreeClassifier, export_graphviz

Specify Input Argument with KerasRegressor

I use a Keras neural network and I would like the input dimension to be automatically set, not hardcoded like in every tutorial I have seen so far. How could I accomplish this?
My code:
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
seed = 1
X = df_input
Y = df_res
def baseline_model(x):
# create model
model = Sequential()
model.add(Dense(20, input_dim=x, kernel_initializer='normal', activation=relu))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_absolute_error', optimizer='adam')
return model
inpt = len(X.columns)
estimator = KerasRegressor(build_fn = baseline_model(inpt ) , epochs=2, batch_size=1000, verbose=2)
estimator.fit(X,Y)
And the error I get:
Traceback (most recent call last):
File ipython-input-2-49d765e85d15, line 20, in estimator.fit(X,Y)
TypeError: call() missing 1 required positional argument: 'inputs'
I would wrap your baseline_model as follows:
def baseline_model(x):
def bm():
# create model
model = Sequential()
model.add(Dense(20, input_dim=x, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_absolute_error', optimizer='adam')
return model
return bm
And then define and fit the KerasRegressor as:
estimator = KerasRegressor(build_fn=baseline_model(inpt), epochs=2, batch_size=1000, verbose=2)
estimator.fit(X, Y)
This avoids having to hardcode the input dimension in baseline_model.
I try that and works
def create_model(max_features, num_class):
def bm():
model = Sequential()
model.add(Dense(512, input_shape=(max_features,)))
model.add(Activation('relu'))
model.add(Dropout(0.3))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.3))
model.add(Dense(num_class, activation='softmax'))
model.summary()
model.compile(
loss='categorical_crossentropy', optimizer='adam',metrics['accuracy'])
return model
return bm
and then
model_clf = KerasClassifier(
build_fn=create_model(max_features, num_class), epochs=10,
batch_size=32, verbose=2)
history = model_clf.fit(
X_train, y_train,
batch_size=32,
epochs=10,
verbose=2,
validation_data=(X_test, y_test))

Resources