Error when running hyperparameter tuning job with SageMaker locally

Error when running hyperparameter tuning job with SageMaker locally - runtime-error

I was trying to run a Hyperparameter tuning job locally in my machine using a sample code as given below.
tuner = HyperparameterTuner(estimator,
objective_metric_name,
hyperparameter_ranges,
metric_definitions,
max_jobs=4,
max_parallel_jobs=2,
objective_type=objective_type,
base_tuning_job_name="hpo-tuning-demo"
)
tuner.fit(inputs=channels)
This gives an error: AttributeError: 'LocalSagemakerClient' object has no attribute 'create_hyper_parameter_tuning_job'. Updating SageMaker and boto3 as suggested in some other posts didn't help.
Does this mean hyperparameter tuning locally is not supported, or if I miss something?

That is correct, hyperparameters tuning is not supported locally as HPO library is not available in the SageMaker SDK.

Related

rstan in a jupyter notebook - kernel problems

I've been trying to run some stan models in a jupyter notebook using rstan with the IRkernel. I set up an environment for this using conda. I believe I have installed all the necessary packages. I can run ordinary R functions without problems, but when I try to create a model using something like
model <- stan( model_code = code , data = dat )
the kernel just dies without any further explanation. The command line output is
memset [0x0x7ffa665b6e3b+11835]
RtlFreeHeap [0x0x7ffa665347b1+81]
free_base [0x0x7ffa640cf05b+27]
(No symbol) [0x0x7ffa2f723b44]
[I 15:25:11.757 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
WARNING:root:kernel 658481b8-0c64-4612-9cad-1f199dabce3a restarted
which I do not know how to interpret. This happens 100% of the time, even with toy models. I can run the models just fine in RStudio. Could this be a memory issue? I don't experience this problem training deep learning models in tensorflow, for reference.
Thanks in advance for any help.

unable to specify master_type in MLEngineTrainingOperator

I am using airflow to schedule a pipeline that will result in training a scikitlearn model with ai platform. I use this DAG to train it
with models.DAG(JOB_NAME,
schedule_interval=None,
default_args=default_args) as dag:
# Tasks definition
training_op = MLEngineTrainingOperator(
task_id='submit_job_for_training',
project_id=PROJECT,
job_id=job_id,
package_uris=[os.path.join(TRAINER_BIN)],
training_python_module=TRAINER_MODULE,
runtime_version=RUNTIME_VERSION,
region='europe-west1',
training_args=[
'--base-dir={}'.format(BASE_DIR),
'--event-date=20200212',
],
python_version='3.5')
training_op
The training package loads the desired csv files and train a RandomForestClassifier on it.
This works fine until the number and the size of the files increase. Then I get this error:
ERROR - The replica master 0 ran out-of-memory and exited with a non-zero status of 9(SIGKILL). To find out more about why your job exited please check the logs:
The total size of the files is around 4 Gb. I dont know what is the default machine used but is seems not enough. Hoping this would solve the memory consumption issue I tried to change the parameter n_jobs of the classifier from -1 to 1, with no more luck.
Looking at the code of MLEngineTrainingOperator and the documentation I added a custom scale_tier and a master_type n1-highmem-8, 8 CPUs and 52GB of RAM , like this:
with models.DAG(JOB_NAME,
schedule_interval=None,
default_args=default_args) as dag:
# Tasks definition
training_op = MLEngineTrainingOperator(
task_id='submit_job_for_training',
project_id=PROJECT,
job_id=job_id,
package_uris=[os.path.join(TRAINER_BIN)],
training_python_module=TRAINER_MODULE,
runtime_version=RUNTIME_VERSION,
region='europe-west1',
master_type="n1-highmem-8",
scale_tier="custom",
training_args=[
'--base-dir={}'.format(BASE_DIR),
'--event-date=20200116',
],
python_version='3.5')
training_op
This resulted in an other error:
ERROR - <HttpError 400 when requesting https://ml.googleapis.com/v1/projects/MY_PROJECT/jobs?alt=json returned "Field: master_type Error: Master type must be specified for the CUSTOM scale tier.">
I don't know what is wrong but it appears that is not the way to do that.
EDIT: Using command line I manage to launch the job:
gcloud ai-platform jobs submit training training_job_name --packages=gs://path/to/package/package.tar.gz --python-version=3.5 --region=europe-west1 --runtime-version=1.14 --module-name=trainer.train --scale-tier=CUSTOM --master-machine-type=n1-highmem-16
However i would like to do this in airflow.
Any help would be much appreciated.
EDIT: My environment used an old version of apache airflow, 1.10.3 where the master_type argument was not present.
Updating the version to 1.10.6 solved this issue

My environment used an old version of apache airflow, 1.10.3 where the master_type argument was not present. Updating the version to 1.10.6 solved this issue

how can I maximize the GPU usage of Tensorflow 2.0 from R (with Keras library)?

I use R with Keras and tensorflow 2.0 on the GPU.
After connecting a second monitor to my GPU, I receive this error during a deep learning script:
I concluded that the GPU is short of memory and a solution seems to be this code:
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # dynamically grow the memory used on the GPU
config.log_device_placement = True # to log device placement (on which device the operation ran)
# (nothing gets printed in Jupyter, only if you run it standalone)
sess = tf.Session(config=config)
set_session(sess) # set this TensorFlow session as the default session for Keras
According to this post:
https://github.com/tensorflow/tensorflow/issues/7072#issuecomment-422488354
Although this code is not accepted by R.
It says
unexpecterd token from Tensorflow.
Error in tf.ConfigProto() : could not find function "tf.ConfigProto"
It seems that tensorflow 2.0 does not accept this code if I understand correct from this post:
https://github.com/tensorflow/tensorflow/issues/33504
Does anyone know how I can maximize the GPU usage from my R script with Keras library and Tensorflow 2.0 ?
Thank you!

To enable GPU memory growth using keras or tensorflow in R, with tensorflow 2.0, you need to find the correct functions in the tf object.
First, find your GPU device:
library(tensorflow)
gpu <- tf$config$experimental$get_visible_devices('GPU')[[1]]
Then enable memory growth for that device:
tf$config$experimental$set_memory_growth(device = gpu, enable = TRUE)
You can find more relevant functions by typing tf$config$experimental$ and then using tab autocomplete in Rstudio.
Since these functions are labeled as experimental, they will likely change or move location in the future.

Unable to deploy R model using Rstudio on google cloud platform

I am using Rstudio on Google cloud Compute engine and using examples on
https://tensorflow.rstudio.com/keras/
My final objective is to be able to deploy R - model to AI platform and get predictions out of it. I have tried many examples using keras,tfestimators,tensorflow but none of them are able to run completely. All of the only run till training but when Its time to export_savemodel() they all fail. Local prediction,evaluation works fine.
model %>% evaluate(x_test, y_test) # work fine in Rstudio
model %>% predict_classes(x_test) # work fine in Rstudio
Want my model version to appear here.
Issues:
After completing the training , I am unable to export model to GCS bucket as command for this is failing.
export_savedmodel(model, "savedmodel")
Error message:
Error in export_savedmodel.keras.engine.training.Model(model,"savedmodel") :
'export_savedmodel()' is currently unsupported under the TensorFlow
Keras implementation, consider using 'tfestimators::keras_model_to_estimator()'.
Then I changed my code to below but still get error message:
library(tfestimators)
tfe_model <- tfestimators::keras_model_to_estimator(model)
export_savedmodel(tfe_model, "savedmodel")
Error:
Error in export_savedmodel.tf_estimator(tfe_model, "savedmodel") :
Currently only classifier and regressor are supported. Please specify a
custom serving_input_receiver_fn.
What I need:
How can I fix the issue ?
Or any guidance on how to deploy R packages on Google cloud platform will be appreciated.

Kernel Density Estimation (KDE) in GME (aka Hawth's Tools) Not Working

I've been trying to produce Kernel Density Estimates using the "kde" tool from
Geospatial Modeling Environment (GME, see documentation on kde). But I keep getting the following error regardless of valid input:
Code:
kde(in="C:\Users\Richard\Desktop\KDE_Scripting_Local\kde.gdb!BB_90sJAN",
out="C:\Users\Richard\Desktop\KDE_Scripting_Local\kde.gdb!KDE_BB90sJAN",
bandwidth="100000", cellsize=6000, kernel="QUARTIC",
ext="C:\Users\Richard\Desktop\KDE_Scripting_Local\kde.gdb!rect_extent");
Error message:
Error: The command text could not be interpreted. Please check the syntax of the command. Error: An important error has occurred. Please include the information below if you submit a query about this error.
Exception from HRESULT: 0x8004025A
The most frustrating part is that I had this exact code working last week. I tried restarting, reinstalling GME, copying the input to a new GDB as suggested here, subprocesses with PYTHON 2.7. Everything still produces this error with the same HRESULT.
I'm running GME Version 0.7.3.0, ArcGIS For Desktop 10.2.2, R Version 3.1.1, and Python 2.7 on Windows 7. There's not much community support for GME, so any help here would be much appreciated.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Error when running hyperparameter tuning job with SageMaker locally - runtime-error

That is correct, hyperparameters tuning is not supported locally as HPO library is not available in the SageMaker SDK.

Related

rstan in a jupyter notebook - kernel problems

unable to specify master_type in MLEngineTrainingOperator

how can I maximize the GPU usage of Tensorflow 2.0 from R (with Keras library)?

Unable to deploy R model using Rstudio on google cloud platform

Kernel Density Estimation (KDE) in GME (aka Hawth's Tools) Not Working

Categories

Resources