how to include environment when submitting an automl experiment in azure machine learning - azure-machine-learning-studio

I use code like below to create an AutoML object to submit an experiment for classification training
automl_settings = {
"n_cross_validations": 2,
"primary_metric": 'accuracy',
"enable_early_stopping": True,
"experiment_timeout_hours": 1.0,
"max_concurrent_iterations": 4,
"verbosity": logging.INFO,
}
automl_config = AutoMLConfig(task = 'classification',
compute_target = compute_target,
training_data = train_data,
label_column_name = label,
**automl_settings
)
ws = Workspace.from_config()
experiment = Experiment(ws, "your-experiment-name")
run = experiment.submit(automl_config, show_output=True)
I want to include my conda yml file (like below) in my experiment submission.
env = Environment.from_conda_specification(name='myenv', file_path='conda_dependencies.yml')
However, I don't see any environment parameter in AutoMLConfig class documentation (similar to what environment parameter does in ScriptRunConfig) or find any example how to do so.
I notice after the experiment is submitted, I get message like this
Running on remote.
No run_configuration provided, running on aml-compute with default configuration
Is run_configuration used for specifying environment? If so, how do I provide run_configuration in my AutoML experiment run?
Thank you.

I figured out how to fix the issues associated with the sdk 1.19.0 upgrade in the AML environment I use, thus no need for the workaround (ie. pass in a SDK 1.18.0 conda environment file to AutoML experiment run) I was thinking about. My original question no longer needs an answer, I just want to add this note in case someone else has the same question later on.
I still don't know why AutoML experiment run has no option to pass in a conda environment file. It would be nice if a reason is given in the AML documentation.

Related

Difference between environment etc. in testthat::test_check vs testthat::test_dir

This is somewhat deep R testing question, and as such, I'm not sure if general Stack Overflow is the right place for it, or if there's an R specific forum that would be better.
Any pointers on that are welcome.
The scenario is: I have package that is using testthat and has some tests in tests/testthat and (for reasons that are important but, to be honest, I don't totally understand) there are some other tests in inst/validation that need to be run as well, as part of a validation script (i.e. the script that this post is about).
I was running test_check(pkg) in my tests folder and it was working fine, but I wasn't getting the extra tests (which makes sense). So then I switched to the following:
test_dirs <- c("tests/testthat", "inst/validation")
for (.t in test_dirs) {
test_dir(.t)
}
Now a bunch of my tests are failing because they can't find some of the constants, etc. that are part of my package! (see note at the bottom for more details...)
So I dig in to the source code and find that test_check() actually calls testthat:::test_package_dir under the hood. Note the ::: this is an unexported function, so I don't really just want to call it in my own code.
testthat:::test_package_dir in turn calls the following, before calling test_dir() itself:
env <- test_pkg_env(package)
withr::local_options(list(topLevelEnvironment = env))
withr::local_envvar(list(
TESTTHAT_PKG = package,
TESTTHAT_DIR = maybe_root_dir(test_path)
))
test_dir(...
Sooooo... it seems like test_check() essentially just does some things to load the package environment (note test_pkg_env is also unexported) and then calls test_dir().
So I guess my question is: why? I've actually noticed this before with test_file() not working because it doesn't have everything in the package environment. Why do these functions not load the package environment like the other testing functions do?
Or really, my question is: is there a way to make them load it? And specifically in my case, is there a way to do what I'm trying to do (run tests in a few different directories) and have it load the package environment?
I notice this in the test_dirs docs:
env -- Environment in which to execute the tests. Expert use only.
which is set to test_env() by default. I have a feeling this is my answer, but I can't figure out how to get the package environment without basically copy/pasting a bunch of code out of functions that are hidden in :::. Perhaps I don't qualify as an "expert"...
Thanks for any insight and/or solutions!
note at the bottom:
Specifically my issue is that I have some "constants" in my aaa.R that are mostly just hard-coded strings or lists like:
SUMMARY_NAME <- "summary"
SUMMARY_COUNT <- "sum_count"
SUMMARY_PATH <- "sum_path"
SUM_REQ_COLS <- list(
list(name = SUMMARY_NAME, type = "character"),
list(name = SUMMARY_COUNT, type = "numeric"),
list(name = SUMMARY_PATH, type = "character"),
)
These are things that I use for checking S3 classes and other purposes so that I don't have hard-coded strings all over my code. The point is: I use some of these in my tests, which works fine for test_check() and devtools::check() and devtools::test() but dies when I try to use test_dir() or test_file() because they can't be found, presumably because the package environment isn't loaded.

objective reg:squaredlogerror dos not exist in r implementation of xgboost?

I am using the xgboost library in r. My model seems to run fine with the default objective reg:squarederror
This runs fine within my code e.g.
model_regression = map2(.x = dtrain_regression, .y = nrounds, ~xgboost(.x, nrounds = .y, objective = "reg:squarederror")))
Reading the docs, there is another potential objective listed, reg:squaredlogerror. I wanted to experiment with this objective:
model_regression = map2(.x = dtrain_regression, .y = nrounds, ~xgboost(.x, nrounds = .y, objective = "reg:squaredlogerror")))
However, when I run with this variation I get an error message that this objective is unknown.
Is it possible to use the objective reg:squaredlogerror within xgboost in r?
You want the latest xgboost. Install it with install_github, see the instructions here
(Don't expect CRAN to have the latest version of a package, esp. if under very active development (like xgboost is), it will lag by a release cycle. Generally the latest development build will be on github )
Try using reg:linear as objective, it will work :)

MXNet Time-series Example - Dropout Error when running locally

I am looking into using MXNet LSTM modelling for time-series analysis for a problem i am currently working on.
As a way of understanding how to implement this, I am following the example code given by xnNet from the link: https://mxnet.incubator.apache.org/tutorials/r/MultidimLstm.html
When running this script after downloading the necessary data to my local source, i am able to execute the code fine until i get to the following section to train the model:
## train the network
system.time(model <- mx.model.buckets(symbol = symbol,
train.data = train.data,
eval.data = eval.data,
num.round = 100,
ctx = ctx,
verbose = TRUE,
metric = mx.metric.mse.seq,
initializer = initializer,
optimizer = optimizer,
batch.end.callback = NULL,
epoch.end.callback = epoch.end.callback))
When running this section, the following error occurs once gaining connection to the API.
Error in mx.nd.internal.as.array(nd) :
[14:22:53] c:\jenkins\workspace\mxnet\mxnet\src\operator\./rnn-inl.h:359:
Check failed: param_.p == 0 (0.2 vs. 0) Dropout is not supported at the moment.
Is there currently a problem internally within the XNNet R package which is unable to run this code? I can't imagine they would provide a tutorial example for the package that is not executable.
My other thought is that it is something to do with my local device execution and connection to the API. I haven't been able to find any information about this being a problem for other users though.
Any inputs or suggestions would be greatly appreciated thanks.
Looks like you're running an old version of R package. I think following instructions on this page to build a recent R-package should resolve this issue.

Use Azure custom-vision trained model with tensorflow.js

I've trained a model with Azure Custom Vision and downloaded the TensorFlow files for Android
(see: https://learn.microsoft.com/en-au/azure/cognitive-services/custom-vision-service/export-your-model). How can I use this with tensorflow.js?
I need a model (pb file) and weights (json file). However Azure gives me a .pb and a textfile with tags.
From my research I also understand that there are also different pb files, but I can't find which type Azure Custom Vision exports.
I found the tfjs converter. This is to convert a TensorFlow SavedModel (is the *.pb file from Azure a SavedModel?) or Keras model to a web-friendly format. However I need to fill in "output_node_names" (how do I get these?). I'm also not 100% sure if my pb file for Android is equal to a "tf_saved_model".
I hope someone has a tip or a starting point.
Just parroting what I said here to save you a click. I do hope that the option to export directly to tfjs is available soon.
These are the steps I did to get an exported TensorFlow model working for me:
Replace PadV2 operations with Pad. This python function should do it. input_filepath is the path to the .pb model file and output_filepath is the full path of the updated .pb file that will be created.
import tensorflow as tf
def ReplacePadV2(input_filepath, output_filepath):
graph_def = tf.GraphDef()
with open(input_filepath, 'rb') as f:
graph_def.ParseFromString(f.read())
for node in graph_def.node:
if node.op == 'PadV2':
node.op = 'Pad'
del node.input[-1]
print("Replaced PadV2 node: {}".format(node.name))
with open(output_filepath, 'wb') as f:
f.write(graph_def.SerializeToString())
Install tensorflowjs 0.8.6 or earlier. Converting frozen models is deprecated in later versions.
When calling the convertor, set --input_format as tf_frozen_model and set output_node_names as model_outputs. This is the command I used.
tensorflowjs_converter --input_format=tf_frozen_model --output_json=true --output_node_names='model_outputs' --saved_model_tags=serve path\to\modified\model.pb folder\to\save\converted\output
Ideally, tf.loadGraphModel('path/to/converted/model.json') should now work (tested for tfjs 1.0.0 and above).
Partial answer:
Trying to achieve the same thing - here is the start of an answer - to make use of the output_node_names:
tensorflowjs_converter --input_format=tf_frozen_model --output_node_names='model_outputs' model.pb web_model
I am not yet sure how to incorporate this into same code - do you have anything #Kasper Kamperman?

unused argument (key = "iris.hex")

When ever I try to run this line or any other line which uses key(following document in http://h2o-release.s3.amazonaws.com/h2o/rel-lambert/5/docs-website/Ruser/rtutorial.html)
iris.hex = h2o.uploadFile(localH2O, path = irisPath, key = "iris.hex")
I get an error in the key calling it as unused argument.
This is the first time I am using H2O and I am new to R as well. Please let me know what is the function of key and only when I run this, I get error. I could create a dataframe with the following statement. But still I would want to understand this key error
h2o.init(ip = "localhost", port = 54321, startH2O = TRUE)
irisPath = system.file("extdata", "iris.csv", package = "h2o")
iris.hex = h2o.uploadFile(path = prosPath, destination_frame = "iris.hex")
iris.data.frame<- as.data.frame(iris.hex)
summary(iris.data.frame)
H2O may be very good in various areas. but unfortunately lack of documentation and tutorial makes it's really difficult to learn...
Hoping that they watch these type of comments and improve their documentation.
At least launching one tutorial of 12GB airlines data processing can help a lot for multiple enthusiastic people who really wanted to explore H2O.
This is a very outdated version of the H2O docs and there have been some major API changes since H2O 3.0. The latest R docs can always be found at: http://h2o-release.s3.amazonaws.com/h2o/latest_stable_Rdoc.html
Our main docs landing page has a link to the latest R docs, Python docs, and a bunch of other links you may find useful. We also have a Google Group called h2ostream for posting new questions and searching through old questions. Welcome to H2O!

Resources