MXNet Time-series Example - Dropout Error when running locally - r

I am looking into using MXNet LSTM modelling for time-series analysis for a problem i am currently working on.
As a way of understanding how to implement this, I am following the example code given by xnNet from the link: https://mxnet.incubator.apache.org/tutorials/r/MultidimLstm.html
When running this script after downloading the necessary data to my local source, i am able to execute the code fine until i get to the following section to train the model:
## train the network
system.time(model <- mx.model.buckets(symbol = symbol,
train.data = train.data,
eval.data = eval.data,
num.round = 100,
ctx = ctx,
verbose = TRUE,
metric = mx.metric.mse.seq,
initializer = initializer,
optimizer = optimizer,
batch.end.callback = NULL,
epoch.end.callback = epoch.end.callback))
When running this section, the following error occurs once gaining connection to the API.
Error in mx.nd.internal.as.array(nd) :
[14:22:53] c:\jenkins\workspace\mxnet\mxnet\src\operator\./rnn-inl.h:359:
Check failed: param_.p == 0 (0.2 vs. 0) Dropout is not supported at the moment.
Is there currently a problem internally within the XNNet R package which is unable to run this code? I can't imagine they would provide a tutorial example for the package that is not executable.
My other thought is that it is something to do with my local device execution and connection to the API. I haven't been able to find any information about this being a problem for other users though.
Any inputs or suggestions would be greatly appreciated thanks.

Looks like you're running an old version of R package. I think following instructions on this page to build a recent R-package should resolve this issue.

Related

Errors running Oolong validation in R on both STM and seededLDA

I'm trying to run the oolong package to validate a couple of topic models I've created. Using both an STM model and a seededLDA model (this code won't be reproducible)
oolong_test1a <- witi(input_model = model_stm_byt, input_corpus = YS$body)
OR
oolong_test1a <- witi(input_model = slda_howard_docs, input_corpus = howard_df$content)
In both cases it successfully creates an oolong test in my global environment. However, when I run either the word intrusion or topic intrusion test, I get this error in both my console and my viewer:
Listening on http://127.0.0.1:7122
Warning: Error in value[[3L]]: Couldn't normalize path in `addResourcePath`, with arguments: `prefix` = 'miniUI-0.1.1.1'; `directoryPath` = 'D:/temp/RtmpAh8J5r/RLIBS_35b54642a1c09/miniUI/www'
[No stack trace available]
I couldn't find any reference to this error anywhere else. I've checked I'm running the most recent version of oolong.
I've also tried to run it on the models/corpus that comes supplied with oolong. So this code is reproducible:
oolong_test <- witi(input_model = abstracts_keyatm, input_corpus = abstracts$text, userid = "Julia")
oolong_test$do_word_intrusion_test()
oolong_test$do_topic_intrusion_test()
This generates the same errors.
There is a new version in github that fixes this issue.
devtools::install_github("chainsawriot/oolong")

Error in setDefaultClusterOptions(type = .sfOption$type) :could not find function "setDefaultClusterOptions"

I'm new here. I've been struggling with analysing some data with the BaSTA package, the data works ok after running the "Datacheck" code , but right after running the following code this happens:
multiout <- multibasta(object = datosJ, studyStart = 1999, studyEnd = 2018, model = "LO",
shape = "simple", niter = 20001, burnin = 2001, thinning = 100,
parallel = TRUE)
No problems were detected with the data.
Starting simulation to find jump sd's... done.
Multiple simulations started...
**Error in setDefaultClusterOptions(type = .sfOption$type) :
could not find function "setDefaultClusterOptions"**
I believe this error has something to do with the usage of "parallel = TRUE" which is a function of the snow package that comes incorporated in the BaSTA package and makes the analysis run faster. If I don't use parallel the analysis takes weeks in running and I've been told that's not normal for the package I'm using.
Any help would be very helpful, thank you.
I came across this same behavior when using another R package that depends on snowfall. setDefaultClusterOptions is housed within a dependency of BaSTA so this is error message is because packages are not being loaded. Try calling library(snowfall) prior to running the BaSTA package command to see if that fixes it for you.

how to include environment when submitting an automl experiment in azure machine learning

I use code like below to create an AutoML object to submit an experiment for classification training
automl_settings = {
"n_cross_validations": 2,
"primary_metric": 'accuracy',
"enable_early_stopping": True,
"experiment_timeout_hours": 1.0,
"max_concurrent_iterations": 4,
"verbosity": logging.INFO,
}
automl_config = AutoMLConfig(task = 'classification',
compute_target = compute_target,
training_data = train_data,
label_column_name = label,
**automl_settings
)
ws = Workspace.from_config()
experiment = Experiment(ws, "your-experiment-name")
run = experiment.submit(automl_config, show_output=True)
I want to include my conda yml file (like below) in my experiment submission.
env = Environment.from_conda_specification(name='myenv', file_path='conda_dependencies.yml')
However, I don't see any environment parameter in AutoMLConfig class documentation (similar to what environment parameter does in ScriptRunConfig) or find any example how to do so.
I notice after the experiment is submitted, I get message like this
Running on remote.
No run_configuration provided, running on aml-compute with default configuration
Is run_configuration used for specifying environment? If so, how do I provide run_configuration in my AutoML experiment run?
Thank you.
I figured out how to fix the issues associated with the sdk 1.19.0 upgrade in the AML environment I use, thus no need for the workaround (ie. pass in a SDK 1.18.0 conda environment file to AutoML experiment run) I was thinking about. My original question no longer needs an answer, I just want to add this note in case someone else has the same question later on.
I still don't know why AutoML experiment run has no option to pass in a conda environment file. It would be nice if a reason is given in the AML documentation.

unused argument (key = "iris.hex")

When ever I try to run this line or any other line which uses key(following document in http://h2o-release.s3.amazonaws.com/h2o/rel-lambert/5/docs-website/Ruser/rtutorial.html)
iris.hex = h2o.uploadFile(localH2O, path = irisPath, key = "iris.hex")
I get an error in the key calling it as unused argument.
This is the first time I am using H2O and I am new to R as well. Please let me know what is the function of key and only when I run this, I get error. I could create a dataframe with the following statement. But still I would want to understand this key error
h2o.init(ip = "localhost", port = 54321, startH2O = TRUE)
irisPath = system.file("extdata", "iris.csv", package = "h2o")
iris.hex = h2o.uploadFile(path = prosPath, destination_frame = "iris.hex")
iris.data.frame<- as.data.frame(iris.hex)
summary(iris.data.frame)
H2O may be very good in various areas. but unfortunately lack of documentation and tutorial makes it's really difficult to learn...
Hoping that they watch these type of comments and improve their documentation.
At least launching one tutorial of 12GB airlines data processing can help a lot for multiple enthusiastic people who really wanted to explore H2O.
This is a very outdated version of the H2O docs and there have been some major API changes since H2O 3.0. The latest R docs can always be found at: http://h2o-release.s3.amazonaws.com/h2o/latest_stable_Rdoc.html
Our main docs landing page has a link to the latest R docs, Python docs, and a bunch of other links you may find useful. We also have a Google Group called h2ostream for posting new questions and searching through old questions. Welcome to H2O!

R '.doSnowGlobals' not found

I am working with the package random uniform Forest. I am trying to run the examples provided in the documentation:
data(iris)
XY = iris
p = ncol(XY)
X = XY[,-p]
Y = XY[,p]
iris.ruf = randomUniformForest(Species ~., XY, threads = 1)
But I get this error:
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
7 nodes produced errors; first error: object '.doSnowGlobals' not found
I googled and found that this is because it is trying to use paralell computing and it does not find something it needs. I have never used paralell computing so I did not understand the explanations I have found and I do not know how to fix this problem. I also read "error: object '.doSnowGlobals' not found?".
According to the manual using "threads = 1" deactivated paralell computing, but I get the error anyways.
I have also checked and the packages paralell and doParallel are loaded.
I do not really need paralell computing and I do not know if I am "connected" to other computers, so I am not sure If that would even work. Would there be an easy way do deactivate paralell computing? Or another alternative for making this work?
The cause of the problem was that I was working on my university computer for which I do not have administrator rights. The randomUniformForest package makes use of parallel processing which uses the IP protocol (even when only one thread is used).
I tried the package on my private computer and it worked fine.

Resources