I am making a simulation with NetLogo and extension R. I have made a supply chain model, where I have distributors and consumers. Consumers provide orders to distributors and distributors forecast future demand and places orders to suppliers in advance to fulfill market demand. The forecast is implemented with extension R (https://ccl.northwestern.edu/netlogo/docs/r.html) by calling elmNN package. The model works fine when simply using "go".
However, when I want to conduct experiments by using behavior space, I keep getting errors. If I set only a few ticks with behavior space, the model works fine. But when I want to launch a few hundred ticks behavior space keeps crashing. For example, "Extension exception: Error in R-extension: error in eval, operator is invalid for atomic vector", "Extension exception: Error in R-extension: error in eval: cannot have attributes on CHARSXP". Sometimes the behavior simply crashes without any error.
I assume that the errors are related to computability issues between NetLogo, R, R extension and java. I am using NetLogo 5.3.1, 64-bit; R-3.3.3 64-bit; rJava 0.9-8.
Model example: https://www.youtube.com/watch?v=zjQpPBgj0A8
A similar question was posted previously, but it has no answer: NetLogo BehaviorSpace crashing when using R extension
The problem was with programming style, which is not suitable for behavior space. Behavior space supports parallel programming due to which some variables were rewritten by new information in the process. When I set Simultaneous runs in parallel to 1 in the behavior space everything was fine.
Related
I want to use the R2Jags parallel model (jags.parallel) to speed up my computations.
The documentation states I need to recompile the model before updating it. However I get a clear error that I don't know how to handle.
Error message:
Code that raises error:
In line 33 M[s] is used and M is defined in the data section.
What are possible reasons, that M is no longer defined when recompiling?
Cheers and thanks
I was running a spatstat envelop to generate simulations sample, however, it got stuck and did not run. So, I attempted to close the application but fail.
RStudio diagnostic log
Additional error message:
This application has requested the Runtime to terminate it in an
unusual way. Please contact the application's support team for more
information
There are several typing errors in the command shown in the question. The argument rank should be nrank and the argument glocal should be global. I will assume that these were typed correctly when you ran the command.
Since global=TRUE this command will generate 2 * nsim = 198 realisations of a completely random pattern and calculate the L function for each one of them. In my experience it should take only a minute or two to compute this, unless the geometry of the window is very complicated. One hour is really extraordinary.
So I'm guessing either you have a very complicated window (so that the edge correction calculation is taking a long time) or that RStudio is hanging somehow.
Try setting correction="border" or correction="none" and see if that makes it run faster. (These are the fastest choices.) If that works, then read the help for Lest or Kest about edge corrections, and choose an edge correction that you like. If not, then try running the same command in R instead of RStudio.
Sometimes R CMD check fails when all your test run fine when you run the manually (or using devtools::test()).
I ran into one of such issues as well, when I wanted to compare results from bootstrapping using the boot package.
I went into a rabbit hole looking for issues caused by parallel computation (done by boot) and Random Number Generators (RNGs).
These were all not the answers.
In the end, the issue was trivial.
I used base::sort() to create the levels of a factor. (To ensure that they would always align, even if the data was in a different order)
The problem is, that the default sort method depends on the locale of your system. And R CMD check uses a different locale than my interactive session.
The issue resided in this:
R interactively used:LC_COLLATE=en_US.UTF-8;
R CMD check used: LC_COLLATE=C;
In the details of base::sort this is mentioned:
Except for method ‘"radix"’, the sort order for character vectors
will depend on the collating sequence of the locale in use:
see ‘Comparison’. The sort order for factors is the order of their
levels (which is particularly appropriate for ordered factors).
I now resolved the issue by specifying the radix sort method.
Now, all works fine.
I have a rather complicated issue with my small package. Basically, I'm building a GARCH(1,1) model with rugarch package that is designed exactly for this purpose. It uses a chain of solvers (provided by Rsolnp and nloptr, general-purpose nonlinear optimization) and works fine. I'm testing my method with testthat by providing a benchmark solution, which was obtained previously by manually running the code under Windows (which is the main platform for the package to be used in).
Now, I initially had some issues when the solution was not consistent across several consecutive runs. The difference was within the tolerance I specified for the solver (default solver = 'hybrid', as recommended by the documentation), so my guess was it uses some sort of randomization. So I took away both random seed and parallelization ("legitimate" reasons) and the issue was solved, I'm getting identical results every time under Windows, so I run R CMD CHECK and testthat succeeds.
After that I decided to automate a little bit and now the build process is controlled by travis. To my surprise, the result under Linux is different from my benchmark, the log states that
read_sequence(file_out) not equal to read_sequence(file_benchmark)
Mean relative difference: 0.00000014688
Rebuilding several times yields the same result, and the difference is always the same, which means that under Linux the solution is also consistent. As a temporary fix, I'm setting a tolerance limit depending on the platform, and the test passes (see latest builds).
So, to sum up:
A numeric procedure produces identical output on both Windows and Linux platforms separately;
However, these outputs are different and are not caused by random seeds and/or parallelization;
I generally only care about supporting under Windows and do not plan to make a public release, so this is not a big deal for my package per se. But I'm bringing this to attention as there may be an issue with one of the solvers that are being used quite widely.
And no, I'm not asking to fix my code: platform dependent tolerance is quite ugly, but it does the job so far. The questions are:
Is there anything else that can "legitimately" (or "naturally") lead to the described difference?
Are low-level numeric routines required to produce identical results on all platforms? Can it happen I'm expecting too much?
Should I care a lot about this? Is this a common situation?
I am running some large regression models in R in a grid computing environment. As far as I know, the grid just gives me more memory and faster processors, so I think this question would also apply for those who are using R on a powerful computer.
The regression models I am running have lots of observations, and several factor variables that have many (10s or 100s) of levels each. As a result, the regression can get computationally intensive. I have noticed that when I line up 3 regressions in a script and submit it to the grid, it exits (crashes) due to memory constraints. However, if I run it as 3 different scripts, it runs fine.
I'm doing some clean up, so after each model runs, I save the model object to a separate file, rm(list=ls()) to clear all memory, then run gc() before the next model is run. Still, running all three in one script seems to crash, but breaking up the job seems to be fine.
The sys admin says that breaking it up is important, but I don't see why, if I'm cleaning up after each run. 3 in one script runs them in sequence anyways. Does anyone have an idea why running three individual scripts works, but running all the models in one script would cause R to have memory issues?
thanks! EXL
Similar questions that are worth reading through:
Forcing garbage collection to run in R with the gc() command
Memory Usage in R
My experience has been that R isn't superb at memory management. You can try putting each regression in a function in the hope that letting variables go out of scope works better than gc(), but I wouldn't hold your breath. Is there a particular reason you can't run each in its own batch? More information as Joris requested would help as well.