Multithreaded Flask application causes stack error in rpy2 R process

Multithreaded Flask application causes stack error in rpy2 R process - r

Essentially the same error as here, but those solutions do not provide enough information to replicate a working example: Rpy2 in a Flask App: Fatal error: unable to initialize the JIT
Within my Flask app, using the rpy2.rinterface module, whenever I intialize R I receive the same stack usage error:
import rpy2.rinterface as rinterface
from rpy2.rinterface_lib import openrlib
with openrlib.rlock:
rinterface.initr()
Error: C stack usage 664510795892 is too close to the limit Fatal error: unable to initialize the JIT
rinterface is the low-level R hook in rpy2, but the higher-level robjects module gives the same error. I've tried wrapping the context lock and R initialization in a Process from the multiprocessing module, but have the same issue. Docs say that a multithreaded environment will cause problems for R: https://rpy2.github.io/doc/v3.3.x/html/rinterface.html#multithreading
But the context manager doesn't seem to be preventing the issue with interfacing with R

rlock is an instance of a Python's threading.Rlock. It should take care of multithreading issues.
However, multitprocessing can cause a similar issue if the embedded R is shared across child processes. The code for this demo script showing parallel processing with R and Python processes illustrate this: https://github.com/rpy2/rpy2/blob/master/doc/_static/demos/multiproc_lab.py
I think that the way around this is to configure Flask, or most likely your wsgi layer, to create isolated child processes, or have all of your Flask processes delegate R calculations to a secondary process (created on the fly, or in a pool of processes waiting for tasks to perform).

As other answers for similar questions have implied, Flask users will need to initialize and run rpy2 outside of the WSGI context to prevent the embedded R process from crashing. I accomplished this with Celery, where workers provide an environment separate from Flask to handle requests made in R.
I used the low-level rinterface library as mentioned in the question, and wrote Celery tasks using classes
import rpy2.rinterface as rinterface
from celery import Celery
celery = Celery('tasks', backend='redis://', broker='redis://')
class Rpy2Task(Task):
def __init__(self):
self.name = "rpy2"
def run(self, args):
rinterface.initr()
r_func = rinterface.baseenv['source']('your_R_script.R')
r_func[0](args)
pass
Rpy2Task = celery.register_task(Rpy2Task())
async_result = Rpy2Task.delay(args)
Calling rinterface.initr() anywhere but in the body of the task run by the worker results in the aforementioned crash. Celery is usually packaged with redis, and I found this a useful way to support exchanging information between R and Python, but of course Rpy2 also provides flexible ways of doing this.

Related

Task error: 'multiprocessing.Queue' cannot be JSON serialized

I have a task decorator that returns an instance of multiprocessing.Queue. Upon running the DAG I get an error message stating that type multiprocessing.Queue cannot be JSON serialized. I think it cannot be loaded into the XCOM. Is the only way around this to dump the queue's contents into a list and have task return that?
I am using the billiard package fork of the multiprocessing python package (i.e. import billiard as multiprocessing). This was suggested in this chat: https://github.com/apache/airflow/issues/14896

How does Hydras sweeper, specifically Ax-sweeper free/allocate memory?

So I'm using Hydra 1.1 and hydra-ax-sweeper==1.1.5 to manage my configuration, and run some hyper-parameter optimization on minerl environment. For this purpose, I load a lot of data in to memory (peak around 50Gb while loading with multiprocessing, drops to 30Gb after fully loaded) with multiprocessing (by pytorch).
On a normal run this is not a problem (My machine have 90+Gb RAM), one training finish without any issue.
However, when I run the same code with -m option (and hydra/sweeper: ax in config), the code stops after about 2-3 sweeper runs, getting stuck at the data loading phase, because all memories of the system (+swap memory) is occupied.
First I thought this was some issue with minerl environment code, which starts java-code in sub-process. So I tried to run my code without the environment (only the 30Gb data), and I still have the same issue. So I suspect I have some memory-leak inbetween the Hydra sweeper.
So my question is, How does Hydra sweeper(or ax-sweeper) work in-between sweeps? I always had the impression that it runs the main(cfg: DictConfig) decorated with #hydra.main(...), takes a scalar return(score) and run the Bayesian optimizer with this score, with main() called similar to a function (everything inside being properly deallocated/garbage collected between each sweep-run).
Is this not the case? Should I then load the data somewhere outside the main() and keep it between sweeps?
Thank you very much in advance!

The hydra-ax-sweeper may run trials in parallel, depending on the result of calling the get_max_parallelism function defined in ax.service.ax_client.
I suspect that your machine is running out of memory because of this parallelism.
Hydra's Ax plugin does not currently have a config group for configuring this max_parallelism setting, so it is automatically set by ax.
Loading the data outside of main (as you suggested) may be a good workaround for this issue.

Hydra sweepers in general does not have a facility to control concurrency. This is the responsibility of the launcher you are using.
The built-in basic launcher runs the jobs serially so it should not trigger memory issues.
If you are using other launchers, you may need to control their parallelism via Launcher specific parameters.

Redirect robot.api.logger calls to file as messages are invisible due to XML-RPC

In one of my projects we are using Robot Framework with custom keyword libraries in a complex test environment using assorted ECUs and PCs. One keyword library must be controlled by a python remote server via XML-RPC, as it has to be on a different PC.
Now, all important messages from robot.api.logger calls, like logger.debug() oder logger.console() are swallowed due to the XML-RPC. This is a known issue, which also clearly stated in the docs.
For most parts these APIs work exactly like when using with Robot
Framework normally. There main limitation is that logging using
robot.api.logger or Python's logging module is currently not
supported.
It is possible to write a thin wrapper or decorator for robot.api.logger, so that all debug messages are redirected to a simple txt file, like:
DEBUG HH:MM:SS > Message
WARN HH:MM:SS > Message
This would be really helpful in case of problems.
Of course, it would be easy to use the built python logging module, but
I'm looking to find a solution, that changes the least amout of already existing code and I also want that the results are written to the text file additional to normal robot.api.logging to the Robot reports, as we are using the same library in a local and a remote way.
So basically I need to find a way to extend/redirected robot.api.logger calls, by first using the normal python logging module and then using the normal robot.api.logger.

You can patch the write function of the robot.api.logger so it will write to a log file as well. This patching could be triggered by a library argument.
This would require you to only modify the constructor of your library.
RemoteLib.py
import sys
from robot.api import logger
from robot.output import librarylogger
from robotremoteserver import RobotRemoteServer
def write(msg, level='INFO', html=False):
librarylogger.write(msg, level, html)
with open('log.txt', 'a') as f:
print(f'{level}\tHH:MM:SS > {msg}', file=f)
class RemoteLib():
ROBOT_LIBRARY_SCOPE = 'GLOBAL'
ROBOT_LIBRARY_VERSION = 0.1
def __init__(self, to_file=True):
if to_file:
logger.write = write
def log_something(self, msg):
logger.info(msg)
logger.debug(msg)
logger.warn(msg)
logger.trace(msg)
logger.error(msg)
logger.console(msg)
if __name__ == '__main__':
RobotRemoteServer(RemoteLib(), *sys.argv[1:])
local_run.robot
*** Settings ***
Library RemoteLib to_file=False
*** Test Cases ***
Test
Log Something something
remote_run.robot
*** Settings ***
Library Remote http://127.0.0.1:8270
*** Test Cases ***
Test
Log Something something
You could use the Python logging module as well in the write patch just as it is used in Robot Framework itself, Redirects the robot.api.logger to python logging if robot is not running.

Run Apache Airflow DAG without Apache Airflow

So here's a stupid idea...
I created (many) DAG(s) in airflow... and it works... however, i would like to package it up somehow so that i could run a single DAG Run without having airflow installed; ie have it self contained so i don't need all the web servers, databases etc.
I mostly instantiate new DAG Run's with trigger dag anyway, and i noticed that the overhead of running airflow appears quite high (workers have high loads doing essentially nothing, it can sometimes take 10's of seconds before dependent tasks are queued etc).
i'm not too bothered about all the logging etc.

You can create a script which executes airflow operators, although this loses all the meta data that Airflow provides. You still need to have Airflow installed as a Python package, but you don't need to run any webservers, etc. A simple example could look like this:
from dags.my_dag import operator1, operator2, operator3
def main():
# execute pipeline
# operator1 -> operator2 -> operator3
operator1.execute(context={})
operator2.execute(context={})
operator3.execute(context={})
if __name__ == "__main__":
main()

It sounds like your main concern is the waste of resources by the idling workers more so than the waste of Airflow itself.
I would suggest running Airflow with the LocalExecutor on a single box. This will give you the benefits of concurrent execution without the hassle of managing workers.
As for the database - there is no way to remove the database component without modifying airflow source itself. One alternative would be to leverage the SequentialExecutor with SQLite, but this removes the ability to run concurrent tasks and is not recommended for production.

First I'd say you need to tweak you airflow setup.
But if that's not an option, then another way is to write your main logic in code outside the DAG. (This is also best practice). For me this makes the code easier to test locally as well.
Writing a shell script is pretty easy to tie a few processes together.
You won't get the benefit of operators or dependencies, but you probably can script your way around it. And if you can't, just use Airflow.

You can overload the imported airflow modules if they fail to import. So for example, if you are using from airflow.decorators import dag, task you can overload the #dag and #task decorators:
from datetime import datetime
try:
from airflow.decorators import dag, task
except ImportError:
mock_decorator = lambda f=None,**d: f if f else lambda x:x
dag = mock_decorator
task = mock_decorator
#dag(schedule=None, start_date=datetime(2022, 1, 1), catchup=False)
def mydag():
#task
def task_1():
print("task 1")
#task
def task_2(input):
print("task 2")
task_2(task_1())
_dag = mydag()

Determining programmatically whether Zope is running in interactive console mode

I know that I can determining programmatically whether Zope is in running in debug mode (bin/instance fg) using the following:
>>> import Globals
>>> Globals.DevelopmentMode
True
It is possible to determine if I started the deamon in interactive console mode (bin/instance debug)?
You may think that this not make sense, but I'm having an issue with a package when I run an instance this way:
https://github.com/collective/collective.fingerpointing/issues/30

I ended up by catching the resulting exception instead:
https://github.com/collective/collective.fingerpointing/pull/34

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Multithreaded Flask application causes stack error in rpy2 R process - r

Related

Task error: 'multiprocessing.Queue' cannot be JSON serialized

How does Hydras sweeper, specifically Ax-sweeper free/allocate memory?

Redirect robot.api.logger calls to file as messages are invisible due to XML-RPC

Run Apache Airflow DAG without Apache Airflow

Determining programmatically whether Zope is running in interactive console mode

Categories

Resources