I have a JavaFX APP containing two listviews displaying incoming customer orders (using a custom cellfactory) received from my server. I also have a few tableview displaying information from a Postgres database (this are spread across a few tabs inside a tabpane).
The user has to take an order (by clicking on it), and enter a few information inside textboxes.
The application was initially written an deployed using Java7. I had no problem whatsoever.
But recently I decided to switch to Java8. I modified my code to use lambdas and added a few extras stuff to the app:
a timeline to check and display orders status every minute, inside a textflow;
modified the customcellfactory class to use an external CSS, with setId instead of setStyle;
...
Now, the application is running fine but, after 2-3 hours of uptime it becomes sluggish. Since is hard for me simulate the behavior inside a profiler I used jstack, top -H, and matching pid with nid to find out what is happening.
This way I found out that the culprit was QuantumRenderer with 95+% CPU usage:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30300 utilizat+ 20 0 5801608 527412 39696 S 95,1 6,5 60:57.34 java
"QuantumRenderer-0" #9 daemon prio=5 os_prio=0 tid=0x00007f4f182bb800 nid=0x765c runnable [0x00007f4eeb2a1000]
java.lang.Thread.State: RUNNABLE
at com.sun.prism.es2.X11GLDrawable.nSwapBuffers(Native Method)
at com.sun.prism.es2.X11GLDrawable.swapBuffers(X11GLDrawable.java:50)
at com.sun.prism.es2.ES2SwapChain.present(ES2SwapChain.java:186)
at com.sun.javafx.tk.quantum.PresentingPainter.run(PresentingPainter.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at com.sun.javafx.tk.RenderJob.run(RenderJob.java:58)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at com.sun.javafx.tk.quantum.QuantumRenderer$PipelineRunnable.run(QuantumRenderer.java:125)
at java.lang.Thread.run(Thread.java:745)
The machine running the application is using a 64Bit version of Lubuntu.
I can't figure out where should I look to find out what is the problem...
It seems your renderer is using the X11 pipeline (Java2D?) which could be the cause of high CPU usage (software acceleration). Does your graphics card supports hardware acceleration?
Try getting more information with -Dprism.verbose=true if your graphics card does support hardware acceleration you might want to try to force it with -Dprism.forceGPU=true, also try enabling the OpenGL pipeline to increase Java2D performance with
-Dprism.order=es2,es1,sw,j2d (you could also try with the old Java2D flag
-Dsun.java2d.opengl=true but I think that won't affect prism).
I would also recommend taking a look at the OpenJFX performance tips and tricks checklist I've seen high CPU usage in nodes that was somewhat fixed with the usage of Node.setCache(true) and its CacheHints when using any kind of animation (with the downside that this uses more memory).
Also, take a look at how you are updating your UI from your worker threads. It's important to do minimum work in the FX UI Thread and update it from your workers correctly and only when necessary, take a look at this other question to learn more about the javafx.concurrent.Task class and its correct usage to update the UI from worker threads.
This seems much more like a software acceleration issue and Dprism.verbose should let you know more but following the other suggestions never hurts! Hope this helps!
Related
The hazelcast cluster runs in an application running on Kubernetes. I can't see any traces of partitioning or other problems in the logs. At some point, this exception starts to appear in the logs:
hz.dazzling_morse.partition-operation.thread-1 com.hazelcast.logging.StandardLoggerFactory$StandardLogger: app-name, , , , , - [172.30.67.142]:5701 [app-name] [4.1.5] Executor is shut down.
java.util.concurrent.RejectedExecutionException: Executor is shut down.
at com.hazelcast.scheduledexecutor.impl.operations.AbstractSchedulerOperation.checkNotShutdown(AbstractSchedulerOperation.java:73)
at com.hazelcast.scheduledexecutor.impl.operations.AbstractSchedulerOperation.getContainer(AbstractSchedulerOperation.java:65)
at com.hazelcast.scheduledexecutor.impl.operations.SyncBackupStateOperation.run(SyncBackupStateOperation.java:39)
at com.hazelcast.spi.impl.operationservice.Operation.call(Operation.java:184)
at com.hazelcast.spi.impl.operationexecutor.OperationRunner.runDirect(OperationRunner.java:150)
at com.hazelcast.spi.impl.operationservice.impl.operations.Backup.run(Backup.java:174)
at com.hazelcast.spi.impl.operationservice.Operation.call(Operation.java:184)
at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.call(OperationRunnerImpl.java:256)
at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:237)
at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:452)
at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:166)
at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:136)
at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.executeRun(OperationThread.java:123)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102)
I can't see any particular operation failing, prior to that. I do run some scheduled operations myself, but they are executing inside try-catch blocks and are never throwing.
The consequence is that whenever a node in the cluster restarts no data is replicated to the new node, which eventually renders the entire cluster useless - all data that's supposed to be cached and replicated among nodes disappears.
What could be the cause? How can I get more details about what causes whatever executor hazelcast uses to shut down?
Based on other conversations...
Your Runnable / Callable should implement HazelcastInstanceAware.
Don't pass the HazelcastInstance or IExecutorService as a non-transient argument... as the instance where the runnable is submitted will be different from the one where it runs.
See this.
So I'm using Hydra 1.1 and hydra-ax-sweeper==1.1.5 to manage my configuration, and run some hyper-parameter optimization on minerl environment. For this purpose, I load a lot of data in to memory (peak around 50Gb while loading with multiprocessing, drops to 30Gb after fully loaded) with multiprocessing (by pytorch).
On a normal run this is not a problem (My machine have 90+Gb RAM), one training finish without any issue.
However, when I run the same code with -m option (and hydra/sweeper: ax in config), the code stops after about 2-3 sweeper runs, getting stuck at the data loading phase, because all memories of the system (+swap memory) is occupied.
First I thought this was some issue with minerl environment code, which starts java-code in sub-process. So I tried to run my code without the environment (only the 30Gb data), and I still have the same issue. So I suspect I have some memory-leak inbetween the Hydra sweeper.
So my question is, How does Hydra sweeper(or ax-sweeper) work in-between sweeps? I always had the impression that it runs the main(cfg: DictConfig) decorated with #hydra.main(...), takes a scalar return(score) and run the Bayesian optimizer with this score, with main() called similar to a function (everything inside being properly deallocated/garbage collected between each sweep-run).
Is this not the case? Should I then load the data somewhere outside the main() and keep it between sweeps?
Thank you very much in advance!
The hydra-ax-sweeper may run trials in parallel, depending on the result of calling the get_max_parallelism function defined in ax.service.ax_client.
I suspect that your machine is running out of memory because of this parallelism.
Hydra's Ax plugin does not currently have a config group for configuring this max_parallelism setting, so it is automatically set by ax.
Loading the data outside of main (as you suggested) may be a good workaround for this issue.
Hydra sweepers in general does not have a facility to control concurrency. This is the responsibility of the launcher you are using.
The built-in basic launcher runs the jobs serially so it should not trigger memory issues.
If you are using other launchers, you may need to control their parallelism via Launcher specific parameters.
I have a Meteor app that is based on the Meteor simple todos tutorial. I noticed, that the second node-process (smaller one) on the server, started later when the app is started up, increases its memory usage steadily, even when there is no activity at all but at least one client is logged on to the server. I trace the RSS-Memory value with
pmap -x $PID_node | tail -1 | awk '{ print $4 }'
but it is also visible with 'top'. The RSS number is increasing steadily when doing nothing at all on the client side. I added the app here for reproducing the behavior.
When I remember correctly it is the original state as I finished the tutorial without doing any custom changes.
When the client is disconnected the memory still remains and is not reduced by any means.This goes on until the app is closed on the server when reaching the memory limit. I encounter similar behavior with the official todos-example. I start the apps with
meteor --port 61100
I found in other posts that this could be related to a public folder which I even do not have in the simple todos so there must be something different. I also updated to the latest version of meteor 1.3.4.1 which does not change the behavior.
Is this a normal meteor behavior or has to be considered a meteor bug? Or is there any bad style code in the examples?
I have Symfony2 application with RabbitMQBundle installed. I've setup consumers and producers as it's described in the bundle documentations and everything works correct. But my consumers started with ./app/console rabbitmq:consumer take all available CPU time. Basically consumer does nothing but waiting for a message and output it. If I start demo consumer from php-amqplib CPU consumption is almost zero. I tried different virsions of Symfony (2.6 and 2.3) but this does not affect CPU load. My server configuration:
Debian 7
PHP 5.6.4 (also tried 5.4)
no database used
RabbitMq 3.4.2
Is there any way to reduce CPU consumption? Thanks
Just ran into a very similar issue and after some debugging realized that I was using an old way of instantiating the connection to rabbitmq.
The new signature of the method is described here: https://github.com/videlalvaro/php-amqplib/blob/master/PhpAmqpLib/Connection/AbstractConnection.php#L136
I was sending in something that looked more like
$this->connection = new Connection\AMQPConnection(
$server->host,
$server->port,
$server->user,
$server->password,
$server->vhost,
$server->insist,
$server->login_method,
$server->locale,
$server->connection_timeout,
$server->read_write_timeout,
$server->context,
$server->keepalive,
$server->heartbeat
);
As per a very old definition around somewhere in version 2. https://github.com/videlalvaro/php-amqplib/blob/v2.0.0/PhpAmqpLib/Connection/AMQPConnection.php#L31
So your plugin seems to use a new version of the library but not the new way to initiate a connection.
All,
I'm looking for a good way to do some job backgrounding through either of these two services.
I see PHPFog supports IronWorks, but i need something more realtime. Through these cloud based PaaS services, I'm not able to use popen(background.php --token=1234). So I'm thinking the best solution, might be to try to kick off a gearman worker to handle the job. (Actually my preferred method would be to use websockets to keep a connection open and receive feedback from the job, rather than long polling a db table through AJAX, but none of these guys support websockets)
Question 1 is, is there a better solution than using gearman to offload the job?
Question 2 is, http://help.pagodabox.com/customer/portal/articles/430779 I see pagodabox supports 'worker listeners' ... has anybody set this up with gearman? Would it work?
Thanks
I am using PagodaBox with a background worker in an application I am building right now. Basically, PagodaBox daemonizes a PHP process for you (meaning it will continually run in the background), so all you really have to do is create a script that checks a database table for tasks to run, runs them, and then sleeps a bit so it's not running too many queries against your database.
This is a simplified version of what I have running:
// Remove time limit
set_time_limit(0);
// Show ALL errors
error_reporting(-1);
// Run daemon
echo "--- Starting Daemon ---\n";
while(true) {
// Query 'work_queue' table for new tasks
// Loop over items and do whatever tasks are associated with them
// Update row to mark task as completed
// Wait a bit
sleep(30);
}
A benefit to this approach is that it's easy to test via CLI:
php tasks.php
You will see all the echo statements come through in console as it's running, and of course it's much easier to do than a more complicated setup with other dependencies like Gearman.
So whenever you add a new task to the table, the maximum amount of time you'll wait for that task to be started in a batch is 30 seconds (or whatever your sleep time is). This is better and preferable to cron jobs, because if you setup a cron job to run every minute (the lowest possible interval) and the work you have to do takes longer than a minute, another cron process will start working on the same queue and you could end up with quite a lot of duplicated task work that could cause a lot of issues that are hard to debug and troubleshoot. So if you instead have either only one background worker that runs all tasks, or multiple background workers that work on different task types, you will never run into this issue.