Is OpenJ9 write gc log asynchronously?
When use Eclipse OpenJ9 in docker container, can i put gc.log to NFS or Ceph?
I've read that OpenJDK write gc log synchronously: Is gc.log writing asynchronous? safe to put gc.log on NFS mount?.
Verbose GC logs could be directed to a file. The option is -Xverbosegclog (mentioned here https://www.eclipse.org/openj9/docs/gc/, although ATM most of Verbose GC documentation is still only on IBM website).
If you suspect that storage medium may block I/O operations, you can try using -Xgc:bufferedLogging. This isn't really documented option (no strong interest), but you are welcome to try it and let us know if you find it valuable.
Note however, with buffered logging, there will be a delay - with a sudden termination of JVM process, the logs may miss a few lines that were still in the internal buffer but not flushed to the file.
Related
I'm running a simple CRUD app built with ASP.NET Core and EF Core 3.1 in a docker swarm cluster on ubuntu. I'm only using managed code.
The container has a 10GB memory limit specified. I can inspect a running container and verify that this limit is actually set, I also see that DOTNET_RUNNING_IN_CONTAINER is set to true. When the app is started the memory consumption is about 700MB and it slowly builds up. Once it reaches 7GB (I see it in container generic stats) I start getting OutOfMemoryExceptions and it stays at this level for days. So the first question is
Why doesn't it go up to 10 GB?
Anyway I expect memory leaks so I have a dotnet-gcdump tool installed in this same container so I go ahead and collect the dump for future analysis with dotnet-gcdump collect. Once I execute this command I see the memory consumption of the running container drops from 7GB to 3GB and stays at this level. The resulting .gcdump file itself size though is only ~200MB with nothing suspicious in it. So next questions are
How does the collection of a dump reduce memory consumption? I'd assume it's doing GC with LOH compaction but it doesn't mention it in the docs.
Why isn't this memory freed automatically if the tool is able to do it?
Why is a resulting dump only 200 MB in size?
As the gcdump documentations explains: "GC dumps are created by triggering a GC in the target process, turning on special events, and regenerating the graph of object roots from the event stream".
Thus, it directly answers your question 2 - it triggers full GC, which may or may not be compacting, but it collects gen2 for sure. It also answers question 4 - it is not a "memory dump" but a special kind of diagnostics data about the objects graph (depndencies and typenames), without the data itself.
And regards to the questions 1 and 3 - it is an example of the GC being "not aggressive" enough. It is kind of the "living on the edge" problem when the process almost meets the containers limits and GC sometimes is not able to interpret it. In other words, it thinks it has enough space, but it doesn't. Please, be warned that this is a super-oversimplification. In such a case full GCs may not happen or happen too late. I would confirm that by observing the process by the dotnet-trace with gc-collect profile.
As a solution, consider setting the limit manually, by using GCHeapHardLimit, to some clearly smaller value like 8GB.
I'm running a Google Cloud Compute VM as my application server for an app that's available on iOS and Android. The server runs Django within uWSGI, fronted with nginx. The communication between uWSGI and nginx happens through a unix file socket.
Recently I started noticing timeouts at client end. I did a bit of experimentation, and found that uWSGI sometimes errors out while writing data to the file socket. When I increase the 'max-time' parameter at the client end, it goes through smoothly. For example, a sample request that returns about 200KB of json data, takes about 1 sec for Django to compute. But the UNIX socket seems to take another 1-2 secs, which seems too high for a 200KB response. If the client is expecting a response within 2 secs, this often leads to a write error (as shown in the screenshot below) at uWSGI. When I increase the timeout at the client end, it goes through smoothly.
I want to know if there are some configuration changes that can make reading and writing on a UNIX socket faster. 200KB is a very minor size for a JSON response from my server - so I won't be able to bring it down. And I can't have a timeout of more than 2 secs at my client (iOS or Android), for business reasons.
Several unix entities are represented by files but are no file at all. Pipes and sockets are examples of entities represented by files that are not files.
So, writing, and reading from a unix socket is not bound to file system I/O and does not share file system time responses. In fact, unix socket is one of fastest ways of IPC, being more efficient than a TCP socket, since it does not use network I/O at all.
That stated, here is some hints on how to solve your particular problem:
Evaluate your app for performance issues. Profile it and check where it might be spending too much time. Usually, I/O is the main villain on performance issues. Also, bad algorithms, linear searches on long lists are also common guilties.
Check your configuration on both web server and your application gateway.
Check processes scheduling. If everybody is running on the same box, process concurrency may be an issue for heavy loads. Be sure to have all processes running under proper priorities.
Good luck!
Throughout the CoreFoundation framework source, POSIX filesystem API calls (e.g. open(), stat(), et al…) are wrapped in an idiom wherein a descriptor on /dev/autofs_nowait is acquired – with open(…, 0) – before the POSIX calls are made; afterwards the descriptor is close()’d before the scope exits.
What is the benefit of doing this? What are the risks?
Does acquiring the /dev/autofs_nowait descriptor have any affect on, or is it effected by, flags to any thusly-wrapped open() calls (like e.g. O_NONBLOCK)?
/dev on my machine, running OS X 10.10.5 has other “autofs” entries:
… none of which have man pages available. If these file-like devices might offer benefits in this vein I would be interested to hear about their use as well, as it may pertain.
Addendum: I could not find much on this subject; a Google Plus post from 2011 claims that:
[t]his file is a special device that's monitored by the autofs
filesystem implementation in the kernel. When opened, the autofs
filesystem will not block that process on any I/O operations on an
autofs file system.
I am not quite sure what that means (they were specifically talking about how launchd works, FWIW) but I was curious about this myself, so I wrote a quick context-manager-y RAII struct to try it out – untargeted profiling shows tests with POSIX calls completing faster but within general hashmarks; I’ll investigate this tactic with a finer-toothed comb after I get more background on how it all works.
These devices allowed the implementor(s) to avoid to define a new syscall or ioctl for the functionality, and maybe it was implemented that way because it was simpler, requires updating less code, and does not change the VFS API, which may have been the concerns at the time.
When you open /dev/autofs_nowait and traverse a path, you trigger auto-mounts, but don't wait for them to finish (otherwise your process blocks until the filesystem is mounted or after the operation times out), so you may receive a ENOENT when opening a file even if everything goes fine.
OTOH, /dev/autofs_notrigger makes the process not even trigger the auto-mounting.
That is all those devices do. The thing is that, in Darwin's implementation, open may block when traversing the filesystem even with O_NONBLOCK or O_NDELAY.
You can follow the flow from the vfs, from the open operation of a vnode:
vn_open -> vn_open_auth ->
namei -> lookup -> ...
Down that path there's no handling of the (non)blocking behavior.
We have an application which is deployed to a WebSphere server running on UNIX, and we are experiencing two issues:
a system hang which recovers after a few minutes - to investigate, we will need the thread dump (javacore).
a system hang which does not recover and requires WebSphere to be restarted - to investigate, we will need the thread dump and heap dump.
The problem is: when a system hang occurs, we do not know whether it is issue 1 or 2.
Ideally we would like to manually generate the thread dump first, and wait to see if the system recovers. If it does not, then we generate the thread dump and the heap dump, before restarting WebSphere.
I know about the kill -3 (or kill -QUIT) command. The command would generate thread dump only (if the parameter IBM_HEAPDUMP=false), or thread dump and heap dump (if IBM_HEAPDUMP=true). However, IBM_HEAPDUMP has to be set before WebSphere is started and cannot be changed while WebSphere is running.
Is my understanding correct, regarding the IBM_HEAPDUMP parameter and the kill -3 command?
Also, is it possible get the logs in the way I described? (i.e. when generating JVM diagnostics, choose whether to generate heap dump or not on the fly)
Your understanding is consistent with everything I've read.
However, I believe you can accomplish what you want by using wsadmin scripting. This article describes how to force javacores and heapdumps on a Windows platform where kill -3 is not available, but the same commands can be run on any WebSphere system.
From within wsadmin or a wsadmin script, execute:
set jvm [$AdminControl completeObjectName type=JVM,process=server1,*]
$AdminControl invoke $jvm generateHeapDump
$AdminControl invoke $jvm dumpThreads
Environment:
Windows 2003 Server (32 bit); IIS6, ASP.NET 2.0 (3.5); 4Gb Ram; 1 Worker Process
We have a situation where we have a very large System.XmlDocument is being loaded into memory, and then it heads into a complied XSL transform.
What is happening is when a web request comes in the server is sitting in an idle state with 2500Mb of available system memory.
As the XML DOM is populated, the available memory drops approx 500Mb at which point we get a System.OutOfMemoryException event. At this point the system should theoretically still have 2000Mb of available memory available to service the request (according to Perfmon).
The related questions I have are:
1) At what level in the stack is this out of memory limitation being met? OS? IIS? ASP.NET? worker process? Is this a per individual web request limit?
2) Is this limit configurable somewhere?
3) Why can’t this web request access the full available system memory?
1) I would guess at the worker process but this should be configurable within IIS to the limit of memory that a worker process can use. Another factor is what level of bits does your software use, e.g. 32 bit has a physical limit of 4 GB since this is the total address space.
2) Probably but don't forget that memory fragmentation may play a role in getting to out of memory faster than you think, e.g. if there is a memory request for a contiguous 1000 Mb piece of memory then this may not necessarily be found in the current memory.
3) Have you examined dump data to see what is in the memory when the exception gets thrown? If not, there are ways to get a snapshot of the memory to see what it looks like as this may give you more clues about what is going on.
You are running in a process. A process can only access 2 gigs of memory. This task is sharing memory with everything else running in this process, so this bit of code does not get the full 2 gig -- even if it is available.
There is a 3 gig switch on the os as well. I believe it is a registry setting. But you will have to search MSDN to find that info.
But realistically, you need to do this another way. Possibly by switching to a SAX style xml parser.
I'm sure there are some bright heads here that can answer your specific questions, but have you asked yourself if there is another way to do what you want? I specifically mean that you probably do not want to process a very large XML document, but you probably more specifically want to return something back to the client. Could you rewrite the code to avoid this XML document altogether, or perhaps not load it all into memory at the same time, and still produce the same end-result?
1) Dunno. Check your logs.
2) IIS limits memory divvied out to websites/application pools. Check your settings.
3) Servers are all about uptime; if an single app hogs all the resources everybody else suffers. Thats why enterprise apps like IIS limit memory to prevent runaways from taking down the entire server.