More interactive ZODB packing - plone

Current ZMI management "Pack database" functionality is little rough.
1) Could it be possible to have some kind of progress indicator for web UI? E.g. one telling how many minutes/hours are left giving at least some kind of estimate
2) How does ZODB packing affect the responsivity of the site? Are all transactions blocked?
3) Any command line scripts with progress indicator available so you could do this from a ZEO command line client?
4) At least some kind of log markers to logout output... [INFO] 30% done... 3:15 to go

Packing Anatomy
ZODB FileStorage packing is process of selective copying of data from one file to another one (only transactions that are "younger" then specified age). Before this copying starts some soft of index is built in memory to aid in process. Thus whole ZODB packing contains following steps:
Building pack index
Copying transactions to temporary file
Appending transactions that were performed after packing started
Replacing original FileStorage with packed one and reopening it in read/write mode
I'm usually monitoring the process by combination of top, vmstat/dstat, watch ls -la var/filestorage.
As Geir mentioned, you can have separate ZEO client dedicated to packing. This was reasonable as thread you invoked packing from blocked until packing finished. Now there is no need to if you use ZEO. ZEO server provides zeopack utility that connects directly to ZEO (no need for dedicated ZEO client) and initiates FileStorage packing. One of the benefits is no need for password, just proper permissions to access ZEO control socket.
Packing progress
As packing is performed by ZEO server (even not server but FileStorage itself), possibility of proper communication of progress to ZEO client is limited. ZEO protocol was not designed to communicate that type of information.
IMHO FileStorage itself could be more verbose in communicating through log file what it is doing right now. Some kind of progress could be built in. And if you feel like need the progress indicator, then you can design some kind of feedback channel through logging module back to ZEO-client/Zope-instance to be communicated back to browser.
Performance while packing
As FileStorage packing is quite intensive disk operation, it reduces throughoutput of disk subsystem. Additionally it expunges disk cache (in case of larger FileStorage), that impact disk performance even after packing finished, as caches should be warmed up again. Possible improvements that lead to longer packing time but smaller impact on system in FileStorage are:
reverting to O_DIRECT operations (not to touch file cache)
reducing disk scheduling priority (ionice on Linux) for thread performing the packing
throttling packing speed

The recommended way of doing packing for large sites is to run it on a separate ZEO instance dedicated to such tasks — and that is not listening for http requests at all.
That will also remove the need for any of the featuers requested.

1) There is no such indicator and it would be possibly hard to implement one (I would love to see at least some progress indicator through the Zope logging system)
2) not blocked but depending on the packing phase you may see high IO and CPU usage
3) no
4) no

Related

Are Google Cloud Disks OK to use with SQLite?

Google Cloud disks are network disks that behave like local disks. SQLite expects a local disk so that locking and transactions work correctly.
A. Is it safe to use Google Cloud disks for SQLite?
B. Do they support the right locking mechanisms? How is this done over the network?
C. How does disk IOP's and Throughput relate to SQLite performance? If I have a 1GB SQLite file with queries that take 40ms to complete locally, how many IOP's would this use? Which disk performance should I choose between (standard, balanced, SSD)?
Thanks.
Related
https://cloud.google.com/compute/docs/disks#pdspecs
Persistent disks are durable network storage devices that your instances can access like physical disks
https://www.sqlite.org/draft/useovernet.html
the SQLite library is not tested in across-a-network scenarios, nor is that reasonably possible. Hence, use of a remote database is done at the user's risk.
Yeah, the article you referenced, essentially stipulates that since the reads and writes are "simplified", at the OS level, they can be unpredictable resulting in "loss in translation" issues when going local-network-remote.
They also point out, it may very well work totally fine in testing and perhaps in production for a time, but there are known side effects which are hard to detect and mitigate against -- so its a slight gamble.
Again the implementation they are describing is not Google Cloud Disk, but rather simply stated as a remote networked arrangement.
My point is more that Google Cloud Disk may be more "virtual" rather than purely networked attached storage... to my mind that would be where to look, and evaluate it from there.
Checkout this thread for some additional insight into the issues, https://serverfault.com/questions/823532/sqlite-on-google-cloud-persistent-disk
Additionally, I was looking around and I found this thread, where one poster suggest using SQLite as a read-only asset, then deploying updates in a far more controlled process.
https://news.ycombinator.com/item?id=26441125
the persistend disk acts like a normal disk in your vm. and is only accessable to one vm at a time.
so it's safe to use, you won't lose any data.
For the performance part. you just have to test it. for your specific workload. if you have plenty of spare ram, and your database is read heavy, and seldom writes. the whole database will be cached by the os (linux) disk cache. so it will be crazy fast. even on hdd storage.
but if you are low on spare ram. than the database won't be in the os cache. and writes are always synced to disk. and that causes lots of I/O operations.
in that case use the highest performing disk you can / are willing to afford.

uWSGI and Flask: keep objects in memory between requests

My stack is uWSGI, flask and nginx currently. I have a need to store data between requests (basically I receive push notifications from another service about events to the server and I want to store those events in the server memory, so client can just query server every n milliseconds, to receive latest update).
Normally this would not work, because of many reasons. One is a good deployment requires you to have several processes in uwsgi in production (and even maybe several machines to scale this out). But my case is very specific: I'm building a web app for a piece of hardware (You can think of your home router configuration page as a good example). This means no need to scale. I also do not have a database (at least not a traditional one) and probably normally 1-2 clients simultaneously.
if I specify --processes 1 --threads 4 in uwsgi, is this enough to ensure the data is kept in the memory as a single instance? Or do I also need to use --threads 1?
I'm also aware that some web servers clear memory randomly from time to time and restart the hosted app. Does nginx/uwsgi do that and where can I read about the rules?
I'd also welcome advises on how to design all of this, if there are better ways to handle this. Please note that I do not consider using any persistant storage for this - this does not worth the effort and may be even impossible due to hardware limitations.
Just to clarify: When I'm talking about one instance of data, I'm thinking of my app.py executing exactly one time and keeping the instances defined there for as long as the server lives.
If you don't need data to persist past a server restart, why not just build a cache object into you application that can do push and pop operations?
A simple array of objects should suffice, one flask route pushes new data to the array and another can pop the data off the array.

what is the thread model of mysql and innodb?

I am learning the architecture of MySQL and InnoDB, and the thread model combined with the pluggable engine system confuses me. It is claimed that MySQL instance is one process with many threads, and InnoDB has many background threads such as master-thread, io threads to deal with callbacks of kernel aio. What's more, I find that the connection pool is managed by the MySQL layer not the pluggable InnoDB layer.
Then how do MySQL threads manage connections and their requests, and how does MySQL request go to kernel aio does it cooperate with InnoDB io thread?
A connection is given its own process (or thread, depending on OS). That process runs along as if it were a single process, except for access to a large number of caches, tables, etc. To deal with them, there are many "mutexes". These are used to give exclusive access to one process to, say, insert/delete something in the buffer_pool or table cache, etc.
In the old days, there were fewer Mutexes. For example, to do anything with the buffer pool a single (or maybe a few) Mutex granted access. This was fine when there was one CPU or maybe two. But when servers had 8 cores, they realized that this was not fine-grained enough. So there was an overhaul of the mutexes. (IIRC, it was spearheadded by Percona.)
Because of these mutexes, 4 processes could work "simultaneously" before they stumbled over each other. More that 8 cores --> throughput actually went down. (And latency went through the roof.)
As new versions came out, that number increased. With MySQL 8, it is possibly over 64.
Another thing that happened was "instances" of buffer_pools and table_open_caches. This was a straightforward way to split a mutex into several.
I am less familiar with the I/O thread model. I suspect the thread (for one connection) somehow signals another thread to do the work for it.
Also, several things are left to background tasks: flushing to logs, updating index blocks from the "change buffer", read-ahead, flushing of "dirty" pages from the buffer_pool, cleaning up BTrees, etc. Note that all of these would use the appropriate Mutex(es) so as not to step on each other and the various connection processes.
The "pluggable engine" system needs a different kind of chatter -- between the generic handler and the engine-specific code. I suspect this is orthogonal to the mutexes, background threads, etc.
The "connection pool" would be yet-another array (or list) of something, and multiple processes would want to grab/release an item in that array. Hence its own mutex(es).

Reduce the BizTalk receive location file input speed

We have a BizTalk 2010 receive location, which will get a 70MB file and then using inbound map (in receive location) and outbound map (in send port) to produce a 1GB file.
While performing the above process, a lot of disk I/O resource is consumed in SQL Server. Another receive location processes performance are highly affected.
We have tried reduce the maximum disk I/O threads in host instance of that receive location, but it still consumes a lot of disk I/O resource in SQL Server.
In fact this process priority is very very low. Is there any method to reduce the disk I/O resource usage of this process such that other processes performance can be normal?
This issue isn't related to the speed of the file input, but, as you mentioned in a comment, to the load this places on the messagebox when trying to persist the 1gb map output to the MessageBox. You have a few options here to try to minimize the impact this will have on other processes:
Adjust the throttling settings on the newly created host to something very low. This may or may not work the way you want it to though.
Set a service window on the recieve location for these files so that they only run during off hours. This would be ideal if you don't have 24/7 demand on the MessageBox and can afford to have slow response time in the middle of the night (say 2-3am)
If your requirements can handle this, don't map the file in the recieve port, but instead route it to an Orchestration and/or custom pipeline component that will split it into smaller pieces and then map the smaller pieces. This should at least give you more fine grained control over the speed at which these are processed (have a delay shape in the loop that processes the pieces). There'd still possibly be issues when you joined them back together, but it shouldn't be as bad as your current process.
It also may be worth looking at your map. If there are lots of slow/processor heavy calls you might be able to refactor it.
Ideally you should debatch the file. Apply business logic including map on each individual segments and then load them into sql one at a time. Later you can use pipeline or some other .NET component to pull data from SQL and rebatch the data. Handling big xml (10 times size as compared to flat file) in BizTalk messagebox is not a very good practice.
If however it was a pure messaging scenario, you can convert file into stream and route it to destination.

Slow BizTalk File Receive

I have an application with a file receive location. After the host instance has been running for a few hours the receive location fails to identify new files dropped into the folder that it is monitoring. It doesn't forget about them altogether, it's just that performance grinds to a crawl. The receive location is configured to poll the target folder every 60 seconds but after host instance has been running for an hour or so, then it seems that the target folder is being polled only every thirty minutes. If I restart the host instance then the files waiting in the target folder are collected right away and performance is fine for the next hour or so.
The same application runs fine in a different environment.
There are now obvious entries in the event log related to the problem.
All the BizTalk SQL jobs are running fine except for Backup BizTalk Server (BizTalkMgmtDb).
Any suggestions gratefully received.
Thanks
Rob
Here are some additional tools which may help you identify and diagnose BizTalk database issues.
BizTalk MsgBox Viewer
Here is a tool to repair identified errors:
Terminator
Use at your own risk... read the glogs and docs. Start with the message box viewer and let us know our results.
Without more details, the biggest tell is that your Backup Job is failing. If the backup job is failing, it may not be properly configured. If it is properly configured and still failing, then you've got other issues. Can you give us some more information about your BizTalk install.
What version are you running?
What are our database sizes?
What are your purge and archive settings like?
Is there any long running blocks in your SQL Server DB coming from BizTalk?
Another thing to consider is the user accounts the send, receive and orchestration hosts are running under. Please check the BizTalk Administration Console. If they are all running the same account, sometimes the orchestrations can starve the send and receive processes of CPU time. I believe priority is given to orchestrations then receive, then send. Even if you are just developing, it is useful to use separate accounts for this. This also improves security.
The Wrox BizTalk Server 2006 will also supply tuning advice.
What other things are going on with the server? Is BizTalk pegged otherwise or is it idle?
You mention that the solution does not have any problems in another environment, so it's likely that there is a configuration problem.
Check the following:
** On SQL Server, set some upper memory limit for SQL Server. By default, SQL Server uses whatever it can get and then hangs onto it, so set a reasonable limit so that your system can operate without spending a lot of time paging memory onto and from your hard drive(s).
** Ensure that you have available disk space - maybe you are running low - this can lead to all kinds of strange problems.
** Try to split up the system's paging file among its physical drives (if you have more than one drive on the system). Also consider using a faster drive, or if you have lots of cash laying around, get a SAN.
** In BizTalk, is tracking enabled? If so, are you also tracking message bodies? Disable tacking or message body tracking and see if there is a difference.
** Start performance monitor and monitor the following counters when running your solution
Object: BizTalk Messaging
Instance: (select the receiving host) %%
Counter: Documents Received/Sec
Object: BizTalk Messaging
Instance: (select the transmitting host) %%
Counter: Documents Sent/Sec
Object: XLANG/s Orchestrations
Instance: (select the processing host) %%
Counter: Orchestrations Completed/Sec.
%% You may have only one host, so just use it. Since BizTalk configurations vary, I am using generic names for hosts.
The preceding counters monitor the most basic aspects of your server, but may help to narrow down places to look further. You can, of course, add CPU and Memory too. If you have time (days...maybe weeks) you could monitor for processes that allocate memory and never release it. Use the following counter...
Object: Memory
Counter: Pool Nonpaged Bytes
Slow decline of this counter indicates that a process is not releasing memory, which affects everything on the system.
Let us know how things turn out!
I had the same problem with, when my orchestration was idle for some time it took a long time to process the first msg. A article of EvYoung helped me solve this problem.
"This is caused by application domain unloading within the BizTalk host process. If an AppDomain is shutdown after idle, the next message that comes needs to wait for the Orchestration to compile again. Depending on the complexity of your design, this can be a noticeable wait. To prevent this in low latency requirement scenario, you can modify the BTSNTSVC.EXE.config file and set SecondsIdleBeforeShutdown property to -1. This will prevent AppDomain shutdown due to idle."
You can find the article in here:
http://blogs.msdn.com/b/biztalkcpr/archive/2008/05/08/thoughts-on-orchestration-performance.aspx
It took me to long to respond but i thought i might help someone. cheers :)
Some good suggestions from others. I will add :
Do you have any custom receive pipeline components on the receive location ? If so perhaps one is leaking memory, calling some external component eg database which is taking a long time ?
How big are the files you are receiving ?
On the File transport properties of your receive location, set "file renaming" on, do the files get renamed within 60s.

Resources