Membase Blocking on Key Eviction? - sqlite

We've been using Memcached for a while and recently started testing Membase in AWS. We're testing a single instance of Membase 1.6.0 on a large EC2 instance with 5GB RAM, 750GB disk (Linux FC8).
We've noticed that SQLite seems to block on eviction purges on an hourly basis when expiryPagerSleeptime wakes up. Although this was expected (because SQLite uses database level locking), we didn't expect that Membase would block as well.
In this case, it seems that while SQLite is deleting old keys, Membase "operations per second" fall to zero or near zero for several minutes. After the eviction process has finished, the Membase server quickly recovers. I would have anticipated that reads from Membase RAM would still proceed while SQLite was locked but this doesn't seem to be the case. Everything stops; the spy clients throw streams of exceptions as they time-out waiting for data that never arrives.
My impression from the docs was that Membase was asynchronous and would continue to serve reads from RAM. I would appreciate any help or suggestions to prevent Membase from blocking on key evictions. This is a serious issue for us because it seems to take about 4 minutes for this eviction process to finish and for the backlog in the disk queue to clear. That means every hour, Membase is effectively offline for 4 minutes.
I should also mention that this happens once the data is larger than RAM (and it's increasing size on the disk). We didn't notice any issues with key eviction when the data was just in RAM (presumably because key eviction in RAM happens too quickly to be noticable.)

With the desire of not duplicating information, this question is being answered and explained here: http://www.couchbase.com/forums/thread/membase-blocking-key-eviction
Perry

Related

The first DB call is much slower

First call at morning takes 15 seconds,
FOR EACH ... NO-LOCK:
END.
the second call takes only 1,5 seconds.
What causes this delay?
What can I log to identify it?
Even when I restart the DB I can't reproduce the behaviour of the first call.
(In complex queries I measure difference of 15 minutes to 2 seconds)
The most likely cause for this will be caching. There are two caches in place:
The -B buffer pool of the database which caches database blocks in memory. It is a typical observation, that once this cache is warmed up after a restart of the DB server queries are executing much faster. Of course this all depends on the size of your DB and the size of the -B buffer pool. Relatively small databases may fit into a relatively large -B buffer pool in large parts
The OS disk cache will also play it's part in your observation

dotnet-gcdump unexpected dump size and impact

I'm running a simple CRUD app built with ASP.NET Core and EF Core 3.1 in a docker swarm cluster on ubuntu. I'm only using managed code.
The container has a 10GB memory limit specified. I can inspect a running container and verify that this limit is actually set, I also see that DOTNET_RUNNING_IN_CONTAINER is set to true. When the app is started the memory consumption is about 700MB and it slowly builds up. Once it reaches 7GB (I see it in container generic stats) I start getting OutOfMemoryExceptions and it stays at this level for days. So the first question is
Why doesn't it go up to 10 GB?
Anyway I expect memory leaks so I have a dotnet-gcdump tool installed in this same container so I go ahead and collect the dump for future analysis with dotnet-gcdump collect. Once I execute this command I see the memory consumption of the running container drops from 7GB to 3GB and stays at this level. The resulting .gcdump file itself size though is only ~200MB with nothing suspicious in it. So next questions are
How does the collection of a dump reduce memory consumption? I'd assume it's doing GC with LOH compaction but it doesn't mention it in the docs.
Why isn't this memory freed automatically if the tool is able to do it?
Why is a resulting dump only 200 MB in size?
As the gcdump documentations explains: "GC dumps are created by triggering a GC in the target process, turning on special events, and regenerating the graph of object roots from the event stream".
Thus, it directly answers your question 2 - it triggers full GC, which may or may not be compacting, but it collects gen2 for sure. It also answers question 4 - it is not a "memory dump" but a special kind of diagnostics data about the objects graph (depndencies and typenames), without the data itself.
And regards to the questions 1 and 3 - it is an example of the GC being "not aggressive" enough. It is kind of the "living on the edge" problem when the process almost meets the containers limits and GC sometimes is not able to interpret it. In other words, it thinks it has enough space, but it doesn't. Please, be warned that this is a super-oversimplification. In such a case full GCs may not happen or happen too late. I would confirm that by observing the process by the dotnet-trace with gc-collect profile.
As a solution, consider setting the limit manually, by using GCHeapHardLimit, to some clearly smaller value like 8GB.

NiFi memory management

I Just want to understand how we should plan for the capacity of a NiFi instance.
We have a NiFi instance which is having around 500 flows. So, the total number of processors enabled on NiFi canvas is around 4000. We do run 2-5 flows simultaneously which does not take more than half an hour i.e. we do process data in MBs.
It was working fine till now but we are seeing outofMemory error very often. So we increased xms and xmx parameters from 4g to 8g which has resolved the problem for now. But going forward we will have more flows and we may face outofmemory issue again.
So, can anyone help with matrix of capacity planning or any suggestion to avoid such issues before happening? eg:- If we have 3000 processors enabled with/without any processing then Xg amount memory required.
Any input on NiFi capacity planning would be appreciated.
Thanks in Advance.
OOM errors can occur due to specific memory consuming processors. For example: SplitXML is loading your whole record to memory, so it could load a 1GiB file for instance.
Each processors can document what resource considerations should be taken. All of the Apache processors(as far as I can tell) are documented in that matter so you can rely on them.
In our example, by the way, SplitXML can be replaced with SplitRecord which doesn't load all of the record to memory.
So even if you use 1000 processors simultaneously, they might not consume as much memory as one processor that loads your whole FlowFile's content to memory.
Check which processors you are using and make sure you don't use one like that(there are more like this one that load the whole document to memory).

Maria DB recommended RAM,disk,core capacity?

I am not able to find maria DB recommended RAM,disk,number of Core capacity. We are setting up initial level and very minimum data volume. So just i need maria DB recommended capacity.
Appreciate your help!!!
Seeing that over the last few years Micro-Service architecture is rapidly increasing, and each Micro-Service usually needs its own database, I think this type of question is actually becoming more appropriate.
I was looking for this answer seeing that we were exploring the possibility to create small databases on many servers, and was wondering for interest sake what the minimum requirements for a Maria/MySQL DB would be...
Anyway I got this helpful answer from here that I thought I could also share here if someone else was looking into it...
When starting up, it (the database) allocates all the RAM it needs. By default, it
will use around 400MB of RAM, which isn’t noticible with a database
server with 64GB of RAM, but it is quite significant for a small
virtual machine. If you add in the default InnoDB buffer pool setting
of 128MB, you’re well over your 512MB RAM allotment and that doesn’t
include anything from the operating system.
1 CPU core is more than enough for most MySQL/MariaDB installations.
512MB of RAM is tight, but probably adequate if only MariaDB is running. But you would need to aggressively shrink various settings in my.cnf. Even 1GB is tiny.
1GB of disk is more than enough for the code and minimal data (I think).
Please experiment and report back.
There are minor differences in requirements between Operating system, and between versions of MariaDB.
Turn off most of the Performance_schema. If all the flags are turned on, lots of RAM is consumed.
20 years ago I had MySQL running on my personal 256MB (RAM) Windows box. I suspect today's MariaDB might be too big to work on such tiny machine. Today, the OS is the biggest occupant of any basic machine's disk. If you have only a few MB of data, then disk is not an issue.
Look at it this way -- What is the smallest smartphone you can get? A few GB of RAM and a few GB of "storage". If you cut either of those numbers in half, the phone probably cannot work, even before you add apps.
MariaDB or MySQL both actually use very less memory. About 50 MB to 150 MB is the range I found in some of my servers. These servers are running a few databases, having a handful of tables each and limited user load. MySQL documentation claims in needs 2 GB. That is very confusing to me. I understand why MariaDB does not specify any minimum requirements. If they say 50 MB there are going to be a lot of folks who will want to disagree. If they say 1 GB then they are unnecessarily inflating the minimum requirements. Come to think of it, more memory means better cache and performance. However, a well designed database can do disk reads every time without any performance issues. My apache installs (on the same server) consistently use up more memory (about double) than the database.

The Case of the Missing '14 second SQLite database' performance

I have developed a program which uses SQLite 3.7 ... database, in it there is a rather extensive write/read module that imports , checks and updates data. This process takes 14 seconds on my PC and Im pleased as punch with the performance.
I use transactions for everything with paratetrs my PC is a Intel i7 with 18gig of ram. I have not set anything in the database. I used SQLite Expert to create the database and create the data structures including table and columns and checked that all indexes are created. In other words its all OK.
I have since deployed the program/database to 2 other machines. That 14 second process takes over 5 minutes on the other machines. Same program, identical data, identical database. The machines are upto date, one is a 3rd gen Intel i7 bought last week, the other is quite fast as well so hardware should not be an issue.
Im just not understanding what the problem could be? Is it the database itself ? I have not set anything other then encription on it. Remembering that I run the same and it takes the 14 seconds. Could it be that the database is 'optimised' to my PC ? so when I give it to others its not optimised?
I know I could turn off jurnaling to get better performance, but that would only speed up the process and still would leave the problem.
Any ideas would be welcome.
EDIT:
I have tested the program on my 7yo Dual Athelon with 3gig of ram running XP on HDD, and the procedure took 35 seconds. Well in tolerable limits considering. I just dont get what could be making 2 modern machines take 5 min ?
I have an idea that its a write issue, as using a reader they are slower but quite ecceptable.
SQLite speed is affected most by how well the disk does random reads and writes; any SSD is much more better at this than any rotating disk.
Whenever changes overflow the internal cache, they must be written to disk. You should use PRAGMA cache_size to increase the cache to more than the default 2 MB.
Changed data must be written to disk at the end of every transaction. Make sure that there are as many changes as possible in one transaction.
If much of your processing involves temporary tables or indexes, the speed is affected by the speed of the main disk. If your machines have enough RAM, you can force temporary data to RAM with PRAGMA temp_store.
You should enable Write-Ahead Logging.
Note: the default SQLite distribution does not have encryption.

Resources