SSD performance benefit in OLTP vs OLAP - olap

In which system is SSD disk better over HDD as far as performance is concerned ? An OLTP or an OLAP system ?
My guess is that in OLTP an SSD disk is more valuable, because transactions are constantly occurring and we need the non-sequential read-write speeds. Is this correct ?

I'm not experienced with the two systems, but from what I know I would say that an SSD would be most beneficent for a OLTP system. Since OLAP is focused more upon complex queries involving aggregation, I'd be inclined to think that read-write times would not be the limiting factor for an OLAP system. That's my two-cents.
You can read more here.
Hope this helps.

Related

Are Google Cloud Disks OK to use with SQLite?

Google Cloud disks are network disks that behave like local disks. SQLite expects a local disk so that locking and transactions work correctly.
A. Is it safe to use Google Cloud disks for SQLite?
B. Do they support the right locking mechanisms? How is this done over the network?
C. How does disk IOP's and Throughput relate to SQLite performance? If I have a 1GB SQLite file with queries that take 40ms to complete locally, how many IOP's would this use? Which disk performance should I choose between (standard, balanced, SSD)?
Thanks.
Related
https://cloud.google.com/compute/docs/disks#pdspecs
Persistent disks are durable network storage devices that your instances can access like physical disks
https://www.sqlite.org/draft/useovernet.html
the SQLite library is not tested in across-a-network scenarios, nor is that reasonably possible. Hence, use of a remote database is done at the user's risk.
Yeah, the article you referenced, essentially stipulates that since the reads and writes are "simplified", at the OS level, they can be unpredictable resulting in "loss in translation" issues when going local-network-remote.
They also point out, it may very well work totally fine in testing and perhaps in production for a time, but there are known side effects which are hard to detect and mitigate against -- so its a slight gamble.
Again the implementation they are describing is not Google Cloud Disk, but rather simply stated as a remote networked arrangement.
My point is more that Google Cloud Disk may be more "virtual" rather than purely networked attached storage... to my mind that would be where to look, and evaluate it from there.
Checkout this thread for some additional insight into the issues, https://serverfault.com/questions/823532/sqlite-on-google-cloud-persistent-disk
Additionally, I was looking around and I found this thread, where one poster suggest using SQLite as a read-only asset, then deploying updates in a far more controlled process.
https://news.ycombinator.com/item?id=26441125
the persistend disk acts like a normal disk in your vm. and is only accessable to one vm at a time.
so it's safe to use, you won't lose any data.
For the performance part. you just have to test it. for your specific workload. if you have plenty of spare ram, and your database is read heavy, and seldom writes. the whole database will be cached by the os (linux) disk cache. so it will be crazy fast. even on hdd storage.
but if you are low on spare ram. than the database won't be in the os cache. and writes are always synced to disk. and that causes lots of I/O operations.
in that case use the highest performing disk you can / are willing to afford.

Maria DB recommended RAM,disk,core capacity?

I am not able to find maria DB recommended RAM,disk,number of Core capacity. We are setting up initial level and very minimum data volume. So just i need maria DB recommended capacity.
Appreciate your help!!!
Seeing that over the last few years Micro-Service architecture is rapidly increasing, and each Micro-Service usually needs its own database, I think this type of question is actually becoming more appropriate.
I was looking for this answer seeing that we were exploring the possibility to create small databases on many servers, and was wondering for interest sake what the minimum requirements for a Maria/MySQL DB would be...
Anyway I got this helpful answer from here that I thought I could also share here if someone else was looking into it...
When starting up, it (the database) allocates all the RAM it needs. By default, it
will use around 400MB of RAM, which isn’t noticible with a database
server with 64GB of RAM, but it is quite significant for a small
virtual machine. If you add in the default InnoDB buffer pool setting
of 128MB, you’re well over your 512MB RAM allotment and that doesn’t
include anything from the operating system.
1 CPU core is more than enough for most MySQL/MariaDB installations.
512MB of RAM is tight, but probably adequate if only MariaDB is running. But you would need to aggressively shrink various settings in my.cnf. Even 1GB is tiny.
1GB of disk is more than enough for the code and minimal data (I think).
Please experiment and report back.
There are minor differences in requirements between Operating system, and between versions of MariaDB.
Turn off most of the Performance_schema. If all the flags are turned on, lots of RAM is consumed.
20 years ago I had MySQL running on my personal 256MB (RAM) Windows box. I suspect today's MariaDB might be too big to work on such tiny machine. Today, the OS is the biggest occupant of any basic machine's disk. If you have only a few MB of data, then disk is not an issue.
Look at it this way -- What is the smallest smartphone you can get? A few GB of RAM and a few GB of "storage". If you cut either of those numbers in half, the phone probably cannot work, even before you add apps.
MariaDB or MySQL both actually use very less memory. About 50 MB to 150 MB is the range I found in some of my servers. These servers are running a few databases, having a handful of tables each and limited user load. MySQL documentation claims in needs 2 GB. That is very confusing to me. I understand why MariaDB does not specify any minimum requirements. If they say 50 MB there are going to be a lot of folks who will want to disagree. If they say 1 GB then they are unnecessarily inflating the minimum requirements. Come to think of it, more memory means better cache and performance. However, a well designed database can do disk reads every time without any performance issues. My apache installs (on the same server) consistently use up more memory (about double) than the database.

ZODB in-memory backend?

I'm currently working on a fairy large project (active members is about hundreds K) and was strongly lean to Plone solutions.
I've asked some questions related to it like here and here.
Got some replies from very experienced Plonistas (and active stackoverflowers as well). I really appreciate it. People keeps saying Plone does not scale well to that large, and most of the reasons is because of ZODB.
Then I think of an in-memory backend for ZODB. RAM is really cheap now ! you can get 128GB for just ~$3k, ten times over a normal $300 128GB SSD, and achieve ~30GBs IO bandwidth compare to that ~300MBs of the SSD.
in-memory backend + Blob for binary + 10s disk journalling for backup + all undos except last 10s would be an instance kill ! They should smoke the RDBMs and offer full ACID + Transaction + Object Mapping compare to such couch*/redis etc.
Is it technical possible ? Is there any implementation ? Does it worth implement (in your opinion) ?
There is a memcache option for RelStorage which helps when you need to use a slow database, but really you should probably just leave that sort of caching to your operating system and make sure your database server has plenty of RAM. (If your RAM is large enough then your filesystem cache should already store most of the data.)
An SSD will significantly reduce the worst case read latencies for random access to data not already in the filesystem cache. It seems silly not to use them now, especially as the Intel 330 SSD is so cheap and has a capacitor equivalent to a battery backed raid controller (making writes superfast too.)
An all in RAM solution can never be considered ACID, as it won't be Durable.
As mentioned in my comment on your other post, it is not the ZODB that is the problem here but Plone's synchronous use of a single contended portal_catalog.
Instead of keeping the entire ZODB in memory, you could mount the portal_catalog in a separate mount point and keep it in memory. I've already seen such kind of configuration and it works smoothly for about 8k users using standard hardware (2 server + 1 zeo server). It may be sufficient for your needs, maybe using more performant hw.

SQLite Abnormal Memory Usage

We are trying to Integrate SQLite in our Application and are trying to populate as a Cache. We are planning to use it as a In Memory Database. Using it for the first time. Our Application is C++ based.
Our Application interacts with the Master Database to fetch data and performs numerous operations. These Operations are generally concerned with one Table which is quite huge in size.
We replicated this Table in SQLite and following are the observations:
Number of Fields: 60
Number of Records: 1,00,000
As the data population starts, the memory of the Application, shoots up drastically to ~1.4 GB from 120MB. At this time our application is in idle state and not doing any major operations. But normally, once the Operations start, the Memory Utilization shoots up. Now with SQLite as in Memory DB and this high memory usage, we don’t think we will be able to support these many records.
Q. Is there a way to find the size of the database when it is in memory?
When I create the DB on Disk, the DB size sums to ~40MB. But still the Memory Usage of the Application remains very high.
Q. Is there a reason for this high usage. All buffers have been cleared and as said before the DB is not in memory?
Any help would be deeply appreciated.
Thanks and Regards
Sachin
A few questions come to mind...
What is the size of each record?
Do you have memory leak detection tools for your platform?
I used SQLite in a few resource constrained environments in a way similar to how you're using it and after fixing bugs it was small, stable and fast.
IIRC it was unclear when to clean up certain things used by the SQLite API and when we used tools to find the memory leaks it was fairly easy to see where the problem was.
See this:
PRAGMA shrink_memory
This pragma causes the database connection on which it is invoked to free up as much memory as it can, by calling sqlite3_db_release_memory().

Throttle application on the basis of per disk usage or CPU usage

Can anyone recommend a way in which I can throttle an application based on the current disk usage or even CPU usage.
The application I am writing scans files on the hard disk and will be pretty hard disk intensive in itself.
Can anyone recommend a way in which I can either throttle down my application(or even pause it for that matter) when the disk usage is high(i.e. user himself is running very HDD or CPU intensive app)? Basically my application shouldn't hamper user's productivity. I know this is a pretty big research topic in itself. But I at least need some cues on how would I approach this.
Help in any form is highly appreciated. :)
Thanks.
Samrat.
Vista has added I/O Prioritization to Windows so if you're using that platform you can just let the O/S take care of it.
For other operating systems maybe finding the I/O latency, and if it is over some predefined threshold then sleep your disk scanner for a bit would work?
Take a look at this ("How can I programmatically limit my program’s CPU usage to below 70%?") and this ("Win32 Thread scheduling#The Larry Osterman answer")

Resources