SQLite vacuuming / fragmentation and performance degradation - sqlite

Let's say I periodically insert data into a SQLite database, then purge the first 50% of the data, but I don't vacuum.
Do I have something like zeroed-out pages for the first 50% of the file now?
If I add another batch of data, am I filling in those zeroed-out pages?
The manual mentions fragmentation of data:
Frequent inserts, updates, and deletes can cause the database file to become fragmented - where data for a single table or index is scattered around the database file.
VACUUM ensures that each table and index is largely stored contiguously within the database file. In some cases, VACUUM may also reduce the number of partially filled pages in the database, reducing the size of the database file further.
But it doesn't indicate that there's necessarily a performance degradation from this.
It mostly hints at the wasted space that could be saved from vacuuming.
Is there a noticeable performance gain for data in strictly contiguous pages?
Could I expect "terrible" performance from a database with a lot of fragmented data?

SQLite automatically reuses free pages.
Fragmented pages can result in performance degradation only if
the amount of data is so large that it cannot be cached, and
your storage device does seeks relatively slowly (e.g. hard disks or cheap flash devices), and
you access the data often enough that the difference matters.
There is only one way to find out whether this is the case for your application: measure it.

Related

High memory consumption in hana table partitioning

I have a big table having records around 4 billion ,table is partitioned but i need to perform the partitioning again. while doing the partitioning memory consumption of the hana system reached to its limit 4TB and started impacting other system.
How we can optimize the partitioning so get completed without consuming that much of memory
To re-partition tables, both the original table structure as well as the new table structure needs to be kept in memory at the same time.
For the target table structures, data will be inserted into delta stores and later on merged, which again consumes memory.
To increase performance, re-partitioning happens in parallel threads, which, you may guess, again uses additional memory.
The administration guide provides a hint to lower the number of parallel threads:
Parallelism and Memory Consumption
Partitioning operations consume a
high amount of memory. To reduce the memory consumption, it is
possible to configure the number of threads used.
You can change the
default value of the parameter split_threads in the partitioning
section of the indexserver.ini configuration file.
By default, 16 threads are used. In the case of a parallel partition/merge, the
individual operations use a total of the configured number of threads
for each host. Each operation takes at least one thread.
So, that's the online option to re-partition if your system does not have enough memory for parallel threads.
Alternatively, you may consider an offline re-partitioning that would involve exporting the table (as CSV!), truncating(!) the table, altering the partitioning on the now empty table and re-importing the data.
Note, that I wrote "truncate" as this will preserve all privileges and references to the table (views, synonyms, roles, etc.) which would be lost if you dropped and recreated the table.

Embedded key-value db vs. just storing one file per key?

I'm confused about the advantage of embedded key-value databases over the naive solution of just storing one file on disk per key. For example, databases like RocksDB, Badger, SQLite use fancy data structures like B+ trees and LSMs but seem to get roughly the same performance as this simple solution.
For example, Badger (which is the fastest Go embedded db) takes about 800 microseconds to write an entry. In comparison, creating a new file from scratch and writing some data to it takes 150 mics with no optimization.
EDIT: to clarify, here's the simple implementation of a key-value store I'm comparing with the state of the art embedded dbs. Just hash each key to a string filename, and store the associated value as a byte array at that filename. Reads and writes are ~150 mics each, which is faster than Badger for single operations and comparable for batched operations. Furthermore, the disk space is minimal, since we don't store any extra structure besides the actual values.
I must be missing something here, because the solutions people actually use are super fancy and optimized using things like bloom filters and B+ trees.
But Badger is not about writing "an" entry:
My writes are really slow. Why?
Are you creating a new transaction for every single key update? This will lead to very low throughput.
To get best write performance, batch up multiple writes inside a transaction using single DB.Update() call.
You could also have multiple such DB.Update() calls being made concurrently from multiple goroutines.
That leads to issue 396:
I was looking for fast storage in Go and so my first try was BoltDB. I need a lot of single-write transactions. Bolt was able to do about 240 rq/s.
I just tested Badger and I got a crazy 10k rq/s. I am just baffled
That is because:
LSM tree has an advantage compared to B+ tree when it comes to writes.
Also, values are stored separately in value log files so writes are much faster.
You can read more about the design here.
One of the main point (hard to replicate with simple read/write of files) is:
Key-Value separation
The major performance cost of LSM-trees is the compaction process. During compactions, multiple files are read into memory, sorted, and written back. Sorting is essential for efficient retrieval, for both key lookups and range iterations. With sorting, the key lookups would only require accessing at most one file per level (excluding level zero, where we’d need to check all the files). Iterations would result in sequential access to multiple files.
Each file is of fixed size, to enhance caching. Values tend to be larger than keys. When you store values along with the keys, the amount of data that needs to be compacted grows significantly.
In Badger, only a pointer to the value in the value log is stored alongside the key. Badger employs delta encoding for keys to reduce the effective size even further. Assuming 16 bytes per key and 16 bytes per value pointer, a single 64MB file can store two million key-value pairs.
Your question assumes that the only operation needed are single random reads and writes. Those are the worst case scenarios for log-structured merge (LSM) approaches like Badger or RocksDB. The range query, where all keys or key-value pairs in a range gets returned, leverages sequential reads (due to the adjacencies of sorted kv within files) to read data at very high speeds. For Badger, you mostly get that benefit if doing key-only or small value range queries since they are stored in a LSM while large values are appended in a not-necessarily sorted log file. For RocksDB, you’ll get fast kv pair range queries.
The previous answer somewhat addresses the advantage on writes - the use of buffering. If you write many kv pairs, rather than storing each in separate files, LSM approaches hold these in memory and eventually flush them in a file write. There’s no free lunch so asynchronous compaction must be done to remove overwritten data and prevent checking too many files for queries.
Previously answered here. Mostly similar to other answers provided here but makes one important, additional point: files in a filesystem can't occupy the same block on disk. If your records are, on average, significantly smaller than typical disk block size (4-16 KiB), storing them as separate files will incur substantial storage overhead.

Fragmentation in SQLite used in a round-robin fashion without VACUUM

There's an SQLite database being used to store static-sized data in a round-robin fashion.
For example, 100 days of data are stored. On day 101, day 1 is deleted and then day 101 is inserted.
The number of rows is the same between days. The the individual fields in the rows are all integers (32-bit or less) and timestamps.
The database is stored on an SD card with poor I/O speed,
something like a read speed of 30 MB/s.
VACUUM is not allowed because it can introduce a wait of several seconds
and the writers to that database can't be allowed to wait for write access.
So the concern is fragmentation, because I'm inserting and deleting records constantly
without VACUUMing.
But since I'm deleting/inserting the same set of rows each day,
will the data get fragmented?
Is SQLite fitting day 101's data in day 1's freed pages?
And although the set of rows is the same,
the integers may be 1 byte day and then 4 bytes another.
The database also has several indexes, and I'm unsure where they're stored
and if they interfere with the perfect pattern of freeing pages and then re-using them.
(SQLite is the only technology that can be used. Can't switch to a TSDB/RRDtool, etc.)
SQLite will reuse free pages, so you will get fragmentation (if you delete so much data that entire pages become free).
However, SD cards are likely to have a flash translation layer, which introduces fragmentation whenever you write to some random sector.
Whether the first kind of fragmentation is noticeable depends on the hardware, and on the software's access pattern.
It is not possible to make useful predictions about that; you have to measure it.
In theory, WAL mode is append-only, and thus easier on the flash device.
However, checkpoints would be nearly as bad as VACUUMs.

Best way to store 100,000+ CSV text files on server?

We have an application which will need to store thousands of fairly small CSV files. 100,000+ and growing annually by the same amount. Each file contains around 20-80KB of vehicle tracking data. Each data set (or file) represents a single vehicle journey.
We are currently storing this information in SQL Server, but the size of the database is getting a little unwieldy and we only ever need to access the journey data one file at time (so the need to query it in bulk or otherwise store in a relational database is not needed). The performance of the database is degrading as we add more tracks, due to the time taken to rebuild or update indexes when inserting or deleting data.
There are 3 options we are considering:
We could use the FILESTREAM feature of SQL to externalise the data into files, but I've not used this feature before. Would Filestream still result in one physical file per database object (blob)?
Alternatively, we could store the files individually on disk. There
could end being half a million of them after 3+ years. Will the
NTFS file system cope OK with this amount?
If lots of files is a problem, should we consider grouping the datasets/files into a small database (one peruser) so that each user? Is there a very lightweight database like SQLite that can store files?
One further point: the data is highly compressible. Zipping the files reduces them to only 10% of their original size. I would like to utilise compression if possible to minimise disk space used and backup size.
I have a few thoughts, and this is very subjective, so your mileage ond other readers' mileage may vary, but hopefully it will still get the ball rolling for you even if other folks want to put differing points of view...
Firstly, I have seen performance issues with folders containing too many files. One project got around this by creating 256 directories, called 00, 01, 02... fd, fe, ff and inside each one of those a further 256 directories with the same naming convention. That potentially divides your 500,000 files across 65,536 directories giving you only a few in each - if you use a good hash/random generator to spread them out. Also, the filenames are pretty short to store in your database - e.g. 32/af/file-xyz.csv. Doubtless someone will bite my head off, but I feel 10,000 files in one directory is plenty to be going on with.
Secondly, 100,000 files of 80kB amounts to 8GB of data which is really not very big these days - a small USB flash drive in fact - so I think any arguments about compression are not that valid - storage is cheap. What could be important though, is backup. If you have 500,000 files you have lots of 'inodes' to traverse and I think the statistic used to be that many backup products can only traverse 50-100 'inodes' per second - so you are going to be waiting a very long time. Depending on the downtime you can tolerate, it may be better to take the system offline and back up from the raw, block device - at say 100MB/s you can back up 8GB in 80 seconds and I can't imagine a traditional, file-based backup can get close to that. Alternatives may be a filesysten that permits snapshots and then you can backup from a snapshot. Or a mirrored filesystem which permits you to split the mirror, backup from one copy and then rejoin the mirror.
As I said, pretty subjective and I am sure others will have other ideas.
I work on an application that uses a hybrid approach, primarily because we wanted our application to be able to work (in small installations) in freebie versions of SQL Server...and the file load would have thrown us over the top quickly. We have gobs of files - tens of millions in large installations.
We considered the same scenarios you've enumerated, but what we eventually decided to do was to have a series of moderately large (2gb) memory mapped files that contain the would-be files as opaque blobs. Then, in the database, the blobs are keyed by blob-id (a sha1 hash of the uncompressed blob), and have fields for the container-file-id, offset, length, and uncompressed-length. There's also a "published" flag in the blob-referencing table. Because the hash faithfully represents the content, a blob is only ever written once. Modified files produce new hashes, and they're written to new locations in the blob store.
In our case, the blobs weren't consistently text files - in fact, they're chunks of files of all types. Big files are broken up with a rolling-hash function into roughly 64k chunks. We attempt to compress each blob with lz4 compression (which is way fast compression - and aborts quickly on effectively-incompressible data).
This approach works really well, but isn't lightly recommended. It can get complicated. For example, grooming the container files in the face of deleted content. For this, we chose to use sparse files and just tell NTFS the extents of deleted blobs. Transactional demands are more complicated.
All of the goop for db-to-blob-store is c# with a little interop for the memory-mapped files. Your scenario sounds similar, but somewhat less demanding. I suspect you could get away without the memory-mapped I/O complications.

Most bandwidth efficient unidirectional synchronise (server to multiple clients)

What is the most bandwidth efficient way to unidirectionally synchronise a list of data from one server to many clients?
I have sizeable chunk of data (perhaps 20,000, 50-byte records) which I need to periodically synchronise to a series of clients over the Internet (perhaps 10,000 clients). Records may added, removed or updated only at the server end.
Something similar to bittorrent? Or even using bittorrent. Or maybe invent a wrapper around bittorrent.
(Assuming you pay for bandwidth on your server and not the others ...)
Ok, so we've got some detail now - perhaps 10 GB of total (uncompressed) data, every 3 days, so that's 100 GB per month.
That's actually not really a sizeable chunk of data these days. Whose bandwidth are you trying to save - yours, or your clients'?
Does the data perhaps compress very readily? For raw binary data it's not uncommon to achieve 50% compression, and if the data happens to have a lot of repeated patterns within it then 80%+ is possible.
That said, if you really do need a system that can just transfer the changes, my thoughts are:
make sure you've got a well defined primary key field - use that as your key to identify each record
record a timestamp for each record to say when it last changed
have each client tell you the timestamp of the last change it knows of, so you can calculate the deltas
ensure that full downloads are possible too, in case clients get out of sync

Resources