Compression in SQLite database .Net C# - sqlite

Can BLOB (or any other data) be compressed (on the fly) into an SQLite3 database?
I'm using System.Data.SQLite in C# targeting .net 3.5.
In my search I keep seeing connection strings like so...
"datasource=base.db;version=3;compress=true;"
But I can find no information about compress=true, I've added it to my connection string for testing and added file as BLOB data but there appears to be no compression, I have tested with for example a large text file, so not an already compressed image or other file.

To my very limited knowledge, the "compress=True" parameter in the System.Data.SQLite.SQLiteConnection String is not taken in account.
Basically, it is kind of useless: there is no built-in compression provided for the .NET System.Data.SQLite.
In order to bring some compression features to the System.Data.SQLite world, you have basically have two options:
Using some third party tools, whether by including among one of those C libraries and rebuilding the System.Data.SQLite based on it: Proprietary SQLite Extensions: SEE, CEROD, ZIPVFS / SQLite Crypt / etc. The commercial ORM Devart based on ADO.NET offers the support for the encryption options I just mentioned and seems they can be used seamlessly (although I personally never tried to use Devart that much).
Creating a wrapper around the System.Data.SQLite: whether converting all your data with some headers about the encapsulated data are and inserting everything using some collection of bytes with a blob data affinity. You can also by the way provide at the same time some additional encryption process.Of course, this will be (much) slower than the C third-party libraries available and may be not sufficient depending on your requirements (insertions, updates,.., deletions speed), but at least this is free (and time-consuming) approach. This can be achieved through a specific SQLite ORM, should not be that hard to implement but quite time-consuming.

Related

ASP.NET Core - Indexing and searching JSON files

I have close to 10K JSON files (very small). I would like to provide search functionality. Since these JSON files are fixed for specific release, I am thinking to pre-index files and load index during startup of website. I don't want to use external search engine.
I am searching for libraries to support this. lucene.Net is one popular library. I am not sure whether this library supports loading pre-index data.
Index JSON documents and store index results (probably in single file), save to file storage service like S3 - Console app.
Load index file and respond to queries. - ASP.NET core app
I am not sure this is possible or not. What are the possible options available?
Since S3 is not a .NET-specific technology and Lucene.NET is a line-by-line port of Lucene, you can expand your search to include Lucene-related questions. There is an answer here that points to an S3 implementation meant for Lucene that could be ported to .NET. But, by the author's own admission, performance of the implementation is not great.
NOTE: I don't consider this to be a duplicate question due to the fact that the answer most appropriate to you is not the accepted answer, since you explicitly stated you don't want to use an external solution.
There are a couple of implementations for Lucene.NET that use Azure instead of AWS here and here. You may be able to get some ideas that help you to create a more optimal solution for S3, but creating your own Directory implementation is a non-trivial task.
Can IndexReader read index file from in-memory string?
It is possible to use a RAMDirectory, which has a copy constructor that moves the entire index from disk into memory. The copy constructor is only useful if your files are on disk, though. You could potentially read the files from S3 and put them into RAMDirectory. This option is fast for small indexes but will not scale if your index is growing over time. It is also not optimized for high-traffic websites that have multiple concurrent threads performing searches.
From the documentation:
Warning: This class is not intended to work with huge
indexes. Everything beyond several hundred megabytes will waste
resources (GC cycles), because it uses an internal buffer size
of 1024 bytes, producing millions of byte[1024] arrays.
This class is optimized for small memory-resident indexes.
It also has bad concurrency on multithreaded environments.
It is recommended to materialize large indexes on disk and use
MMapDirectory, which is a high-performance directory
implementation working directly on the file system cache of the
operating system, so copying data to heap space is not useful.
When you call the FSDirectory.Open() method, it chooses a directory that is optimized for the current operating system. In most cases it returns MMapDirectory, which is an implementation that uses the System.IO.MemoryMappedFiles.MemoryMappedFile class under the hood with multiple views. This option will scale much better if the size of the index is large or if there are many concurrent users.
To use Lucene.NET's built-in index file optimizations, you must put the index files in a medium that can be read like a normal file system. Rather than trying to roll a Lucene.NET solution that uses S3's APIs, you might want to check into using S3 as a file system instead. Although, I am not sure how that would perform compared to a local file system.

What are the best practices/setup for applying caching techniques in web application?

I want to apply caching techniques to improve my asp.net web application performance. I am going to use .NET default cash. I want to store the data in the XML file as well so that If the system fails to found the data from the cache, I can use the XML file as a secondary option. Is this workflow seems well or standard? Will file i/o operation degrade the performance instead of improving it or break the system integrity? The data volume will be medium and the number of files will be around 1k~2k.
Using XML files as data source seems like a rather unorthodox approach. A more common way would be using a database as data source and something like the Distributed Redis Cache for caching.
See the docs for further information.

Will using dblinq to SQLite in a Windows Store app pass store validation?

I've been trying to figure out how to get a decent LINQ to something working for ORM database access in a Windows Store app.
All I've found is SQLite and the sqlite-net NuGet package. The latter sucks a bit, as I don't get any .dbml like structure which resolves relationships and provides navigation properties for easy querying (no manual joins needed then).
I was wondering:
Does dblinq in comnbination with SQLite offer this?
Will using this pass Windows Store validation?
Thank you !
Update: Some links I used in my research:
The famous Tim Heuer post on SQLite and Windows 8: http://timheuer.com/blog/archive/2012/08/07/updated-how-to-using-sqlite-from-windows-store-apps.aspx
DBlinq: http://code.google.com/p/dblinq2007/
sqlite-net: http://code.google.com/p/sqlite-net/
Interesting discussion stating ADO.NET is not possible: http://social.msdn.microsoft.com/Forums/en-US/winappswithcsharp/thread/e9cdd75d-03e4-4577-988e-4c02a52e3f50
I'm not familiar with dblinq but by looking at the sqlite tests in the project, it seems the library is offering what you're looking for, i.e. navigation properties for relationships between different tables.
Since dblinq is a .NET library, using it shouldn't make the store validation fail. There is another problem though: you can't use such a .NET library in a Windows Store application, only Windows Store class libraries and portable class libraries are allowed. Since the source for the library is available, you can try compiling it as a Windows Store class library, but I'm afraid there are going to be some classes missing that dblinq is depending on which might make it difficult to port.

Cross-platform ORM for C# that's free and not NHibernate?

I've got a client who wants an ASP.NET MVC application. I'll develop it with VS.NET 2010 Express, demo it to him on my Linux server during its development (Mono supports ASP.NET MVC), and he'll eventually host it on a commercial provider running IIS.
Getting this done quickly is the name of the game. The only piece I'm missing here is the database layer. Ideally I'd use SQL CE and EF4. But SQL CE only works on Windows, and Mono doesn't support the Entity Framework anyway.
The only free Linq to SQL-like option I see is DbLinq. A quick test with that on a MySQL database had it erroring out on a table that had two foreign keys to a single primary key. A search on Google shows that this bug was identified, and a patch was created, two years ago or so. That the patch still hasn't been applied to the main source by now, and that this bug seems to affect so a common scenario, does not fill me with confidence on the production-readiness of DbLinq.
Even if it did work, it'd have to be with MySQL, as that's the only database I can expect to be available on both Linux and an eventual Windows server. (SQLite, Berkeley DB, etc., would all require some native drivers be installed on the server, which I can't count on.)
I don't know NHibernate. But from what I read, it requires manually creating XML mapping files... so I don't have to write SQL statements, but I do have to create mapping files? (Plus I'd need to learn how to use it.) Like I said above: Getting this done quickly is a goal here.
If I must, I will just pony up the $5 a month or so for a cheap ASP.NET hosting provider and use that to demo progress to the client, using SQL CE and EF4. But before I do that I'd just like to see if there are any other viable options. (It's kind of mostly an intellectual exercise by this point.)
So... any tips?
Does it really have to be a fully bloated ORM?
I recommend to have a look at some of the so called "micro-orm`s", especially my favourite one: Peta-Poco (http://www.toptensoftware.com/petapoco/)
Peta-Poco runs perfectly under mono and has an incredible performance. Even better, because of the small codesize (~1k lines of c#) it is very easy to understand what`s going on under the hood and you can easily change/extend the code to your needs. For the start you just have to copy the single .cs file in your project and you are ready to go.
Peta-Poco has a very well poco-mapping heuristic so you will get your c# objects out of the db with zero configuration for the most cases.
You COULD try Linq-to-SQL. Partially supported under Mono from 2.6, it supports many dbs under mono Release Notes Mono 2.6 (they are working with those of DbLINQ to make it).
Ah... Forget to learn quickly how to use nhibernate. It's very good but it's quite an hell. And creating the XML is the least (and with NHibernate 3.2 they have added their version of Fluent interfaces, so XML aren't anymore necessary I think. You can "code" your XML.)

SQLite & Versioning Systems

Foreword: I am not trying to write an alternative either to Subversion or to any other versioning system.
I wonder if SQLite has what it takes to replace the usual repositories of versioning systems by a single-file database file where different versions are stored as BLOBs?
Fossil is a version control system implemented in SQLite. It uses a single database, storing the versions as BLOBs.
Not all version control systems use the filesystem.
In fact, one such distributed version control system, Monotone, already uses SQLite for storage. The FAQ Why an embedded SQL database, instead of Berkeley DB? gives some rational for this choice. The FAQ doesn't address "why not filesystem storage" though.
Even SVN, at least historically, supports an alternate BDB repository data-store. While this is not SQLite it is easy to imagine that SQLite can function as a "super" BDB that supports SQL as an interface. (Actually, BDB can even be used as an SQLite back-end, for a fee :-)
Keep in mind that, no matter where the data (diffs/deltas) is stored it all ends up as some form of "BLOB" -- BDB value, data in a file, or BLOB column in a[n SQLite] database.
Happy coding

Resources