SQLite3 Data rescue on Error: Database disk image is malformed - sqlite

Background
I have a database thats been corrpted, and want to save so much of the data possible.
I have tried sql dump the data with numerous of tools, without success.
Always same error message:
Error: database disk image is malformed
I'm pretty sure this did happen due to a power failure.
Approach?
Now the the database is in fact a file. And I'm thinking if its possible to treat it so and try to save so much data as possible.
I guessing when the db is opened by a tool or program it first check its headers.
In my case I get the error message right away. I'm assuming that the headers are corrupt, or miss matching. And due to that no tool will try to read the payload.
In the documents http://www.sqlite.org/fileformat2.html there are explanations for the header offsets.
Questions: Is this is an reasonable approach? And if it possible to repair, modify or exchange headers on the corrupted db. And how do I do it?

Despite several replies in multiple threads on SO to the contrary, SQLite databases can be recovered from corruption!
I have requested an update from the SQLite team in their FAQ (http://www.sqlite.org/faq.html#q20), but in the meantime, here are a couple of options.
The FAQs state:
"...If SQLITE_SECURE_DELETE is not used and VACUUM has not been run, then some of the deleted content might still be in the database file, in areas marked for reuse. But, again, there exist no procedures or tools that we know of to help you recover that data."
and:
"...Depending how badly your database is corrupted, you may be able to recover some of the data by using the CLI to dump the schema and contents to a file and then recreate. Unfortunately, once humpty-dumpty falls off the wall, it is generally not possible to put him back together again."
There are in fact at least two excellent tools to do data recovery for whole SQLite databases and individual records, and they can help in cases of hardware failure, software errors or human problems. It will not be 100% pristine, but the situation is not hopeless
PhotoRec is open source and multi-platform. While historically, it was used for images and pdfs, it now supports SQLite recovery (http://www.cgsecurity.org/wiki/File_Formats_Recovered_By_PhotoRec), along with 220+ binary file types. If a database (or entire directory) is deleted, PhotoRec can often restore the db file to a sane-enough state to be opened and exported. There are pre-compiled versions of the app freely available for Windows, Mac and Linux.
In addition, the commercial product Epilog by CCL Forensics can do very advanced record recovery, including retrieving data from the write-ahead log (WAL) transaction files. It is a few hundred dollars, but it can do fairly amazing forensic reconstruction on SQLite data (both native binary db files as well as raw disk images).
Both the above have saved my hide several times, so passing this along for others who may have lost hope in deleted/corrupted SQLite databases (as well as genuine forensics for popular use cases, like mobile phones, browsers, address books, etc.).
Once you've regenerated/exported data, it's always a good idea to verify your backup schemes and definitely run a pragma integrity_check periodically, along with vacuuming.
I have requested that the official FAQ be updated to at least mention that one can google "sqlite recovery" or something if it's verboten to mention other projects/products by name.
Cheers.

Related

Is SQLite in immutable mode safe on non read-only media?

In an application that ships with a read-only SQLite database, I've found that opening the database as immutable radically improves query performance. However the SQLite documentation says this (emphasis mine):
The immutable parameter is a boolean query parameter that indicates that the database file is stored on read-only media. When immutable is set, SQLite assumes that the database file cannot be changed, even by a process with higher privilege, and so the database is opened read-only and all locking and change detection is disabled.
This is tripping me up a bit because the media (Windows Program Files) is not read-only and it can be changed, but the expectation is that it won't change. The application itself does not alter the database. A user could alter the databases using external tools (or just open it in Notepad and corrupt it) but we would call that user error and tell them not to do that.
My concern is that this part of the documentation might be hinting at some other process I'm not aware of (like maybe Windows periodically doing something that might result in the database file changing in some way).
If the application itself does not alter the database, and the user doesn't either, and there isn't some other malicious or poorly-coded program on the computer that might be touching files that don't belong to it, is it reasonably safe to open a SQLite database as immutable?
Experimentally, the answer appears to be yes, it is safe. I made this change and have not observed any problems with it.

Sqlite query reports SQLITE_CORRUPT

I have an embedded system running with a RTOS and is using C language.
I am using Sqlite to maintain a file(let's call it sqlLiteFile.db) on a File System residing on a NAND. The Sqlite version is 3.8.5
Earlier, I was creating a new database for this file every time the system comes up. So, it was a volatile file. I had no issues at that time.
However, now I made the sqlLiteFile.db to be persistent. So, every time system reboots, it opens the same file and starts writing. This works fine for some time, and survives few reboots. But, after a while, the Sqlite query starts reporting SQLITE_CORRUPT error. However, the write operation to sqlite still works fine, it is the query which starts reporting error. I can see the write operation successful, using a debugger. Also, the size of the file in the file system keeps increasing.
When I download the file and use Sqlite browser, I can not open the file anymore. When I use some other tool to convert the sqlLiteFile.db to sqlLiteFile.txt, I can see an error at the bottom: /**** ERROR: (11) database disk image is malformed *****/
Any suggestions on how to prevent this corruption would be helpful.
Edit:
Further I did try doing clean shutdowns which closes the database using sqlite3_close() prior to rebooting. This time the database survived a little longer through reboots, but it got corrupted again eventually. So, it seems it is more then just about closing the database before exiting the application. Probably the size?
Update:
The system reboots(and re-opening/closing the sqlite database) doesn't cause corruption, but it happens after the database size reaches a certain amount(~55 Kb)
It did seem that fsync() was doing something to the sqlite database. Taking out fsync() functionality didn't cause sqlite corruption. Also, I was opening and reading the database while downloading and at the same time sqlite database was written. Both these factor or fsync() alone was causing file system corruption. I still need to figure out a better way to perform fysnc(), but now I know exactly what was causing the corruption.

Write-Ahead Logging and Read-Only mode compatible in SQLite3?

Open read-only
I have a sqlite3 file on a filesystem that belongs to a different user than is running the reading process. I want the reading process to be able to read the file in read-only mode, so I'm passing SQLITE_OPEN_READONLY. I would expect that to work. Surely the idea is that read-only mode works on files that we don't want to write to?
When I prepare my first statement I get
unable to open database file
Similarly if I run the sqlite3 command line tool I get the same result unless I sudo. Which seems to confirm to me that the issue is writeability rather than anything else.
Journal files
The answer to this question seems to suggest that if there are journal files around then read-only access isn't possible.
Why are there journal files? Because another process is writing the file, my user process is trying to open it in read-only. To do this I am using Write-Ahead Logging, which produces two journal files, -shm and -wal. True enough, if I stop the writing process and remove the journal files, my user process can open it in read-only mode.
Incompatibility?
So I have two situations:
If the file belongs to the writing process and also the read-only process, write-ahead logging enables process A to write and process B to read-only
If the file belongs to the writing process but does not belong to the read-only process, the read-only process is blocked from opening read-only.
How do I achieve both of these? To spell it out, I want:
Writing process owns database
Read-only process does not own database
Read-only process cannot write to database
Write-ahead logging is enabled on database
Seems like a simple set of requirements, but I can't see an obvious solution.
**EDIT: ** Going by this documentation, it looks like this isn't possible. Can you suggest any alternative ways to achieve the above?
Yes WAL-journaled databases cannot be opened read-only, explicitly or otherwise (i.e. in the case where the database file is read-only to the process).
If you require that the read-only process absolutely not be allowed to modify the database file, then the only thing that comes to mind is that the write process maintains a not WAL-journal additional copy of the database.
Bottom line: to the best of my knowledge, WAL and read-only can't be done.
I think what the documentation is saying is that the WAL database itself may not be present on a readonly media, which does not necessarily mean you cannot use SQLITE_OPEN_READONLY. In fact, I have successfully opened two connections, a read-write as well as one with SQLITE_OPEN_READONLY, both on a WAL sqlite database. These work just fine. I tested an INSERT query using the read-only connection and the statement correctly returned an error that the database is read-only.
Just make sure that the database is stored on some media with write-access as a -shm file needs to be created and maintained, and so even a 'ready-only' connection may actually physically write something to disk - which doesn't necessarily mean that it can modify data using SQL.

Is there any value in including SQLite in VCS's

Having an argument with my team. We are developing an application using SQLite and some want to add it to the repo (GIT) and some don't. Previously with RDBMS system there has been no perceived benefit of using VCS on the DB. However SQLite is a self contained file with no external dependencies so i assume, even though it is binary, that a commit of the project code + the SQLite file will give an accurate snapshot of the state of play at that point.
I also assume that a branch and merge would work as well.
Has anyone actually done this and if so does it work?
You'd get more benefit from GIT's versioning facilities if you stored a dump of the SQLite database (i.e. commands required to create it) rather than the database file itself. That way you could look at the history of the dump file and see tables or data being added etc.
Generally speaking, it's preferable to include full set of dependencies in a VCS repository. This makes your life a whole lot simpler.
If you're after versioning DB schema, check out Wizardby.

Writing to and reading from the same file, at the same time (disk being asynchronous?)

We're creating a web service where we're writing files to disk. Sometimes these files will be read at the same time as they are written.
If we do this - writing and reading from the same file - we sometimes end up with files that are of the same length, but where some of the data inside are not the same. So with a 350mb file we get maybe 20-40 bytes that differs.
This problem mostly occur if we have 3-4 files that are being written and read at the same time. Could this problem be because there is no guarantee that after a "write" to a disk, that the data is actually written, i.e., the disk is being asynchronous.
Also, the computer we're testing on is just a standard macbook pro, so no fancy disks of any kind.
The bug might be somewhere else, but we just wanted to ask the question and see if anybody knew something about this writing+reading thing.
All modern OSs support concurrent reading and writing to files (obviously, given a single writer). So this is not an OS level bug. But do make sure you do not have multiple threads/processes trying to append data to the file.
Check your application code. Check the buffers you are using. Make sure your application is synchronized and there are no race conditions between readers and writers.

Resources