Finding End Of SQLite File In Disk Dump? - sqlite

This is really stumping me. I'm trying to recover some lost information (for reasons I cannot disclose) from a dump of an Android phone's free space. I have no lookup table for the disk, so all I have is the raw dump of the flash.
Basically, I'm trying to pick out SQLite files from this huge 350 megabyte mess. I can find the SQLite file header easy enough, it's 100 bytes and described here. Everything seems to be in place. However, I can find entries, but I'm currently stumped as to how to determine where the entries stop and the file ends and other sectors of the disk are filled.
Any suggestions? I'm at a dead end currently, other than kind of manually going through and trying to eyeball it, but I'm a programmer here, trying to find some sort of methodical way through this.
I appreciate you guys in advance!

I've always had luck recovering data using PhotoRec which, despite its name, supports many file formats including sqlite.
http://www.cgsecurity.org/wiki/File_Formats_Recovered_By_PhotoRec
I've never tried it on a dump of flash memory so I don't know how successful it would be. It depends on how the file is layed out in memory and PhotoRec bets on the fact that most files are stored in contiguous blocks (i.e. not fragmented).

Related

What is the size limit for .r file extension size?

What is the max file size limit for .r extension file now?
I read that it has 5MB limit, is it still the same? How does that change, will it be different from OS to OS or R version to version.
Reference: RStudio maximum file size reached
I'm very new to R, can someone please help me?
Thanks
There is no documented limit for the maximum file size or R code files. In fact, R will be able to deal with anything that’s even remotely reasonable. But for what it’s worth a 5 MiB source code file is not reasonable. If you actually have such files, I strongly suggest removing the large data that’s declared inside it, and moving it to a proper data file instead: separate your code and data. Actual code will never be this big.
As for editing such a file, different code editors have different limits for the size of files they deal well with. Again, having such a big code file is plain unreasonable, so not many code editors bother catering to this use-case, and even though few editors have a hard-coded limit, interactively editing such large files may not work.

QFile: cannot retrieve size from PHYSICALDRIVE

I wrote a tool which was originally thought for analyzing hard disc images. Now I'm trying to use this tool for live analyzis of computer systems, means my tool tries to access the physical drive.
I implemented my tool in QT accessing the images using the QFile class. Instead of images I hand over the physical drive, under windows it is \.\PHYSICALDRIVE0.
I was wondering first I didnt get any errors, I can open the device, I can seek, get the position, almost everything. The only thing I have problems with is retrieving the drive size with size().
Some code example:
QFile file( "\\.\PHYSICALDRIVE0" );
file.open( QIODevice::ReadOnly );
file.size(); //returns 0
I'm not too deep into QT, probably this is some easy thing. I would like to thank everybody who has an idea what is the reason.
thanks in advance!
QFileInfo may be able to help you out. It sounds like opening a read only file at that part of windows partition is allowed maybe even if it doesn't exist. There might be a chance that the call of GetLastError() may give more information why a file size of zero was returned.
With QFileInfo, you can check to make sure it exists before it opens.
You may end up needing some platform specific calls to be able to work with Physical Drives:
Volume to physical drive
It looks like there may be some example code of looking at partitions with PartMod on SourceForge.
As a side note of querying sizes of file folders, I thought it had to be cached somewhere by the operating system, or had to be calculated at the time of the query in many cases. I know it seems like that happens when looking at folder properties in Windows or Get Info on OSX.
Also, looking at the Volume to physical drive answers, there is a field there for the extent length. I think this is what you are looking for.
Hope that helps.

Store map key/values in a persistent file

I will be creating a structure more or less of the form:
type FileState struct {
LastModified int64
Hash string
Path string
}
I want to write these values to a file and read them in on subsequent calls. My initial plan is to read them into a map and lookup values (Hash and LastModified) using the key (Path). Is there a slick way of doing this in Go?
If not, what file format can you recommend? I have read about and experimented with with some key/value file stores in previous projects, but not using Go. Right now, my requirements are probably fairly simple so a big database server system would be overkill. I just want something I can write to and read from quickly, easily, and portably (Windows, Mac, Linux). Because I have to deploy on multiple platforms I am trying to keep my non-go dependencies to a minimum.
I've considered XML, CSV, JSON. I've briefly looked at the gob package in Go and noticed a BSON package on the Go package dashboard, but I'm not sure if those apply.
My primary goal here is to get up and running quickly, which means the least amount of code I need to write along with ease of deployment.
As long as your entiere data fits in memory, you should't have a problem. Using an in-memory map and writing snapshots to disk regularly (e.g. by using the gob package) is a good idea. The Practical Go Programming talk by Andrew Gerrand uses this technique.
If you need to access those files with different programs, using a popular encoding like json or csv is probably a good idea. If you just have to access those file from within Go, I would use the excellent gob package, which has a lot of nice features.
As soon as your data becomes bigger, it's not a good idea to always write the whole database to disk on every change. Also, your data might not fit into the RAM anymore. In that case, you might want to take a look at the leveldb key-value database package by Nigel Tao, another Go developer. It's currently under active development (but not yet usable), but it will also offer some advanced features like transactions and automatic compression. Also, the read/write throughput should be quite good because of the leveldb design.
There's an ordered, key-value persistence library for the go that I wrote called gkvlite -
https://github.com/steveyen/gkvlite
JSON is very simple but makes bigger files because of the repeated variable names. XML has no advantage. You should go with CSV, which is really simple too. Your program will make less than one page.
But it depends, in fact, upon your modifications. If you make a lot of modifications and must have them stored synchronously on disk, you may need something a little more complex that a single file. If your map is mainly read-only or if you can afford to dump it on file rarely (not every second) a single csv file along an in-memory map will keep things simple and efficient.
BTW, use the csv package of go to do this.

Drawbacks to having (potentially) thousands of directories in a server instead of a database?

I'm trying to start using plain text files to store data on a server, rather than storing them all in a big MySQL database. The problem is that I would likely be generating thousands of folders and hundreds of thousands of files (if I ever have to scale).
What are the problems with doing this? Does it get really slow? Is it about the same performance as using a Database?
What I mean:
Instead of having a database that stores a blog table, then has a row that contains "author", "message" and "date" I would instead have:
A folder for the specific post, then *.txt files inside that folder than has "author", "message" and "date" stored in them.
This would be immensely slower reading than a database (file writes all happen at about the same speed--you can't store a write in memory).
Databases are optimized and meant to handle such large amounts of structured data. File systems are not. It would be a mistake to try to replicate a database with a file system. After all, you can index your database columns, but it's tough to index the file system without another tool.
Databases are built for rapid data access and retrieval. File systems are built for data storage. Use the right tool for the job. In this case, it's absolutely a database.
That being said, if you want to create HTML files for the posts and then store those locales in a DB so that you can easily get to them, then that's definitely a good solution (a la Movable Type).
But if you store these things on a file system, how can you find out your latest post? Most prolific author? Most controversial author? All of those things are trivial with a database, and very hard with a file system. Stick with the database, you'll be glad you did.
It is really depends:
What is file size
What durability requirements do you have?
How many updates do you perform?
What is file system?
It is not obvious that MySQL would be faster:
I did once such comparison for small object in order to use it as sessions storage for CppCMS. With one index (Key Only) and Two indexes (primary key and secondary timeout).
File System: XFS ext3
-----------------------------
Writes/s: 322 20,000
Data Base \ Indexes: Key Only Key+Timeout
-----------------------------------------------
Berkeley DB 34,400 1,450
Sqlite No Sync 4,600 3,400
Sqlite Delayed Commit 20,800 11,700
As you can see, with simple Ext3 file system was faster or as fast as Sqlite3 for storing data because it does not give you (D) of ACID.
On the other hand... DB gives you many, many important features you probably need, so
I would not recommend using files as storage unless you really need it.
Remember, DB is not always the bottle neck of the system
Forget about long-winded answers, here's the simplest reasons why storing data in plaintext files is a bad idea:
It's near-impossible to query. How would you sort blog posts by date? You'd have to read all the files and compare their date, or maintain your own index file (basically, write your own database system.)
It's a nightmare to backup. tar cjf won't cut it, and if you try you may end up with an inconsistent snapshot.
There's probably a dozen other good reasons not to use files, it's hard to monitor performance, very hard to debug, near impossible to recover in case of error, there's no tools to handle them, etc...
I think the key here is that there will be NO indexing on your data. SO to retrieve anything in say a search would be rediculously slow compared to an indexed database. Also, IO operations are expensive, a database could be (partially) in memory, which makes the data available much faster.
You don't really say why you won't use a database yourself... But in the scenario you are describing I would definitely use a DB over folder any day, for a couple of reasons. First of all, the blog scenario seems very simple but it is very easy to imagine that you, someday, would like to expand it with more functionality such as search, more post details, categories etc.
I think that growing the model would be harder to do in a folder structure than in a DB.
Also, databases are usually MUCH faster that file access due to indexing and memory caching.
IIRC Fudforum used the file-storage for speed reasons, it can be a lot faster to grab a file than to search a DB index, retrieve the data from the DB and send it to the user. You're trading the filesystem interface with the DB and DB-library interfaces.
However, that doesn't mean it will be faster or slower. I think you'll find writing is quicker on the filesystem, but reading faster on the DB for general issues. If, like fudforum, you have relatively immutable data that you want to show several posts in one, then a file-basd approach may be a lot faster: eg they don't have to search for every related post, they stick it all in 1 text file and display it once. If you can employ that kind of optimisation, then your file-based approach will work.
Also, mail servers work in the file-based approach too, the Maildir format stores each email message as a file in a directory, not in a database.
one thing I would say though, you'll be better storing everything in 1 file, not 3. The filesystem is better at reading (and caching) a single file than it is with multiple ones. So if you want to store each message as 3 parts, save them all in a single file, read it to get any of the parts and just display the one you want to show.
...and then you want to search all posts by an author and you get to read a million files instead of a simple SQL query...
Databases are NOT faster. Think about it: In the end they store the data in the filesystem as well. So the question if a database is faster depends strongly on the access path.
If you have only one access path, which correlates with your file structure the file system might be way faster then a database. Just make sure you have some caching available for the filesystem.
Of course you do loose all the nice things of a database:
- transactions
- flexible ways to index data, and therefore access data in a flexible way reasonably fast.
- flexible (though ugly) query language
- high recoverability.
The scaling really depends on the filesystem used. AFAIK most file system have some kind of upper limit for number of files (totally or per directory), though on the new ones this is often very high. For hundreds and thousands of files with some directory structure to keep directories to a reasonable size it should be possible to find a well performing file system.
#Eric's comment:
It depends on what you need. If you only need the content of exact on file per query, and you can determine the location and name of the file in a deterministic way the direct access is faster than what a database does, which is roughly:
access a bunch of index entries, in order to
access a bunch of table rows (rdbms typically read blocks that contain multiple rows), in order to
pick a single row from the block.
If you look at it: you have indexes and additional rows in memory, which make your caching inefficient, where is the the speedup of a db supposed to come from?
Databases are great for the general case. But if you have a special case, there is almost always a special solution that is better in some sense.
if you are preferred to go away with RDBMS, why dont u try the other open source key value or document DBs (Non- relational Dbs)..
From ur posting i understand that u r not goin to follow any ACID properties of relational db.. it would be better to adapt other key value dbs (mongodb,coutchdb or hyphertable) instead of your own file system implementation.. it will give better performance than the existing approaches..
Note: I am not also expert in this.. just started working on MongoDB and find useful in similar scenarios. just wanted to share in case u r not aware of these approaches

Reading a COBOL DAT file

I have been given a set of COBOL DAT, IDX and KEY files and I need to read the data in them and export it into Access, XLS, CSV, etc. I do not know the version, vendor of the COBOL code as I only have the windows executable that created the files.
I have tried Easysoft and Parkway ODBC drivers but I have not been successful in reading the data from the files.
I do not have access to the source code as the company that was distributing this product shut down.
I have successfully read some of the dat files using http://www.cobolproducts.com/datafile just now which I came to know through another forum. Most probably I will work with them to help me read the rest of the files that I am having an issue with.
A few possibilities.
1/ See if you can find the names of the people that worked for the company. They may be helpful.
2/ Open the DAT file in a text editor. The data may be decodable from that. If the basic format can be discerned, quick'n'dirty code can be written to extract it.
3/ Open up the executable in an editor, there may be strings in there that indicate which compiler was used, then you can search for info on its file formats. If it's a DOS application, there's a good chance it was either Microsoft or Fujitsu COBOL.
4/ Consider placing job requests on work sites like elance or rentacoder; I don't think there's a cost if the work can't be done successfully.
5/ Hire someone to examine it and advise on the likelihood of recovery.
6/ Get a screen dump of the record contents for every active record and re-construct it from that.
Some of these are pretty hard so your mileage may vary.
Good luck.
I have read COBOL DAT files only with FD, when I do not have the FD, I open the file in a Text Editor, and try to guess the columns, and try again, until I have this working, the big problem with this approach is when the DAT file have COMP columns, that can be any kind of COMP type, but with a litthe patience I cold get this done.
I had tryed Parkway ODBC, but without success.
for anyone going through this journey, I found this in sourceforge: Cobol and RPG data reader and converter
http://sourceforge.net/projects/cobol2j/
Im about to try it, sounds kind of promising

Resources