Azure Data Explorer ingest text Log Files with custom delimiter - azure-data-explorer

I'm trying to use Azure Data Explorer to ingest some logs (IIS Logs, POP3 logs, IMAP logs) that contain values delimited by space.
I would have expected Azure Data Explorer to infer the correct schema from the files as separate columns, however it only identifies a single column with the entire data.
The reason for this seems to be the header and metadata rows, which I can't find a way to skip (would have thought there is a way to skip those).
However, even if I remove the metadata rows, manually, from the log file, it still doesn't seem to be able to recognize the schema for the table.
I have also tried to create the table before ingesting, using KQL queries, and instead of creating a new table, I ask the import to ingest into an already existing table. However, doing this, it doesn't identify any rows to be imported from the logs.
I'm not sure what exactly can be done, I thought Azure Data Explorer (and Log Explorer - tried that too, works the same) to be a perfect solution for log files created by Windows apps.

The documentation might have been a good start point.
It is very clear as to what are the supported formats for ingestion.
IIS Logs, POP3 logs & IMAP logs are not listed.
Data formats supported by Azure Data Explorer for ingestion
As to the TXT format, an entire line is ingested as a single value. No additional parsing there.
Format
Extension
Description
TXT
.txt
A text file with lines delimited by \n. Empty lines are skipped.
You could use the TXT format to load the data and then parse it and split it to columns, within ADX, probably by using REGEX.

Related

Parsing FB-Purity's Firefox idb (Indexed Database API) object_data blob from Linux bash

From a Linux bash script, I want to read the structured data stored by a particular Firefox add-on called FB-Purity.
I have found a folder called .mozilla/firefox/b8eab5j0.default/storage/default/moz-extension+++37a9788c-671d-4cae-ba5c-fbdb8788499a^userContextId=4294967295/ that contains a .metadata file which contains the string moz-extension://37a9788c-671d-4cae-ba5c-fbdb8788499a, an URL which when opened in Firefox shows the add-on's details, so I am pretty sure that this folder belongs to the add-on.
That folder contains an idb directory, which sounds like Indexed Database API, a W3C standard apparently used since last year by Firefox it to store add-ons data.
The idb folder only contains an empty folder and an SQLite file.
The SQLite file, unfortunately, does not contain much application structured data, but the object_data table contains a 95KB blob which probably contains the real structured data:
INSERT INTO `object_data` VALUES (1,'0pmegsjfoetupsf.742612367',NULL,NULL,
X'e08b0d0403000101c0f1ffe5a201000400ffff7b00220032003100380035003000320022003a002
2005300610074006f0072007500200055007205105861006e00690022002c00220036003100350036
[... 95KB ...]
00780022007d00000000000000');
Question: Any clue what this blob's format is? How to extract it (using command line or any library or Linux tool) to JSON or any other readable format?
Well, I had a fun day today figuring this out and ended creating a Python tool that can read the data from these indexedDB database files and print them (and maybe more at some point): moz-idb-edit
To answer the technical parts of the question first:
Both the name key (name) and data (value) use a Mozilla proprietary format whose only documentation appears to be its source code at this time.
The keys use a special just-for-this use-case encoding whose rough description is available in mozilla-central/dom/indexedDB/Key.cpp – the file also contains the only known implementation. Its unique selling point appears to be the fact that it is relatively compact while being compatible with all the possible index types websites may throw at you as well as being in the correct binary sorting order by default.
The values are stored using SpiderMonkey's internal StructuredClone representation that is also used when moving values between processes in the browser. Again there are no docs to speak of but one can read the source code which fortunately is quite easy to understand. Before being added to the database however the generated binary is compressed on-the-fly using Google's Snappy compression which “does not aim for maximum compression [but instead …] aims for very high speeds and reasonable compression” – probably not a bad idea considering that we're dealing with wasteful web content here.
To locate the correct indexedDB file for an extension's local storage data, one needs to resolve the extension's static ID to a so-call “internal UUID” whose value is different in every browser profile instance (to make tracking based on installed addons a lot harder). The mapping table for this is stored as a pref (“extensions.webextensions.uuids”) in the prefs.js. The IDB path then is ${MOZ_PROFILE}/storage/default/moz-extension+++${EXT_UUID}^userContextId=4294967295/idb/3647222921wleabcEoxlt-eengsairo.sqlite
For all practical intents and purposes you can read the value of a single storage key of any extension by downloading the project mentioned above. Basic usage is:
$ ./moz-idb-edit --extension "${EXT_ID}" --profile "${MOZ_PROFILE}" "${STORAGE_KEY}"
Where ${EXT_ID} is the extension's static ID (check its manifest.json file or look in about:support#extensions-tbody if your unsure), ${MOZ_PROFILE} is the Firefox profile directory (also in about:support) and ${STORAGE_KEY} is the name of the key you'd like to query (unfortunately querying all keys is not supported yet).
Also writing data is not currently supported either.
I'll update this answer as I implement more features (or drop me an issue on the project page!).

convert a binary file to SQLite database

i have a binary file of mobile, in this binary file msgs and contacts of phone-book are stored i have extracted msgs from it but now have to extract contacts saved in phone book.in this binary file the data is stored in sqlite format as i found this string 53514C69746520666F726D617420330000 in my binary file. now how to extract list of contacts saved in phone book.
You need to first work out the format of the file from which you are extracting information, then write code to extract it. A good starting point would be The SQLite Database File Format.
The first part of that string you give (53514C69746520666F726D6174203300) is ASCII hex for SQLite format 3<nul>, which matches the header shown in that link above, so that may go some way toward helping you figure out how best to process it.
Although, given the fact it appears to be just a normal SQLite database file, you may get lucky and be able to use it as-is with a normal SQLite instance. That would be the first thing I'd try since you can then use regular SQL queries to output the data in a more usable form.
For example, if the file is called pax.db, simply run:
sqlite pax.db
to open it, then you may find you can use all the regular investigative commands like .databases, .schema, .tables and so on.

Export Excel connection to Access - .ODC info to .ODBC

I have lots of data to wrangle and I need some help.
I have been using an excel file that has two worksheets of interest to me. They each produce a OLAP pivot table with the data I need to work with. What I would like to do is move those (.odc) connections to access queries so I don't have to hand paste all of this info out and manipulate it and then go through the whole process several more times.
One table is Throughput (number of parts through an operation(s)) by Part Number and by Date. The other is Hours Logged at the operation(s) by Part Number and by Date. I also have a master list of all part numbers with some more data that I have to mix in.
Biggest problem: Each chart is producing its own subset of dates and part numbers so I have to take care to match up the data to run the calculations. I've tried:
By hand. Got tired with that real quick.
Using LOOKUP, VLOOKUP, MATCH with INDIRECT and all sorts of tricks.
It's a mess. But I'm confident that if I can put the original pivot tables into Access I can add a few joins and write up a couple queries and it will turn out beautifully.
Worst comes to worse I can copy/paste the pivot table data into access by hand, but what if I want to change or expand the data set? I'd rather work with the raw data.
EDIT:
The data is held on SQL Server and I cannot change that.
The excel pivot tables use a .ODC file for the connection. They gives the following connection string:
Provider=MSOLAP.3;Integrated Security=SSPI;Persist Security Info=True;Initial Catalog=[MyCatalog];Data Source=[MySource];MDX Compatibility=1;Safety Options=2;MDX Missing Member Mode=Error
Provider=MSOLAP.4;Integrated Security=SSPI;Persist Security Info=True;Initial Catalog=[MyCatalog];Data Source=[MySource];MDX Compatibility=1;Safety Options=2;MDX Missing Member Mode=Error
(I replaced the actual catalog and source)
Can I use the .odc file information to create a pass through query in Access?
Have you consider using a proper OLAP server?
Comparison of OLAP Servers
Once setup you'll be able to connect your Excel's Pivot Table to the server (as well as other reporting tools).
Talked to our IT dept. The guy who built the Cubes is working on querying the same info into MS Access for me.
Thanks everyone.

what is the best way to export data from Filemaker Pro 6 to Sql Server?

I'm migrating/consolidating multiple FMP6 databases to a single C# application backed by SQL Server 2008. the problem I have is how to export the data to a real database (SQL Server) so I can work on data quality and normalisation. Which will be significant, there are a number of repeating fields that need to be normalised into child tables.
As I see it there are a few different options, most of which involve either connecting to to FMP over ODBC and using an intermediate to copy the data across (either custom code or MS Acess linked tables), or, exporting to flat file format (CSV with no header or xml) and either use excel to generate insert statements or write some custom code to load the file.
I'm leaning towards writing some custom code to do the migration (like this article does, but in C# instead of perl) over ODBC, but I'm concerned about the overhead of writing a migrator that will only be used once (as soon as the new system is up the existing DB's will be archived)...
a few little joyful caveats: in this version of FMP there's only one table per file, and a single column may have multi-value attributes, separated by hex 1D, which is the ASCII group separator, of course!
Does anyone have experience with similar migrations?
I have done this in the past, but using MySQL as the backend. The method I use is to export as csv or merge format and them use the LOAD DATA INFILE statement.
SQL Server may have something similar, maybe this link would help bulk insert

Does DB2 OS/390 BLOB support .docx file

ASP.net app inserts Microsoft Windows 2007 .docx file into a row on DB2 OS/390 Blob table. A different VB.net app gets the DB2 OS/390 Blob data. VB.net app kicks off Microsoft Word to open the .docx file but then Microsoft Word pops up a message that the data is corrupted. Word will allow you to fix the data so the file can be viewed but it is extra steps and users complain.
I've seen some examples where .docx can be converted to .doc but they only talk about stripping out the text. Some of our .docx have pictures in them.
Any ideas?
I see that this question is 10 months old. I hope it's not too late to be helpful.
Neither DB2 nor any other database that allows a "Blob" data type would know that the data came from a .docx file, or do anything that would cause Word to complain. The data is supposed to be an exact copy of whatever data you pass to it.
Similarly, the Word document does not "know" that it has been copied to a BLOB object and then back.
Therefore, the problem is almost certainly with your handling of the BLOB data, in one or both of your programs.
Please run your first program to copy the .docx file into the databse, then run the second one to read it back out. Then use a byte-by-byte tool to compare the two files. One way to do this would be to open a command window and type:
fc/b Doc1.docx Doc2.docx
If you have access to some better compare tools, by all means use them... but make sure that it looks at EVERY BYTE, not just the printable characters.
Obviously, you ARE going to find differences, or else Microsoft Word wouldn't give you errors on the second one when the first one is just fine. Once you see what the differences are, hopefully you will understand what is going wrong and how to fix them.
I had a similar problem several years ago (I was storing graphics, but it's the same basic problem). It turns out that the document size was being affected - I would store 8005 bytes into the BLOB object, and when I read it back out I was getting 8192 bytes. NUL (0) bytes were being appended to the end of the data.
My solution at the time was to append an "X" to the end of the BLOB data when I wrote it to the database. Then, when I read it back, I would search for the very last "X" in the data and remove it, along with any data after it. That way, I could recover the original data. What I should have done was store the data length in the database along with the BLOB data. Then you could truncate the file to that size, eliminating the corruption.
If appended NUL bytes aren't your problem, then you'll need to do something else to fix the problem. But you don't have a clue until you know what changed. Something did.

Resources