Invalid INDX entries for $I30 on NTFS harddisk - hex

While parsing my NTFS formatted hard disk, I found some invalid entries of INDX while Windows is still able to list all the root directory contents!
The structure of the Index Record in NTFS 3.1 is clear (NTFS doc):
Offset Description
-------------------------------------
0x00 MFT Reference of the file
0x08 Size of the index entry
0x0A Offset to the filename
...
0x52 Filename
...
However, I found some entries where their size is faulty as well as their MFT Reference (which is a bunch of zeros)!
I enclose a screenshot that shows a part of INDX along side with their text representations where each line is of width 0x20. I highlighted the faulty part.
The figure shows that entries were parsed rationally until the last correct entry at 0x0628:
MFT Reference (8 bytes): 66 30 00 00 00 00 01 00
Size of entry (2 bytes): 70 00
So the entry ends at 0x0697.
Thereafter, things got weird! Entries at 0x0698:
MFT Reference (8 bytes): 00 00 00 00 00 00 00 00 Seems invalid
Size of entry (2 bytes): 10 00 Of course invalid because the size is less than the entry structure minimum size that includes the filename at 0x52 for instance.
For me, it seems that "Buziol Games" was a deleted folder on the root directory of the harddisk, I am not sure. Anyway, Windows explorer is not facing troubles on listing the contents.
Do anybody understand how does it work? How do Windows continue parsing?
EDIT: In addition, please find the hex dump as a pure text on pastebin

As files get renamed, added, and deleted, INDX records end up containing unzeroized slack space at their end. Each INDX "page" is always 4096 bytes long, and as files get deleted the B+ tree nodes get shifted, leaving old, abandoned nodes at the end of INDX pages. This is very useful for forensics.
The "Buziol Games" entry appears to be a perfectly valid INDX record. Why do you think it was deleted?
Note that the INDX header (right where the "INDX" string is) can tell you how many entries there are in the page - check out offset 0x1c (size of index entries) vs offset 0x20 (allocated size of index entries). And note that these are relative to offset 0x18.
So looking at your pastebin output, in offset 0x1c we find the value 0x690 which means that the last entry ends at 0x18 + 0x690 = 0x6A8. The entry you see at offset 0x698 seems to be a kind of "null" entry, as per https://0cch.com/ntfsdoc/concepts/index_record.html:
last entry has a size of 0x10 (just large enough
for the flags (and a mft ref of zero))
Note thst its size is 0x10 which means it ends at 0x6A8, as expected.
See also https://www.fireeye.com/blog/threat-research/2012/10/incident-response-ntfs-indx-buffers-part-4-br-internal.html.
A good description of NTFS can be found at http://dubeyko.com/development/FileSystems/NTFS/ntfsdoc.pdf.

Related

always two extended linear address records in hex file

I exported a .hex file for a PIC32 microcontroller, the content starts like this
:020000040000fa
:020000041fc01b
:042ff000ffff3fffa1
:020000040000fa
:020000041fc01b
:042ff400b9fff8bf6a
:020000040000fa
:020000041fc01b
:042ff800d9df74ffaa
:020000040000fa
:020000041fc01b
:042ffc00f3ffff7e62
:020000040000fa
:020000041d00dd
After reading some articles about the Intel HEX format, the output confuses me a bit.
Let's have a look at the first three lines only:
:02 0000 04 0000 fa
:02 0000 04 1fc0 1b
:04 2ff0 00 ffff3fff a1
If the third section is 04, as it is for the first two lines - it means the following data of type is of extended linear address. So this address part is put in front of the address which is declared for data of type 00 (like the third one).
The reason why this confuses me is that there are ALWAYS two 04 lines under each other. But a 04 line is only valid until another appears.
So why are there always two?
But why are

Issuse encrypting a gzip compressed raw text: value change to base64

I have the following raw text :
.
It represents a gz compressed value. Here is the raw text uncompressed :
.
Now I would like to encrypt it using AES in CBC mode with a 128 bits long key. Here are the key and the inizialization vector I'm using:
key: asdfgh0123456789
iv: 0123456789asdfgh
Anyway when I try to decrypt the cypertext obtained from encryption I got base64 encoded raw text as my input.
[Here][1] is the website service I'm using to encrypt and decrypt.
Why does my input change automatically to base64 ? Whats wrong ?
Screenshot :
Screenshot: [![enter image description here][2]][2]
The problem with a sequence of bytes is that they cannot simply be shown on a screen because you can have 256 different values on a single byte but the alphabet only has 26 different letters.
Given a sequence of bytes it can be represented in letters and numbers when converted to base64.
1) The text TEST is represented in bytes as 54 45 53 54 and as base64 as VEVTVA==
2) GZ of TEST is represented in bytes as 1f 8b 08 00 00 00 00 00 00 ff 0b 71 0d 0e e1 e2 02 00 cf 1b f9 1e 06 00 00 00 and as base64 as H4sIAAAAAAAA/wtxDQ7h4gIAzxv5HgYAAAA=
Now you can encrypt the bytes of the gz or the base64 of the gz - they are different.
When using a web page that takes text as input you better use the base64 of the gz. Then, when you decrypt you get what you used.
First and foremost you have to understand that you are not encrypting a raw text gzip. You are actually encrypting a base64 form of your raw text gizp.
Then symmetric encryption algorithms like AES are useful because given a cleartext and key they allows ciphering and deciphering to transform respectively your in input first in ciphertext and then cleartext.
As shown in your screenshot you correctly achieved that! So really nothing wrong. To return to original form of raw gzip text you just have to decode the output of decryption using the base64 encoding scheme and you'll get wath you are looking for.
Moreover as you already knows not all bytes values can be represented by visible symbols, this is the main reason often encrypted text or binary data is represented using base64. Over other enconding scheme, it has less overhead. For instance hex encoding double the input size; while instead base64 makes it (on average) 33% bigger.
As a bonus: Here is useful tools for this kind of transformations Cyberchef.

Selecting character code table in ESC/POS command

I need print non-english chars on print receipts, use thermal POS receipt printer. Xprinter XP-58III thermal POS receipt printer suppport generic ESC/POS commands.
As I know this should be done by setting character code table. In my case, target code page is 21.
The ESC/POS command for setting Code Page is 'ESC t n' (ASCII) or '1B 74 n' (Hex) where 'n' is page n of the character code table.
I case use of Hex form of command: shold I convert '21' to hex value, or I should use this number without converting, i.e. '1B 74 21'?
Also, where should be added thnis command, right after initialization code?
0x1B 0x40 0x1B 0x74 0x21
I use hex editor to add/edit ESC/POS codes inside the binary file.
EDIT: I solved the issue myself. In order to print any non-english characters on the POS receipt printer, we have to fulfill two conditions: 1) set the correct Code Page, and 2) set the corresponding encoding in receipt file or POS software (same encoding as Code Page). The correct Code Page for this POS printer model is 25 [WPC1257].
I solved the issue myself: the problem was in wrong Code Page set. The correct Code Page for this POS printer is 25 [WPC1257]. We have also set the corresponding encoding in receipt file (same encoding as Code Page).
Page 21 would be "Thai Character Code 11", where 21 is represented in decimal and you need to say "0x15" in binary. Then the command will look like "0x1B 0x74 0x15".
Regarding the command position, the ESC/POS commands are executed in place and affects thereafter in general. There may be no problem it you put it just after the initialization command. Just try.

Get position of character in collating sequence

I'm looking a way to convert ASCII text into hex data, I mean the way of hex that you can obtain making
MOVE X'nn' TO MYVAR.
Then I need to use it like a number (COMP-3 I guess).
I tried to move a PIC(X) to a PIC S9(2)V COMP-3 but does not work as I thought...
Further explanation, as my question was marked as unclear:
First of all, sorry, I made this question late in the night and now that I'm reading again, yes, it's unclear.
Now, the real issue is that I want to use a char (let's say "A") as it's hexadecimal numeric representation to use it as a index for a internal table.
For example, in C it could be easy, making:
int mynum;
char mytext;
mynum = atoi(mytext);
then using mychar to access an array. So, in COBOL I have:
01 MY-TABLE.
05 MY-TABLE-ITEM PIC X OCCURS 1000.
01 MY-TEXT PIC X 100.
01 MY-TEXT-X PIC X OCCURS 100.
Then, I want to iterate MY-TEXT-X and transform it into it's hex code to store it into a numeric variable (PIC 9(n)) to use it to access MY-TABLE-ITEM, something like:
PERFORM VARYING I FROM 1 BY 1 UNTIL I > 100
PERFORM TRANSFORM-DATA
DISPLAY MY-TABLE-ITEM(MY-NUMBER)
END-PERFORM
As I said, I thought I can move a PIC X to a PIC S9(2)V COMP-3 so the numeric variable can get the value but it's not working as I expected...
EDIT:
So I just found my compiler doesn't support intrinsic functions, so that does not help me...
EDIT - Added source code
So, here's the source I'm using, and also displays from complier and executions.
SOURCE:
IDENTIFICATION DIVISION.
PROGRAM-ID. likeatoi.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 the-char PIC X
VALUE "K".
01 the-result PIC 999.
01 the-other-result PACKED-DECIMAL PIC 9(8)
VALUE ZERO.
01 FILLER
REDEFINES the-other-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 num pic 9(8).
PROCEDURE DIVISION.
MAINLINE.
* with instrinsic function
* MOVE FUNCTION ORD ( the-char )
* TO the-result.
DISPLAY
">"
the-char
"<"
">"
the-result
"<".
* Old School
MOVE the-char TO char-to-convert.
DISPLAY
">"
the-char
"<"
">"
the-other-result
"<".
MOVE the-other-result TO num.
DISPLAY num.
STOP RUN.
Now, here a detail of all I tried:
First, try to compile it wit INTRINSIC FUNCTION ORD:
***** 1) 0384: E User-defined word expected instead of reserved word. (scan su
With this compilation, run program (Ignore error):
COBOL procedure error 211 at line 17 in ./ESCRITORIO/HEX/LIKEATOI.COB
(/home/dohitb/Escritorio/HEX/likeatoi.COB) compiled 17/03/05 20:37:29.
Comment FUNCTION part, then compile again:
Errors: 0, Warnings: 1, Lines: 37 for program LIKEATOI.
(Warning for displaying a COMP variable, it's OK)
Execute again (without the "num", and still with comp variable):
>A<> <
>A<>A<
Add "num" variable, change char to "K" and change COMP to PACKED-DECIMAL (in HEX: 4B)
>K<> <
>K<>K<
04900000
So, as I was telling, neither options are working. The most accurate rigth now is using PACKED-DECIMAL with redefines to PIC 9, but with HEX positions higher than "A" it gives a "9" so it's still not valid.
I think it could be a matter of local COLLATION.
FINAL EDIT
Now I made a variant of the original source code:
IDENTIFICATION DIVISION.
PROGRAM-ID. likeatoi.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 the-char PIC X
VALUE "K".
01 the-result PIC 999.
01 the-other-result BINARY PIC 9(4)
VALUE ZERO.
01 FILLER-1
REDEFINES the-other-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 the-comp-result COMP PIC 9(4)
VALUE ZERO.
01 FILLER-2
REDEFINES the-comp-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 the-packed-result PACKED-DECIMAL PIC 9(4)
VALUE ZERO.
01 FILLER-3
REDEFINES the-packed-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
01 num PIC 9(8).
01 alfa PIC X(20)
VALUE 'ABCDEFGHIJabcdefghij'.
01 FILLER REDEFINES alfa.
05 char PIC X OCCURS 20.
01 w-index PIC 99 VALUE ZEROES.
PROCEDURE DIVISION.
MAINLINE.
PERFORM VARYING w-index FROM 1 BY 1 UNTIL w-index > 20
MOVE char(w-index) TO the-char
* Variations of "Old School" code
MOVE the-char TO char-to-convert OF FILLER-1
MOVE the-char TO char-to-convert OF FILLER-2
MOVE the-char TO char-to-convert OF FILLER-3
DISPLAY
">"
the-char
"<"
" with BINARY >"
the-other-result
"<"
MOVE the-other-result TO num
DISPLAY "Numeric value: " num
DISPLAY
">"
the-char
"<"
" with COMP >"
the-comp-result
"<"
MOVE the-comp-result TO num
DISPLAY "Numeric value: " num
DISPLAY
">"
the-char
"<"
" with PACKED >"
the-packed-result
"<"
MOVE the-packed-result TO num
DISPLAY "Numeric value: " num
END-PERFORM.
STOP RUN.
And, for my surprise, it's giving me this output
>A< with BINARY >A<
Numeric value: 00000065
>A< with COMP >A<
Numeric value: 00000100
(and so on...) So now looks like it's working... Could it be because the first try I made I was working with 05-LEVEL variables?
Looks like now it's done!
Thanks for all, Bill, you will figure on the greetings section of my project :)
At last, one detail.
If I make a "MOVE"
MOVE 'A' TO CHAR
Then do all the binary stuff, the results are different... here an example.
with VALUE, for "D" I get 68, but with MOVE I get 60...
You have been suffering from using an old compiler. It is to the COBOL 85 Standard, but does not have the intrinsic functions which were a 1989 Extension to the Standard.
Also, it has a non-Standard behaviour which I have not encountered before, which is difficult to explain fully (not having access to that compiler).
The point of using the > and < in the DISPLAY is so that you always know exactly how long each output field is. You know whether there is a blank, or some non-printable character. Your DISPLAY of fields defined as COMP and BINARY only show one character, rather than four numeric digits which would typically be held in two bytes of storage (like an INT, except with a limit of 9999).
Therefore I suggested the MOVE, where you then get the expected result when defined as BINARY and an... unexplained result when defined as COMP.
One explanation for the COMP result may be that COMPUTATIONAL fields are entirely down to the compiler implementor to define. So what is COMP on one system may not be the same type of field as COMP on another system (same with COMP-1, COMP-2, COMP-3 etC). This is why the 1985 Standard introduced new names (for example BINARY and PACKED-DECIMAL) so that they would be portable across COBOL compilers.
If you are stuck with using that compiler, you are unfortunate. If you have the possibility of using another compiler, you can find, amongst other choices, the open-source GnuCOBOL (I am a moderator on the discussion area of the GnuCOBOL project at SourceForge.Net). Use a different compiler if you can.
Here's an example program which will work on modern COBOL compilers using both the intrinsic function ORD and a way it used to be done (and probably is still done). Note, if your COMP field is "little endian", swap the order of the FILLER and field under the REDEFINES.
IDENTIFICATION DIVISION.
PROGRAM-ID. likeatoi.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 the-char PIC X
VALUE "A".
01 the-result PIC 999.
01 the-other-result BINARY PIC 9(4)
VALUE ZERO.
01 FILLER
REDEFINES the-other-result.
05 FILLER PIC X.
05 char-to-convert PIC X.
PROCEDURE DIVISION.
* with instrinsic function
MOVE FUNCTION ORD ( the-char )
TO the-result
DISPLAY
">"
the-char
"<"
">"
the-result
"<"
* Old School
MOVE the-char TO char-to-convert
DISPLAY
">"
the-char
"<"
">"
the-other-result
"<"
STOP RUN
.
The ORD is easy, it is effectively the same as your atoi in C (assuming that that gives you the position in the collating sequenece).
In the second, since COBOL, traditionally, can't have a one-byte binary, is a way, using REDEFINES, to get a character in the low-order part of a two-byte binary, so that the whole binary field represents the "numeric value" of the representation of that character.
The output from the above is:
>A<>066<
>A<>0065<
Note that ORD gives the position in the collating sequence (binary zero with ORD would return one) and the second is just giving the direct representation (binary zero would give zero).
To use either value you may want to "re-base" afterwards if you are only interested in printable characters.
Note, I'm confused that you have a compiler which supports in-line PERFORM but not intrinsic functions. If a USAGE of BINARY is rejected, use COMP instead.

How to read a non-standard DBF memo (BLOB) file from ACT?

I am trying to convert data from Act 2000 to a MySQL database. I have successfully imported the DBF files into individual MySQL tables. However I am having issues with the *.BLB file, which seems to be a non-standard memo file.
The DBF files, identifies themselves as dbase III Plus, No memo format. There is a single *.BLB which is a memo file for multiple DBFs to share BLOB data.
If you read this document: http://cicorp.com/act/sdk/ACT6-SDK-ChapterA.htm#_Toc483994053)
You can see that the REGARDING column is a 6 character one. The description is: This 6-byte field is supplied by the system and contains a reference to a field in the Binary Large Object (BLOB) Database.
Now upon opening the *.BLB I can see that the block size is 64 bytes. All the blocks of text are NULL padded out to that size.
Where I am stumbling is trying to convert the values stored in the REGARDING column to blocks location in the BLB file. My assumption is that 6 character field is an offset.
For example, one value for REGARDING is, (ignoring the square brackets): [ ",J$]
In my Googling, I found this: http://ulisse.elettra.trieste.it/services/doc/dbase/DBFstruct.htm#C1.5
It explains that in memo fields (in normal DBF files at least) the space value is ignore (i.e. it's padding out the column).
Therefore if I'm correct (again, square brackets) [",J$] should be the offset in my BLB file. Luckily I've still got access to the original ACT2000 software, so I can compare the full text in the program / MySQL and BLB file.
Using my example value, I know that the DB row with REGARDING value of [ ",J$] corresponds to a 1024 byte offset (or 16 blocks, assuming my guess of a 64 byte sized block).
I've tried reading some Python code for open source projects that read DBF files - but I'm in over my head.
I think what I need to do is unpack the characters to binary, but am not sure.
How can I find the 64-block based spot to read from based on what's found in the DBF files?
EDIT by Jerry Dodge
I've attempted to reverse-engineer the strings in this field to hexadecimal values, and then to an integer value using StrToInt64, but the result still does not match up with the blob file. I've also tried multiplying this integer value by 64 and not multiplying, but the result keeps winding up outside of the size of the blob file, not actually finding any data.
For example, a value of ___/BD (_ = space) translates to $2f4244 hexidecimal, which in turn translates to the integer value of 3097156, but does not correspond with any relevant portion of data in the blob file, even when multiplied or divided by 64.
According to the SDK you linked, the following happens as I understand:
There is a TYPE field (right behing REGARDING) that encodes what REGARDING is used for (see the second table of the linked chapter). So I'd assume that if type=6 (meeting not held) the REGARDING is either irrelevant or only contains a meeting ID reference from some other table. On that line of thought I would only expect REGARDING to be a BLB offset if type=101 (or possibly 100). I'd also not abandon the thought that in these relevant cases TYPE might be a concatenation of BLB file index and offset (because there is a mention that each file must not be longer than 30K chars and I really expect to be able to store much more data even in one table).

Resources