Issuse encrypting a gzip compressed raw text: value change to base64

Issuse encrypting a gzip compressed raw text: value change to base64 - encryption

I have the following raw text :
.
It represents a gz compressed value. Here is the raw text uncompressed :
.
Now I would like to encrypt it using AES in CBC mode with a 128 bits long key. Here are the key and the inizialization vector I'm using:
key: asdfgh0123456789
iv: 0123456789asdfgh
Anyway when I try to decrypt the cypertext obtained from encryption I got base64 encoded raw text as my input.
[Here][1] is the website service I'm using to encrypt and decrypt.
Why does my input change automatically to base64 ? Whats wrong ?
Screenshot :
Screenshot: [![enter image description here][2]][2]

The problem with a sequence of bytes is that they cannot simply be shown on a screen because you can have 256 different values on a single byte but the alphabet only has 26 different letters.
Given a sequence of bytes it can be represented in letters and numbers when converted to base64.
1) The text TEST is represented in bytes as 54 45 53 54 and as base64 as VEVTVA==
2) GZ of TEST is represented in bytes as 1f 8b 08 00 00 00 00 00 00 ff 0b 71 0d 0e e1 e2 02 00 cf 1b f9 1e 06 00 00 00 and as base64 as H4sIAAAAAAAA/wtxDQ7h4gIAzxv5HgYAAAA=
Now you can encrypt the bytes of the gz or the base64 of the gz - they are different.
When using a web page that takes text as input you better use the base64 of the gz. Then, when you decrypt you get what you used.

First and foremost you have to understand that you are not encrypting a raw text gzip. You are actually encrypting a base64 form of your raw text gizp.
Then symmetric encryption algorithms like AES are useful because given a cleartext and key they allows ciphering and deciphering to transform respectively your in input first in ciphertext and then cleartext.
As shown in your screenshot you correctly achieved that! So really nothing wrong. To return to original form of raw gzip text you just have to decode the output of decryption using the base64 encoding scheme and you'll get wath you are looking for.
Moreover as you already knows not all bytes values can be represented by visible symbols, this is the main reason often encrypted text or binary data is represented using base64. Over other enconding scheme, it has less overhead. For instance hex encoding double the input size; while instead base64 makes it (on average) 33% bigger.
As a bonus: Here is useful tools for this kind of transformations Cyberchef.

Related

always two extended linear address records in hex file

I exported a .hex file for a PIC32 microcontroller, the content starts like this
:020000040000fa
:020000041fc01b
:042ff000ffff3fffa1
:020000040000fa
:020000041fc01b
:042ff400b9fff8bf6a
:020000040000fa
:020000041fc01b
:042ff800d9df74ffaa
:020000040000fa
:020000041fc01b
:042ffc00f3ffff7e62
:020000040000fa
:020000041d00dd
After reading some articles about the Intel HEX format, the output confuses me a bit.
Let's have a look at the first three lines only:
:02 0000 04 0000 fa
:02 0000 04 1fc0 1b
:04 2ff0 00 ffff3fff a1
If the third section is 04, as it is for the first two lines - it means the following data of type is of extended linear address. So this address part is put in front of the address which is declared for data of type 00 (like the third one).
The reason why this confuses me is that there are ALWAYS two 04 lines under each other. But a 04 line is only valid until another appears.
So why are there always two?
But why are

Invalid INDX entries for $I30 on NTFS harddisk

While parsing my NTFS formatted hard disk, I found some invalid entries of INDX while Windows is still able to list all the root directory contents!
The structure of the Index Record in NTFS 3.1 is clear (NTFS doc):
Offset Description
-------------------------------------
0x00 MFT Reference of the file
0x08 Size of the index entry
0x0A Offset to the filename
...
0x52 Filename
...
However, I found some entries where their size is faulty as well as their MFT Reference (which is a bunch of zeros)!
I enclose a screenshot that shows a part of INDX along side with their text representations where each line is of width 0x20. I highlighted the faulty part.
The figure shows that entries were parsed rationally until the last correct entry at 0x0628:
MFT Reference (8 bytes): 66 30 00 00 00 00 01 00
Size of entry (2 bytes): 70 00
So the entry ends at 0x0697.
Thereafter, things got weird! Entries at 0x0698:
MFT Reference (8 bytes): 00 00 00 00 00 00 00 00 Seems invalid
Size of entry (2 bytes): 10 00 Of course invalid because the size is less than the entry structure minimum size that includes the filename at 0x52 for instance.
For me, it seems that "Buziol Games" was a deleted folder on the root directory of the harddisk, I am not sure. Anyway, Windows explorer is not facing troubles on listing the contents.
Do anybody understand how does it work? How do Windows continue parsing?
EDIT: In addition, please find the hex dump as a pure text on pastebin

As files get renamed, added, and deleted, INDX records end up containing unzeroized slack space at their end. Each INDX "page" is always 4096 bytes long, and as files get deleted the B+ tree nodes get shifted, leaving old, abandoned nodes at the end of INDX pages. This is very useful for forensics.
The "Buziol Games" entry appears to be a perfectly valid INDX record. Why do you think it was deleted?
Note that the INDX header (right where the "INDX" string is) can tell you how many entries there are in the page - check out offset 0x1c (size of index entries) vs offset 0x20 (allocated size of index entries). And note that these are relative to offset 0x18.
So looking at your pastebin output, in offset 0x1c we find the value 0x690 which means that the last entry ends at 0x18 + 0x690 = 0x6A8. The entry you see at offset 0x698 seems to be a kind of "null" entry, as per https://0cch.com/ntfsdoc/concepts/index_record.html:
last entry has a size of 0x10 (just large enough
for the flags (and a mft ref of zero))
Note thst its size is 0x10 which means it ends at 0x6A8, as expected.
See also https://www.fireeye.com/blog/threat-research/2012/10/incident-response-ntfs-indx-buffers-part-4-br-internal.html.
A good description of NTFS can be found at http://dubeyko.com/development/FileSystems/NTFS/ntfsdoc.pdf.

what's wrong with this XOR encryption?

i'm trying to find out the key of a xor encryption but something seems be wrong.
the ecrypted file begins with this hex string:
78DF2B983C9428942894892CD8D6CFF4F8942895289428952894319428949BFD8F0289089D068DC556C458C25D94289428942894289428942894289428942894289428942894289C289428942894289428942894289428942894289428942894289428942894289428942894289428942894289428942894289428942894289428942894289428942894289428942894..
i know this is a zip comressed file. the zip header structure is something like:
504B0304140000000800CC898C459C47ECC9DA8350002DC06400120000003937373048442D56312E30302E352E676E78 (https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html)
the first 4 bytes of a zip header is always the signature:
504B0304, after xor this with the ecrypted hex, i get: 2894289c
but 2894289c seems to be wrong, why? look at the encrypted hex string..
there's lot of '2894' hex values in the content..these must be decrypted '0000', so is '2894' the right key?
now, i xor the encrypted zip header signature with the key '2894'
78DF2B98 xor 28942894 => 504b030c
pretty strange, because the last hex value is '0c' but it must be '04'!
what is my fault?
(sorry for my english)

How to represent acute accents in ASCII?

I'm having an encoding problem related to cookies on one of my websites.
A user is inputing Usuário, which has an acute accent, and that's being put in a cookie. The raw HEX for the cookie response is (for the Usuário string):
55 73 75 C3 A1 72 69 6F
When I see it in the browser, it looks like this:
...which is really messy. I need to fix this up.
Then I went to this website: http://www.rapidtables.com/convert/number/hex-to-ascii.htm and converted the HEX value to see how it would look like. And I got the same output:
Right. This means the HEX code is wrong. Then I tried to convert Usuário to ASCII to see how it should be. I used this WebSite: http://www.asciitohex.com/ and this is the result:
For my surprise, the HEX is exactly the one that is showing up messy. Why???
And how do I represent Usuário in ASCII so I can put it in a cookie? Should I manually encode it?
PS: I'm using ASP.NET, just in case it matters.

As of 2015 the standard of the web to store character data is UTF-8 and not ASCII. ASCII actually only contains the first 128 characters of the codepage, and does not include any kind of accented characters. To add accented characters to this 128 characters there were many legacy solutions: codepages. They each added 128 different characters to the default ASCII list thereby allowing representing 256 different characters.
The problem was, that this didn't properly solve the issue: ASCII based codepages were more or less incomatible with each other (except for the first 128 characters), and there was usually no way of programatically knowing which codepage was in used.
One of the solutions was UTF-8, which is a way to encode the unocde character set (containing most of the characters used around the world, and more) while trying to remain compatible with ASCII. The first 128 characters are actually the same in both cases, but afterwards UTF-8 characters become multi-byte: one character is encoded using a series of bytes (usually 2-3, depends on which character needs to be encoded)
The problem is if you are using some kind of ASCII based single byte codebase (like ISO-8859-1), which encodes supported characters in single bytes, but your input is actually UTF-8, which will encode accented characters in multiple bytes (you can see this in your HEX example. á is encoded as C3 A1: two bytes). If you try to read these two bytes in an ASCII based codepage, which uses single bytes for every characters (in West-Europe this codepage is usually ISO-8859-1), then each of this two bytes will be reprensented with two different characters.
In the web world the default encoding is UTF-8, so your clients will usually send their requests using UTF-8. ASP.NET is Unicode aware, so it can handle these requests. However somewere in your code this UTF-8 is converted acccidentally into ISO-8859-1, and then back into UTF-8. This might happen on various layers. As you have issues it probably happens at the cookie layer, which is sometimes problematic (here is how it worked in 2009). You should also double check your application that it uses UTF-8 everywhere else though (views, database, etc.), if you want to properly support accented characters.

as3crypto issue

I am using the as3crypto library to get the AES algorithm working on a small project that i am doing. This is how i get the crypto function :
var cipher:ICipher = Crypto.getCipher("simple-aes-cbc", key, Crypto.getPad("pkcs5"));
As you can see, I am trying to use AES-128 with CBC and the pkcs5 padding.
If my source data is 128bytes long, the encrypted data is coming as 160bytes. Can some one tell me why this problem is coming?
Following is a small table that I compiled from a sample program.
Source string length | Encrypted string length
15 | 32
16 | 48
31 | 48
32 | 64
Is it supposed to be like this or have I made some mistake.

It is supposed to be like that. You asked for PKCS5 padding which always adds at least one byte of padding. And, of course, the input must be rounded up to some number of whole blocks because AES produces 16-byte chunks of output. With half a block, you cannot decrypt any of the input at all.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex