I intercepted an HTTP request originating from Android to Instagram when I creating account. Here the post data that is sent to the instagram:
e95cef1c47aa4c85ee7555403af92acb80aca9266e8edf77a7fb75b37795c735 {"allow_contacts_sync":"true","sn_result":"API_ERROR:+null","phone_id":"7520e5f4-b4a6-4bd9-a445-972641476fde","_csrftoken":"JlMrKwuiXF6pPB5q98Srx2TZR1MrKCfe","username":"michaelabramobics2","first_name":"Michael","adid":"dac68c0e-4307-4753-8c07-3ea2c26187dd","guid":"fa13e631-1663-49cf-a507-e62dbb03012b","device_id":"android-4d0577bf20b57285","email":"michaelabramobics2#gmail.com","sn_nonce":"bWljaGFlbGFicmFtb2JpY3MyQGdtYWlsLmNvbXwxNTI2NzM1Nzk4fBgiGpUFAo8qZWzGlVPG02r9zOXztwLQnQ==","force_sign_up_code":"","waterfall_id":"52d43d05-7cac-468a-8b10-2f2499eb7cf2","qs_stamp":"","password":"123456789"}
How I can decode this parameter?
"sn_nonce":"bWljaGFlbGFicmFtb2JpY3MyQGdtYWlsLmNvbXwxNTI2NzM1Nzk4fBgiGpUFAo8qZWzGlVPG02r9zOXztwLQnQ=="
Base64 decoding returns:
michaelabramobics2#gmail.com|1526735798|"*elƕSjН
Email|Unixtime| and ?
What is the last value? How can I find out what encoding it is?
I will be very grateful for help.
A nonce is a number used once. Generally the nonce however consists of bytes, and they are often random bytes. It depends on the protocol if it is used a number or if the nonce is just binary data. It is is used as a number it is likely a statically sized, unsigned, big- or sometimes little endian number. But most often the nonce consists of random bytes.
Random bytes, or course, will not display as well as the mail address or the Unix time. Because the bytes are not encoded text, decoding it will generally result in garbage. If the decoded text is Unicode, or if there are unprintable characters then the result is generally shorter than you would expect as bytes are combined or left out entirely.
In hexadecimals the last part reads (converted using the tomeko.net online decoder:
18221A9505028F2A656CC69553C6D36AFDCCE5F3B702D09D
which looks fairly random to me, it's certainly not text in any encoding. The 24 bytes are also a common length for cryptographically secure nonces, keys and such, so that would strengthen the assumption that this is a random nonce.
Related
My question is: Is there a reliable way to detect if a hex / base64 string is actually encrypted, or just encoded?
(I did a quick search but I only seem to find whats the difference between encryption and encoding none seems to say how to detect encryption in general...)
I don't need to know what kind of encryption it is, just detect whether it is encrypted or not and send error if not encrypted, thus enforce encryption.
String size may vary from couple of bytes to kilobytes...
Is there a C/C++ library available for that?
If you think you're working with encoded/encrypted plaintext, the most obvious thing to do would be to try and decode with various standard encodings, and see if what you get back looks like plain English, or at least what you're looking for.
Beyond that, there's a few things you could try:
If you had a perfectly encrypted string, it would be indistinguishable from random noise, so if you can see significant correlations in your string, you probably have imperfectly encrypted data, or straight up encoded plaintext.
To find this, you can find the "Index of Coincidence" for the string, or look for repeated blocks of code. If you find repeats, it's either unencrypted, or, if the repeats are multiples of 16 bytes (or another suitable block length) long, then it might be ECB encoded (i.e. with the same 16 bytes key repeated through the data).
I would say your best bet would be to see how random your string is, if it's really hard to find correlations, then it's probably well encrypted. If the same bits of encrypted/encoded text keep popping up, it's probably just encoded.
I have a requirement for one of my projects in which I am expecting a few of the incoming fields encrypted as AES-256 when sent to us by upstream. The incoming file is comma delimited. Is there a possibility that the AES encrypted fields may contain "," throwing off the values to different fields? What about if it is pipe delimited or some other delimiter?
Also, how what should be the datatype of these encrypted fields in order to read these encrypted fields using an ETL tool?
Thanks in Advance
AES as a block cipher is a family of permutations selected by the key. The output is expected to be random ( more precisely we believe that AES is a Pseudo-Random-Permutation)
AES ( like any block cipher) outputs binary data, usually as a byte array and bytes can take any value between 0 and 256 with equal probability.
You are not alone;
Transmitting binary data can create problems, especially in protocols that are designed to deal with textual data. To avoid it altogether, we don't transmit binary data. Many of the programming errors related to encryption on Stack Overflow are due to sending binary data over text-based protocols. Most of the time this works, but occasionally it fails and the coders wonder about the problem. The binary data corrupts the network protocol.
Therefore hex, base64, or similar encodings are necessary to mitigate this. Base64 is not totally URL safe and one can make it URL safe with a little work.
And note that has nothing to do with security; it is about visibility and interoperability.
I'm a bit confused how people represent binary data, and how it is sent over networks. I will explain through Wikipedia's example. Shown here <- https://imgur.com/a/POELH -> So I have my binary data encoded as base 64, and I am sending the text TWFU. So I am sending T then W then F and finally U. But to send T, a char. I will need one byte to send it, like I've always been told. One character sent over a network is one byte.
Because now I've come to think that if I encode 24 bytes, I will be sending over 4 characters, but to send over 4 characters I need the same amount of bytes as characters??
So when sending over the network "Man" (unencoded) (Requiring 3 bytes normally) vs "TWFu" (encoded) (requiring 4 bytes normally) in the example from above, are the same sequence of bits sent over the network the same. Because the last time I've used a socket to send over data, they just ask for a string input, never a text + encoding input.
Synopsis: "How" is an agreement. "Raw" is common.
Data is sent in whichever way the sender and receiver agree. There are many protocols that are standard agreements. Protocols operate at many levels. A very common pair that covers two levels is TCP/IP. Many higher-level protocols are layered on top of them. (A higher-level protocol may or may not depend on specific underlying protocols.) HTTP and SMTP are very common higher-level protocols, often with SSL sandwiched in between.
Sometimes the layers or the software that implements them is called a stack. There is also the reference (or conceptual) OSI Model. The key point about it is that it provides a language to talk about different layers. The layers it defines may or may not map to any specific stack.
Your question is too vague to answer directly. With HTTP, "raw" binary data is transferred all the time. The HTTP headers can give the length of the body in octets and the body follows the header. As part of the agreement between the sender and receiver, the header might give meta-data about the binary data using MIME headers. For example: Your gravatar
is sent with headers including:
content-length:871
content-type:image/png
That's enough for the receiver to know that the sender claims that it is a PNG graphic of 871 bytes. The receiver will read the header and then read 871 bytes for the body and then assume that what follows is another HTTP header.
Some protocols use synchronizations methods other than bodies with pre-declared sizes. They might be entirely text-based and use a syntax that allows only certain characters. They can be extended by a nesting agreement to use something like Base64 to represent binary data as text.
Some layers might provide data compression of sufficient density that expansion by higher layers, such as Base64, is not a great concern. See HTTP Compression, for example.
If you want to see HTTP in action, hit F12 and go the Network tab. If you want to see other protocols active on your computer try WireShark, Microsoft Message Analyzer, Fiddler or similar.
Base64 is a method for encoding arbitrary 8-bit data in a purely 7-bit channel. As much as the internet is based on the principle of 8-bit bytes, for text mode it's presumed to be 7-bit ASCII unless otherwise specified.
If you're sending that data Base64 encoded then you'll literally send TWFU. Many text-based protocols use Base64 out of convenience: It's an established standard and it's efficient enough for most applications.
The foundation of the internet, IP, is a protocol based on 8-bit bytes. When sending binary data you can make full use of all 8 bits, but if you're working with a text-mode protocol, of which there are many, you're generally stuck using 7-bit ASCII unless the protocol has a way of specifying which character set or encoding you're using.
If you have the option to switch to a "binary" transfer then you can side-step the need for Base64. If you're working with a 7-bit ASCII protocol then you're probably going to need Base64.
Note this isn't the only method for encoding arbitrary binary characters. There's also quoted printable as used in email, and URI encoding for URLs. These are more efficient in cases where escaping is exceptional, but far less efficient if it's required for each character.
If you know you're dealing with 7-bit text only there's no need for base-64 encoding.
However, if you'd need to send
Man
Boy
over a purely 7-bit text channel you couldn't send it as literal with the line breaks. Instead, you'd send encoded in base64
TWFuDQpCb3kNCg==
which has encoded line breaks but doesn't use incompatible characters. Of course, the receiver needs to know that you're sending encoded text - either implied by the protocol or explicitly marked in some way.
I am sending some video files (size could be even in GB) as application/x-www-form-urlencodedover HTTP POST.
The following link link suggests that it would be better to transmit it over Multipart form data when we have non-alphanumeric content.
Which encoding would be better to transmit data of this kind?
Also how can I find the length of encoded data (data encoded with application/x-www-form-urlencoded)?
Will encoding the binary data consume much time?
In general, encoding skips the non-alphanumeric characters with some others. So, can we skip encoding for binary data (like video)? How can we skip it?
x-www-form-urlencoded treats the value of an entry in the form data set as a sequence of bytes (octets).
Of the possible 256 values, only 66 are left as it or still encoded as a single byte value, the others are replaced by the hexadecimal representation of the value of their code-point.
This usually takes three to five bytes depending on the encoding.
So in average (256-66)/256 or 74% of the file will be encoded to take three-to-five as much space as originally.
This encoding however has no header nor significant overhead.
multipart/form-data instead works by dividing the data into parts and then finding a string of any length that doesn't occur in said part.
Such string is called the boundary and it is used to delimit the end of the part, that is transmitted as a stream of octects.
So the file is mostly send as it, with negligible size overhead for big enough data.
The draw back is that the user-agent need to find a suitable boundary, however given a string of length k there is only a probability of 2-8k of finding that string in a uniformly generated binary file.
So the user-agent can simply generate a random string and do a quick search and exploit the network transmission time to hide the latency of the search.
You should use multipart/form-data.
This depends on the platform you are using, in general if you cannot access the request body you have to re-perform the encoding your self.
For multipart/form-data encoding there is a little, usually negligible (compared to the transmission time) overhead.
I want to encrypt small serialized data structures (~256 bytes) so I can pass them around (especially in URLs) safely. My current approach is to use a symmetric block cipher, and then to base 64 encode, then URL encode the cipher text. This yields an encoded cipher text that is (unsurprisingly) quite a bit longer than the original data structure. The length of these encoded ciphers is a bit of a usability problem; ideally I'd like the cipher text to be around the same length as the input text.
Is there a block cipher that can be configured to constrain the values of the output bytes to be in the URL-safe range? I assume there would be a security trade-off involved if there is.
For a given key K, a cipher has to produce a different ciphertext for each plaintext. If your message space is 256 bytes, the cipher has to be able to produce at least 256^256 different messages. This will require at least 256 bytes, and any reduction in the size of the output alphabet requires longer messages.
As you've seen, you can do some encoding afterward to avoid certain output symbols, at the cost of increased length. Furthermore, you would pay the same cost if the encoding were part of the encryption algorithm proper. That's why this isn't a feature of any encryption algorithm.
As others have mentioned, the only real answer is to reduce the size of the data you are encrypting so that you need to encode less data. (Either that or don't put the data in url's in the first place e.g. store the data in a database and put a unique id in the url). So compress > encrypt > encode.
If your data structure is 256 bytes long encrypting it with a block cipher of 8 bytes increases it up to 8 bytes (depending of the concrete input length).
Therefore before applying base64 you have up to 264 bytes which are increased by the base64 encoding up to 352 bytes.
Therefore as you can see the most overhead is created by the base64 encoding. There are some slightly more effective encodings available like base91 - but they are very uncommon.
If size matters I would recommend to compress the data before encrypting it.
URL encoding will not significantly expand a base64 encoded string, since 62 of the 64 characters do not need to be modified. However, you can use modified base64 encoding to do a little better. This encoding uses the '-' and '_' characters in place of the '+' and '/' characters to yield a slight efficiency improvement.
The cipher itself is not causing any significant data expansion. It will pad the data to be a multiple of the block length, but that is insignificant in your case. You might try compressing the input prior to encryption. 256 bytes is not much but you might see some improvement.