I checked the storage size but I'm confused when it comes to storing numbers.
In case of Bytes, what does "Byte length" mean? If I store -128 what's the length? And in case of 12?
In case of Floating-point number and Integer, it doesn't matter if I store 325 or 9.9999999999999 it will always be 8 bytes?
In case of Array? Let' say we have ["ab", "bcd"], what's the size, (2+3=5) or (2+1)+(3+1)=7
If you store an array of bytes, the size will simply be the length of that array. An array with a single byte value of -128 is still just one byte.
Yes, all numbers occupy the same 8-byte size, even if you don't see a fractional part.
The documentation says it's the sum of the array element sizes, so I would expect 7, the sum of the two individual string sizes, each encoded in UTF-8 + 1
Related
Is possible encrypt 30 digit number into 10 digit number, i have number like
23456-32431-23233-76543-98756-54543 i need look like 10 digit encrypt format.
Is possible encrypt 30 digit number into 10 digit number,
Purely mathematically - you cannot. Still we are assuming you want to represent 30 decimal digits of any value using 10 decimal digits. You simply want to put a pint into a shot glass.
Anything, i need compressed the digit.
Compression would be possible if some of the stated assumptions would be not valid.
If you could represent the output as text (any character or binary), you could encode the decimal value to binary/base64 form which would allow shorten representation (still no to 1:3 ratio)
Compression would work well, if the input values (or part of the input) would not be random. If digits or significant part of the input have not uniform distribution or part of the input would represent a limited counter, then the parts or digits could be represented with limited number of bits.
You may know more about your data, so only you could tell anything about the data distribution.
curiosity, how goo.gl & bit.ly working?
The "shortening" sites are a key-value storage, mapping a generated short value to stored full url. So it's mapping, not any compression.
I have a data that needs to be stored in a database as encrypted, the maximum length of the data before encryption is 50 chars (English or Arabic), I need to encrypt the data using AES-128 bit, and store the output in the database (base64string).
How to know the length of the data after encryption?
Try it with your specified algorithm, block size, IV size, and see what size output you get :-)
First it depends on the encoding of the input text. Is it UTF8? UTF16?
Lets assume UTF8 so 1 Byte per character means 50 Bytes of input data to your encryption algorithm. (100 Bytes if UTF16)
Then you will pad to the Block Size for the algorithm. AES, regardless of key size is a block of 16 Bytes. So we will be padded out to 64 Bytes (Or 112 for UTF 16)
Then we need to store the IV and header information. So that is (usually, with default settings/IV sizes) another 16Bytes so we are at 80 Bytes (Or 128 for UTF16)
Finally we are encoding to Base64. I assume you want string length, since otherwise it is wasteful to make it into a string. So Base 64 bloats the string using the following formula: Ceil(bytes/3) * 4. So for us that is Ceil(80/3) = 27 * 4 = 108 characters (Or 172 for UTF 16)
Again this is all highly dependent on your choices of how you encrypt, what the text is encoded as, etc.
I would try it with your scenario before relying on these numbers for anything useful.
I want to build a huffman tree and assign a code to all the 255 byte values based on their frequencies. But For my application I need a hash table to get the code for a byte in constant time. But in worst case the tree may be so unbalanced that certain bytes have a very large key (even 254 bit long) . So maintaining a hash table is being very difficult. The code requires high performance and so stroing them as a string won't work. How can I resolve the issue?
Why would you need a hash table for 256 values? Simply have a 256-entry table where you directly index the code for each byte.
Each code is at most 32 bytes long, so just have a table of 256 entries, each with a fixed number of 33 bytes per entry. 8448 bytes. The first byte of the 33 being the length of the code in bits, and the remaining bytes being the code, of which you only use the requisite number of bits for each.
When using the openssl_encrypt() function in PHP to encrypt a string with AES-256-CBC as the encryption method:
$encrypted = openssl_encrypt($data, "AES-256-CBC", $key, 0, $iv);
I tried different string lengths for $data, and the resulting length of $encrypted will increase when $data reaches a multiple of 16 bytes. But it seems the growth is not steady.
Is there a general formula that relates the length of $data and $encrypted?
Let me quote from https://stackoverflow.com/a/3717552/2393787
With CBC mode, the input data must have a length multiple of the block length, so it is customary to add PKCS#5 padding: if the block length is n, then at least 1 byte is added, at most n, such that the total size is a multiple of n, and the last added bytes (possibly all of them) have numerical value k where k is the number of added bytes. Upon decryption, it suffices to look at the last decrypted byte to recover k and thus know how many padding bytes must be ultimately removed.
Hence, with CBC mode and AES, assuming PKCS#5 padding, if the input data has length d then the encrypted length is (d + 16) & ~15. I am using C-like notation here; in plain words, the length is between d+1 and d+16, and multiple of 16.
This states, that the length of your encrypted data can't be predicted with CBC. You should consired moving to another mode.
I have a question about the data types in sqlite3.
As a value of SQLITE_INTEGER can be stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value, if I only know that a column in SQlite database stores SQLITE_INTEGER, how can I know a value in this column is 4 bytes or 6-8 bytes integer, or which one should be used to get the value, sqlite3_column_int() or sqlite3_column_int64()?
Can I use sqlite3_column_bytes() in this case? but according to the documentation, sqlite3_column_bytes() is primarily used for TEXT or BLOB.
Thanks!
When SQLite steps into a record, all integer values are expanded to 64 bits.
sqlite3_column_int() returns the lower 32 bits of that value without checking for overflows.
When you call sqlite3_column_bytes(), SQLite will convert the value to text, and return the number of characters.
You cannot know how large an integer value is before reading it.
Afterwards, you can check the list in the record format documentation for the smallest possible format for that value, but if you want to be sure that integer values are never truncated to 32 bits, you have to always use sqlite3_column_int64(), or ensure that large values get never written to the DB in the first place (if that is possible).