How do I save an image to the database using caveman and sxql? - common-lisp

I am trying to build a website that takes an uploaded image and saves it in the PostgreSQL database.
From caveman I can do:
(caveman2:request-raw-body caveman2:*request*)
Which gives me a circular stream: CIRCULAR-STREAMS:CIRCULAR-INPUT-STREAM.
I suppose I can then use a read-sequence to put the contents into a byte array
(let ((buffer (make-array 5 :adjustable t :fill-pointer 5)))
(read-sequence buffer (caveman2:request-raw-body caveman2:*request*))
(add-picture-to-db buffer))
The problem occurs when I try to save this byte array into the database using sxql.
(defun add-picture-to-db picture
(with-connection (db)
(datafly:execute
(sxql:update :testpictures
(sxql:set= :picture picture)
(sxql:where (:= :id 1))))))
I guess the reason why it is failing might be because ultimately, sxql will generate a string which won't work well with binary data. Is there something here that I'm missing? How can I make this work?
Ideally, the way to verify the solution would be to retrieve the saved image from the db, serve it as the response of a http request and see if the client gets the image.

It would be much better to use Postmorden for this as it supports the handling of byte data with PostgreSQL
However, it is possible to work around the limitations of sxql. The first thing to understand is that sxql will ultimately generate an SQL query string, which will cause problems if you insert byte data directly into it.
It is necessary to convert the bytes of the file you want to store into HEX so that it can be used in sxql.
(format nil "~{~2,'0X~}" list-of-bytes-from-file)
Running this function through all the bytes of the file will give you a string composed of two digit HEX for each byte. This is important because other methods of converting bytes to HEX may not maintain the two digit padding, leading to an odd number of HEX.
For example:
(write-to-string 0 :base 16)
This will return a single digit HEX.
Next, you store the resulting string as you normally would into a bytea type column in the db using sxql.
When retrieving the file from the database, you get a byte array that represents the HEX string.
Using this function you can convert it back to a HEX string.
(flexi-streams:octets-to-string byte-array :external-format :utf-8)
Next step is to split the resulting string into pairs of HEX, e.g: ("FF" "00" "A2")
Then convert the pair back into a byte using this function on each pair:
(parse-integer pair :radix 16)
Store those bytes into an array of type unsigned byte, and finally return that array as the body of the response in caveman2 (not forgetting to also set the corresponding content-type header).

Related

Proper method of converting base64 image strings to binary?

I have a table in a SQLite database that contains about 15,000 single-page document scans stored as base64 strings. If I understand correctly, converting these to binary would reduce the size of the table by 25%.
Is it correct that it is not possible to convert the images to binary in SQLite directly but the base64 strings need to be converted to images first and then to binary? If so, will creating an image in Tcl from each base64 string and converting to binary suffice? And are there any tricky items that a novice is likely to overlook in attempting to do so?
When the test code below is executed, it appears that img_binary is binary data, but is this the correct approach?
Thank you.
set db "database_name"
sqlite3 dbws $db
#Base64 strings in database are prefixed with "data:image/gif;charset=utf-8;base64,"
set l [expr {[string length {data:image/gif;charset=utf-8;base64,}] -1}]
dbws eval { select img_base64 from lexi_raw where img_no = $nbr } {
image create photo ::img::lexi -data [string replace $img_base64 0 $l]
set img_binary [::img::lexi data -format png]; #Does this return binary to be written to SQLite?
puts $img_binary
}
SQLite doesn't have a built-in base64 decoder, but you can add one.
Try this:
package require sqlite3
sqlite3 db :memory:
db function base64decode -argcount 1 -deterministic -returntype blob {binary decode base64}
db eval {SELECT base64decode('SGVsbG8sIFdvcmxk') AS message}
The trick is the function method, which creates a new function (called base64decode) that is implemented by the given Tcl script fragment (binary decode base64; the argument is appended as a word to that). I'm passing -argcount 1 because we only ever want to pass a single argument here, -deterministic because the result is always the same for the same input, and -returntype blob because we know the result is binary.
If you want to do more complex processing (such as stripping a prefix as well) then it's best to implement by calling a procedure:
db function base64decode -argcount 1 -deterministic -returntype blob myDecoder
proc myDecoder value {
# Strip a leading prefix
regsub {^data:image/gif;charset=utf-8;base64,} $value "" value
# Decode the rest
return [binary decode base64 $value]
}

How do i store binary 1s and 0s as bits rather than bytes in a file in python3?

I'm trying to do file compression using huffman encoding in python and i have successfully constructed the codes for each of the unique characters in the file. Now, when i encode the original file with this code it generates a sequence of 1s and 0s. However each character takes a byte and i want to know how i can store the codes such that each 1s and 0s gets stored as bits rather bytes in python3 which will actually reduce the file size.
#This is the file i redirect my output to
sys.stdout=open("./input2.txt","w")
#Tree for storing the code
class Tree:
__slots__=["left","right","val","c"]
def __init__(self,val,c):
self.val=val
self.c=c
self.left=self.right=None
def value(self):
return (self.val,self.c)
#Tree is a list of tree nodes. Initially it is a list where each
#character is a seperate tree.
def construct(tree):
while(len(tree)>1):
left=_heapq.heappop(tree)
right=_heapq.heappop(tree)
root=(left[0]+right[0],left[1]+right[1],Tree(left[0]+right[0],left[1]+right[1]))
root[2].left=left[2]
root[2].right=right[2]
_heapq.heappush(tree,root)
return tree
#This function generates the code for the characters in the tree which
#is the 'root' argument and 'code' is an empty String
#'codes' is the map for mapping character with its code
def Print(root,code,codes):
if(root.left==None and root.right==None):
codes[root.c]=code
return
Print(root.left,code+'0',codes)
Print(root.right,code+'1',codes)
#This function encodes the 'compressed' string with the 'codes' map
def encode(compressed,codes):
document=''.join(list(map(lambda x:codes[x],compressed)))
return document
My output is like this:
110111001110111001110111001110111001110101000011011011011110101001111011001101110100111101101111011100011110110111101011111101010111010000011011101011101101111011101111011110111011001101001101110100011101111011101101010110
The problem is each of the 1 and 0 are stored as a character of 4 bytes each and i want them to be stored as bits
You do not include the code where you save it to a file so I cannot say for sure. However I can take a guess here.
You are likely forgetting to pack your 1 and and 0 codes together. You will likely need to use the bytes or bytearraytype (see here for documentation) and use bitwise operators (see here for info) to shift and pack 8 of your codes into each byte before storing it out to the file.
Be conscious of the bit ordering that you pack the codes into the bytes.
I have not used these, but you might find the pack routine useful. See here for info.

How to Know The Initial Data type is UTF-8 or UTF-16 in sqlite?

In this, the function sqlite3_column_type can tell me whether the initial data type of the result is text or not, but it will not tell whether it is UTF-8 or UTF-16. Is there a way to know that?
Thanks
If you have a brand new empty database, before any tables are created, you can set the internal encoding used for Unicode text with the encoding pragma, and later use it to see the encoding being used (It defaults to UTF-8).
When storing or retrieving TEXT values, sqlite will automatically convert if needed between UTF-8 and UTF-16, so it doesn't matter too much which one is being used internally unless you're trying to get every last tiny bit of performance out of it.
In the link you provided it says explicitely:
const unsigned char sqlite3_column_text(sqlite3_stmt, int iCol);
const void sqlite3_column_text16(sqlite3_stmt, int iCol);
sqlite3_column_text → UTF-8 TEXT result sqlite3_column_text16 → UTF-16
TEXT result
These routines return information about a single column of the current
result row of a query. In every case the first argument is a pointer
to the prepared statement that is being evaluated (the sqlite3_stmt*
that was returned from sqlite3_prepare_v2() or one of its variants)
and the second argument is the index of the column for which
information should be returned. The leftmost column of the result set
has the index 0. The number of columns in the result can be determined
using sqlite3_column_count().

Using Coldfusion's Encrypt function to encrypt a hex block and return a block-length result

My company is working on a project that will put card readers in the field. The readers use DUKPT TripleDES encryption, so we will need to develop software that will decrypt the card data on our servers.
I have just started to scratch the surface on this one, but I find myself stuck on a seemingly simple problem... In trying to generate the IPEK (the first step to recreating the symmetric key).
The IPEK's a 16 byte hex value created by concatenating two triple DES encrypted 8 byte hex strings.
I have tried ECB and CBC (zeros for IV) modes with and without padding, but the result of each individual encoding is always 16 bytes or more (2 or more blocks) when I need a result that's the same size as the input. In fact, throughout this process, the cyphertexts should be the same size as the plaintexts being encoded.
<cfset x = encrypt("FFFF9876543210E0",binaryEncode(binaryDecode("0123456789ABCDEFFEDCBA98765432100123456789ABCDEF", "hex"), "base64") ,"DESEDE/CBC/PKCS5Padding","hex",BinaryDecode("0000000000000000","hex"))>
Result: 3C65DEC44CC216A686B2481BECE788D197F730A72D4A8CDD
If you use the NoPadding flag, the result is:
3C65DEC44CC216A686B2481BECE788D1
I have also tried encoding the plaintext hex message as base64 (as the key is). In the example above that returns a result of:
DE5BCC68EB1B2E14CEC35EB22AF04EFC.
If you do the same, except using the NoPadding flag, it errors with "Input length not multiple of 8 bytes."
I am new to cryptography, so hopefully I'm making some kind of very basic error here. Why are the ciphertexts generated by these block cipher algorithms not the same lengths as the plaintext messages?
For a little more background, as a "work through it" exercise, I have been trying to replicate the work laid out here:
https://www.parthenonsoftware.com/blog/how-to-decrypt-magnetic-stripe-scanner-data-with-dukpt/
I'm not sure if it is related and it may not be the answer you are looking for, but I spent some time testing bug ID 3842326. When using different attributes CF is handling seed and salt differently under the hood. For example if you pass in a variable as the string to encrypt rather than a constant (hard coded string in the function call) the resultant string changes every time. That probably indicates different method signatures - in your example with one flag vs another flag you are seeing something similar.
Adobe's response is, given that the resulting string can be unecrypted in either case this is not really a bug - more of a behavior to note. Can your resultant string be unencrypted?
The problem is encrypt() expects the input to be a UTF-8 string. So you are actually encrypting the literal characters F-F-F-F-9.... rather than the value of that string when decoded as hexadecimal.
Instead, you need to decode the hex string into binary, then use the encryptBinary() function. (Note, I did not see an iv mentioned in the link, so my guess is they are using ECB mode, not CBC.) Since the function also returns binary, use binaryEncode to convert the result to a more friendly hex string.
Edit: Switching to ECB + "NoPadding" yields the desired result:
ksnInHex = "FFFF9876543210E0";
bdkInHex = "0123456789ABCDEFFEDCBA98765432100123456789ABCDEF";
ksnBytes = binaryDecode(ksnInHex, "hex");
bdkBase64 = binaryEncode(binaryDecode(bdkInHex, "hex"), "base64");
bytes = encryptBinary(ksnBytes, bdkBase64, "DESEDE/ECB/NoPadding");
leftRegister = binaryEncode(bytes, "hex");
... which produces:
6AC292FAA1315B4D
In order to do this we want to start with our original 16 byte BDK
... and XOR it with the following mask ....
Unfortunately, most of the CF math functions are limited to 32 bit integers. So you probably cannot do that next step using native CF functions alone. One option is to use java's BigInteger class. Create a large integer from the hex strings and use the xor() method to apply the mask. Finally, use the toString(radix) method to return the result as a hex string:
bdkText ="0123456789ABCDEFFEDCBA9876543210";
maskText = "C0C0C0C000000000C0C0C0C000000000";
// use radix=16 to create integers from the hex strings
bdk = createObject("java", "java.math.BigInteger").init(bdkText, 16);
mask = createObject("java", "java.math.BigInteger").init(maskText, 16);
// apply the mask and convert the result to hex (upper case)
newKeyHex = ucase( bdk.xor(mask).toString(16) );
WriteOutput("<br>newKey="& newKeyHex);
writeOutput("<br>expected=C1E385A789ABCDEF3E1C7A5876543210");
That should be enough to get you back on track. Given some of CF's limitations here, java would be a better fit IMO. If you are comfortable with it, you could write a small java class and invoke that from CF instead.

ColdFusion: Integer "0" doesn't convert to ASCII character

I have a string (comprised of a userID and a date/time stamp), which I then encrypt using ColdFusion's Encrypt(inputString, myKey, "Blowfish/ECB/PKCS5Padding", "Hex").
In order to interface with a 3d party I have to then perform the following:
Convert each character pair within the resultant string into a HEX value.
HEX values are then represented as integers.
Resultant integers are then output as ASCII characters.
All the ASCII characters combine to form a Bytestring.
Bytestring is then converted to Base64.
Base64 is URL encoded and finally sent off (phew!)
It all works seamlessly, APART FROM when the original cfEncrypted string contains a "00".
The HEX value 00 translates as the integer (via function InputBaseN) 0 which then refuses to translate correctly into an ASCII character!
The resultant Bytestring (and therefore url string) is messed up and the 3d party is unable to decipher it.
It's worth mentioning that I do declare: <cfcontent type="text/html; charset=iso-8859-1"> at the top of the page.
Is there any way to correctly output 00 as ASCII? Could I avoid having "00" within the original encrypted string? Any help would be greatly appreciated :)
I'm pretty sure ColdFusion (and the Java underneath) use a null-terminated string type. This means that every string contains one and only one asc(0) char, which is the string terminator. If you try to insert an asc(0) into a string, CF is erroring because you are trying to create a malformed string element.
I'm not sure what the end solution is. I would play around with toBinary() and toString(), and talk to your 3rd party vendor about workarounds like sending the raw hex values or similar.
Actually there is a very easy solution. The credit card company who is processing your request needs you to convert it to lower case letters of hex. The only characters processed are :,-,0-9 do a if else and convert them manually into a string.

Resources