azure documentdb culture while sorting - azure-cosmosdb

I have doucuments(json) in English and in French stores in my documentdb collection.
While doing query using sort descending by title ( a property of my document), the result seem to be wrong.
Instead of beginning by Z-A , its started by special character such as 'Ö', 'é', 'Á' and then go to Z-A.

DocumentDB uses UTF-8 strings per JSON standard. So sort by strings also follows UTF-8 order, i.e. this is the expected behavior.
For a different sort order, you have to store a canonicalized version of the string, then use it for sorting. For example, for case-insensitive sort, you'd store the lower case representation of the string as a separate property. If you want accents to be ignored (é = "e"), then you'd store a mapping of the string without accents.

Related

Searching for blob field in SQLite

I have column in my database that stores BLOB.
I want to run a query to check if specific byte array value is present in the table.
The value is b'\xf4\x8f\xc6{\xc2mH(\x97\x9c\x83hkE\x8b\x95' (python bytes).
I tried to run this query:
SELECT * from received_message
WHERE "EphemeralID"
LIKE HEX('\xf4\x8f\xc6{\xc2mH(\x97\x9c\x83hkE\x8b\x95');
But I get 0 results though I 100% sure that I store this value in the database.
Is there something wrong with my query?
Your search string is a bit weird-- you appear to have some complex things in there like { and (. Maybe you should search through the blob the way it is stored instead?
From the Sqlite documentation:
BLOB literals are string literals containing hexadecimal data and
preceded by a single "x" or "X" character. Example: X'53514C697465'
So maybe do a like with the ascii representation of the hex value you want? Maybe start with looking for just f48f or F48F if your sqlite stores it upper case.

Unable to mach strings in sqlite-net query in Greek language

I want to return from sqlite database some strings based on what the user typed. The comparison must be case-insensitive. While my query works for English data, it only works for Greek when all letters are Caps. So I guess that the method ToUpper() performs differently in the query and in the code.
I've narrowed down the problem to the ToUpper() method because when I run it outside of the query to the filter string it performs great for capital letters.
var filterString = filter.Designation?.ToUpper();
var sites = from c in MemoryService.DbContext.db.Table<Site>()
where filterString == null || c.Designation.ToUpper().Contains(filterString)
I think you need to use a culture specific comparison, see this post here.
Compare strings with non-English characters?
This assumes you know what language the compared strings will be typed in.

Update query to append zeroes into blob field with SQLiteStudio

I'm trying to append zeroes to a blob field in a sqlite3 db.
What I tried is this:
UPDATE Logs
SET msg_buffer = msg_buffer || zeroblob(1)
WHERE msg_id = 'F0'
As you can see, the field name is "msg_buffer", and what I want is to append byte zero. It seems that the concat operator || doesn't work.
How could I achieve this?
Reading the doc link posted by Googie (3.2. Affinity Of Expressions), I managed to find the way:
UPDATE Logs
SET msg_buffer = CAST(msg_buffer || zeroblob(1) AS blob)
WHERE msg_id = 'F0'
The CAST operator can take the expression with the concatenate operator and force the blob affinity.
SQLite3 does support datatypes. See https://www.sqlite.org/datatype3.html
They are not strictly linked with declared type of a column, but rather individual per each cell value. The type is determined by how it was created/modified. For example if you insert 5, it will be INTEGER. If you insert 5.5, it will be REAL. If you insert 'test' it will be TEXT, if you insert zeroblob(1), it will be BLOB and if you insert null, it will be NULL.
Now, what you are doing is that you're trying to concatenate current value with a BLOB type. If current value is TEXT (or basically if you use || operator, as you do, you are converting any type into a TEXT), it will be concatenated with byte \x00, which actually determines the end of a string. In other words, you are adding yet another string terminator, to an already existing one, that the TEXT type has.
There will be no change on output of this operation. TEXT always ends with byte zero and it is always excluded from the result, as it's a meta character, not the value itself.
Additional information from http://sqlite.1065341.n5.nabble.com/Append-data-to-a-BLOB-field-td46003.html - appending binary data to BLOB field is not possible. You can modify prealocated blob:
Append is not possible. But if you preallocate space using
zeroblob() or similar, you can write to it using the incremental
blob API:
http://www.sqlite.org/c3ref/blob_open.html
Finally, please see accepted answer, as author of the question found an interesting solution.

How is an MVCCKey formed in CockroachDB?

I want to create a MVCCKey with a timestamp and pretty value I know. But I realize a roachpb.key is not very straightforward; is there some prefix/suffix involved? Is the database name is also encoded in roachpb.key?
Can anyone please tell me how a MVCCKey is formed? What information does it have? In the documentation, it just says that it looks like /table/primary/key/column.
An engine.MVCCKey combines a regular key with a timestamp. MVCCKeys are encoded into byte strings for use as RockDB keys (RocksDB is configured with a custom comparator so MVCCKeys are sorted correctly even though the timestamp uses a variable-width encoding).
Regular keys are byte strings of type roachpb.Key. For ordinary data records, the keys are constructed from table, column, and index IDs, along with the values of indexed columns. (The database ID is not included here; the database to which a table belongs can be found in the system.descriptors table)
The function keys.PrettyPrint can convert a roachpb.Key to a human-readable form.

Why does SQLite full-text search (FTS4) treat angle brackets differently in a compound search?

I have an SQLite database using FTS4. It is used to store emails with message id's of the form:
Searching for messages using the FTS MATCH syntax, I get a result from:
SELECT rowid FROM emails WHERE emails MATCH '<8200#comms.io>'
This returns the correct row. But when I try to find multiple emails, I get an empty response:
SELECT rowid FROM emails WHERE emails MATCH '<8200#comms.io> OR <8188#comms.io>'
Strangely though, I can search without the angle bracket characters. This returns both rows:
SELECT rowid FROM emails WHERE emails MATCH '8200#comms.io OR 8188#comms.io'
This even though the angle brackets are present in the stored columns. I can find no mention that these are special characters in SQLite, and without the 'OR', the single-term search works fine.
Why are these characters treated differently in my compound search?
The default (simple) tokenizer reads alphanumerical characters and treats all others as word separators to be ignored.
So when searching for a message ID, you have to actually search for a phrase with multiple words (8200, comms, and io).
If you want to treat the entire message ID as a word, you have to write a custom tokenizer.

Resources