Cosmos DB, C# SQL Api - case-insensitive WHERE clause - azure-cosmosdb

I am working on a project with Azure Cosmos DB using the C# SQL Api (DocumentDB) and need to know if it's possible to have a case-insensitive WHERE clause. From what I can find online it doesn't appear to be possible yet.
I want to write a query like:
SELECT l.CustomerName, l.LogDetail
FROM Logs l
WHERE l.CustomerName = 'Acme'
and have documents returned with CustomerName equal to "ACME", "Acme", or even "aCmE". I don't want to take a performance hit of a scan. I'd prefer to have the query use an index.
I know I could create a second CustomerName field with all lowercase values to filter on, but I'm looking to see if I can avoid that. Is this possible?

Unfortunately, unless it was added in the past two months, this is not possible.
If you use ToLower() or ToUpper() on an indexed field it will result in a scan, so that is not an option.
Some valid solutions are like you said to add another field with a case-insensitive string, or to only insert data with a certain case. It sounds like your DB is case insensitive anyway, so why not ensure that the cases really are insensitive?

At the time of this writing, there is now a LOWER function that can be used Cosmos SQL API queries. This would enable you to write your query like this:
SELECT l.CustomerName, l.LogDetail
FROM Logs l
WHERE LOWER(l.CustomerName) = 'acme'
Here are the docs for the LOWER function.

There is a StringEquals function now which can be used to do case insensitive compares.
SELECT STRINGEQUALS("abc", "abc", false) AS c1, STRINGEQUALS("abc", "ABC", false) AS c2, STRINGEQUALS("abc", "ABC", true) AS c3
returns
[{
"c1": true,
"c2": false,
"c3": true
}]
Here is the documentation - https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-stringequals

Related

CosmoDB applying a sort by field removes all documents that do not have that field

We are migrating from mongoDB to CosmoDB using the Mongo API.
We have encountered the following difference in query behavior around sorting.
Using the CosmoDB mongo API sorting by a field removes all documents that don't have that field. Is it possible to modifying the query to including the nulls to replicate the mongo behavior?
For example if we have the following 2 documents
[{
"id":"p1",
"priority":1
},{
"id":"p2"
}]
performing:
sort({"priority":1})
cosmoDB will return a single result 'p1'.
mongo will return both results in the order 'p2', 'p1', the null documents will be first.
As far as I know, the null value will not include in the query result sort scan.
Here is a workaround, you could set a not exists field in the sort method to force the engine scan all the data.
Like this:
db.getCollection('brandotestcollections').find().sort({"test": 1, "aaaa":1})
The result is like this:
I had the same problem and got solved after some reading
Refer the document...
You have to update the indexing policy of the container to change the default way of Cosmos DB sorting!

How to do a Case Insensitive search on Azure DocumentDb?

is it possible to perform a case insensitive search on DocumnetDb?
Let's say I have a record with 'name' key and value as "Timbaktu"
This will work:
select * from json j where j.name = "Timbaktu"
This wont:
select * from json j where j.name = "timbaktu"
So how do yo do a case insensitive search?
Thanks in advance.
Regards.
There are two ways to do this. 1. use the built-in LOWER/UPPER function, for example,
select * from json j where LOWER(j.name) = 'timbaktu'
This will require a scan though. Another more efficient way is to store a "canonicalized" form e.g. lowercase and use that for querying. For example, the JSON would be
{ name: "Timbaktu", nameLowerCase: "timbaktu" }
Then use it for querying like:
select * from json j WHERE j.nameLowerCase = "timbaktu"
Hope this helps.
Cosmos recently added a case-insensitive option for string functions:
You now have an option to make these string comparisons
case-insensitive: Contains, EndsWith, StringEquals, and StartsWith.
and Significant performance improvements have been realized for these
string system functions. Each of these four string system functions
now benefit from an index and will therefore have much lower latency
and request unit (RU) consumption.
Announcement
Perhaps this is an ancient case, I just want to provide a workaround.
You could use UDF in azure cosmos db.
udf:
function userDefinedFunction(str){
return str .toLowerCase();
}
And use below sql to query results:
SELECT c.firstName FROM c where udf.lowerConvert(c.firstName) = udf.lowerConvert('John')

How can I Scan an index in reverse in DynamoDB?

I am currently using DynamoDB and having a problem scanning. I am able to get paged results in forward order by using the ExclusiveStartKey. However, regardless of whether I set ScanIndexForward true or false, I get results in forward order from my scan operation. How can i get results in reverse order from a Scan in DynamoDB?
ScanIndexForward is the correct way to get items in descending order by the range key of the table or index you are querying. From the AWS API Reference:
A value that specifies ascending (true) or descending (false)
traversal of the index. DynamoDB returns results reflecting the
requested order determined by the range key. If the data type is
Number, the results are returned in numeric order. For type String,
the results are returned in order of ASCII character code values. For
type Binary, DynamoDB treats each byte of the binary data as unsigned
when it compares binary values.
Based on the docs for Scan, I conclude that there is no way to Scan in reverse. However, I would say that you are not using DynamoDB correctly if you need to do that. When designing a schema for a database like DyanmoDB you should plan the schema based on your expected queries to ensure that almost all application queries have a good index. Scans are meant more for sys admin operations or for feeding into MapReduce or analytics. "A Scan operation always scans the entire table, then filters out values to provide the desired result, essentially adding the extra step of removing data from the result set." (Query and Scan Performance) That can lead to performance problems and other issues.
Using DynamoDB is fundamentally different from working with a traditional relational database and requires a big change in the way you think about using it. You need to decide whether DynamoDB's advantages of availability in storage and performance, reliability and availability are worth accepting its limitations.
As of now the dynamoDB scan cannot return you sorted results.
You need to use a query with a new global secondary index (GSI) with a hashkey and range field. The trick is to use a hashkey which is assigned the same value for all data in your table.
I recommend making a new field for all data and calling it "Status" and set the value to "OK", or something similar.
Then your query to get all the results sorted would look like this:
{
TableName: "YourTable",
IndexName: "Status-YourRange-index",
KeyConditions: {
Status: {
ComparisonOperator: "EQ",
AttributeValueList: [
"OK"
]
}
},
ScanIndexForward: false
}
The docs for how to write GSI queries are found here: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.Querying

Riak search queries via the java client

I am trying to perform queries using the OR operator as following:
MapReduceResult result = riakClient.
mapReduce("some_bucket", "Name:c1 OR c2").
addMapPhase(new NamedJSFunction("Riak.mapValuesJson"), true).
execute();
I only get the 1st object in the query (where name='c1').
If I change the order of the query (i.e. Name:c2 OR c1) again I get only the first object in query (where name='c2').
is the OR operator (and other query operators) supported in the java client?
I got this answer from Basho engeneer, Sean C.:
You either need to group the terms or qualify both of them. Without a field identifier, the search query assumes that the default field is being searched. You can determine how the query will be interpreted by using the 'search-cmd explain' command. Here's two alternate ways to express your query:
Name:c1 OR Name:c2
Name:(c1 OR c2)
both options worked for me!

Asp.net fulltext multiple search terms methodology

I've got a search box that users can type terms into. I have a table setup with fulltext searching on a string column. Lets say a user types this: "word, office, microsoft" and clicks "search".
Is this the best way to deal with multiple search terms?
(pseudocode)
foreach (string searchWord in searchTerms){
select col1 from myTable where contains(fts_column, ‘searchWord’)
}
Is there a way of including the search terms in the sql and not iterating? I'm trying to reduce the amount of calls to the sql server.
FREETEXT might work for you. It will separate the string into individual words based on word boundaries (word-breaking). Then you'd only have a single SQL call.
MSDN -- FREETEXT
Well you could just build your SQL Query Dynamically...
string [] searchWords = searchTerm.Split(",");
string SQL = "SELECT col1 FROM myTable WHERE 1=2";
foreach (string word in searchWords)
{
SQL = string.Format("{0} OR contains(fts_column, '{1}')", SQL, word);
}
//EXEC SQL...
Obviously this comes with the usual warnings/disclaimers about SQL Injection etc... but the principal is that you would dynamically build up all your clauses and apply them in one query.
Depending on how your interacting with your DB, it might be feasible for you to pass the entire un-split search term into a SPROC and then split & build dynamic SQL inside the stored procedure.
You could do it similar to what you have there: just parse the search terms based on delimiter, and then make a call on each, joining the results together. Alternatively, you can do multiple CONTAINS:
SELECT Name FROM Products WHERE CONTAINS(Name, #Param1) OR CONTAINS(Name, #Param2) etc.
Maybe try both and see which is faster in your environment.
I use this class for Normalizing SQL Server Full-text Search Conditions

Resources