SQLite MATCH query - sqlite

I have created a virtual table and am executing the following query that returns nothing:
SELECT * FROM table_search WHERE name MATCH 'Test'
If I change the MATCH to '=' or 'LIKE' then the query returns a row of data.

As the documentation explains:
The MATCH operator is a special syntax for the match()
application-defined function. The default match() function
implementation raises an exception and is not really useful for
anything. But extensions can override the match() function with more
helpful logic.
If you haven't defined such a function, then it will select no rows. If you have defined such a function, then explain that in your question. Otherwise, stick with like, =, or regexp.

I am wondering what changed since Gordon Linoff's answer, because he is obviously quite the SQL expert and I'm not. Just talking about my own experience:
Don't specify the column but the table name for FTS queries:
SELECT * FROM table_search WHERE table_search MATCH 'Test'
I did the same mistake and even tried the implementation of a custom MATCH function from this repository after reading the answer from Gordon Linoff and the doc.
The MATCH operator is a special syntax for the match() application-defined function. The default match() function implementation raises an exception and is not really useful for anything. But extensions can override the match() function with more helpful logic.
Contrary, I found out the MATCH function works out of box (using sqlite 3.39) if I used the table name, and an asterix 'Test*' but that was just because of my specific dataset and a wrong assumption.
The custom function is not being called by the MATCH, but it is called if I applied it to a column like
SELECT match(name) FROM table_search WHERE table_search MATCH 'Test'
which is a very different intention.

Related

Unary NOT in SQLite FTS5 MATCH query

The SQLite FTS5 docs say that search queries such as SELECT ... WHERE MATCH '<query1> NOT <query2>' are supported, but it looks like there's no support for the unary NOT operator.
For example, if I want to search for everything that doesn't match <query>, I cannot use MATCH 'NOT <query>'. I would have to use NOT MATCH '<query>', which is a completely different thing (the FTS5 module never gets to see the NOT operator, as it is outside the quotation marks). Only the text inside the quotation marks is the search query.
I need to find a way to use an unary NOT operator inside the search query. I can't use it outside, because I only get to control the search query text, and not the rest of the SQL statement.
A possible approach I've thought of would be to find a search query that matches anything, and do MATCH '<match_anything> NOT <query>'. However, I've found no way to match everything in a search query.
Can you think of a way to have the behaviour of the unary NOT operator inside the search query?
Try this ..
SELECT * FROM docs
WHERE ROWID NOT IN (
SELECT ROWID FROM docs WHERE content MATCH '<query>'
)

SQLite Custom ORDER BY over Strings causing Parser Stack Overflow

I wrote a custom SQLite 'sort' over strings that basically replaces each relevant substring by an alphabetic letter in the appropriate place in the alphabet.
The problem is pretty simple - there are too many REPLACE statements running around causing a parser stack overflow.
For example, the statement looks something like: SELECT ... FROM ... WHERE ... ORDER BY REPLACE(REPLACE(...REPLACE('alpha','A'), 'beta','B'), 'gamma','C')...);
The total count of REPLACE calls is 66.
My current work-around is just to use a custom function to apply the replacements (since I am using the sqlite c api), but it would be nice to be able to do this in SQLite itself, rather than having to use a c callback.
Is it possible? Is there a better solution?

R sqlexecute wildcard

Using RODBCext (and Teradata) my SQL query often need to be restricted and is done so with a where statement. However, this is not always required and it would be beneficial to not restrict, but I would like to use a single SQL query. (The actual query is more complex and has several instances of what I'm attempting to apply here)
In order to return all rows, using a wildcard seems like the next best option, but nothing appears to work correctly. For example, the sql query is:
SELECT *
FROM MY_DB.MY_TABLE
WHERE PROC_TYPE = ?
The following does work when passing in a string for proc_type:
sqlExecute(connHandle, getSQL(SQL_script_path), proc_type, fetch = TRUE)
In order to essentially bypass this filter, I would like to pass a wildcard so all records are returned.
I've tried proc_type set to '%', '*'. Also escaped both with backslashes and enclosed with double-quotes, but no rows are ever returned, nor are any errors produced.
You could use COALESCE to do this:
SELECT *
FROM MY_DB.MY_TABLE
WHERE PROC_TYPE = COALESCE(?, PROC_TYPE);
In the event that your parameter is NULL it will choose PROC_TYPE to compare to PROC_TYPE which will return everything.
As for your wildcard attempt you would have to switch over to an operator that can use a wildcard. Instead of =, LIKE for instance. I think you would end up with some oddball edge cases though depending on your searchterm and the data in that column, so the COALESCE() option is a better way to go.

Substring and Length function in U SQL

I want all records where ParaName matches with tagName. i have tried Length, LEFT and SUBSTRING function, but i think Length,LEFT and SUBSTRING functions are not possible in U-SQL. If possible, what is the syntax??
#var=
SELECT * FROM Table
WHERE ParaName LIKE tagName+"%";
U-SQL emphasizes the use of C# Expressions and methods on .NET types to handle many common cases that SQL achieves with functions.
In this case your type is string (System.String) so methods like StartsWith() and Contains() can be used among many others.
Example: ParmeterName that begins with tagName
WHERE ParameterName.StartsWith( tagName )
Example: ParmeterName that contains with tagName
WHERE ParameterName.Contains( tagName )
There are many examples of using various c# expressions and methods at the official reference site - U-SQL Language Reference. Look under the sub-topic Built-in C# Functions and Operators (U-SQL).

IsNULL and Coalesce proper usage

As we have two options to intercept the null values coming from database...
ISNull
Coalesce
Following are the ways to write the query for the above two functions...
Select IsNull(Columnname, '') As validColumnValue From TableName
Select Coleasce(Columnname, '') As validColumnValue From TableName
Query - Which should be prefered in which situation and why?
This has been hashed and re-hashed. In addition to the tip I pointed out in the comment and the links and explanation #xQbert posted above, by request here is an explanation of COALESCE vs. ISNULL using a subquery. Let's consider these two queries, which in terms of results are identical:
SELECT COALESCE((SELECT TOP (1) name FROM sys.objects), N'foo');
SELECT ISNULL((SELECT TOP (1) name FROM sys.objects), N'foo');
(Comments about using TOP without ORDER BY to /dev/null/ thanks.)
In the COALESCE case, the logic actually gets expanded to something like this:
SELECT CASE WHEN (SELECT TOP (1) ...) IS NULL
THEN (SELECT TOP (1) ...)
ELSE N'foo'
END
With ISNULL, this does not happen. There is an internal optimization that seems to ensure that the subquery is only evaluated once. I don't know if anyone outside of Microsoft is privy to exactly how this optimization works, but you can this if you compare the plans. Here is the plan for the COALESCE version:
And here is the plan for the ISNULL version - notice how much simpler it is (and that the scan only happens once):
In the COALESCE case the scan happens twice. Meaning the subquery is evaluated twice, even if it doesn't yield any results. If you add a WHERE clause such that the subquery yields 0 rows, you'll see similar disparity - the plan shapes might change, but you'll still see a double seek+lookup or scan for the COALESCE case. Here is a slight different example:
SELECT COALESCE((SELECT TOP (1) name FROM sys.objects
WHERE name = N'no way this exists'), N'foo');
SELECT ISNULL((SELECT TOP (1) name FROM sys.objects
WHERE name = N'no way this exists'), N'foo');
The plan for the COALESCE version this time - again you can see the whole branch that represents the subquery repeated verbatim:
And again a much simpler plan, doing roughly half the work, using ISNULL:
You can also see this question over on dba.se for some more discussion:
Performance difference for COALESCE versus ISNULL?
My suggestion is this (and you can see my reasons why in the tip and the above question): trust but verify. I always use COALESCE (because it is ANSI standard, supports more than two arguments, and doesn't do quite as wonky things with data type precedence) unless I know I am using a subquery as one of the expressions (which I don't recall ever doing outside of theoretical work like this) or I am experiencing a real performance issue and just want to compare to see if COALESCE vs. ISNULL has any substantial performance difference (which outside of the subquery case, I have yet to find). Since I am almost always using COALESCE with arguments of like data types, I rarely have to do any testing other than looking back at what I've said about it in the past (I was also the author of the aspfaq article that xQbert pointed out, 7 years ago).
begin humor: The 1st, the 2nd will never work it's spelled wrong :D END humor
---Cleaned up response---
Features of isNull(value1,value2)
Only supports 1 valuation, if the first is null, the 2nd will be used, so if its null too you get null back!
is non-ANSI standard. Meaning if database portability is an issue, don't use this one
isnull(value1,value2) will return the datatype for Value1
will return the datatype for the selected value and fail when implicit conversions can't occur
Features of Coalesce(Value1, Value2, value3, value...)
Supports multiple valuations of Null; basically will pull in the first non-null value from the list. if all values are null in list, null is returned.
is ANSI-standard meaning database portability shouldn't be an issue.
will return the datatype for the selected value and fail if all fields in the select do not return the same datatype.
So to answer the question directly:
It depends on the situation if you need to develop SQL that
is DB independent; coalesce is more correct to use.
allows for multiple evaluations; coalesce is more correct (of course you could just embed isnull over and over and over...) but put that under a performance microscope, and coalesce may just win. (i've not tested it)
are you using a db engine that supports isNull? (if not use coalesce)
how do you want type casting handled? implicitly or not.
---ORIGINAL------
is null only supports 2 evaluations
coalesce supports many more... coalesce (columnName1, ColumnName2, ColumnName3, '')
coalesce returns datatype similar to that of case evaluation, whereas isnull returns datatype of first in list. (which I found interesting!)
As to when to use which. you'd have to investigate by looking at execution plan of both on SQL 2008 and 2005, different versions different engines different ways to execute.
Furthermore coalesce is ansii standard, isnull is engine specific. Thus if you want greater portability between dbengines use coalesce.
More info here aspfaq
or here msdn blog
you may take this onto consideration.
ISNULL function required two parameters: the value to check and the replacement for null values
2.COALESCE function works a bit different COALESCE will take any number of parameters and return the first non-NULL value , I prefer COALESCE over ISNULL 'cause
meets ANSI standarts, while ISNULL does not.
I hope you found the answer to your question.

Resources