cts:near query not working with special character

cts:near query not working with special character - xquery

I have a document with element <word>42 case § 100</word>
I am searching for this using a near query since my requirement is to use it. This does not work when i use the below query,
cts:search(
//word,
cts:near-query(
(
cts:word-query("42", ("case-insensitive","diacritic-insensitive","punctuation-insensitive","lang=en"), 1),
cts:word-query("case", ("case-insensitive","diacritic-insensitive","punctuation-insensitive","lang=en"), 1),
cts:word-query("§", ("case-insensitive","diacritic-insensitive","punctuation-insensitive","lang=en"), 1),
cts:word-query("1*", ("case-insensitive","diacritic-insensitive","punctuation-insensitive","lang=en"), 1)
),
2,
("ordered"),
1
)
)
Same works when I use "§ 1*" together instead of splitting them.

§ is punctuation which is not included in the word index. You can change this by using custom tokenization overrides. See Search Developer's Guide for more details.

I think your distance (3rd param) is too small in this case. The distance is between all subqueries, so when looking for 4 ordered terms, you need a distance of 3.
HTH!

Related

is it the same to use MATCHES (* + "" + *) and no parameters in a FOR EACH in Progress 4GL?

So I made the following FOR EACH
FOR EACH insp_cd
WHERE insp_cd.status_ = 1
AND insp_cd.item MATCHES('*' + pc-itemPost + '*')
AND insp_cd.update_at < NOW:
So, when the pc-itemPost is "", should I avoid using the MATCHES? Like:
IF pc-itemPost = "" THEN DO:
FOR EACH insp_cd
WHERE insp_cd.status_ = 1
AND insp_cd.update_at < NOW:
...
END.
ELSE DO:
FOR EACH insp_cd
WHERE insp_cd.status_ = 1
AND insp_cd.item MATCHES('*' + pc-itemPost + '*')
AND insp_cd.update_at < NOW:
I know it's very slow because of the table scan, but I'd like to know if there is any difference. Thanks.

Any time that you can avoid MATCHES you should do so.
Using an IF statement to choose branches that execute different static FOR EACH statements is one way to do it. Building dynamic queries based on similar logic would be another approach.
Whether or not your two queries are "different"? Sure, they are different. They have different WHERE clauses so their specific behavior (and performance) will depend on the index structure (which we don't know).
insp_cd.item matches “*” + pc-itempost + “*”
Can be very different from:
insp_cd.item = “”.
And logically it is not the same as omitting a check of insp_cd.item altogether. Logically maybe you’re attempting to exclude empty values? I’m not sure what the requirement is here.
If insp_cd.item is the first component of an index, or the second component after insp_cd.Status then a variation of this query using ‘ = “” ‘ will be much more efficient than one using MATCHES.
Back to avoiding MATCHES, at a high level:
If there is no need for wild cards use "=". Equality matches are always preferred.
If the wild card is at the end of the string use BEGINS.
If the wild card is being used to signify a known list use a series of OR clauses or a LOOKUP() or build a temp-table to join in the query.
There are probably more ways to avoid MATCHES but these are the ones that spring to mind.

Search for value in element with specific attribute

<elemA>
<elemZ mytype ="1">
<myval>100</myval>
</elemZ>
<elemZ mytpe ="2">
<myval>200</myval>
</elemZ>
</elemA>
Using cts:queries, I would want to find myval of 100 in elemZ with mytype = "1". I do not see any cts query that allows cts:element-query and also filtering on attribute. Even an cts:and-query does not appear helpful.
Without attribute constraint, element-value-query and two element-queries would work easy.
cts:search(doc(), (some cts query?))

First try this simple xpath -- validate that it works, and that its not sufficiently performant for you.
//elemZ[#mytype=1]/myval[. = "100" ]
That should return myval element children of elemZ with mytype=1 and myval text content = "100"
To do better (with cts:query) will need those 'dreaded' other cts:queries and possibly some range indexes.
Roughly : (untested)
search(doc(),
cts:element-query(xs:QName("elemZ"),
cts:and-query((
cts:element-attribute-value-query(xs:QName("elemZ"), xs:QName("mytype"), "1"),
cts:element-value-query(xs:QName("myval"), "100") )) ) )
Recommend you start with the simplest expression that does anything then one by one add constraints.
In your case, it's conceivable that the query optimizer will optimize the simple xpath into the appropriate cts query. Worth trying and measuring performance. I personally like to start with a basic xpath and then only work my way up to a cts:query as needed.

Strings Comparing between result set and correct set

I'm working on an algorithm to extract keywords from a text, I have a test set of scientific abstracts with their tags (keywords) , my question is What is the best way to compare the correct tags with the tags my algorithm produce ?
Should I strictly compare them ex.
if (correct_tag == result_tag)
...or do a similarity check ? Given that sometimes I get something like the following:
For the same document:
**correct_tag** = ["eigenvalues and eigenfunctions in quantum mechanics"]
**result_tag** = ["eigenvalues and eigenfunctions"]
For Another Document:
**correct_tag** = ["cardiovascular system"]
**result_tag** = ["cardiovascular physiology",""cardiovascular system""]
NOTE: These tags are in text tags , meaning they are extracted from the text
Guys any help is appreciated , thanks

Sort Numbers with a Letter Suffix from an Access Database

I am attempting to retrieve some information from an Access database using an OleDbConnection. I am trying to Order the results By a column that contains a set of numbers in string format.
I wanted the results in a natural order (e.g. 1, 2, 10, 20 versus 1, 10, 2, 20), so I converted the data in the column of interest to integers and sorted the results.
"SELECT Drawing, Sheet FROM TableName ORDER BY CINT(Sheet) ASC"
This works fine, except in some cases when the table data has values with a letter suffix (e.g. 1A, 2B, etc...). The command above obviously fails for theses cases.
I would like the results sorted like so: 1, 2, 2A, 2B, 3, 3A, and so on...
So, how to go about this? I've seen some examples that use REGEXP and some conditional statements, but apparently MS SQL doesn't support REGEXP. So I'm stuck. Ideas would be appreciated.

There is no way to use a regular expression in a query run from outside an Access session. If the query is run inside an Access session, it could use a custom VBA function with RegExp. But a regular expression approach seems like over-kill for this situation anyway. You can get what you need simply with Val().
The Val Function will return a number from the digits in your string. It will stop reading the string when it hits a letter.
Here's an example from the Immediate window.
? Val("2A")
2
Use it in your query like this ...
SELECT Drawing, Sheet
FROM TableName
ORDER BY Val(Sheet), Sheet;

SQLite Updating first letter to be upper case

I have a field, customer.country
Iam trying to update it so that the first letter of the values in country are upper case. I don't seem to be able to find out a way of doing that.

UPDATE customer
SET country = UPPER(SUBSTR(country, 1, 1)) || SUBSTR(country, 2)

Why don't you use substr() to get the first letter and upper() it?
upper(substr(customer.country, 1, 1))||substr(customer.country, 2)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

cts:near query not working with special character - xquery

§ is punctuation which is not included in the word index. You can change this by using custom tokenization overrides. See Search Developer's Guide for more details.

I think your distance (3rd param) is too small in this case. The distance is between all subqueries, so when looking for 4 ordered terms, you need a distance of 3. HTH!

Related

is it the same to use MATCHES (* + "" + *) and no parameters in a FOR EACH in Progress 4GL?

Search for value in element with specific attribute

Strings Comparing between result set and correct set

Sort Numbers with a Letter Suffix from an Access Database

SQLite Updating first letter to be upper case

Categories

Resources