I'm creating a xamarin.forms application, and we use sqlite-net-plc by Frank A. Krueger. It is supposed to support full text searching, which I am trying to implement.
Now, full text search seems to work. I created a query like:
SELECT * FROM Document d JOIN(
SELECT document_id
FROM SearchDocument
WHERE SearchDocument MATCH 'test*'
) AS ranktable USING(document_id)
which seems to work fine. However, I'd like to return the results in order of their rank, otherwise the result is useless. According to the documentation (https://www.sqlite.org/fts3.html), the syntax should be:
SELECT * FROM Document d JOIN(
SELECT document_id, rank(matchinfo(SearchDocument)) AS rank
FROM SearchDocument
WHERE SearchDocument MATCH 'test*'
) AS ranktable USING(document_id)
ORDER BY ranktable.rank
However, the engine doesn't seem to know the "rank" function:
[ERROR] FATAL UNHANDLED EXCEPTION: SQLite.SQLiteException: no such function: rank
It does know the "matchinfo" function though.
Can anyone tell me what I'm doing wrong?
Edit: After some more searching it seems that the rank function is simply not implemented in the library. I'm confused. How can people use the fulltext search without caring about the order of the results? Is there some other way of ordering the results so that the most relevant results are at the top?
It depends on SQLitePCLRaw.bundle_green. It's worth looking into that.
Related
This is a little strange, but I have a situation where it'd be beneficial for me to know which traversal an element came from.
For a simple example, something like this:
.union(
select('parent').out('contains'), //traversal 1
select('parent2').out('contains') //traversal 2
)
.dedup()
.project('id','traversal')
.by(id())
.by( //any way to determine which traversal it came from? or if it was in both? )
Edit: One thing I found is that I can use Map with Group/By to get partly there:
.union(
select('parent').out('contains')
.map(group().by(identity()).by(constant('t1'))),
select('parent2').out('contains')
.map(group().by(identity()).by(constant('t2'))),
)
.dedup() //Dedup isn't gonna work here because each hashmap will be different.
.project('id','traversal')
.by( //here I can't figure out how to read a value from the hashmap inline )
The above query without the project/by piece returns this:
[{v[199272505353083909]: 't1'}, {v[199272515180338177]: 't2'}]
Or is there a better way to do this?
Thanks!
One simple approach might be to just fold the results. If you get back an empty list you will know you did not find any on that "branch":
g.V('44').
union(out('route').fold().as('a').project('res','branch').by().by(constant('b1')),
out('none').fold().as('b').project('res','branch').by().by(constant('b2')))
which yields
{'res': [v[8], v[13], v[20], v[31]], 'branch': 'b1'}
{'res': [], 'branch': 'b2'}
UPDATED after discussion in comments to include an alternative approach that uses nested union steps to avoid the project step inside the union. I still think I prefer the project approach unless the performance when measured is not good.
g.V('44').
union(local(union(out('route').fold(),constant('b1')).fold()),
local(union(out('none').fold(),constant('b2')).fold()))
which yields
[[v[8], v[13], v[20], v[31]], 'b1']
[[], 'b2']
I am trying to find the count of all workspaces group by Customer, and then sorting the response with the count value.
t3 = g.withSideEffect("Neptune#repeatMode","BFS")
.V().has("Project","sid","A68FA527BB214F0E9D2287B455BEFE0A8AACFED0724B407F9EBF727ED439E8ED")
.both().hasLabel("Customer").group().by().by(
both().hasLabel("Workspace").count().order().by(Column.values,Order.desc)
)
.unfold()
.project("rowName","data")
.by(select(Column.keys).properties(MandatoryCustomerAttributes.firstName.name()).value())
.by(select(Column.values))
.by(fold().unfold());
I am facing the following error which I am not able to understand. If someone can help it will be great.
java.util.concurrent.CompletionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: {"requestId":"c3246ac6-d306-41e6-a6d1-3adbacb928fb","code":"InvalidParameterException","detailedMessage":"The provided object does not have accessible keys: class java.lang.Long"}
at java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:375)
at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1934)
at org.apache.tinkerpop.gremlin.driver.ResultSet.one(ResultSet.java:123)
at org.apache.tinkerpop.gremlin.driver.ResultSet$1.hasNext(ResultSet.java:175)
at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:182)
at org.apache.tinkerpop.gremlin.driver.ResultSet$1.next(ResultSet.java:169)
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:146)
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal$TraverserIterator.next(DriverRemoteTraversal.java:131)
at org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal.nextTraverser(DriverRemoteTraversal.java:112)
at org.apache.tinkerpop.gremlin.process.remote.traversal.step.map.RemoteStep.processNextStart(RemoteStep.java:80)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:129)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:39)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:204)
at org.apache.tinkerpop.gremlin.process.traversal.Traversal.forEachRemaining(Traversal.java:278)
at com.callcomm.eserve.enitty.trial.NeptuneTrialCode.main(NeptuneTrialCode.java:373)
Ok I think after staring at this for a while I spotted the issue.
This part of the query
hasLabel("Workspace").count().order().by(Column.values,Order.desc
Is going to try to apply Column.values against the count result (an integer, not a map) - hence the error message.
You will need to order the group after it is created. Here is a simple example, note the use of local.
g.V('44','3','8').
group().
by().
by(out().count()).
order(local).
by(values,desc)
==>[v[8]:251,v[3]:93,v[44]:4]
As a side note, I did notice that the project only has only 2 items declared but has 3 by modulators. That should not be a problem but you may not be getting the results you wanted.
I am using the following query to obtain the current component serial number (tr_sim_sn) installed on the host device (tr_host_sn) from the most recent record in a transaction history table (PUB.tr_hist)
SELECT tr_sim_sn FROM PUB.tr_hist
WHERE tr_trnsactn_nbr = (SELECT max(tr_trnsactn_nbr)
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_lot = '99524136'
AND tr_part = '6684112-001')
The actual table has ~190 million records. The excerpt below contains only a few sample records, and only fields relevant to the search to illustrate the query above:
tr_sim_sn |tr_host_sn* |tr_host_pn |tr_domain |tr_trnsactn_nbr |tr_qty_loc
_______________|____________|_______________|___________|________________|___________
... |
356136072015140|99524135 |6684112-000 |vattal_us |178415271 |-1.0000000000
356136072015458|99524136 |6684112-001 |vattal_us |178424418 |-1.0000000000
356136072015458|99524136 |6684112-001 |vattal_us |178628048 |1.0000000000
356136072015050|99524136 |6684112-001 |vattal_us |178628051 |-1.0000000000
356136072015836|99524137 |6684112-005 |vattal_us |178645337 |-1.0000000000
...
* = key field
The excerpt illustrates multiple occurrences of tr_trnsactn_nbr for a single value of tr_host_sn. The largest value for tr_trnsactn_nbr corresponds to the current tr_sim_sn installed within tr_host_sn.
This query works, but it is very slow, ~8minutes.
I would appreciate suggestions to improve or refactor this query to improve its speed.
Check with your admins to determine when they last updated the SQL statistics. If the answer is "we don't know" or "never" then you might want to ask them to run the following 4gl program which will create a SQL script to accomplish that:
/* genUpdateSQL.p
*
* mpro dbName -p util/genUpdateSQL.p -param "tmp/updSQLstats.sql"
*
* sqlexp -user userName -password passWord -db dnName -S servicePort -infile tmp/updSQLstats.sql -outfile tmp/updSQLtats.log
*
*/
output to value( ( if session:parameter <> "" then session:parameter else "updSQLstats.sql" )).
for each _file no-lock where _hidden = no:
put unformatted
"UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS FOR PUB."
'"' _file._file-name '"' ";"
skip
.
put unformatted "commit work;" skip.
end.
output close.
return.
This will generate a script that updates statistics for all table and all indexes. You could edit the output to only update the tables and indexes that are part of this query if you want.
Also, if the admins are nervous they could, of course, try this on a test db or a restored backup before implementing in a production environment.
I am posting this as a response to my request for an improved query.
As it turns out, the following syntax features two distinct features that greatly improved the speed of the query. One is to include tr_domain search criteria in both main and nested portions of the query. Second is to narrow the search by increasing the number of search criteria, which in the following are all included in the nested section of the syntax:
SELECT tr_sim_sn,
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_trnsactn_nbr IN (
SELECT MAX(tr_trnsactn_nbr)
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_part = '6684112-001'
AND tr_lot = '99524136'
AND tr_type = 'ISS-WO'
AND tr_qty_loc < 0)
This syntax results in ~0.5s response time. (credit to my colleague, Daniel V.)
To be fair, this query uses criteria outside the originally stated parameters that were included in the original post, making it difficult to impossible for others to attempt a reasonable answer. This omission was not on purpose of course, rather due to being fairly new to fundamentals of good query design. This query in part is a result of learning that when too-few or non-indexed fields are used as search criteria in a large table, it is sometimes helpful to narrow the search by increasing the number of search criteria items. The original had 3, this one has 5.
I've been googling for a while now and I guess I just have trouble stating my question the correct way.
I have a Product and my Product has "optional" ProductImages associated with it.
When I lazyload the products everything works as expected but I'd like to join my images prior to reduce my total amoutn of queries.
Here's the code:
$qb->select('product')
->from('FocumaTCBundle:Product', 'product')
->join('product.ProductType', 'type')
->join('product.ProductImages', 'productImage')
->where('type.id = :productTypeId')
->setParameter('productTypeId', $PRODUCT_HOTEL_TYPE);
This however returns less results then without the join. I'm not sure how
to create an "optional" join :(
Thanks for some help on this!
Use leftJoin:
$qb->select('product')
->from('FocumaTCBundle:Product', 'product')
->leftJoin('product.ProductType', 'type')
->leftJoin('product.ProductImages', 'productImage')
->where('type.id = :productTypeId')
->setParameter('productTypeId', $PRODUCT_HOTEL_TYPE);
The transaction AL11 returns a mapping of "directory parameters" to file paths on the application server AFAIK.
The trouble with transaction AL11 is that its program only calls c modules, there's almost no trace of select statements or function calls to analize there.
I want the ability to do this dynamically, in my code, like for instance a function module that took "DATA_DIR" as input and "E:\usr\sap\IDS\DVEBMGS00\data" as output.
This thread is about a similar topic, but it doesn't help.
Some other guy has the same problem, and he explains it quite well here.
I strongly suspect that the only way to get these values is through the kernel directly. some of them can vary depending on the application server, so you probably won't be able to find them in the database. You could try this:
TYPE-POOLS abap.
TYPES: BEGIN OF t_directory,
log_name TYPE dirprofilenames,
phys_path TYPE dirname_al11,
END OF t_directory.
DATA: lt_int_list TYPE TABLE OF abaplist,
lt_string_list TYPE list_string_table,
lt_directories TYPE TABLE OF t_directory,
ls_directory TYPE t_directory.
FIELD-SYMBOLS: <l_line> TYPE string.
START-OF-SELECTION-OR-FORM-OR-METHOD-OR-WHATEVER.
* get the output of the program as string table
SUBMIT rswatch0 EXPORTING LIST TO MEMORY AND RETURN.
CALL FUNCTION 'LIST_FROM_MEMORY'
TABLES
listobject = lt_int_list.
CALL FUNCTION 'LIST_TO_ASCI'
EXPORTING
with_line_break = abap_true
IMPORTING
list_string_ascii = lt_string_list
TABLES
listobject = lt_int_list.
* remove the separators and the two header lines
DELETE lt_string_list WHERE table_line CO '-'.
DELETE lt_string_list INDEX 1.
DELETE lt_string_list INDEX 1.
* parse the individual lines
LOOP AT lt_string_list ASSIGNING <l_line>.
* If you're on a newer system, you can do this in a more elegant way using regular expressions
CONDENSE <l_line>.
SHIFT <l_line> LEFT DELETING LEADING '|'.
SHIFT <l_line> RIGHT DELETING TRAILING '|'.
SPLIT <l_line>+1 AT '|' INTO ls_directory-log_name ls_directory-phys_path.
APPEND ls_directory TO lt_directories.
ENDLOOP.
Try the following
data : dirname type DIRNAME_AL11.
CALL 'C_SAPGPARAM' ID 'NAME' FIELD 'DIR_DATA'
ID 'VALUE' FIELD dirname.
Alternatively if you wanted to use your own parameters(AL11->configure) then read these out of table user_dir.