How to SELECT a single record in table X with the largest value for X.a WHERE values for fields X.b & X.c are specified - openedge

I am using the following query to obtain the current component serial number (tr_sim_sn) installed on the host device (tr_host_sn) from the most recent record in a transaction history table (PUB.tr_hist)
SELECT tr_sim_sn FROM PUB.tr_hist
WHERE tr_trnsactn_nbr = (SELECT max(tr_trnsactn_nbr)
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_lot = '99524136'
AND tr_part = '6684112-001')
The actual table has ~190 million records. The excerpt below contains only a few sample records, and only fields relevant to the search to illustrate the query above:
tr_sim_sn |tr_host_sn* |tr_host_pn |tr_domain |tr_trnsactn_nbr |tr_qty_loc
_______________|____________|_______________|___________|________________|___________
... |
356136072015140|99524135 |6684112-000 |vattal_us |178415271 |-1.0000000000
356136072015458|99524136 |6684112-001 |vattal_us |178424418 |-1.0000000000
356136072015458|99524136 |6684112-001 |vattal_us |178628048 |1.0000000000
356136072015050|99524136 |6684112-001 |vattal_us |178628051 |-1.0000000000
356136072015836|99524137 |6684112-005 |vattal_us |178645337 |-1.0000000000
...
* = key field
The excerpt illustrates multiple occurrences of tr_trnsactn_nbr for a single value of tr_host_sn. The largest value for tr_trnsactn_nbr corresponds to the current tr_sim_sn installed within tr_host_sn.
This query works, but it is very slow, ~8minutes.
I would appreciate suggestions to improve or refactor this query to improve its speed.

Check with your admins to determine when they last updated the SQL statistics. If the answer is "we don't know" or "never" then you might want to ask them to run the following 4gl program which will create a SQL script to accomplish that:
/* genUpdateSQL.p
*
* mpro dbName -p util/genUpdateSQL.p -param "tmp/updSQLstats.sql"
*
* sqlexp -user userName -password passWord -db dnName -S servicePort -infile tmp/updSQLstats.sql -outfile tmp/updSQLtats.log
*
*/
output to value( ( if session:parameter <> "" then session:parameter else "updSQLstats.sql" )).
for each _file no-lock where _hidden = no:
put unformatted
"UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS FOR PUB."
'"' _file._file-name '"' ";"
skip
.
put unformatted "commit work;" skip.
end.
output close.
return.
This will generate a script that updates statistics for all table and all indexes. You could edit the output to only update the tables and indexes that are part of this query if you want.
Also, if the admins are nervous they could, of course, try this on a test db or a restored backup before implementing in a production environment.

I am posting this as a response to my request for an improved query.
As it turns out, the following syntax features two distinct features that greatly improved the speed of the query. One is to include tr_domain search criteria in both main and nested portions of the query. Second is to narrow the search by increasing the number of search criteria, which in the following are all included in the nested section of the syntax:
SELECT tr_sim_sn,
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_trnsactn_nbr IN (
SELECT MAX(tr_trnsactn_nbr)
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_part = '6684112-001'
AND tr_lot = '99524136'
AND tr_type = 'ISS-WO'
AND tr_qty_loc < 0)
This syntax results in ~0.5s response time. (credit to my colleague, Daniel V.)
To be fair, this query uses criteria outside the originally stated parameters that were included in the original post, making it difficult to impossible for others to attempt a reasonable answer. This omission was not on purpose of course, rather due to being fairly new to fundamentals of good query design. This query in part is a result of learning that when too-few or non-indexed fields are used as search criteria in a large table, it is sometimes helpful to narrow the search by increasing the number of search criteria items. The original had 3, this one has 5.

Related

Sqoop trying to --split-by ROWID (Oracle) fails

(be Kind, this is my first question and I did extensive Research here and on the net beforehand. Question Oracle ROWID for Sqoop Split-By Column did not really solve this issue, as the original Person asking resorted to using another column)
I am using sqoop to copy data from an Oracle 11 DB.
Unfortunately, some tables have no index, no Primary key, only partitions (date). These tables are very large, hundreds of millions if not billions of rows.
so far, I have decided to Access data in the source by explicitly adressing the partitions. That works well and Speeds up the process nicely.
I need to do the splits by data that resides in each and every table in order to avoid too many if- branches in my bash script. (we're talking some 200+ tables here)
I notice that a split by 8 Tasks results in very uneven spread of workload among the Tasks. I considered using Oracle ROWID to define the split.
To do this, I must define a boundary-query. In a Standard query 'select * from xyz' the rowid is not part of the result set. therefore, it is not an option to let Sqoop define the boundary-query from --query.
Now, when I run this, I am getting the error
ERROR tool.ImportTool: Encountered IOException running import job:
java.io.IOException: Sqoop does not have the splitter for the given SQL
data type. Please use either different split column (argument --split-by)
or lower the number of mappers to 1. Unknown SQL data type: -8
samples of ROWID :
AAJXFWAKPAAOqqKAAA
AAJXFWAKPAAOqqKAA+
AAJXFWAKPAAOqqKAA/
it is static and unique once it is created for any row.
I cast this funny datatype into something else in my boundary-query
sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true --connect
jdbc:oracle:thin:#127.0.0.1:port:mydb --username $USER --P --m 8
--split-by ROWID --boundary-query "select cast(min(ROWID) as varchar(18)), cast
( max(ROWID)as varchar(18)) from table where laufbzdt >
TO_DATE('2019-02-27', 'YYYY-MM-DD')" --query "select * from table
where laufbzdt > TO_DATE('2019-02-27', 'YYYY-MM-DD') and \$CONDITIONS "
--null-string '\\N'
--null-non-string '\\N'
But then I get ugly ROWIDs that are rejected by Oracle:
select * from table where laufbzdt > TO_DATE('2019-02-27', 'YYYY-MM-DD')
and ( ROWID >= 'AAJX6oAG聕聁AE聉N:' ) AND ( ROWID < 'AAJX6oAH⁖⁁AD䁔䀷' ) ,
Error Msg = ORA-01410: invalid ROWID
how can I resolve this properly?
I am a LINUX-Embryo and have painfully chewed myself through the Topics of bash-shell-scripting and Sqooping so far, but I would like to make better use of evenly spread mapper-task workload - it would cut sqoop-time in half, I guess, saving some 5 to 8 hours.
TIA!
wahlium
You can try ROWNUM, but I think sqoop import does not work with pseudocolumn.

sqlite-net-plc full text rank function

I'm creating a xamarin.forms application, and we use sqlite-net-plc by Frank A. Krueger. It is supposed to support full text searching, which I am trying to implement.
Now, full text search seems to work. I created a query like:
SELECT * FROM Document d JOIN(
SELECT document_id
FROM SearchDocument
WHERE SearchDocument MATCH 'test*'
) AS ranktable USING(document_id)
which seems to work fine. However, I'd like to return the results in order of their rank, otherwise the result is useless. According to the documentation (https://www.sqlite.org/fts3.html), the syntax should be:
SELECT * FROM Document d JOIN(
SELECT document_id, rank(matchinfo(SearchDocument)) AS rank
FROM SearchDocument
WHERE SearchDocument MATCH 'test*'
) AS ranktable USING(document_id)
ORDER BY ranktable.rank
However, the engine doesn't seem to know the "rank" function:
[ERROR] FATAL UNHANDLED EXCEPTION: SQLite.SQLiteException: no such function: rank
It does know the "matchinfo" function though.
Can anyone tell me what I'm doing wrong?
Edit: After some more searching it seems that the rank function is simply not implemented in the library. I'm confused. How can people use the fulltext search without caring about the order of the results? Is there some other way of ordering the results so that the most relevant results are at the top?
It depends on SQLitePCLRaw.bundle_green. It's worth looking into that.

Documentum Document Content Share Location from SQL

I am trying to sort out how to find the physical location of a file on a mapped documentum share. There are several ways to do it using the API or DQL, but neither of those will scale to what we need to migrate data out of the system. Ultimately the plan is to migrate all data out and into a new system, but we need the file locations to plan this out.
The following resources have been helpful:
https://robineast.wordpress.com/2007/01/24/where-is-my-content-stored/
https://community.emc.com/thread/51958?start=0&tstart=0
Running this DQL will give us the location, but the SQL provided does not return any data relevant to what we're trying to accomplish (or anything at all).
execute GET_PATH for '<parent_id_goes_here>'
Result:
t:\documentum\data\schema\storage_volume_number\00000000\80\01\ef\63.xlsx
Additionally, using the API with getpath returns valid data, but when choosing to show the SQL is gives the same query (a little further down) which doesn't actually give the location of the file.
API>getpath,c,<r_object_id>
...
t:\documentum\data\schema\storage_volume_number\00000000\80\01\ef\63.xlsx
This is the query provided with both when you choose 'Show the SQL'.
select a.r_object_id, b.audit_attr_names, a.is_audittrail,
a.event, a.controlling_app, a.policy_id,
a.policy_state, a.user_name, a.message,
a.audit_subtypes, a.priority, a.oneshot,
a.sendmail, a.sign_audit
from dmi_registry_s a, dmi_registry_r b
where a.r_object_id = b.r_object_id and a.registered_id = :p0 and (a.event = 'all' or a.event = 'dm_all' or a.event = :p1)
order by a.is_audittrail desc, a.event desc,
a.r_object_id, b.i_position desc;
:p0 = < parent_id >;
:p1 = dm_getfile
The above query returns nothing in PL/SQL, and removing the :p0/:p1 variables just returns audit data.
Any guidance on how to get this using SQL, or a DQL script that could be written to give the path and r_object_id in a CSV to join? I'm also open to other ideas of pulling data out of this system.
After a lot of digging I found that the best way to go about this is to convert the data ticket into your path. To quote the articles linked in the question:
The trick to determining the path to the content is in decoding the data_ticket's 2’s complement decimal value. Convert the data_ticket to a 2’s compliment hexadecimal number by first adding 2^32 to the number and then converting it to hex. You can use a scientific calculator to do this or grab some Java code off the net.
-2147474649 + 2^32 = (-2147474649 + 4294967296) = 2147492647
converting 2147492647 to hex = 80002327
Now, split the hex value of the data_ticket at every two characters, append it to file_system_path and docbase_id (padded to 8 bits), and add the dos_extension. Viola! you have the complete path to the content file.
C:/Documentum/data/docbase/content_storage_01/0000001/80/ 00/23/27.txt
This PowerShell code will do the conversion for you -- just feed it the data ticket.
$Ticket = -2147474649
$FSTicketInt = $Ticket + [math]::Pow(2, 32)
$FSTicketHex = [Convert]::ToString($FSTicketInt, 16)
$FSTicketPath = ($FSTicketHex -split '(..)' | ? {$_}) -join '\'
Then all you need to do is join the path with the content storage location using [System.IO.Path]::Combine().

Required Appropriate query to find out the result

I need a desired result with the less number of execution time.
I have a table which contains many rows (over 100k) , in this table a field name is notes varchar2(1800).
It contains following values:
notes
CASE Transfer
Surnames AAA : BBBB
Case Status ACCOUNT TXFERRED TO BORROWERS
Completed Date 25/09/2022
Task Group 16
Message sent at 12/10/2012 11:11:21
Sender : lynxfailures123#google.com
Recipient : LFRB568767#yahoo.com
Received : 21:31 12/12/2002
Rows should return with the values of(ACCOUNT TXFERRED TO BORROWERS).
I have used the following queries but it takes a long time(72150436 sec) to execute:
Select * from cps_case_history where (dbms_lob.instr(notes, 'ACCOUNT
TFR TO UFSS') > 1)
Select * from cps_case_history where notes like '%ACCOUNT TFR TO
UFSS%'
Could you please share us the exact query which will take less time to execute.
Can you try parallel hints. Optimizer hints
Select /*+ PARALLEL(a,8) */ a.* from cps_case_history a
where INSTR(NOTES,'Text you want to search') > 0; -- your condition
Replace 8 with 16 and see if the performance improves further.
Avoid % in beginning of the like operator
ie., where notes like '%Account...'
Updated answer : Try creating partition tables.You can go with range partitioning on completed_date column Partitioning

How to get programatically the file path from a directory parameter in abap?

The transaction AL11 returns a mapping of "directory parameters" to file paths on the application server AFAIK.
The trouble with transaction AL11 is that its program only calls c modules, there's almost no trace of select statements or function calls to analize there.
I want the ability to do this dynamically, in my code, like for instance a function module that took "DATA_DIR" as input and "E:\usr\sap\IDS\DVEBMGS00\data" as output.
This thread is about a similar topic, but it doesn't help.
Some other guy has the same problem, and he explains it quite well here.
I strongly suspect that the only way to get these values is through the kernel directly. some of them can vary depending on the application server, so you probably won't be able to find them in the database. You could try this:
TYPE-POOLS abap.
TYPES: BEGIN OF t_directory,
log_name TYPE dirprofilenames,
phys_path TYPE dirname_al11,
END OF t_directory.
DATA: lt_int_list TYPE TABLE OF abaplist,
lt_string_list TYPE list_string_table,
lt_directories TYPE TABLE OF t_directory,
ls_directory TYPE t_directory.
FIELD-SYMBOLS: <l_line> TYPE string.
START-OF-SELECTION-OR-FORM-OR-METHOD-OR-WHATEVER.
* get the output of the program as string table
SUBMIT rswatch0 EXPORTING LIST TO MEMORY AND RETURN.
CALL FUNCTION 'LIST_FROM_MEMORY'
TABLES
listobject = lt_int_list.
CALL FUNCTION 'LIST_TO_ASCI'
EXPORTING
with_line_break = abap_true
IMPORTING
list_string_ascii = lt_string_list
TABLES
listobject = lt_int_list.
* remove the separators and the two header lines
DELETE lt_string_list WHERE table_line CO '-'.
DELETE lt_string_list INDEX 1.
DELETE lt_string_list INDEX 1.
* parse the individual lines
LOOP AT lt_string_list ASSIGNING <l_line>.
* If you're on a newer system, you can do this in a more elegant way using regular expressions
CONDENSE <l_line>.
SHIFT <l_line> LEFT DELETING LEADING '|'.
SHIFT <l_line> RIGHT DELETING TRAILING '|'.
SPLIT <l_line>+1 AT '|' INTO ls_directory-log_name ls_directory-phys_path.
APPEND ls_directory TO lt_directories.
ENDLOOP.
Try the following
data : dirname type DIRNAME_AL11.
CALL 'C_SAPGPARAM' ID 'NAME' FIELD 'DIR_DATA'
ID 'VALUE' FIELD dirname.
Alternatively if you wanted to use your own parameters(AL11->configure) then read these out of table user_dir.

Resources