Order by clause ignored in crystal user defined command - plsql

I am using the below query as a user defined command command in crystal 2013. It returns and orders properly in sql developer but when I add it to my report the fields are either returned twice and/or in the wrong order. It should return rows based on ID ascending e.g. 39-40-41-42. But returns 39-41-40-42... Or 3939-4141-4040-4242. So there seems to be a pattern..
select ad.arinvt_id,
ud.parent_id, listagg(ud.cuser, '') within group (order by ud.parent_id) as sfdt,
listagg(ud.ud_cols_id, '') within group (order by ud.ud_cols_id) as uci
from arinvoice_detail ad
left join ud_data ud
on ad.arinvt_id = ud.parent_id
where ad.arinvt_id = ud.parent_id
and ud.ud_cols_id in (39, 40, 41, 42)
group by ad.arinvt_id, ud.parent_id
I haven't been able to find much, what I have found is on different platforms. Any help is much appreciated!
I saw this:
How to define a custom order in ORDER BY clause?
and tried to change to
(order by field(xyz))
but crystal wouldn't take that.

I was linking the table twice. Once in the command, again in crystal. Not sure why it did the duplication some times and not others, or why the order was skewed.. but the below command works so far.
select
ud.parent_id,
listagg(ud.cuser, '') within group (order by ud.ud_cols_id) as sfdt
from ud_data ud
where ud.ud_cols_id in (39, 40, 41, 42)
group by ud.parent_id

Related

How to SELECT a single record in table X with the largest value for X.a WHERE values for fields X.b & X.c are specified

I am using the following query to obtain the current component serial number (tr_sim_sn) installed on the host device (tr_host_sn) from the most recent record in a transaction history table (PUB.tr_hist)
SELECT tr_sim_sn FROM PUB.tr_hist
WHERE tr_trnsactn_nbr = (SELECT max(tr_trnsactn_nbr)
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_lot = '99524136'
AND tr_part = '6684112-001')
The actual table has ~190 million records. The excerpt below contains only a few sample records, and only fields relevant to the search to illustrate the query above:
tr_sim_sn |tr_host_sn* |tr_host_pn |tr_domain |tr_trnsactn_nbr |tr_qty_loc
_______________|____________|_______________|___________|________________|___________
... |
356136072015140|99524135 |6684112-000 |vattal_us |178415271 |-1.0000000000
356136072015458|99524136 |6684112-001 |vattal_us |178424418 |-1.0000000000
356136072015458|99524136 |6684112-001 |vattal_us |178628048 |1.0000000000
356136072015050|99524136 |6684112-001 |vattal_us |178628051 |-1.0000000000
356136072015836|99524137 |6684112-005 |vattal_us |178645337 |-1.0000000000
...
* = key field
The excerpt illustrates multiple occurrences of tr_trnsactn_nbr for a single value of tr_host_sn. The largest value for tr_trnsactn_nbr corresponds to the current tr_sim_sn installed within tr_host_sn.
This query works, but it is very slow, ~8minutes.
I would appreciate suggestions to improve or refactor this query to improve its speed.
Check with your admins to determine when they last updated the SQL statistics. If the answer is "we don't know" or "never" then you might want to ask them to run the following 4gl program which will create a SQL script to accomplish that:
/* genUpdateSQL.p
*
* mpro dbName -p util/genUpdateSQL.p -param "tmp/updSQLstats.sql"
*
* sqlexp -user userName -password passWord -db dnName -S servicePort -infile tmp/updSQLstats.sql -outfile tmp/updSQLtats.log
*
*/
output to value( ( if session:parameter <> "" then session:parameter else "updSQLstats.sql" )).
for each _file no-lock where _hidden = no:
put unformatted
"UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS FOR PUB."
'"' _file._file-name '"' ";"
skip
.
put unformatted "commit work;" skip.
end.
output close.
return.
This will generate a script that updates statistics for all table and all indexes. You could edit the output to only update the tables and indexes that are part of this query if you want.
Also, if the admins are nervous they could, of course, try this on a test db or a restored backup before implementing in a production environment.
I am posting this as a response to my request for an improved query.
As it turns out, the following syntax features two distinct features that greatly improved the speed of the query. One is to include tr_domain search criteria in both main and nested portions of the query. Second is to narrow the search by increasing the number of search criteria, which in the following are all included in the nested section of the syntax:
SELECT tr_sim_sn,
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_trnsactn_nbr IN (
SELECT MAX(tr_trnsactn_nbr)
FROM PUB.tr_hist
WHERE tr_domain = 'vattal_us'
AND tr_part = '6684112-001'
AND tr_lot = '99524136'
AND tr_type = 'ISS-WO'
AND tr_qty_loc < 0)
This syntax results in ~0.5s response time. (credit to my colleague, Daniel V.)
To be fair, this query uses criteria outside the originally stated parameters that were included in the original post, making it difficult to impossible for others to attempt a reasonable answer. This omission was not on purpose of course, rather due to being fairly new to fundamentals of good query design. This query in part is a result of learning that when too-few or non-indexed fields are used as search criteria in a large table, it is sometimes helpful to narrow the search by increasing the number of search criteria items. The original had 3, this one has 5.

Join works in Azure SQL but fails with with R DBI connection: The multi-part identifier could not be found

I have a query that works perfectly in SSMS. But when running the query in R using the DBI package, I receive several multipart identifier errors: The multi-part identifier: "rt.secondary_id" could not be bound, "rt.third_id" could not be bound, and "t2.important" could not be bound.
select t1.[main_id]
,rt.secondary_id
,rt.third_id
,t1.[date_col]
,t2.important
from t1
inner join rt on t1.main_id = rt.main_id
inner join t2 on rt.main_id = t2.main_id
inner join (select t1.main_id, max(t1.date_col) as upload_time from t1 group by t1.main_id) AS ag ON t1.main_id = ag.main_id AND t1.date_col = ag.upload_time
The unique identifier in t1 is the combination of main_id and date_col, and this query finds the most recent entry in t1 for a given main_id.
Not exactly sure if my query is structured in a poor way or this is an R issue. I've tried adding SET NOCOUNT ON to the query based on what I thought might be related issues elsewhere on stackoverflow, but no dice.
I found out what my issue was- silly (but time consuming) mistake on my part... but essentially, I was bringing my SQL query into R via paste(scan(...), collapse = " "). I had a comment in my SQL query, --, which could not be read correctly by R. Deleting the comment OR switching the comment to /* ... */ syntax fixes the problem.

SQLITE selecting distinct entries that are less than 1 minute old

I'm making a flight tracking map that will need to pull live data from a sql lite db. I'm currently just using the sqlite executable to navigate the db and understand how to interact with it. Each aircraft is identified by a unique hex_ident. I want to get a list of all aircraft that have sent out a signal in the last minute as a way of identifying which aircraft are actually active right now. I tried
select distinct hex_ident, parsed_time
from squitters
where parsed_time >= Datetime('now','-1 minute')
I expected a list of 4 or 5 hex_idents only but I'm just getting a list of every entry (today's entries only) and some are outside the 1 minute bound. I'm new to sql so I don't really know how to do this yet. Here's what each entry looks like. The table is called squitters.
{
"message_type":"MSG",
"transmission_type":8,
"session_id":"111",
"aircraft_id":"11111",
"hex_ident":"A1B4FE",
"flight_id":"111111",
"generated_date":"2021/02/12",
"generated_time":"14:50:42.403",
"logged_date":"2021/02/12",
"logged_time":"14:50:42.385",
"callsign":"",
"altitude":"",
"ground_speed":"",
"track":"",
"lat":"",
"lon":"",
"vertical_rate":"",
"squawk":"",
"alert":"",
"emergency":"",
"spi":"",
"is_on_ground":0,
"parsed_time":"2021-02-12T19:50:42.413746"
}
Any ideas?
You must remove 'T' from the value of parsed_time or use datetime() for it also to make the comparison work:
where datetime(parsed_time) >= datetime('now', '-1 minute')
Note that datetime() function does not take into account microseconds, so if you need 100% accuracy, you must put them in the code with concatenation:
where replace(parsed_time, 'T', ' ') >=
datetime('now', '-1 minute') || substr(parsed_time, instr(parsed_time, '.'))

How to select TOP 1 results _without_ using LIMIT in SQLite?

I'm writing an SQLite select statement and want to pick out the first hit only that satisfy my criterion.
My problem is that I'm writing code inside a simulation framework that wraps my SQLite code before sending it to the database, and this wrapping already adds 'LIMIT 100' to the end of the code.
What I want to do:
SELECT x, y, z FROM myTable WHERE a = 0 ORDER BY y LIMIT 1
What happens when this simulation development framework has done its job:
SELECT x, y, z FROM myTable WHERE a = 0 ORDER BY y LIMIT 1 LIMIT 100
exec error near "LIMIT": syntax error
So my question is: How do I work around this limitation? Is there any way to still limit my results to give only one hit back despite that the statement will end in 'LIMIT 100'? I'm thinking something like creating a temporary table, add an index and filter on that, but my knowledge is limited to simple database queries.

Unexpected backwards incompatability in sqlite

I have a dev environment running sqlite 3.7.16.2 and a production environment running sqlite 3.7.9 and I am running into some unexpected backwards incompatability.
I have a table that looks like this:
sqlite> select * from calls;
ID|calldate|calltype
1|2013-10-01|monthly
1|2013-11-01|3 month
1|2013-12-01|monthly
2|2013-07-11|monthly
2|2013-08-11|monthly
2|2013-09-11|3 month
2|2013-10-11|monthly
2|2013-11-11|monthly
3|2013-04-22|monthly
3|2013-05-22|monthly
3|2013-06-22|3 month
3|2013-07-22|monthly
4|2013-10-04|monthly
4|2013-11-04|3 month
4|2013-12-04|monthly
5|2013-10-28|monthly
5|2013-11-28|monthly
With the newer version of sqlite (3.7.16.2) I can use this:
SELECT ID, MIN(calldate), calltype FROM calls WHERE calldate > date('NOW') GROUP BY ID;
which gives me:
ID|MIN(calldate)|calltype
1|2013-11-01|3 month
2|2013-11-11|monthly
4|2013-11-04|3 month
5|2013-10-28|monthly
However when I run this same code on the older version of sqlite (3.7.9) I get this:
ID|MIN(calldate)|calltype
1|2013-11-01|monthly
2|2013-11-11|monthly
4|2013-11-04|monthly
5|2013-10-28|monthly
I looked through the changes here, but could not figure out why this is still happening. Any suggestions on how to work around this or how to rewrite my query?
You are using an extension that was added in SQLite 3.7.11.
In standard SQL, it is not allowed to use columns that appear neither in the GROUP BY clause nor are wrapped in an aggregate function.
(SQLite accepts this silently for compatibility with MySQL, but returns the data from some random record in the group.)
To get other columns from a record with the minimum value, you have to search the minimum values for each group first, and then to join these with the original table:
SELECT calls.ID,
calls.calldate,
calls.calltype
FROM calls
JOIN (SELECT ID,
MIN(calldate) AS calldate
FROM calls
WHERE calldate > date('now')
GROUP BY ID
) AS earliest
ON calls.ID = earliest.ID AND
calls.calldate = earliest.calldate

Resources