For Each ,For First - openedge

What is the meaning of For each and For First.. Example below
FOR EACH <db> NO-LOCK,
FIRST <db> OF <db> NO-LOCK:
DISPLAY ..
Also why we need to use NO-LOCK for every table for every time.

Let's answer by giving an example based on the Progress demo DB:
FOR EACH Customer WHERE Customer.Country = "USA" NO-LOCK,
FIRST Salesrep WHERE Salesrep.salesrep = Customer.Saleserp:
/* your code block */
END.
The FOR EACH Block is an iterating block (loop) that integrates data access (and a few more features like error handling and frame scoping if you want to go that far back).
So the code in "your code block" is executed for every Customer record matching the criteria and it also fetches the matching Salesrep records. The join between Customer and Salesrep is an inner join. So you'll only be processing Customers where the Salesrep exists as well.

FOR statement documentation (includes EACH and FIRST keywords)
NO-LOCK documentation
Google is your friend and documentation on packages is usually quite user-friendly.
Try not to ask questions that can be solved by simple search on StackOverflow.

FOR EACH table
Selects a set of records and starts a block to process those records.
NO-LOCK means what it says, the records are retrieved from the database without any record locking. So you might get a "dirty read" (uncommitted data) and someone else might change the data while you are looking at that record.
That sounds awful but, in reality, NO-LOCK reads are almost always what you want to use. If you do need to update a NO-LOCK record you can just FIND CURRENT with a lock.
FOR EACH NO-LOCK can return large numbers of records in a single network message whereas the other lock types are one record at a time - this makes NO-LOCK quite a bit faster for many purposes. And even without the performance argument you probably don't want to be taking out large numbers of locks and preventing other users running inquiries all the time.
Your example lacks a WHERE clause so, by default, every record in the table is returned using the primary index. If you specify a WHERE clause you will potentially only have a subset of the data to loop through and the index selection may be impacted. You can also add a lot of other options like BY to specify sort order.
FOR FIRST is somewhat similar to FOR EACH except that you only return, at most, a single record. Even if the WHERE clause is empty or would otherwise specify a larger result set. BUT BE CAREFUL - the "FIRST" is deceptive. Even if you specify a sort order using BY the rule is "selection, then sorting". At most only one record gets selected so the BY doesn't matter. The index dictated by the WHERE (or lack of a WHERE) determines the sort order. So if your request something like:
FOR FIRST customer NO-LOCK BY discount:
DISPLAY custNum name discount.
END.
You will fetch customer #1, not customer #41 as you might have expected. (Try the code above with the sports2000 database. Replace FIRST with EACH in a second run.)
FOR EACH table1 NO-LOCK,
FIRST table2 NO-LOCK OF table1:
or
FOR EACH customer NO-LOCK,
FIRST salesRep NO-LOCK OF customer:
DISPLAY custnum name customer.salesRep.
END.
Is a join. The OF is a shortcut telling the compiler to find fields that the two tables have in common to build an implied WHERE clause from. This is one of those "makes a nice demo" features that you don't want to use in real code. It obfuscates the relationship between the tables and makes your code much harder to follow. Don't do that. Instead write out the complete WHERE clause. Perhaps like this:
for each customer no-lock,
first salesRep no-lock where sakesRep.salesRep = customer.salesRep:
display custnum name customer.salesRep.
end.

Related

What if i don't use index fields in my program?

I am a beginner to this progress 4GL. I have confused with the following logic especially how the index actually working.
I have added 2 fields in one index. As you can see below I have written three queries.
Query 1, Used the index and finding data from 2 fields to retrieve the data
Query 2, Used the same index but finding data from 1 field only
Query 3, Used the same index field with one non-index field
define temp-table tt_creldata no-undo
field tt_cscx_order as character
field tt_cscx_part as character
field tt_cscx_shipfrom as character
index tt_cscx
tt_cscx_order
tt_cscx_part
.
**Query 1:**
find first tt_creldata use-index tt_cscx
where tt_cscx_order = "153"
and tt_cscx_part = "113" no-lock no-error.
**Query 2:**
find first tt_creldata use-index tt_cscx
where tt_cscx_order = "153" no-lock no-error.
**Query 3:**
find first tt_creldata use-index tt_cscx
where tt_cscx_order = "153"
and tt_cscx_part = "113"
and tt_cscx_shipfrom = "US" no-lock no-error.
Question 1: Which query helps to improve the performance
Question 2: What if I don't use one field which is indexed when I mentioned use-index
Question 3: What if I add one non-index field when I mentioned use-index?
As a general rule of thumb, you should never use use-index.
The AVM will select one or more indexes to use for a query at compile time, and by forcing it to use one of your choosing, you are removing the possibility of this.
Having extra, possibly non-index, fields in your where clause will only affect the indexes chosen if you let the AVM choose (ie don't use use-index ). This is also true if you don't use indexed fields in your query.
You can see which indexes are used if you compile the program with the xref or xml-xref options, and looking for the SEARCH items.
As nwahmaet says, you should never use USE-INDEX. In this case it is especially pointless because there is only one index. In cases where there are multiple indexes a FIND statement will only use one of them no matter how complex the WHERE clause but the compiler will almost always do a better job picking an efficient index than you will. (The FOR EACH statement and its associated dynamic queries are capable of using multiple indexes. FIND is always limited to just one index.) In those rare cases where you think you are doing a better job you should thoroughly document why your choice is better and include detailed test cases and results.
All of your queries are using FIRST. This is necessary because your index is not defined as unique. That may be your intent but it seems unusual. And it means that in the event of duplicate records with the same key values you are magically making the "first" record more special than the others. Which is a data normalization faux pas (you are making "firstness" an attribute of the data) and a bug waiting to happen.
FIND FIRST and USE-INDEX are often used together to (try to) cover up for each other's deficiencies. By specifying a particular index the FIRST becomes more consistent. Likewise, FIRST is often used to "cure" performance issues that arise from insufficient index definitions, inadequate WHERE clauses or choosing FIND when FOR EACH would have been more appropriate.
None of these queries are going to perform notably faster than the others.
Query 2 may, or may not return the same record as query 1. For instance, if there is a part = "112" then query 2 will have a different "first" record. But it will be just as fast to return as query 1.
Likewise query 3 may have a different result depending on what records contain shipfrom = "US". In the best case where the very first order = "153" and part "113" also satisfy shipfrom = "US" then it will be the same speed as the others.
However, query 3 might be a lot slower depending on how many records have to be scanned before one is found that has shipfrom = "US" since that field is not a part of any index and matching it will, therefore, require scanning records until one is found which matches. That might be the first record or it might be the 10 zillionth.

How to put a part of a code as a string in table to use it in a procedure?

I'm trying to resolve below issue:
I need to prepare table that consists 3 columns:
user_id,
month
value.
Each from over 200 users has got different values of parameters that determine expected value which are: LOB, CHANNEL, SUBSIDIARY. So I decided to store it in table ASYSTENT_GOALS_SET. But I wanted to avoid multiplying rows and thought it would be nice to put all conditions as a part of the code that I would use in "where" clause further in procedure.
So, as an example - instead of multiple rows:
I created such entry:
So far I created testing table ASYSTENT_TEST (where I collect month and value for certain user). I wrote a piece of procedure where I used BULK COLLECT.
declare
type test_row is record
(
month NUMBER,
value NUMBER
);
type test_tab is table of test_row;
BULK_COLLECTOR test_tab;
p_lob varchar2(10) :='GOSP';
p_sub varchar2(14);
p_ch varchar2(10) :='BR';
begin
select subsidiary into p_sub from ASYSTENT_GOALS_SET where user_id='40001001';
execute immediate 'select mc, sum(ppln_wartosc) plan from prod_nonlife.mis_report_plans
where report_id = (select to_number(value) from prod_nonlife.view_parameters where view_name=''MIS'' and parameter_name=''MAX_REPORT_ID'')
and year=2017
and month between 7 and 9
and ppln_jsta_symbol in (:subsidiary)
and dcs_group in (:lob)
and kanal in (:channel)
group by month order by month' bulk collect into BULK_COLLECTOR
using p_sub,p_lob,p_ch;
forall x in BULK_COLLECTOR.first..BULK_COLLECTOR.last insert into ASYSTENT_TEST values BULK_COLLECTOR(x);
end;
So now when in table ASYSTENT_GOALS_SET column SUBSIDIARY (varchar) consists string 12_00_00 (which is code of one of subsidiary) everything works fine. But the problem is when user works in two subsidiaries, let say 12_00_00 and 13_00_00. I have no clue how to write it down. Should SUBSIDIARY column consist:
'12_00_00','13_00_00'
or
"12_00_00","13_00_00"
or maybe
12_00_00','13_00_00
I have tried a lot of options after digging on topics like "Deling with single/escaping/double qoutes".
Maybe I should change something in execute immediate as well?
Or maybe my approach to that issue is completely wrong from the very beginning (hopefully not :) ).
I would be grateful for support.
I didn't create the table function described here but that article inspired me to go back to try regexp_substr function again.
I changed: ppln_jsta_symbol in (:subsidiary) to
ppln_jsta_symbol in (select regexp_substr((select subsidiary from ASYSTENT_GOALS_SET where user_id=''fake_num''),''[^,]+'', 1, level) from dual
connect by regexp_substr((select subsidiary from ASYSTENT_GOALS_SET where user_id=''fake_num''), ''[^,]+'', 1, level) is not null) Now it works like a charm! Thank you #Dessma very much for your time and suggestion!
"I wanted to avoid multiplying rows and thought it would be nice to put all conditions as a part of the code that I would use in 'where' clause further in procedure"
This seems a misguided requirement. You shouldn't worry about number of rows: databases are optimized for storing and retrieving rows.
What they are not good at is dealing with "multi-value" columns. As your own solution proves, it is not nice, it is very far from nice, in fact it is a total pain in the neck. From now on, every time anybody needs to work with subsidiary they will have to invoke a function. Adding, changing or removing a user's subsidiary is much harder than it ought to be. Also there is no chance of enforcing data integrity i.e. validating that a subsidiary is valid against a reference table.
Maybe none of this matters to you. But there are very good reasons why Codd mandated "no repeating groups" as a criterion of First Normal Form, the foundation step of building a sound data model.
The correct solution, industry best practice for almost forty years, would be to recognise that SUBSIDIARY exists at a different granularity to CHANNEL and so should be stored in a separate table.

CustTableListPage filtering is too slow

When I'm trying to filter CustAccount field on CustTableListPage it's taking too long to filter. On the other fields there is no latency. I'm trying to filter just part of account number like "*123".
I have done reindexing for custtable and also updated statics but not appreciable difference at all.
When i have added listpage's query in a view it's filtering custAccount field normally like the other fields.
Any suggestion?
Edit:
Our version is AX 2012 r2 cu8, not a user based problem it occurs for every user, Interaction class has some custimizations but just for setting some buttons enable/disable props. etc... i tryed to look query execution what i found is not clear. something like FETCH_API_CURSOR_000000..x
Record a trace of this execution and locate what is a bottleneck.
Keep in mind that that wildcards (such as *) have to be used with care. Using a filter string that starts with a wildcard kills all performance because the SQL indexes cannot be used.
Using a wildcard at the end
Imagine that you have a dictionnary and have to list all the words starting with 'Foo'. You can skip all entries before 'F', then all those before 'Fo', then all those before 'Foo' and start your result list from there.
Similarly, asking the underlying SQL engine to list all CustAccount entries starting with '123' (= filter string '123*') allows using an index on CustAccount to quickly skip to the relevant data.
Using a wildcard at the start
Imagine that you still have that dictionnary and have to list all the words ending with 'ing'. You would have no other choice than going through the entire dictionnary and checking the ending of every word (due to the alphabetical sorting).
This explains why asking the SQL engine to list all CustAccount entries ending with '123' (= filter string '*123') means that all CustAccount values must be investigated. So the AOS loops through all the entries and uses an SQL cursor to do this. That is the FETCH_API_CURSOR statement you see on the SQL level.
Possible solutions
Educate your end user that using a wildcard at the beginning of a filter string will always be slow on a large table.
Step up the SQL server hardware / allocated resources (faster CPU, more RAM, faster disk, ...).
Create a full text index on CustAccount (not a fan of this one and performance impact should be thoroughly investigated).
I've solve the problem. CustTableListPage query had a sorting over DirPartyTable.Name field. When I remove this sorting, filtering with wildcard working like a charm.

ASP.NET / SQL drop-down list sort order

I am trying to correct the sort order of my ASP.NET drop down list.
The problem I have is that I need to select a distinct Serial number and have these numbers organised by DateTime Desc.
However I cannot ORDER BY DateTime if using DISTINCT without selecting the DateTime field in my query.
However if I select DateTime this selects every data value associated with a single Serial number and results in duplications.
The purpose of my page is to display data for ALL Serials, or data associated to one serial. When a new cycle begins (because it is a new production run) the Serial reverts to 1. So I cannot simply organise by serial number either.
When I use the following SQL statement the list box is in the order I require but after a period of time (usually a few hours) the order changes and appears to have no organised structure.
alt text http://img7.imageshack.us/i/captureky.jpg/
I'm fairly new to ASP.NET / SQL, does anyone know of a solution to my problem.
If you have multiple date times for each serial number, then which do you want to use for ordering? If the most recent, try this:
SELECT SerialNumber,
MAX(DateTimeField)
FROM Table
GROUP BY SerialNumber
ORDER BY 2 DESC
I don´t know if everybody agrees with that, but when I see a DISTINCT in a query the first thought that goes trough my mind is "This is wrong". Generally, DISTINCT is not necessary and it´s used when the person writing the query doesnt know very well what he is doing and this might be the case since you said you are new with Sql.
Without complete knowledge of your model is difficult to assist you a hundred percente, but I would say that you should use a GROUP BY clause instead of DISTINCT, then you can order it correctly.

Efficiently finding unique values in a database table

I've got a database table with a very large amount of rows. This table represents messages that are logged by a system. Each message has a message type and this is stored it it's own field in the table. I'm writing a website for querying this message log. If I want to search by message type then ideally I would want to have a drop down box listing the message types that have come up in the database. Message types may change over time so I can't hard code the types into the drop down. I'll have to do some sort of lookup. Iterating over the entire table contents to find unique message values is obviously very stupid however being stupid in the database field I'm here asking for a better way. Perhaps a separate lookup table which the database occasionally updates listing just the unique message types that I can populate my drop down from would be a better idea.
Any suggestions would be much appreciated.
The platform I'm using is ASP.NET MVC and SQL Server 2005
A separate lookup table with the id of the message type stored in your log. This will reduce the size and increase the efficiency of the log. Also it would Normalize your data.
Yep, I would definitely go with the separate lookup table. You can then populate it using something like:
INSERT TypeLookup (Type)
SELECT DISTINCT Type
FROM BigMassiveTable
You could then run a top-up job periodically to pull in new types from your main table that don't already exist in the lookup table.
SELECT DISTINCT message_type
FROM message_log
is the most straightforward but not very efficient way.
If you have a list of types that can possibly appear in the log, use this:
SELECT message_type
FROM message_types mt
WHERE message_type IN
(
SELECT message_type
FROM message_log
)
This will be more efficient if message_log.message_type is indexed.
If you don't have this table but want to create one, and message_log.message_type is indexed, use a recursive CTE to emulate loose index scan:
WITH rows (message_type) AS
(
SELECT MIN(message_type) AS mm
FROM message_log
UNION ALL
SELECT message_type
FROM (
SELECT mn.message_type, ROW_NUMBER() OVER (ORDER BY mn.message_type) AS rn
FROM rows r
JOIN message_type mn
ON mn.message_type > r.message_type
WHERE r.message_type IS NOT NULL
) q
WHERE rn = 1
)
SELECT message_type
FROM rows r
OPTION (MAXRECURSION 0)
I just wanted to state the obvious: normalize the data.
message_types
message_type | message_type_name
messages
message_id | message_type | message_type_name
Then you can just do without any cached DISTINCT:
For your dropdown
SELECT * FROM message_types
For your retrieval
SELECT * FROM messages WHERE message_type = ?
SELECT m.*, mt.message_type_name FROM messages AS m
JOIN message_types AS mt
ON ( m.message_type = mt.message_type)
I'm not sure why you would want a cached DISTINCT which you'll have to update, when you can slightly tweak the schema and have one with RI.
Create an index on the message type:
CREATE INDEX IX_Messages_MessageType ON Messages (MessageType)
Then to get a list of unique Message Types, you run:
SELECT DISTINCT MessageType
FROM Messages
ORDER BY MessageType
Because the index is physically sorted in order of MessageType SQL Server can very quickly, and efficiently, scan through the index, picking up a list of unique message types.
It is not bad performing - it's what SQL Server is good at.
Admittedly, you can save some space by having a "message types" table. And if you only display a few messages at a time: then the bookmark lookup, as it joins back to the MessageTypes table, won't be a problem. But if you start displaying hundreds or thousands of messages at a time, then the join back to MessageTypes can get pretty expensive, and needless, and it will be faster to have the MessageType stored with the message.
But i would have no problem with creating an index on the MessageType column, and selecting distinct. SQL Server loves that sort of thing. But if you're finding it to be a real load on your server, once you're getting dozens of hits a second, then follow the other suggestion and cache them in memory.
My personal solution would be:
create the index
select distinct
and if i still had problems
cache in memory that expires after 30 seconds
As for the normalized/denormalized issue. Normalizing saves space, at the cost of CPU when joins are constantly performed. But the logical point of denoralization is to avoid duplicate data, which can lead to inconsistent data.
Are you planning on changing the text of a message type, which if you stored with the messages you would have to update all rows?
Or is there something to be said for the fact that at the time of the message the message type was "Client response requested"?
Have you considered an indexed view? Its result set is materialized and persists in storage so that the overhead of the lookup is separated from the rest of whatever you're trying to do.
SQL Server takes care of automagically updating the view when there is a data change which in its opinion would change the contents of the view, so in this respect it's less flexible than Oracle materialized.
The MessageType should be a Foreign Key in the main table to a definition table containing the message type codes and descriptions. This will greatly increase your lookup performance.
Something like
DECLARE #MessageTypes TABLE(
MessageTypeCode VARCHAR(10),
MessageTypeDesciption VARCHAR(100)
)
DECLARE #Messages TABLE(
MessageTypeCode VARCHAR(10),
MessageValue VARCHAR(MAX),
MessageLogDate DATETIME,
AdditionalNotes VARCHAR(MAX)
)
From this design, your lookup should only query MessageTypes
As others have said, create a separate table of message types. When you add a record to the message table, check if the message type already exists in the table. If not, add it. In either case, then post the identifier from the message type table into the message table. This should give you normalized data. Yes, it's a little extra time when you add a record, but should be more efficient on retrieval.
If there are a lot more adds then reads and if the "message type" is short, an entirely different approach would be to still create the separate message type table, but don't reference it when doing adds, and only update it lazily, on demand.
Namely, (a) Include a time-stamp in each message record. (b) Keep a list of the message types found as of the last time you checked. (c) Each time you check, search for any new message types added since the last time, as in:
create table temp_new_types as
(select distinct message_type
from message
where timestamp>last_type_check
);
insert into message_type_list (message_type)
select message_type
from temp_new_types
where message_type not in (select message_type from message_type_list);
drop table temp_new_types;
Then store the timestamp of this check somewhere so you can use it the next time around.
The answer is to use 'DISTINCT' and each best solution is different for different sizes of table. Thousands of rows, millions, billions ? more ? This are very different best solutions.

Resources