Partition eliminations with multilevel partitions on teradata database - teradata

I have a performance issue with multilevel partitioned tables on teradata database. It seems the partition elimination is not occurring if a table has a certain kind of partitioning structure.
As an example, consider the following table and query. With a single partitioning expression, the plan successfully eliminated all but one partition.
/* Table with single partitioning expression */
CREATE MULTISET TABLE table_single
(
id INTEGER,
time_stamp INTEGER)
PRIMARY INDEX ( id ,time_stamp )
PARTITION BY RANGE_N(time_stamp BETWEEN *,1651330800 AND 1656601199 EACH 3600 ,NO RANGE, UNKNOWN);
EXPLAIN SELECT count(*) FROM table_single WHERE time_stamp BETWEEN 1654041600 AND 1654041600 + 100;
// result
...
3) We do an all-AMPs SUM step in TD_MAP1 to aggregate from
a single partition of table_single with a condition of
...
In contrast, when we add another partition, the plan became a full scan, which indicates partition elimination did not occur.
/* Example with multi level partition */
CREATE MULTISET TABLE table_mult
(
id INTEGER,
time_stamp INTEGER)
PRIMARY INDEX ( id ,time_stamp )
PARTITION BY (
RANGE_N(time_stamp BETWEEN *,1651330800 AND 1656601199 EACH 3600 ,NO RANGE, UNKNOWN),
RANGE_N(id BETWEEN 1 AND 200 EACH 1 ,NO RANGE, UNKNOWN)
);
EXPLAIN SELECT count(*) FROM table_mult WHERE time_stamp BETWEEN 1654041600 AND 1654041600 + 100;
// result
...
3) We do an all-AMPs SUM step in TD_MAP1 to aggregate from
table_mult by way of an all-rows scan with a condition of
...
Further, when I change the partitioning expression slightly as below (removed the "*" from the first RANGE_N call), then the partition elimination is back.
/* Table with multilevel parition, v2 */
CREATE MULTISET TABLE table_mult2
(
id INTEGER,
time_stamp INTEGER)
PRIMARY INDEX ( id ,time_stamp )
PARTITION BY (
RANGE_N(time_stamp BETWEEN 1651330800 AND 1656601199 EACH 3600 ,NO RANGE, UNKNOWN),
RANGE_N(id BETWEEN 1 AND 200 EACH 1 ,NO RANGE, UNKNOWN)
);
EXPLAIN SELECT count(*) FROM table_mult2 WHERE time_stamp BETWEEN 1654041600 AND 1654041600 + 100;
// result
...
3) We do an all-AMPs SUM step in TD_MAP1 to aggregate from 404
partitions of table_mult2 with a condition of (
...
Not only the query plan changes, but also in practice we also observe that the query on the second table becomes slower as we have more data outside the target partitions, from which we believe the partition elimination is not functioning with that table.
We would like to understand the reasons for this behavior, in particular why the partition elimination does not occur for the second definition of the table.

Related

Automatic random primary key within a specific range in sqlite3?

I would like to automatically insert a primary key every time I add a new record to an SQLite3 table, much like a PRIMARY KEY AUTOINCREMENT except that the value should be randomly chosen from some range (say 0000 through 9999) rather than being assigned sequentially.
For demonstration purposes, let's restrict the range to 1 through 6 instead and try to populate the following table:
CREATE TABLE dice (rolled INTEGER PRIMARY KEY NOT NULL);
Now every time I insert a new record into that table, I want a new random primary key to be created.
The following works and does exactly what I want
INSERT INTO dice VALUES(
(
WITH RECURSIVE roll(x) AS (
VALUES(ABS(RANDOM()) % 6 + 1)
UNION ALL
SELECT x % 6 + 1 FROM roll
)
SELECT x FROM roll WHERE (
SELECT COUNT(*) FROM dice where rolled = x
) = 0 LIMIT 1
)
);
except that I have to invoke it manually/explicitly.
Is there any way to embed the above (or an equivalent) calculation for the random primary key into a DEFAULT clause for the "rolled" column or into some sort of trigger, so that a new key will be calculated automatically every time I insert a record?

Insert into Table with the first column being a Sequence

I am trying to use an Insert, Sequence and Select * to work together.
INSERT INTO BRK_INDV
Select * from (Select brk_seq.NEXTVAL as INDV_SEQ, a.*
FROM (select to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') BUSINESS_DAY, to_char(REQUEST_DATETIME,'hh24') src_hour,
CASE tran_type
WHEN 'V' THEN 'Visa'
WHEN 'M' THEN 'MasterCard'
ELSE tran_type
end text,
tran_type, count(*) as count
from DLY_STATS
where 1=1
AND to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') = '09-FEB-2015'
group by to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY'),to_char(REQUEST_DATETIME,'hh24'),tran_type order by src_hour)a);
This gives me the following error:
ERROR at line 2:
ORA-02287: sequence number not allowed here
I tried to remove the order by and still the same error.
However, if I only run
Select brk_seq.NEXTVAL as INDV_SEQ, a.*
FROM (select to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') BUSINESS_DAY, to_char(REQUEST_DATETIME,'hh24') src_hour,
CASE tran_type
WHEN 'V' THEN 'Visa'
WHEN 'M' THEN 'MasterCard'
ELSE tran_type
end text,
tran_type, count(*) as count
from DLY_STATS
where 1=1
AND to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') = '09-FEB-2015'
group by to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY'),to_char(REQUEST_DATETIME,'hh24'),tran_type order by src_hour)a;
It shows me proper entries. Then, why is select * not working for that?
Kindly help.
I see what you're trying to do. You want to insert rows into the BRK_INDV table in a particular order. The sequence number, which I assume will be the primary key of BRK_INDV, will be generated sequentially in the sorted order of the input rows.
You are working with a relational database. One of the first characteristics we all learn about a relational database is that the order of the rows in a table is insignificant. That's just a fancy word for fugitaboutit.
You cannot assume that a select * from table will return the rows in the same order they were written. It might. It might for quite a long time. Then something -- the number of rows, the grouping of some column values, the phase of the moon -- something will change and you will get them out in a seemingly totally random order.
If you want order, it must be imposed in the query, not the insert.
Here's the statement you should be executing:
INSERT INTO BRK_INDV
With
Grouped( Business_Day, Src_Hour, Text, Tran_Type, Count )As(
Select Trunc( Request_Datetime ) Business_Day,
To_Char( Request_Datetime, 'hh24') Src_Hour,
Case Tran_Type
When 'V' Then 'Visa'
When 'M' Then 'MasterCard'
Else Tran_Type
end Text,
Tran_Type, count(*) as count
from DLY_STATS
Where 1=1 --> Generated as dynamic SQL?
And Request_Datetime >= Date '2015-02-09'
And Request_Datetime < Date '2015-02-10'
Group By Trunc( Request_Datetime ), To_Char( Request_Datetime, 'hh24'), Tran_Type
)
Select brk_seq.Nextval Indv_Seq, G.*
from Grouped G;
Notice there is no order by. If you want to see the generated rows in a particular order:
select * from Brk_Indv order by src_hour;
Since there could be hundreds or thousands of transactions in any particular hour, you probably order by something other than hour anyway.
In Oracle, the trunc function is the best way to get a date with the time portion stripped away. However, you don't want to use it in the where clause (or, aamof, any other function such as to_date or to_char)as that would make the clause non-sargable and result in a complete table scan.
The problem is that you can't use a sequence in a subquery. For example, this gives the same ORA-02287 error you are getting:
create table T (x number);
create sequence s;
insert into T (select * from (select s.nextval from dual));
What you can do, though, is create a function that returns nextval from the sequence, and use that in a subquery:
create function f return number as
begin
return s.nextval;
end;
/
insert into T (select * from (select f() from dual));

SQL update attribute

I have a table of milestones in which the primary key is Id_milestone and a table with tasks where foreign key is id_milestone. For each task I attribute completion in percentage. The milestone also have attribute of completion in percentage. I need to update the completion of the milestone at 100 percent until they have completed all the tasks set to 100 percent. I have a DropDownList with interval 10 percent and users update the progress. Sorry for my English.
CREATE TABLE Milestone
(
ID_milestone INTEGER NOT NULL ,
Nazev_milniku VARCHAR2 (30) ,
) ;
CREATE TABLE Milestone_complete
(
ID_milestone INTEGER NOT NULL ,
Completed INTEGER
) ;
CREATE TABLE Tasks
(
ID_tasks INTEGER NOT NULL ,
ID_milestone INTEGER NOT NULL ,
Name_task VARCHAR2 (30) ,
) ;
CREATE TABLE Tasks_complete
(
ID_task INTEGER NOT NULL ,
Completed INTEGER
) ;
This will calculate the percentage of your milestones using all percentages of your tasks.
UPDATE [milestones]
SET [percentage] = (SELECT AVG(pecentage) FROM [tasks] WHERE [tasks].[id_milestone] = [milestones].[id])
If you want to set e.g. a status flag like [completed] you can do something like this:
UPDATE [milestones]
SET [completed] = (CASE WHEN (SELECT AVG(pecentage) FROM [tasks] WHERE [tasks].[id_milestone] = [milestones].[id]) = 100 THEN 1 ELSE 0 END)
This is just a guess what you want as you didn't comppletely show us your tables, structure, etc.
Hope it helps.
EDIT
Your table Milestone need a column percentage or do you want to write the result in the other table Milestone_complete? I don't understand why you have those additional tables.
So we guess you have a column percentage in your Milestone table. In this case the SQL is like this:
UPDATE [Milestone]
SET [percentage] = (SELECT AVG(pecentage)
FROM [Tasks]
WHERE [Tasks].[ID_milestone] = [Milestone].[ID_milestone])
If you want to write the result in the Completed' column in theMilestone_complete` table, then do it like this:
UPDATE [Milestone_complete]
SET [percentage] = (SELECT AVG(pecentage)
FROM [Tasks]
WHERE [Tasks].[ID_milestone] = [Milestone].[ID_milestone])
FROM [Milestone_complete]
JOIN [Milestone] ON [Milestone].[ID_milestone] = [Milestone_Complete].[ID_milestone]
Or do you only want to insert result into Milestone_Complete if all tasks of the milestone is 100%?
I do not know if you already have a record in the compelted table or need to add on in case.
I assume you have one and want to set completed to e.g. 1 (0 = not completed).
UPDATE [Milestone_complete]
SET [completed] = 1
FROM [Milestone_complete]
JOIN [Milestone] ON [Milestone].[ID_milestone] = [Milestone_Complete].[ID_milestone]
WHERE (SELECT AVG(percentage)
FROM [Tasks]
WHERE [Tasks].[ID_milestone] = [Milestone].[ID_milestone])=100
I didn't test anything of it, so not sure if it works, but as I do not 100% your approach you need to modify to your own needs. Hope it helps. Next time you ask a question concentrate on a clear, complete and understandable question, make it easier.

Is it possible to use WHERE clause in same query as PARTITION BY?

I need to write SQL that keeps only the minimum 5 records per each identifiable record in a table. For this, I use partition by and delete all records where the value returned is greater than 5. When I attempt to use the WHERE clause in the same query as the partition by statement, I get the error "Ordered Analytical Functions not allowed in WHERE Clause". So, in order to get it to work, I have to use three subqueries. My SQL looks ilke this:
delete mydb.mytable where (field1,field2) in
(
select field1,field2 from
(
select field1,field2,
Rank() over
(
partition BY field1
order by field1,field2
) n
from mydb.mytable
) x
where n > 5
)
The innermost subquery just returns the raw data. Since I can't use WHERE there, I wrapped it with a subquery, the purpose of which is to 1) use WHERE to get records greater than 5 in rank and 2) select only field1 and field2. The reason why I select only those two fields is so that I can use the IN statement for deleting those records in the outermost query.
It works, but it appears a bit cumbersome. I'd like to consolidate the inner two subqueries into a single subquery. Is this possible?
Sounds like you need to use the QUALIFY clause which is the HAVING clause for Window Aggregate functions. Below is my take on what you are trying to accomplish.
Please do not run this SQL directly against your production data without first testing it.
/* Physical Delete */
DELETE TGT
FROM MyDB.MyTable TGT
INNER JOIN
(SELECT Field1
, Field2
FROM MyDB.MyTable
QUALIFY ROW_NUMBER() (PARTITION BY Field1, ORDER BY Field1,2)
> 5
) SRC
ON TGT.Field1 = SRC.Field1
AND TGT.Field2 = SRC.Fileld2
/* Logical Delete */
UPDATE TGT
FROM MyDB.MyTable TGT
,
(SELECT Field1
, Field2
FROM MyDB.MyTable
QUALIFY ROW_NUMBER() (PARTITION BY Field1, ORDER BY Field1,2)
> 5
) SRC
SET Deleted = 'Y'
/* RecordExpireDate = Date - 1 */
WHERE TGT.Field1 = SRC.Field1
AND TGT.Field2 = SRC.Fileld2

SQLite query runs 10 times slower than MSAccess query

I have a 800MB MS Access database that I migrated to SQLite. The structure of the database is as follows (the SQLite database, after migration, is around 330MB):
The table Occurrence has 1,600,000 records. The table looks like:
CREATE TABLE Occurrence
(
SimulationID INTEGER, SimRunID INTEGER, OccurrenceID INTEGER,
OccurrenceTypeID INTEGER, Period INTEGER, HasSucceeded BOOL,
PRIMARY KEY (SimulationID, SimRunID, OccurrenceID)
)
It has the following indexes:
CREATE INDEX "Occurrence_HasSucceeded_idx" ON "Occurrence" ("HasSucceeded" ASC)
CREATE INDEX "Occurrence_OccurrenceID_idx" ON "Occurrence" ("OccurrenceID" ASC)
CREATE INDEX "Occurrence_SimRunID_idx" ON "Occurrence" ("SimRunID" ASC)
CREATE INDEX "Occurrence_SimulationID_idx" ON "Occurrence" ("SimulationID" ASC)
The table OccurrenceParticipant has 3,400,000 records. The table looks like:
CREATE TABLE OccurrenceParticipant
(
SimulationID INTEGER, SimRunID INTEGER, OccurrenceID INTEGER,
RoleTypeID INTEGER, ParticipantID INTEGER
)
It has the following indexes:
CREATE INDEX "OccurrenceParticipant_OccurrenceID_idx" ON "OccurrenceParticipant" ("OccurrenceID" ASC)
CREATE INDEX "OccurrenceParticipant_ParticipantID_idx" ON "OccurrenceParticipant" ("ParticipantID" ASC)
CREATE INDEX "OccurrenceParticipant_RoleType_idx" ON "OccurrenceParticipant" ("RoleTypeID" ASC)
CREATE INDEX "OccurrenceParticipant_SimRunID_idx" ON "OccurrenceParticipant" ("SimRunID" ASC)
CREATE INDEX "OccurrenceParticipant_SimulationID_idx" ON "OccurrenceParticipant" ("SimulationID" ASC)
The table InitialParticipant has 130 records. The structure of the table is
CREATE TABLE InitialParticipant
(
ParticipantID INTEGER PRIMARY KEY, ParticipantTypeID INTEGER,
ParticipantGroupID INTEGER
)
The table has the following indexes:
CREATE INDEX "initialpart_participantTypeID_idx" ON "InitialParticipant" ("ParticipantGroupID" ASC)
CREATE INDEX "initialpart_ParticipantID_idx" ON "InitialParticipant" ("ParticipantID" ASC)
The table ParticipantGroup has 22 records. It looks like
CREATE TABLE ParticipantGroup (
ParticipantGroupID INTEGER, ParticipantGroupTypeID INTEGER,
Description varchar (50), PRIMARY KEY( ParticipantGroupID )
)
The table has the following index:
CREATE INDEX "ParticipantGroup_ParticipantGroupID_idx" ON "ParticipantGroup" ("ParticipantGroupID" ASC)
The table tmpSimArgs has 18 records. It has the following structure:
CREATE TABLE tmpSimArgs (SimulationID varchar, SimRunID int(10))
And the following indexes:
CREATE INDEX tmpSimArgs_SimRunID_idx ON tmpSimArgs(SimRunID ASC)
CREATE INDEX tmpSimArgs_SimulationID_idx ON tmpSimArgs(SimulationID ASC)
The table ‘tmpPartArgs’ has 80 records. It has the below structure:
CREATE TABLE tmpPartArgs(participantID INT)
And the below index:
CREATE INDEX tmpPartArgs_participantID_idx ON tmpPartArgs(participantID ASC)
I have a query that involves multiple INNER JOINs and the problem I am facing is the Access version of the query takes about a second whereas the SQLite version of the same query takes 10 seconds (about 10 times slow!) It is impossible for me to migrate back to Access and SQLite is my only option.
I am new to writing database queries hence these queries might look stupid, so please advise on anything you see faulty or kid-dish.
The query in Access is (the entire query takes 1 second to execute):
SELECT ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period, Count(OccurrenceParticipant.ParticipantID) AS CountOfParticipantID FROM
(
ParticipantGroup INNER JOIN InitialParticipant ON ParticipantGroup.ParticipantGroupID = InitialParticipant.ParticipantGroupID
) INNER JOIN
(
tmpPartArgs INNER JOIN
(
(
tmpSimArgs INNER JOIN Occurrence ON (tmpSimArgs.SimRunID = Occurrence.SimRunID) AND (tmpSimArgs.SimulationID = Occurrence.SimulationID)
) INNER JOIN OccurrenceParticipant ON (Occurrence.OccurrenceID = OccurrenceParticipant.OccurrenceID) AND (Occurrence.SimRunID = OccurrenceParticipant.SimRunID) AND (Occurrence.SimulationID = OccurrenceParticipant.SimulationID)
) ON tmpPartArgs.participantID = OccurrenceParticipant.ParticipantID
) ON InitialParticipant.ParticipantID = OccurrenceParticipant.ParticipantID WHERE (((OccurrenceParticipant.RoleTypeID)=52 Or (OccurrenceParticipant.RoleTypeID)=49)) AND Occurrence.HasSucceeded = True GROUP BY ParticipantGroup.Description, Occurrence.SimulationID, Occurrence.SimRunID, Occurrence.Period;
The SQLite query is as follows (this query takes around 10 seconds):
SELECT ij1.Description, ij2.occSimulationID, ij2.occSimRunID, ij2.Period, Count(ij2.occpParticipantID) AS CountOfParticipantID FROM
(
SELECT ip.ParticipantGroupID AS ipParticipantGroupID, ip.ParticipantID AS ipParticipantID, ip.ParticipantTypeID, pg.ParticipantGroupID AS pgParticipantGroupID, pg.ParticipantGroupTypeID, pg.Description FROM ParticipantGroup as pg INNER JOIN InitialParticipant AS ip ON pg.ParticipantGroupID = ip.ParticipantGroupID
) AS ij1 INNER JOIN
(
SELECT tpa.participantID AS tpaParticipantID, ij3.* FROM tmpPartArgs AS tpa INNER JOIN
(
SELECT ij4.*, occp.SimulationID as occpSimulationID, occp.SimRunID AS occpSimRunID, occp.OccurrenceID AS occpOccurrenceID, occp.ParticipantID AS occpParticipantID, occp.RoleTypeID FROM
(
SELECT tsa.SimulationID AS tsaSimulationID, tsa.SimRunID AS tsaSimRunID, occ.SimulationID AS occSimulationID, occ.SimRunID AS occSimRunID, occ.OccurrenceID AS occOccurrenceID, occ.OccurrenceTypeID, occ.Period, occ.HasSucceeded FROM tmpSimArgs AS tsa INNER JOIN Occurrence AS occ ON (tsa.SimRunID = occ.SimRunID) AND (tsa.SimulationID = occ.SimulationID)
) AS ij4 INNER JOIN OccurrenceParticipant AS occp ON (occOccurrenceID = occpOccurrenceID) AND (occSimRunID = occpSimRunID) AND (occSimulationID = occpSimulationID)
) AS ij3 ON tpa.participantID = ij3.occpParticipantID
) AS ij2 ON ij1.ipParticipantID = ij2.occpParticipantID WHERE (((ij2.RoleTypeID)=52 Or (ij2.RoleTypeID)=49)) AND ij2.HasSucceeded = 1 GROUP BY ij1.Description, ij2.occSimulationID, ij2.occSimRunID, ij2.Period;
I don’t know what I am doing wrong here. I have all the indexes but I thinking I am missing declaring some key index that will do the trick for me. The interesting thing is before migration my ‘research’ on SQLite showed that SQLite is faster, smaller and better in all aspects than Access. But I cant seem to get SQLite work faster than Access in terms of querying. I reiterate that I am new to SQLite and obviously do not have much idea as well as experience so if any learned soul could help me out with this, it will be much appreciated.
I have reformatting your code (using my home-brew sql formatter) to hopefully make it easier for others to read..
Reformatted Query:
SELECT
ij1.Description,
ij2.occSimulationID,
ij2.occSimRunID,
ij2.Period,
Count(ij2.occpParticipantID) AS CountOfParticipantID
FROM (
SELECT
ip.ParticipantGroupID AS ipParticipantGroupID,
ip.ParticipantID AS ipParticipantID,
ip.ParticipantTypeID,
pg.ParticipantGroupID AS pgParticipantGroupID,
pg.ParticipantGroupTypeID,
pg.Description
FROM ParticipantGroup AS pg
INNER JOIN InitialParticipant AS ip
ON pg.ParticipantGroupID = ip.ParticipantGroupID
) AS ij1
INNER JOIN (
SELECT
tpa.participantID AS tpaParticipantID,
ij3.*
FROM tmpPartArgs AS tpa
INNER JOIN (
SELECT
ij4.*,
occp.SimulationID AS occpSimulationID,
occp.SimRunID AS occpSimRunID,
occp.OccurrenceID AS occpOccurrenceID,
occp.ParticipantID AS occpParticipantID,
occp.RoleTypeID
FROM (
SELECT
tsa.SimulationID AS tsaSimulationID,
tsa.SimRunID AS tsaSimRunID,
occ.SimulationID AS occSimulationID,
occ.SimRunID AS occSimRunID,
occ.OccurrenceID AS occOccurrenceID,
occ.OccurrenceTypeID,
occ.Period,
occ.HasSucceeded
FROM tmpSimArgs AS tsa
INNER JOIN Occurrence AS occ
ON (tsa.SimRunID = occ.SimRunID)
AND (tsa.SimulationID = occ.SimulationID)
) AS ij4
INNER JOIN OccurrenceParticipant AS occp
ON (occOccurrenceID = occpOccurrenceID)
AND (occSimRunID = occpSimRunID)
AND (occSimulationID = occpSimulationID)
) AS ij3
ON tpa.participantID = ij3.occpParticipantID
) AS ij2
ON ij1.ipParticipantID = ij2.occpParticipantID
WHERE (
(
(ij2.RoleTypeID) = 52
OR
(ij2.RoleTypeID) = 49
)
)
AND ij2.HasSucceeded = 1
GROUP BY
ij1.Description,
ij2.occSimulationID,
ij2.occSimRunID,
ij2.Period;
As per JohnFx (above), I was confused by the derived views. I think there is actually no need for it, especially since they are all inner joins. So, below I have attempted to reduce the complexity. Please review and test for performance. I have had to do a cross join with tmpSimArgs since it is only joined to Occurence - I assume this is desired behaviour.
SELECT
pg.Description,
occ.SimulationID,
occ.SimRunID,
occ.Period,
COUNT(occp.ParticipantID) AS CountOfParticipantID
FROM ParticipantGroup AS pg
INNER JOIN InitialParticipant AS ip
ON pg.ParticipantGroupID = ip.ParticipantGroupID
CROSS JOIN tmpSimArgs AS tsa
INNER JOIN Occurrence AS occ
ON tsa.SimRunID = occ.SimRunID
AND tsa.SimulationID = occ.SimulationID
INNER JOIN OccurrenceParticipant AS occp
ON occ.OccurrenceID = occp.OccurrenceID
AND occ.SimRunID = occp.SimRunID
AND occ.SimulationID = occp.SimulationID
INNER JOIN tmpPartArgs AS tpa
ON tpa.participantID = occp.ParticipantID
WHERE occ.HasSucceeded = 1
AND (occp.RoleTypeID = 52 OR occp.RoleTypeID = 49 )
GROUP BY
pg.Description,
occ.SimulationID,
occ.SimRunID,
occ.Period;
I have presented a smaller scaled down version of my query. Hope this is more clear and legible than my earlier one.
SELECT5 * FROM
(
SELECT4 FROM ParticipantGroup as pg INNER JOIN InitialParticipant AS ip ON pg.ParticipantGroupID = ip.ParticipantGroupID
) AS ij1 INNER JOIN
(
SELECT3 * FROM tmpPartArgs AS tpa INNER JOIN
(
SELECT2 * FROM
(
SELECT1 * FROM tmpSimArgs AS tsa INNER JOIN Occurrence AS occ ON (tsa.SimRunID = occ.SimRunID) AND (tsa.SimulationID = occ.SimulationID)
) AS ij4 INNER JOIN OccurrenceParticipant AS occp ON (occOccurrenceID = occpOccurrenceID) AND (occSimRunID = occpSimRunID) AND (occSimulationID = occpSimulationID)
) AS ij3 ON tpa.participantID = ij3.occpParticipantID
) AS ij2 ON ij1.ipParticipantID = ij2.occpParticipantID WHERE (((ij2.RoleTypeID)=52 Or (ij2.RoleTypeID)=49)) AND ij2.HasSucceeded = 1
The application that I am working on is a Simulation application and in order to understand the context of the above query I thought it necessary to give a brief explanation of the application. Let us assume there is a planet with some initial resources and living agents. The planet is allowed to exist for 1000 years and the actions performed by the agents are monitored and stored in the database. After 1000 years the planet is destroyed and again re-created with the same set of initial resources and living agents as the first time. This (the creation and destruction) is repeated 18 times and all the actions of the agents performed during those 1000 years are stored in the database. Thus our entire experiment consists of 18 re-creations which is termed as the ‘Simulation’. Each of the 18 times the planet is recreated is termed as a run and each of the 1000 years of a run is called a period. So a ‘Simulation’ consists of 18 runs and each run consists of 1000 periods. At the start of each run, we assign the ‘Simulation’ an initial set of knowledge items and dynamic agents that interact with each other and the items. A knowledge item is stored by an agent inside a knowledge store. The knowledge store is also considered to be a participating entity in our Simulation. But this concept (regarding knowledge stores) is not important. I have tried to be detailed about every SELECT statement and the tables involved.
SELECT1: I think this query could be replaced by just the table ‘Occurrence’, since it does nothing much. The table Occurrence stores the different actions taken by the agents, in each period of every simulation run of a particular ‘Simulation’. Normally each ‘Simulation’ consists of 18 runs. And each run consists of a 1000 periods. An agent is allowed to take an action in every period of every run in the ‘Simulation’. But the Occurrence table does not store any details about the agents that perform the actions. The Occurrence table might store data related to multiple ‘Simulations’.
SELECT2: This query simply returns the details of actions performed in every period of every run of a ‘Simulation’ along with the details of all participants of the ‘Simulation’ like their respective ParticipantIDs. The OccurrenceParticipant table stores records for every participating entity of the Simulation and that includes agents, knowledge stores, knowledge items, etc.
SELECT3: This query returns only those records from the pseudo table ij3 that are due to agents and knowledge items. All records in ij3 concerning knowledge items will be filtered out.
SELECT4: This query attaches the ‘Description’ field to every record of ‘InitialParticipant’. Please note that the column ‘Description’ is an Output column of the entire query. The table InitialParticipant contains a record for every agent and every knowledge item that is initially assigned to the ‘Simulation’
SELECT5: This final query returns all records from the pseudo table ij2 for which the RoleType of the participating entity (which may either be an agent or a knowledge item) is 49 or 52.
I would suggest moving the ij2.RoleTypeID filtering from the outermost query to ij3, use IN instead of OR, and move the HasSucceeded query to ij4.

Resources