long running query in documentum application - dql

When retrieving objects in our Documentum application it takes a long time. We have activated long running query option in data source och, but have found that the below query is taking too much time:
select all
b.r_object_id, dm_repeating1_0.state_name, a.object_name
from
dm_policy_sp a,
dm_sysobject_sp b,
dm_policy_rp dm_repeating1_0
where
(
(a.r_object_id=b.r_policy_id)
and (dm_repeating1_0.i_state_no=b.r_current_state)
and b.r_object_id in (N'a long, long list of IDs')
or a.r_object_id in (N'a long, long list of IDs')
)
and /* ... */
As you can see, the table "a" is a policy table and it has only 7 records. In the SQL statement after both "or" operators, we are looking for an object_id between 100 objects in table "a"! We executed a query and searched for those objects in table "b" (systemObjects) and we found that those objects belong to table b!
The above query takes about 17 minutes. When we changed the name of table after "or" operator in table to b, it took only 10 seconds!
We suppose this query is wrong. We don't know if it is a bug in Documentum or we have configured Documentum wrong. We don't know where we can find the DQL which creates this SQL or related components? Any idea?

Looks like documentum does it inside LifecycleNameDataHandler and LifecycleDataHandlerHelper. I decompile these classes and found this DQL query
SELECT b.r_object_id, a.state_name, a.object_name FROM dm_policy(all) a, dm_sysobject(all) b WHERE b.r_object_id IN (...) AND a.r_object_id = b.r_policy_id AND a.i_state_no = b.r_current_state ENABLE(row_based)
Documentum Webtop execute this DQL query when user open any datagrid with LifeCycle state name column.
There are a few option:
Optimize query on database level and test it from DQL (dql Tester in DA and etc)
Decompile class LifecycleDataHandlerHelper and rewrite DQL query in other manner. Try to add hints like FORCE_ORDER or something else.
If you do not use Life Cycles at all, you can easy disable this class. in the file webcomponent\app.xml line com.documentum.webcomponent.library.applylifecycle.LifecycleNameDataHandler should be commented or disabled.
Delete Life Cycle State name (or State Name) from grids. Maybe users select this column in their customized grids. It is possible to advice users to delete this columns from the grids.

I don't know what exactly you want to retrieve by this query, but I think that your query might be reworked as follows:
select all
b.r_object_id, dm_repeating1_0.state_name, a.object_name
from
dm_policy_sp a,
dm_sysobject_sp b,
dm_policy_rp dm_repeating1_0
where
(
(a.r_object_id=b.r_policy_id)
AND dm_repeating1_0.r_object_id=a.r_object_id
and (dm_repeating1_0.i_state_no=b.r_current_state)
and (b.r_object_id in (...)
or a.r_object_id in (...))
)

Related

Need to get data from a table using database link where database name is dynamic

I am working on a system where I need to create a view.I have two databases
1.CDR_DB
2.EMS_DB
I want to create the view on the EMS_DB using table from CDR_DB. This I am trying to do via dblink.
The dblink is created at the runtime, i.e. DB Name is decided at the time user installs the database, based on the dbname dblink is decided.
My issue is I am trying to create a query like below to create a view from a table which name is decided at run time. Please see below query :
select count(*)
from (SELECT CONCAT('cdr_log#', alias) db_name
FROM ems_dbs a,
cdr_manager b
WHERE a.db_type = 'CDR'
and a.ems_db_id = b.cdr_db_id
and b.op_state = 4 ) db_name;
In this query cdr_log#"db_name" is the runtime table name(db_name get's created at runtime).
When I'm trying to run above query, I'm not getting the desired result. The result of the above query is '1'.
When running only the sub-query from the above query :
SELECT CONCAT('cdr_log#', alias) db_name
FROM ems_dbs a,
cdr_manager b
WHERE a.db_type = 'CDR'
and a.ems_db_id = b.cdr_db_id
and b.op_state = 4;
i'm getting the desired result, i.e. cdr_log#cdrdb01
but when i'm trying to run the full query, getting result as '1'.
Also, when i'm trying to run as
select count(*) from cdr_log#cdrdb01;
I'm getting the result as '24' which is correct.
Expected Result is that I should get the same output similar to the query :
select count(*) from cdr_log#cdrdb01;
---24
But the desired result is coming as '1' using the full query mentioned initially.
Please let me know a way to solve the above problem. I found a way to do it via a procedure, but i'm not sure how can I invoke this procedure.
Can this be done as part of sub query as I have used above?
You're not going to be able to create a view that will dynamically reference an object over a database link unless you do something like create a pipelined table function that builds the SQL dynamically.
If the database link is created and named dynamically at installation time, it would probably make the most sense to create any objects that depend on the database link (such as the view) at installation time too. Dynamic SQL tends to be much harder to write, maintain, and debug than static SQL so it would make sense to minimize the amount of dynamic SQL you need. If you can dynamically create the view at installation time, that's likely the easiest option. Even better than directly referencing the remote object in the view, particularly if there are multiple objects that need to reference the remote object, would probably be to have the view reference a synonym and create the synonym at install time. Something like
create synonym cdr_log_remote
for cdr#<<dblink name>>
create or replace view view_name
as
select *
from cdr_log_remote;
If you don't want to create the synonym/ view at installation time, you'd need to use dynamic SQL to reference the remote object. You can't use dynamic SQL as the SELECT statement in a view so you'd need to do something like have a view reference a pipelined table function that invokes dynamic SQL to call the remote object. That's a fair amount of work but it would look something like this
-- Define an object that has the same set of columns as the remote object
create type typ_cdr_log as object (
col1 number,
col2 varchar2(100)
);
create type tbl_cdr_log as table of typ_cdr_log;
create or replace function getAllCDRLog
return tbl_cdr_log
pipelined
is
l_rows typ_cdr_log;
l_sql varchar(1000);
l_dblink_name varchar(100);
begin
SELECT alias db_name
INTO l_dblink_name
FROM ems_dbs a,
cdr_manager b
WHERE a.db_type = 'CDR'
and a.ems_db_id = b.cdr_db_id
and b.op_state = 4;
l_sql := 'SELECT col1, col2 FROM cdr_log#' || l_dblink_name;
execute immediate l_sql
bulk collect into l_rows;
for i in 1 .. l_rows.count
loop
pipe row( l_rows(i) );
end loop;
return;
end;
create or replace view view_name
as
select *
from table( getAllCDRLog );
Note that this will not be a particularly efficient way to structure things if there are a large number of rows in the remote table since it reads all the rows into memory before starting to return them back to the caller. There are plenty of ways to make the pipelined table function more efficient but they'll tend to make the code more complicated.

Efficient way to load lists of objects from database to instantiate a single object

My situation
I have a c# object which contains some lists. One of these lists are for example a list of tags, which is a list of c# "SystemTag"-objects. I want to instantiate this object the most efficient way.
In my database structure, I have the following tables:
dbObject - the table which contains some basic information about my c# object
dbTags - a list of all available tabs
dbTagConnections - a list which has 2 fields: TagID and ObjectID (to make sure an object can have several tags)
(I have several other similar types of data)
This is how I do it now...
Retrieve my object from the DB using an ID
Send the DB object to a "Object factory" pattern, which then realise we have to get the tags (and other lists). Then it sends a call to the DAL layer using the ID of our C# object
The DAL layer retrieves the data from the DB
These data are send to a "TagFactory" pattern which converts to tags
We are back to the Object Factory
This is really inefficient and we have many calls to the database. This especially gives problems as I have 4+ types of lists.
What have I tried?
I am not really good at SQL, but I've tried the following query:
SELECT * FROM dbObject p
LEFT JOIN dbTagConnection c on p.Id= c.PointId
LEFT JOIN dbTags t on c.TagId = t.dbTagId
WHERE ....
However, this retreives as many objects as there are tagconnections - so I don't see joins as a good way to do this.
Other info...
Using .NET Framework 4.0
Using LINQ to SQL (BLL and DAL layer with Factory patterns in the BLL to convert from DAL objects)
...
So - how do I solve this as efficient as possible? :-) Thanks!
At first sight I don't see your current way of work as "inefficient" (with the information provided). I would replace the code:
SELECT * FROM dbObject p
LEFT JOIN dbTagConnection c on p.Id= c.PointId
LEFT JOIN dbTags t on c.TagId = t.dbTagId
WHERE ...
by two calls to the DALs methods, first to retrieve the object main data (1) and one after that to get, only, the data of the tags related (2) so that your factory can fill-up the object's tags list:
(1)
SELECT * FROM dbObject WHERE Id=#objectId
(2)
SELECT t.* FROM dbTags t
INNER JOIN dbTag Connection c ON c.TagId = t.dbTagId
INNER JOIN dbObject p ON p.Id = c.PointId
WHERE p.Id=#objectId
If you have many objects and the amount of data is just a few (meaning that your are not going to manage big volumes) then I would look for a ORM based solution as the Entity Framework.
I (still) feel comfortable writing SQL queries in the DAOs to have under control all queries being sent to the DB server, but finally it is because in our situation is a need. I don't see any inconvenience on having to query the database to recover, first, the object data (SELECT * FROM dbObject WHERE ID=#myId) and fill the object instance, and then query again the DB to recover all satellite data that you may need (the Tags in your case).
You have be more concise about your scenario so that we can provide valuable recommendations for your particular scenario. Hope this is useful you you anyway.
We used stored procedures that returned multiple resultsets, in a similar situation in a previous project using Java/MSSQL server/Plain JDBC.
The stored procedure takes the ID corresponding to the object to be retrieved, return the row to build the primary object, followed by multiple records of each one-to-many relationship with the primary object. This allowed us to build the object in its entirety in a single database interaction.
Have you thought about using the entity framework? You would then interact with your database in the same way as you would interact with any other type of class in your application.
It's really simple to set up and you would create the relationships between your database tables in the entity designer - this will give you all the foreign keys you need to call related objects. If you have all your keys set up in the database then the entity designer will use these instead - creating all the objects is as simple as selecting 'Create model from database' and when you make changes to your database you simply right-click in your designer and choose 'update model from database'
The framework takes care of all the SQL for you - so you don't need to worry about that; in most cases..
A great starting place to get up and running with this would be here, and here
Once you have it all set up you can use LINQ to easily query the database.
You will find this a lot more efficient than going down the table adapter route (assuming that's what you're doing at the moment?)
Sorry if i missed something and you're already using this.. :)
As far I guess, your database exists already and you are familiar enough with SQL.
You might want to use a Micro ORM, like petapoco.
To use it, you have to write classes that matches the tables you have in the database (there are T4 generator to do this automatically with Visual Studio 2010), then you can write wrappers to create richer business objects (you can use the ValueInjecter to do it, it is the simpler I ever used), or you can use them as they are.
Petapoco handles insert / update operations, and it retrieves generated IDs automatically.
Because Petapoco handles multiple relationships too, it seems to fit your requirements.

The question about the basics of LINQ to SQL

I just started learning LINQ to SQL, and so far I'm impressed with the easy of use and good performance.
I used to think that when doing LINQ queries like
from Customer in DB.Customers where Customer.Age > 30 select Customer
LINQ gets all customers from the database ("SELECT * FROM Customers"), moves them to the Customers array and then makes a search in that Array using .NET methods. This is very inefficient, what if there are hundreds of thousands of customers in the database? Making such big SELECT queries would kill the web application.
Now after experiencing how actually fast LINQ to SQL is, I start to suspect that when doing that query I just wrote, LINQ somehow converts it to a SQL Query string
SELECT * FROM Customers WHERE Age > 30
And only when necessary it will run the query.
So my question is: am I right? And when is the query actually run?
The reason why I'm asking is not only because I want to understand how it works in order to build good optimized applications, but because I came across the following problem.
I have 2 tables, one of them is Books, the other has information on how many books were sold on certain days. My goal is to select books that had at least 50 sales/day in past 10 days. It's done with this simple query:
from Book in DB.Books where (from Sale in DB.Sales where Sale.SalesAmount >= 50 && Sale.DateOfSale >= DateTime.Now.AddDays(-10) select Sale.BookID).Contains(Book.ID) select Book
The point is, I have to use the checking part in several queries and I decided to create an array with IDs of all popular books:
var popularBooksIDs = from Sale in DB.Sales where Sale.SalesAmount >= 50 && Sale.DateOfSale >= DateTime.Now.AddDays(-10) select Sale.BookID;
BUT when I try to do the query now:
from Book in DB.Books where popularBooksIDs.Contains(Book.ID) select Book
It doesn't work! That's why I think that we can't use thins kinds of shortcuts in LINQ to SQL queries, like we can't use them in real SQL. We have to create straightforward queries, am I right?
You are correct. LINQ to SQL does create the actual SQL to retrieve your results.
As for your shortcuts, there are ways to work around the limitations:
var popularBooksIds = DB.Sales
.Where(s => s.SalesAmount >= 50
&& s.DateOfSale >= DateTime.Now.AddDays(-10))
.Select(s => s.Id)
.ToList();
// Actually should work.
// Forces the table into memory and then uses LINQ to Objects for the query
var popularBooksSelect = DB.Books
.ToList()
.Where(b => popularBooksIds.Contains(b.Id));
Yes, query gets translated to a SQL string, and the underlying SQL can be different depending on what you are trying to do... so you have to be careful in that regard. Checkout a tool called linqpad, you can try your query in it and see the executing SQL.
Also, it runs when iterating through the collection or calling a method on it like ToList().
Entity framework or linq queries can be tricky sometimes. Sometimes you are surprised at the efficiency of the sql query generated and sometimes the query is so complicated and inefficient that you would smack your forehead.
Best idea is that if you have any suspicions about a query, run an sql profiler at the backend that would monitor all the queries coming in. That way you know exactly what is being passed on to the sql server and correct any inefficiencies if need be.
http://damieng.com/blog/2008/07/30/linq-to-sql-log-to-debug-window-file-memory-or-multiple-writers
This will help you to see what and when queries are being run. Also, Damiens blog is full of other linq to sql goodness.
You can generate an EXISTS clause by using the .Any method. I have had more success that way than trying to generate IN clauses, because it likes to retrieve all the data and pass it all back in as parameters to a query
In linq to sql, IQueryable expression fragments can be combined to create a single query, it will try to keep everything as an IQueryable for as long as it can, before you do something that cannot be expressed in SQL. When you call ToList you are directly asking it to resolve that query into an IEnumerable stored in memory.
In most cases you are better off not selecting the book ids in advance. Keep the fragment for popular books in a single place in the code and use it when necessary, to build on another query. An IQueryable is just an expression tree, which is resolved into SQL at some other point.
If you think your application will perform better by storing the popular books elsewhere (memcache or whatever), then you may consider pulling them out before hand, and checking against that later. This will mean each book id will be passed in as a sproc parameter and used in an IN clause.

Hierarchical Database Select / Insert Statement (SQL Server)

I have recently stumbled upon a problem with selecting relationship details from a 1 table and inserting into another table, i hope someone can help.
I have a table structure as follows:
ID (PK) Name ParentID<br>
1 Myname 0<br>
2 nametwo 1<br>
3 namethree 2
e.g
This is the table i need to select from and get all the relationship data. As there could be unlimited number of sub links (is there a function i can create for this to create the loop ?)
Then once i have all the data i need to insert into another table and the ID's will now have to change as the id's must go in order (e.g. i cannot have id "2" be a sub of 3 for example), i am hoping i can use the same function for selecting to do the inserting.
If you are using SQL Server 2005 or above, you may use recursive queries to get your information. Here is an example:
With tree (id, Name, ParentID, [level])
As (
Select id, Name, ParentID, 1
From [myTable]
Where ParentID = 0
Union All
Select child.id
,child.Name
,child.ParentID
,parent.[level] + 1 As [level]
From [myTable] As [child]
Inner Join [tree] As [parent]
On [child].ParentID = [parent].id)
Select * From [tree];
This query will return the row requested by the first portion (Where ParentID = 0) and all sub-rows recursively. Does this help you?
I'm not sure I understand what you want to have happen with your insert. Can you provide more information in terms of the expected result when you are done?
Good luck!
For the retrieval part, you can take a look at Common Table Expression. This feature can provide recursive operation using SQL.
For the insertion part, you can use the CTE above to regenerate the ID, and insert accordingly.
I hope this URL helps Self-Joins in SQL
This is the problem of finding the transitive closure of a graph in sql. SQL does not support this directly, which leaves you with three common strategies:
use a vendor specific SQL extension
store the Materialized Path from the root to the given node in each row
store the Nested Sets, that is the interval covered by the subtree rooted at a given node when nodes are labeled depth first
The first option is straightforward, and if you don't need database portability is probably the best. The second and third options have the advantage of being plain SQL, but require maintaining some de-normalized state. Updating a table that uses materialized paths is simple, but for fast queries your database must support indexes for prefix queries on string values. Nested sets avoid needing any string indexing features, but can require updating a lot of rows as you insert or remove nodes.
If you're fine with always using MSSQL, I'd use the vendor specific option Adrian mentioned.

How to Get Last Created Entry's ID From Sql Database With Asp.Net

I will explain problem with an example:
There is two table in my database, named entry, tags
There is a column named ID_ENTRY in both table. When I add a record to table, entry, I have to take the ID_ENTRY of last added record and add it to table, tags. How can I do it?
The only way to do this is with multiple statements. Using dynamic sql you can do this by separating each statement in your query string with a semi-colon:
"DECLARE #ID int;INSERT INTO [Entry] (...) VALUES ...; SELECT #ID = scope_identity();INSERT INTO [TAGS] (ID_ENTRY) VALUES (#ID);"
Make sure you put this in a transaction to protect against concurrency problems and keep it all atomic. You could also break that up into two separate queries to return the new ID value in the middle if you want; just make sure both queries are in the same transaction.
Also: you are using parameterized queries with your dynamic sql, right? If you're not, I'll personally come over there and smack you 10,000 times with a wet noodle until you repent of your insecure ways.
Immediatly after executing the insert statement on first table, you should query ##IDENTITY doing "SELECT ##identity". That will retrieve the last autogenerated ID... and then just insert it on the second table.
If you are using triggers or something that inserts rows... this may be not work. Use Scope_Identity() instead of ##IDENTITY
I would probably do this with an INSERT trigger on the named entry table, if you have all of the data you need to push to the tags table available. If not, then you might want to consider using a stored procedure that creates both inside a transaction.
If you want to do it in code, you'll need to be more specific about how you are managing your data. Are you using DataAdapter, DataTables, LINQ, NHibernate, ...? Essentially, you need to wrap both inserts inside a transaction of some sort so that either inserts get executed or neither do, but the means to doing that depend on what technology you are using to interact with the database.
If you use dynamic sql, why not use Linq to Entity Framework, now EF is the recommend data access technology from Microsoft (see this post Clarifying the message on L2S Futures from ADO.NET team blog), and if you do an insert with EF the last identity id will available for you automatically, I use it all the time it's easy.
Hope this helps!
Ray.

Resources