ASP.NET and SQL server with huge data sets - asp.net

I am developing a web application in ASP.NET and on one page I am using a ListView with paging. As a test I populated the table it draws from with 6 million rows.
The table and a schema-bound view based off it have all the necessary indexes and executing the query in SQL Server Management Studio with SELECT TOP 5 returned in < 1 second as expected.
But on the ASP.NET page, with the same query, it seems to be selecting all 6 million rows without any limit. Shouldn't the paging control limit the query to return only N rows rather than the entire data set? How can I use these ASP.NET controls to handle huge data sets with millions of records? Does SELECT [columns] FROM [tablename] quite literally mean that for the ListView, and it doesn't actually inject a TOP <n> and does all the pagination at the application level rather than the database level?

When you enable paging, the paging control, datasource, grid, etc. will limit the number of rows displayed by the control. However, they definitely will not limit the number of rows returned by the SELECT statement.
You should be using DataObjectSource as the control's data source and have it call a class method that executes a SELECT statement that only returns the necessary rows. Otherwise, your performance will be horrible.
Unfortunately, SQL Server doesn't support any type of RANGE clause and the required SQL isn't very pretty. But it is absolutely necessary.
http://www.asp.net/data-access/tutorials/efficiently-paging-through-large-amounts-of-data-cs

Related

Using a statically assigned application item as SQL in Oracle APEX

In my Oracle Apex 19.3 application I have a SQL statement that needs to be used on several pages and changes slightly based on the user that is logged in. So that I do not need to duplicate this code over and over on each page I generate this statement as an application item called: QUERY_BASED_ON_USER.
An application computation then statically sets it to SELECT j.* FROM table(pkg_jobstatus.report()) j WHERE j.id IN (:USERIDS)
(USERIDS is a separate application item)
I wish to use the application item QUERY_BASED_ON_USER as the sql statement for a table. When setting the data source to PL/SQL and using the following code,
BEGIN
return :QUERY_BASED_ON_USER;
END;
I get this error: PL/SQL function body did not return a value.
I tried debugging this by settings a static page region to: &QUERY_BASED_ON_USER. and it outputs the query correctly.
My assumption is that the code editor does not evaluate the application computation and thus it returns an empty string, which it then refuses to validate or save. But I do not know how to validate this or how to work around this.
How can I use the application item as the sql statement?
You need to set "Use Generic Column Names" to true, and specify the number of columns your query will return:
Then the query is not parsed until runtime, when the item value is available.

Teradata: Is there a way to generate DDL from a view or select statement?

I am using a global application user account to access database A. This user account does not have permissions to modify database A's schema (ie, create tables, modify tables, etc). This user also has access to database B, but only views. I need to run SQL to feed data from a view in database B into a table in database A.
In a perfect world, I would be able to use this SQL:
create database_a.mytable as (select * from database_b) with no data
However, the user can't create tables in database A. If I could get the DDL of the select statement then I could log in under my personal account (which doesn't have any access to database B) and run the DDL in database A to create the table.
The only other option is to manually write the SQL, but I don't want to do that, especially since this view I am wanting to copy has many columns of varying data types and sizes.
Edit: I may be getting closer. I just experimented with this:
show (select * from database_b.myview)
However, it generated the DLL of every single table that is used in the view itself, as well as the definition for the view. This doesn't really help me since I just want the schema of the select statement itself. In other words, I need what would be generated if I were to use the create table as statement mentioned above.
Edit for Rob: Perhaps "DDL" was the wrong term to use. Using show view db.myview just shows the definition of the view, not the schema it represents. In my above example of create table as, I show how you can create a table that mimics the schema of a result set returned in a select. It generates a DDL on the back end for creating a table and then executes that DDL to actually create the table. You can then say show table db.newtable and see the new table's DDL. I want to get that DDL directly from a select statement so that I can copy it, log out of the app account, into my personal account, and then execute the DDL to create the table.
This is only to save me the headache of having to type out the DDL manually by hand to save time and reduce typing errors, especially since the source view has so many columns. That said, I think hitting up the DBA or writing some snazzy stored procedure to do dynamic stuff would be a bit over the top for my needs. I think there has to be a way to get the DDL for creating a table schema directly from a select statement.
Generate DDL Statements for objects:
SHOW TABLE {DatabaseB}.{Table1};
SHOW VIEW {DatabaseB}.{View1};
Breakdown of columns in a view:
HELP VIEW {DatabaseB}.{View1};
However, without the ability to create the object in the target database DatabaseA your don't have much leverage. Obviously, if the object already existed INSERT INTO SELECT ... FROM DatabaseB.Table1 or MERGE INTO would be options that you already explored.
Alternative Solution
Would it be possible to have a stored procedure created that dynamically created the table based on the view name that is provided? The global application account would simply need privilege to execute the procedure. Generally the user creating the stored procedure would need the permissions to perform the actions contained within the stored procedure. (You have some additional flexibility with this in Teradata 13.10.)
There are some caveats with this approach. You are attempting to materialize views that could reference anywhere from hundreds to billions of records. These aren't simple 1:1 views that are put on top of the target tables. Trying to determine the required space in the target database to materialize the view will be difficult. Performance can and will vary depending on the complexity of the view and the data volumes. This will not be a fast-path or data block optimized operation.
As a DBA, I would be concerned with this approach being taken on by a global application account without fully understanding the intent. I trust you have an open line of communication with the DBA(s) involved for supporting this system. I'm sure there are reasons for your madness that can't be disclosed here.
Possible Solution - VOLATILE TABLE
Unless the implicit privilege for CREATE TABLE has been revoked from the global application account this solution should work.
Volatile tables do not require perm space. There table definitions persist for the duration of the session and any data inserted into them relies on the spool space of the user who instantiated it.
CREATE VOLATILE TABLE {Global Application UserID}.{TableA_Copy} AS
(
SELECT *
FROM {DatabaseB}.{TableA}
)
WITH NO DATA
NO PRIMARY INDEX
ON COMMIT PRESERVE ROWS;
SHOW TABLE {Global Application UserID}.{TableA_Copy};
I opted to use a Teradata 13.10 feature called NO PRIMARY INDEX. By default, CREATE TABLE AS will take the first column of the SELECT statement and make it the PRIMARY INDEX of the table. This could lead to skewing and perm space issues in your testing depending on the data demographics. You can specify an explicit PRIMARY INDEX on your own as you understand the underlying data. (See the DDL manuals for details on the syntax if you're uncertain.)
The use of ON COMMIT PRESERVE ROWS for the intent of this example is probably extraneous. But in reality if you popped any data into that table for testing this clause would be beneficial in Teradata mode as the data would otherwise be lost immediately after the CREATE TABLE or any other data manipulation was performed against the volatile table.

Microsoft Indexing Service and OLEDB With Paging

I have an ASP.net 2.0 intranet site that uses the indexing service on a folder and its contents.
OLEDB is used to query the files in this folder by using the same technique as discussed here.
This was written by another developer but i am starting to understand his way of working.
But now the clients are complaining about the long loadtime of the page because all files in the folder are queried at once. They are right about the fact that it's slow so i considered using paging (Like in linq Skip().Take()). I know that in SQL this translates as:
SELECT col1, col2
FROM
(
SELECT col1, col2, ROW_NUMBER() OVER (ORDER BY ID) AS RowNum
FROM MyTable
)
AS MyDerivedTable
WHERE MyDerivedTable.RowNum BETWEEN #startRow AND #endRow
But for some reason this does not work when used with OLEDB.
Which version of SQL does this use or do any of you got a suggestiong on how to implement the paging?
EDIT:
Because the above method is only available when using sql Server 2005 or higher, i am going to try a method prior to 2005. I think OLEDB doesn't support Row_Number() or Over.
Going to try:
SELECT ... FROM Table WHERE PK IN
(SELECT TOP #PageSize PK FROM Table WHERE PK NOT IN
(SELECT TOP #StartRow PK FROM Table ORDER BY SortColumn)
ORDER BY SortColumn)
ORDER BY SortColumn
Seems like MSIDXS doesn't support much SQL functions.
Only the basics like "Select", "Where", "Order by" works. The other functions like "Top", "Rowcount", "Over" don't work. It even fails on "Count(*)".
I implemented paging by using the DataAdapter.Fill() method with 2 integers; startrecord and maxrecord. This is not ideal but the best in this case solution.
Now all records will be collected but only those i need will be stored in the dataset which then is converted to a collection of my own class.
This works fast for the first pages because only the first rows will be looped and returned. But when you have 20 pages the last page will take longer because all the records before it will be looped.
I tested this with a page size of 20 and 400 results.
The first page took 200ms while the last page took around 1,6 seconds.
A noticeable lag but now it only takes place on the last pages and not on the first 10.
There is a search and sorting mechanism so the last pages won't be visited that much.

Paging design recommendation for asp.net and sqlserver 2005

I am relatively new to programming. My work basically revolves around data and analysis. I want to create a simple asp.net page which shows huge chunk of data from the database. There could be a millions of rows of data which is used for different kinds of analysis/searchin/filtering etc..
Should I write paging logic at the front end or at the back-end (in this case SQL Server 2005)?
What would be the best practice around this? Your suggestions/links to resources in this direction is greatly appreciated.
please use this example Building Custom Paging with LINQ, ListView, DataPager and ObjectDataSource
Paging of Large Resultsets in ASP.NET
ListView and DataPager
Custom paging in ASP.NET with ListView & DataPager
Implementing Custom Paging in ASP.NET with SQL Server 2005
You may be interested in this...
Paging of Large resultset in asp.net
I would suggest you create a stored procedure to query and page your data. Linq To SQL is a fast an easy way to execute the stp.
Simple example of stored procedure to take care of paging:
CREATE PROCEDURE [dbo].[stp_PagingSample]
(
#page int,
#pagesize int
)
AS
WITH Numbered AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY ID) AS 'RowNumber'
FROM tbl_YourTable
)
SELECT *
FROM Numbered
WHERE RowNumber BETWEEN ((#page - 1) * #pagesize) + 1 AND (#page * #pagesize);
The stored procedure is the tricky part. But drop a comment if you would like me to add more sample code executing the stp and rendering the data... :)

Without stored procedures, how do you page result sets in ASP.NET?

Without stored procedures, how do you page result sets retrieved from SQL Server in ASP.NET?
You could use LINQ, for instance:
var customerPage = dataContext.Customers.Skip(50).Take(25);
and then display those 25 customers.
See Scott Guthrie's excellent Using LINQ-to-SQL - section 6 - retrieve products with server side paging.
Another option (on SQL Server 2005 and up) is to use ordered CTE's (Common Table Expression) - something like this:
WITH CustomerCTE AS
(
SELECT CustomerID,
ROW_NUMBER() OVER (ORDER BY CustomerID DESC) AS 'RowNum'
FROM Customers
)
SELECT * FROM CustomerCTE
WHERE rownum BETWEEN 150 AND 200
You basically define a CTE over your sort critiera using the ROW_NUMBER function, and then you can pick any number of those at will (here: those between 150 and 200). This is very efficient and very useful server-side paging. Join this CTE with your actual data tables and you can retrieve anything you need!
Marc
PS: okay, so the OP only has SQL Server 2000 at hand, so the CTE won't work :-(
If you cannot update to either SQL Server 2005, or .NET 3.5, I'm afraid your only viable option really is stored procedures. You could do something like this - see this blog post Efficient and DYNAMIC Server-Side paging with SQL Server 2000, or Paging with SQL Server Stored Procedures
The best is to use an ORM which will generate dynamic paging code for you - LINQ To SQL, NHibernate, Entity Framework, SubSonic, etc.
If you have a small result set, you can page on the server using either DataPager, PagedDataSource, or manually using LINQ Skip and Take commands.
(new answer since you're using SQL Server 2000, .NET 2.0, and don't want to use an ORM)
There are two ways to handle paging in SQL Server 2000:
If you have a ID column that's sequential with no holes, you can execute a SQL string that says something like SELECT Name, Title FROM Customers WHERE CustomerID BETWEEN #low and #high - #low and #high being parameters which are calculated based on the page size and page that you're on. More info on that here.
If you don't have a sequential ID, you end up using a minimum ID and ##rowcount to select a range. For instance, SET ##rowcount 20; SELECT Name, Title FROM Customers WHERE CustomerID > #low' - either calculating #low from the page size and page or from the last displayed CustomerID. There's some info on that approach here.
If you have a small dataset, you can page through it in .NET code, but it's less efficient. I'd recommend the PagedDataSource, but if you want to write it yourself you can just read your records from a SqlDataReader into an Array and then use the Array.Range function to page through it.
This is how I handled all of my paging and sorting with AJAX in my ASP.NET 2.0 application.
http://programming.top54u.com/post/AJAX-GridView-Paging-and-Sorting-using-C-sharp-in-ASP-Net.aspx
Well my general approach is usually to create two tables for the results to be paged. The first is an info table that has a search id identity column and has the min and max row numbers. The second table contains the actual results and has an identity column for the row number. I insert into the second table and get the min and max rows and store them in the first table. Then I can page through by selecting just the rows I want. I usually expire the results after 24 hours by using code right before the insert. I usually use a stored procedure to do the inserts for me but you could do it without a stored procedure.
This has the advantage of only performing the more complex sql search once. And the dataset won't change between page displays. It is a snapshot of the data. It also can ease server side sorting. I just have to select those rows in order and reinsert into the second table.

Resources