I am using data from multiple tables (6 tables) from multiple schemas to show a grid on the launch of an application. Due to many outer joins it is taking very long to load the initial page. How can i increase the efficiency of my program.
There are many ways to optimize,
you could start paginating the query results. i.e show only few results in the page at a time and click on page number to see the other results.
Optimize your query. There are options like Query plan that can give you an idea about the performance of the query. Also preferably use SQL queries and avoid using Procedures
Related
I am exploring using ADX as a timeseries data store for sensor metrics. Our current solution is storing data in MSSQL and I'm testing ADX as an alternative. I was able to set up data ingestion and I can perform basic queries, and with the added aggregation functions, computing insights and statistics seems to be much faster.
As part of the solution, we have a API data access layer used by clients and our web portal to query data for display and analysis use. I am currently transforming the MSSQL queries to the KQL version and I'm hitting a stumble block on data pagination.
We have a function to query historical data using a combination of:
an start/end date,
a device identifier,
and some paging options
records per page,
current page,
column sorting / additional filtering
Currently this is handled in a SQL SP on the back-end, by getting the total number of records and pages (which is set as output on the API so that the front-end can use this data in the table view), then getting the records based on the input parameters and pagination details to return a record set - quite straight forward.
Any suggestions on how to achieve effective pagination using ADX/KQL?
I found a section in the docs on pagination on stored query results, but as the queries are dynamic based on user input, so this does not sound like a viable option.
When you paginate (for example viewing result 21-30) you need to consider if you are taking a snapshot of the result and paging through it or viewing live data. If you expect new rows coming in to not affect your pagination, than stored query results is that snapshot. Once you generate it you can select specific rows from it based on your page calculation.
I am using websql to store data in a phonegap application. One of table have a lot of data say from 2000 to 10000 rows. So when I read from this table, which is just a simple select statement it is very slow. I then debug and found that as the size of table increases the performance deceases exponentially. I read somewhere that to get performance you have to divide table into smaller chunks, is that possible how?
One idea is to look for something to group the rows by and consider breaking into separate tables based on some common category - instead of a shared table for everything.
I would also consider fine tuning the queries to make sure they are optimal for the given table.
Make sure you're not just running a simple Select query without a where clause to limit the result set.
I am currently working with EF4 and in one of my scenario i am using join and wanted to retrieve the data but as the resultant data is so much EF4 is even fail to generate the query plan..As a work around i tried to load the data in simple generic list( using Selecting all data from both the tables) and then tried to join on that two list but still i am getting outofmemory exception as one table contains around 100k records and second table contains 50k records i wanted to join them in query...but still with noluck using EF...please suggest me any work around of this...
I can't think of any scenario where you would need a result set containing 100k+ records. It may not be the answer you want, but the best way to improve performance is to reduce the amount of records that you're dealing with.
What we did is that we wrote custom SQL and executed it with Context.Database.SqlQuery(sql, params)
I am building UI for a large product catalog (millions of products).
I am using Sql Server, FreeText search and ASP.NET MVC.
Tables are normalized and indexed. Most queries take less then a second to return.
The issue is this. Let's say user does the search by keyword. On search results page I need to display/query for:
Display 20 matching products on first page(paged, sorted)
Total count of matching products for paging
List of stores only of all matching products
List of brands only of all matching products
List of colors only of all matching products
Each query takes about .5 to 1 seconds. Altogether it is like 5 seconds.
I would like to get the whole page to load under 1 second.
There are several approaches:
Optimize queries even more. I already spent a lot of time on this one, so not sure it can be pushed further.
Load products first, then load the rest of the information using AJAX. More like a workaround. Will need to revise UI.
Re-organize data to be more Report friendly. Already aggregated a lot of fields.
I checked out several similar sites. For ex. zappos.com. Not only they display the same information as I would like in under 1 second, but they also include statistics (number of results in each category).
The following is the search for keyword "white"
http://www.zappos.com/white
How do sites like zappos, amazon make their results, filters and stats appear almost instantly?
So you asked specifically "how does Zappos.com do this". Here is the answer from our Search team.
An alternative idea for your issue would be using a search index such as solr. Basically, the way these work is you load your data set into the system and it does a huge amount of indexing. My projects include product catalogs with 200+ data points for each of the 140k products. The average return time is less than 20ms.
The search indexing system I would recommend is Solr which is based on lucene. Both of these projects are open source and free to use.
Solr fits perfectly for your described use case in that it can actually do all of those things all in one query. You can use facets (essentially group by in sql) to return the list of different data values for all applicable results. In the case of keywords it also would allow you to search across multiple fields in one query without performance degradation.
you could try replacing you aggergate queries with materialized indexed views of those aggregates. this will pre-compute all the aggregates and will be as fast as selecting any regular row data.
.5 sec is too long for an appropriate hardware. I agree with Aaronaught and first thing to do is to convert it in single SQL or possibly Stored Procedure to ensure it's compiled only once.
Analyze your queries to see if you can create even better indexes (consider covering indexes), fine tune existing indexes, employ partitioning.
Make sure you have appropriate hardware config - data, log, temp and even index files should be located on independent spindles. make sure you have enough RAM and CPU's. I hope you are running 64-bit platform.
After all this, if you still need more - analyze most used keywords and create aggregate result tables for top 10 keywords.
Amount Amazon - they most likely use superior hardware and also take advantage of CDN's. Also, they have thousands of servers surviving up the content and there is no performance bottlenecks - data is duplicated multiple times across several data centers.
As completely separate approach - you may want to look into "in-memory" databases such as CACHE - this is the fastest you can get on DB side.
I am encountering a performance problem in the development of current UI. The problem, I suppose, is however general.
I have a page with a simple asp.net grid. Grid will display data from a table based on certain search criteria. Moreover, grid has fixed page size (say 10). There is pager at the bottom which can be used to navigate b/w pages. On the back end, whenever search button is pressed a stored procedure is called which returns the desired data.
Stored procedure has parameters like currentpageIndex, pagesize, other search criteria, etc. Here is a pseudo code for sp:
-- SP begins
-- calculate the page index range to return required using current page index and page size
-- query the table in a CTE and do all filtering. Also calculate row numbers so that
-- correct record range can be returned.
-- use the cte to return the correct record based on the row number calculated in CTE
-- SP ends
I have following problems/queries in this approach
When Db table size is large (say 10 million records), performance degrades and this approach becomes impractical.
Is using table variable or a temporary table more useful?
Is there any other efficient way to get paged data from database?
Hi Dan, the article provided a new insight for calculation of total rows. Really helpful. Thanks.
But still is there better way than using CTE when data is large?
Update: I have found few other performant approaches for efficiently getting paged records.
There's a good article on SqlServerCentral called SQL Server 2005 Paging – The Holy Grail that shows a few techniques for server-side paging. You will need to register to view it, though.
I know for really large result sets then software like Google will simply estimate how many rows will be returned, bypassing the need to get a count of all the rows returned.
Sorry, if I can't give more help.