Have a stored procedure that produces a number--let's say 50, that is rendered as an anchor with the number as the text. When the user clicks the number, a popup opens and calls a different stored procedure and shows 50 rows in a html table. The 50 rows are the disaggregation of the number the user clicked. In summary, two different aspx pages and two different stored procedures that need to show the same amount, one amount is the aggregate and the other the disaggregation of the aggregate.
Question, how do I test this code so I know that if the numbers do not match, there is an error somewhere.
Note: This is a simplified example, in reality there are 100s of anchor tags on the page.
This kind of testing falls outside of the standard / code level testing paradigm. Here you are explicitly validating the data and it sounds like you need a utility to achieve this.
There are plenty of environments to do this and approaches you can take, but here's two possible candidates
SQL Management Studio : here you can generate a simply script that can run through the various combinations from the two stored procedures ensuring that the number and rows match up. This will involve some inventive T-SQL but nothing particular taxing. The main advantage of this approach is you'll have bare metal access to the data.
Unit Testing : as mentioned your problem is somewhat outside of the typical testing scenario where you would oridnarily Mock the data and test into your Business Logic. However, that doesn't mean you cannot write the tests (especially if you are doing any Dataset manipulation prior to this processing. Check out this link and this one for various approaches (note: if you're using VS2008 or above, you get the Testing Projects built in from the Professional Version up).
In order to test what happens when the numbers do not match, I would simply change (temporary) one of the stored procedure to return the correct amount +1, or always return zero, etc.
Related
I am starting a customer lifetime project at work and want to share how the data looks with the business, as I want to be able to identify the important variables with them. I plan to do this using the excellent rpivottable package and launch a shiny app to see where there are basic differences in groups to select my features.
This would mean I have my customer base of 4million customers and slice and dice them in a number of ways.
However, following GDPR we need to ensure no group is shown that has less than 7 customers in it. Therefore I need somekind of background calculation to ensure that less than 7 customers are never shown.
If I think logically about this, the only way I could see it working would be to make a change to the pivottable, have some form of submit button, so that the size of groups could be calculated, and then a filter (which needs to be hidden from the user so it cannot be switched off) is applied.
I know I should provide code, but I do not know where to start here. Has anyone had similar issues and has a potential solution to all or part of the problem?
Has anyone built a hidden filter into their rpivottable?
Has anyone been able to restrict their output to only show 90% of their data?
Thanks,
J
To be absolutely sure, you would need to load in a data frame that looks like "dim, dim, dim, count" where count is always greater than 7. Basically just a bit of preprocessing on your input data. Unfortunately, this means that you will be restricted to a small number of coarse dimensions, else you will end up filtering out everything.
At work we have just started using CodedUI, in our product there are a lot of data grids, and while the CodedUI UIMap recorder is capable of picking out individual elements, it does not seem to be able to pick out collections of elements, such as returning a list giving each cell in a column or row, or even more usefully a list of lists, so you can navigate the data in a way that is sensitive to context - I may be interested for instance in checking that the fourth column is always equal to the sum of the second and third.
Is there any way to do this sort of search in CodedUI? So far the only search methods I have come across are those used by the UIMap recorder itself, which should only ever return a single object. Without this I am finding it difficult to make any tests that are particularly useful...
There is method UITestControl.FindMathingControls which returns collection of elements satisfying conditions you set.
As far as I understand this method gives you what you need.
I have a typical scenario that I'm struggling with from a performance standpoint. The user selects a value from a dropdown and clicks a button. A stored procedure takes that value as an input parameter, executes, and returns the results to a grid. For just one of the values ('All'), the query runs for roughly 2.5 minutes. For the rest of the values the query runs less than 1ms.
Obviously, having the user wait for 2.5 minutes just isn't going to fly. So, what are some typical strategies to handle this?
Some of my own thoughts:
New table that stores the information for the 'All' value and is generated nightly
Cache the data on the caching server
Any help is appreciated.
Thanks!
Update
A little bit more info:
sp returns two result sets. The first is a group by rollup summary and the second is the first result set, disaggregated (roughly 80,000 rows).
I would first look at if your have the proper indexes in place. Using the Query Analyzer and the Database Tuning Assistant is a simple and often effective way of seeing what indexes might help.
If you still have performance problems after creating the appropriate indexes you might then look at adding tables/views to speed things up. If your query does a lot of joins you might consider creating an indexed view that allows you to do a select with no joins on the denormalized data. Since indexed views are persisted you can see big gains from their use.
You can read up on indexed views here:
http://msdn.microsoft.com/en-us/library/dd171921%28v=sql.100%29.aspx
and read about the database tuning adviser here:
http://msdn.microsoft.com/en-us/library/ms166575.aspx
Also, how many records does "All" return? I have seen people get hung up on the "All" scenario before, but if it returns 1 million records or something then the data is not usable to a person anyways...
Caching data is a good thing, but.... if the SP is inherently flawed, then you might want to actually fix it instead of trying to bandage it with caching.
You might also want to (since you didn't mention here) look at the number of rows "All" returns compared to the other selections and think about your indexes.
Also in your SP does the "All" cause it to run a different sets of tsql as in maybe a case or an if... or is it running the same code just with a different "WHERE"?
It might simply be that "ALL" just returns A LOT of records. You may want to implement paging and partial dataset return using ajax... (kinda like return the first 1000 records early so that it can be displayed and also show a throbber on the screen while the rest of the dataset is returned)
These are all options... if the number of records really isnt that different between ALL and the others... then it probably has something to do with the query/index/program flow.
I am working on a large project where I have to present efficient way for a user to enter data into a form.
Three of the fields of that form require a value from a subset of a common data source (SQL Table). I used JQuery and JQuery UI to build an autocomplete, which posts to a generic HttpHandler.
Internally the handler uses Linq-to-sql to grab the data required from that specific table. The table has about 10 different columns, and the linq expression uses the SqlMethods.Like() to match the single search term on each of those 10 fields.
The problem is that that table contains some 20K rows. The autocomplete works flawlessly, accept the sheer volume of data introduces deleays, in the vicinity of 6 seconds or so (when debugging on my local machine) before it shows up.
The JqueryUI autocomplete has 0 delay, queries on the 3 key, and the result of the post is made in a Facebook style multi-row selectable options. (I almost had to rewrite the autocomplete plugin...).
So the problem is data vs. speed. Any thoughts on how to speed this up? The only two thoughts I had were to cache the data (How/Where?); or use straight up sql data reader for data access?
Any ideas would be greatly appreciated!
Thanks,
<bleepzter/>
I would look at only returning the first X number of rows using the .Take(10) linq method. That should translate into a sensbile sql call, which will put much less load on your database. As the user types they will find less and less matches, so they will only see that data they require.
I'm normally reckon 10 items is enough for the user to understand what is going on and still get to the data they need quickly (see the amazon.com search bar for an example).
Obviously if you can sort the data in a meaningful fashion then the 10 results will be much more likely to give the user what they are after quickly.
Returning the top N results is a good idea for sure. We found (querying a potential list of 270K) that returning the top 30 is a better bet for the user finding what they're looking for, but that COMPLETELY depends on the data you are querying.
Also, you REALLY should drop the delay to something sensible like 100-300 ms. When you set delay to ZERO, once you hit the 3-character trigger, effectively EVERY. SINGLE. KEY. STROKE. is sent as a new query to your server. This could easily have the unintended and unwelcome effect of slowing down the response even MORE.
I am building UI for a large product catalog (millions of products).
I am using Sql Server, FreeText search and ASP.NET MVC.
Tables are normalized and indexed. Most queries take less then a second to return.
The issue is this. Let's say user does the search by keyword. On search results page I need to display/query for:
Display 20 matching products on first page(paged, sorted)
Total count of matching products for paging
List of stores only of all matching products
List of brands only of all matching products
List of colors only of all matching products
Each query takes about .5 to 1 seconds. Altogether it is like 5 seconds.
I would like to get the whole page to load under 1 second.
There are several approaches:
Optimize queries even more. I already spent a lot of time on this one, so not sure it can be pushed further.
Load products first, then load the rest of the information using AJAX. More like a workaround. Will need to revise UI.
Re-organize data to be more Report friendly. Already aggregated a lot of fields.
I checked out several similar sites. For ex. zappos.com. Not only they display the same information as I would like in under 1 second, but they also include statistics (number of results in each category).
The following is the search for keyword "white"
http://www.zappos.com/white
How do sites like zappos, amazon make their results, filters and stats appear almost instantly?
So you asked specifically "how does Zappos.com do this". Here is the answer from our Search team.
An alternative idea for your issue would be using a search index such as solr. Basically, the way these work is you load your data set into the system and it does a huge amount of indexing. My projects include product catalogs with 200+ data points for each of the 140k products. The average return time is less than 20ms.
The search indexing system I would recommend is Solr which is based on lucene. Both of these projects are open source and free to use.
Solr fits perfectly for your described use case in that it can actually do all of those things all in one query. You can use facets (essentially group by in sql) to return the list of different data values for all applicable results. In the case of keywords it also would allow you to search across multiple fields in one query without performance degradation.
you could try replacing you aggergate queries with materialized indexed views of those aggregates. this will pre-compute all the aggregates and will be as fast as selecting any regular row data.
.5 sec is too long for an appropriate hardware. I agree with Aaronaught and first thing to do is to convert it in single SQL or possibly Stored Procedure to ensure it's compiled only once.
Analyze your queries to see if you can create even better indexes (consider covering indexes), fine tune existing indexes, employ partitioning.
Make sure you have appropriate hardware config - data, log, temp and even index files should be located on independent spindles. make sure you have enough RAM and CPU's. I hope you are running 64-bit platform.
After all this, if you still need more - analyze most used keywords and create aggregate result tables for top 10 keywords.
Amount Amazon - they most likely use superior hardware and also take advantage of CDN's. Also, they have thousands of servers surviving up the content and there is no performance bottlenecks - data is duplicated multiple times across several data centers.
As completely separate approach - you may want to look into "in-memory" databases such as CACHE - this is the fastest you can get on DB side.