We have a site hosted one side of the planet and a customer on the other.
Its asp.net and theres load of complex business rules on the forms. So there are many instances where the user takes some actions and the site posts back to alter the form based on business rules.
So now the customer is complaining about site lag.
For us, the lag is barely noticeable so were pretty much talking pure geographical distance here I think.
What are the options for improving performance...
a) Put a mirrored data center nearer the customers country
b) Rewrite the whole app, trying to implement the business rules entirely in client side script (may not be feasible)
Outside of this, has anyone any tips or tricks that might boost performance.
We already have heavy caching between db and web server but in this case, this isn't the issue since they are side by side anyway...
The prob is a 30,000 mile roundtrip between client and server...
(I notice the reverse is slow also - when I use websites in the customers country, they always seem slow...)
I have this problem too. Some of my clients are in New Zealand and I am in the UK. That is as big a round-trip as you can get, unless you count deep space probes.
Start here:
http://www.aspnet101.com/2010/03/50-tips-to-boost-asp-net-performance-part-i/
http://www.aspnet101.com/2010/04/50-tips-to-boost-asp-net-performance-part-ii/
Many of these are serverside hints, but particlar hints from these pages that might help you include:
disable ViewState where appropriate
use a CDN so that users are getting as much of their content as possible from nearby servers e.g. jQuery script files, Azure services.
change your images to sprites
minify your js
validate all Input received from the Users - on the clientside saves unnecessary round trips. jQuery validation is excellent for this.
Use IIS Compression - reduces download size
Use AJAX wherever possible - if you don't already, this has the greatest potential to improve your round trip sizes. My preference is (you guessed it...) jQuery AJAX
Also, in Firefox, install the YSlow Add-on. This will give you hints on how to improve your particular page
If all of this is not enough and you can afford the time and investment, converting your application to ASP.NET MVC will make your pages a lot lighter on the bandwidth. You can do this gradually, changing the most problematic pages first and over time replace your site without impacting your users. But only do this after exhausting the many ideas posted by al lof the answers to your question.
Another option, if you are going to do a rewrite is to consider a Rich Internet Application using Silverlight. This way, you can have appropriate C# business rules executing in the client browser and only return to the server for small packets of data.
The most obvious short term solution would be to buy some hosting space in the same country as your client, but you would have to consider database synchronising if you have other clients in your home country.
First step is that you probably want to get some performance information from your client accessing your website. Something like Firebug (in Firefox) that shows how long every request for each item on your page took. You may be surprised what the bottle neck actually is. Maybe just adding a CDN (content Delivery Network) for your images, etc. would be enough.
If your site has any external references or tracking that runs on the client (Web trends, etc) that may even be the bottleneck, it could be nothing to do with your site as such
This might sound obvious, but here it goes: I'd try to limit the information interchange between the client and the server to the absolute minimum, probably by caching as much information as possible on the first call, and using javascript.
For example: if your app is polling the server when the user presses "get me a new blank form", you can instead send a "hidden"(i.e. on a javascript string) blank form on the first call, and have it replace the visible one with javascript when the user presses the button. No server polling = big gain in perceived responsiveness.
Another example would be an ajax service that renders a complex form every time that the user changes one field. The inefficient (but normally easier to implement) way to do it is having the server "send" the complete form, in html. A more efficient way would be having the server return a short message (maybe encoded in json), and have the client build the form from the message, again with javascript. The trick here is that in some cases you can start rendering the form before the message is received, so the "perceived responsiveness" will be also better.
Finally, see if you can cache things up; If the user is asking for information that you already have, don't pull the server for it again. For example, save on a javascript array the "current state" of the page, so if the user presses "back" or "forward" you can just restore from there, instead of polling the server.
You are dealing with a "long fat pipe" here, meaning the bandwidth is sufficient (it can still do x KB/s) but the lag is increased. I would be looking at decreasing the number and frequency of requests first, before decreasing the size of the request.
You don't have to reimplement 100% of the business rules in Javascript, but do start chipping away at the simple validation. IMHO this has the potential to give you the best bang for buck.
But of course, don't take my word for it, investigate where the bottleneck happens - i.e. response time or transfer time. Most modern browsers' developer plugins can do that these days.
One thing you should look at is how big is the ViewState of the page. For every postback it will send the viewstate. If it's large and the internet lines are "slow", then you will get lag.
Ways to fix this is to scrutinize your code and turn off viewstate for controls which don't need it, compress the viewstate before sending it to the client, making the postbacks smaller, cache the viewstate on the server and replace it in the aspx file with a guid or similar making the postback even smaller.
And of course make sure you have compression (gzip) turned on for your entire site, so that what you send in the first place is compressed.
Also make sure you add cache headers to all static content so that the client caches those files (js, css, images).
Use Fiddler or something similar to monitor how much data is being sent back and forth for your application.
First of all, thanks guys for all the information.
Just a bit of extra background. The site is built using a dynamic form generating engine I wrote.
Basically I created a system whereby the form layout is described in the db and rendered on the fly.
This has been a very useful system in that it allows rapid changes and also means our outputs, which include on screen, pdf and xml outputs are synched to this descriptions, which I call form maps. For example, adding a new control, or reordering a form, is automatically seen in all renders - form, pdf, xml and standard display
It does however introduce the overhead of having to build the page on every request, including post backs
I've been speaking to an architect in my company that we're probably going to need a few copies around the globe - its not possible to run a global system for people in different countries from one european data center.
Some of the forms are huge, the customer is unwilling to see sense and break them into smaller portions - so this is also adding to the overhead.
The only thing I have not tried yet is running something like gzip to reduce the payload been sent over and back...
Related
I have a page that is hidden behind auth, so SEO doesn't matter.
It includes fetching quite a lot of data. Will the page finish loading faster if I use SSR and fetch the data in getServerSideProps, or will it be a marginal difference compared to client side fetching?
The page will only be accessed by mobile.
TLDR;
It depends on many aspects of the page, but it may run faster because it provides a better FMP score in terms of performance results. But for pages like signup/sign-in, it won't make any noticeable difference because you should keep the FMP and interaction time gap very low.
What is the main difference?
First off, I should mention the results may vary based on geographic locations, servers, and latency, but according to studies, the SSR version of applications may be faster than the CSR ones, because it will render the viewable content of pages faster for the users. But the page until it's fully loaded may not be interactable.
SSR (Server Side Rendering e.g. Next.js) vs. CSR (Client Side Rendering e.g. Create React App)
Conclusion
Based on your usage you should take a lot of consideration. So for apps/pages like personal pannels and stuff like that, the FMP and performance scores do not matter, because users should have interacted with the page and done some stuff, and it should be fully interactable when the contents are loaded, so IMHO its better to stick with CSR and make fewer efforts for making the pages SSR like. Otherwise, for other landing pages, you should make users excited and show them some stuff of your content before they recognize it won't be worth sticking with your app because it's slow.
Using ASP.NET MVC: I am caught in the middle between rendering my views on the server vs in the client (say jquery templates). I do not like the idea of mixing the two as I hear some people say. For example, some have said they will render the initial page (say a list of a bunch of comments) server side, and then when a new comment is added they use client side templating. The requirement to have the same rendering logic in two different areas of your code makes me wonder how people convince themselves it is worth it.
What are the reasons you use to decide which to use where?
How does your argument change when using ASP.NET Web Forms?
One reason that people do that is because they want their sites to get indexed by search engines but also want to have the best user experience, so are writing client code for that. You have to decide what makes sense given the constraints and goals you have. Unfortunately, what makes the most business sense won't always seem to make the most sense from a technical perspective.
One advantage to server-side rendering is that your clients don't have to use javascript in order for your pages to be functional. If you're relying on JQuery templates, you pretty much have to assume that your page won't have any content when rendered without javascript. For some people this is important.
As you say, I would prefer not to use the same rendering logic twice, since you run the risk of letting it get out of sync.
I generally prefer to just leverage partial views to generate most content server-side. Pages with straight HTML tend to render a bit faster than pages that have to be "built" after they've loaded, making the initial load a little speedier.
We've developed an event-based AJAX architecture for our application which allows us to generate a piece of content in response to the result of an action, and essentially send back any number of commands to the client-side code to say "Use the results of this rendered partial view to replace the element with ID 'X'", or "Open a new modal popup dialog with this as the content." This is beneficial because the server-side code can have a lot more control over the results of an AJAX request, without having to write client-side code to handle every contingency for every action.
On the other hand, putting the server in control like this means that the request has to return before the client-side knows what to do. If your view rendering was largely client-based, you could make something "happen" in the UI (like inserting the new comment where it goes) immediately, thereby improving the "perceived speed" of your site. Also, the internet connection is generally the real speed bottleneck of most websites, so just having less data (JSON) to send over the wire can often make things more speedy. So for elements that I want to respond very smoothly to user interaction, I often use client-side rendering.
In the past, search-engine optimization was a big issue here as well, as Jarrett Widman says. But my understanding is that most modern search engines are smart enough to evaluate the initial javascript for pages they visit, and figure out what the page would actually look like after it loads. Google even recommends the use of the "shebang" in your URLs to help them know how to index pages that are dynamically loaded by AJAX.
I have several pages of my web application done. They all use the same master page so they all all look very similar, except of course for the content. It's technically possible to put a larger update panel and have all the pages in one big update panel so that instead of jumping from page to page, the user always stays on the same page and the links trigger __doPostback call-backs to update with the appropriate panel.
What could be the problem(s) with building my site like this?
Well, "pages" provide what is known as the "Service Interface layer" between your business layer and the http aspect of the web application. That is all of the http, session and related aspects are "converted" into regular C# types (string, int, custom types etc.) and the page then calls methods in the business layer using regular C# calling conventions.
So if you have only one update panel in your whole application, what you're effectively saying is that one page (the code behind portion) will have to handle all of the translations between the http "ness" and the business layer. That'll just be a mess from a maintainable perspective and a debugging perspective.
If you're in a team that each of you will be potentially modifying the same code behind. This could be a problem for some source control systems but one or more of you could define the same method name with the same signature and different implementations. That's won't be easy to merge.
From a design perspective, there is no separation of concerns. If you have a menu or hyper link on a business application, it most likely means a difference concern. Not a good design at all.
From a performance perspective you'll be loading all of your systems functionality no matter what function your user is actually doing.
You could still have the user experience such that they have the one page experience and redirect the callback to handlers for the specific areas on concern. But I'd think real hard about the UI and the actual user experience you'll be providing. It's possible that you'll have a clutter of menus and other functionality when you combine everything into one page.
Unless the system you are building a really simple and has no potential to grow beyond what it currently is and provide your users with a one page experience is truly provide value and an improved user experience and wouldn't go down this route.
When you have a hammer, everything looks like a nail.
It really depends on what you are trying to do. Certainly, if each page is very resource-intensive, you may have faster load times if you split them up. I'm all for simplicity, though, and if you have a clean and fast way of keeping users on one page and using AJAX to process data, you should definitely consider it.
It would be impossible to list too many downsides to an AJAX solution, though, without more details about the size and scope of the Web application you are using.
The data on our website can easily be scraped. How can we detect whether a human is viewing the site or a tool?
One way is by calculating time which a user stays on a page. I do not know how to implement that. Can anyone help to detect and prevent automated tools from scraping data from my website?
I used a security image in login section, but even then a human may log in and then use an automated tool. When the recaptcha image appears after a period of time the user may type the security image and again, use an automated tool to continue scraping data.
I developed a tool to scrape another site. So I only want to prevent this from happening to my site!
DON'T do it.
It's the web, you will not be able to stop someone from scraping data if they really want it. I've done it many, many times before and got around every restriction they put in place. In fact having a restriction in place motivates me further to try and get the data.
The more you restrict your system, the worse you'll make user experience for legitimate users. Just a bad idea.
It's the web. You need to assume that anything you put out there can be read by human or machine. Even if you can prevent it today, someone will figure out how to bypass it tomorrow. Captchas have been broken for some time now, and sooner or later, so will the alternatives.
However, here are some ideas for the time being.
And here are a few more.
and for my favorite. One clever site I've run across has a good one. It has a question like "On our "about us" page, what is the street name of our support office?" or something like that. It takes a human to find the "About Us" page (the link doesn't say "about us" it says something similar that a person would figure out, though) And then to find the support office address,(different than main corporate office and several others listed on the page) you have to look through several matches. Current computer technology wouldn't be able to figure it out any more than it can figure out true speech recognition or cognition.
a Google search for "Captcha alternatives" turns up quite a bit.
This cant be done without risking false positives (and annoying users).
How can we detect whether a human is viewing the site or a tool?
You cant. How would you handle tools parsing the page for a human, like screen readers and accessibility tools?
For example one way is by calculating the time up to which a user stays in page from which we can detect whether human intervention is involved. I do not know how to implement that but just thinking about this method. Can anyone help how to detect and prevent automated tools from scraping data from my website?
You wont detect automatic tools, only unusual behavior. And before you can define unusual behavior, you need to find what's usual. People view pages in different order, browser tabs allow them to do parallel tasks, etc.
I should make a note that if there's a will, then there is a way.
That being said, I thought about what you've asked previously and here are some simple things I came up with:
simple naive checks might be user-agent filtering and checking. You can find a list of common crawler user agents here: http://www.useragentstring.com/pages/Crawlerlist/
you can always display your data in flash, though I do not recommend it.
use a captcha
Other than that, I'm not really sure if there's anything else you can do but I would be interested in seeing the answers as well.
EDIT:
Google does something interesting where if you're looking for SSNs, after the 50th page or so, they will captcha. It begs the question to see whether or not you can intelligently time the amount a user spends on your page or if you want to introduce pagination into the equation, the time a user spends on one page.
Using the information that we previously assumed, it is possible to put a time limit before another HTTP request is sent. At that point, it might be beneficial to "randomly" generate a captcha. What I mean by this, is that maybe one HTTP request will go through fine, but the next one will require a captcha. You can switch those up as you please.
The scrappers steal the data from your website by parsing URLs and reading the source code of your page. Following steps can be taken to atleast making scraping a bit difficult if not impossible.
Ajax requests make it difficult to parse the data and require extra efforts in getting the URLs to be parsed.
Use cookie even for the normal pages which do not require any authentication, create cookies once the user visits the home page and then its required for all the inner pages.This makes scraping a bit difficult.
Display the encrypted code on the website and then decrypt it on the loadtime using javascript code. I have seen it on a couple of websites.
I guess the only good solution is to limit the rate that data can be accessed. It may not completely prevent scraping but at least you can limit the speed at which automated scraping tools will work, hopefully below a level that will discourage scraping the data.
I built an online news portal before which is working fine for me but some say the home page is slow a little bit. When I think of it I see a reason why that is.
The home page of the site displays
Headlines
Spot news (sub-headlines
Spots with pictures
Most read news (as titles)
Most commented news (as titles)
5 news titles from each news category (11 in total e.g. sports, economy, local, health
etc..)
now, each of these are seperate queries to the db. I have tableadapters datasets and datatables (standard data acces scenarios) so for headlines, I call the business logic in my news class that returns the datatables by the tableadapter. from there on, I either use the datatable by just binding it to the controls or (most of the time) the object converts it to a list(of news) for example and I consume it from there.
Doing this for each of the above seems to work fine though. At least it does not put a huge load. But makes me wonder if there is a better way.
For example, the project I describe above is a highly dynamic web site, news are inserted as they arrive from agencies 24 hours non-stop. so caching in this case might not sound good. but on the other hand, I know have another similar project for a local newspaper. The site will only be updated once a day. In this case:
Can I only run one query, that would return a datatable containing all the news items inserted for today, then query that datatable and place headlines, spots and other items to their respective places on the site? Or is there a better alternative around? I just wander how other people carry out similar tasks in the most efficient way.
I think you should use FireBug to find out what elements are taking time to load. Sometimes large images can ruin the show (and the size of the image on screen isn't always relative its download size).
Secondly you could download the Yahoo Firefox plugin YSlow and investigate if you have any slowing scripts.
But Firebug should give you the best review. After loading Firebug click on the 'Net' tab to view the load time of each element in the page.
If you've got poor performance, your first step isn't to start mucking around. Profile your code. Find out exactly why it is slow. Is the slowdown in transmitting the page, rendering it, or actually dynamically generating the page? Is a single query taking too long?
Find out exactly where the bottleneck is and attack the problem at its heart.
Caching is also a very good idea, even in cases where content is updated fairly quickly. As long as your caching mechanism is intelligent, you'll still save a lot of generation time. In the case of a news portal or a blog as opposed to a forum, your likely to improve performance greatly with a caching system.
If you find that your delays come from the DB, check your tables, make sure they're properly indexed, clustered, or whatever else you need depending on the amount of data in the table. Also, if you're using dynamic queries, try stored procedures instead.
If you want to get several queries done in one database request, you can. Since initially you wont be showing any data until all the queries are done anyhow, and barring any other issues, you'll at least be saving time on accessing the DB again for every single query.
DataSets hold a collection of tables, they can be generated by several queries in the same request.
ASP.NET provides you with a pretty nice mechanism already for caching (HttpContext.Cache) that you can wrap around and make it easier for you to use. Since you can set a life span on your cached objects, you don't really have to worry about articles and title not being up to date.
If you're using WebForms for this website, disable ViewState for the controls that don't really need them just to make the page that little bit faster to load. Not to mention plenty of other tweaks and changes to make a page load faster (gzipping, minimizing scripts etc.)
Still, before doing any of that, do as Anthony suggested and profile your code. Find out what the true problem is.