I built an online news portal before which is working fine for me but some say the home page is slow a little bit. When I think of it I see a reason why that is.
The home page of the site displays
Headlines
Spot news (sub-headlines
Spots with pictures
Most read news (as titles)
Most commented news (as titles)
5 news titles from each news category (11 in total e.g. sports, economy, local, health
etc..)
now, each of these are seperate queries to the db. I have tableadapters datasets and datatables (standard data acces scenarios) so for headlines, I call the business logic in my news class that returns the datatables by the tableadapter. from there on, I either use the datatable by just binding it to the controls or (most of the time) the object converts it to a list(of news) for example and I consume it from there.
Doing this for each of the above seems to work fine though. At least it does not put a huge load. But makes me wonder if there is a better way.
For example, the project I describe above is a highly dynamic web site, news are inserted as they arrive from agencies 24 hours non-stop. so caching in this case might not sound good. but on the other hand, I know have another similar project for a local newspaper. The site will only be updated once a day. In this case:
Can I only run one query, that would return a datatable containing all the news items inserted for today, then query that datatable and place headlines, spots and other items to their respective places on the site? Or is there a better alternative around? I just wander how other people carry out similar tasks in the most efficient way.
I think you should use FireBug to find out what elements are taking time to load. Sometimes large images can ruin the show (and the size of the image on screen isn't always relative its download size).
Secondly you could download the Yahoo Firefox plugin YSlow and investigate if you have any slowing scripts.
But Firebug should give you the best review. After loading Firebug click on the 'Net' tab to view the load time of each element in the page.
If you've got poor performance, your first step isn't to start mucking around. Profile your code. Find out exactly why it is slow. Is the slowdown in transmitting the page, rendering it, or actually dynamically generating the page? Is a single query taking too long?
Find out exactly where the bottleneck is and attack the problem at its heart.
Caching is also a very good idea, even in cases where content is updated fairly quickly. As long as your caching mechanism is intelligent, you'll still save a lot of generation time. In the case of a news portal or a blog as opposed to a forum, your likely to improve performance greatly with a caching system.
If you find that your delays come from the DB, check your tables, make sure they're properly indexed, clustered, or whatever else you need depending on the amount of data in the table. Also, if you're using dynamic queries, try stored procedures instead.
If you want to get several queries done in one database request, you can. Since initially you wont be showing any data until all the queries are done anyhow, and barring any other issues, you'll at least be saving time on accessing the DB again for every single query.
DataSets hold a collection of tables, they can be generated by several queries in the same request.
ASP.NET provides you with a pretty nice mechanism already for caching (HttpContext.Cache) that you can wrap around and make it easier for you to use. Since you can set a life span on your cached objects, you don't really have to worry about articles and title not being up to date.
If you're using WebForms for this website, disable ViewState for the controls that don't really need them just to make the page that little bit faster to load. Not to mention plenty of other tweaks and changes to make a page load faster (gzipping, minimizing scripts etc.)
Still, before doing any of that, do as Anthony suggested and profile your code. Find out what the true problem is.
Related
Given a website/blog's RSS feed link, is there any way to get that site's entire RSS history (all its blog posts EVER) in a single XML file?
Is this something that is only possible from the other end (ie. a site publishes it's entire blogroll history as RSS)? In which case, how is this achieved?
Thanks!
S
RSS is just another way of expressing the data. It depends entirely on the site. If the site provides a way for you to specify how many items you want (which is unlikely), then you should know that that won't work on other sites.
Technically speaking, formatting the data in RSS is no different than formatting it in HTML. For example, many sites (including this one), need to represent some sequential data (questions in SO's case) on a page in HTML. To do this, the site will iterate through some data source (like a database), and output HTML so your web browser can render it, until it hits some limit. Knowing that limit is impossible, as it depends on the site. This is exactly what RSS does: it iterates through a data source, spitting out XML as it goes along. Again, knowing the limit is not possible.
Is this something that is only possible from the other end ...? In which case, how is this achieved?
If you can change how your site generates the RSS, simply remove the limit. I know this is vague, but it really depends on the implementation. There are dozens of RSS implementations, all different, and all behaving differently.
So my point is, nothing will work universally, you have to change the site itself to modify that behavior.
You are right there. The site has to publish its entire history, otherwise you can't get it. Doing it on server side, if you have access to the database, its quite easy. Just dump all the rows as XML. It actually takes effort to filter and limit the xml. How you can do it on blogging platforms? You could use plugins that allow you to do this
We have a site hosted one side of the planet and a customer on the other.
Its asp.net and theres load of complex business rules on the forms. So there are many instances where the user takes some actions and the site posts back to alter the form based on business rules.
So now the customer is complaining about site lag.
For us, the lag is barely noticeable so were pretty much talking pure geographical distance here I think.
What are the options for improving performance...
a) Put a mirrored data center nearer the customers country
b) Rewrite the whole app, trying to implement the business rules entirely in client side script (may not be feasible)
Outside of this, has anyone any tips or tricks that might boost performance.
We already have heavy caching between db and web server but in this case, this isn't the issue since they are side by side anyway...
The prob is a 30,000 mile roundtrip between client and server...
(I notice the reverse is slow also - when I use websites in the customers country, they always seem slow...)
I have this problem too. Some of my clients are in New Zealand and I am in the UK. That is as big a round-trip as you can get, unless you count deep space probes.
Start here:
http://www.aspnet101.com/2010/03/50-tips-to-boost-asp-net-performance-part-i/
http://www.aspnet101.com/2010/04/50-tips-to-boost-asp-net-performance-part-ii/
Many of these are serverside hints, but particlar hints from these pages that might help you include:
disable ViewState where appropriate
use a CDN so that users are getting as much of their content as possible from nearby servers e.g. jQuery script files, Azure services.
change your images to sprites
minify your js
validate all Input received from the Users - on the clientside saves unnecessary round trips. jQuery validation is excellent for this.
Use IIS Compression - reduces download size
Use AJAX wherever possible - if you don't already, this has the greatest potential to improve your round trip sizes. My preference is (you guessed it...) jQuery AJAX
Also, in Firefox, install the YSlow Add-on. This will give you hints on how to improve your particular page
If all of this is not enough and you can afford the time and investment, converting your application to ASP.NET MVC will make your pages a lot lighter on the bandwidth. You can do this gradually, changing the most problematic pages first and over time replace your site without impacting your users. But only do this after exhausting the many ideas posted by al lof the answers to your question.
Another option, if you are going to do a rewrite is to consider a Rich Internet Application using Silverlight. This way, you can have appropriate C# business rules executing in the client browser and only return to the server for small packets of data.
The most obvious short term solution would be to buy some hosting space in the same country as your client, but you would have to consider database synchronising if you have other clients in your home country.
First step is that you probably want to get some performance information from your client accessing your website. Something like Firebug (in Firefox) that shows how long every request for each item on your page took. You may be surprised what the bottle neck actually is. Maybe just adding a CDN (content Delivery Network) for your images, etc. would be enough.
If your site has any external references or tracking that runs on the client (Web trends, etc) that may even be the bottleneck, it could be nothing to do with your site as such
This might sound obvious, but here it goes: I'd try to limit the information interchange between the client and the server to the absolute minimum, probably by caching as much information as possible on the first call, and using javascript.
For example: if your app is polling the server when the user presses "get me a new blank form", you can instead send a "hidden"(i.e. on a javascript string) blank form on the first call, and have it replace the visible one with javascript when the user presses the button. No server polling = big gain in perceived responsiveness.
Another example would be an ajax service that renders a complex form every time that the user changes one field. The inefficient (but normally easier to implement) way to do it is having the server "send" the complete form, in html. A more efficient way would be having the server return a short message (maybe encoded in json), and have the client build the form from the message, again with javascript. The trick here is that in some cases you can start rendering the form before the message is received, so the "perceived responsiveness" will be also better.
Finally, see if you can cache things up; If the user is asking for information that you already have, don't pull the server for it again. For example, save on a javascript array the "current state" of the page, so if the user presses "back" or "forward" you can just restore from there, instead of polling the server.
You are dealing with a "long fat pipe" here, meaning the bandwidth is sufficient (it can still do x KB/s) but the lag is increased. I would be looking at decreasing the number and frequency of requests first, before decreasing the size of the request.
You don't have to reimplement 100% of the business rules in Javascript, but do start chipping away at the simple validation. IMHO this has the potential to give you the best bang for buck.
But of course, don't take my word for it, investigate where the bottleneck happens - i.e. response time or transfer time. Most modern browsers' developer plugins can do that these days.
One thing you should look at is how big is the ViewState of the page. For every postback it will send the viewstate. If it's large and the internet lines are "slow", then you will get lag.
Ways to fix this is to scrutinize your code and turn off viewstate for controls which don't need it, compress the viewstate before sending it to the client, making the postbacks smaller, cache the viewstate on the server and replace it in the aspx file with a guid or similar making the postback even smaller.
And of course make sure you have compression (gzip) turned on for your entire site, so that what you send in the first place is compressed.
Also make sure you add cache headers to all static content so that the client caches those files (js, css, images).
Use Fiddler or something similar to monitor how much data is being sent back and forth for your application.
First of all, thanks guys for all the information.
Just a bit of extra background. The site is built using a dynamic form generating engine I wrote.
Basically I created a system whereby the form layout is described in the db and rendered on the fly.
This has been a very useful system in that it allows rapid changes and also means our outputs, which include on screen, pdf and xml outputs are synched to this descriptions, which I call form maps. For example, adding a new control, or reordering a form, is automatically seen in all renders - form, pdf, xml and standard display
It does however introduce the overhead of having to build the page on every request, including post backs
I've been speaking to an architect in my company that we're probably going to need a few copies around the globe - its not possible to run a global system for people in different countries from one european data center.
Some of the forms are huge, the customer is unwilling to see sense and break them into smaller portions - so this is also adding to the overhead.
The only thing I have not tried yet is running something like gzip to reduce the payload been sent over and back...
The data on our website can easily be scraped. How can we detect whether a human is viewing the site or a tool?
One way is by calculating time which a user stays on a page. I do not know how to implement that. Can anyone help to detect and prevent automated tools from scraping data from my website?
I used a security image in login section, but even then a human may log in and then use an automated tool. When the recaptcha image appears after a period of time the user may type the security image and again, use an automated tool to continue scraping data.
I developed a tool to scrape another site. So I only want to prevent this from happening to my site!
DON'T do it.
It's the web, you will not be able to stop someone from scraping data if they really want it. I've done it many, many times before and got around every restriction they put in place. In fact having a restriction in place motivates me further to try and get the data.
The more you restrict your system, the worse you'll make user experience for legitimate users. Just a bad idea.
It's the web. You need to assume that anything you put out there can be read by human or machine. Even if you can prevent it today, someone will figure out how to bypass it tomorrow. Captchas have been broken for some time now, and sooner or later, so will the alternatives.
However, here are some ideas for the time being.
And here are a few more.
and for my favorite. One clever site I've run across has a good one. It has a question like "On our "about us" page, what is the street name of our support office?" or something like that. It takes a human to find the "About Us" page (the link doesn't say "about us" it says something similar that a person would figure out, though) And then to find the support office address,(different than main corporate office and several others listed on the page) you have to look through several matches. Current computer technology wouldn't be able to figure it out any more than it can figure out true speech recognition or cognition.
a Google search for "Captcha alternatives" turns up quite a bit.
This cant be done without risking false positives (and annoying users).
How can we detect whether a human is viewing the site or a tool?
You cant. How would you handle tools parsing the page for a human, like screen readers and accessibility tools?
For example one way is by calculating the time up to which a user stays in page from which we can detect whether human intervention is involved. I do not know how to implement that but just thinking about this method. Can anyone help how to detect and prevent automated tools from scraping data from my website?
You wont detect automatic tools, only unusual behavior. And before you can define unusual behavior, you need to find what's usual. People view pages in different order, browser tabs allow them to do parallel tasks, etc.
I should make a note that if there's a will, then there is a way.
That being said, I thought about what you've asked previously and here are some simple things I came up with:
simple naive checks might be user-agent filtering and checking. You can find a list of common crawler user agents here: http://www.useragentstring.com/pages/Crawlerlist/
you can always display your data in flash, though I do not recommend it.
use a captcha
Other than that, I'm not really sure if there's anything else you can do but I would be interested in seeing the answers as well.
EDIT:
Google does something interesting where if you're looking for SSNs, after the 50th page or so, they will captcha. It begs the question to see whether or not you can intelligently time the amount a user spends on your page or if you want to introduce pagination into the equation, the time a user spends on one page.
Using the information that we previously assumed, it is possible to put a time limit before another HTTP request is sent. At that point, it might be beneficial to "randomly" generate a captcha. What I mean by this, is that maybe one HTTP request will go through fine, but the next one will require a captcha. You can switch those up as you please.
The scrappers steal the data from your website by parsing URLs and reading the source code of your page. Following steps can be taken to atleast making scraping a bit difficult if not impossible.
Ajax requests make it difficult to parse the data and require extra efforts in getting the URLs to be parsed.
Use cookie even for the normal pages which do not require any authentication, create cookies once the user visits the home page and then its required for all the inner pages.This makes scraping a bit difficult.
Display the encrypted code on the website and then decrypt it on the loadtime using javascript code. I have seen it on a couple of websites.
I guess the only good solution is to limit the rate that data can be accessed. It may not completely prevent scraping but at least you can limit the speed at which automated scraping tools will work, hopefully below a level that will discourage scraping the data.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Concerning pages that build a web application:
Lately, I have found myself creating web pages that are simpler than the ones I used to. Before, I would try to jam as much functionality into a single page as I could to prevent from having lots of pages.
I am starting to realize that this was just making things way more complex, convoluted, and confusing than it had to be. Why not have more pages? I think the reason that I was doing this was because I didn't want the user to have to browse to other pages; just to have all the functionality they needed on a single page.
Well, these good intentions turned into an overly confusing interface for the user and very unmanageable source code. I am a new developer and I am trying to be very reflective of what I am doing so that I can improve. If it makes a difference, I am developing in ASP.net (though these are probably considerations for any platform).
My questions are:
Am I overthinking these things?
Has anyone else found themselves doing this?
Where is the happy medium?
There is no expert who can give you a rule that works in all places at all times. I have been known in my industry for years for "easy" interfaces and we've won significant amounts of business for it (as well as 5 "Best in Class" awards). I have also had people within my company and outside of it tell me - for years - that they like my work but wish that I would "jazz it up" with more graphics and such. What always amazes me is how little connection people see between the two.
So...a few rules of thumb:
A page should do one main thing.
A page may well have multiple links related to the main thing
Menuing and link layout should be consistent across pages
Simpler is better than more complex
Pages should be visually appealing and inviting
Rule 4 is more important than rule 5.
For example, my product provides an interface that lets people define classes and events to be displayed in a calendar. I could have one page that lets you Review, Add, Update, Delete and Edit the classes. Indeed, in some simpler areas, I've used the gridview to let people manage everything in a grid. However, classes have too much information to do this and still follow the rules above.
So,
The main idea is: "Here is a list of classes for this location"
The links are "Add New" shown above and to the right of the grid, Change and Delete are links within each row. This is consistent across the app.
Menuing for the system as a whole is always across the right/top. Nothing else appears on the class/event page except for standard elements common to all pages (a logo, a header, a footer).
The grid is nicely styled but there are no spurious graphics (4,5,6)
A few last things about UIs and graphic design.
First, develop your own vision and be consistent across pages and apps.
Second, do not be afraid of simplicity.
Next, when soliciting advice from others keep in mind that you do not want their advice - you want their impressions: you want to understand the way they perceive the interface. Advice is sometimes good but, more often than not, actually harmful. In my experience, everyone thinks that they are a UI expert.
When you do your hallway (or formal) useability testing you should discount almost all advice to the effect that "you should make that stand out more." As you'll see, it will quickly become "and that," "and that," "and the other." If you follow this advice, you'll end up with a mess due to Brittingham's first rule of design: If everything is important than nothing is. (There you go: when explaining why you can't make someone stand out more, just tell them that "it violates Brittingham's first rule of design!")
Hope this helps!
You hit the nail on the head. Use the KISS principle. (Keep It Simple Stupid)
I've done this in the past as well and not only does it make for a hideous UI, but confusing as to what operations you can do on the page due to having too much functionality. I've often found in testing that I did not have enough checks to see if the user could perform a certain operation based on the state of the data.
It's easy enough in ASP.Net to write several pages that do simple tasks and then link them together with Response.Redirect or Server.Transfer. Now all I try to achieve on any given page is what the design specs say. So if my page is just a search page, that's all I give. If the user wants to see the details of an item that was returned in the search, then I send them to an itemDetails.aspx page.
You've broken a wall that most software developers have, the one that was blocking your view on usability before. A lot of developers don't really think about it and try to make it easier for them by stuffing functionality in one window, web page or whatever.
The thing is once you start designing software from the user's point of view, i.e. making it easier, several things start to become clear. One is the issue of code maintenance, that code is easily more managable to work on if you don't stuff everything in one giant class or whatever travesty you've been doing. The other is usability itself, that you start to think how the user is actually using your application through the graphical interface. Third is avoiding requirements or scope creep where you stop developing functionality that the user doesn't need.
We as users want simplicity partly because we don't want to spend most of our time muddling through a bad UI when we can get our work done faster with a simple and slick UI. That makes it for us software developers the right thing to do, to think through your design on all levels... that and specs always lie.
Definitely agree: most attempts at writing pages/forms that do too much have resulted in
bugs and rewrites. Problems occur with keeping all parts valid/synchronized,
excess managing of users' expectations ("I've entered a bill number here and clicked "find person" there but it gives an error message. Why?") when the two are logically separate. These questions cannot arise if only the valid options are visible,
Formatting/layout issues: In ASP.NET pages, trying to layout independent User Controls turns out to be a nightmare ("But we really want all the buttons vertically aligned!" in separate user controls. Good luck with that.)
I'd consider webpages with more than one functionality only if the target audience consists of domain experts, i.e. people that need lots of functionality on one page for better productivity (think data-entry or financial software with lots of variables).
Even then, most of the time, it's possible separate pages into single units.
No
Yes - me
I found the happy medium was to use Masterpages, and using it in a way that was familiar to IFrames. That I could have a lots of functionality combined well together. There is a more interesting way of doing this with WPF/Silverlight called Prism
The amount of functionality on a page is usually not determined by you but by your customer. If the customer demands a single page to update some VeryComplexObject, you're likely to end up with an aspx page that has a significant number of lines. Main reason is that you simply have a lot of event handlers for all actions on the page.
Whether that page is complex is entirely up to you. You should always attempt to make your code-behind file as simple and clean as possible. Some suggestions in that direction:
Move all business code to another application layer.
Use ObjectDataSource for providing data to data-bound controls such as ListView, GridView, Repeater, ... Delegating loading of data to a dedicated object prevents a lot of overhead in your aspx.cs file.
Another suggestion is to use user controls to implement portions of your page. You would usually only do this when you can reuse the user control, but it can also be of great help reducing page complexity (both of your code-behind file as well as your aspx).
Sometimes I think we are all guilty of forgetting just who it is that we develop our applications for. It isn't always easy as a developer to be able to take a step back and have a look at your application as a user might do so. This is why big companies employee hundreds of people to do this for them and they don't always get it right.
Usability is a massive subject but it is defiantly something that all developers need to keep in mind. It has taken me a long time to learn this but when tackling any development task I always try to think about how my users are going to interact with what I am writing. This will make a difference to all levels of your development.
I would suggest reading Don't Make Me Think by Steve Krug. This book won't take you an age to read and it puts across some fantastic ideas that can help you to develop applications that are much easier to use and understand.
I always find that once I have thought about the user experience the decisions about what my web pages are going to do and how they are going to interact are much easier to make.
Maybe you should ask the people who are using your site. Or better yet, just watch people use your site. I think that would tell you if your site is designed well, or if you need to change it.
Although ASP.NET MVC seems to have all the hype these days, WebForms are still quite pervasive. How do you keep your project sane? Let's collect some tips here.
I generally try to stay clear of it... but when i do use WebForms, i follow these precepts:
Keep the resulting HTML clean: Just because you're not hand-coding every <div> doesn't mean the generated code has to become an unreadable nightmare. Avoiding controls that produce ugly code can pay off in reduced debugging time later on, by making problems easier to see.
Minimize external dependencies: You're not being paid to debug other people's code. If you do choose to rely on 3rd-party components then get the source so you don't have to waste unusually large amounts of time fixing their bugs.
Avoid doing too much on one page: If you find yourself implementing complex "modes" for a given page, consider breaking it into multiple, single-mode pages, perhaps using master pages to factor out common aspects.
Avoid postback: This was always a terrible idea, and hasn't gotten any less terrible. The headaches you'll save by not using controls that depend on postback are a nice bonus.
Avoid VIEWSTATE: See comments for #4.
With large projects the best suggestion that I can give you is to follow a common design pattern that all your developers are well trained in and well aware of. If you're dealing with ASP.NET then the best two options for me are:
o Model View Presenter (though this is now Supervisor Controller and Passive View).
This is a solid model pushing seperation between your user interface and business model that all of your developers can follow without too much trouble. The resulting code is far more testable and maintainable. The problem is that it isn't enforced and you are required to write lots of supporting code to implement the model.
o ASP.NET MVC
The problem with this one is that it's in preview. I spoke with Tatham Oddie and be mentioned that it is very stable and usable. I like it, it enforces the seperation of concerns and does so with minimal extra code for the developer.
I think that whatever model you choose, the most important thing is to have a model and to ensure that all of your developers are able to stick to that model.
Create web user controls for anything that will be shown on more than one page that isn't a part of masterpage type content. Example: If your application displays product information on 10 pages, it's best to have a user control that is used on 10 pages rather than cut'n'pasting the display code 10 times.
Put as little business logic in the code behind as possible. The code behind should defer to your business layer to perform the work that isn't directly related to putting things on the page and sending data back and forth from the business layer.
Do not reinvent the wheel. A lot of sloppy codebehinds that I've seen are made up of code that is doing things that the framework already provides.
In general, avoid script blocks in the html.
Do not have one page do too many things. Something I have seen time and time again is a page that say has add and edit modes. That's fine. However if you have many sub modes to add and edit, you are better off having multiple pages for each sub mode with reuse through user controls. You really need to avoid going a bunch of nested IFs to determine what your user is trying to do and then showing the correct things depending on that. Things get out of control quickly if your page has many possible states.
Learn/Grok the page lifecycle and use it to your advantage. Many ugly codebehind pages that I've seen could be cleaner if the coder understood the page lifecycle better.
Start with Master Pages on day #1 - its a pain coming back to retrofit.
Following what Odd said, I am trying out a version of the MVP called Model Presentation which is working well for me so far. I am still getting an understanding of it and adapting it to my own use but it is refreshing from the code I used to write.
Check it out here: Presentation Model
Use version control and a folder structure to prevent too many files from all being in the same folder. There is nothing more painful than waiting for Windows Explorer to load something because there are 1,000+ files in a folder and it has to load all of them when the folder is opened. A convention on naming variables and methods is also good to have upfront if possible so that there isn't this mish-mash of code where different developers all put their unique touches and it painfully shows.
Using design patterns can be helpful in organizing code and having it scale nicely, e.g. a strategy pattern can lead to an easier time when one has to add a new type of product or device that has to be supported. Similar for using some adapter or facade patterns.
Lastly, know what standards your forms are going to uphold: Is it just for IE users or should any of IE, Firefox, or Safari easily load the form and look good?