How do I estimate how much bandwidth my web scraper uses? - web-scraping

I'm looking into web scraping proxy tools and am interested in Smart Proxy. The thing that concerns me is that I'll pay by bandwidth and I don't have any idea how much bandwidth my web scraper uses. Is there a way for me to estimate this before I sign up for the service?
Maybe something simple like manually loading the web pages and inspecting something in chrome dev tools?

Related

How do you setup a caching proxy on OSX?

When doing web development there are times external resources are referenced in a web page (e.g. google fonts). I would like to cache some of these calls on my Macbook but not cache the code Im working on.
The goal being speed of development and a workaround when working on slow networks (e.g. 3G using a hotspot).
I came across Squid proxy but have not been able to configure it at all. Im open to other suggestions to achieve this goal. Any ideas?
I'm using SquidMan : http://squidman.net
It come with a nice user interface.

How to make asp.net web forms application faster?

My last project is a medium size asp.net web forms application. It is built using:
asp.net 3.5
ling to sql dbml --> sql server database (9 tables)
ext.net 1.6 (www.ext.net)
structuremap 2.5.3.0
This time I believed I did my best in terms of architectural design, code and data transfer optimizations. I followed all advice I could to work with the database efficiently through linq to sql and I built layers (model, repository, service, presentation) to separate concerns and lightweight the code in the aspx code behind files.
The problem is: I've installed the application in various web hosting servers with the same pitiful result: the application is struggling to work... pages are loading like in slow motion...
In the past I would say 'OK, I didn't do all I could to speed things up' but in this case I really tried to apply the best practices...
Is there anything else I can do about it? Or is it just asp.net for really small projects only?
thank you.
ASP.NET is fine for building large scale websites. As Brad mentioned, StackExchange sites are built using it, and StackOverflow is a very busy site indeed.
What you need to do first is measure performance; until you do that, you're just guessing at where the problem areas are.
So start with the browser - use a tool such as Firebug, or YSLOW, Google Chrome dev tools, whatever takes your fancy and run your site using the tool enabled. The tools can let you know how long things are taking to process eg requests, how long content is taking to download etc.
YSLOW will also give you some tips on anything it finds as being a bit slow e.g. you're making to many HTTP requests, you should consider minifying your CSS/JS files. You will get a general overview of how the site is performing and where problems could be.
To dig a bit deeper, use a tool like RedGate's ANTS Profiler, use the trial version and measure your website, and server side code, with that tool. There are other tools, though I'm not aware of any free ones.
My first question is that when its slow. Did you try your project in Local area network. Please check first there. If there slow then you need to improve little bit.
This slow performance depends on many things
such as large data load, multiple logic on one page etc.
Please let me know.
Thanks
Basit.

Load Balancing in Asp.Net

I am making a project for the university. When admission Starts, suddenly a lot of traffic comes at my site around 50000 to 100000 users. Site goes down therefore. How to manage it Please provide me details regarding this.
thank you very much
There are several different things you should look at:
ASP.NET Caching
Do you have a caching strategy? There are a lot of features in ASP.NET that will allow you to optimize how the server responds to requests. Take a look at:
http://msdn.microsoft.com/en-us/library/xsbfdd8c(v=VS.100).aspx
Load Balancing is really applied by the web server, not the server framework (ASP.NET / etc). If general opimizations and caching still give you problems, look into load balancing and web farms for IIS. Take a look at:
http://learn.iis.net/page.aspx/213/network-load-balancing/
If you have a lot of money, there are hardware solutions for load balancing that work with IIS. That's outside the provinces of programming and StackOverflow but it's helpful to know they are out there especially if you need to have a discussion with management about the pros and cons (expense!!) of which route to take.

Performance testing strategy web app

We recently had a web app that went out to site acceptance testing where they found severe performance problems related to request size (massive viewstate ASP.net).
We need to ammend our testing strategy to include performance testing, can anyone give us guidance on best practices please?
This is a very broad case to cover, but here are a few of the highlights of things that we do on a regular basis.
DO NOT just test on your network, get remote testing in. LAN connections are very fast, large pages and large load times can go by un-noticed. Ideally get to a place where it mimics the production location in regards to hardware and proximity/connection to the end user.
Use ANTS Profiler or similar tool to profile for expensive methods, and high memory usage.
Test with multiple users, to simulate load. Depending on the nature of the application also load test, either with multiple physical testers or with testing tools that allow you to simulate and script a load scenario.
Review the code to see if objects are retaining viewstate when they shouldn't need to.
I don't know a hard and fast set of "rules" but I find these are good starting points.
In addition to Mitchel's comments above I would recommend conducting load testing as part of your Continuos Integration (CI) process. Visual Studio Team Suite (Test Edition) contains a good load/stress test tool.

What are the application design aspects to be considered when developing a Multi-Tier, High Availability web application?

The application is planned to be built using ASP.NET, .NET Remoting & MS SQL Server.
High availability is required at presentation layer, application layer and database.
Does IIS 7.0 provide any advantages over IIS 6.0 in regard to the High availability aspect?
Among the many aspects you want to consider, make sure that you have numbers.
By numbers, I mean how many request per second do you want to deliver? How many users per day are you planning? Are they all going to come in 1 hour or through out the day? Are they simply buying stuff on a e-Commerce website or is it a social network website with lots of pictures and videos?
All those questions matters in how you will architecture your website. If you go with a simple e-Commerce website that should not crash, make sure to have 2 servers with load balancing with some health monitoring on the IIS process. For the database, 1 machine will do the trick especially if you have some RAID hard drives.
However, if you go toward a social network site... things get freaky fast. If users upload pictures, you will need lots of space and much more if they upload videos. You might want to use Cloud Service to host those pictures without too much fees. For videos, you might want to use embeded link like Youtube or Google video.
As for IIS 7.0 versus IIS 6.0, I don't think there will be any significant changes. Both are really reliable.
Take a look at the High Scalability Blog
Make sure your design scales in a horizontal manner.
That is, have your system hiding behind a load balance layer with the servers that are actually providing the service behind the load balance layer.
When you need to increase capacity, you build a new server or servers and plug it in alongside the existing servers. Then you configure the load balance layer to also consider the new server(s) when passing out the work.

Resources