We are very new in Google Cloud and learning.
I have two question marks in my mind.
First is
Can I create localisation IP addresses for virtual instances? like I open web site with German IP range or another web site I want assign under Italian IP range.
Where is the best place to start or is it possible under cloud.
Second is
We had DDOS attack to under cloud and resources made peak while under attack, Will google charge extreme price for that peak time or will be normal billing.
Second question brings to third one,
We using cloudflare for domains, Is there stable way yo prevent DDOS attacks under google cloud?
I appreciate your time and answers.
To your first point, are you after finding the shortest path between your users and wherever you serve your content? If that's the case, you can simply put a load balancer in front of your backend services within Google Cloud, with a global public forwarding IP address, and the service itself will take care of redirecting the traffic to the nearest group of machines available. Here is an example of a HTTP(S) Load Balancer setup.
Or is localization what you are trying to achieve? In that case I'd rely on more standard forms of handling the language of choice like using browser settings (or user account settings if existing) or the Accept-Language header. This is a valuable resource from LocalizeJS.
Lastly if you are determined to having multiple versions of your application deployed for the different languages that you support, you could still have an intermediate service that determines the source of the request using IP-based lookups and redirect the user to the version of your choice. Said so, my feeling is that this is a more traditional behavior that in the world of client applications that are responsive and localized on the spot, the extra hop/redirect could get to annoy some users.
To your second point, there is a number of protections that are already built-in on some services within Google Cloud, in order to help you protect your applications and machines in different ways. On the DDoS front, you can benefit from policies and protections on the CDN side, where you get cache and scaling based preventive measures.
In addition to that, and if you have a load balancer put in front of your content, you can benefit from protections on layers 3, 4 and 7 of the OSI model. That includes typical HTTP, SYN floods, port exhaustion or NTP amplification attacks.
What this means is that in many of these situations, your infrastructure will not even notice many of these potential attacks, as they'll be alleviated before they reach your infrastructure (and therefore you will not be billed for that). Said so, I have heard and experienced situations in which these protections did not act in a timely fashion, or were triggered at all. In these scenarios, there is a possibility for your system to need to handle that extra load. However, and especially in events when the attack was obviously malicious and documented to be supposedly handled by Google Cloud, there is a chance to make a point with Google in order to get some support on the topic.
A bit more on that here.
Hope this is helpful.
Related
Should information such as metrics generated from an application that are devoid of any business information, still be subject to encryption/decryption over HTTPS, when being transmitted within the eco system of an organization, that sits behind firewalls?
The reason I am asking this question is that, since the metrics data does not give away any business information, and is behind a firewall already, beyond everything, since the data is tremendous in size (time-series data in the counts of millions of records per second), does it make sense to reduce the computational complexity involved in using HTTPS, that forces encryption/decryption at every hop of the metrics' journey from source to destination, by redirecting metrics data with an ingress policy applied, that routes the packets via another port such as 8080 to skip encryption/decryption, thus saving us BIG on resource utilization, and of course reduced time complexity?
Or is it a known compromise that can in some way turn into a vulnerability hole, that can lead to breaches in the system?
Context:
The applications being monitored are communicating over HTTPS.
The metrics scraping agents are asked to communicate over HTTP
Ingress policy applied on the application node, recognizes the calls from the known metrics scraping agent and routes the packets via a non HTTPS port such as 8080, in order to skip the certificate validation plus mainly, the decryption of metrics payload in the request coming in.
I am looking for suggestions and inputs, especially from someone who has had this problem to solve in their experience. Anybody else with relevant information is more than welcome to add to it.
Any leads appreciated.
Thank you, in advance.
the metrics data does not give away any business information
I think this is not true. Metrics can record traffic patterns also in a business context (e.g.: what users searched for/bought the most, etc.).
Also, it can accidentally contain sensitive information (it should not but accidents can happen). Additionally, it can help attackers to get more data about:
Your infrastructure (what platforms you use)
Your environment (os, java version, etc.)
Your app topology (who calls who)
Please check the Fallacies of distributed computing:
#4 The network is secure
Being behind a firewall does not mean attackers can't get in, that's one of the reasons why you use HTTPS on the internal network.
UPDATE: See the answer I've provided below for the solution I eventually got set up on AWS.
I'm currently experimenting with methods to implement a global load-balancing layer for my app servers on Digital Ocean and there's a few pieces I've yet to put together.
The Goal
Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.
Additionally, I would eventually like to automate the maintenance of this by writing a daemon that can monitor, scale, and heal any of the servers on the system. Or I'll combine various services to achieve the same automation goals. First I need to figure out how to do it manually.
The Stack
Ubuntu 14.04
Nginx 1.4.6
node.js
MongoDB from Compose.io (formerly MongoHQ)
Global Domain Breakdown
Once I rig everything up, my domain would look something like this:
**GLOBAL**
global-balancing-1.myapp.com
global-balancing-2.myapp.com
global-balancing-3.myapp.com
**NYC**
nyc-load-balancing-1.myapp.com
nyc-load-balancing-2.myapp.com
nyc-load-balancing-3.myapp.com
nyc-app-1.myapp.com
nyc-app-2.myapp.com
nyc-app-3.myapp.com
nyc-api-1.myapp.com
nyc-api-2.myapp.com
nyc-api-3.myapp.com
**SFO**
sfo-load-balancing-1.myapp.com
sfo-load-balancing-2.myapp.com
sfo-load-balancing-3.myapp.com
sfo-app-1.myapp.com
sfo-app-2.myapp.com
sfo-app-3.myapp.com
sfo-api-1.myapp.com
sfo-api-2.myapp.com
sfo-api-3.myapp.com
**LON**
lon-load-balancing-1.myapp.com
lon-load-balancing-2.myapp.com
lon-load-balancing-3.myapp.com
lon-app-1.myapp.com
lon-app-2.myapp.com
lon-app-3.myapp.com
lon-api-1.myapp.com
lon-api-2.myapp.com
lon-api-3.myapp.com
And then if there's any strain on any given layer, in any given region, I can just spin up a new droplet to help out: nyc-app-4.myapp.com, lon-load-balancing-5.myapp.com, etc…
Current Working Methodology
A (minimum) trio of global-balancing servers receive all traffic.
These servers are "DNS Round-Robin" balanced as illustrated in this
(frankly confusing) article: How To Configure DNS Round-Robin Load
Balancing.
Using the Nginx GeoIP
Module and
MaxMind GeoIP Data
the origin of any given request is determined down to the
$geoip_city_continent_code.
The global-balancing layer then routes the request to the least
connected server on the load-balancing layer of the appropriate
cluster: nyc-load-balancing-1, sfo-load-balancing-3,
lon-load-balancing-2, etc.. This layer is also a (minimum) trio of
droplets.
The regional load-balancing layer then routes the request to the
least connected server in the app or api layer: nyc-app-2,
sfo-api-1, lon-api-3, etc…
The details of the Nginx kung fu can be found in this tutorial:
Villiage Idiot: Setting up Nginx with GSLB/Reverse Proxy on
AWS. More general info about Nginx load-balancing is available
here
and
here.
Questions
Where do I put the global-balancing servers?
It strikes me as odd that I would put them either all in one place, or spread that layer out around the globe either. Say, for instance, I put them all in NYC. Then someone from France hits my domain. The request would go from France, to NYC, and then be routed back to LON. Or if I put one of each in SFO, NYC, and LON then isn't it still possible that a user from Toronto (Parkdale, represent) could send a request that ends up going to LON only to be routed back to NYC?
Do subsequent requests get routed to the same IP?
As in, if a user from Toronto sends a request that the global-balancing layer determines should be going to NYC, does the next request from that origin go directly to NYC, or is it still luck of the draw that it will hit the nearest global-balancing server (NYC in this case).
What about sessions?
I've configured Nginx to use the ip_hash; directive so it will direct the user to the same app or api endpoint (a node process, in my case) but how will global balancing affect this, if at all?
Any DNS Examples?
I'm not exactly a DNS expert (I'm currently trying to figure out why my CNAME records aren't resolving) but I'm a quick study when provided with a solid example. Has anyone gone through this process before and can provide a sample of what the DNS records look like for a successful setup?
What about SSL/TLS?
Would I need a certificate for every server, or just for the three global-balancing servers since that's the only public-facing gateway?
If you read this whole thing then reward yourself with a cupcake. Thanks in advance for any help.
The Goal: Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.
The global-balancing layer then routes the request to theleast
connected server...
If I'm reading your configuration correctly, you're actually proxying from your global balancers to the balancers at each region. This does not meet your goal of routing users to the nearest region.
There are three ways that I know of to get what you're looking for:
30x Redirect Your global balancers receive the HTTP request and then redirect it to a server group in or near the region it thinks the request is coming from, based on IP address. This sounds like what you were trying to set up. This method has side effects for some applications, and also increases the time it takes for a user to get data since you're adding a ton of overhead. This only makes sense if the resources you're redirecting to are very large, and the local regional cluster will be able to serve much more efficiently.
Anycast (taking advantage of BGP routing) This is what the big players like Akamai use for their CDN. Basically, there are multiple servers out on the internet with the exact same routable IP address. Suppose I have servers in several regions, and they have the IP address of 192.0.2.1. If I'm in the US and try to connect to 192.0.2.1, and someone is in Europe that tries to connect to 192.0.2.1, it's likely that we'll be routed to the nearest server. This uses the internet's own routing to find the best path (based on network conditions) for the traffic. Unfortunately, you can't just use this method. You need your own AS number, and physical hardware. If you find a VPS provider that lets you have a chunk of their Anycast block, let me know!
Geo-DNS There are some DNS providers that provide a service often marketed as "Geo-DNS". They have a bunch of DNS servers hosted on anycast addresses which can route traffic to your nearest servers. If a client queries a European DNS server, it should return the address for your European region servers, vs. some in other regions. There are many variations on the Geo DNS services. Others simply maintain a geo-IP database and return the server for the region they think is closer, just like the redirect method but for DNS before the HTTP request is ever made. This is usually the good option, for price and ease of use.
Do subsequent requests get routed to the same IP?
Many load balancers have a "stickiness" option that says requests from the same network address should be routed to the same end server (provided that end server is still up and running).
What about sessions?
This is exactly why you would want that stickiness. When it comes to session data, you are going to have to find a way to keep all your servers up-to-date. Realistically, this isn't always guaranteed. How you handle it depends on your application. Can you keep a Redis instance or whatever out there for all your servers to reliably hit from around the world? Do you really need that session data in every region? Or can you have your main application servers dealing with session data in one location?
Any DNS Examples?
Post separate questions for these. Everyone's "successful setup" looks differently.
What about SSL/TLS?
If you're proxying data, only your global balancers need to handle HTTPS. If you're redirecting, then all the servers need to handle it.
A Working Solution
I've had a wild ride over the past few months figuring out the whole Global-HA setup. Tonnes of fun and I've finally settled with a rig that works very well, and is nothing like the one outlined in the above question.
I still plan on writing this up in tutorial form, but time is scarce as I head into the final sprint to get my app launched early next year, so here's a quick outline of the working rig I ended up with.
Overview
I ended up moving my entire deployment to AWS. I love Digital Ocean, but the frank reality is that AWS is light years ahead of them (and everyone, really) when it comes to the services offered under one roof. My monthly expenses went up slightly, but once I was done tweaking and streamlining I ended up with a solution that costs about $75/month per region for the most basic deployment (2 instances behind an ELB). And a new region can be spun up and deployed within about 30 minutes.
Global Balancing
I quickly found out (thanks to #Brad's answer above) that trying to spin up my own global balancing DNS layer is insane. It was a hell of a lot of fun figuring out how a layer like this works, but short of getting on a plane and scraping my knuckles installing millions of dollars worth of equipment around the world, it was not going to be possible to roll my own.
When I finally figured out what I was looking for, I found my new best friend: AWS Route 53. It offers a robust DNS network with about 50-odd nodes globally and the ability to do some really cool routing tricks like location-based routing, latency-based routing (which is kinda awesome), and AWS Alias records that 'automagically' route traffic to other AWS Services you'll be using (Like ELB for load balancing).
I ended up using latency-based routing that directs the global traffic to the closest regional Elastic Load Balancer, which has an Auto-Scaling Group attached to it in any given region.
I'll leave it up to you to do your homework on the other providers: www.f5.com, www.dyn.com, www.akamai.com, www.dnsmadeeasy.com. Depending on your needs, there may be a better solution for you, but this works very well for me.
Content Delivery Network
Route 53 integrates with AWS Cloudfront very nicely. I setup an S3 bucket that I'm using to store all the static media files that my users will upload, and I've configured a Cloudfront distribution to source from my media.myapp.com S3 bucket. There are other CDN providers, so do your shopping. But Cloudfront gets pretty good reviews and it's a snap to setup.
Load Balancing & SSL Termination
I'm currently using AWS Elastic Load Balancer to balance the load across my application instances, which live in an Auto-Scaling Group. The request is first received by ELB, at which point SSL is terminated and the request is passed through to an instance in the Auto-Scaling Group.
NOTE: One giant caveat for ELB is that, somewhat ironically, it doesn't handle massive spikes very well. It can take up to 15 minutes for an ELB to trigger a scale-up event for itself, creating 500/timeouts in the meantime. A steady, constant increase in traffic is supposedly handled quite well, but if you get hit with a spike it can fail you. If you know you're going to get hit, you can 'call ahead' and AWS will warm up your ELB for you, which is pretty ridiculous and anti-pattern to the essence of AWS, but I imaging they're either working on it, or ignoring it because it's not really that big of a problem. You can always spin up your own HAProxy or Nginx load-balancing layer if ELB doesn't work for you.
Auto-Scaling Group
Each region has an ASG which is programmed to scale when the load passes a certain metric:
IF CPU > 90% FOR 5 MINUTES: SCALEUP
IF CPU < 70% FOR 5 MINUTES: SCALEDN
I haven't yet put the ELB/ASG combo through its paces. That's a little way down my To-Do list, but I do know that there are many others using this setup and it doesn't seem to have any major performance issues.
The config for an Auto-Scaling Group is a little convoluted in my opinion. It's actually a three-step process:
Create an AMI configured to your liking.
Create a Launch Configuration that uses the AMI you've created.
Create an Auto-Scaling Group that uses the Launch Configuration you've created to determine what AMI and instance type to launch for any given SCALEUP event.
To handle config and app deployment when any instance launches, you use the "User Data" field to input a script that will run once any given instance launches. This is possibly the worst nomenclature in the history of time. How "User Data" describes a startup script only the author knows. Anyhow, that's where you stick the script that handles all your apt-gets, mkdirs, git clones, etc.
Instances & Internal Balancing
I've also added an additional 'internal balancing layer' using Nginx that allows me to 'flat-pack' all my Node.js apps (app.myapp.com, api.myapp.com, mobile.myapp.com, www.myapp.com, etc.myapp.com) on every instance. When an instance receives a request passed to it from ELB, Nginx handles routing the request to the correct Node.js port for any given application. Sort of like a poor-mans containerization. This has the added benefit that any time one of my apps needs to talk to the other (like when app. needs to send a request to api.) it's done via localhost:XXXX rather than having to go out across the AWS network, or the internet itself.
This setup also maximizes usage of my resources by eliminating any idle infrastructure if the app layer it hosts happens to be receiving light traffic. It also obviates the need to have and ELB/ASG combo for every app, saving more cash.
There's no gotchas or caveats that I've run into using this sort of setup, but there is one work-around that needs to be in place with regard to health-checking (see below).
There's also a nice benefit in that all instances have an IAM role which means that your AWS creds are 'baked in' to each instance upon birth and accessible via your ENV vars. And AWS 'automagically' rotates your creds for you. Very secure, very cool.
Health Checks
If you go the route of the above setup, flat-packing all your apps on one box and running an internal load-balancer, then you need to create a little utility to handle the ELB Health Checks. What I did was create an additional app called ping.myapp.com. And then I configured my ELB Health Checks to send any health checks to the port that my ping app is running on, like so:
Ping Protocol: HTTP
Ping Port: XXXX
Ping Path: /ping
This sends all health checks to my little ping helper, which in turn hits localhost:XXXX/ping on all the apps residing on the instance. If they all return a 200 response, my ping app then returns a 200 response to the ELB health check and the instances gets to live for another 30 seconds.
NOTE: Do not use Auto-Scaling Health Checks if you're using an ELB. Use the ELB health checks. It's kinda confusing, I thought they were the same thing, they're not. You have the option to enable one or the other. Go with ELB.
The Data Layer
One thing that is glaringly absent from my setup is the data layer. I use Compose.io as my managed data-layer provider and I deploy on AWS so I get very low latency between my app layers and my data layer. I've done some prelim investigation on how I would roll my data layer out globally and found that it's very complex — and very expensive — so I've kicked it down my list as a problem that doesn't yet need to be solved. Worst case is that I'll be running my data layer in US-East only and beefing up the hardware. This isn't the worst thing in the world since my API is strictly JSON data on the wire so the average response is relatively tiny. But I can see this becoming a bottleneck at very large, global scale — if I ever get there. If anyone has any input on this layer I'd love to hear what you have to say.
Ta-Da!
Global High Availability On A Beer Budget. Only took me 6 months to figure it out.
Love to hear any input or ideas from anyone that happens to read this.
You can use Anycast for your webservice for free if using Cloudflare free plan.
Digital Ocean now supports Load Balancing of servers itself. It is extremely easy to set up and works great! Saves you having to add in unnecessary components such as nginx (if you only want to use for load balancing).
We were having issues using SSL file uploads with nginx on a digital ocean server, however since the Digital Ocean update, we have removed nginx and now use Digital Ocean's load balancing feature and it works just as we need it to!
I am building a service which requires me to dynamically launch and close servers at many locations around the world, (for example using AWS). When a user visits my domain they need to be assigned to a local server with the lowest latency.
By assignment, I mean that for example the client makes an ajax call to example.com/getData, it should go directly to one particular server that is has been assigned to. Different servers will be doing different computation, so it is not sufficient to have some kind of general load balancing.
What general mechanisms/technology would allow me to 1) Assess the latency between a particular client and any server under my control? 2) Assign a particular client to a particular server? I cannot use just the IP addresses for example, since javascript has domain name based restrictions.
Thanks
Note: I do not have enough reputation to link all the technologies in the response, therefore sometimes you will see the links copied in plain text.
1) Assign users to a local server with the lowest latency is not always possible.
Sometimes the geographically closest server to a user is unexpectedly the one with the highest latency.
To find the lowest latency between your (running) servers and the users is not an easy task.
There might be many different hops (routers) between the client and the server, and any of them at any time can have problems, routes update, packet congestions and so on.
The quickest way to assess the latency is a ping, but it can be that the firewalls block this.
So the best way to achieve this is to use the anycast
All the major CDN providers implement this method. Some use the TCP anycast, which seems to be not recommended, and others UDP anycast. It is an open debate.
Anyway in order to implement anycast you need to be able to peer with the ISP routers, and normally this is not possible. Additionally there are good peers and bad peers.
Finally All this requires a deep knowledge of the routing protocols and the TCP/IP stack.
A quick and dirty solution could be to use BIND with the GEO-IP patch.
So you can define specific dns query responses per country.
What I mean is that, for instance, if you have a server in UK and one in US you can configure BIND to respond to users coming from europe to hit the UK server and users coming from US to hit the US server.
2) To assign a particular client to a particular server you can use the technique I described on the point 1 or you can use a proxy and sticky sessions.
HA-Proxy is a good product to achieve this. (the URL: xy.1wt.eu )
3) if you use the point 1, you will not have problems with cross domain ajax calls. In fact it is completely transparant for the client. For instance for the same domain example.com a user coming from US will resolve it to 1.1.1.1 whereas a user coming from Germany will resolve example.com to 2.2.2.2 (ip addresses are fake and used just as an example).
On a side note, a solution to do cross domain ajax call is JSON-P which has though some drawbacks, like the lack of support for POST.
If I were you I would go with the BIND and GEO-IP, because it would solve all three problems in once. (a part for the latency because is not always true that the geographically closest server is the one with the lowest latency.)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm about to start writing a web app (Asp.Net/IIS7) which will be accessible over the internet. It will be placed behind a firewall which accepts http and https.
The previous system which we are going to replace doesn't let this web server talk directly to a database, but rather have it making highly specialized web service calls (through a new firewall which only allows this kind of calls) to a separate app server which then go to the DB to operate on the data.
I have worked on many systems in my day, but this is the first one which has taken security this seriously. Is this a common setup? My first thought was to use Windows Authentication in the connectionstring on the web server and have the user be a crippled DB-user (can only view and update its own data) and then allow DB access through the inner firewall as well.
Am I Naïve? Seems like I will have to do a lot of mapping of data if we use the current setup for the new system.
Edit: The domain of this app is online ordering of goods (Business to business), Users (businesses) log in, input what they can deliver at any given time period, view previous transaction history, view projected demand for goods etc. No actual money is exchanged through this system, but this system provides the information on which goods are available for sale, which is data input to the ordering system
This type of arrangement (DMZ with web server, communicating through firewall with app server, communicating through firewall with db) is very common in certain types of environment, especially in large transactional systems (online corporate banking, for example)
There are very good security reasons for doing this, the main one being that it will slow down an attack on your systems. The traditional term for it is Defence in Depth (or Defense if you are over that side of the water)
Reasonable security assumption: your webserver will be continually under attack
So you stick it in a DMZ and limit the types of connection it can make by using a firewall. You also limit the webserver to just being a web server - this reduces the number of possible attacks (the attack surface)
2nd reasonable security assumption: at some point a zero-day exploit will be found that will get to your web server and allow it to be compromised, which could lead to to an attack on your user/customer database
So you have a firewall limiting the number of connections to the application server.
3rd reasonable security assumption: zero-days will be found for the app server, but the odds of finding zero-days for the web and app servers at the same time are reduced dramatically if you patch regularly.
So if the value of your data/transactions is high enough, adding that extra layer could be essential to protect yourself.
We have an app that is configured similarly. The interface layer lives on a web server in the DMZ, the DAL is on a server inside the firewall with a web service bridging the gap between them. In conjunction with this we have an authorization manager inside the firewall which exposes another web service that is used to control what users are allowed to see and do within the app. This app in one of our main client data tracking systems, and is accessible to our internal employees and outside contractors. It also deals with medical information so it falls under the HIPAA rules. So while I don’t think this set up is particularly common it is not unheard of, particularly with highly sensitive data or in situations where you have to deal with audits by a regulatory body.
Any reasonably scalable, reasonably secure, conventional web application is going to abstract the database away from the web machine using one or more service and caching tiers. SQL injection is one of the leading vectors for penetration/hacking/cracking, and databases often tend to be one of the more complex, expensive pieces of the overall architecture/TOC. Using services tiers allows you to move logic out of the DB, to employ out-of-process caching, to shield the DB from injection attempts, etc. etc. You get better, cheaper, more secure performance this way. It also allows for greater flexibility when it comes to upgrades, redundancy or maintenance.
Configuring the user's access rights seems like a more robust solution to me. Also your DataAccess layer should have some security built in, too. Adding this additional layer could end up being a performance hit but it really depends on what mechanism you're using to move data from "WebServer1" to "WebServer2." Without more specific information in that regard, it's not possible to give a more solid answer.
I'm a Java coder and not very familiar with how networks work (other than basic UDP/TCP connections)
Say I have servers running on machines in the US, Asia, Latin America and Europe. When a user requests a service, I want their request to go to the server closest to them.
Is it possible for me to have one address: mycompany.com, and somehow get requests routed to the appropriate server? Apparently when someone goes to cnn.com, they receive the pictures, videos, etc. from a server close to them. Frankly, I don't see how that works.
By the way, my servers don't serve web pages, they serve other services such as stock market data....just in case that is relevant.
Since I'm a programmer, I'm interested to know how one would do it in software. Since this is little more than an idle curiosity, pointers to commercial products or services won't be very helpful in understanding this problem :)
One simple approach would be to look at the first byte (Class A) of the IP address coming into the UDP DNS request and then based off that you could deliver the right geo-located IP.
Another approach would be a little more complicated. Instead of using the server that is geographically closest to the user, you could use the server that has the lowest latency for that user.
The lower latency will provide faster transfer speeds while being easier to calculate than geographic location.
For a much more detailed look, check out this article on CDNs (pay attention to the Technology Section):
Content Delivery Network - Wikipedia
These are the kinds of networks that the large sites use to distribute their content over the net (Akamai is a popular example). As you can see, things can get pretty complicated pretty quickly with CDNs having their own proprietary protocols, etc...
Update: I didn't see the disclaimer about commercial solutions at the end of the original post. I'll leave this up for those who may find it of interest.
--
Take a look at http://ultradns.com/. A managed DNS service like that may be just what you need to accomplish what you are looking for.
Amazon.com, Forbes.com, Oracle, all use them...
Quote From http://ultradns.com/solutions/traffic.html:
UltraDNS Traffic Management solution provides a set of tools allowing IT administrators to define load balancing configurations for content servers residing in one or more geographic locations. The Traffic Management Solution manages traffic directed to the servers by dynamically changing the responses to DNS requests. Load balancing is performed based on dynamic metrics obtained from the host servers on a continual monitoring basis. The UltraDNS Traffic Management solution is not a single application, but combines the capabilities of several existing UltraDNS systems to control traffic, manage site failures, and optimize web content systems.
One approach is, as Jeff mentioned, using the IP address: http://en.wikipedia.org/wiki/Geolocation_software
In my experienced, this is precise to the nearest relatively large city (in the US at least). There are several open databases to aid in this (see the wiki link). Then you can generate image tags and download links and such based on this information.
As for locating the nearest server, I'm sure you can think of a few ways to do it. For instance, if the best return you can get is major city, you can lookup that city in a list of Latitude/Longitude and calculate the nearest server based on that.