Software Routing - networking

"Commercial software routers from companies such as Vyatta can typically only attain transfer data at speeds of up to three gigabits per second. That isn’t fast enough to take advantage of the full speed of a typical network card, which operates at 10 gigabits per second." [1]
How is the speed of the network interface card relevant in this scenario? Aren't software routers connecting multiple Virtual Machines running on the same physical host? [2] Unless a PC has multiple network interface cards, it is unlikely that it functions as a packet switch between different physical hosts.
My interpretation suggests that there seem to exist two different kinds of software routing: (1) Embedding a real time operating system on an actual router. (2) Writing application layer code on a PC that can handle packets being transmitted between different virtual machines running on that very PC. Is this correct?

It depends on what your router is doing. If it's literally just looking at a static route table and forwarding packets out another interface, there isn't much hit in performance.
It's when you get into things like NAT, Crypto, QoS, SPI... that you will see performance degradation. Hardware vendors are usually using custom silicon to process the more advanced features, this allows for higher throughput packet forwarding.
Now that merchant silicon is fast enough and the open source applications are getting better, the performance gap is closing.
It really depends on your use case as far as what you want to use. I've gone with both and not seen performance hits, but the software versions weren't handling high throughput workloads.

Performance of the link from the virtual network to the physical eventually becomes important at any reasonable scale. You're right that, within the same physical host, things can be pretty quick, but that requires that one can get everything needed in one box.
While merchant silicon has come a long way in improving the performance of networking equipment, greater gains are taking place getting CPU's to handle networking tasks better. Both AMD and Intel have improved their architectures to the point where 10 Gbps forwarding is a reality. Intel has developed a specialized library (DPDK Wiki Page) that takes care of a lot of low-level networking functions at high performance.

Related

Building a small 4 node cluster - few quick questions about networking

I'm putting together a small 4 node cluster on which I'm going to be running storm. I have a few questions about the networking side of things. First off all the computers are equipped with gigabit ethernet however the hub that I currently have only goes up to 100 megabits. Should I upgrade my hub? Or will the performance gain be negligible? Second I read on a few sites that a hub is not the best piece of hardware to use that a switch would be better for my purposes. I'm trying to use Storm to have one machine pull data down from the internet and then pass it off to the others for processing. Would a switch or hub be more useful? Thanks for all your help folks.
A Router can allow for serious networking capabilities, it's also oftentimes overkill. With only 4 machines you're probably much more likely to want a Gigabit Switch instead: sold in stores oftentimes under the name Gigabit Router -- which is technically a lie as it's usually a Bridge (Hub or Switch, Networking has a lot of overloaded names). Router are many times more expensive than Switches if you have difficulty identifying between the two from just marketing names. A hub on the other hand is oftentimes a dumb Switch with less capabilities (and sometimes speed penalties in high data flow situations).
The question as to if you need to upgrade is dependent on where you bottleneck is. Is the data you're sending large? Do your cluster computer spend a lot of time computing instead of receiving data? First determine if your networking speed will be your bottleneck, then decide if you should upgrade that bottleneck. If you're worried about network speed but aren't 100% sure it will be a bottleneck, a cheap 1 Gigabit Switch won't cost you much and will almost certainly meet you're needs.
Also note that if you're data needs to first come over the internet (isn't generated on your side of the network) you're bottleneck will almost certainly be your internet connection before your local network.
So essentially, profile your problem before making a choice.

Flow based routing and openflow

This may not be the typical stackoverflow question.
A colleague of mine has been speculating that flow-based routing is going to be the next big thing in networking. Openflow provides the technology to use low cost switches in large application, IT data-centers, etc; replacing Cisco, HP, etc switch and routers. The theory is that you can create a hierarchy these openflow switches with simple configuration, eg. no spanning tree. Open flow will route each flow to the appropriate switch/switch-port, using only the knowledge of the hierarchy of switches (no routers). The solution is suppose to save enterprises money and simplify networking.
Q. He is speculating that this may dramatically change enterprise networking. For many reasons, I am skeptical. I would like to hear your thoughts.
OpenFlow is a research project from Stanford University led by professor Nick McKeown. In the original OpenFlow research paper, the goal of OpenFlow was to give researchers a way "to run experimental protocols in the networks they use every day." For years networking researchers have had an almost impossible task deploying and evaluating their ideas on real networks with real Ethernet switches and IP routers. The difficultly is that real switches and routers from companies like Cisco, HP, and others, are all closed, proprietary boxes that implement standard "protocols", like Ethernet spanning tree, and OSPF. There are business reasons why Cisco and HP won't let you run software on their switches and routers; there is no technical reason. OpenFlow was invented to solve a people problem: if Cisco is not willing to let you run code on their switch, maybe they can at least provide a very narrow interface to let you remotely configure their switch, and that narrow interface is called OpenFlow.
To my knowledge more than a dozen companies are currently implementing OpenFlow support for their switches. Some like HP are only providing the OpenFlow software for research purposes. Others like NEC are actually offering commercial support.
For academic researchers that want to evaluate new routing protocols in real networks, OpenFlow is a huge win. For switch vendors, it is less clear if OpenFlow support will help, hurt, or have no effect in the long run. After all, the academic research market is very small.
The reason why OpenFlow is most often discussed in the context of enterprise networks is that OpenFlow grew out of a previous research project called Ethane that used OpenFlow's mechanism of remotely programming switches in an enterprise network in order to centralize a security policy. Ethane, and by extension OpenFlow, has led directly to two startup companies: Nicira, founded by Martin Casado, and Big Switch Networks, founded by Guido Appenzeller. It would be easier to implement an Ethane-like system if all of the switches in the network supported OpenFlow.
Closely related to enterprise networks are data center networks, the networks that interconnect thousands to tens of thousands of servers in companies such as Google, Facebook, Microsoft, Amazon.com, and Yahoo!. One problem with Ethernet is that it does not scale to this many servers on the same Layer 2 network. We attempted to solve this problem in a research project called PortLand. We used OpenFlow to facilitate programming the switches from a central controller, which we called a Fabric Manager. We released the PortLand source code as open source.
However, we also found a limitation to OpenFlow's functionality. In another data center networking research project called Helios, we were not able to use OpenFlow because it did not provide a mechanism for bonding multiple switch ports into a Link Aggregation Group (LAG). Presumably one could extend the OpenFlow specification indefinitely until it all possible switch features become exposed.
There are other networks as well such as the Internet access networks, Internet backbones, home networks, wireless networks, cellular networks, etc. Researchers are trying to see where OpenFlow fits into all of these markets. What it really comes down to is the question, "what problem does OpenFlow solve?" Ethane makes a case for enterprise networks but I have not yet seen a compelling case for any other type of network. OpenFlow might be the next big thing, or it might end up being a case of "don't solve a people problem with a technical solution."
In order to assess the future of flow-based networking and OpenFlow, here’s the way to think about it.
It starts with the silicon trends: Moore’s Law (2X transistors per 18-24 months), and a correlated but slower improvement in the I/O bandwidth available on a single chip (roughly 2X every 30-36 months). You can now buy full-featured 10GbE single chip switches with 64 ports, and chips which have a mix of 40GbE and 10GbE ports with comparable total I/O bandwidth.
There are a variety of ways physically connect these in a mesh (ignoring the loop-free constraints of spanning tree and the way Ethernet learns MAC addresses). In the high performance computing (HPC) world, a lot of work has been done building clusters with InfiniBand and other protocols using meshes of small switches to network the compute servers. This is now being applied to Ethernet meshes. The geometry of a CLOS or fat-tree topology enables a two stage mesh with a large number of ports. The math is thus: Where n is the # of ports per chip, the number of devices you can connect in a two-stage mesh is (n*2)/2, and the number you can connect in a three-stage mesh is (n*3)/4. While with standard spanning tree and learning, the spanning tree protocol will disable the multi-path links to the second stage, most of the Ethernet switch vendors have some sort of multi-chassis Link Aggregation protocol which gets around the multi-pathing limitation. There is also standards work in this area. Although it might not be obvious, the vast majority of Link Aggregation schemes allocate traffic so all the frames of any given flow take the same path. This is done in order to minimize out-of-order frames so they don’t get dropped by some higher level protocol. They could have chosen to call this “flow based multiplexing” but instead they call it “link aggregation”.
Although the devil is in the details, there are a variety of data center operators and vendors that have concluded they don’t need to have large multi-slot chassis switches in the aggregation/core layer for server connect, instead using meshes of inexpensive 1U or 2U switches.
People have also concluded that eventually you need some kind of management station to set up the configuration of all the switches. Again, drawing from the experience with HPC and InfiniBand, they use what is called an InfiniBand Controller. In the telecom world, most telecom networks have evolved to separate the management and part of the control plane from the boxes that carry the data traffic.
Summarizing the points above, meshes of Ethernet switches with an external management plane with multipath traffic where flows are kept in order is evolutionary, not revolutionary, and is likely to become mainstream. At least one major company, Juniper, has made a big public statement about their endorsement of this approach. I'd call all of these "flow-based routing".
Juniper and other vendors’ proprietary approaches notwithstanding, this is an area that cries out for standards. The Open Networking Foundation (ONF), was founded to promote standards in this area, starting with OpenFlow. Within a couple of months, the sixty+ members of ONF will be celebrating their first year anniversary. Each member has, I am led to believe, paid tens of thousands of dollars to join. While the OpenFlow protocol has a ways to go before it is widely adopted, it has real momentum.
#Nathan: OpenFlow 1.1 actually adds some primitives that enable the use of multiple links via the Multipath Proposal.
An excellent view of OpenFlow by Simon Crosby
http://community.citrix.com/display/ocb/2011/03/21/The+Rise+of+the+Software+Defined+Network
More context on SDN which discusses IETF's SDN initiative and ONF's Openflow. Working in conjuction is a powerful combination http://bit.ly/A8xYso
Nathan, Excellent historical account and overview of openflow. Thanks!
You've hit on the points that I've been wrapping my head around as to why Openflow might not be widely adopted. Since it was designed to be open to allow researcher the ability to run experimental protocols and not necessarily be "compatible with" the big players Cisco/HP/etc. it puts itself into niche (although potentially big) market, more on this later. And as you've stated it's recieved some adoption in the "cloud data centers (CDC)" e.g. google, facebook, etc because they need to exploit experimental protocols to gain a competitive advantage or optimize for their application.
As you've stated some switch vendors have added openflow capability to capitalize on the niche need in academia and potentially sell into the CDC; google, facebook. This is potentially a big market (or bubble if you're pessimistic).
The problem that I see is that the majority of the market (80% or more) is enterprise IT data centers. The requirements here is for stable, compatible networking. Open and less expensive would be nice, but not at the cost of the former.
One could think of a day where corporate IT is partially or completely cloud-sourced where QoS is maintained by the cloud provider. In this case, experimental protocols could be leveraged to provide a competitive advantaged for speed or QoS. In which case; openflow could play a more dominant roll. I personally think this scenario is many years off.
So, the conclusion I come to is that other than in research and perhaps CDCs (google, facebook), the market is pretty small. I suppose that if researchers use openflow to come up with a better protocol for say link aggregation, or congestion management, then eventually Cisco and HP will provide those in their standard offering because their customers will demand it. So openflow could be a market influencer (via the research community), but it would not be a market disruptor.
Do you agree with my conclusions? Thanks for your input.

How many socket connections can a web server handle?

Say if I was to get shared, virtual or dedicated hosting, I read somewhere a server/machine can only handle 64,000 TCP connections at one time, is this true? How many could any type of hosting handle regardless of bandwidth? I'm assuming HTTP works over TCP.
Would this mean only 64,000 users could connect to the website, and if I wanted to serve more I'd have to move to a web farm?
In short:
You should be able to achieve in the order of millions of simultaneous active TCP connections and by extension HTTP request(s). This tells you the maximum performance you can expect with the right platform with the right configuration.
Today, I was worried whether IIS with ASP.NET would support in the order of 100 concurrent connections (look at my update, expect ~10k responses per second on older ASP.Net Mono versions). When I saw this question/answers, I couldn't resist answering myself, many answers to the question here are completely incorrect.
Best Case
The answer to this question must only concern itself with the simplest server configuration to decouple from the countless variables and configurations possible downstream.
So consider the following scenario for my answer:
No traffic on the TCP sessions, except for keep-alive packets (otherwise you would obviously need a corresponding amount of network bandwidth and other computer resources)
Software designed to use asynchronous sockets and programming, rather than a hardware thread per request from a pool. (ie. IIS, Node.js, Nginx... webserver [but not Apache] with async designed application software)
Good performance/dollar CPU / Ram. Today, arbitrarily, let's say i7 (4 core) with 8GB of RAM.
A good firewall/router to match.
No virtual limit/governor - ie. Linux somaxconn, IIS web.config...
No dependency on other slower hardware - no reading from harddisk, because it would be the lowest common denominator and bottleneck, not network IO.
Detailed Answer
Synchronous thread-bound designs tend to be the worst performing relative to Asynchronous IO implementations.
WhatsApp can handle a million WITH traffic on a single Unix flavoured OS machine - https://blog.whatsapp.com/index.php/2012/01/1-million-is-so-2011/.
And finally, this one, http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html, goes into a lot of detail, exploring how even 10 million could be achieved. Servers often have hardware TCP offload engines, ASICs designed for this specific role more efficiently than a general purpose CPU.
Good software design choices
Asynchronous IO design will differ across Operating Systems and Programming platforms. Node.js was designed with asynchronous in mind. You should use Promises at least, and when ECMAScript 7 comes along, async/await. C#/.Net already has full asynchronous support like node.js. Whatever the OS and platform, asynchronous should be expected to perform very well. And whatever language you choose, look for the keyword "asynchronous", most modern languages will have some support, even if it's an add-on of some sort.
To WebFarm?
Whatever the limit is for your particular situation, yes a web-farm is one good solution to scaling. There are many architectures for achieving this. One is using a load balancer (hosting providers can offer these, but even these have a limit, along with bandwidth ceiling), but I don't favour this option. For Single Page Applications with long-running connections, I prefer to instead have an open list of servers which the client application will choose from randomly at startup and reuse over the lifetime of the application. This removes the single point of failure (load balancer) and enables scaling through multiple data centres and therefore much more bandwidth.
Busting a myth - 64K ports
To address the question component regarding "64,000", this is a misconception. A server can connect to many more than 65535 clients. See https://networkengineering.stackexchange.com/questions/48283/is-a-tcp-server-limited-to-65535-clients/48284
By the way, Http.sys on Windows permits multiple applications to share the same server port under the HTTP URL schema. They each register a separate domain binding, but there is ultimately a single server application proxying the requests to the correct applications.
Update 2019-05-30
Here is an up to date comparison of the fastest HTTP libraries - https://www.techempower.com/benchmarks/#section=data-r16&hw=ph&test=plaintext
Test date: 2018-06-06
Hardware used: Dell R440 Xeon Gold + 10 GbE
The leader has ~7M plaintext reponses per second (responses not connections)
The second one Fasthttp for golang advertises 1.5M concurrent connections - see https://github.com/valyala/fasthttp
The leading languages are Rust, Go, C++, Java, C, and even C# ranks at 11 (6.9M per second). Scala and Clojure rank further down. Python ranks at 29th at 2.7M per second.
At the bottom of the list, I note laravel and cakephp, rails, aspnet-mono-ngx, symfony, zend. All below 10k per second. Note, most of these frameworks are build for dynamic pages and quite old, there may be newer variants that feature higher up in the list.
Remember this is HTTP plaintext, not for the Websocket specialty: many people coming here will likely be interested in concurrent connections for websocket.
This question is a fairly difficult one. There is no real software limitation on the number of active connections a machine can have, though some OS's are more limited than others. The problem becomes one of resources. For example, let's say a single machine wants to support 64,000 simultaneous connections. If the server uses 1MB of RAM per connection, it would need 64GB of RAM. If each client needs to read a file, the disk or storage array access load becomes much larger than those devices can handle. If a server needs to fork one process per connection then the OS will spend the majority of its time context switching or starving processes for CPU time.
The C10K problem page has a very good discussion of this issue.
To add my two cents to the conversation a process can have simultaneously open a number of sockets connected equal to this number (in Linux type sytems) /proc/sys/net/core/somaxconn
cat /proc/sys/net/core/somaxconn
This number can be modified on the fly (only by root user of course)
echo 1024 > /proc/sys/net/core/somaxconn
But entirely depends on the server process, the hardware of the machine and the network, the real number of sockets that can be connected before crashing the system
It looks like the answer is at least 12 million if you have a beefy server, your server software is optimized for it, you have enough clients. If you test from one client to one server, the number of port numbers on the client will be one of the obvious resource limits (Each TCP connection is defined by the unique combination of IP and port number at the source and destination).
(You need to run multiple clients as otherwise you hit the 64K limit on port numbers first)
When it comes down to it, this is a classic example of the witticism that "the difference between theory and practise is much larger in practise than in theory" - in practise achieving the higher numbers seems to be a cycle of a. propose specific configuration/architecture/code changes, b. test it till you hit a limit, c. Have I finished? If not then d. work out what was the limiting factor, e. go back to step a (rinse and repeat).
Here is an example with 2 million TCP connections onto a beefy box (128GB RAM and 40 cores) running Phoenix http://www.phoenixframework.org/blog/the-road-to-2-million-websocket-connections - they ended up needing 50 or so reasonably significant servers just to provide the client load (their initial smaller clients maxed out to early, eg "maxed our 4core/15gb box # 450k clients").
Here is another reference for go this time at 10 million: http://goroutines.com/10m.
This appears to be java based and 12 million connections: https://mrotaru.wordpress.com/2013/06/20/12-million-concurrent-connections-with-migratorydata-websocket-server/
Note that HTTP doesn't typically keep TCP connections open for any longer than it takes to transmit the page to the client; and it usually takes much more time for the user to read a web page than it takes to download the page... while the user is viewing the page, he adds no load to the server at all.
So the number of people that can be simultaneously viewing your web site is much larger than the number of TCP connections that it can simultaneously serve.
in case of the IPv4 protocol, the server with one IP address that listens on one port only can handle 2^32 IP addresses x 2^16 ports so 2^48 unique sockets. If you speak about a server as a physical machine, and you are able to utilize all 2^16 ports, then there could be maximum of 2^48 x 2^16 = 2^64 unique TCP/IP sockets for one IP address. Please note that some ports are reserved for the OS, so this number will be lower. To sum up:
1 IP and 1 port --> 2^48 sockets
1 IP and all ports --> 2^64 sockets
all unique IPv4 sockets in the universe --> 2^96 sockets
There are two different discussions here: One is how many people can connect to your server. This one has been answered adequately by others, so I won't go into that.
Other is how many ports yours server can listen on? I believe this is where the 64K number came from. Actually, TCP protocol uses a 16-bit identifier for a port, which translates to 65536 (a bit more than 64K). This means that you can have that many different "listeners" on the server per IP Address.
I think that the number of concurrent socket connections one web server can handle largely depends on the amount of resources each connection consumes and the amount of total resource available on the server barring any other web server resource limiting configuration.
To illustrate, if every socket connection consumed 1MB of server resource and the server has 16GB of RAM available (theoretically) this would mean it would only be able to handle (16GB / 1MB) concurrent connections. I think it's as simple as that... REALLY!
So regardless of how the web server handles connections, every connection will ultimately consume some resource.

Lots of ports with little data, or one port with lots of data?

I've been checking out using a system called ROS (http://www.ros.org) for some work.
There are lots of different types of data that get sent between network nodes in ROS.
You define a struct of data that you want to send in a message, and ROS will handle opening a specific port between the two nodes that will only send that struct of data.
So if there are 5 different messages, there will be 5 different ports.
As opposed to this scenario, I have seen other platforms that just push all the different messages across one port. This means that there needs to be a sort of multiplexing/demultiplexing (done by some sort of message parsing on the receivers end).
What I wonder is... which is better from a performance perspective?
Do operating systems switch based on ports quickly, so that a system like ROS doesn't have to do too much work to work out what is in the message and interpreting it?
OR
Is opening lots of ports going to mean lots of slower kernel calls, and the cost of having to work out and translate message types end up being more then the time spent switching between ports?
When this scales to a large amount of data at high rates and lots of different messages types there will be lots of ports. So I imagine that when scaling each of these topologies that performance will be a big factor in selecting the way to work.
I should also point out that these nodes usually exist on one small network, or most of the time on the one machine in which networking is used as a force of inter-process communication. So the transmission time is only a very small factor in the overall system timing.
ROS being an architecture for robots may have one node for every sensor and actuator, so depending on the complexity of your system we may be talking about 20-30 nodes pushing small-ish (100bytes or so) data between 10-100Hz
It depends. I do not know the specifics of ROS but in networking it comes down to the following constraints:
Distance: speed of light is fast but over a distance it starts making a difference
Protocol Overhead: connection oriented vs. connection-less
On the OS side, maintaining a list of free ports isn't such much of an overhead - of course there is a cost to it but everything is relative: if you are talking about a distributed system with long distance links, then it is easy to argue that cycling through OS network ports ranks as lower concern compared to managing communication quality.
Without a more specific question, I'll stop here.
I don't have any data on this, but it seems plausible that multiple ports might be handled more efficiently by multi-core systems, as opposed to demultiplexing within the program.

How Many Network Connections Can a Computer Support?

When writing a custom server, what are the best practices or techniques to determine maximum number of users that can connect to the server at any given time?
I would assume that the capabilities of the computer hardware, network capacity, and server protocol would all be important factors.
Also, do you think it is a good practice to limit the number of network connections to a certain maximum number of users? Or should the server not limit the number of network connections and let performance degrade until the response time is extremely high?
Dan Kegel put together a summary of techniques for handling large amounts of network connections from a single server, here: http://www.kegel.com/c10k.html
In general modern servers can handle very large numbers of concurrent connections. I've worked on systems having over 8,000 concurrently open TCP/IP sockets.
You will need a high quality servicing interface to handle that kind of load, check out libevent or libev.
That is a good question and it definitely is situational. What is your computer? Do you have a 4 socket machine filled with Quad Core Xeons, 128 GB of RAM, and Fiber Channel Connectivity (like the pair of Dell R900s we just bought)? Or are you running on a p3 550 with 256 MB of RAM, and 56K modem? How much load does each connection place on your server? What kind of response is acceptible?
These are the questions you need to answer. I guess the best way to find the answer is through load testing. Create a unit test of the expected (and maybe some unexpected) paths that your code will perform against your server. Find a load testing framework that will allow you to simulate 10, 100, 1000, 10000 users performing those tasks at the same time.
That will tell you how many connections your computer can support.
The great thing about the load/unit test scenario is that you can put in response time expectations in your unit tests and increase the load until you fall outside of your response time. If you have a requirement of supporting X number of Users with Y second response, you will be able to demonstrate it with your load tests.
One of the biggest setbacks in high concurrency connections is actually the routers involved. Home user oriented routers usually have a small NAT table, preventing the router from actually servicing the server the connections.
Be sure to research your router/ network infrastructure setup just as well.
I think you shouldn't limit the number of connections your server will allow - just catch and handle properly any exceptions that might occur when accepting and closing connections and you should be fine. You should leave that kind of lower level programming to the underlying OS layers - that way you can port your server easier etc.
This really depends on your operating system.
Different Unix flavors will support "unlimited" number of file handles / sockets others have high values like 32768.
A typical user limit is 8192 but it can usually be set higher.
I think windows is more limiting but the server version may have higher limits.

Resources