Are there any distributed systems that completely use UDP instead of TCP? - networking

I am trying to find some user cases of distributed systems (especially computing systems such as MLSys or graph computing systems), which completely rely on UDP as the communication primitve. But it seems everyone is using TCP. So I am curious whether there are some purely UDP-based distributed system (Of course, the application level should be more complex to track packet loss and retransmission, etc).

Related

What is a higher reliability protocol than TCP?

I know that TCP has a 16-bit checksum, to catch errors in transmission. So what TCP outputs on the other end is theoretically reliable... to a point.
This article suggests that TCP is not as reliable as one might hope if they are after "high reliability":
http://iang.org/ssl/reliable_connections_are_not.html#ref_6
Are there readily available protocols, or even transport libraries (C/C++ preferred), that are more reliable than TCP? Speed of moderate concern too.
I would imagine a transport library would effectively be a reimplementation of most of the parts of TCP.
It is a shame that TCP isn't more flexible to allow a tradeoff for more reliability at the cost of throughput/latency/speed. You could get a lot more reliability if you even made the checksum 32-bit instead of 16-bit. And again if you chose to make it 64-bit.
There seems to be a very big cost to adding your own reliable transport layer on top of TCP: for starters, the hardware acceleration support for processing TCP won't suffice, and you'll need to provide some of your CPU time to process this layer. Additionally, it is a lot of extra complexity and code to implement such a thing, which could have been all avoided if the TCP checksum was larger or selectable.
TLS
Widely deployed, well understood, plenty of libraries available in all commonly used languages.
If by 'reliability' you understand a better chance to detect stream alteration (bad hardware or malicious interference) then cryptographic HMAC is the way to go. And TLS is pretty much the industry standard.

Can i ignore UDP's lack of reliability features in a controlled environment?

I'm in a situation where, logically, UDP would be the perfect choice (i need to be able to broadcast to hundreds of clients). This is in a very small and controlled environment (the whole network is over a few square metters, all devices are local, the network is way oversized with gigabit ethernet and switches everywhere).
Can i simply "ignore" all of the added reliability that needs to be tossed on udp (checking messages arrived, resending them etc) as those mostly apply where the is expected packet loss (the internet) or is it really suggested to handle udp as "may not arrive" even in such conditions?
I'm not asking for theorycrafting, really wondering if anyone could tell me from experience if i'm actually likely to have udp packets missing in such an environment or is it's going to be a really rare event as obviously sending things and assuming that worked is much simpler than handling all possible errors.
This is a matter of stochastics. Even in small local networks, packet losses will occur. Maybe they have an absolute probability of 1e-10 in a normal usage scenario. Maybe more, maybe less.
So, now comes real-world experience: Network controllers and Operating systems do have a tough live, if used in high-throughput scenarios. Worse applies to switches. So, if you're near the capacity of your network infrastructure, or your computational power, losses become far more likely.
So, in the end it's just a question on how high up in the networking stack you want to deal with errors: If you don't want to risk your application failing in 1 in 1e6 cases, you will need to add some flow/data integrity control; which really isn't that hard. If you can live with the fact that the average program has to be restarted every once in a while, well, that's error correction on user level...
Generally, I'd encourage you to not take risks. CPU power is just too cheap, and bandwidth, too, in most cases. Try ZeroMQ, which has broadcast communication models, and will ensure data integrity (and resend stuff if necessary), is available for practically all relevant languages, and runs on all relevant OSes, and is (at least from my perspective) easier to use than raw UDP sockets.

NBAD, Netflow on layer 7

I'm developing Network Behavior Anomaly Detection and I'm using Cisco protocol NetFlow for collecting traffic information. I want to collect information about layer 7 of ISO OSI Reference Model, especially https protocol.
What is the best way to achieve this?
Maybe someone find it helpful:
In my opinion you should try sFlow or Flexible NetFlow.
SFlow uses a sampling to achieve scalability. System architecture consists receiving devices getting two types of samples:
-randomly sampling packets
-basis of sampling counters at certain time intervals
Sampled packets are sent as sFlow datagrams to a central server running the software for the analysis and reporting of network traffic, sFlow collector.
SFlow may be implemented in hardware or software, and while the name "sFlow" means that this is flow technology, however, this technology is not flow at all, and represents the transmission image on the basis of samples.
NetFlow is a real flow technology. Entries for the flow generated in the network devices and combined into packages.
Flexible NetFlow allows customers to export almost everything that passes through the router, including the entire package and doing it in real time, like sFlow.
In my opinion Flexible NetFlow is much better and if you're afraid of DDoS attack choose it.
If FNF is better why use sFlow? Cause many switches today only supports sFlow, and if we don't have possibility of use FNF and want to get real-time data sFlow is best option.

Where does the Transport Layer operate?

I'd like to know where the Transport Layer of the OSI model is running in a computer system. Is it part of the Operating System? Does it run in its own process or thread? How does it pass information up to other applications or down to other layers?
I'd like to know where the Transport Layer of the OSI model is running in a computer system.
It isn't. The OSI model applies to the OSI protocol suite, which is defunct, and not running anywhere AFAICS. However TCP/IP has its own model, which also includes a transport layer. I will assume that's what you mean hereafter.
Is it part of the Operating System?
Yes.
Does it run in its own process or thread?
No, it runs as part of the operating system.
How does it pass information up to other applications
Via system calls, e.g. the Berkeley Sockets API, WinSock, etc.
or down to other layers?
Via internal kernel APIs.
What the OSI model calls the transport layer corresponds fairly closely to the TCP layer in TCP/IP. That is, it gives guaranteed delivery/error recovery, and transparent transfers between hosts -- you don't need to pay attention to how the data is routed from one host to another -- you just specify a destination, and the network figures out how to get it there.
As far as where that's implemented: well, mostly in the TCP/IP stack, which is typically part of the OS. Modern hardware can implement at least a few bits and pieces in the hardware though (e.g., TCP checksum and flow control). The network stack will offload those parts of the TCP operation to the hardware via the device driver.
The transport layer is available as a library usually shipping with Operating System.
The logical part is implemented in the library. Interaction with transport medium is through drivers.
The transport layer exists between two devices or more, in his example a Client and Host Machine (virtual or real). Transport is invoked by the Operating System on both ends. Both the Client and Host Machine have instances of an Operating System and underly hardware managing transport.
Transport control coordinates information delivery assurance for both the Client and Host Machine OS. Some Machines where necessary, shift some of the workload from the CPU or Kernel down to underlying chipsets to lighten the load. Transport duty is essential commodities work not typically appropriate for the Kernel or Main CPU, but the OS is where transport evolved from as the grid modernized.
In the classroom, the duty is done by the OS, in the industrial control systems I design and implement, we always consider hardware acceleration and efficiencies.
RPDelio

Is there an algorithm for fingerprinting the TCP congestion control algorithm used in a captured session?

I would like a program for determining the TCP congestion control algorithm used in a captured TCP session.
The referenced Wikipedia article states:
TCP New Reno is the most commonly
implemented algorithm, SACK support is
very common and is an extension to
Reno/New Reno. Most others are
competing proposals which still need
evaluation. Starting with 2.6.8 the
Linux kernel switched the default
implementation from reno to BIC. The
default implementation was again
changed to CUBIC in the 2.6.19
version.
Also:
Compound TCP is a Microsoft
implementation of TCP which maintains
two different congestion windows
simultaneously, with the goal of
achieving good performance on LFNs
while not impairing fairness. It has
been widely deployed with Microsoft
Windows Vista and Windows Server 2008
and has been ported to older Microsoft
Windows versions as well as Linux.
What would be some strategies for determining which CC algorithm is in use (from a third party capturing the session)?
Update
This project has built a tool to do this:
The Internet has recently been
evolving from homogeneous congestion
control to heterogeneous congestion
control. Several years ago, Internet
traffic was mainly controlled by the
standard TCP AIMD algorithm, whereas
Internet traffic is now controlled by
many different TCP congestion control
algorithms, such as AIMD, BIC, CUBIC,
CTCP, HSTCP, HTCP, HYBLA, ILLINOIS,
LP, STCP, VEGAS, VENO, WESTWOOD+, and
YEAH. However, there is very little
work on the performance and stability
study of the Internet with
heterogeneous congestion control. One
fundamental reason is the lack of the
deployment information of different
TCP algorithms. The goals of this
project are to:
1) develop tools for identifying the TCP algorithms in the Internet,
2) conduct large-scale TCP-algorithm measurements in the Internet.
There are many more congestion control algorithms than you mention here, off the top of my head the list includes: FAST, Scalable, HSTCP, HTCP, Bic, Cubic, Veno, Vegas.
There are also small variations of them due to bug fixes in actual implementations and I'd guess that implementations in different OSes also behave slightly different from one another.
But if I need to try to come up with an idea it would be to estimate the RTT of the connection, you can try to look at the time it took between the third and the fourth packets, as the first and second packets may be tainted by ARPs and other discovery algorithms along the route.
After you have an estimate for RTT you could try to refine it along the way, I'm not exactly sure how you could do that though. But you don't require a full spec for the program, just ideas :-)
With the RTT figured out you can try to put the packets into RTT bins and count the number of in flight data packets in each bin. This way you'll be able to "plot" estimated-cwnd (# of packets in bin) to time and try some pattern matching there.
An alternative would be to go along the trace and try to "run" in your head the different congestion control algorithms and see if the decision at any point matches with the decision you would have done. It will require some leniency and accuracy intervals.
This definitely sounds like an interesting and challenging task!

Resources