What are the benefits of not supporting UDP in cloud network? (thinking of Windows Azure case) - networking

Azure, Rackspace and Amazon do handle UDP, but GAE (the most similar to Azure) does not.
I am wondering what are the expected benefits of this restriction. Does it help fine-tuning the network? Does it ease the load balancing? Does is help to secure the network?

I suspect the reason is that UDP traffic does not have a defined lifetime nor a defined packet to packet relationship. This makes it hard to load balance and hard to manage - when you don't know how long to hold the path open you end up using timers, this is a problem for some NAT implementations too.

There's another angle not really explored here so far. UDP traffic is also a huge source of security problems, specifically DDoS attacks.
By blocking all UDP traffic, Azure can more effectively mitigate these attacks. Nearly all large bandwidth attacks, which are by far the hardest to deal with, are Amplification Attacks of some sort and most often UDP based. Allowing that traffic past the border of the network greatly improves the likelihood of service disruption, regardless of QoS sureties.
A second facet to that same story is that by blocking UDP they prevent people from hosting insecure DNS servers and thus prevent Azure from being the source of these large scale amplification attacks. This is actually a very good thing for the internet overall, as I'd think the connectivity of Azure's data centers are significant. To contrast this I've had servers in AWS send non stop UDP attacks to our datacenter for months on end, and could not successfully get the abuse team to respond to it.

The only thing that comes to my mind is that maybe they wanted to avoid their cloud being accessed through an unreliable transport protocol.
Along with scalability, reliability is one of the key aspects in Azure. For example Sql Azure and Azure Storage data is always replicated in at least three places and roles with at least two instances have a 99.95% uptime in their SLA.
Of course, despite its partial unreliability, UDP has its use cases, some of them enumerated in the comments from the feature voting site, but maybe those use cases are not a target for the Azure platform.

Related

Should Instrumentation data such as metrics, be transmitted over HTTPS?

Should information such as metrics generated from an application that are devoid of any business information, still be subject to encryption/decryption over HTTPS, when being transmitted within the eco system of an organization, that sits behind firewalls?
The reason I am asking this question is that, since the metrics data does not give away any business information, and is behind a firewall already, beyond everything, since the data is tremendous in size (time-series data in the counts of millions of records per second), does it make sense to reduce the computational complexity involved in using HTTPS, that forces encryption/decryption at every hop of the metrics' journey from source to destination, by redirecting metrics data with an ingress policy applied, that routes the packets via another port such as 8080 to skip encryption/decryption, thus saving us BIG on resource utilization, and of course reduced time complexity?
Or is it a known compromise that can in some way turn into a vulnerability hole, that can lead to breaches in the system?
Context:
The applications being monitored are communicating over HTTPS.
The metrics scraping agents are asked to communicate over HTTP
Ingress policy applied on the application node, recognizes the calls from the known metrics scraping agent and routes the packets via a non HTTPS port such as 8080, in order to skip the certificate validation plus mainly, the decryption of metrics payload in the request coming in.
I am looking for suggestions and inputs, especially from someone who has had this problem to solve in their experience. Anybody else with relevant information is more than welcome to add to it.
Any leads appreciated.
Thank you, in advance.
the metrics data does not give away any business information
I think this is not true. Metrics can record traffic patterns also in a business context (e.g.: what users searched for/bought the most, etc.).
Also, it can accidentally contain sensitive information (it should not but accidents can happen). Additionally, it can help attackers to get more data about:
Your infrastructure (what platforms you use)
Your environment (os, java version, etc.)
Your app topology (who calls who)
Please check the Fallacies of distributed computing:
#4 The network is secure
Being behind a firewall does not mean attackers can't get in, that's one of the reasons why you use HTTPS on the internal network.

Need to setup a RMTP stream from our server with multicast

I have a client with a 1-2 thousand viewer audience, with everyday streams, same concurrent number of viewers.
Ive got a server set up for their website etc, but am in the process of figuring out the best way to stream with OBS onto that server, and than re-distribute that stream to clients (as an embed on the website).
Now from the calculations i did, running that kind of concurrent viewers is very problematic, as it forces you into a 10gbit link - which is very expensive, and i would ideally like to fit within 1-2gbps, if possible.
A friend of mine recommended to look into "Multicast" which supossedly uses MUCH less bandwith than regular live streaming options. Is multicast doable? Ive had a NGINX live stream set up on my server by a friend before, but never looked into the config and if multicast is supported within that. Are there any other options? What would you recommend?
Also, the service of that live stream isnt a high profit / organisation type of deal, so any pre-made services just dont make sense, as it would easily cost 40+ dollars per stream, which is just too much for my client.
Thank you for any help!
Tom
Rather than Multicast, P2P is more practical solution on Internet, to save money not bandwidth.
Especially for H5 browser, it's possible to use WebRTC DataChannel to transport P2P data.
But Multicast does not work on internet routers.
Multicast works by sending a single stream across the network to edge points where clients can 'join' the multicast to get an individual stream for them.
It requires that the network supports multicast protocols and the edges align with your users.
It is typically used when an operator has their own IP network for service like IPTV, rather than for services over the internet.
For your scenario, you would usually use an organ server and a CDN - this will usually reduce the load on your own server as the video will be cached on the network and multiple user can access the same 'chunks' of the video.
You can see and AWS example for on demand video here - other vendor and cloud providers have solutions too so this is just an example:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/tutorial-s3-cloudfront-route53-video-streaming.html
You can find more complex On Demand and Live tutorial also but they are likley more that you need: https://aws.amazon.com/cloudfront/streaming/
Exploring P2P may be an option also as Winton suggests - some CDN may also leverage P2P technology internally.

Opensource lightweight HIDS for use on production servers

Requirement
I want to secure my production VMs on AWS, these VMs host critical web applications and can see around 500 Mbps traffic during peak hours. I already using mod_security WAF but I am not very happy with it.
Here is what I am thinking:
What if I can use snort in a lightweight configuration to monitor only HTTP traffic (this would be behind SSL termination) and use opensource XSS and SQLi rules to add an additional layer of protection ? The number of rules will be > 100.
By the time traffic hits my VMs it will be unencrypted. Moreover as I am using snort as on the same host, there wont be much of a semantic gap ( WAF has an edge over IPS since it builds richer app layer context and can detect layer 7 attacks more accurately). Is this understanding correct ?
I can spare around 200Mb of memory and can take 10% overhead on CPU performance.
Is snort the best bet here ? I looked at Suricata which seems to be easier on CPU but hard on memory. Please let me know if this makes sense at all. I want to stick to open source solutions.

How to avoid crashing my user's router?

It appears that cheap consumer routers are fairly easy to crash: hanging around in various backup/sync software forums, I see this mentioned from time to time. Developers seem to be putting a fair amount of effort into making sure they don't crash the routers.
What are the "do"s and "don't"s for my network-heavy application to ensure that it doesn't cause issues with badly designed routers? Especially one that intends to connect to a number of peers?
IMO trying to workaround bad hardware is the road to nowhere, because every router fails in its own remarkable way :).
What you can do in the network-heavy application is assume that network is not stable media (routers can crash, etc) and design application network operations accordingly.
For instance, provide reconnect logic, connection timeouts, some sort of state caching to allow users work with app even if network connectivity is gone.
Concerning faulty routers - they usually crash because of great number of simultaneous connections (e.g. downloading via bittorrent or other p2p protocol). So, maintaining minimum number of connections can help.

What percentage of users are behind symmetric NATs, such that "p2p" traffic needs to be relayed?

We're implementing a SIP-based solution and have configured the setup to work with RTPProxy. Right now, we're routing everything through RTPProxy as we were having some issues with media transport relying on ICE. If we're not mistaken, a central relay server is necessary for relaying streaming data between two clients if they're behind symmetric NATs. In practice, is this a large percentage of all consumer users? How much bandwidth woudl we save if we implemented proper routing to skip the relay server when not necessary. Are there better solutions we're missing?
In falling order of usefulness:
There is a direct connection between the two endpoints in both directions. You just connect and you are essentially done.
There is a direct connection between the two endpoints in one direction. In that case you just connect via the right direction by trying both.
Both parties are behind NATs of some kind.
Luckily, UPnP works in one end, you can then upgrade the connection to the above scheme
UPnP doesn't work, but STUN does. Use it to punch a hole in the NAT. There are a couple of different protocols but the general trick is to negotiate via a middle man that coordinates the NAT-piercing.
You fall back to let another node on the network act as a relaying proxy.
If you implement the full list above, then you have to give up very few connections and don't have to spend much time on bandwidth utilization at proxies. The BitTorrent protocol, of which I am somewhat familiar, usually stops at UPnP, but provides a built-in test to test for connectivity through the NAT.
One really wonders why IPv6 did not get implemented earlier - this is a waste of programmers time.
Real world NAT types survey (not a huge dataset, though):
http://nattest.net.in.tum.de/results.php
According to Google, about 8% of the traffic has to be relayed: http://code.google.com/apis/talk/libjingle/important_concepts.html
A large percentage (if not the majority) of home users uses NAT, as that is what those xDSL/cable routers use to provide network access to the local network.
You can theoretically use UPnP to open ports and set-up forwarding rules on the router to go through the NAT transparently. Unfortunately (or fortunately, depending on who you are) many users disable UPnP as a matter of course on their router and may not appreciate having to add forwarding rules manually.
What you might be able to do (and what Skype does AFAIK) is to have some of the users that have clear network paths and enough bandwidth act as relay nodes. Apart from the routing and QoS issues, you would at least have to find some way to ensure the privacy of any relayed data from anyone, including the owner of the relay node. In addition, there might be legal issues to settle with this approach, apart from the technical ones.

Resources