How to get high availability when download data on 2 servers? - networking

I have data need to be downloaded on a local server every 24 hours. For high availability we provided 2 servers to avoid failures and losing data.
My question is: What is the best approach to use the 2 servers?
My ideas are:
-Download on a server and just if download failed for any reason, download will continue on the other server.
-Download will occur on the 2 servers at the same time every day.
Any advice?

In terms of your high-level approach, break it down into manageable chunks i.e. reliable data acquisition, and highly available data dissemination. I would start with the second part first, because that's the state you want to get to.
Highly available data dissemination
Working backwards (i.e. this is the second part of your problem), when offering highly-available data to consumers you have two options:
Active-Passive
Active-Active
Active-Active means you have at least two nodes servicing requests for the data, with some kind of Load Balancer (LB) in front, which allocates the requests. Depending on the technology you are using there may be existing components/solutions in that tech stack, or reference models describing potential solutions.
Active-Passive means you have one node taking all the traffic, and when that becomes unavailable requests are directed at the stand-by / passive node.
The passive node can be "hot" ready to go, or "cold" - meaning it's not fully operational but is relatively fast and easy to stand-up and start taking traffic.
In both cases, and if you have only 2 nodes, you ideally want both the nodes to be capable of handling the entire load. That's obvious for Active-Passive, but it also applies to active-active, so that if one goes down the other will successfully handle all requests.
In both cases you need some kind of network component that routes the traffic. Ideally it will be able to operate autonomously (it will have to if you want active-active load sharing), but you could have a manual / alert based process for switching from active to passive. For one thing, it will depend on what your non-functional requirements are.
Reliable data acquisition
Having figured out how you will disseminate the data, you know where you need to get it to.
E.g. if active-active you need to get it to both at the same time (I don't know what tolerances you can have) since you want them to serve the same consistent data. One option to get around that issues is this:
Have the LB route all traffic to node A.
Node B performs the download.
The LB is informed that Node B successfully got the new data and is ready to serve it. LB then switches the traffic flow to just Node B.
Node A gets the updated data (perhaps from Node B, so the data is guaranteed to be the same).
The LB is informed that Node A successfully got the new data and is ready to serve it. LB then allows the traffic flow to Nodes A & B.
This pattern would also work for active-passive:
Node A is the active node, B is the passive node.
Node B downloads the new data, and is ready to serve it.
Node A gets updated with the new data (probably from node B), to ensure consistency.
Node A serves the new data.
You get the data on the passive node first so that if node A went down, node B would already have the new data. Admittedly the time-window for that to happen should be quite small.

Related

Is a netflow record equal to a session?

Because I don't very understand a session definition in network I have a puzzle that whether a netflow record equal to a session?
If I upload some files to the server through FTP at a time, and there produce 50
netflow records(same source and destination IP but ports are different). Does the process equal to 50 sessions, or the process after the server closed the connection equal to only one session?
such like this photo:
thanks a lot :)
Short answer; It all depends.
There are many factors and variables when working with network flows, whether it's Cisco's Netflow format (in various versions), IETF's IPFIX or other similar formats. If we take a very common format, Netflow v5, a flow is defined by 5 or 7 tuples (depending on how detailed the definition is). These tuples are; Source and destination IP address, source and destination port and protocol (in addition Type of Service and ingress interface index). Also Netflow v5 is a uni-directional network flow protocol, meaning it will treat connections coming from the server separately from those going to the server. So any IP packet matching that 5/7 tuple definition in one direction will constitute a network flow and result in a Netflow record. All this have to be taken into consideration when examining Netflow data and comparing it to network communication sessions (which by itself also may have different definitions based on its context).
And as if that wasn't enough, there are also implementation specific variables and limitations, that may split a session into several records. Usually flow protocols implement various timeouts to be able to efficiently collect and store data. TCP sessions may have connections open over a long time period, and makes it challenging for the flow generator to keep and maintain the flow in memory. Some network flow formats have the ability to locate such split records and merge them into one single flow record.
So, to sum up, network analysts starting to work with network flows easily fall into the trap of thinking one record equals one session. That assumption may be true sometimes, but not always.

Syncing tables one-way but different tables in different directions

I have never worked with multiple servers syncing before. I am working on a project that will require multiple MariaDB servers to sync specific tables. Each table will only ever be changed on one server, but each of those tables will be changed on a different server.
Given tables A, B, C, and D:
A - settings table - only updated when users change settings
B, C, D - work tables - these will be updated every few seconds with work done by the individual servers.
Main Server - Changes A; needs B, C, D kept up-to-date in real-time.
Server 2 - Changes B; needs A kept up-to-date in real-time.
Server 3 - Changes C; needs A kept up-to-date in real-time.
Server 4 - Changes D; needs A kept up-to-date in real-time.
Looking at replication tutorials, I see lots of information about one-way and two-way, but I haven't been able to find anything that matches up with what I'm trying to do.
It is imperative that the data is synced in real-time as the information is time-sensitive. If the servers lose connection to each other, I still need the data to be synced as soon as the connection is restored.
The tables will all be updated by PHP code on their respective servers. The servers are all running Linux. Is this something that can be done with MariaDB by itself? Or would it be better to handle this in another way? I'd really like to avoid two-way replication of the entire database to all the servers as most of the data is unnecessary for any server but the main server and the server that created it.
Forget having multiple machines trying to sync with each other. Instead think about...
Plan A: Using a single server for all the work. Then there is no syncing.
Plan B: Make use of the network to allow everybody to write to a Master, which continually replicates uni-directionally to any number of Slaves. The slaves can be read from.
Plan C: Some form of clustering, such as Galera, so that multiple nodes are continually replicating to each other, and you can read/write to each node.
Beware in any sync or replication environment of the "critical read" scenario. This is when a user writes a comment on a blog post, then fails to find it on the next web page. The problem is that it has not been "sync'd" yet.
To answer your question: There is no "sync" mechanism other than the 3 things I described.
More
Connection issues are not a serious problem -- There are many cloud services out there that come very close to being available 100.0% of the time.
Latency is not really a serious problem -- you would be surprised at how far away web sites are. A thousand miles is "nothing" in the Internet today.
Minimize the number of round trips from the user to the web server if they are far apart. And minimize the number of database calls between the front end and the database. Note that it is beneficial to put the web server near the client in one case and near the database in the other case.
In a Master-Slave setup, a few gigabytes per day is a reasonable cap on traffic. How much traffic are you talking about?

Which node should I push data to in a cluster?

I've setup a kafka cluster with 3 nodes.
kafka01.example.com
kafka02.example.com
kafka03.example.com
Kafka does replication so that any node in the cluster can be removed without loosing data.
Normally I would send all data to kafka01, however that will break the entire cluster if that one node goes down.
What is industry best practice when dealing with clusters? I'm evaluating setting up an NGINX reverse proxy with round robin load balancing. Then I can point all data producers at the proxy and it will divvy up between the nodes.
I need to ensure that no data is lost if one of the nodes becomes unavailable.
Is an nginx reverse proxy an appropriate tool for this use case?
Is my assumption correct that a round robin reverse proxy will distribute the data and increase reliability without data loss?
Is there a different approach that I haven't considered?
Normally your producer takes care of distributing the data to all (or selected set of) nodes that are up and running by using a partitioning function either in a round robin mode or by using some semantics of your choice. The producer publishes to a partition of a topic and different nodes are leaders for different partitions of one topic. If a broker node becomes unavailable, this node will fall out of the cluster (In Sync Replicas) and new leaders for partitions on that node will be selected. Through metadata requests/responses, your producer will become aware of this fact and push messages to other nodes which are currently up.

How do I configure OpenSplice DDS for 100,000 nodes?

What is the right approach to use to configure OpenSplice DDS to support 100,000 or more nodes?
Can I use a hierarchical naming scheme for partition names, so "headquarters.city.location_guid_xxx" would prevent packets from leaving a location, and "company.city*" would allow samples to align across a city, and so on? Or would all the nodes know about all these partitions just in case they wanted to publish to them?
The durability services will choose a master when it comes up. If one durability service is running on a Raspberry Pi in a remote location running over a 3G link what is to prevent it from trying becoming the master for "headquarters" and crashing?
I am experimenting with durability settings in a remote node such that I use location_guid_xxx but for the "headquarters" cloud server I use a Headquarters
On the remote client I might to do this:
<Merge scope="Headquarters" type="Ignore"/>
<Merge scope="location_guid_xxx" type="Merge"/>
so a location won't be master for the universe, but can a durability service within a location still be master for that location?
If I have 100,000 locations does this mean I have to have all of them listed in the "Merge scope" in the ospl.xml file located at headquarters? I would think this alone might limit the size of the network I can handle.
I am assuming that this product will handle this sort of Internet of Things scenario. Has anyone else tried it?
Considering the scale of your system I think you should seriously consider the use of Vortex-Cloud (see these slides http://slidesha.re/1qMVPrq). Vortex Cloud will allow you to better scale your system as well as deal with NAT/Firewall. Beside that, you'll be able to use TCP/IP to communicate from your Raspberry Pi to the cloud instance thus avoiding any problem related to NATs/Firewalls.
Before getting to your durability question, there is something else I'd like to point out. If you try to build a flat system with 100K nodes you'll generate quite a bit of discovery information. Beside generating some traffic, this will be taking memory on your end applications. If you use Vortex-Cloud, instead, we play tricks to limit the discovery information. To give you an example, if you have a data-write matching 100K data reader, when using Vortex-Cloud the data-writer would only match on end-point and thus reducing the discovery information by 100K times!!!
Finally, concerning your durability question, you could configure some durability service as alignee only. In that case they will never become master.
HTH.
A+

Is it realistic for a dedicated server to send out many requests per second?

TL;DR
Is it appropriate for a (dedicated) web server to be sending many requests out to other servers every second (naturally with permission from said server)?
I'm asking this purely to save myself spending a long time implementing an idea that won't work, as I hope that people will have some more insight into this than me.
I'm developing a solution which will allow clients to monitor the status of their server. I need to constantly (24/7) obtain more recent logs from these servers. Unfortunately, I am limited to getting the last 150 entries to their logs. This means that for busy clients I will need to poll their servers more.
I'm trying to make my solution scalable so that if it gets a number of customers, I won't need to concern myself with rewriting it, so my benchmark is 1000 clients, as I think this is a realistic upper limit.
If I have 1000 clients, and I need to poll their servers every, let's give it a number, two minutes, I'm going to be sending requests off to more than 8 servers every second. The returned result will be on average about 15,000 characters, however it could go more or less.
Bearing in mind this server will also need to cope with clients visiting it to see their server information, and thus will need to be lag-free.
Some optimisations I've been considering, which I would probably need to implement relatively early on:
Only asking for 50 log items. If we find one already stored (They are returned in chronological order), we can terminate. If not, we throw out another request for the other 100. This should cut down traffic by around 3/5ths.
Detecting which servers get more traffic and requesting their logs less commonly (i.e. if a server only gets 10 logged events every hour, we don't want to keep asking for 150 every few minutes)
I'm basically asking if sending out this many requests per second is considered a bad thing and whether my future host might start asking questions or trying to throttle my server. I'm aiming to go shared for the first few customers, then if it gets popular enough, move to a dedicated server.
I know this has a slight degree of opinion enabled, so I fear that it might be a candidate for closure, but I do feel that there is a definite degree of factuality required in the answer that should make it an okay question.
I'm not sure if there's a networking SE or if this might be more appropriate on SuperUser or something, but it feels right on SO. Drop me a comment ASAP if it's not appropriate here and I'll delete it and post to a suggested new location instead.
You might want to read about the C10K Problem. The article compares several I/O strategies. A certain number of threads that each handle several connections using nonblocking I/O is the best approach imho.
Regarding your specific project I think it is a bad idea to poll for a limited number of log items. When there is a peak in log activity you will miss potentially critical data, especially when you apply your optimizations.
It would be way better if the clients you are monitoring pushed their new log items to your server. That way you won't miss something important.
I am not familiar with the performance of ASP.NET so I can't answer if a single dedicated server is enough. Especially because I do not know what the specs of your server are. Using a reasonable strong server it should be possible. If it turns out to be not enough you should distribute your project across multiple servers.

Resources