Custom TCP proxy for high availability cluster - tcp

I'm in a high availability project which includes deployment of 2-node high availability cluster for hot replacement of services (applications) running on the cluster nodes. The applications have inbound and outbound tcp connections as well as process udp traffic (mainly for communicating with ntp server).
The problem is pretty standard until one needs to provide a hot migration of services to backup node with all the data stored in RAM. Applications are agnostic of backup mechanisms and it is highly undesirable to modify them.
As only approach to this problem, I've come off with a duplication approach assuming that both cluster nodes will run the same applications repeating calculations of each other. In case of failure the primary server the backup server will become a primary.
However, I have not found any ready solution for proxy which will have synchronous port mirroring. No existing proxy servers (haproxy, dante, 3proxy etc.) support such feature as far as I know. Have I missed something, or I should write a new one from scratch?
A rough sketch of the functionality can be found here:
p.s. I assume that it is possible to compare traffic from the two clones of the same application...

Related

How to establish a direct peer-peer connection between two computers in two distant locations through internet?

While building an application by python for implementing post-processing in Quantum Key Distribution, I require a server (say Alice) and a client(say Bob) at distant locations to interact and exchange some information while doing the required calculations simultaneously (for which threading is used).
At first, the open-source NGROK service was used as a hosting server but it makes the time required for post-processing very huge mainly due to network congestion.
So, is there a way to establish a direct peer-to-peer connection between Alice and Bob via the internet wherein Alice's system itself acts as a server thereby bypassing a third-party host server? If there is a way out please suggest one otherwise any leads or suggestions for open source services better than NGROK would be highly helpful.
PS: If you need any additional information for providing help, i would be eager to respond.

Should each application have its own host instance / host instances?

Should each application in BizTalk have its own host instance / host instances?
I've read various blog posts and books that it's good practice to have different host instances for different tasks, e.g. one for receive, one for send, one for orchestrations and one for tracking.
But should each application get its own receive, send, orchestration and tracking host?, or is it just one of each for all applications?
Short answer, probably not unless there is a particular aspect of that application that is causing a performance issue for the other applications. If that is the case you would only create a host for the particular part that is causing the issue. e.g. if it is the receive adapter you would create a host for receive locations that run under that adapter. If it is the Orchestrations for that application then create a processing host for that application for the Orchestrations of that application to run under.
Long answer from from Microsofts
High Availability for BizTalk Hosts
Disadvantages of Additional Hosts
While there are benefits of creating additional host instances, there are also potential drawbacks if too many host instances are created. Each host instance is a Windows service (BTSNTSvc.exe or BTSNTSvc64.exe), which generates additional load against the MessageBox database and consumes computer resources, such as CPU, memory, and threads. Other than these, you have the following reasons for not configuring too many additional host instances:
Several performance counters are reported per host with too much granularity. This affects the usability for the administrator who would need to traverse through a lot of data. This has a negative impact on the overall view the administrator has.
Each host consumes considerable amount of memory that might lead to a situation of throttling and reduced performance.
If the hosts have receive adapters that continuously perform polling, each host will poll the database at short intervals, thereby resulting in degraded performance.
As you're read, there are no definite rules.
Should each Application have a dedicated Host/Instance? Sure, if you've got a relatively small number of Applications for the hardware.
It's something to also consider if one Application is problematic.
Should each Application have separate Receive, Send or Orch hosts? No, not to start. If you observe a particular operation consuming a disperportionate abount of resources, then split it out.

Neo4j HA over a VPN?

I am currently in the process of creating 3 Neo4j High Availability servers. My business logic leaves one server as a dedicated master, while the other two machines are dedicated slaves. My slaves exist in an entirely different datacenter than my master.
What is the best method to establish a link between the two applications? I've been able to establish connections using OpenVPN, but am curious if that would be better than like SSH port forwarding? I'm not entirely sure how Zookeeper needs to communicate with each other node. A VPN connection only creates a one-way connection, where my master, for example, can create a connection with slave, but could not create one with its master. (I think?)
How should I do this? Thanks!
PS: My master is using an embedded instance of Neo4j, while the slaves are stand-alone instances (if this matters).
So your setup is not about availability as the slaves cannot become masters anyway?
Just about replication to the other datacenter?
You also need to take the neo4j coordinator (zookeeper) into account which is usually needed for all cluster participants.
My colleague suggested that you might get away with just putting the zookeeper (perhaps even just a single one as you don't need master election) directly besides your master server.
Then the ability to connect into the masters' VPN should be enough for the slaves to pull updates.

ASP.NET Hosting on Virtual Servers running on VMWare

My Company is running several international websites for selling insurance products.
Our current setup is a Webfarm with multiple Loadbalanced Webservers hosting our ASP.NET applications. The backend is a single - yet powerful - SQL Server. (all in one data center)
Our network admins want to move to virtual servers running on VMWare.
Scenarios could be
Webfarm: Multiple standard webservers, Loadbalanced (current setup), Session state on SQL Server
Virtual Webfarm: Multiple virtual servers, loadbalanced on one physical VMWare Host, Session state on SQL Server
2.a same as above but with multiple physical hosts
Single Virtual Webserver: One big powerful virtual webserver, no loadbalancing required, session state can be kept in process
There is a big hype around virtualization and I can see the benefits, but have no experience with this. I cannot tell what issues we will face and to what we should pay special attention.
Does anyone have experience with such a virtual setup?
What are general recommendations?
I tend towards 2a. I am afraid of having all webservers on one single physical machine.
Many thanks in advance to share your thoughts.
There are three reasons to use more than one webserver for an application:
Scaling - More grunt is required than one machine can provide
Reliability - Website should keep running in case of failure (a. hardware b. software)
Prioritization - One of the webservers takes on heavy work (perhaps scheduled tasks) leaving the other to respond to client requests quickly.
Marrying that up to you scenarios:
Scenario 1 provides 1, 2, 3
Scenario 2 provides 2b (perhaps 2a if it is fully hardware redundant (doubt it))
Scenario 2a provides 1, 2
Scenario 3 provides none of the above
Advantages of Virtual Hosting:
Lower Total Cost of Ownership (TCO) on big cluster serving multiple purposes is cost effective
New servers can be created quickly if needed
Redundant hardware is easier to justify if the cost is shared among many applications
Disadvantages:
Other virtual machines may suck away your CPU/Disk IO capacity
IMHO there is little point to load balancing multiple virtual machines on the same virtual server.
Robert's pretty much covered it all, I'm mostly just adding a note to say that at least one of our clients is currently running with option 2a.
So we have multiple loadbalanced web servers running on a couple of VM hosts, talking to a non-virtualised SQL cluster - this works quite well for them.
One other advantage of virtualisation is that it allows you to more fully utilise your hardware - however, you need to be aware that if you're running your virtual host at around 90% capacity with multiple VMs, you've not got a lot of spare capacity for any traffic spikes - if you're not expecting any, then great, but if you are, you'll need to have something in place to cope.
I agree with all of the above answers, and I actually work at a webhost. :-) If you're using multiple load-balanced webservers now then I can only assume the reason for it is either
Hardware Redundancy: If a single app server fails then those sessions are lost, but the app keeps running on the other servers and users can immediately re-connect.
or
Application Load Distribution (it's late so I can't think of a better name): Your traffic dictates that you have multiple app servers since all of your users would crash a single app server.
If #1 is the reason, then going to VMWare defeats the purpose since you only have one server supporting everything, and in case of hard drive crash, etc, you are down while it is repaired. If #2 is the reason then a VMWare based solution MAY work, however keep in mind that the hardware you'd use would almost necessarily be of a higher caliber than what you're currently using. So you maybe get more bang for your buck, but you STLL lose the redundancy that multiple physical machines gave you.
Now, you could always combine the two by having multiple physical machines all running VMWare, but that adds a level of complexity to things that you may not necessarily want either.
It doesn't sound like there would be any tangible benefit from running multiple virtual servers on the same physical host, you're just adding overhead. Unless I'm missing something with the way you've described the setup, there wouldn't be any benefit at all from moving to VMware - unless you're looking at taking advantage of features such as VMotion
VMware is most useful for consolidating underutilized hardware. If your hardware is running at near-capacity during peak periods then you don't want to run multiple VMs on the one machine.
There are benefits to Virtualization but your network admins need to prove that there is a benefit for your company before you even consider switching. I would say if you have multiple apps running on dedicated servers with low traffic (i.e. each app has it's own physical server) then sure, Virtualize. If you have one app over many servers, then don't.
You should be able to use virtual machine hosts with multiple vm per host and load balance across all of them.
Microsoft is doing this with msdn and technet http://virtualization.info/en/news/2008/05/microsoft-migrates-msdn-and-technet-on.html.

Developing Serverless Lan Chat Program Help!

I want to develop simple Serverless LAN Chat program just for fun. How can I do this ? What type Architecture should I use?
Last year I have worked on TCP,UDP Client/ Server application Project.It was simple (Server listens to certain port/socket and Client connect to server's port etc..) But I have no idea about how to develop "Serverless" LAN Chat program. How can I do this? UDP,TCP,Multicast,Broadcast? or Should program behave like both server and client?
The simplest way would be to use UDP and simply broadcast your messages all over the network.
A little bit more advanced version would be to only use the broadcast to discover other nodes in the network.
Every node maintains a list of known peers.
Messages are sent with TCP to all known peers.
When a node starts up, it sends out an UDP broadcast to discover other nodes.
When a node receives a discovery broadcast, it sends "itself" to the source of the broadcast, in order to make it self known. The receiving node adds the broadcaster to it's own list of known peers.
When a node drops out of the network, it sends another broadcast in order to inform the remaining nodes that they should remove the dropped client from their list.
You would also have to consider handling the dropping out of nodes without them informing the rest of the network.
The spread toolkit may be a bit overkill for what you want, but an interesting starting point.
From the blurb:
Spread is an open source toolkit that provides a high performance messaging service that is resilient to faults across local and wide area networks. Spread functions as a unified message bus for distributed applications, and provides highly tuned application-level multicast, group communication, and point to point support. Spread services range from reliable messaging to fully ordered messages with delivery guarantees.
Spread can be used in many distributed applications that require high reliability, high performance, and robust communication among various subsets of members. The toolkit is designed to encapsulate the challenging aspects of asynchronous networks and enable the construction of reliable and scalable distributed applications.
Spread consists of a library that user applications are linked with, a binary daemon which runs on each computer that is part of the processor group, and various utility and demonstration programs.
Some of the services and benefits provided by Spread:
Reliable and scalable messaging and group communication.
A very powerful but simple API simplifies the construction of distributed architectures.
Easy to use, deploy and maintain.
Highly scalable from one local area network to complex wide area networks.
Supports thousands of groups with different sets of members.
Enables message reliability in the presence of machine failures, process crashes and recoveries, and network partitions and merges.
Provides a range of reliability, ordering and stability guarantees for messages.
Emphasis on robustness and high performance.
Completely distributed algorithms with no central point of failure.
Apples iChat is an example of the very product you are envisioning. It uses Bonjour (apple's zero-conf networking protocol) to identify peers on a LAN. You can then chat or audio/video chat with them.
I'm not entirely sure how Bonjour works inside, but I know it uses multicast. Clients "register" services on the LAN, and the Bonjour protocol allows for each host to pull up a directory of hosts for a given service (all without central management).

Resources