SignalR over websocket timeout issue - signalr

Our system now defaults to web-sockets for our signalr communications.
When I send a whole bunch of data the web socket seems to time out after about 2-3 seconds. Setting the transport to 'longPolling' works just fine. I do not see anywhere where a timeout needs to be set for a web-socket explicitly.
How do I prevent the web-socket transport from timing out when sending large amounts of data in one go?
Edit:
We are using the javascript client on version 1.1.3 and sending apparently more than 2 seconds worth of data :) --- I am sending 500 message structures with each being around 90 bytes.
But it seems something else may be going on as the messages are sent one-by-one.

Related

Load balancing TCP traffic using Apache Camel with Netty leads to transaction failures

I am new to Apache Camel and Netty and this is my first project. I am trying to use Camel with the Netty component to load balance heavy traffic in a back end load test scenario.This is the setup I have right now:
from("netty:tcp:\\this-ip:9445?defaultCodec=false&sync=true").loadBalance().roundRobin().to("netty:tcp:\\backend1:9445?defaultCodec=false&sync=true,netty:tcp:\\backend2:9445?defaultCodec=false&sync=true)
The issue is unexpected buffer sizes that I am receiving in the response that I see in the client system sending tcp traffic to Camel. When I send multiple requests one after the other I see no issues and the buffer size is as expected. But, when I try running multiple users sending similar requests to Camel on the same port, I intermittently see unexpected buffer sizes, sometimes 0 bytes to sometimes even greater than the expected number of bytes. I tried playing around with multiple options mentioned in the Camel-Netty page like:
Increasing backlog
keepAlive
buffersizes
timeouts
poolSizes
workerCount
synchronous
stream caching (did not work)
disabled useOriginalMessage for performance
System level TCP parameters, etc. among others.
I am yet to resolve the issue. I am not sure if I'm fundamentally missing something. I did take a look at the encoder/decoders and guess if that could be an issue. But, I don't understand why a load balancer needs to encode/decode messages. I have worked with other load balancers which just require endpoint configurations and hence, I am assuming that Camel does not require this. Am I right? Please know that the issue is not with my client/backend as I ran a 2000 user load test from my client to the backend with less than 1% failures but see a large number of failure ( not that there are no successes) with Camel. I have the following questions:
1.Is this a valid use-case for Apache Camel- Netty? Should I be looking at Mina or others?
2.Can I try to route tcp traffic to JMS or other components and then finally to the tcp endpoint?
3.Do I need encoders/decoders or should this configuration work?
4.Should I continue with this approach or try some other load balancer?
Please let me know if you have any other suggestions. TIA.
Edit1:
I also tried the same approach with netty4 and mina components. The route looks similar to the one in netty. The route with netty4 is as follows:
from("netty4:tcp:\\this-ip:9445?defaultCodec=false&sync=true").to("netty4:tcp:\\backend1:9445?defaultCodec=false&sync=true")
I read a few posts which had the same issue but did not find any solution relevant to my issue.
Edit2:
I increased the receive timeout at my client and immediately noticed the mismatch in expected buffer length issue fall to less than 1%. However, I see that the response times for each transaction when using Camel and not using it is huge; almost 10 times higher. Can you help me with reducing the response times for each transaction? The message received back at my client varies from 5000 to 20000 bytes. Here is my latest route:
from("netty:tcp://this-ip:9445?sync=true&allowDefaultCodec=false&workerCount=20&requestTimeout=30000")
.threads(20)
.loadBalance()
.roundRobin()
.to("netty:tcp://backend-1:9445?sync=true&allowDefaultCodec=false","netty:tcp://backend-2:9445?sync=true&allowDefaultCodec=false")
I also used certain performance enhancements like:
context.setAllowUseOriginalMessage(false);
context.disableJMX();
context.setMessageHistory(false);
context.setLazyLoadTypeConverters(true);
Can you point me in the right direction about how I can reduce the individual transaction times?
For netty4 component there is no parameter called defaultCodec. It is called allowDefaultCodec. http://camel.apache.org/netty4.html
Also, try something like this first.
from("netty4:tcp:\\this-ip:9445?textline=true&sync=true").to("netty4:tcp:\\backend1:9445?textline=true&sync=true")
The above means the data being sent is normal text. If you are sending byte or something else you will need to provide decoding/encoding for netty to handle the data.
And a side note. Before running the Camel route, test manually to send test messages via a standard tcp tool like sockettest to verify that everything works. Then implement the same via Camel. You can find sockettest here http://sockettest.sourceforge.net/ .
I finally solved the issue with the same route settings as above. The issue was with the Request and Response Delimiter not configured properly due to which it was either closing the connection too early leading to unexpected buffer sizes or it was waiting too long even after the entire buffer was received leading to high response times.

Signalr LongPollDelay and the buffer

We have Safari mobile clients that are affected by one of their 5 connections being blocked by signalr. We have used the solution propped here: https://github.com/SignalR/SignalR/issues/1406#issuecomment-14284093
Where we have these settings changed to the following for signalR 2.x
GlobalHost.Configuration.ConnectionTimeout =
TimeSpan.FromMilliseconds(1000);
GlobalHost.Configuration.LongPollDelay = TimeSpan.FromMilliseconds(5000);
We are sending notifications from the server to the client with no message queue or acknowledgement framework. We don’t need to guarantee message delivery but we do want there to be a high probability of success. We think this should be possible due to our low message rate and a buffer size of 1000. However we have some questions:
Are messages held in a queue while the LongPollDelay occurs? Should
they be sent during the next long poll using the settings above?
Our tests with a single message being sent during a 2 minute
LongPollDelay suggest that they are not retrieved during the 1
second long poll request that follows. Are there any reasons for
this i.e. buffer flushing after 1 minute?
Does ConnectionTimeout affect all transports?
If ConnectionTimeout applies to all transports is there a way of
setting this for only Safari mobile users i.e. have two connections
available and use agent detection to point to a specific connection?
Is there a way of setting the LongPollDelay so that this also only
applied to only Safari mobile users?
All advice welcome and appreciated, Matt
[FOLLOW-UP QUESTIONS]
Thanks that helps a lot. We have retried with 30secs LongPollDelay and it works as expected. I have a couple of follow-up questions that you/someone might care to comment on:
1) During testing we also see the client sending a ping request to the server roughly every 5 minutes. Why is the ping period set to 5 minutes when the disconnect period is so much shorter, and what is the purpose of the client pinging the server if it assumes it is disconnected via an alternative mechanism.
2) w.r.t. Different configurations for different clients. Could we not set up another SignalR endpoint and point only Safari mobile to this? Something like the response to this post:
Can I reduce the Circular Buffer to "1"? Is that a good idea?
You are correct that the SignalR will queue/buffer messages. Even if there wasn't a LongPollDelay configured, SignalR needs to do this because there is always a chance that messages are sent while clients are repolling/reconnecting.
SignalR assumes that the client has disconnected if the client hasn't been connected to the server within the last DisconnectTimeout. Once the DisconnectTimeout triggers, SignalR will call OnDisconnected and clear any message buffers belonging to the supposedly disconnected client so it doesn't leak memory. The DisconnectTimeout defaults to 30 seconds which is far less than the 2 minute LongPollDelay you configured, so that explains this behavior.
The ConnectionTimeout only affects long polling unless you've disabled keep alives. If keep alives are disabled, it applies to all transports.
There is no way to selectively configure the ConnectionTimeout for specific types of clients. But as I stated, it only affects long polling by default.
There is no way to selective configure the LongPollDelay for specific types of clients.

SignalR connection is endless

I have a signalR connection working but something very weird happens, sometimes it works perfectly in very few seconds and other times when I track the request it took more than 10 minutes trying to connect and it gives me something like that
can anyone give me an explanation for this? any hints, how to search for the problem
The request your looking at: /connect?transport=serverSentEvents&... is supposed to be endless.
SingalR is using comet technology called server-sent events or SSE. The basic idea is that SignalR responds to SSE requests in chunks, but never actually closes the response unless the client asks it to.
Browsers with SSE support can read the chunks sent from the server as they are sent even though the response doesn't end. This allows an unlimited number of messages to be sent in response to a single request.

implementing a background process responding to the client in an atmosphere+netty/jetty application

We have a requirement to to support 10k+ users, where every user initiate a request and waits for a response from the server (the response can take as long as 20-30 seconds to arrive). it is only one request from the client, and after a long processing by the server, a response will be transmitted and then the connection will disconnect.
in the background, the server will do some DB search and wait for other background processes to notify on completion before responding to the client.
after doing some research i figured out we will need to use something like the atmosphere framework to support websockets/sse event/long polling along with an asynchronous server like netty (=> nettosphere) or jetty.
As for my experience - mostly Java EE world and Tomcat server.
my questions are:
what will be easier to implement in regard to my experience and our requirement: atmosphere + netty or atmoshphere+jetty? which one can scale better, has an easier learning curve and easier to implement other java technologies?
how do u implement in atmosphere a response that is sent only to the originating client and not broadcast to the rest of the clients? (all the examples i found are broadcast).
how can i implement in netty (or jetty) when using the atmosphere framework our response? i.e., the client send a request, after it is received in the server some background processes are run, and when they finish i need to locate the connection and transmit the response. is that achievable?
Some thoughts:
At 10k+ users, with 20-30 second response latency, you likely hit file descriptor limits if using just 1 network interface. Consider a solution that uses multiple network interfaces.
Your description of your request/response can be handled entirely with standard Servlet 3.0, standard HTTP/1.1, Async request handling, and large timeouts.
If your clients are web browsers, and you don't start sending a response from the server until the 20-30 second window, you might hit browser idle timeouts.
Atmosphere and Cometd do the same things, supporting long duration connections, with connection technique fallbacks, and with logical channel APIs.
I believe the AKKA framework will handle this sort of need. I am looking at using it to handle scaling issues possibly with a RabbitMQ to help off load work to potentially other servers that may be added later to scale as needed.

Detecting missing responses to long running HTTP (SOAP) requests

I need a way to detect a missing response to a long running HTTP POST request. This problem arises when the network infrastructure (firewalls, proxies, unplugged cables, etc.) drops the response packets. The server may detect this failure, but the client cannot send additional bytes after the POST to probe the state of the TCP connection. The failure may be limited to a single TCP connection. For example I may be able to subsequently open a new TCP connection to the server.
I'm looking for a solution that still uses HTTP POST and does not change the duration of the server side processing.
Some solutions that I can think of are:
Provide a side channel interface to retrieve request & response history. If the history lists the response as having been send (presumably resulting in a TCP error) but I have not yet received it within a reasonable time I can generate a local error.
Use an X header to request that the server deliver "spurious" 100 Continue provisional responses on a regular interval. If I fail to see an expected 100 Continue or a non-provisional response I can generate a local error.
Is there a state of the art solution for this problem?
It sounds to me like you are using Soap for something that would be much better done using a stateful connection, or a server side push technology.

Resources