I have created Vertx TCP Server and corresponding TCP Client. The job of the TCP client is to read logs (in gzip format) from a specific directory and continuously upload to the server. The server would then do some processing and persist the data. Because the logs are in gzip format, hence the average size of a log file is 144 KB. My TCP Client simulator is spawning 4 threads, each thread will create 100 such log files and upload them to vertex server. So at the end, I expect to see 400 (= 100 * 4) files uploaded to the server.
But to my surprise, once the simulator finishes, I see the server has processed only 76 files and there is no sign of remaining 324 files. I have run it multiple times, but every time the server processes around 75 files.
From the TCP client log, I see that all 400 files have been uploaded. So it seems the server is dropping some packet. But from the server log, I see only ~76 files have been processed.
Later I reduced the compressed log file size generated by the simulator to make it 8KB (each log file now has lesser number of log entries), and this time I see around 398 files are processed at server side.
Apparently it seems to be some configuration issue. Maybe I have to configure the Vertex server with more resources (instances, increase received buffer size and event queue size, etc). But how would anybody know the limit, beyond which the packets will be dropped.
More precisely, what are the "specific" configuration required in such case.
Secondly, is it possible to configure the Vertx "TCP" client/server with some acknowledgement sort of mechanism, so that the client can upload the subsequent file, only when the previously uploaded file has been processed successfully.
Related
I write a simple server application. In that application, I created a server socket and put it into the listen state with listen call.
After that, I did not write any code to accept the incoming connection request. I simply waited for the termination with pause call.
I want to figure out practically that how many bytes are buffered in the server side if the connection is not accepted. Then I want to validate the number with the theory of the TCP.
To do that,
First, I started my server application.
Then I used "dd" and "netcat" to send the data from client to server. Here is the command:
$> dd if=/dev/zero count=1 bs=100000 | nc 127.0.0.1 45001
Then I opened wireshark and wait for the zero-window message.
From the last properly acknowledged tcp frame. the client side can successfully send 64559 byte data to the server.
Then I execute the above dd-netcat command to create another client and send data again.
In this case, I got the following wireshark output:
From the last successfully acknowledged tcp frame, I understand that the client application can successfully sent 72677 bytes to the server.
So, it seems that the size of the related buffer can change in runtime. Or, I misinterpret the output of the wireshark.
How can I understand the size of the related receive buffer? What is the correct name to refer that receive buffer in terminology? How can I show the default size of the related receive buffer?
Note that the port number of the tcp server is "45001".
Thank you!
Does sending large files over a websocket "block" websocket for other messages while the large files are being sent?
Does sending the files via independent Http requests while the other messages continue to be sent over WS have any distinct advantage "in keeping the WS unblocked"?
Assume 1 network card.
In case of WebSocket over HTTP/1.1, yes, the upload of a large file (in the form of a large WebSocket message) blocks the WebSocket connection.
In case of WebSocket over HTTP/2 (if supported by both the client and server), one HTTP/2 stream will upload the large file, and another HTTP/2 stream is be used to carry WebSocket messages. In this case, the problem becomes the HTTP/2 flow control window, which may be exhausted by the large upload stream, leaving the WebSocket message stream stalled (so that messages are queued and delayed). Unfortunately, the details of this queueing/delay depend on the client and on the server implementations, so you have to try.
Typically implementations do a good job at interleaving streams, so rarely the possible stalls are a problem.
For WebSocket over HTTP/1.1, if you open multiple WebSocket connections, you may be able to send files and messages in parallel, using N WebSocket connections for the files, and 1 WebSocket connection for the messages, for example.
Some non-browser clients allow you to open multiple HTTP/2 connections to the same domain, so again you will have the chance to send files and messages in parallel. However, to my knowledge, browsers do not allow more than 1 HTTP/2 connection to the same domain, so the parallelism is there, but constrained by the HTTP/2 flow control window.
Not sure what you mean by "keeping the WS unblocked", but HTTP/1.1 works in the same way as WebSocket for what pertains its usage of connections.
If you are in a browser environment, browsers allow 6-8 HTTP connections to the same domain, and typically unlimited (or at least many more) WebSocket connections.
So if you want to send, say, 10 large files, 6-8 of them will be uploaded via HTTP, but the remaining will be queued waiting for one of the HTTP connections to finish the previous upload.
Meanwhile, you can use the WebSocket connection to send messages.
In case of HTTP/2, browsers only open 1 connection, so you may use HTTP/2 streams for the uploads and a WebSocket over HTTP/2 stream for the messages, but they will all share the same HTTP/2 flow control window, potentially stalling each other.
All in all, WebSocket has not been designed for large uploads.
I would not be surprised if you hit WebSocket message size limits, as servers cannot allow clients to upload messages of arbitrary size (as it will blow up the server memory). The same is true for clients; browsers have typically small limits for the size of WebSocket messages that they receive, independently of whether HTTP/1.1 or HTTP/2 is used.
If you really need to upload large files, I think a solution where you upload via HTTP (which allow larger sizes, for example when using multipart/form-data), and keep small messaging via WebSocket is optimal.
The use of HTTP/2 may hit the HTTP/2 flow control window limit, but you have a limit in 6-8 connections in HTTP/1.1 too, so again you have to try and see if you hit any limit, and if you do, which limit it is in what case.
Using HTTP for uploads makes less likely that you hit WebSocket message size limits that are not known in advance and possibly different from client to client (browser to browser), and you don't want to implement your own splitting and merging of large uploads via WebSocket to respect those limits.
I have such a special requirement:
Download a big file from server with HTTP.
If detect network not so good, then giveup current download but download another smaller file.
But per my test, it seems if I send a HTTP GET to server, it will download the file continuously even if I only read 1024 bytes, then I try to close the connection after detect bandwidth is low, but actually download bytes is larger than what I request.
To save bandwidth, I wish server do not send more data then my request, but it sounds like impossible. Then what's the actually mechanism to stop send data to client if we only request partially? e.g. 1024 bytes? If I do not disconnect, server will sent data to client till the whole file is finished?
I'm dumping data from my server app to my client app in chunks (TCP/IP). At some point the client may wish to abort the transfer and make a new request to the server. The rapid approach to get this done is to kill the TCP connection so that any data already sent by the server and live on the network is dumped. The new connection will handle the new request/transfer so there is no delay in receiving old redundant data.
Is this an acceptable solution?
NB: I did consider breaking the chunks into smaller sizes separated by client ack messages but then you have the problem of fixing a chunk size....too small and too many acks (slowed transfer)....too big and there is still a residual delay in dumping redundant data.
Any ideas or standard design approaches that I should be aware of?
TIA
You can use two TCP connection similar to FTP - one to send control request to the server & other transfers the actual data. If you wish to abort a transfer then just a request to abort it over control channel.
Send the data in chunks but don't acknowledge. When the client wants to abort the transfer make it send a cancellation request to the server. The client now just throws away chunks (which are still arriving). Eventually, the server gets the cancellation request and stops sending data. If you want to you can make the server send an acknowledgement of cancellation.
This way you can have small chunks with minimal overhead. You could have 1KB chunks with a 4 or 8 byte chunk header containing the size of the chunk. That is an extremely small bandwidth and latency overhead.
Note, that a small chunk does not generally result in a small IP packet. TCP streams data. It does not care about your chunk size.
I have syslog successfully forwarding logs to an upstream server like so:
$MainMsgQueyeType LinkedList
$MainMsgQueueSize 10000
$MainMsgQueusDiscardMark 8000
$MainMsgQueueDiscardSeverity 1
$MainMsgQueueSaveOnShutdown off
$MainMsgQueueTimeoutEnqueue 0
$ActionQueueType LinkedList # in memory queue
$ActionQueueFileName fwdRule1 # unique name prefix for spool files
$ActionQueueSize 10000 # Only allow 10000 elements in the queue
$ActionQueueDiscardMark 8000 # Only allow 8000 elements in the queue before dropping msgs
$ActionQueueDiscardSeverity 1 # Discard Alert,Critical,Error,Warning,Notice,Info,Debug, NOT Emergency
$ActionQueueSaveOnShutdown off # save messages to disk on shutdown
$ActionQueueTimeoutEnqueue 0
$ActionResumeRetryCount -1 # infinite retries if host is down
$RepeatedMsgReduction off
*.* ##remoteserver.mynetwork.com:5544
On the remoteserver I have something that talks syslog and listens on that port. To test, I have a simple log client that logs 100 messages a second to syslog.
This all works fine, and I have configured the queues above so that in the event that the remoteserver is unavailable, the queues start filling up, and then eventually messages get discarded, thus safeguarding syslog from blocking its logging clients.
When I stop the remote log sink on remoteserver:5544, syslog is still stable (queues filling up / full up), but when I restart the remote log sink a while later, rsyslog detects the server again, reestablishes a TCP connection
HOWEVER - syslog only forwards 1 message to it, despite the queue having many thousands of messages in it, and the logging client continuing to log 100 messages a second
How can I make syslog start forwarding messages again once it has detected the remoteserver is back up? (Without restarting syslog).
Am using rsyslog 4.6.2-2
I am using, and want to use TCP
The problem in case anybody comes across this was that workdirectory was set to:
$WorkDirectory /var/spool/rsyslog
And the above config, does this:
$ActionQueueFileName fwdRule1
Even though its supposed to be an in-memory queue. Because of this, when the queue reached 800 (bizarrely, not 8000), disk-assisted mode was activated, and syslog attempted to write messages to /var/spool/rsyslog. This directory didn't exist . Randomly, (hence a race condition must exist and a bug in rsyslog), after continually trying to open a queue file on the disk in that directory, rsyslog got into a twisted state and gave up and continued queueing messages, until it hit the high 10,000 mark. Restarting the downstream logserver failed to make it recover.
Taking out all references to ActionQueueFileName and making WorkDirectory exist fixed this issue.