maxInboundMessageSize on channel and stub - grpc

I could not see the correlation between maxInboundMessageSize on the channel and on stub
My flow is simple
1. I m creating the channel (using ManagedChannelBuilder)
maxInboundMetadataSize(int bytes)
Sets the maximum size of metadata allowed to be received.
The stub is created base on the channel
withMaxInboundMessageSize(int maxSize)
Returns a new stub that limits the maximum acceptable message size from a remote peer.
The relation between those values is still unclear for me
I don't know how to use it properly and what is a good practice
Does anyone understand it?

maxInboundMetadataSize(int bytes) enforces http2 SETTINGS_MAX_HEADER_LIST_SIZE (see rfc 7540 section 6.5.2). Notice that the default value by http2 is unlimited which can be dangerous. gRPC uses default 8k. You can configure it if you know reasonable upper bound for your specific use case.
withMaxInboundMessageSize is for protecting receiver going out of memory. in case, malicious peer sends very large message to attack. This is again purely for protection, gRPC doesn't allocate any buffer using maxInboundMessageSize (default 4MB).
So, they don't have any relationship per say. Using default is perfectly fine unless your use case doesn't work (default values are 8KB for maxInboundMetadataSize, 4MB for maxInboundMessageSize as of gRPC java 1.29.0). I can see certain applications may need more than 4MB message size.

Related

When does AkkaHttp backpressure kick in?

.. when the http response entity is not consumed, or the client tcp buffer becomes full, or when the rate of client taking from its tcp buffer is lower then the rate of server pushing data to it?
I am looking for a way for to achieve the following:
Let's assume that there is a backpressure-able source of data on the server, such as an Apache Kafka topic.
If I consume this source from a remote location it may be possible that the rate at which that remote location can consume is lower - this is solved if Kafka client or consumer is used.
However let's assume that the client is a browser and that exposing direct Kafka protocol / connectivity is not a possibility.
Further, let's assume that there is a possibility of getting all the value even if jumping over some messages.
For instance in case of compacted topics, getting only the latest values for each key is enough for a client, no need to go through intermediate values.
This would be equivalent to Flowable.onBackpressureLatest() or AkkaStreams.aggregateOnBackpressure or onBackpressureAggregate.
Would it be a way to expose the topic over HTTP REST (e.g. Server Side Events / chunked transfer-encoding) or over web-sockets, that would achieve this effect of skipping over intermediate values for each key?
Please advise, thanks
Akka http supports back pressure based on TCP protocol very well and you can read about using it in combination with streaming here
Kafka consumption and exposure via http with back pressure can be easily achieved in combination of akka-http, akka-stream and alpakka-kafka.
Kafka consumers need to do polling and alpakka covers back pressure with reduction of polling requests.
I don't see the necessity of skipping over the messages when back pressure is fully supported. Kafka will keep track of the offset consumed by a consumer group (the one you pick for your service or http connection) and this will guarantee eventual consumption of all messages. Of course, if you produce messages way faster in a topic, the consumer will never catch up. Let me know if this is your case.
As a final note, you may check out Confluent REST Proxy API, which allows you to read Kafka messages in a restful manner.

Does Golang's (*http.ResponseWriter) Write() method block until data is recieved by the client?

I ask this question because I had a very weird puzzling experience that I am about to tell.
I am instrumenting an HTTP API server to observe it's behavior in the presence of latency between the server and the clients. I had a setup consisting of a single server and a dozen of clients connected with a 10Gbps Ethernet fabric. I measured the time it took to serve certain API requests in 5 scenarios. In each scenario, I set the latency between the server and the clients to one of the values: No latency (I call this baseline), 25ms, 50ms, 250ms or 400ms using the tc-netem(8) utility.
Because I am using histogram buckets to quantify the service time, I observed that all the requests were processed in less than 50ms whatever the scenario is, which clearly doesn't make any sense as, for example, in the case of 400ms, it should be at least around 400ms (as I am only measuring the duration from the moment the request hits the server to the moment the HTTP Write()function returns). Note that the response objects are between 1Kb to 10Kb in size.
Initially, I had doubts that the *http.ResponsWriter's Write() function was asynchronous and returns immediately before data is received by the client. So, I decided to test this hypothesis by writing a toy HTTP server that services the content of a file that is generated using dd(1) and /dev/urandom to be able to reconfigure the response size. Here is the server:
var response []byte
func httpHandler(w http.ResponseWriter, r * http.Request) {
switch r.Method {
case "GET":
now: = time.Now()
w.Write(response)
elapsed: = time.Since(now)
mcs: = float64(elapsed / time.Microsecond)
s: = elapsed.Seconds()
log.Printf("Elapsed time in mcs: %v, sec:%v", mcs, s)
}
}
func main() {
response, _ = ioutil.ReadFile("BigFile")
http.HandleFunc("/hd", httpHandler)
http.ListenAndServe(":8089", nil)
}
Then I start the server like this:
dd if=/dev/urandom of=BigFile bs=$VARIABLE_SIZE count=1 && ./server
from the client side, I issue time curl -X GET $SERVER_IP:8089/hd --output /dev/null
I tried with many values of $VARIABLE_SIZE from the range [1Kb, 500Mb], using an emulated latency of 400ms between the server and each one of the clients. To make long story short, I noticed that the Write() method blocks until the data is sent when the response size is big enough to be visually noticed (on the order of tens of megabytes). However, when the response size is small, the server doesn't report a mentally sane servicing time compared to the value reported by the client. For a 10Kb file, the client reports 1.6 seconds while the server reports 67 microseconds (which doesn't make sense at all, even me as a human I noticed a little delay on the order of a second as it is reported by the client).
To go a little further, I tried to find out starting from which response size the server returns a mentally acceptable time. After many trials using a binary search algorithm, I discovered that the server always returns few microseconds [20us, 600us] for responses that are less than 86501 bytes in size and returns expected (acceptable) times for requests that are >= 86501 bytes (usually half of the time reported by the client). As an example, for a 86501 bytes response, the client reported 4 seconds while the server reported 365 microseconds. For 86502 bytes, the client reported 4s and the sever reported 1.6s. I repeated this experience many times using different servers, the behavior is always the same. The number 86502 looks like magic !!
This experience explains the weird observations I initially had because all the API responses were less than 10Kb in size. However, this opens the door for a serious question. What the heck on earth is happening and how to explain this behavior ?
I've tried to search for answers but didn't find anything. The only thing I can think about is maybe it is related to Linux's sockets size and whether Go makes the system call in a non-blocking fashion. However, AFAIK, TCP packets transporting the HTTP responses should all be acknowledged by the receiver (the client) before the sender (the server) can return ! Breaking this assumption (as it looks like in this case) can lead to disasters ! Can someone please provide an explanation for this weird behavior ?
Technical details:
Go version: 12
OS: Debian Buster
Arch: x86_64
I'd speculate the question is stated in a wong way in fact: you seem to be guessing about how HTTP works instead of looking at the whole stack.
The first thing to consider is that HTTP (1.0 and 1.1, which is the standard version since long time ago) does not specify any means for either party to acknowledge data reception.
There exists implicit acknowledge for the fact the server received the client's request — the server is expected to respond to the request, and when it responds, the client can be reasonably sure the server had actually received the request.
There is no such thing working in the other direction though: the server does not expect the client to somehow "report back" — on the HTTP level — that it had managed to read the whole server's response.
The second thing to consider is that HTTP is carried over TCP connections (or TLS, whcih is not really different as it uses TCP as well).
An oft-forgotten fact about TCP is that it has no message framing — that is, TCP performs bi-directional transfer of opaque byte streams.
TCP only guarantees total ordering of bytes in these streams; it does not in any way preserve any occasional "batching" which may naturally result from the way you work with TCP via a typical programming interface — by calling some sort of "write this set of bytes" function.
Another thing which is often forgotten about TCP is that while it indeed uses acknowledgements to track which part of the outgoing stream was actually received by the receiver, this is a protocol detail which is not exposed to the programming interface level (at least not in any common implementation of TCP I'm aware of).
These features mean that if one wants to use TCP for message-oriented data exchange, one needs to implement support for both message boundaries (so-called "framing") and acknowledgement about the reception of individual messages in the procotol above TCP.
HTTP is a protocol which is above TCP but while it implements framing, it does not implement explicit acknowledgement besides the server responding to the client, described above.
Now consider that most if not all TCP implementations employ buffering in various parts of the stack. At least, the data which is submitted by the program gets buffered, and the data which is read from the incoming TCP stream gets buffered, too.
Finally consider that most commonly used TCP implementations provide for sending data into an active TCP connection through the use of a call allowing to submit a chunk of bytes of arbitrary length.
Considering the buffering described above, such a call typically blocks until all the submitted data gets copied to the sending buffer.
If there's no room in the buffer, the call blocks until the TCP stack manages to stream some amount of data from that buffer into the connection — freeing some room to accept more data from the client.
What all of the above means for net/http.ResponseWriter.Write interacting with a typical contemporary TCP/IP stack?
A call to Write would eventially try to submit the specified data into the TCP/IP stack.
The stack would try to copy that data over into the sending buffer of the corresponding TCP connection — blocking until all the data manages to be copied.
After that you have essentially lost any control about what happens with that data: it may eventually be successfully delivered to the receiver, or it may fail completely, or some part of it might succeed and the rest will not.
What this means for you, is that when net/http.ResponseWriter.Write blocks, it blocks on the sending buffer of the TCP socket underlying the HTTP connection you're operating on.
Note though, that if the TCP/IP stack detects an irrepairable problem with the connection underlying your HTTP request/response exchange — such as a frame with the RST flag coming from the remote part meaning the connection has been unexpectedly teared down — this problem will bubble up the Go's HTTP stack as well, and Write will return a non-nil error.
In this case, you will know that the client was likely not able to receive the complete response.

Some questionts about MPI send modes

I'm trying to understand the specific of MPI send modes (send, bsend, ssend, rsend) and I have next questions:
MPI_Send uses some buffer if the is not initialized appropriate MPI_|i|recv and message size not too big and not exceeded buffer size (otherwise, MPI_Send will wait appropriate recv). I know, it's true (this situation described here: Deadlock with MPI ).
MPI_Bsend uses buffer (denoted MPI_Buffer_attach function) only when not initialized appropriate recv. It's true?
Buffer for MPI_Bsend is the same as that buffer for MPI_send?
MPI_Ssend never uses buffer. It's true? Or behavior of MPI_Ssend like MPI_Send (buffer uses, if message size is not exceeded buffer size)?
If an answer on my questions "it's not true", could not you give me detailed answer with explanations?
MPI_Send precise behavior is subject to change depending on the implementation. In addition, some implementations allow the threshold size to be tuned by the user.
Check MPI's Send Modes for some detailed information. If you want to make sure your program is portable to other MPI implementations, refer to MPI standard (section 3.4: Communication Modes). For the standard mode (MPI_Send), here's what the standard says (as of MPI 3.1).
The send call described in Section 3.2.1 uses the standard communication mode. In this mode, it is up to MPI to decide whether outgoing messages will be buffered. MPI may buffer outgoing messages. In such a case, the send call may complete before a matching receive is invoked. On the other hand, buffer space may be unavailable, or MPI may choose not to buffer outgoing messages, for performance reasons. In this case, the send call will not complete until a matching receive has been posted, and the data has been moved to the
receiver.
Thus, a send in standard mode can be started whether or not a matching receive has
been posted. It may complete before a matching receive is posted. The standard mode send
is non-local: successful completion of the send operation may depend on the occurrence of
a matching receive.
The main misconception you have is that you think MPI_Send uses buffering if MPI_Recv has not been called by the receiver process. Actually, it usually depends on message size regardless if the matching receive has been called.
If buffering is used, the user's send buffer is released after the data is copied to a temporary buffer. Then, the program can continue its execution regardless the corresponding receive has been issued or not.

Is using packets in a networking library a good idea at all?

I've been programming a library for both TCP and UDP networking and thought about using packets. Currently I've implemented a packet class which can be used like the C++ standard library's stream classes (it has << and >> for inputting and reading data). I plan on sending the packets like so:
bytes 1-8 - uint64_t as the size of the packet.
bytes 8-size - contents of the packet.
But there's a problem. What if a malicious client sends a size measured in terabytes and random garble as the filler? The server's memory is filled with the random garble and it will freeze/crash.
Is it a good idea to let the server decide the maximum allowed size of the received packet?
Or should I discard packets and implement transferring data as streams (where reading/writing would be entirely decided by the user of the library)?
(PS: I'm not a native English speaker, so forgive my possibly hideous usage of the language.)
Yes, set a maximum allowed size on the server side. Set it so that the server won't freeze/crash, but not smaller. Predictable behaviour should be the highest goal.

Maximum on HTTP header values?

Is there an accepted maximum allowed size for HTTP headers? If so, what is it? If not, is this something that's server specific or is the accepted standard to allow headers of any size?
No, HTTP does not define any limit. However most web servers do limit size of headers they accept. For example in Apache default limit is 8KB, in IIS it's 16K. Server will return 413 Entity Too Large error if headers size exceeds that limit.
Related question: How big can a user agent string get?
As vartec says above, the HTTP spec does not define a limit, however many servers do by default. This means, practically speaking, the lower limit is 8K. For most servers, this limit applies to the sum of the request line and ALL header fields (so keep your cookies short).
Apache 2.0, 2.2: 8K
nginx: 4K - 8K
IIS: varies by version, 8K - 16K
Tomcat: varies by version, 8K - 48K (?!)
It's worth noting that nginx uses the system page size by default, which is 4K on most systems. You can check with this tiny program:
pagesize.c:
#include <unistd.h>
#include <stdio.h>
int main() {
int pageSize = getpagesize();
printf("Page size on your system = %i bytes\n", pageSize);
return 0;
}
Compile with gcc -o pagesize pagesize.c then run ./pagesize. My ubuntu server from Linode dutifully informs me the answer is 4k.
Here is the limit of most popular web server
Apache - 8K
Nginx - 4K-8K
IIS - 8K-16K
Tomcat - 8K – 48K
Node (<13) - 8K; (>13) - 16K
HTTP does not place a predefined limit on the length of each header
field or on the length of the header section as a whole, as described
in Section 2.5. Various ad hoc limitations on individual header
field length are found in practice, often depending on the specific
field semantics.
HTTP Header values are restricted by server implementations. Http specification doesn't restrict header size.
A server that receives a request header field, or set of fields,
larger than it wishes to process MUST respond with an appropriate 4xx
(Client Error) status code. Ignoring such header fields would
increase the server's vulnerability to request smuggling attacks
(Section 9.5).
Most servers will return 413 Entity Too Large or appropriate 4xx error when this happens.
A client MAY discard or truncate received header fields that are
larger than the client wishes to process if the field semantics are
such that the dropped value(s) can be safely ignored without changing
the message framing or response semantics.
Uncapped HTTP header size keeps the server exposed to attacks and can bring down its capacity to serve organic traffic.
Source
RFC 6265 dated 2011 prescribes specific limits on cookies.
https://www.rfc-editor.org/rfc/rfc6265
6.1. Limits
Practical user agent implementations have limits on the number and
size of cookies that they can store. General-use user agents SHOULD
provide each of the following minimum capabilities:
o At least 4096 bytes per cookie (as measured by the sum of the
length of the cookie's name, value, and attributes).
o At least 50 cookies per domain.
o At least 3000 cookies total.
Servers SHOULD use as few and as small cookies as possible to avoid
reaching these implementation limits and to minimize network
bandwidth due to the Cookie header being included in every request.
Servers SHOULD gracefully degrade if the user agent fails to return
one or more cookies in the Cookie header because the user agent might
evict any cookie at any time on orders from the user.
--
The intended audience of the RFC is what must be supported by a user-agent or a server. It appears that to tune your server to support what the browser allows you would need to configure 4096*50 as the limit. As the text that follows suggests, this does appear to be far in excess of what is needed for the typical web application. It would be useful to use the current limit and the RFC outlined upper limit and compare the memory and IO consequences of the higher configuration.
I also found that in some cases the reason for 502/400 in case of many headers could be because of a large number of headers without regard to size.
from the docs
tune.http.maxhdr
Sets the maximum number of headers in a request. When a request comes with a
number of headers greater than this value (including the first line), it is
rejected with a "400 Bad Request" status code. Similarly, too large responses
are blocked with "502 Bad Gateway". The default value is 101, which is enough
for all usages, considering that the widely deployed Apache server uses the
same limit. It can be useful to push this limit further to temporarily allow
a buggy application to work by the time it gets fixed. Keep in mind that each
new header consumes 32bits of memory for each session, so don't push this
limit too high.
https://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.2-tune.http.maxhdr
If you are going to use any DDOS provider like Akamai, they have a maximum limitation of 8k in the response header size. So essentially try to limit your response header size below 8k.

Resources