How do you determine HTTP request parameter order when calculating HMACs? - http

I'm writing a Web service that is going to use HMAC for authentication. Quick overview: an HMAC is a message digest calculated using the body of a message along with a secret key. The sender calculates the HMAC and attaches it to the request, then the receiver calculates the message digest on receipt using the secret key, which it has on file. If the digests are the same, then the receiver can be sure that the message was sent by the person who they claim to be.
My question is about the parameter order. Let's say the Web service request has three parameters, foo, bar and baz. The body of the HTTP POST will look something like:
foo=1&bar=2&baz=3&hmac=de7c9b85b8b78aa6bc8a7a36f70a90701c9db4d9
(The HMAC in this case is a fake example.)
Normally HTTP parameter order is not significant, but when it comes to calculating the hash, it is. Should the server take the raw incoming request, drop the "hmac" parameter which is, of course, not part of the hash calculation, and hash that? Or should there be an agreed upon order of parameters which must be followed in order for the hash to be calculated correctly?
The former approach puts a bit more of a burden on the implementor on the server side, but it's more robust. What I'm really asking about is the expectation of developers who are building things on the client side. Do they expect that things will just work regardless of what order the parameters?

I would say that manipulating the body of the request after you have calculated a hash based on that body, which is significant to whether the request is accepted, is generally bad practice (for reasons that, I feel, are obvious). That HMAC should not be appended to the request body, but set in either a GET parameter, a cookie, or a custom header.
This also reduces the burden on the implementor on the server side for your first suggestion, and this is the path I would recommend.
But that's me, others may have differing opinions on all of this...

Related

What is the appropriate HTTP status code for a null or missing object attribute?

Let's say we have an API with a route /foo/<id> that represents an instance of an object like this:
class Foo:
bar: Optional[Bar]
name: str
...
class Bar:
...
(Example in Python just because it's convenient, this is about the HTTP layer rather than the application logic.)
We want to expose full serialized Foo instances (which may have many other attributes) under /foo/<id>, but, for the sake of efficiency, we also want to expose /foo/<id>/bar to give us just the .bar attribute of the given Foo.
It feels strange to me to use 404 as the status response when bar is None here, since that's the same status code you'd get if you requested some arbitrarily incorrect route like /random/gibberish, too; if we were to have automatic handling of 404 status in our client-side layer, it would be misinterpreting this with likely explanations such as "we forgot to log in" or "the client-side URL routing was wrong".
However, 200 with a response-body of null (if we're serializing using JSON) feels odd as well, because the presence or absence of the entity at the given endpoint is usually communicated via a status rather than in-line in the body. Would 204 with an empty response-body be the right thing to say here? Is a 404 the right way to go, and if so, what's the right way for the server to communicate nuances like "but that was a totally expected and correct route" or "actually the foo-ID you specified was incorrect, this isn't missing because the attribute was un-set".
What are the advantages and disadvantages of representing the missing-ness of this attribute in different ways?
I wonder if you could more clearly articulate why a 200 with a null response body is odd. I think it communicates exactly what you want, as long as you're not trying to differentiate between a given Foo not having a bar (e.g. Foo.has_key?(bar)) and Foo having a bar explicitly set to null.
Of 404, https://developer.mozilla.com says,
In an API, this can also mean that the endpoint is valid but the resource itself does not exist.
so I think it's acceptable. 204 doesn't strike me as particularly outlandish in this situation, but is more commonly associated (IME, at least) with DELETEs (and occasionally PUTs/POSTs that don't return results.)
I also struggle a lot with this because:
404 can point to a non existent url, or a path that is acceptable
but the particular referenced resource does not exist. I have also
used it to error out on request body's that carry identifiers that
are non existent.
A lot of people shoe-horn these errors into the bad request (400)
error code which is somewhat acceptable but also a cop out.
(Literally anything the server did not process successfully can be classified as a bad request, if you
think about it)
With 2(above) in mind, a 400 with some helpful message body is
sometimes used to wash out the guilt of not committing outrightly to
a 404, but this demands some parsing expectations on the client's
side, which is not always nice. Also returning a 400 which,
according to this is kind of gaslighting the client, because 400
errors are supposed to be the client's fault entirely with regard to the structure of the request, not because the client asked for something not in your db.
400 Bad Request response status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
The general feeling is that 200 means all is good, and therefore
there's always a tacit expectation the response will always contain
some form of body, not null.(Right??) I wouldn't encourage using a 200 for
these situations. While 204's don't carry the responsibility having to carry a response body, they also sort of convey the message that "something worked", which is not the message you want to send here, right?
What I'm trying to say? Thoughtful API design is hard.

Count sent and received bytes in Go in an http.Handler ServeHTTP function?

How can sent and received bytes be counted from within a ServeHTTP function in Go?
The count needs to be relatively accurate. Skipping connection establishment is not ideal, but acceptable. But headers must be included.
It also needs to be fast. Iterating is generally too slow.
The counting itself doesn't need to occur within ServeHTTP, as long the count for a given connection can be made available to ServeHTTP.
This must also not break HTTPS or HTTP/2.
Things I've Tried
It's possible to get a rough, slow estimate of received bytes by iterating over the Request headers. This is far too slow, and the Go standard library removes and combines headers, so it's not accurate either.
I tried writing an intercepting Listener, which created an internal tls.Listen or net.Listen Listener, and whose Accept() function got a net.Conn from the internal Listener's Accept(), and then wrapped that in an intercepting net.Conn whose Read and Write functions call the real net.Conn and count their reads and writes. It's then possible to make those counts available to the ServeHTTP function via mutexed shared variables.
The problem is, the intercepting Conn breaks HTTP/2, because Go's internal libraries cast the net.Conn as a *tls.Conn (e.g. https://golang.org/src/net/http/server.go#L1730), and it doesn't appear possible in Go to wrap the object while still making that cast succeed (if it is, it would solve this problem).
Counting sent bytes can be done relatively accurately by counting what is written to the ResponseWriter. Counting received bytes in the HTTP body is also achievable, via Request.Body. The critical issue here appears to be quickly and accurately counting request header bytes. Though again, also counting connection establishment bytes would be ideal.
Is this possible? How?
I think it is possible, but I can't say I've done it. However, based on browsing the stdlib implementation of the HTTP server and TLS listener, I don't see why it shouldn't be possible; the key is wrapping the connection before TLS instead of after. This also gets you a more accurate count of bytes on the wire, rather than a count of decrypted bytes.
You've already got an intercepting Listener, you just need to insert it in the right spot. Rather than passing your Listener to http.Serve (or wherever you're inserting it), you want to pass it to tls.NewListener first, which wraps it in the TLS handler, and then pass the result, which will be a TLS listener (making Go's HTTP/2 support happy) into the HTTP server.
Of course, if you want a count of decrypted bytes rather than wire bytes, you may be SOL - wrapping the net.Conn just won't get you there. You'll likely have to do the best you can with counting headers & body.

Generating a multipart/byterange response without scanning the parts ahead of sending

I would like to generate a multipart byte range response. Is there a way for me to do it without scanning each segment I am about to send out, since I need to generate multipart boundary strings?
For example, I can have a user request a byterange that would have me fetch and scan 2GB of data, which in my case involves me loading that data into my (slow) VM as strings and so forth. Ideally I would like to simply state in the response that a part has a length of a certain number of bytes, and be done with it. Is there any tooling that could provide me with this option? I see that many developers just grab a UUID as the boundary and are probably willing to risk a tiny probability that it will appear somewhere within the part, but that risk seems to be small enough multiple people are taking it?
To explain in more detail: scanning the parts ahead of time (before generating the response) is not really feasible in my case since I need to fetch them via HTTP from an upstream service. This means that I effectively have to prefetch the entire part first to compute a non-matching multipart boundary, and only then can I splice that part into the response.
Assuming the data can be arbitrary, I don’t see how you could guarantee absence of collisions without scanning the data.
If the format of the data is very limited (like... base 64 encoded?), you may be able to pick a boundary that is known to be an illegal sequence of bytes in that format.
Even if your boundary does collide with the data, it must be followed by headers such as Content-Range, which is even more improbable, so the client is likely to treat it as an error rather than consume the wrong data.
Major Web servers use very simple strategies. Apache grabs 8 random bytes at startup and renders them in hexadecimal. nginx uses a sequential counter left-padded with zeroes.
UUIDs are designed to avoid collisions with other UUIDs, not with arbitrary data. A UUID is no more likely to be a good boundary than a completely random string of the same length. Moreover, some UUID variants include information that you may not want to disclose, such as your machine’s MAC address.
Ideally I would like to simply state in the response that a part has a length of a certain number of bytes, and be done with it. Is there any tooling that could provide me with this option?
Maybe you can avoid supporting multiple ranges and simply tell the clients to request each range separately. In that case, you don’t use the multipart format, so there is no problem.
If you do want to send multiple ranges in one response, then RFC 7233 requires the multipart format, which requires the boundary string.
You can, of course, invent your own mechanism instead of that of RFC 7233. In that case:
You cannot use 206 (Partial Content). You must use 200 (OK) or some other applicable status code.
You cannot use the multipart/byteranges media type. You must come up with your own media type.
You cannot use the Range request header.
Because a 200 (OK) response to a GET request is supposed to carry a (full) representation of the resource, you must do one of the following:
encode the requested ranges in the URL; or
use something like POST instead of GET; or
use a custom, non-standard status code instead of 200 (OK); or
(not sure if this is a correct approach) use media type parameters, send them in Accept, and add Accept to Vary.
The chunked transfer coding may be useful, but you cannot rely on it alone, because it is a property of the connection, not of the payload.

How to handle padding errors in pkcs11?

I'm wondering how C_DecryptFinal & C_Decrypt are supposed to deal with padding errors.
According to pkcs11 2.20, C_DecryptFinal can return CKR_ENCRYPTED_DATA_INVALID or CKR_ENCRYPTED_DATA_LEN_RANGE,
so I suppose that if padding is invalid, C_DecryptFinal/C_Decrypt return CKR_ENCRYPTED_DATA_INVALID.
Is it correct?
If so, is C_DecryptFinal/C_Decrypt vulnerable to padding-oracle attacks?
Citing the standard (section 11.1.6):
CKR_ENCRYPTED_DATA_LEN_RANGE: The ciphertext input to a decryption
operation has been determined to be invalid ciphertext solely on the
basis of its length. Depending on the operation’s mechanism, this
could mean that the ciphertext is too short, too long, or is not a
multiple of some particular blocksize. This return value has higher
priority than CKR_ENCRYPTED_DATA_INVALID.
CKR_ENCRYPTED_DATA_INVALID: The encrypted input to a decryption
operation has been determined to be invalid ciphertext. This return
value has lower priority than CKR_ENCRYPTED_DATA_LEN_RANGE.
So for block ciphers the CKR_ENCRYPTED_DATA_LEN_RANGE should be returned when the input is not block-aligned.
If the input is block-aligned, the CKR_ENCRYPTED_DATA_INVALID is probably returned in case of wrong padding for the CKM_*_PAD mechanisms.
Thus the padding oracle attack is probably possible.
As the PKCS#7 padding is the only defined padding scheme for block ciphers, it is quite often the responsibility of the application to handle the padding, which leads to what I think should be the answer to your question:
It is up to the application (i.e. "the cryptoki client") not to provide an external attacker (i.e. the "the application client") with any oracle to determine the padding was wrong, regardless of the source of this information (i.e. the cryptoki or the application itself).
It is probably meaningless to protect against the padding oracle attack on the cryptoki interface level (i.e. an attacker inside the application), as such an attacker can decrypt anything at will directly using the appropriate functions.
(Of course it is better to use some form of authenticated encryption and do not need to worry about the padding oracle attack at all)
Desclaimer: I am no crypto expert, so please do validate my thoughts.

Correlation on MessageBox direct bound ports

I have an orchestration called MyUsefulOrch, hosted in an application MySharedApp.
MyUsefulOrch has an inbound messagebox-direct-bound port to receive requests, and after doing some useful work, an outbound messagebox-direct-bound port to send a message to the caller.
Now, I have another orchestration called MyCallerOrch which wants to benefit from the useful processing provided by MyUsefulOrch. However, MyCallerOrch is hosted in a different application, MyCallingApp.
I do not want to have any references to the assembly which contains MyUsefulOrch from MyCallerOrch.
My problem now is making sure I can send a message to MyUsefulOrch from MyCallerOrch and receive a response from it.
Ahah! Correlation should do the trick! But how do I go about getting correlation to work in this scenario?
For example:
Would I put a correlation id in a property schema and stuff a guid into the message context under this property from MyCallerOrch just before sending it to the messagebox?
How do I ensure that MyCallerOrch receives only the responses it needs to receive from MyUsefulOrch?
Do I need to put the correlation id value into the message body of the messages which are sent between the two orchestrations?
I would greatly appreciate any help, ideally as descriptive as possible, about how to acheive this.
Many thanks in advance.
If you use a two-way, request/response send port in the caller orchestration to send messages to the useful orchestration, then you can use correlation to route the relevant messages back to the userful orch from the caller.
The trick is that you will need to modify the useful orch (to make it more useful, of course).
If you do not/cannot control whether or not callers to the userful orch are expecting a response back, then you would need to make the inbound (request) port a one-way port. The orchestration would then complete by sending to a one-way outbound (response) port.
To ensure that messages received from two-way/request-response callers are routed back properly, the construct shape of the outbound message inside your useful orch will need to set the following message properties to true using a message assignment shape:
BTS.RouteDirectToTP
BTS.IsRequestResponse
Before setting those two properties, though, also make sure to do something like msgOut(*) f= msgIn(*); in the same message assignment shape to ensure that other properties get copied over. If the inbound and outbound messages are not the same, then you have to manually set each of the required properties, one at a time.
Those properties, of course, in addition to the two above, are what help ensure that the result of the useful orch is properly routed to the caller. They should be inside your correlation set and are:
BTS.CorrelationToken
BTS.EpmRRCorrelationToken
BTS.IsRequestResponse
BTS.ReqRespTransmitPipelineID
BTS.RouteDirectToTP
I'm getting a bit ahead of myself, however, as you assign the correlation set to the outbound send shape only if BTS.EpmRRCorrelationToken exists msgIn. This is critical. I have used a decision shape in an orchcestration, with the decision based upon that exact phrase. If the result is true, then send the previously constructed message out and assign the correlation set from above as the Initializing correlation set. This will cause BizTalk to route the message back to the caller as its expected response.
If the result of the decision was false then the caller of the useful orchestration was one-way. You will still likely want to send out a result (and just have someone else subscribe to it). You can even use the same send port as for two-way responses, just do not assign the correlation set.
You will want to thoroughly test this, of course. It does work for me in the one scenario in which I have used it, but that doesn't absolve others from doing their due diligence.
I think you are pretty much on the right track
Since the 2 applications are going to send messages to eachother, if you use strongly typed schemas, both apps will need to know about the schemas.
In this case recommend that you separate the common schemas off into a separate assembly, and reference this from both your orchestration apps.
(Schemas registered on the Server must have unique XMLNS#ROOTs, even across multiple applications)
However, if you really can't stand even a shared schema assembly reference, you might need to resort to untyped messages.
Richard Seroter has an example here
His article also explains a technique for auto stamping a correlation GUID on the context properties.
Edit : Good point. It is possible to promote custom context properties on the message without a Pipeline - see the tricks here and here - this would suffice to send the context property to MyUsefulOrch and similarly, the Custom context could be promoted on the return message from within MyUsefulOrch (since MyUsefulOrch doesn't need any correlation). However I can't think how, on the return to MyCallingOrch that the custom context property can be used to continue the "following correlation", unless you add a new correlating property into the return message.

Resources