GCM - Max length for Registration ID - push-notification

Update: GCM is deprecated, use FCM
What is the maximum length for a Registration ID issued by GCM servers? GCM documentation do not provide this info. Googling for this reveals that Registration ID is not fixed length in nature and can be up to 4K (4096 bytes) in length. But these are not official answers from Google. I am currently receiving Registration IDs which are 162 characters long. Can anybody help?

On android-gcm forum a google's developer confirms it's 4k

I am interested in know about this also. My reg id size is 183 chars. I suspect it won't be longer than 512 chars though, let alone 4K. Imagine sending bulk notification, a 4K reg id x 1000 = 4MB message size!
In the end, I just use the 'text' type in my MySQL table to store the registration id. So even if google send me a 1K, 2K, or 4K (very unlikely) reg id, I will be able to handle it.
Update: I have come across a new reg id size: 205.

This is what has said in GCM doc,
A JSON object whose fields represents the key-value pairs of the message's payload data. If present, the payload data it will be included in the Intent as application data, with the key being the extra's name. For instance, "data":{"score":"3x1"} would result in an intent extra named score whose value is the string 3x1.
There is no limit on the number of key/value pairs, though there is a limit on the total size of the message (4kb). The values could be any JSON object, but we recommend using strings, since the values will be converted to strings in the GCM server anyway.
If you want to include objects or other non-string data types (such as integers or booleans), you have to do the conversion to string yourself. Also note that the key cannot be a reserved word (from or any word starting with google.).
To complicate things slightly, there are some reserved words (such as collapse_key) that are technically allowed in payload data. However, if the request also contains the word, the value in the request will overwrite the value in the payload data. Hence using words that are defined as field names in this table is not recommended, even in cases where they are technically allowed. Optional.

Related

storing as integer vs string size

I have checked the Docs but I got confused a bit. When storing a long integer such as 265628143862153200. Would it be more efficient to store it as a string of integer.
This is what I need help with is the below calculation corrent?
Integer:
length of 265628143862153200 *8 ?
String:
length of 265628143862153200 +1 ?
The Firebase console is not meant to be part of administrative workflows. It's just for discovery and development. If you have production-grade procedures to follow, you should only write code for that using the provided SDKs. Typically, developers make their own admin sites to deal with data in Firesotre.
Also, you should know that JavaScript integers are not "big" enough to store data to the full size provided by Firestore. If you want to use the full size of a number field, you will have to use a system that supports Firestore's full 64 bit signed integer type.
If you must store numbers bigger than either limit, and be able to modify them by hand, consider instead storing multiple numbers, similar to the way Firestore's timestamp stores seconds and nanoseconds as separate numbers, so that the full value can overflow greater than signed 64 bits.

Decoding the UK's NHS Test and Trace QR codes

As persons in the UK may know, the government is rolling out a bold new strategy in the bid to slow down infection rates from Covid-19.
The lastest rollout is for QR codes to display at the point of sale. To obtain one, you create an account detailing the operator name, name of establishment, phone number and email address, you are then issued a QR code.
Being concerned about data security I am attempting to discover why the QR code I must display for persons using my cafe to "log in" with is 343 bytes long.
The QR code reads as follows:
UKC19TRACING:1:eyJhbGciOiJFUzI1NiIsImtpZCI6IllycWVMVHE4ei1vZkg1bnpsYVNHbllSZkI5YnU5eVBsV1lVXzJiNnFYT1EifQ.eyJpZCI6IlY1VldYMzlSIiwib3BuIjoiUGlwbGV5IEJhcm4gQ2Fmw6kiLCJhZHIiOiJQaXBsZXkgQmFyblxuQnJvY2toYW0gRW5kXG5MYW5zZG93biIsInBjIjoiQkExOUJaIiwidnQiOiIwMDgifQ.xG3rlgLIpQjHuZa7kQ4I4TC2u3xhmHpyhLjqGTS1aaFzueUt8TqqsW4-1eKL-RSOP9o0av9XPivtK-BfPuUV-g
There are a number of repeating sequences in the code such as "eyJ" and "pZCI6Il" which (I think) rules out the possibility that this is proper encryption.
My concern is that I am publicly displaying a lot of information, whereas it seems to me that a simply signature (like UKC19TRACING) plus a key into a database would be sufficient fo any rational way of implementing contact tracing.
So I have fired off a Freedom of Information request to the relevant government department (the UK Department of Health and Social care), but in the meantime I thought that greater experts than I might like to have a go at decrypting this.
You're completely correct that the repeated eyJ strongly suggests this isn't just raw encrypted data, and has some structure. eyJ comes from the Base64 encoding of {" and almost always indicates that you're looking at Base64-encoded JSON. In this case, it's actually b64-encoded, but the eyJ pattern is the same. b64 is a slightly modified version of Base64.
As Topaco notes, the part after the UKC19TRACING:1: is a JWT. You can decode it at jwt.io as the following:
Header
{
"alg": "ES256",
"kid": "YrqeLTq8z-ofH5nzlaSGnYRfB9bu9yPlWYU_2b6qXOQ"
}
This tells you how it's signed, and the identifier of the key that signed it.
Payload
{
"id": "V5VWX39R",
"opn": "Pipley Barn Café",
"adr": "Pipley Barn\nBrockham End\nLansdown",
"pc": "BA19BZ",
"vt": "008"
}
This is the signed payload. This isn't encrypted (and isn't meant to be). It's just b64-encoded and signed, along with the header.
You're correct that this could be implemented by a large database and a sparse identifier. The point of a JWT is that you don't need the database (and more importantly, don't need a mechanism for performing a database lookup). Because the data is signed, you can trust this payload was generated by an entity with access to the signing key. To validate this signature, you need the public key (generally as a JWK) for the given kid.

Bytes to receive on gen_tcp:recv by parsing json

Working on a chat server, I need to receive json via gen_tcp in erlang.
One way is to send a 4byte int header which is a good idea so i can also reject messages from clients if they exceed the max length but add complexity on client side.
Another way is to read line, should work too for json if i am not wrong.
Third idea is to read json using depth tracking (counting '{' maybe?)
That way i can also set max message length and make client code less complex.
How can i do it specially with erlang i.e. check number of square brackets opened and keep receiving till last closes? or if its even a good idea?
How does xmpp and other messaging protocols handle this problem?
Another way is to read line, should work too for json if i am not wrong.
Any key or value in json can contain a newline, and if your read protocol is: "Stop reading when a newline character is read from the socket.", you will not read the whole json if any key or value in the json has a newline character in it.
Third idea is to read json using depth tracking (counting '{' maybe?)
Ugh. Too complex. And json can start with a [ as well. And, a key or value could contain a ] or a } too.
The bottom line is: you need to decide on what should mark the end of a sent message. You could choose some relatively unique string like: --*456?END OF MESSAGE!123**--, but once again a key or value in the json could possibly contain that string--and that is why byte headers are used. You should be able to make an informed choice on how you want to proceed after reading this.

What is the maximum length of a FCM getToken? [duplicate]

Working with the "new" Firebase Cloud Messaging, I would like to reliably save client device registration_id tokens to the local server database so that the server software can send them push notifications.
What is the smallest size of database field that I should use to save 100% of client registration tokens generated?
I have found two different libraries that use TextField and VarChar(255) but nothing categorically defining the max length. In addition, I would like the server code to do a quick length check when receiving tokens to ensure they "look" right - what would be a good min length and set of characters to check for?
I think this part of FCM is still the same as GCM. Therefore, you should refer to this answer by #TrevorJohns:
The documentation doesn't specify any pattern, therefore any valid string is allowed. The format may change in the future; please do not validate this input against any pattern, as this may cause your app to break if this happens.
As with the "registration_id" field, the upper bound on size is the max size for a cookie, which is 4K (4096 bytes).
Emphasizing on the The format may change in the future part, I would suggest to stay safe and have a beyond the usual max (mentioned above) length. Since the format and length of a registration token may also vary.
For the usual length and characters, you can refer to these two answers the latter being much more definitive:
I hasn't seen any official information about format of GCM registrationId, but I've analyzed our database of such IDs and can make following conclusions:
in most cases length of a registrationID equals 162 symbols, but can be variations to 119 symbols, maybe other lengths too;
it consists only from this chars: [0-9a-zA-Z\-\_]*
every regID contains one or both of "delimiters": - (minus) or _ (underline)
I am now using Firebase Cloud Messaging instead of GCM.
The length of the registration_id I've got is 152.
I've also got ":" at the very beginning each time like what jamesc mentioned (e.g. bk3RNwTe3H0:CI2k_HHwgIpoDKCIZvvDMExUdFQ3P1).
I make the token as varchar(255) which is working for me.
However, the length of registration_id has no relationship with size
of 4k. You are allowed to send whatever size of the data through
network. Usually, cookies are limited to 4096 bytes, which consist of
name, value, expiry date etc.
This is a real fcm token:
c2aK9KHmw8E:APA91bF7MY9bNnvGAXgbHN58lyDxc9KnuXNXwsqUs4uV4GyeF06HM1hMm-etu63S_4C-GnEtHAxJPJJC4H__VcIk90A69qQz65toFejxyncceg0_j5xwoFWvPQ5pzKo69rUnuCl1GSSv
as you can see the length of token is: 152
I don't think the upper limit for a registration ID is 4K. It should be safe to assume that it is much lower than that.
The upper limit for a notification payload is 4KB (link), and the notification payload includes the token (link). Since the payload also needs to include the title, body, and other data too, the registration ID should be small.
That's what I understand from the docs ¯\_(ツ)_/¯
The last tokens I got were 163-chars long. I think it's safe to assume that they will never exceed 255 chars. Some comments in the other answer reported much higher lengths!
Update
So far, in 4 months that I'm running my app, there are over 100k registration IDs, and every single one of them is 163-chars long. It's very possible that Google maintains the ID length stable in order not to crash apps. Hence, I'd suggest
getting a few registration IDs in your local machine
measuring their length and verifying it's constant (or at least it doesn't change significantly)
picking a safe initial value, slightly higher than the ID length
I think it's unlikely for the length to change now, but I'll keep an eye. Please let me know if you noticed IDs of different lengths in your apps!

Generating a multipart/byterange response without scanning the parts ahead of sending

I would like to generate a multipart byte range response. Is there a way for me to do it without scanning each segment I am about to send out, since I need to generate multipart boundary strings?
For example, I can have a user request a byterange that would have me fetch and scan 2GB of data, which in my case involves me loading that data into my (slow) VM as strings and so forth. Ideally I would like to simply state in the response that a part has a length of a certain number of bytes, and be done with it. Is there any tooling that could provide me with this option? I see that many developers just grab a UUID as the boundary and are probably willing to risk a tiny probability that it will appear somewhere within the part, but that risk seems to be small enough multiple people are taking it?
To explain in more detail: scanning the parts ahead of time (before generating the response) is not really feasible in my case since I need to fetch them via HTTP from an upstream service. This means that I effectively have to prefetch the entire part first to compute a non-matching multipart boundary, and only then can I splice that part into the response.
Assuming the data can be arbitrary, I don’t see how you could guarantee absence of collisions without scanning the data.
If the format of the data is very limited (like... base 64 encoded?), you may be able to pick a boundary that is known to be an illegal sequence of bytes in that format.
Even if your boundary does collide with the data, it must be followed by headers such as Content-Range, which is even more improbable, so the client is likely to treat it as an error rather than consume the wrong data.
Major Web servers use very simple strategies. Apache grabs 8 random bytes at startup and renders them in hexadecimal. nginx uses a sequential counter left-padded with zeroes.
UUIDs are designed to avoid collisions with other UUIDs, not with arbitrary data. A UUID is no more likely to be a good boundary than a completely random string of the same length. Moreover, some UUID variants include information that you may not want to disclose, such as your machine’s MAC address.
Ideally I would like to simply state in the response that a part has a length of a certain number of bytes, and be done with it. Is there any tooling that could provide me with this option?
Maybe you can avoid supporting multiple ranges and simply tell the clients to request each range separately. In that case, you don’t use the multipart format, so there is no problem.
If you do want to send multiple ranges in one response, then RFC 7233 requires the multipart format, which requires the boundary string.
You can, of course, invent your own mechanism instead of that of RFC 7233. In that case:
You cannot use 206 (Partial Content). You must use 200 (OK) or some other applicable status code.
You cannot use the multipart/byteranges media type. You must come up with your own media type.
You cannot use the Range request header.
Because a 200 (OK) response to a GET request is supposed to carry a (full) representation of the resource, you must do one of the following:
encode the requested ranges in the URL; or
use something like POST instead of GET; or
use a custom, non-standard status code instead of 200 (OK); or
(not sure if this is a correct approach) use media type parameters, send them in Accept, and add Accept to Vary.
The chunked transfer coding may be useful, but you cannot rely on it alone, because it is a property of the connection, not of the payload.

Resources