I do not understand the explanation of Fanout throttling - firebase

According to the Firebase documentation:
Message fanout is the process of sending a message to multiple devices, such as when you target topics and groups, or use the Notifications composer in the Firebase console.
We limit the number of in-progress message fanouts per project to 1,000. After that, we may reject additional fanout requests until some of the fanouts complete.
This explanation is not clear, for example, if one message is sent to one topic registered with one million devices (simultaneous transmission to one million devices), it becomes one fan-out. Is not it?

Related

Firebase Cloud Messaging: "Topic Quota Exceeded"

I have a webapp and a Windows Service which communicate using Firebase Cloud Messaging. The webapp subscribes to a couple of Topics to receive messages, and Windows Service App sends messages to one of these Topics. In some cases it can be several messages per seconds, and it gives me this error:
FirebaseAdmin.Messaging.FirebaseMessagingException: Topic quota exceeded
I don't quite get it. Is there a limit to messages that can be sent to a specific topic, or what is the meaning?
I have found until now only info about topic names and subscription limits, but I actually couldn't find anything about "topic quota", except maybe this page of the docs (https://firebase.google.com/docs/cloud-messaging/concept-options#fanout_throttling) although I am not sure it refers to the same thing, and in case if and how it can be changed. In the Firebase Console I can't find anything either. Has anybody got an idea?
Well.. from this document it seems pretty clear that this can happen:
The frequency of new subscriptions is rate-limited per project. If you
send too many subscription requests in a short period of time, FCM
servers will respond with a 429 RESOURCE_EXHAUSTED ("quota exceeded")
response. Retry with exponential backoff.
I do agree that the document should've state how much quantity will trigger the block mechanism instead of just telling the developer to "Retry with exponential backoff". But, at the end of the day, Google also produced this document to help developers understand how to properly implement this mechanism. In a nutshell:
If the request fails, wait 1 + random_number_milliseconds seconds and
retry the request.
If the request fails, wait 2 + random_number_milliseconds seconds and
retry the request.
If the request fails, wait 4 + random_number_milliseconds seconds and
retry the request.
And so on, up to a maximum_backoff time.
My conclusion: reduce the amount of messages send to topic OR implement a retry mechanism to recover unsuccessful attempts
It could be one of these issue :
1. Too high subscriptions rates
Like noted here
The frequency of new subscriptions is rate-limited per project. If you send too many subscription requests in a short period of time, FCM servers will respond with a 429 RESOURCE_EXHAUSTED ("quota exceeded") response. Retry with exponential backoff.
But this don't seem to be your problem as you don't open new subscriptions, but instead send messages at high rate.
2. Too many messages sent to on device
Like noted here
Maximum message rate to a single device
For Android, you can send up to 240 messages/minute and 5,000 messages/hour to a single device. This high threshold is meant to allow for short term bursts of traffic, such as when users are interacting rapidly over chat. This limit prevents errors in sending logic from inadvertently draining the battery on a device.
For iOS, we return an error when the rate exceeds APNs limits.
Caution: Do not routinely send messages near this maximum rate. This
could waste end users’ resources, and your app may be marked as
abusive.
Final notes
Fanout throttling don't seems to be the issue here, as the rate limit is really high.
Best way to fix your issue would be :
Lower your rates, control the number of "devices" notified and overall limit your usage over short period of time
Keep you rates as is but implement a back-off retries policy in your Windows Service App
Maybe look into a service mor suited for your usage (as FCM is strongly focused on end-client notification) like PubSub

Replies are not always delivered to the desired reply listener container

We are applying the request/reply semantics using spring-kafka with the ReplyingKafkaTemplate.
However, we noticed that sometimes the reply does not end up where it should.
Here is a rough description of our setup:
Service A
2 instances
consuming messages from topic-a which has 2 partitions. (each instance gets 1 partitions assigned).
Service A is the initiator.
Service B:
2 instances, consumes messages from topic-b, which also has 2 partitions.
Reacts to incoming messages from A and returns a reply message using the #SendTo annotation.
Observed behaviour:
When an instance of service A, e.g. A1, is sending a message to service B the send fails with a reply timeout. The request is consumed by B successfully and a reply is being returned, however it was consumed by the other instance, e.g. A2. From the logging I can see that A1 get topic-a-0 assigned, whereas A2 gets topic-a-1 assigned.
Suggestions from the docs:
Our scenario is described in this section of the docs: https://docs.spring.io/spring-kafka/reference/html/#replying-template
It gives a couple suggestions:
Give each instance a dedicated reply topic
Use reply partition header and use dedicated partitions for each instance
Our setup is based on a single topic for the whole service. So all incoming events and reply events are send to this and consumed from this topic. So option #1 is not desirable in our situation.
The downside of option #2 is that you cannot use the group management feature, which is a pitty because our services run on Kubernetes so we'd like to use the group management feature for maximum flexibility.
A third option?
So I was wondering if there was a third option:
Why not use group management and determine the assigned topic partitions of the reply container at runtime on the fly when sending a message and set the reply partition header.
It looks like the ReplyingKafkaTemplate#getAssignedReplyTopicPartitions method provides exactly this information.
This way, the partitions are not fixed and we can still use the group management feature.
The only downside I can foresee is that when the partitions are rebalanced after the request was sent but before the reply was received, the request could fail.
I already have tested something to see if it works and it looks like it does. The main reason for me to post this question is to check if my idea makes sense, are there any caveats to take into account. I'm wondering why this is not supported by spring-kafka out of the box.
If my solution makes sense, I am willing to raise an enhancement issue and provide a PR on the spring-kafka project.
The issue, as you describe, is there is no guarantee we'll get the same partition(s) after a rebalance.
The "third option" is to use a different group.id for each instance and set sharedReplyTopic=true. In this case all instances will get the reply and it will be discarded by the instance(s) that did not send the request.
The best solution, however, is to use a unique reply topic for each instance.

FCM Priority vs Delivery Time Clarification Needed

I am currently using the legacy http API to send data messages from my server to Android devices connected to certain topics. I have both, time critical and not so critical messaging. However I never have messaging which needs to wake up attached phones.
So I started with priority=normal, as I expected the behaviour explained here:
https://firebase.google.com/docs/cloud-messaging/concept-options
But what I actually observe is more like what is being described here:
https://firebase.google.com/docs/cloud-messaging/http-server-ref
When I switch to priority=high messages arrive like instant indeed.
So it seems like the first piece of Firebase documentation is wrong?
Or how do I need to proceed to achieve the behavior described there, cause that is, what I'd actually needed in my use case?
BTW I use time_to_live=0 messages in such cases. Messages are small, less than 50 bytes of data.
You need to use priority= high like this for example:
const options = {
priority: "high",
timeToLive: 60 * 60 * 24
};
If you want instant notifications when device is in background or shutdown then you need to use priority: "high" since it means that the notification that is sent will have higher priority thus appearing instantly and waking the device.
FCM attempts to deliver high priority messages immediately, allowing the FCM service to wake a sleeping device when necessary and to run some limited processing (including very limited network access). High priority messages generally should result in user interaction with your app. If FCM detects a pattern in which they don't, your messages may be de-prioritized.
timetolive:
This parameter specifies how long (in seconds) the message should be kept in FCM storage if the device is offline. The maximum time to live supported is 4 weeks, and the default value is 4 weeks. For more information

Cursor payload is too big compared to the useful payload

Steps to reproduce
The logic of the application assumes that there are number of data sources on the server which are handled by groups.
If client wants to subscribe to the specific data source, it calls:
myhub.Subscribe(dataSourceId);
On the server side, we just add the client to the specific group:
await Groups.Add(Context.ConnectionId, dataSourceId.ToString());
Then all the messages are sent with huge cursor payload. And the most important part, the size of it grows with every subscription.
Am I doing something wrong?
Update
Similar: SignalR and large number of groups
Unfortunately this is how cursors work. Cursor contains references to all topics the connection is subscribed to and each group is a separate topic. Besides the cursor getting bigger there is one more limitation to using many groups. The more groups the client is a member of the bigger the groups token. The groups token is sent back to the server when the client is reconnecting and if it gets too big it may exceed the URL size limit causing reconnect failures.

How to Improve Performance of Kafka Producer when used in Synchronous Mode

I have developed a Kafka version : 0.9.0.1 application that cannot afford to lose any messages.
I have a constraint that the messages must be consumed in the correct sequence.
To ensure I do not loose any messages I have implemented Retries within my application code and configured my Producer to ack=all.
To enforce exception handling and to Fail Fast I immediately get() on the returned Future from Producer.send(), e.g.
final Future<RecordMetadata> futureRecordMetadata = KAFKA_PRODUCER.send(producerRecord);
futureRecordMetadata.get();
This approach works fine for guaranteeing the delivery of all messages, however the performance is completely unacceptable.
For example it takes 34 minutes to send 152,125 messages with ack=all.
When I comment out the futureRecordMetadata.get(), I can send 1,089,125 messages in 7 minutes.
When I change ack=all to ack=1 I can send 815,038 in 30 minutes. Why is there such a big difference between ack=all and ack=1?
However by not blocking on the get() I have no way of knowing if the message arrived safely.
I know I can pass a Callback into the send and have Kafka retry for me, however this approach has a drawback that messages may be consumed out of sequence.
I thought request.required.acks config could save the day for me, however when I set any value for it I receive this warning
130 [NamedConnector-Monitor] WARN org.apache.kafka.clients.producer.ProducerConfig - The configuration request.required.acks = -1 was supplied but isn't a known config.
Is it possible to asynchronously send Kafka messages, with a guarantee they will ALWAYS arrive safely and in the correct sequence?
UPDATE 001
Is there anyway I can consume messages in kafka message KEY order direct from the TOPIC?
Or would I have to consume messages in offset order then sort programmatically
to Kafka message Key order?
If you expect a total order, the send performance is bad. (actually total order scenario is very rare).
If Partition order are acceptable, you can use multiple thread producer. One producer/thread for each partition.

Resources