Understanding How to Store Web Push Endpoints

Understanding How to Store Web Push Endpoints - push-notification

I'm trying to get started implementing Web Push in one of my apps. In the examples I have found, the client's endpoint URL is generally stored in memory with a comment saying something like:
In production you would store this in your database...
Since only registered users of my app can/will get push notifications, my plan was to store the endpoint URL in the user's meta data in my database. So far, so good.
The problem comes when I want to allow the same user to receive notifications on multiple devices. In theory, I will just add a new endpoint to the database for each device the user subscribes with. However, in testing I have noticed that endpoints change with each subscription/unsubscription on the same device. So, if a user subscribes/unsubscribes several times in a row on the same device, I wind up with several endpoints saved for that user (all but one of which are bad).
From what I have read, there is no reliable way to be notified when a user unsubscribes or an endpoint is otherwise invalidated. So, how can I tell if I should remove an old endpoint before adding a new one?
What's to stop a user from effectively mounting a denial of service attack by filling my db with endpoints through repeated subscription/unsubscription?
That's more meant as a joke (I can obvioulsy limit the total endpoints for a given user), but the problem I see is that when it comes time to send a notification, I will blast notification services with hundreds of notifications for invalid endpoints.
I want the subscribe logic on my server to be:
Check if we already have an endpoint saved for this user/device combo
If not add it, if yes, update it
The problem is that I can't figure out how to reliably do #1.

I will just add a new endpoint to the database for each device the user subscribes with
The best approach is to have a table like this:
endpoint | user_id
add an unique constraint (or a primary key) on the endpoint: you don't want to associate the same browser to multiple users, because it's a mess (if an endpoint is already present but it has a different user_id, just update the user_id associated to it)
user_id is a foreign key that points to your users table
if a user subscribes/unsubscribes several times in a row on the same device, I wind up with several endpoints saved for that user (all but one of which are bad).
Yes, unfortunately the push API has a wild unsubscription mechanism and you have to deal with it.
The endpoints can expire or can be invalid (or even malicious, like android.chromlum.info). You need to detect failures (using the HTTP status code, timeouts, etc.) when you try to send the push message from your application server. Then, for some kind of failures (permanent failures, like expiration) you need to delete the endpoint.
What's to stop a user from effectively mounting a denial of service attack by filling my db with endpoints through repeated subscription/unsubscription?
As I described above, you need to properly delete the invalid endpoints, once you realize that they are expired or invalid. Basically they will produce at most one invalid request. Moreover, if you have high throughput, it takes only a few seconds for your server to make requests for thousands of endpoints.
My suggestions are based on a lot of experiments and thinking done when I was developing Pushpad.

Another way is to have a keep alive field on you server and have your service worker update it whenever it receives a push notification. Then regularly purge endpoints which haven't been responded to recently.

Related

What's the recommended way to handle microservice processing bugs new insights?

Before I get to my question, let me sketch out a sample set of microservices to illustrate my dilemma.
Scenario outline
Suppose I have 4 microservices:
An activation service where features supplied to our customers are (de)activated. A registration service where members can be added and changed. A secured key service that is able to generate secure keys (in a multi step process) for members to be used when communicating with them with the outside world. And a communication service that is used to communicate about our members with external vendors.
The secured key service may however only request secured keys if this is a feature that is activated. Additionally, the communication service may only communicate about members that have a secured key AND if the communication feature itself is activated.
Because they are microservices, each of the services has it's own datastore and is completely self sufficient. That is, any data that is required from the other microservices is duplicated locally and kept in sync by means of asynchronous messages from the other microservices.
The dilemma
I'm actually facing two main dilemma's. The first is (pretty obviously) data synchronization. When there are multiple data stores that need to be kept in sync you have to account for messages getting lost or processed out of order. But there are plenty of out of the box solutions for this and when all fails you could even fall back to some kind of ETL process to keep things in sync.
The main issue I'm facing however is the actions that need to be performed. In the above example the secured key service must perform an action when it either
Receives a message from the registration service for a new member when it already knows that the secured keys feature is active in the activation service
Receives a message from the activation service that the secured keys feature is now active when it already knows about members from the registration service
In both cases this means that a message from the external system must lead to both an update in the local copy of the data as well as some logic that needs to be processed.
The question
Now to the actual question :)
What is the recommended way to cope with either bugs or new insights when it comes to handling those messages? Suppose there is a bug in the message handler from the activation service. The handler does update the internal data structure, but it fails to detect that there are already registered members and thus never starts the secure key generation process. Alternatively it could be that there's no bug, but we decide that there is something else we want the handler to do.
The system will have no reason to resubmit or reprocess messages (as the message didn't fail), but there's no real way for us to re-trigger the behavior that's behind the message.
I hope it's clear what I'm asking (and I do apologize if it should be posted on any of the other 170 Stack... sites, I only really know of StackOverflow)

I don't know what is the recommended way, I know how this is done in DDD and maybe this can help you as DDD and microservices are friends.
What you have is a long-running/multi-step process that involves information from multiple microservices. In DDD this can be implemented using a Saga/Process manager. The Saga maintains a local state by subscribing to events from both the registration service and the activation service. As the events come, the Saga check to see if it has all the information it needs to generate secure keys by submitting a CreateSecureKey command. The events may come in any order and even can be duplicated but this is not a problem as the Saga can compensate for this.
In case of bugs or new features, you could create special scripts or other processes that search for a particular situation and handle it by submitting specific compensating commands, without reprocessing all the past events.
In case of new features you may even have to process old events that now are interesting for your business process. You do this in the same way, by querying the events source for the newly interesting old events and send them to the newly updated Saga. After that import process, you subscribe the Saga to these newly interesting events and the Saga continues to function as usual.

Sharing particular data between realms

I am about to start working on the back-end for a mobile app (initially iOS/Android, later also website) and I am thinking whether Realm could fulfill all my needs.
The basic idea is that there are two types of users - customers and service-providers. The customers send requests to the server once in a while and are subscribed (real-time) for any event that might occur in relation to this request in the future. Each service-provider is listening for specific requests from all customers and is the one who is going to trigger various events (send data) for each of those requests.
From the Realm docs, it is obvious that the real-time data sync is not going to be a problem. The thing I am concerned about is how to model the scenario (customer/service-provider) in the Realm 'world'. Based on what I read, it is preferred to have one realm per user. Therefore, I suppose the user will register and will be given a realm. Then whenever he makes a request, it is going to be stored in his realm. Now the question is how to model the service-provider. There are going to be various service-providers each responding (triggering various kinds of events up to one hour after request) to different kinds of requests. (Each user can send any request and therefore be served by any service-provider.)
I read a bit about that Realm supports data sharing among different realms which could be a partial solution for this problem, however I was not able to find if this 'sharing' could share only particular requests. (Meaning each service-provider will get only requests intended for him.)
My question is whether this scenario is doable using Realm?

This sounds like a perfect fit for Realm's server-side event-handling. Put simply, Realm offers the ability through our Node SDK to listen for changes across Realms on the server.
So in your example, where each mobile user would have their own Realm, the URL for this would be /~/myRealm in which the tilde represents the Realm user ID. The Node SDK event handling API allows you to register a JS function that will run in response to changes represented by a Regex pattern for Realm URLs. In this case you could use: ^/([0-9a-f]+)/myRealm so that any time any user's myRealm updated, the server could perform some logic.
In this manner, the server via the Node SDK is really a "super-user" or service-provider as you describe. When an event fires, the JS function that runs is provided the Realm that was updated and a list of indexes pertaining to the objects in the Realm that were inserted, deleted, or modified. You can then perform any logic in JS, such as using the changed data to call out to another API or opening the Realm in question or any other and writing changes which will get pushed back out to the respective clients.
The full server-side event handling is part of Realm Professional Edition, but we recently released another way to interact with this called Realm Functions. This provides the ability through the server's dashboard to create the same JS functions that will run in response to changes across Realms. The developer edition support 3 functions so you can try it out immediately!

Sync external planning with Google agenda private API

We're developing an agenda on our platform. We implemented a feature to sync with Google Agenda which works correctly except that it only works with public calendar and not when it's private.
We implement everything as Google provides and use AuthO2 protocol.
We are migrating to https and we hope that it will solve our issue.
Do you have any idea on the reason it's blocked when agenda is private?

You can implement synchronization by sending HTTP request:
GET https://www.googleapis.com/calendar/v3/calendars/calendarId/events
and adding path parameters and optional query parameters as shown in Events: list.
In addition to that, referring to Synchronize Resources Efficiently, you can keep data for all calendar collections in sync while saving bandwidth by using the "incremental synchronization".
As highlighted in the documentation:
A sync token is a piece of data exchanged between the server and the client, and has a critical role in the synchronization process.
As you may have noticed, sync token takes a major part in both stages in incremental synchronization. Make sure to store this syncToken for the next sync request. As discussed:
Initial full sync is performed once at the very beginning in order to fully synchronize the client’s state with the server’s state. The client will obtain a sync token that it needs to persist.
Incremental sync is performed repeatedly and updates the client with all the changes that happened ever since the previous sync. Each time, the client provides the previous sync token it obtained from the server and stores the new sync token from the response.
More information and examples on how to synchronize efficiently can be found in the given documentations.

Validate approach for clearing notifications

Could you validate my approach for using Firebase to manage a user notification system?
Basically I want to have user specific channels as well as more general channels which hold notifications. These notifications would appear on an intranet if the user has not viewed them before.
The idea being a server side action will update Firebase endpoints using the REST API either for a specific user or broadcast to everyone. The specific user messages I can easily mark as read and therefore not show them again, its the general broadcast I am struggling slightly with.
I could add a flag(user ID) to the general broadcast to indicate its read but I am concerned about performance as the client would have to check historic broadcast messages for the existence of this flag. I could add a user id to create a new endpoint which should be quicker.
e.g. /notification/general/ - contains the message, this triggers the client which then checks to see if /users/USERID/MessageID exists if it doesnt display the message and create this end point.
Is there something I am missing or is that the best approach?

Are the messages always consumed in-order? If so you could have each client remember the ID of the last message it read in each public channel. You could then use "startAt" on the queue to limit it to only new messages.
If they're not consumed in order, then you'll need some way of storing data about which ones were read and which ones weren't. Perhaps you can have each message get sent out to everyone's personal queue, and then have each user remove read messages.

Since there are already undividual user messages, why not just deliver the broadcasts to everyone individually (think email) rather than trying to store a single copy and figure out who read it.
To reduce bulk, you could store the message content separately, and simply store the ids in a user's queue. Then when they are viewed, you flag them per-user without any additional complexity.
At 100k of users receiving 100 messages a day including broadcasts, with a standard firebase ID (around 20 characters), that comes out to 210,000,000 characters a year (i.e. nothing for a database, and probably still far less than the actual bulk of storing the message body), assuming they never expire and get deleted.

Check database for changes via long polling

Im creating a chat app in ASP.NET MVC3.
im using long polling and AsyncController to do so
when a user posts a chat its saved in database , to retrieve should i constantly check database for change in record or after definite interval
or is there an better/ efficient way of doing it
i came across this question but could not get a usable answer.

You may take a look at SignalR for an efficient way. Contrary to the standard polling mechanism (in which you are sending requests at regular intervals to check for changes), SignalR uses a push mechanism in which the server sends notifications to connected clients to notify them about changes.

Since you're already using long polling and an asynccontrolller, why not create a message pool? Take a look at this solution.
In a nutshell, instead of just writing the updated chat to the database, you should also stick it in some sort of queue. Then each user's async thread is listening to that pool waiting for a message to appear. When one appears return the data to the user through your normal operation. When all listening threads have picked up the message it can be removed from the queue. This will prevent you from having several threads hammering your database looking for a new message.

You can give PServiceBus(http://pservicebus.codeplex.com/) a try and here is a sample web chat app(http://74.208.226.12/ChatApp/chat.html) running and does not need database in between to pass message between two web clients. If you want to persist data in the database for logging sake, you can always subscribe to the chat message and log it to database.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex