I am about to start working on the back-end for a mobile app (initially iOS/Android, later also website) and I am thinking whether Realm could fulfill all my needs.
The basic idea is that there are two types of users - customers and service-providers. The customers send requests to the server once in a while and are subscribed (real-time) for any event that might occur in relation to this request in the future. Each service-provider is listening for specific requests from all customers and is the one who is going to trigger various events (send data) for each of those requests.
From the Realm docs, it is obvious that the real-time data sync is not going to be a problem. The thing I am concerned about is how to model the scenario (customer/service-provider) in the Realm 'world'. Based on what I read, it is preferred to have one realm per user. Therefore, I suppose the user will register and will be given a realm. Then whenever he makes a request, it is going to be stored in his realm. Now the question is how to model the service-provider. There are going to be various service-providers each responding (triggering various kinds of events up to one hour after request) to different kinds of requests. (Each user can send any request and therefore be served by any service-provider.)
I read a bit about that Realm supports data sharing among different realms which could be a partial solution for this problem, however I was not able to find if this 'sharing' could share only particular requests. (Meaning each service-provider will get only requests intended for him.)
My question is whether this scenario is doable using Realm?
This sounds like a perfect fit for Realm's server-side event-handling. Put simply, Realm offers the ability through our Node SDK to listen for changes across Realms on the server.
So in your example, where each mobile user would have their own Realm, the URL for this would be /~/myRealm in which the tilde represents the Realm user ID. The Node SDK event handling API allows you to register a JS function that will run in response to changes represented by a Regex pattern for Realm URLs. In this case you could use: ^/([0-9a-f]+)/myRealm so that any time any user's myRealm updated, the server could perform some logic.
In this manner, the server via the Node SDK is really a "super-user" or service-provider as you describe. When an event fires, the JS function that runs is provided the Realm that was updated and a list of indexes pertaining to the objects in the Realm that were inserted, deleted, or modified. You can then perform any logic in JS, such as using the changed data to call out to another API or opening the Realm in question or any other and writing changes which will get pushed back out to the respective clients.
The full server-side event handling is part of Realm Professional Edition, but we recently released another way to interact with this called Realm Functions. This provides the ability through the server's dashboard to create the same JS functions that will run in response to changes across Realms. The developer edition support 3 functions so you can try it out immediately!
Related
Imagine the following situation. I have an API and a developer builds an application that retrieves new content from it on a daily base. She stores this content and provides this data to all the instances of an app she developed. In this way these apps do not have to call the API directly.
Is there a way to prevent this and force the apps (and therefore the end users) to use the API and not only the application on the server.
I found many questions about how to cache API data but not how to prevent that. I am fairly new to this, so maybe I am overlooking something or maybe it is not possible to prevent this.
Thank you in advance!
Assuming you are using Apigee for API-management, you have some options. First, consider the options available to you contractually, if this is that sort of business relationship and you can impose certain API behavior with a business partner through a contract.
Separate from the legal side of things, we remember that you control your API and the credentials you issue for use by your API clients. You cannot though control, practically, what a client developer does with the credentials you issue: she could promise to embed the credentials in the mobile apps' API client, but change her mind and use it centrally, and then design her mobile client to call into her central cache. If though you really insist that only mobile app clients should be calling your API and not a hub/cache server, then you could consider applying constraint policies on your API (within the Apigee proxy, such as Access Control). For instance, you could blacklist your partner's hub/cache server IP address, although that is weak security at best. Or, you could apply a constraint that only clients with certain identifying User-Agent strings (mobile OS, client) are allowed to connect to your API. Or use GeoIP filtering to allow only clients from certain regions, if that applies to your use-case.
Finally, depending on the data model, you might be able to rate-limit such that a bulk cache becomes impractical: if your edge-client use-cases is to fetch a single record, but a cache would have to hold thousands of records, then you could impose a per-client rate limit (Quota policy) which is no bother to individual mobile clients, but makes the work of a hub/cache server untenable.
Before I get to my question, let me sketch out a sample set of microservices to illustrate my dilemma.
Scenario outline
Suppose I have 4 microservices:
An activation service where features supplied to our customers are (de)activated. A registration service where members can be added and changed. A secured key service that is able to generate secure keys (in a multi step process) for members to be used when communicating with them with the outside world. And a communication service that is used to communicate about our members with external vendors.
The secured key service may however only request secured keys if this is a feature that is activated. Additionally, the communication service may only communicate about members that have a secured key AND if the communication feature itself is activated.
Because they are microservices, each of the services has it's own datastore and is completely self sufficient. That is, any data that is required from the other microservices is duplicated locally and kept in sync by means of asynchronous messages from the other microservices.
The dilemma
I'm actually facing two main dilemma's. The first is (pretty obviously) data synchronization. When there are multiple data stores that need to be kept in sync you have to account for messages getting lost or processed out of order. But there are plenty of out of the box solutions for this and when all fails you could even fall back to some kind of ETL process to keep things in sync.
The main issue I'm facing however is the actions that need to be performed. In the above example the secured key service must perform an action when it either
Receives a message from the registration service for a new member when it already knows that the secured keys feature is active in the activation service
Receives a message from the activation service that the secured keys feature is now active when it already knows about members from the registration service
In both cases this means that a message from the external system must lead to both an update in the local copy of the data as well as some logic that needs to be processed.
The question
Now to the actual question :)
What is the recommended way to cope with either bugs or new insights when it comes to handling those messages? Suppose there is a bug in the message handler from the activation service. The handler does update the internal data structure, but it fails to detect that there are already registered members and thus never starts the secure key generation process. Alternatively it could be that there's no bug, but we decide that there is something else we want the handler to do.
The system will have no reason to resubmit or reprocess messages (as the message didn't fail), but there's no real way for us to re-trigger the behavior that's behind the message.
I hope it's clear what I'm asking (and I do apologize if it should be posted on any of the other 170 Stack... sites, I only really know of StackOverflow)
I don't know what is the recommended way, I know how this is done in DDD and maybe this can help you as DDD and microservices are friends.
What you have is a long-running/multi-step process that involves information from multiple microservices. In DDD this can be implemented using a Saga/Process manager. The Saga maintains a local state by subscribing to events from both the registration service and the activation service. As the events come, the Saga check to see if it has all the information it needs to generate secure keys by submitting a CreateSecureKey command. The events may come in any order and even can be duplicated but this is not a problem as the Saga can compensate for this.
In case of bugs or new features, you could create special scripts or other processes that search for a particular situation and handle it by submitting specific compensating commands, without reprocessing all the past events.
In case of new features you may even have to process old events that now are interesting for your business process. You do this in the same way, by querying the events source for the newly interesting old events and send them to the newly updated Saga. After that import process, you subscribe the Saga to these newly interesting events and the Saga continues to function as usual.
We're developing an agenda on our platform. We implemented a feature to sync with Google Agenda which works correctly except that it only works with public calendar and not when it's private.
We implement everything as Google provides and use AuthO2 protocol.
We are migrating to https and we hope that it will solve our issue.
Do you have any idea on the reason it's blocked when agenda is private?
You can implement synchronization by sending HTTP request:
GET https://www.googleapis.com/calendar/v3/calendars/calendarId/events
and adding path parameters and optional query parameters as shown in Events: list.
In addition to that, referring to Synchronize Resources Efficiently, you can keep data for all calendar collections in sync while saving bandwidth by using the "incremental synchronization".
As highlighted in the documentation:
A sync token is a piece of data exchanged between the server and the client, and has a critical role in the synchronization process.
As you may have noticed, sync token takes a major part in both stages in incremental synchronization. Make sure to store this syncToken for the next sync request. As discussed:
Initial full sync is performed once at the very beginning in order to fully synchronize the client’s state with the server’s state. The client will obtain a sync token that it needs to persist.
Incremental sync is performed repeatedly and updates the client with all the changes that happened ever since the previous sync. Each time, the client provides the previous sync token it obtained from the server and stores the new sync token from the response.
More information and examples on how to synchronize efficiently can be found in the given documentations.
From Firebase's FAQ:
What happens to my app if I lose my network connection?
Firebase transparently reconnects to the Firebase servers as soon as
you regain connectivity. In the meantime, all Firebase operations done
locally by your app will immediately fire events (...). Once
connectivity is reestablished, you’ll receive the appropriate set of
events so that your client “catches up” with the current server state
Then what happens if I go offline and keep modifying my local data, then come back online and other clients have performed different changes? Which one will ultimately be saved?
If the data on the server gets overridden, does it mean older data can replace newer one?
If the newer data added online is kept, do I know that the data submitted while offline has been discarded?
When your client comes back online, after an offline period and writing data, the behavior of those changes will be determined by which method you used to write them:
The set(), setWithPriority(), remove(), and push() methods are last-write-wins. This means that if offline client A makes a change at t=0, and online client B makes a change at t=10, that offline client A's changes will overwrite client B's changes when upon reconnection. Note that this specifically applies to the changes that were made (i.e. set /a/b/c to 1), not the entire Firebase.
The transaction() method, however, is built specifically for handling conflicts. When offline client A reconnects, your transaction update function will re-run and apply the new change to your Firebase data.
In most applications, users are appending data to lists or modifying individual user state, but not modifying the same piece of data. In the event that multiple users are modifying the same piece of data, you'll want to use transaction() whether you're offline or not.
Generally speaking, Firebase has been built to handle going offline and online automatically and so you shouldn't have to write application code to detect and handle that case.
I want to create a queue where clients can put in requests, then server worker threads can pull them out as they have resources available.
I'm exploring how I could do this with a Firebase repository, rather than an external queue service that would then have to inject data back into Firebase.
With security and validation tools in mind, here is a simple example of what I have in mind:
user pushes a request into a "queue" bucket
servers pull out the request and deletes it (how do I ensure only one server gets the request?)
server validates data and retrieves from a private bucket (or injects new data)
server pushes data and/or errors back to the user's bucket
A simplified example of where this might be useful would be authentication:
user puts authentication request into the public queue
his login/password goes into his private bucket (a place only he can read/write into)
a server picks up the authentication request, retrieves login/password, and validates against the private bucket only the server can access
the server pushes a token into user's private bucket
(certainly there are still some security loopholes in a public queue; I'm just exploring at this point)
Some other examples for usage:
read only status queue (user status is communicated via private bucket, server write's it to a public bucket which is read-only for the public)
message queue (messages are sent via user, server decides which discussion buckets they get dropped into)
So the questions are:
Is this a good design that will integrate well into the upcoming security plans? What are some alternative approaches being explored?
How do I get all the servers to listen to the queue, but only one to pick up each request?
Wow, great question. This is a usage pattern that we've discussed internally so we'd love to hear about your experience implementing it (support#firebase.com). Here are some thoughts on your questions:
Authentication
If your primary goal is actually authentication, just wait for our security features. :-) In particular, we're intending to have the ability to do auth backed by your own backend server, backed by a firebase user store, or backed by 3rd-party providers (Facebook, twitter, etc.).
Load-balanced Work Queue
Regardless of auth, there's still an interesting use case for using Firebase as the backbone for some sort of workload balancing system like you describe. For that, there are a couple approaches you could take:
As you describe, have a single work queue that all of your servers watch and remove items from. You can accomplish this using transaction() to remove the items. transaction() deals with conflicts so that only one server's transaction will succeed. If one server beats a second server to a work item, the second server can abort its transaction and try again on the next item in the queue. This approach is nice because it scales automatically as you add and remove servers, but there's an overhead for each transaction attempt since it has to make a round-trip to the firebase servers to make sure nobody else has grabbed the item from the queue already. But if the time it takes to process a work item is much greater than the time to do a round-trip to the Firebase servers, this overhead probably isn't a big deal. If you have lots of servers (i.e. more contention) and/or lots of small work items, the overhead may be a killer.
Push the load-balancing to the client by having them choose randomly among a number of work queues. (e.g. have /queue/0, /queue/1, /queue/2, /queue/3, and have the client randomly choose one). Then each server can monitor one work queue and own all of the processing. In general, this will have the least overhead, but it doesn't scale as seamlessly when you add/remove servers (you'll probably need to keep a separate list of work queues that servers update when they come online, and then have clients monitor the list so they know how many queues there are to choose from, etc.).
Personally, I'd lean toward option #2 if you want optimal performance. But #1 might be easier for prototyping and be fine at least initially.
In general, your design is definitely on the right track. If you experiment with implementation and run into problems or have suggestions for our API, let us know (support#firebase.com :-)!
This question is pretty old but in case someone makes it here anyway...
Since mid 2015 Firebase offers something called the Firebase Queue, a fault-tolerant multi-worker job pipeline built on Firebase.
Q: Is this a good design that will integrate well into the upcoming security plans?
A: Your design suggestion fits perfectly with Firebase Queue.
Q: How do I get all the servers to listen to the queue, but only one to pick up each request?
A: Well, that is pretty much what Firebase Queue does for you!
References:
Introducing Firebase Queue (blog entry)
Firebase Queue (official GitHub-repo)