Delegating Tasks for Mission Critical Application - backgroundworker

I'm working on a mission critical application.
The application fetches Stock Market data from different stock markets like NYSE, NASDAQ, etc. using third party service.
Customers can come to the application and add their Portfolio (which company's shares they have).
And then set Alerts. eg. Notify me when AAPL price goes above $xxx on NASDAQ. when MSFT price goes below $zzz on NYSE.
I've a cron job that fetches market data from third party service for all the tickers users have added (AAPL, GOOG, MSFT, etc...) every 1 min.
After I get the data, I fetch all the alerts that users have created and then send them notification via Email, SMS, Pushover, Twitter, Facebook Message, etc. Also add that notification to app's database so user can see it in App when they log in.
Now since this is time intensive application, failure to fetch data may result in big loss since customers are paying for the time critical data.
Currently, I'm pushing all the notification sending part to Queue. Worker (on my server) sends notification.
Are there any other better ways to delegate as much work as possible to third party servers?
Would you recommend using Iron.io worker so it does the job of sending the notifications as well.
And may be also fetching data from the market.
Thanks!

Architecturally there are a number of approaches but it sounds as if you're making the right choices. Using a queue to decouple the producer from the notification process makes sense. This enables a more proper SOA architecture where you can change/update/evolve various parts of the app independently without worrying too much about tightly coupled code.
That said, your question is specifically around offloading to third parties. There are third parties that can abstract the notification part out of your code. I'm not super familiar with them but there are many options: PubNub, Pusher, Twilio, SendGrid, Mailgun, AWS SNS, etc.
I work for Iron.io. We have many customers doing exactly what you're trying to accomplish: creating workers that become little mini-services and calling them from either push events, scheduled tasks, or on-demand. This frees you up from having to deal with the queuing, routing, scheduling, and worker/background server capacity.
We're happy to help you architect things right from the beginning, just reach out to support#iron.io.

Related

Best strategy to develop back end of an app with large userbase, taking into account limitations of bandwidth, concurrent connections etc.?

I am developing an Android app which basically does this: On the landing(home) page it shows a couple of words. These words need to be updated on daily basis. Secondly, there is an 'experiences' tab in which a list of user experiences (around 500) shows up with their profile pic, description,etc.
This basic app is expected to get around 1 million users daily who will open the app daily at least once to see those couple of words. Many may occasionally open up the experiences section.
Thirdly, the app needs to have a push notification feature.
I am planning to purchase a managed wordpress hosting, set up a website, and add a post each day with those couple of words, use the JSON-API to extract those words and display them on app's home page. Similarly for the experiences, I will add each as a wordpress post and extract them from the Wordpress database. The reason I am choosing wordpress is that it has ready made interfaces for data entry which will save my time and effort.
But I am stuck on this: will the wordpress DB be able to handle such large amount of queries ? With such a large userbase and spiky traffic, I suspect I might cross the max. concurrent connections limit.
What's the best strategy in my case ? Should I use WP, or use firebase or any other service ? I need to make sure the scheme is cost effective also.
My app is basically very similar to this one:
https://play.google.com/store/apps/details?id=com.ekaum.ekaum
For push notifications, I am planning to use third party services.
Kindly suggest the best strategy I should go with for designing the back end of this app.
Thanks to everyone out there in advance who are willing to help me in this.
I have never used Wordpress, so I don't know if or how it could handle that load.
You can still use WP for data entry, and write a scheduled function that would use WP's JSON API to copy that data into Firebase.
RTDB-vs-Firestore scalability states that RTDB can handle 200 thousand concurrent connections and Firestore 1 million concurrent connections.
However, if I get it right, your app doesn't need connections to be active (i.e. receive real-time updates). You can get your data once, then close the connection.
For RTDB, Enabling Offline Capabilities on Android states that
On Android, Firebase automatically manages connection state to reduce bandwidth and battery usage. When a client has no active listeners, no pending write or onDisconnect operations, and is not explicitly disconnected by the goOffline method, Firebase closes the connection after 60 seconds of inactivity.
So the connection should close by itself after 1 minute, if you remove your listeners, or you can force close it earlier using goOffline.
For Firestore, I don't know if it happens automatically, but you can do it manually.
In Firebase Pricing you can see that 100K Firestore document reads is $0.06. 1M reads (for the two words) should cost $0.6 plus some network traffic. In RTDB, the cost has to do with data bulk, so it requires some calculations, but it shouldn't be much. I am not familiar with the pricing small details, so you should do some more research.
In the app you mentioned, the experiences don't seem to change very often. You might want to try to build your own caching manually, and add the required versioning info in the daily data.
Edit:
It would possibly be more efficient and less costly if you used Firebase Hosting, instead of RTDB/Firestore directly. See Serve dynamic content and host microservices with Cloud Functions and Manage cache behavior.
In short, you create a HTTP function that reads your database and returns the data you need. You configure hosting to call that function, and configure the cache such that subsequent requests are served the cached result via hosting (without extra function invocations).

Firebase connection count with angular bindings?

I've read quite a few posts (including the firebase.com website) on Firebase connections. The website says that one connection is equivalent to approximately 1400 visiting users per month. And this makes sense to me given a scenario where the client makes a quick connection to the Firebase server, pulls down some data, and then closes the connection. However, if I'm using angular bindings (via angularfire), wouldn't each client visit (in the event the user stays on the site for a period of time) be a connection? In this example having 100 users (each of which is making use of firebase angular bindings) connecting to the site at the same time would be 100 connections. If I opted not to use angular bindings, that number could be (in a theoretical sense) 0 if all the clients already made their requests for data and were just idling.
Do I understand this properly?
AngularFire is built on top of Firebase's regular JavaScript/Web SDK. The connection count is fundamentally the same between them: if a 100 users are using your application at the same time and you are synchronizing data for each of them, you will have 100 concurrent connections at that time.
The statement that one concurrent connection is the equivalent of about 1400 visits per month is based on the extensive experience that the Firebase people have with how long the average connection lasts. As Andrew Lee stated in this answer: most developers vastly over-estimate the number of concurrent connections they will have.
As said: AngularFire fundamentally behaves the same as Firebase's JavaScript API (because it is built on top of that). Both libraries keep an open connection for a user, so that they can synchronize any changes that occur between the connected users. You can manually drop such a connection by calling goOffLine and then re-instate it with goOnline. Whether that is a good approach is largely dependent on the type of application you're building.
Two examples:
There recently was someone who was building a word game. He used Firebase to store the final score for each game. In his case explicitly managing the connections makes sense, because the connection is only needed for a relatively short time when compared to the time the application is active.
The "hello world" for Firebase programming is a chat application. In such an application it doesn't make a lot of sense to manage the connections yourself. So briefly connect every 15 seconds and then disconnect again. If you do this, you're essentially reverting to polling for updates. Doing so will lose you one of the bigger benefits of using Firebase: it automatically synchronizes data to connected clients.
So only you can decide whether explicit connection management is best for you application. I'd recommend starting without it (it's simpler) and first testing your application on a smaller scale to see how actual usage holds up to your expectation.

Can I export my Urban Airship push device tokens?

I'm evaluating Urban Airship as a push solution and I was wondering if it's possible to export my device tokens should I decide to stop using their service?
I've noticed they have an API endpoint to download device data (http://docs.urbanairship.com/reference/api/v3/device_information.html#device-token-list-api) but I was wondering if anyone actually went through the process of switching their push solution from UA to an internal solution (i.e. run my own push server and ping old users).
Thank you!
I'm not sure if there is an API call for it, but you could go to Audience->device tokens, and make a script to fetch all of them.
In the company I work, we decided on a different approach.
All communication with Urban Airships goes through our own backend, where we at the same time store the devicetokens sent from the device. That way we can shift to another way of sending push notifications without modifying our apps. It is of course a bit more time consuming to do the initial development. On the other hand, if you go for the solution you are currently considering, the switch to you own implementation (or another push provider) will properly require several migrations, or at least maintaining two different ways of sending push notification for a considerable time.
BTW:we have been using UA for almost 3 years, and have been very happy with their service.

Architecture For A Real-Time Data Feed And Website

I have been given access to a real time data feed which provides location information, and I would like to build a website around this, but I am a little unsure on what architecture to use to achieve my needs.
Unfortunately the feed I have access to will only allow a single connection per IP address, therefore building a website that talks directly to the feed is out - as each user would generate a new request, which would be rejected. It would also be desirable to perform some pre-processing on the data, so I guess I will need some kind of back end which retrieves the data, processes it, then makes it available to a website.
From a front end connection perspective, web services sounds like it may work, but would this also create multiple connections to the feed for each user? I would also like the back end connection to be persistent, so that data is retrieved and processed even when the site is not being visited, I believe IIS will recycle web services and websites when they are idle?
I would like to keep the design fairly flexible - in future I will be adding some mobile clients, so the API needs to support remote connections.
The simple solution would have been to log all the processed data to a database, which could then be picked up by the website, but this loses the real-time aspect of the data. Ideally I would be looking to push the data to the website every time the data changes or now data is received.
What is the best way of achieving this, and what technologies are there out there that may assist here? Comet architecture sounds close to what I need, but that would require building a back end that can handle multiple web based queries at once, which seems like quite a task.
Ideally I would be looking for a C# / ASP.NET based solution with Javascript client side, although I guess this question is more based on architecture and concepts than technological implementations of these.
Thanks in advance for all advice!
Realtime Data Consumer
The simplest solution would seem to be having one component that is dedicated to reading the realtime feed. It could then publish the received data on to a queue (or multiple queues) for consumption by other components within your architecture.
This component (A) would be a standalone process, maybe a service.
Queue consumers
The queue(s) can be read by:
a component (B) dedicated to persisting data for future retrieval or querying. If the amount of data is large you could add more components that read from the persistence queue.
a component (C) that publishes the data directly to any connected subscribers. It could also do some processing, but if you are looking at doing large amounts of processing you may need multiple components that perform this task.
Realtime web technology components (D)
If you are using a .NET stack then it seems like SignalR is getting the most traction. You could also look at XSockets (there are more options in my realtime web tech guide. Just search for '.NET'.
You'll want to use signalR to manage subscriptions and then to publish messages to registered client (PubSub - this SO post seems relevant, maybe you can ask for a bit more info).
You could also look at offloading the PubSub component to a hosted service such as Pusher, who I work for. This will handle managing subscriptions and component C would just need to publish data to an appropriate channel. There are other options all listed in the realtime web tech guide.
All these components come with a JavaScript library.
Summary
Components:
A - .NET service - that publishes info to queue(s)
Queues - MSMQ, NServiceBus etc.
B - Could also be a simple .NET service that reads a queue.
C - this really depends on D since some realtime web technologies will be able to directly integrate. But it could also just be a simple .NET service that reads a queue.
D - Realtime web technology that offers a simple way of routing information to subscribers (PubSub).
If you provide any more info I'll update my answer.
A good solution to this would be something like http://rubyeventmachine.com/ or http://nodejs.org/ . It's not asp.net, but it can easily solve the issue of distributing real time data to other users. Since user connections, subscriptions and broadcasting to channels are built in to each, that will make coding the rest super simple. Your clients would just connect over standard tcp.
If you needed clients to poll for updates then you would need a que system to store info for the next request. That could be a simple array, or a more complicated que system depending on your requirements and number of users.
There may be solutions for .net that I am not aware of that do the same thing, but those are the 2 I know of.

How can I create a queue with multiple workers?

I want to create a queue where clients can put in requests, then server worker threads can pull them out as they have resources available.
I'm exploring how I could do this with a Firebase repository, rather than an external queue service that would then have to inject data back into Firebase.
With security and validation tools in mind, here is a simple example of what I have in mind:
user pushes a request into a "queue" bucket
servers pull out the request and deletes it (how do I ensure only one server gets the request?)
server validates data and retrieves from a private bucket (or injects new data)
server pushes data and/or errors back to the user's bucket
A simplified example of where this might be useful would be authentication:
user puts authentication request into the public queue
his login/password goes into his private bucket (a place only he can read/write into)
a server picks up the authentication request, retrieves login/password, and validates against the private bucket only the server can access
the server pushes a token into user's private bucket
(certainly there are still some security loopholes in a public queue; I'm just exploring at this point)
Some other examples for usage:
read only status queue (user status is communicated via private bucket, server write's it to a public bucket which is read-only for the public)
message queue (messages are sent via user, server decides which discussion buckets they get dropped into)
So the questions are:
Is this a good design that will integrate well into the upcoming security plans? What are some alternative approaches being explored?
How do I get all the servers to listen to the queue, but only one to pick up each request?
Wow, great question. This is a usage pattern that we've discussed internally so we'd love to hear about your experience implementing it (support#firebase.com). Here are some thoughts on your questions:
Authentication
If your primary goal is actually authentication, just wait for our security features. :-) In particular, we're intending to have the ability to do auth backed by your own backend server, backed by a firebase user store, or backed by 3rd-party providers (Facebook, twitter, etc.).
Load-balanced Work Queue
Regardless of auth, there's still an interesting use case for using Firebase as the backbone for some sort of workload balancing system like you describe. For that, there are a couple approaches you could take:
As you describe, have a single work queue that all of your servers watch and remove items from. You can accomplish this using transaction() to remove the items. transaction() deals with conflicts so that only one server's transaction will succeed. If one server beats a second server to a work item, the second server can abort its transaction and try again on the next item in the queue. This approach is nice because it scales automatically as you add and remove servers, but there's an overhead for each transaction attempt since it has to make a round-trip to the firebase servers to make sure nobody else has grabbed the item from the queue already. But if the time it takes to process a work item is much greater than the time to do a round-trip to the Firebase servers, this overhead probably isn't a big deal. If you have lots of servers (i.e. more contention) and/or lots of small work items, the overhead may be a killer.
Push the load-balancing to the client by having them choose randomly among a number of work queues. (e.g. have /queue/0, /queue/1, /queue/2, /queue/3, and have the client randomly choose one). Then each server can monitor one work queue and own all of the processing. In general, this will have the least overhead, but it doesn't scale as seamlessly when you add/remove servers (you'll probably need to keep a separate list of work queues that servers update when they come online, and then have clients monitor the list so they know how many queues there are to choose from, etc.).
Personally, I'd lean toward option #2 if you want optimal performance. But #1 might be easier for prototyping and be fine at least initially.
In general, your design is definitely on the right track. If you experiment with implementation and run into problems or have suggestions for our API, let us know (support#firebase.com :-)!
This question is pretty old but in case someone makes it here anyway...
Since mid 2015 Firebase offers something called the Firebase Queue, a fault-tolerant multi-worker job pipeline built on Firebase.
Q: Is this a good design that will integrate well into the upcoming security plans?
A: Your design suggestion fits perfectly with Firebase Queue.
Q: How do I get all the servers to listen to the queue, but only one to pick up each request?
A: Well, that is pretty much what Firebase Queue does for you!
References:
Introducing Firebase Queue (blog entry)
Firebase Queue (official GitHub-repo)

Resources