Recommendation engine for push information delivery - push-notification

I want to develop a system that involves a recommendation engine for a push information delivery. I have seen plenty of explanations about using some engines, like Mahout Taste and Duine. Yet by using them, the recommended items are obtained after an input containing user Id occurs. So, such engines seem to be suitable only for web applications/service that use pull-request from users.
But by using push messaging, i want my server to actively send a recommmendation message directly to some particular users/customers that are based on recommendation algorithm, relevant. The delivery process would be performed as soon as a new item (product/content) available in the database.
My question, is it possible/recommended to use the existing engines, like Mahout, or Duine? What algorithms are good in order to do this?

What's the distinction you're making -- whether your push to or pull from a user, you have their user ID, presumably. This doesn't affect recommendations. You can recommend in either case, whether you want to push or pull.


Validate data before insertion in Firebase

I'm building an app which uses user contributed content.
The contribution by each user should be available to all others in real time.
I was looking into firebase Realtime database for this.
However, when a user contributes content, there are quite heavy validations and calculations (read server side) to be done on the data before making it available to others.
Is it possible to have a server side validation in firebase ? Or should I look for alternatives ?
Initially, Firebase did not have a feature to implement server-side processing/calculations. All your processing had to be done on the client side.
Now, they've recently introduced a new feature called Cloud Functions For Firebase. Its a really useful new addition where you can write server-side code without the hassles of managing servers or instances. Read up more about it from the above link.
Also, this Youtube playlist by Jen Person is a great start. And, you can find examples similar to your use case here.

Is Graph Database a good use case for a messaging system?

I am diving in the universe of Graph Databases and I'm simply amazed by how powerful it is. I chose OrientDB to start my first use case but I'm not certain if my domain applies to this specific section of my App.
An User follows another User.
An User can be part of a Conversation.
A Message can be sent (with a timestamp) to a Conversation.
A Message can be read (with a timestamp) by an User.
I'm worried to end up with millions (even billions) of Message nodes and sent or read edges thus affecting the overall performance of the system. The messaging section is not the main concept of the app, it is just a small portion of it.
Would it be a problem for OrientDB to handle? Is it a good application for a Graph Database?
Thank you all for your patience,
Don't think a Graph Database is a best candidate for a messaging system. Message system are relational in nature and suits the likes my MySQL.
You wouldn't be surprised to hear though that Facebook uses document-oriented databases for their messaging system.
Facebook is currently the largest installation of Cassandra, which is excellent for scalability. We already know that from Facebook. Plus its great for storing messages due to its distributed nature.
Take a look at the suggested way to use OrientDB with a similar use case:
The choice of a graph database ultimately depends on what are you going to do with the data.
In your case, do you plan to use any graph-processing algorithms, or graph traversals?
An edge in graph theory represents a relationship between nodes (objects). In the case of a timestamp for read and sent, it does not really fit and you will end up with billions of edges, killing the performance of the system.
The follower concept perfectly fits the database. Now concerning the Conversation it could be an attribute of the node. Do you need to create an edge to represent ownership just to query the Conversation ID ?
If the messaging is just a small part of your application, I suggest to use the best tool for your need and to combine both a column-oriented database (Cassandra) and use Orient-DB to represent relationships or use Orient-DB as in the Chat use case (Thanks #Lvca)
What we suggest is to avoid using Edges or Vertices connected with
edges for messages. The best way is using the document API by creating
one class per chat room, with no index, to have super fast access to
last X messages.
Also wondering about this topic but I think any RDBMS will be better for this task.
Also, Chat is kinda of a log. So ElasticSearch (and similar) can be perfect match for storing Terra bytes of chat data.
A lot of dissonant answers here.. Speaking from experience, I've built a few messaging systems on plain MongoDB instances with no issues whatsoever handling hundreds/thousands of concurrent users (with chat groups).
I'd say go with either Cassandra as it's a battle-tested database if you're very worried about scalability (as it's got it practically built-in) or some of the newcomers like MongoDB which is constantly being upgraded and you can relatively easily then include search via ElasticSearch on top of that. MongoDB supports scaling via sharding and it can therefore horizontally scale to your needs.
Just be sure to not bottleneck your speed on your backend service, implement as much asynchronous operations as possible.
Now, you can even go as far as to implement a streaming platform like Kafka which is excellent for CDC (change data capture) and will persist your message log until it is read by a service that actually writes messages to your database of choice, adding to your resiliency factor.

Is there a good way to link registered users' emails with data in google analytics?

If I build a website for my new awesome mobile app (or web service or whatever) I might want to do a slow launch, sending email invites to the first x people to register on the site.
Is there a good way to link each registered email to the corresponding data in google analytics (or any similar service), and query them based on location, language, etc.?
Maybe the spanish version isn't quite done yet, so I don't want to invite people who used a spanish browser to sign up. Or maybe my app is location-dependent (like time tables for buses) and just doesn't work at all outside of my home town.
I really want to have a simple email-only "registration".
It is completely possible, although it may breach some of GA's terms of use if done wrong.
You should not store email addresses in any way as part of your GA data because it would be considered personally identifiable data. However, there is nothing saying that you couldn't store a kind of GUID for each user, and then compare that with email addresses offline - although the user should be made aware that any actions they take while using your service/application/whatever are being tracked with the capability of being personally identified.
As far as getting the actual data that you are discussing, language and location are stored by GA by default, so no headache there!
The best way to store the user's GUID would probably be in a custom dimension. How you do this is going to depend on how you build your product. I had to write a tracking library using the measurement protocol for an AS3 project awhile back because there isn't an AS3 library that is supported anymore. If you are using JavaScript, it will be much easier, as Google offers native JS libraries to handle web analytics.
Finally, try taking a look at the documentation. Its pretty easy to understand

Will Google block my access if I use their features without token?

I'm using this link
to fetch feeds using Google's algorithm. As you can see I'm not adding any other parameters, just fetching the returned data in JSON format. My app will be heavily used hopefully and if I send a lot of requests to this link, will Google block my access or something?
Is there anything I can include, like userip, url for my app (so if they have problem to just contact me) or something else?
The most basic answer to your question is that Google will change its Terms of Service whenever it likes, and you've got no say in the matter. So if it's allowed today, it might not be allowed tomorrow, at Google's whim.
On this issue, though, you seem fairly safe. From the Terms of Service (these is the general document, since Reader doesn't seem to have a specific one):
Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
Google provides RSS and Atom. They provide these feeds, so I assume they expect that they'll be used. They don't say that it's a misuse to point someone else at those feeds, so it looks OK for now, but they could add such a clause at any time.
All online services are subject to the terms and conditions of the providers of those services. So, as others have said, they may be ok with your use today, but they can change their mind any time down the line. I doubt including a URL or email or contact info will help anything, because when these services change, they don't notify every user of the service, they just announce the change publicly, and usually they give several month's notice in order to give users a chance to adapt their applications, but this is not standardized or enforced so there is no guarantee. One example would be the fairly recent discontinuance of the Google Finance API (for which no replacement has been announced).
The safest approach would be to design your app such that this feature that uses google's functionality is decoupled as much as possible from the rest of your app, so that, when or if the availability of the service changes (ie it's no longer available at all) you can adapt your app to use some other source for the feeds with minimal impact to the rest of the app. Design for change and plan for the worst.

Better way to notify the users about workflow items

I want to notify the users about their assignment list/status of their work items via notification.
Instead of sending email notification, is it possible to show the notifications in Tridion itself? Say for example: having new item like "Notifications" under shortcuts-->mytasks?
Or is there any better way to notify the users apart from email communication, because the users dont want to receive so many mails in their mail box.
Yes. The list of items shown shortcuts is extendable, just like every other list I've ever looked at in the Tridion GUI.
Have a look at this blog post from Jaime to get started. The topic of how to extend is was also covered in this question, but Jaime's tutorial is probably a better starting point.
Once you get some experience with writing this extension, you'll probably run into questions similar to the one Nuno asked here (and that was answered expertly by Jaime and Boris).
In general I like to think of workflow notification in two broad groups - Active and Passive. Under active notifications, I really only include email notification, but you could expand the concept to push notifications to an iphone app or send a text message etc. When considering active notifications, it is very common for users to get fed up of having too many alerts from the system, so it is important to design it in a flexible way that allows your users to not feel bombarded with alerts. The most successful implementation I have implemented allowed users to say how often they receive notifications. Typically they set this to every 24 hours, and they receive a summary email of their pending assignments each morning. This allows users who are very active and use their task list regularly to never receive emails, as they normally get to the items before the following day.
Moving to passive forms of notification, if you keep in mind that you can expose a user's task list using the API's the SDL have provided, you could think about implementing the following forms of passive notification:
Create an RSS feed for the Users's assignment list
Create a Widget/Portlet for a company intranet to show a user their assignments
Create some kind of desktop or mobile app which can pull the data
Set the default start screen of the CMS to be the Task List rather than the Dashboard
The last of these options has been an "out of the box" offering from Tridion for a long time (but I think it was dropped by mistake at some point with Tridion 2009 or 2011. However, #Alvin has recently answered one of my other questions, which may help solve this issue (although it may not be supported). Essentially you can set the <defaultpage> node in the CME.config to /Views/Dashboard/Dashboard.aspx#locationId=cme:workitems. This will make the UI automatically open on the work list (BUT I REPEAT... THIS MAY NOT BE SUPPORTED).
