Web feed RSS and Atom: both inefficient? - rss

As far as I understand, both web-feeds RSS and Atom request, starting at the client side, content from the server, and they do that at periodic intervals of time. It doesn't matter whether there is new content or not, the client checks for updates.
Wouldn't it be more efficient the other way round? Let the server announce new updates. In this scenario, it would have to keep track of the clients, and when each got what update. It would also have to send a message to each one. But still, it looks more efficient if client-server were not communicating when there are no new news.
Is there a reason why web-feeds are the way they are?

This model is not inherent to feeds (RSS or Atom), but to HTTP itself, where a client queries a server to get data. This is at this point, the only way in a pure client -> server model to determine whether there is any new data available or updated.
Now, in the context of server querying other servers, PubsubHubbub solves that with webhooks. Basically, when polling any given resource, a server can also "subscribe" by providing a webhook which will be called upon a change or update in the feed. This way the subscriber does not have to poll the feed over and over again.

Related

Database rollback on API response failure

A customer of ours is very persistent that they expect this "from any API" (meaning they don't want to pay for the changes). I seem to have trouble finding clear information on this though.
Say we have an API that creates an appointment for a calendar. Server-side everything was successful, data is committed to the database. API tries to send the HTTP 201 (Created) response, but something goes wrong there. Client ignores the response, or connection dropped, ...
They want our API to undo the database changes in that particular situation.
The question is not how to do this, but rather if this is something most APIs do? Is this standard behavior? Or something similar like refusing duplicate create requests?
The difficult part of course is to actually know if an API has failed to send the response, and as far as I am concerned with respect to the crux of the question, it is not a usual behavior implemented. If the user willingly inputs the data, you can go ahead and store it. If the response doesn't return properly due to timeouts (you are not responsible for user "ignoring" the response), then the client side code can refresh on failure and load fresh data. And the user can delete inputted data themselves(given you provide an endpoint for that)
Depending on the database, it is possible to make all database changes of an API reversible. For example, with SQL, you use [SQL transactions][1] using commit, rollback and savepoints. There is most likely a similar mechanism available for noSQL.

how to show updated data to the users as fast as possible (not real-time)?

In database some entity is getting updated by some backend process. We want to show this updated value to the user not real-time but as fast as possible on website.
Problems we are facing with these approaches.
Polling :- As we know that there are better techniques then polling like SSE, WebSockets.
SSE :- In SSE the connection open for long time(I search on internet and found that it uses long polling). Which might cause problem when user increases.
WebSockets :- As we need only one way communication(from server to client), SSE is better then this.
Our Solution
We check database on every request of user and update the value.(It is not very good as it will depend upon user next request)
Is it good approach or is there any better way to do this or Am I missing something about SSE(misunderstood something).
Is it fine to use SignalR instead of this all?(is there any long connection issue in it or not?)
Thanks.
It's just up to your requirements what you should use.
Options:
You clients need only the update information, in the case they make a request -> Go your way
If you need a solution with different client types like (Webclient, Winformclient, Androidclient,....) and you have for example different browser types which you should support. Not all browsers support all mechanisme... SignalR was designed to choose automatically the right transport mechanisme according to the mechanisme which a clients supports --> SignalR is an option. (Read more details here: https://www.asp.net/signalr) Has also options that your connection keeps alive.
There are also alternatives like https://pusher.com/ (For short this is only a queue where you can send messages, and also subscribe for messages) But these services are only free until for example some data volume.
You can use event based communication. When ever there is a change(event) in the backend/database, server should send a message to clients.
Your app should register to respective events and refresh the UI when ever there is an update.
We used Socket IO for this usecase, in our apps and it worked well.
Here is the website https://socket.io/

Consuming Atom feeds: how does it work?

I'm sorry if the title is too generic, but I've been browsing the Internet for one hour and I couldn't find any architectural explanation. I'm totally new both to RSS and Atom protocols, as far as I have understood until now is:
A server publishes documents
Clients subscribe to this server
Clients are notified when the server publishes new documents
Clients consume the documents
It seems like a queueing mechanism (like JMS). What is not clear to me is:
"Clients are notified" is just another way of saying "clients must poll the server to check if there are new messages"?
How does a client know that a message has already been read and that is no longer 'new'? Is this check in charge to the client or to the server?
Can anyone address me to some documentation about that? I've been googling for a while but every search sends me to sites that explain how to use libraries for parsing etc....
Thanx
I think these answer your questions:
How large RSS reader works (netvibes, Google reader...)
How RSS and ATOM inform client about updates? long polling or polling or something else?
RSS 2.0 Specification
https://en.wikipedia.org/wiki/PubSubHubbub
How does a client know that a message has already been read and that
is no longer 'new'?
I think that is specific to the implementation, but for example you could save guids of each fetched <item> and then flag them read as the user reads the items.
I think Janih's answer below is good and you should check all these links.
For more specific details to you questions:
Clients are notified" is just another way of saying "clients must poll
the server to check if there are new messages?
Yes... and no. Yes, polling is the default and yes it's cumbersome. Protocols like PubSubHubbub will help. RSS Feed API services like Superfeedr (which I built!) will do it on your behalf and send you notifications using a webhooks (so you don't have to poll at all!)

Is there a way using HTTP to allow the server to update the content in a client browser without client requesting for it?

It is quite easy to update the interface by sending jQuery ajax request and updating with new content. But I need something more specific.
I want to send the response to client without their having requested it and update the content when they have found something new on the server. No need to send an ajax request every time. When the server has new data it sends a response to every client.
Is there any way to do this using HTTP or some specific functionality inside the browser?
Websockets, Comet, HTTP long polling.
It has name server push (you can also find it under name Comet technology). Do search using these keywords and you will find bunch examples, tools and so on. No special protocol is required for that.
Aaah! You are trying to break the principles of the web :) You see if the web was pure MVC (model-view-controller) the 'server' could actually send messages to the client(s) and ask them to update. The issue is that the server could be load balanced and the same request could be sent to different servers. Now if you were to send a message back to the client you'll have to know who all are connected to the server. Let's say the site is quite popular and you have about 100,000 people connecting to it every day. You'll actually have to store the IPs of each of them to know where on the internet they are located and to be able to "push" them a message.
Caveats:
What if they are no longer browsing your website? You see currently there is no way to log out automatically if you close your browser. The server needs to check after a fixed timeout if you have logged out (or you send a new nonce with every response to prevent the server from doing that check)
What about a system restart/crash etc? You'd lose all the IPs that you were keeping track of and you are back to square one - you have people connected to you but until you receive new requests you can't really "send" them data when they may be expecting it as per your model.
Let's take an example of facebook's news feeds or "Most recent" link close to the top right - sometimes while you are browsing your wall you see the number next to most recent has gone up or a new 'feed' has come to the top of your wall post! It's the client sending periodic requests to the server to find out what was updated rather than the other way round
You see, it keeps it simple and restful. You may feel it's inefficient for the client to "poll" the server to pull the data and you'd prefer push, but the design of the server gets simplified :)
I suggest ajax-pulling is the best way to go - you are distributing computation to the client and keeping it simple (KIS principle :)
Of course you can get around it, the question is, is it worth it?
Hope this helps :)
RFC 6202 might be a good read.

Asp.net chat application using database for message queue

I have developed a chat web application which uses a SqlServer database for exchanging messages.
All clients poll every x seconds to check for new messages.
It is obvious that this approach consumes many resources, and I was wondering if there is a "cheaper" way of doing that.
I use the same approach for "presence": checking who is on.
Without using a browser plugin/extension like flash or java applet, browser is essentially a one way communication tool. The request has to be initiated by the browser to fetch data. You cannot 'push' data to the browser.
Many web app using Ajax polling method to simulate a server 'push'. The trick is to balance the frequency/data size with the bandwidth and server resources.
I just did a simple observation for gmail. It does a HttpPost polling every 5 seconds. If there's no 'state' change, the response data size is only a few bytes (not including the http headers). Of course google have huge server resources and bandwidth, that's why I mention: finding a good balance.
That is "Improving user experience vs Server resource". You might need to come out with a creative way of polling strategy, instead of a straightforward polling every x seconds.
E.g. If no activity from party A, poll every 3 seconds. While party A is typing, poll every 5 seconds. This is just a illustraton, you can play around with the numbers, or come out with a more efficient one.
Lastly, the data exchange. The challenge is to find a way to pass minimum data sizes to convey the same info.
my 2 cents :)
For something like a real-time chat app, I'd recommend a distributed cache with a SQL backing. I happen to like memcached with the Enyim .NET provider, so I'd do something like the following:
User posts message
System writes message to database
System writes message to cache
All users poll cache periodically for new messages
The database backing allows you to preload the cache in the event the cache is cleared or the application restarts, but the functional bits rely on in-memory cache, rather than polling the database.
If you are using SQL Server 2005 you can look at Notification Services. Granted this would lock you into SQL 2005 as Notification Services was removed in SQL 2008 it was designed to allow the SQL Server to notify client applications of changes to the database.
If you want something a little more scalable, you can put a couple of bit flags on the Users record. When a message for the user comes in change the bit for new messages to true. When you read the messages change it to 0. Same for when people sign on and off. That way you are reading a very small field that has a damn good chance of already being in cache.
Do the workflow would be ready the bit. If it's 1 then go get the messages from the message table. If it's 0 do nothing.
In ASP.NET 4.0 you can use the Observer Pattern with JavaScript Objects and Arrays ie: AJAX JSON calls with jQuery and or PageMethods.
You are going to always have to hit the database to do analysis on whether there is any data to return or not. The trick will be on making those calls small and only return data when needed.
There are two related solutions built-in to SQL Server 2005 and still available in SQL Server 2008:
1) Service Broker, which allows subscribers to post reads on queues (the RECEIVE command with WAIT..). In your case you would want to send your message through the database by using Service Broker Services fronting these Queues, which could then be picked up by the waiting clients. There's no polling, the waiting clients just get activated when a message is received.
2) Query Notifications, which allow a subscriber to define a Query, and the receive notifications when the dataset that would result from executing that query would change. Built on Service Broker, Query Notifications are somewhat easier to use, but may also be somewhat less efficient. (Not that Query Notifications and their siblings, Event Notifications are frequently mistaken for Notification Services (NS), which causes concern because NS is decommitted in 2008, however, Query & Event Notifications are still fully available and even enhanced in SQL Server 2008).

Resources