Best way for consuming RSS feeds - rss

What is the best way for consuming rss feeds ,which aren't pubsubhubbub enabled?
I thought with a php script ,which parses the feeds and it's called every 10 minutes.But when I have 100 feeds it will be slow process.
What do you recommend ?

You can have a pool of threads that works to process the set of feeds that you get each time to distribute the work.

Related

How to: Background Processing of WooCommerce Product Batch API calls

I'm running a webshop with 7000 products. Every night I'm executing a cronjob on my ERP-system, pushing all product-information into my WooCommerce store through their Restful API.
This works perfectly... However, I'm looking for ways to improve the performance of this task.
What I've done, so far, is to change my individual update API requests to a batch call. Thereby I limit my requests from 7000 to around 1400 - I can batch around 50 products without running into gateway timeouts and other serverside limitations.
However, while improving this, I'm wondering if there are some smart way of scheduling these update-tasks into a background queue/process. I'm quite familiar with this on Laravel, but I'm wondering if anything exist in Wordpress/WooCommerce that actually supports this out-of-the-box.
What I actually mean; instead of executing the update-batch task upon API-call, the API should just schedule the task and send a response back to client, telling that the task has been added succesfully. This way the ERP-system dosen't have to wait for the WooCommerce to finalize the whole batch. If I make the ERP-system make async calls, then it would properly lead to overloading the WooCommerce, and wouldn't be beneficial.
If not, what would actually be the best approach to accomplish this? I'm thinking of making an jobs/queue-databasetable, that contains my payload, and after pushing all update-data to this, then creating another endpoint, which tells my WooCommerce store to start working through the list.
Please let me know if there are some excellent way of achieving this.

Caching web page data, Database or File

I am creating an RSS reader application that requests the RSS from my server. I mean, the RSS is first downloaded to my server, then application downloads it from my server.
I want to create RSS cache for this. And for example, each RSS would be refreshed every 1 minute. So, If 10 users request RSS of example.com in 1 minute, my server will download it only for the first time, and in other 9 requests, RSS will be loaded from cache.
My question is, Should I use a Database (MSSQL) for this purpose? or I should use files?
I have no limit in Database size nor in file size...
EDIT: I'm using ASP.NET for the server.
You can use Memcached for this and set the cache expiry time to be 10 minutes or whatever you want. I found this for you.
P.S. Google is our friend
possible duplicate question here
Use memcaching and avoid file access since that will be slower.
asp.net has a build in cache, here is the msdn on best practices for it http://msdn.microsoft.com/en-us/library/aa478965.aspx

Creating an aggregate RSS feed from RSS-less search results

So, say I'm a journalist, who wants some way of easily posting links to stories I've written that are published to my newspaper's website. Alas, my newspaper's website doesn't offer user-level RSS feeds (user-level anything for journalists, really).
Running a search (I.e., http://www.calgaryherald.com/search/search.html?q=Rininsland) brings up everything I've done in reverse chronological order (albeit with some duplicates; ignore for now, will deal with later). Is there any way I can parse this into an RSS feed?
It seems like Yahoo! Pipes might be an easy way to do this, but I'm open to whatever.
Thanks!
Normally this would be a great use of Yahoo Pipes, but it appears that the search page you cited has a robots.txt file, which Pipes respects. This means that Pipes will not pull data from the page.
For more info: "How do I keep Pipes from accessing my web pages?"
http://pipes.yahoo.com/pipes/docs?doc=troubleshooting#q14
You would have to write a scraper yourself that makes an HTTP request to that URL, parses the response, and writes RSS as output. This could be done in many server-side environments such as PHP, Python, etc.
EDIT: Feedity provides a service to scrape web pages into feeds. Here is a Feedity feed of your search url:
http://feedity.com/rss.aspx/calgaryherald-com/UFJWUVZQ
However, unless you sign up for a subscription ($3.25/mo), this feed will be subject to the following constraints:
Free feeds created
without an account are limited to 5
items and 10 hours update interval.
Free feeds created without an account
are automatically purged from our
system after 30 days of inactivity.
Provided it's just links and a timestamp you want for each article then the Yahoo Pipes Search module will return the latest 10 in it's search index of the Herlad site.

RSS feed basics - just repeatedly overwriting the same file?

Really simple question here:
For a PHP-driven RSS feed, am I just overwriting the same XML file every time I "publish" a new feed thing? and the syndicates it's registered with will pop in from time to time to check that it's new?
Yes. An RSS reader has the URL of the feed and regularly requests the same URL to check for new content.
that's how it works, a simple single xml rss file that gets polled for changes by rss readers
for scalability there there is FeedTree: collaborative RSS and Atom delivery but unlike another well known network program (bittorrent) it hasn't had as much support in readers by default
Essentially, yes. It isn't necessarily a "file" actually stored on disk, but your RSS (or Atom) is just changed to contain the latest items/entries and resides at a particular fixed URL. Clients will fetch it periodically. There are also technologies like PubSubHubbub and pinging for causing updates to get syndicated closer to real-time.
Yes... BUT! There are ways to make the susbcribers life better and also improve your bandwidth :) Implement the PubSubHubbub protocol. It will help any application that wants the content of the feed to be notified as soon as it's available. It'es relatively simple to implement on the publisher side as it only involves a ping.

creating updatable rss feed

Hi I have a page in my java/jsp based web application which shows list of new products.
I want to provide an rss feed for this. So, what is the way to create an rss feed which other can use to subscribe?
I could find some java based feed creators. But, the question is how the feed will be self updating based on new products added to the system?
I'm not familiar with Java, so here's a general thought.
Your feed should be accessible via some URL, like http://mydomain.com/products/feeds/rss. When Feed Aggregator fetches this URL, the servlet (I believe this is how they are called in Java world) fetches a list of recent products from the DB or wherever, build RSS feed and then sends it back to the requester, which turns out to be a Feed Aggregator.
For performance reasons this particular servlet may not access the database each time it's executing. Rather, it can cache either resulting feed (recommended, HTTP allows for a very flexible caching) or a result of database query somewhere in memory/on disk.

Resources