elasticsearch rss river not storing content field - rss

I am using the RSS river plugin for elasticsearch to index an RSS feed and it is working great with one exception, it does not store / index the "content" field produced by wordpress. From the source it doesn't look like there is a way to do this but I might be missing something.
Are there any suggestions on how to accomplish this? Or plans to add this to the plugin in the future?
This is the feed I am using: https://blog.mariadb.org/feed/

Could you create an issue in https://github.com/dadoonet/rssriver and I will look at it?
If you could also add some sample documents that would be awesome.
Thanks

Related

MLS / IDX Data in the form of XML

I have an issue and I have been struggling with for weeks, almost a month, I am working on this website for a real estate agent in Toronto, Ontario and the last thing I have to do is get the listing on her website. We are using Wordpress wp-residence theme, this theme is compatible with iHomefinder, however the data I get from iHomefinder does not work with the themes features (maps, searches and even styles) I read somewhere that the only way to use these theme features is to add the listing manually. I found this plugin that will import all the listings
https://en-ca.wordpress.org/plugins/wp-residence-add-on-for-wp-all-import/
But now I need to get my listings in the form of XML, CSV or XLS. I have all my login info and url to get my listing, however in the instructions it says I need to connect via RETS Client, which I do not have...is there anyone out there who can point me in the right direction?
You'll need to use an IDX plugin.
I think you should also look over https://mlsimport.com .This plugin should help you import data from MLS. But is a paid version.

Wordpress form to add hotels

I am trying to create a webpage to add hotels (database for hotels) using wordpress. (name, address, pictures, reviews, ...). Ideas?
Another idea for categorizing hotels with cities?
Thanks.
You will probably need to further explain your question so that we can be more help to you. If your intention is to just add a list of hotel names to your website then I don't know of any WordPress plugins for that, but you might want to check out Expedia's EAN http://developer.ean.com/
You need to sign for their affiliate program, which is very easy. You get immediate access to their hotel databases plus you can make availability/booking requests with several response options, including JSON, which is more convenient and lightweight than the (unfortunately) more widespread XML. This is something you will need to hire a developer to do.
If you want to allow clients/visitors to be able to book and search for hotels by just installing a plugin try https://wordpress.org/plugins/wp-auto-hotel-finder/.

RSS feed - Load More

I'm trying to load the rss content provided by Yahoo Finance, for example, from http://finance.yahoo.com/rss/headline?s=yhoo (which actually seems to redirect toward (http://feeds.finance.yahoo.com/rss/2.0/headline?s=yhoo&region=US&lang=en-US).
Anyway, the point is the url will only provide 20 <item> and I'd like to load more than that. Feedly doesn't seem to have any problem doing so, so maybe there's a query parameter I'm not aware of?
Services like Feedly store feed history, but feed publishers usually don't provide older feed entries. So your best bet is to use Feedly's api: https://developer.feedly.com/v3/streams/
See also: How to get more feeds from RSS url?

Nutch possibilities

i am new to nutch and am using nutch 1.9. right now am doing some POC on a sample site(shaadi.com). I have few questions, can somebody help me out on this?
i cant access the urls that requires login authentication(form based), though i setup the configuration in httpclient-auth.xml, nutch-site.xml and all.
i know nutch fetches us only the whole content of the website. but is it possible to get only a piece of information like first name, address etc.. from the website page using nutch? (i think its more like scraping.. this is what pythons scrapy does)
Thanks in advance.
You will need to use plugin to extract specific data & add that data to nutch document while indexing.
This plugin can be used to extract data
www.atlantbh.com/precise-data-extraction-with-apache-nutch/

Importing and filtering RSS feeds into nodes

I'm looking for the best way to import a feed, say BBC News and filter through all the content so only articles containing a keyword are stored in the database. I'd like each item to be shown as a node of a specific content type and for it to be updated every given time interval. Is there a straightforward way to do this? (I don't have much experience in php at all so be specific please)
Thanks!
Use the Feeds module to import the nodes.
Use the Rules module + Cron to run automated actions on your nodes.
I'm not sure how to filter out the content, but these two modules should get you started. TIP: It would almost better to have the content filtered before you import the nodes into Drupal.
The Feeds module has the ability to create nodes from an RSS feed. Once setup, it can run at regular intervals.
In addition the Feeds Tamper module can help you filter based on keywords.
Before going with the feeds module, be sure to review the project bugs list first (https://drupal.org/project/issues/feeds?categories=bug). This is not a very solid module. For example, feeds cannot update via cron, they have to be updated manually. This alone should be a showstopper for most folks.
I'm only talking about the D6 version.

Resources