how can I autodetect the rss url (if available) out of any given website url for blackberry cascades? - rss

SO i have a webview displaying user defined websites. I want to autodetect if that url contains any rss feed and post it in a Label/textarea.

The most straight forward way is to parse the HTML into a DOM document, then traverse the document looking for nodes that define RSS links. You may try using QXmlSimpleReader but this can be frustrating because most HTML is not well formed XML so you will have to handle exceptions.
In an answer to this question, the following SourceForge project was recommended. This might be worth a look.

Related

RSS feed link doesn't open up reader or just dumps out raw XML

I developed an RSS feed following a tutorial and I think the .xml file itself is in order. However, I have two problems:
When people click on the RSS link, it doesn't automatically load into their RSS readers
For those that don't have an RSS reader, clicking the link results in a page full of code which is not very understandable
I was hoping that there might be some tips on how to easily realize this.
Try to remove the <![CDATA[ and ]]> in the description tag.
I downloaded your xml, changed those lines, tested it on my server, and it worked in google's rss reader.
This is a browser and user profile dependent issue in how the RSS link is going to react when clicked on.
If the user has the action set up to automatically load it into their feed reader of choice, it will do that.
If they don't, then it won't.
For those that just see a raw dump, it could be that they're using a browser that does not support RSS feeds and will dump out the XML as raw text. Google Chrome (at least still in version 18) without the use of extensions or add-ons will usually be the dump truck culprit here.

Flex 3: Project Architecture & SEO

I've got a Flex 3 project. One of the problems I have is that not very much of its content is indexed by Google. Currently, I pull data from a mySQl database, so the Googlebot doesn't see most of the site.
My goal is to increase the amount of content indexed by Google, improve the SEO, and improve SERPs.
I thought that instead of pulling the data from the database that I would change the project's architecture and create separate "pages". So, in my case, I would compile each puzzle separately and upload it to the server in its own directory. This way the info in each puzzle would get indexed.
The negative is that if I add a puzzle, I'd have to add a link to it in all of the puzzles that are already on the server. I would have to add the link, re-compile each puzzle and upload it to the server. Is there a way to get around this problem? Also, if I wanted to communicate some data from one puzzle to another in the future, I wouldn't be able to do so.
Any suggestions?
Thank you.
-Laxmidi
The usual way to achieve this goal is to develop a hidden parallel site in HTML.
On the first page you will have your flash and, hidden by javascript, a list of links to the other pages. These links will be parsed by the robots. Ideally, the href pages are virtual (look for "url rewriting"). On each "fake" page, your server-side language will print on the page a content or links from your database AND the flash. The flash will be provided with a string explaining where it is and what it's supposed to show.
Ex: http://www.mysite.com/category1/content7 The URL rewriting sends this request to http://www.mysite.com/index.php?uri=category1/content7. The page should display the Flash with FlashVar "uri=category1/content7". The Flash knows which content it has to display so when an user comes from google, following this link, he will find the content he was looking for.
Every linking and content for SEO should be in HTML, don't trust robots capability of reading Flash.
have a look at Adobe's reference on deep-linking.
you can generate a website's sitemap.xml with a cron process (daily), such that the URLs encode the state of the application you need. This URL will encode whatever content you need to retrieve from the db, with just one index.html page.
good luck!

Preventing RSS feed scraping?

On a Wordpress site, I have both a normal blog that I want Google to detect and an RSS feed for outgoing links to other sites. I don't need/want bots to get at this other RSS feed nor do I want people to be able to get the link for their own use.
I've disabled RSS for the main blog successfully but am not sure how to encrypt/protect/hide the RSS link for this additional feed.
I'm not sure how Facebook runs a newsfeed without RSS but however they do it is probably beyond my means/experience to replicate.
Where these are just outgoing links, I don't think copyright notices in the feed will do much. Maybe there is a way to output the links automatically through a means other than RSS?
Use Robots.Text www.robotstxt.org to prevent google from following the link. All self respecting robots should follow the directives in the robots.txt file. This file needs to go in the root of your sit.
The basic answer to this is to use a method of getting the feed entries in a manner other than using the actual RSS like outputting JSON, going through the API, etc.
It will help prevent scraping though not completely.

Read rss and show as html

I am using google reader for my RSS, i want to export all my shared or starred rss items to HTML to take this html and put on my website
Do any one have an idea about?
And one important thing as well, can i page through this html? i mean to export as pages not all in one html page to let the user on my site page through my starred feeds.
Thanks,
With XSTL you can transform XML to any format you want, including HTML. You can do the transformation on the server, or with modern browsers like IE6+ and Firefox2+ you can do the transformation on the client side. XSTL isn't very pretty as a programming language, but the concept is pretty neat.
I don't know if you can link directly to the RSS feed XML so that it's always up to date. I think Google requires that you authenticate and have permission to access the feed.
You can read from an RSS with jQuery by selecting and iterating through the tags rather easily. Additionally, you can perform conditional-checks on attributes etc as well.

Add RSS to any website?

Is there any website/service which will enable me to add RSS subscription to any website?
This is for my company I work. We have a website which displays company related news. These news are supplied by an external agency and they gets updated to our database automatically. Our website picks up random/new news and displays them. We are looking at adding a "Subscribe via RSS" button to our website.
If you have the data in your database, creating one yourself is fairly straight forward - there's a simple tutorial here.
Once you've set up a feed, in the <head> of your page, you put text like:
<link rel="alternate" title="RSS Feed"
href="http://www.example.com/rss-feed/latest/" type="application/rss+xml" />
This allows the feed to be "auto-discovered" by your user's browser (e.g. the RSS icon appears in the address bar in FF).
Here's an article that discusses various webscrapers that will generate feeds: http://www.masternewmedia.org/news/2006/03/09/how_to_create_a_rss.htm
If you don't care to click through, here are the services the author discusses:
http://www.feedyes.com/
http://www.feed43.com/
http://www.feedfire.com/site/index.html
Other webscrapers suggested in the other answers:
http://page2rss.com/
http://www.dapper.net/
However, you're probably better off generating the feeds yourself from the info in the DB.
Your question is a little difficult to understand. Are you trying to generate the RSS for others to consume, or are you trying to consume someone else's RSS?
If you are trying to generate your RSS feed for others to consume you will need to read the spec:
http://cyber.law.harvard.edu/rss/rss.html
If you are trying to consume it, that link will also help. Then you'll need to look into an XML / RSS parser.
If you can provide more details I can update my answer.
If you are not in a position to add an RSS feed to the existing site, see Page2Rss as an intermediate solution.
Might Dapper be of some use? You just need to set up which bits of your news feed to scour and voila, instant rss without having to touch any code...
Actually this is very doable with Yahoo! Pipes. Assuming that 1) your page is under 200k, 2) your robots.txt file does not disallow Pipes, and 3) your news feed has a unique ID, like so:
<ul id="newsfeed">
... you could use the Fetch Page module, trim it to just the items inside the news feed, loop though each list item, and use an Item Builder module to mangle the relevant bits as a proper RSS feed. Then, in the head of your document, you'd put in an RSS link, like so:
<link rel="alternate" type="application/atom+xml" title="News Feed" href="http://pipes.yahoo.com/your_pipe_id" />
This is of course completely ass-backwards, but would work for a quick fix, or in situations where you had no control over the body of the page.
Write a webhandler that exposes the content of the database as an RSS feed.
You either need to roll your own, or get a service that is a screen scraper.
After you have created your feed, you can use something like Feedburner to disseminate it.
If you happen to be using ASP.NET, you might want to check out the ASP.NET RSS Toolkit. It's useful for both generating and consuming feeds.

Resources