I am working on a Drupal module proving RSS feed generation (though not posting it to Drupal Answers, as my question is not really Drupal-related).
What I am wondering is whether an RSS feed could have multiple <atom:link> elements? (Or any other element for that matter?)
For example, PubSubHubbub requires an <atom:link> with rel attributes set to hub and href pointing to feed's update hub.
On the other hand, the same <atom:link> could be used with rel attribute set to self and href pointing to feed's URL.
Which means, if I want to use both, I would need to include <atom:link> element twice in my feed. Or am I missing something here?
Yes, you can have multiple <atom:link> elements, each one would have a different rel attribute. That is valid Atom XML and customary practice.
You should review the Atom spec and the RSS spec, too.
I am currently working on a project where I will have some pages translated entirely (meta-information on the project) and some pages (articles) only in one language, but still with the interface in many languages.
How should I handle this with regard to Google etc.?
I want the information to be available in all languages, so that people can find it with search terms in their local language, but can I get duplicate content problems if the articles are available on /fr/my-nice-article /de/my-nice-article with the same article text (not translated) and only translated interface ("Ecris un commentaire", "Schreibe einen Kommentar").
AFAIK Google automatically determines what language your content is written in. It users the language of the majority of all text. If you want to explicitly specify it, you can use the following snippet:
<meta name="language" content="de" />
Google, of course, shows only the pages in the user's language (so German Google users only see the de-content).
Found something directly from Google
Websites that provide content for different regions and in different
languages sometimes create content that is the same or similar but
available on different URLs. This is generally not a problem as long
as the content is for different users in different countries. While we
strongly recommend that you provide unique content for each different
group of users, we understand that this may not always be possible.
There is generally no need to "hide" the duplicates by disallowing
crawling in a robots.txt file or by using a "noindex" robots meta tag.
However, if you're providing the same content to the same users on
different URLs (for instance, if both example.de/ and example.com/de/
show German language content for users in Germany), you should pick a
preferred version and redirect (or use the rel=canonical link element)
appropriately.
I'm trying to incorporate a google news feed in my website (Using the built-in SimplePie functionality of WordPress).
However, the default feed gets rendered in a strange table structure. Sure enough, when I inspect the feed XML, I see that Google News has a whole bunch of table html as its 'description' element, complete with embedded styles, etc (See this example)- essentially dictating how the feed must be displayed, and not allowing for any effective css based customization.
This seems really dumb- can anyone help explain what is going on, or at least agree with me that this is just a terrible feed architecture?
Feeds often include html tags, as many (most?) readers will handle and use them, and that way the RSS provider can have some nice looking output in the reader, as you've guessed. (I prefer flagging it as CDATA unless it's proper xhtml, as it's not valid xml/rss otherwise). It's not in the original spirit of RSS perhapts, but the Google feed is just an extreme example of common practice. As per your problem, does strip_htmltags help (simplepie.org/wiki/reference/simplepie/strip_htmltags)?
My site is to have a section for normal users, a section for managers, and a section for use only by anonymous visitors. Each section of the site requires changes to Drupal settings for using a different theme, changing the Primary & Secondary links, changes which blocks are used, etc. In other words, the user experience changes significantly from section to section.
I could probably accomplish what I need by using Drupal's multi-sites, a shared database, and using settings.php to override the variables I need to (ie: menu_primary_links_source). However, to make things more manageable from an operational point of view, and to buy flexibility, I'm considering using the PURL API (purl.module) to prefix the URLs for certain site sections, and having my theme and custom modules react according to the current PURL prefix.
Before I get started, I want to ensure I'm not discounting Spaces.module. Spaces uses PURL, Features, and Context (which I'm also currently using for my site). I don't entirely understand how exactly Spaces fits into the picture. Would it help me make different site sections, each with specific configuration & behavior? Or am I better off depending directly on the PURL API?
The Spaces-PURL-Context conundrum. Fun. I've been meaning to write this up long-style to finish wrapping my head around it.
What is Spaces?
Spaces is a module that creates containers of overridden configuration for your site. It's not specifically about features, it's about any number of configuration values that are able to work with Spaces, including whether a Feature is active or not. (Active does not mean the module is disabled, just that a number of Feature-oriented things are whisked away, such as content types and Spaces-aware Views.
When using Spaces, you need to decide what type of "buckets" you want to use. Open Atrium uses OG and User-shaped buckets, what you need is a new sort of bucket based on user role. For the sake of sanity, you might even need to create a separate module just to define user roles as a more concrete thing in Drupal, kind of like how Spaces OG needs to lean on Organic Groups for a number of concepts.
What is Context?
Context is ultimately a page decorating mechanism. You tell it some stuff about the page, it modifies the page accordingly. Context cannot modify the URL, it's the other way around. Features define Contexts to tell the site how to render a given page uniquely for that Feature, there is no direct connection between Context and Spaces or Context and PURL.
What is PURL?
PURL is a method of sticking things in the URL and keeping them there until you are done with them.
How this Glues Together
Spaces with PURL integration are triggered based on one of two things: The URL or something about the content in the page. To explain this, I'll use Spaces OG as an example.
You click a link. The link was prebuilt with a PURL component that Spaces OG is watching for clues. If that piece of the URL makes sense to spaces, the Space is triggered.
All links except those that opt-out of the PURL modification persist the PURL URL element, meaning the Space is happy, and re-triggers with each page load.
Spaces OG knows to check nodes for their group affiliations. If Spaces can crack open a node and find a group, it will trigger that node's Space, using PURL's modified version of drupal_goto() to redirect the whole page for URL consistency. This will trump any existing URL structure.
If there is no URL component, and the node has no group affiliation, no Space is triggered.
Once the Space is triggered, all of that Spaces configuration values are pulled into play. This will mean the Space's preset defaults (you can have multiple default Space configurations for every Space type) overlay Drupal's defaults, which in turn are overridden by any configuration saved specifically for the Space. In the case of Open Atrium, this includes such nice things as group color, blocks on the dashboard, and enabled Features.
If the user goes to visit something provided by a Feature--a Node, a View, etc, any Contexts related to that node, that view, that URL that any module provides might just be triggered, and start doing things with blocks and theming to tailor the page for the Feature's content.
Next Steps
As I mention above, it sounds to me as though your first step is to try looking at Spaces OG, and rewriting it to be centered around the User Role instead of Organic Groups. You shouldn't have to do much with PURL directly besides a little copy and paste from Spaces OG. You might want to post in the Spaces issue queue to float this idea where the maintainers might see it and give pointers.
The way I understand the spaces module is this:
It provides a way for the features module (and your "features" created from this) to integrate with and be available within defined areas of your site. Out of the box this includes: Organic Groups, Taxonomy, and Users. There is an API to define more "spaces" than this.
So for example you could create a "feature" (with the features module) of an image gallery. Using spaces with organic groups, you would be able to have each group have the ability to enable and disable this feature and it would only be available within that "space" (group in this case).
From the organic groups page:
Groups get their own theme, language, taxonomy, and so on. Integrates well and depends upon Views module
So in your situation, you could think of spaces as a way to make organic groups more flexible. As NoParrots said, OpenAtrium (http://openatrium.com/) relies on the features/spaces/context modules heavily, so that might be a good place to review how these modules work together.
EDIT:
I found a great video that might explain things more clearly: http://www.archive.org/details/TheHeartOfOpenAtriumContextPurlAndSpaces_782. Around 16:00 he starts talking about PURL.
From this page (below the video) there is also an explaination of PURL/Context/Spaces which I think is pretty good:
Context is a module for triggering reactive behaviors within a page load.
Controlling block visibility, menu
trails, page classes, and page
template layouts are examples of
things that fall into its
jurisdiction.
PURL is a library for capturing and abstracting request handling that goes
beyond what the Drupal core menu
system provides ($_GET['q']).
Detection of request components, like
subdomain, path prefix, user agent, or
file extension, and sustaining their
presence is its primary role.
Spaces is a generalized configuration override framework. In
theory it allows you to "customize
everything, for anything." In practice
it allows things like custom group
colors and features, per-user
dashboards, and multisite-like usage
of a single Drupal install.
I would suggest using Spaces or Organic Groups. Spaces was used considerably in Open Atrium... a Development Seed out-of-the-box intranet package. Intranets really require the concept of access control and feature visibility depending on which department or role you have so I'm confident that Spaces will be very good for you.
Of course there is the venerable Organic Groups also. Spaces is a "higher" level concept than PURL. Spaces uses the context and PURL modules BTW. My gut instinct is for you to use Spaces or Organic groups.
There are a couple of videos on the net that talk about Spaces. Check them out.
What is the usefulness of W3C's Semantic Data Extractor?
http://www.w3.org/2003/12/semantic-extractor.html
This tool, geared by an XSLT
stylesheet, tries to extract some
information from a HTML semantic rich
document. It only uses information
available through a good usage of the
semantics defined in HTML.
The aim is to show that providing a
semantically rich HTML gives much more
value to your code: using a
semantically rich HTML code allows a
better use of CSS, makes your HTML
intelligible to a wider range of user
agents (especially search engines
bots).
As an aside, it can give clues to user
agents developers on some hooks that
could be interesting to add in their
product.
After checking validation for CSS and HTML. Should i go for Semantic Data Extractor tool.
What it does. and how it can improved our coding.? Is anyone using it?
And i check some site randomly with but with most of sites it gives error
Using org.apache.xerces.parsers.SAXParser
Exception net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException: The element type "input" must be terminated by the matching end-tag "`</input>`".
org.xml.sax.SAXParseException: The element type "input" must be terminated by the matching end-tag "`</input>`".
Is it possible to get validate every site with this tool?
After checking validation for CSS and HTML. Should i go for Semantic Data Extractor tool.
Probably not
What it does.
Exactly what you quoted from its homepage.
and how it can improved our coding.?
Other then hitting you over the head when you have problems counting heading levels; not a lot.
And i check some site randomly with but with most of sites it gives error
It depends on well formed and sane input.