Google sitemap HrefLang tag without the main site url - xml-sitemap

We have websites with multilingual content.
e.g.
http://www.example.com/about-us/
http://www.example.com/en-HK/about-us/
http://www.example.com/en-GB/about-us/
http://www.example.com/zn-CH/about-us/
We need to configure the hreflang tags in sitemap for Google to know that there are alternate links for the same pages in different languages.
I know for the above example that my sitemap url tag would look like this:
<url>
<loc>http://www.example.com/about-us</loc>
<xhtml:link rel="alternate" hreflang="en-GB" href="http://www.example.com/en-GB/about-us"/>
<xhtml:link rel="alternate" hreflang="en-HK" href="http://www.example.com/en-HK/about-us"/>
<xhtml:link rel="alternate" hreflang="zn-CH" href="http://www.example.com/zn-CH/about-us"/>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
However, if I don't have the main url but just the last three ones with en-HK, en-GB and zn-CH, then how should my url tag look? Should I just skip the loc tag and keep the three xhtml:link tags? Or can I specify any url in the loc tag and put the remaining two in xhtml:link tags?
I am new to Google sitemaps. Any help is greatly appreciated.
Thanks,
Rashmi
Edit:
From the answer posted on sitemap for domain with multilanguage site, for my example with sites in en-HK, en-GB and zn-CH, should there be three url tags, with each of them assigned to loc with the other two in xhtml:link?

Ok. Found the answer at:
Help Google serve the correct language to your visitors
We need to have a url tag for each of the url and specify the others as alternate urls.

Related

Weird Page Views in Google Analytics

Google Analytics, I noticed that there is some weird URL path under my domain that doesn't even exist;for example:
my domain is ABC.com
under page view I see: ABC.com/products/L-apos.
I have the path of "products", but there is no such thing called "L-apos" which is very weird.
When I did some checking I found that what mot people face is totally different URLs like porno links for instance; However, in my case, the domain is correct but the path doesn't exist.
The html entity for a single quote is:
&apos;
Which is pretty close to your L-apos. Is it possible you have an errant ' at the end of an href to the /products/ page?

How to add an RSS feed link globally to every page?

First thought was to include the link tag in my theme. If a theme can be used: which content type do I have to use?
If a theme cannot be used: where do I put the link tag?
Used it before, sorry, didn't recall #facepalm
How to do it:
Place the code manually in the resource section of your Xpage/Custom control as you are not be able to compute the href value:
<xp:this.resources>
<xp:linkResource rel="alternate" type="application/rss+xml"
title="Oliver Busse - OSnippets"
href="/#{javascript:config.getConfig().getItemValueString('pathSnippets')}/rss.xsp">
</xp:linkResource>
</xp:this.resources>

Can I add rel nofollow to iframe tag?

I have a widget distributed in some sites via an iframe. But I don't want the google bot index the url. Will adding a nofollow to iframe tag resolve the problem?. Does Iframe tag support nofollow, and will Google understand it?
Absolutely, you kind of can add rel="nofollow" to an iframe. You just must be tricky about it. Here's how...
Build a blank html file. Add your iframe to that alone. In the Meta tag include
<meta name="robots" content="noindex,nofollow">
Now, iframe that page onto the one your going to show.
The rel attribute is not allowed for the iframe element.
See allowed attributes for iframe: HTML5, HTML 4.01
According to Google, all you need is
<meta name="robots" content="none" />
“none - Equivalent to noindex, nofollow”
It is not possible but there is a workaround for it. You can use a URL of your domain that redirects to the IFRAME. In your robots.txt, you will prevent the bots to follow the link of your URL.
In your robots.txt add
User-agent:*
Disallow:/h/
Then, create a 301 redirect in your htaccess or something similar if you use something else like nginx. that will redirect a local URL to the URL of the Iframe.
Redirect 301 /h/fancy-url/ http://targetdomain.com/the-uri-of-iframe/
In your iframe use the
<iframe src="https://yourdomain.com/h/fancy-url/?possibleparam=xx">
"rel" is not a recognised attribute for the "iframe" tag according to W3C specs. You could use some javascript and document.write to place the iframe code on the page
you might want to check out is IFrame crawled by Google? but from my understanding the "nofollow" is not allowed on an iframe

How can I find feed or XML of a particular news source

I want to get xml file of a particular news source, Of if there is any project which converts html news to xml, parsing page and tokenizing its various traits such as date, author name, title, content etc. in a single xml or similar type of file.
For example see this link:
http://daily.bhaskar.com/article/NAT-TOP-yeddyurappa-breaks-venkaiah-naidus-laptop-slaps-minister-reports-2318460.html
How can I extract content, author, date etc from this webpage. Or if I can find this webpage's feed I can do that easily. But How can I search for that.
which technology are you using ?
If it's a purely client-side / web solution then you'll find js options in a previous StackOverflow question. If you're on the server-side you can use WebClient/LINQ to hit the ATOM feed and parse it
To find out if a page has a feed scan the HTML for a specific <link> tag with these rel and type attributes:
<link rel="alternate" type="application/rss+xml" title="Page as RSS"
href="http://example.com/page/feed">
The feed URL is stored in the href attribute. This mechanism is called RSS Autodiscovery

showing categories wordpress in google result , HOW?

how are you?
this is my website: http://rehlat-world.com
when I search in google : site:rehlat-world.com
The result only "POSTS , PAGES , TAGS"
I need to include categories but I can't
this is example for category : http://rehlat-world.com/country/indonesia
=======================
The source of category page " " also it is include in sitemaps.xml http://rehlat-world.com/sitemap.xml
Please Help me how can include it.
Note I'm using this plugins (All in One SEO Pack و Google XML Sitemaps , WP Super Cache)
I can help you with your issue. This is an easy error to make and thankfully just as easy to fix.
If you take a look at the source code of your category pages (right click anywhere on page, select link to source code).
On line 9 you will see
<meta name="robots" content="index, follow" />
This is perfectly fine but then if you scroll down to lines 74 - 80, you will see All in One Seo plugin has also added its metatags,
<!-- All in One SEO Pack 1.6.13.2 by Michael Torbert of Semper Fi Web Design[418,446] --> <meta name="robots" content="noindex,follow"/> <link rel="canonical" href="http://rehlat-world.com/country/indonesia"/> <!-- /all in one seo pack -->
So you can see the repeated "robots" meta tag specifying "noindex". Simply go to into your All In One Seo plugin settings and disable the option to add robots meta tags to categories.
Obviously the first meta tag is all you need.
This will do the job and cats will be indexed in no time.
I will also add a suggestion that will help your site in the future by making it more appealing to your visitors and the search engines. I looked in your sitemap and noticed your permalinks are extremely ugly due to the Arabic text being used, which inturn cant be recognized by wordpress or the browsers because you still have wordpress set in English. You should really change your wordpress language config to Arabic.
The very first line in your source file says <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Strict//EN tells your browser the website is set in the English language and thats how the internet browsers should read the website. You should be able to fix this by adding in the header.php file of your theme, above the tag. I think this should work but im not 100% sure and may be wrong.
You also edit your wp-config.php file and find define ('WPLANG', '');, change that to define ('WPLANG', 'ar');. I have very little experiance with this so it would be wise to read http://codex.wordpress.org/Translating_WordPress#WordPress_Localization_Repository
could also save you time to do it with a plugin like http://wordpress.org/extend/plugins/gtranslate/
If you are already well aware of this and its not causing any issues with your rankings, disregard what I said.
Good luck
Aaron

Resources