Wordpress how to prevent category and archive crawl and index - wordpress

I've noticed in Google's webmasters tools that I have two records showing where I have unexpected duplicated content.
Its apparently happening because Google has crawled and indexed my categories and archives although I have no visible links for either (that I'm aware of).
I'd like to prevent these items from being crawled and indexed, but how?
Here are the two records that Google's webmaster tools are showing....
/2009/10/
/category/test/

One way to control spider access is of course to manually create (or modify) a robots.txt file.
However, for Wordpress, it might make more sense to use a plugin, such as Google Sitemap Generator or the more SEO-geared All in One SEO Pack

You could add an if statement to the header.php file
<?php
if(is_archive) {
?>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<?php } ?>
Google should respect that. The is_archive conditional covers categories as well - http://codex.wordpress.org/Conditional_Tags#Any_Archive_Page

Related

Yoast plugin is not showing meta description and meta keyword

I am a newbie and learning new things every now and then. I recently set up Yoast on my WordPress website and I put title, focus keyword, and meta description for every single page of my website manually in Yoast widget. This shows up under the page but unfortunately the meta description and keywords is not appear on the page source.
In fact throughout the website same title and description is displayed. Is there any additional configuration required after installation of Yoast in header.php or somewhere in files?
You can add code to functions.php for showing focus keywords as meta-keywords in section of the page.
function set_head_keywords() {
$id = get_the_ID();
if (!$id) return;
$meta = get_post_meta( $id, '_yoast_wpseo_focuskw', true );
echo '<meta name="keywords" content="'.$meta.'" />';
}
add_action( 'wp_head', 'set_head_keywords' );
Answer from Yoast seo plugin is stated below:
We’ve removed the meta keywords feature in Yoast SEO from version 6.3. Meta keywords haven’t had a use for a long time, so their removal from our plugin has been long overdue.
Reference: https://kb.yoast.com/kb/meta-keywords-yoast-seo/
Google:
Google does not use the keywords meta tag in web ranking.
Reference: https://webmasters.googleblog.com/2009/09/google-does-not-use-keywords-meta-tag.html
Yahoo! Announced they no longer use the meta keywords tag anymore either.
Bing:
On 2014: Today, it’s pretty clear the meta keyword tag is dead in terms of SEO value. Sure, it might have value for contextual ad systems or serve as a signal to ‘bots plying the web looking for topics to target, but as far as search goes, that tag flat lined years ago as a booster.
Reference: https://blogs.bing.com/webmaster/2014/10/03/blame-the-meta-keyword-tag

Preventing search engines from indexing all posts

I'm working on a Wordpress site where I'm using the posts to create a list of tour dates for an entertainer. With ACF I have fields set up in a table and the client just enters a date, location, link to buy tickets, etc.
The table is all I need visitors to see. The actual post created by single.php is not going to be styled and should never be seen.
I want to prevent someone searching the artist and city and coming across the post.
Is there a plugin or a disallow I can put in the robot.txt file?
Any help is appreciated. Kinda funny in a time where everyone is trying to get noticed by search engines and I want to hide something from them!
Add the code below to your themes functions.php:
add_action('wp_head', 'your_prefix_noindex_nofollow');
function your_prefix_noindex_nofollow() {
if(is_single()){
echo '<meta name="robots" content="noindex,nofollow"/>';
}
}
You can also change "your_prefix" in the function name to whatever you like. It will work as is, but it's a good practice to use the same prefix in all your function names.

Wordpress SEO by Yoast doesn't display meta tags on paginated pages

I have a wordpress site that has SEO Meta tags populated by Yoast plugin.
This seems to work fine, except on any paginated page. On page2 or higher there is no
<meta name="description" > tag. Is there a simple setting I need to change? I can't find anything about it online at all.
You have to check if your Wordpress header.php has a line with:
<?php wp_head(); ?>
That allows Wordpress to inject code in the header. It has to be just before the closing head tag </head>. If not present just add it.

showing categories wordpress in google result , HOW?

how are you?
this is my website: http://rehlat-world.com
when I search in google : site:rehlat-world.com
The result only "POSTS , PAGES , TAGS"
I need to include categories but I can't
this is example for category : http://rehlat-world.com/country/indonesia
=======================
The source of category page " " also it is include in sitemaps.xml http://rehlat-world.com/sitemap.xml
Please Help me how can include it.
Note I'm using this plugins (All in One SEO Pack و Google XML Sitemaps , WP Super Cache)
I can help you with your issue. This is an easy error to make and thankfully just as easy to fix.
If you take a look at the source code of your category pages (right click anywhere on page, select link to source code).
On line 9 you will see
<meta name="robots" content="index, follow" />
This is perfectly fine but then if you scroll down to lines 74 - 80, you will see All in One Seo plugin has also added its metatags,
<!-- All in One SEO Pack 1.6.13.2 by Michael Torbert of Semper Fi Web Design[418,446] --> <meta name="robots" content="noindex,follow"/> <link rel="canonical" href="http://rehlat-world.com/country/indonesia"/> <!-- /all in one seo pack -->
So you can see the repeated "robots" meta tag specifying "noindex". Simply go to into your All In One Seo plugin settings and disable the option to add robots meta tags to categories.
Obviously the first meta tag is all you need.
This will do the job and cats will be indexed in no time.
I will also add a suggestion that will help your site in the future by making it more appealing to your visitors and the search engines. I looked in your sitemap and noticed your permalinks are extremely ugly due to the Arabic text being used, which inturn cant be recognized by wordpress or the browsers because you still have wordpress set in English. You should really change your wordpress language config to Arabic.
The very first line in your source file says <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Strict//EN tells your browser the website is set in the English language and thats how the internet browsers should read the website. You should be able to fix this by adding in the header.php file of your theme, above the tag. I think this should work but im not 100% sure and may be wrong.
You also edit your wp-config.php file and find define ('WPLANG', '');, change that to define ('WPLANG', 'ar');. I have very little experiance with this so it would be wise to read http://codex.wordpress.org/Translating_WordPress#WordPress_Localization_Repository
could also save you time to do it with a plugin like http://wordpress.org/extend/plugins/gtranslate/
If you are already well aware of this and its not causing any issues with your rankings, disregard what I said.
Good luck
Aaron

How to noindex in Google one page of a web site

I am interested how to prevent one page of a website to not get indexed by Google, or any other robots.
In my script i have the template with TPL files , Index.tpl , Header.tpl ....
So how do i tell google not to index page : login.tpl
Thank you
If you want a specific URL (or a directory) no not be indexes by crawlers, a simple solution is to use a robots.txt file -- which will allow you to specify what can, and cannot, be indexed.
For more informations, see About /robots.txt
For example, if you want a crawler not to index the /my-page.php URL, you could use something like this in your robots.txt file :
User-agent: *
Disallow: /my-page.php
As a sidenote : files that should not be visible from end-users (like include files, libraries, non-interpreted templates, ...) should not be served by your webserver : no-one should be available to access those.
If using Apache, using a .htaccess file in a given folder (provided this feature is enabled), you can prevent Apache from serving any file from that folder :
Deny from All
Note : nothing will be served by Apache from the directory that contains a .htaccess file with that content !
This is not correct. The robots.txt does not tell crawlers what to index and what not to index. That's what you use the meta-robots tag for. Have it serve noindex and you're good.
See for example and further reading: http://yoast.com/x-robots-tag-play/
I know i am late for the answers but this could help others also
below is the more precise answer that you will see.
I am considering that you are using wordpress for your site.
You can use wordpress "CUSTOM FIELD" option.(you can find details here)
The first thing you need to do is add the following code to the head section of your theme’s header.php template.
And copy the below code
<?php
$noindex = get_post_meta($post->ID, 'noindex-page', true);
if ($noindex) {
echo '<meta name="robots" content="noindex,follow" />';
}
?>
Now all you need to do is specify a custom field entitled noindex-page and assign a value to it. It doesn’t matter what you enter. All you need to do is ensure that something is entered in the field so that the custom field noindex-page returns as true in the code you specified in your header.
please keep this in mind, this will also work for posts

Resources