Removing index.htm from URL - wordpress

I have a WordPress blog which is functioning just fine - the URLs are set to Month/Day/Year and everything on the front-end looks and functions fine.
However, when checking my stats and Google Webmaster Tools, there's tons of 404s that look like this:
http://theURL.com/normal-wordpress-url/index.htm
Of course, index.htm does not exist at the end of the WordPress URL, so the search engine is given a 404.
I have no idea what's causing this, as everything works fine for humans.
So basically, I need a way to tell search engines to forget about the index.htm at the end of the URL.
I've tried this in the .htacess with no luck:
RewriteCond %{REQUEST_URI} /index\.htm?$ [NC]
RewriteRule ^(.*)index\.htm?$ "/$1" [NC,R=301,NE,L]
Does anybody have any suggestions?

Maybe there are different problems in here that may need solution:
Problem 1: If the crawler is the one pointing to this page there are two things that you might need to do:
Try to go to Webmaster and delete "index.htm"
Try to create a robot that will disallow "index.htm" from being seen on Google Crawler.
Problem 2: If you have distributed your urls to point to this url, Google Webmaster can tell you which webpage it is coming from exactly.
So, try to make sure that all the links pointing to "index.htm" are removed from all other urls.

Related

.htaccess - Combatting String Query Spam, Custom 404 redirect

Been trying to clean up the aftermath of a decode64 content injection hack on multiple sites on my shared server.
It's clean, but now I'm getting links of incoming spam links with query string, like abc.com/?some-stupid-porno-spam/, so even though the content no longer exists, they're still being redirected to the front page. Am ranking for these spam words instead, or google's just saying those are soft-404s for the luckier sites.
Got a solution, but it's temporary. I was advised to add the following to the top of the .htaccess file:
RewriteEngine on
RewriteCond %{QUERY_STRING} .
RewriteRule ^/?$ - [L,R=404]
So, now all links with /? are redirected to a 404. 2 problems:
It's temporary, in that ALL /? queries are thrown to the 404, including wordpress post/page previews. Is there are way to make it such that it only works for non-existent pages?
The 404 points to the webhosts 404. How can I make it such that it goes to the theme's 404 instead?
Thanks for your time!
------ update
So, the above code works great. I can preview posts/pages, but I found there's a problem - it blocks wordpress' WYSIWYG text editor. The 'visual' tab remains blank, and none of the toolbars appear.
Help? lol

/blog (subdomain) 404 errors in webmaster tools /default.htm

I am new to Stackoverflow but a friend gave me a tip to ask my question over here since he couldn't help me as well. I have google's for multiple days now and I see that my rankings are dropping again in google because of all the crawl errors. My main site is build in serif webplus X5. I have added a wordpress blog to it which can be found at www.sitename .com/blog
Google has found more than 150 crawling errors and this is growing on a daily base, the point is that google ads behind all my blog url's /default.htm
I was wondering if someone can write me a htaccess 301 code for all these url's so it will instant redirect?
Today I started with manually redirecting some url's but this will not solve my problem because everytime I add another post and new tags all these new page's will also have the same default issue.
As you can imagine this is really frustrating grrr...
I have tried a lot of code's that I had found during my search but none of them did what I would like to achieve, other tips to get rid of the default page's are also very welcome.
Thank all of you who would like to fix this problem with me
Place this rule just below RewriteEngine On rule in main WP .htaccess:
RewriteCond %{THE_REQUEST} /default\.htm [NC]
RewriteRule ^(.*?)default\.htm$ /blog/$1 [L,R=301,NC,NE]

Wordpress page numbers and .htaccess ModRewrite rule

we're getting quite a few errors popping up in webmaster tools. Our homepage paginated pages are showing up without the page prefix i.e. 17/ instead of page/17/
Can anyone help me write a rule to redirect these pages so the errors don't keep popping up?
We need to turn…
http://wwwexample.com/17/
into…
http://wwwexample.com/page/17/
We only need this to work for the homepage though as the pagination is working fine for the rest of the site i.e.
www.example.com/category/snails/17/ (This is working properly)
I'm not hot on mod rewrite and would like to start learning more about it.
Thanks,
James
Add these rules before your wordpress rules in your htaccess file in your document root:
RewriteRule ^([0-9]+)/$ /page/$1/ [R=301,L]

.htacces - moving all posts in root to a new category

Could anyone please help me? I am at the last chance saloon and losing a lot of traffic. Any help would be greatfully received.
After a year based on my permalink structure, all posts were in the root so have been picked up by Google as:
snowmenu.com/postname
Since changing my categories and permalink structure, I need the years worth of posts on Google to be redirected to:
snowmenu.com/ski-snowboard-winter-sports-news/postname
Is there a way to make this happen via .htaccess?
Thank you very much to anyone who's able to help me.
Just had a look at the website and I am afraid from my knowledge their is no easy way to do this type of forwarding with .htaccess.
This is because there is no way to tell the difference in link structure from a "normal link" like (eg http://www.snowmenu.com/ski-resorts/) and what you want to be redirected to (eg http://www.snowmenu.com/ski-snowboard-winter-sports-news/latest-ski-news/). If you redirect all requests you will end up having links like http://www.snowmenu.com/ski-snowboard-winter-sports-news/ski-resorts/ which if I am right is not desirable?
The long solution would be to create a htaccess redirect for EVERY URL.
The only other solution that comes to mind is using PHP (or simular) to do a redirect within your 404 document.
EDIT
This will redirect ALL requests to the page you want. But as I said before I dont think this is what you want?
RewriteRule ^(?!ski-snowboard-winter-sports-news)(.*)$ /ski-snowboard-winter-sports-news/$1 [L,R=301]
EDIT 2
Having given it some thought I think I have have come up with a viable option. This will check to see if the requested file exists, if so it will redirect to your new directory (in theory :P).
RewriteCond %{DOCUMENT_ROOT}/ski-snowboard-winter-sports-news/$0 -f
RewriteRule ^(.*)$ /ski-snowboard-winter-sports-news/$1 [R=301,L]
You can use this plugin to avoid messing with .htaccess file directly:
http://wordpress.org/extend/plugins/redirection/
It has a nice interface for you to configure the redirection rules.
The plugin mentioned by #Wordpress Hardcore works best.

How to remove URL injection by parameter en masse?

My (wordpress) site was recently the victim of an attack...ended up with around 20,000 injected URLs. I've since cleaned up the site completely, plugged all the holes, and have installed further hardened the files, but I'm still left with all these URLs in the google index & a message on Google that says "This site may be hacked" because of all these spammy URLs. It's just not realistic to be able to go through & add them to the Webmasters URL Remove tool. I've heard the best way is to get them to display 404 (or 403) and they'll naturally fall out of the index.
Here's what I'd like to do, but haven't figured out how to do it yet: I'd like to come up with a way to force any URL with a certain parameter to display a 404 or 403. For example, the below URL is a good representation of the URLs that are currently indexed:
http://mysiteurlhere.com/index.php?free-online-games-with-cash-prizes.html&items=2&pidnum=1568
Both "items" & "pidnum" are parameters that are used in every single indexed URL that I've seen. My question is: would it be possible to single out one of those parameters with some sort of .htaccess statement, and block or force the URL to 404?
(note: I did go through the robots.txt to disallow any further URLs with parameters like these from being indexed...I just don't know how to do the .htaccess method)
Try with .htaccess:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)items=.*pidnum= [NC]
RewriteRule ^ - [R=404,L]

Resources