Using htaccess to rewrite special characters for Wordpress url aliases - wordpress

I recently migrated a fairly large site (~6,000 posts) from Drupal to Wordpress. As part of the process, I migrated the Drupal-created url aliases to Wordpress for SEO and link retention purposes.
An example of a url alias that Drupal created that worked great in Drupal:
/stories/will-this-be-another-la-niña-year
That url in Wordpress returns a 404. However, this works:
/stories/will-this-be-another-la-nina-year
It seems then my best bet is to write a generic international character to english character set rewrite rule in htaccess, before the url is passed to Wordpress.
Any idea how I might do this?
Thanks a lot for whatever help you can give.
Matt.

It seems like there might be a better way to do this within wordpress, you may want to do a quick browser through the wordpress Trac tickets, there maybe some patch or temporary fix for the problem. But if you need to go to an htaccess/redirect method, you can either use a RewriteMap to sanitize and redirect-if-needed or explicitly redirect on non-ascii characters.
A RewriteMap requires access to either server or vhost config to setup the map. It could be as simple as a list of /stories/will-this-be-another-la-niña-year URIs mapped to http://yourdomain.com/stories/will-this-be-another-la-nina-year (the all ascii URL, the http:// is significant because it tells mod_rewrite to redirect the browser). Or you can write a script to look for non-ascii characters and replace them with the appropriate ascii character.
Text mapping:
RewriteMap sanitize txt:/path/to/uri_mapping.txt
Script mapping:
RewriteMap sanitize prg:/path/to/sanitize_script.php
Then in your htaccess file, you can invoke this mapping like this (these rules will need to be above the wordpress rules, since you want the URI sanitized before wordpress gets a hold of them.
RewriteRule ^(.*)$ /${sanitize:$1|$1} [L]
If you don't have access to server/vhost config, you'll have to enumerate the possibilities in your htaccess file, again putting these rules above the wordpress rules:
# replace ñ
RewriteRule ^(.*)ñ(.*)$ /$1n$2 [R=301,L]
# replace ú
RewriteRule ^(.*)ú(.*)$ /$1ú$2 [R=301,L]
etc.

I just added the following lines at the beginning of my .htaccess file and it works:
RewriteRule ^(.*)é(.*)$ /$1e$2 [R=301,L]
RewriteRule ^(.*)è(.*)$ /$1e$2 [R=301,L]
RewriteRule ^(.*)ê(.*)$ /$1e$2 [R=301,L]
RewriteRule ^(.*)î(.*)$ /$1i$2 [R=301,L]
RewriteRule ^(.*)ô(.*)$ /$1o$2 [R=301,L]
RewriteRule ^(.*)û(.*)$ /$1u$2 [R=301,L]
RewriteRule ^(.*)â(.*)$ /$1a$2 [R=301,L]
RewriteRule ^(.*)à(.*)$ /$1a$2 [R=301,L]
RewriteRule ^(.*)ï(.*)$ /$1i$2 [R=301,L]

Related

Apache rewrite to map # based URL into proper URLs

Long version (you can skip to TL;DR if you want to):
I am working with a Wordpress site that was set up by someone else. The website has multiple pages where page has tabbed content which is accessible through #. For eg:
www.example.com/services/category1/#tab-service1
www.example.com/services/category1/#tab-service2
www.example.com/services/category2/#tab-service1
www.example.com/services/category2/#tab-service2
www.example.com/services/category2/#tab-service3
Now, when search engines index they are indexing only www.example.com/services/category1/ and www.example.com/services/category2/. This creates a problem where we cannot have search engines point directly to the content within a given tab. What we want is for search engines to show links that takes users directly to (say) www.example.com/services/category2/#tab-service3.
Now, I don't think google can index such # based content on its own. So, I am thinking of using apache rewrites to try to resolve this issue. I have access to .htaccess file only (from a config perspective).
TL;DR
How to redirect www.example.com/services/category1/service3/ to www.example.com/services/category1/#tab-service3 using apache redirects (I have access to .htaccess file)?
This is what I am trying but it's not working:
Options +FollowSymlinks -MultiViews
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} ^/services/category1/([a-z0-9])/? [NC]
RewriteRule .* /services/category1#tab-%1 [R,NE,L]
Someone also adviced to look into pushState server config to fix this. I am not sure how to use pushState.
UPDATE:
I have updated the rewrites to the following but it still doesn't work. It keeps showing Wordpress' 404 page
<IfModule mod_rewrite.c>
Options +FollowSymlinks -MultiViews
RewriteEngine On
RewriteBase /domainfolder/
RewriteCond %{REQUEST_URI} ^/services/category1/([a-z0-9]+)/?$ [NC]
RewriteRule ^/services/category1/([a-z0-9]+)/?$ /services/category1/#$1 [NE,R,L]
</IfModule>
Your %{REQUEST_URI} regex is wrong. The pattern ^/services/
category1/([a-z0- 9 ])/? matches /services/category1/{any 1 char of a-z or 0-9} format followed by an optional slash. So this does not match your request /services/category1/service3 but matches /services/category1/a/ .
You should be using
^/services/category1/([a-z0-9]+)/?$

Apache HTAccess Remove Query String for redirect

It sounds like a relatively simple one but here goes.. I have a wordpress website called example.com It has recently been redeveloped using wordpress however the old sites structure used the following www.example.com/?page_id=22 I want to redirect anything after the domain name that contains /?page_id=[any digit] to the homepage, im having great difficulty in accomplishing this and I also think I may have issues because wordpress itself uses the query string page_id before the permalinks redirects kick in, I could be wrong though but any help on removing the query string directly after a domain name would be great especially keeping in mind that it is a wordpress install.
I have tried various iterations that look like the following to no avail
RewriteCond %{QUERY_STRING} ^page_id=(.*)$
RewriteRule (.*) http://www.example.com/ [R=301,L]
RedirectMatch 301 ^/page_id=(.*)$ /
Try this rule right at top, just below RewriteEngine On line:
RewriteCond %{QUERY_STRING} ^page_id=\d+$ [NC]
RewriteRule ^/?$ /? [R=301,L]
/? will strip off any query string.

mod_rewrite rule (dynamic to static URI) not working

I am trying to rewrite the following dynamic URI:
http://domain.com/dictionary/?h=%E8%AF%91
To this:
http://domain.com/dictionary/%E8%AF%91
Note that %E8%AF%91 is the Chinese character 译.
I am using the following mod_rewrite in .htaccess:
RewriteEngine on
RewriteRule ^dictionary/([^/\.]+)/?$ index.php?h=$1
I have also tried:
RewriteEngine on
RewriteRule ^dictionary/([^/\.]+)/?$ dictionary/index.php?h=$1
I am not sure why it is working. Some theories that I have:
Chinese characters need a particular workaround for mod_rewrite
WordPress's rewrite rules are messing with mine
I have checked and mod_rewrite is definitely activated.
I suggest capturing against %{THE_REQUEST} with a rewritecond which will give you the original, client-encoded form of the URL in e.g. %1.
You should then find it much easier to re-substitute that capture into the new URL.

Why are .htaccess RSS rules being ignored?

I'm in the process of migrating a blog to a new platform & server, and having trouble with mod_rewrite .htaccess rules. So far I'm able to redirect post URLs and the root domain the the new server, but the rules for the RSS URL is being ignored.
Here are my rules:
RewriteRule ^[0-9]+/[0-9]+/([^/]+)/?$ http://blog.example.com/$1 [R=302,L] #working
RewriteRule ^/rss$ http://blog.example.com/rss [R=302,L] #not working
RewriteRule ^$ http://blog.example.com/ [R=302,L] #working
The first and last rule are working as expected, but the second rule is not redirecting. If I type in http://example.com/rss it does not redirect to http://blog.example.com/rss
I feel like I'm missing something simple. This is my first time fiddling with mod_rewrite. Thanks.
Assuming that you're using apache 2.0+, you need to remove the leading slash from the patterns because they get stripped by apache when rules in an htaccess file are being applied.
RewriteRule ^rss$ http://blog.example.com/rss [R=302,L]

redirect a domain extension to a subdirectory

I am trying to create a permanent htaccess redirect (301) from all my domain extensions into the appropriate subdirectories. The "rules" are as follow:
Redirect belgian website to its subdirectory on the main website:
from: www.example.be
to: www.example.com/befr/
Of course I would like to preserve the url parameters (if any) of the "from". Globaly, if someone entered the first url it should redirect to the second url (langage subdirectory in the main website).
I'm using wordpress and I'm hosting on a plesk I've read many things here but I'm stuck, thank you very much in advance for your help
PS: I've tried that but it doesn't work
RewriteCond %{HTTP_HOST} ^(www\.)?example.be$ [NC]
RewriteRule ^(.*) http://www.example.com/befr/$1 [L,R]
After reading your question, your code should be working (with informations you gave).
If it's not, here are some points to check:
1. Make sure mod_rewrite is enabled and htaccess can be executed (Apache config).
2. Your htaccess has to be in root folder (where example.be is forwarded).
3. About your htaccess' code:
since you're using Wordpress, make sure your rule is on top of other rules
don't forget to escape . (second one) in RewriteCond (otherwise it doesn't mean the same) even if it works that way
replace R flag (302 by default) by R=301 if you want a 301 redirect
Your code now looks like this
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(www\.)?example\.be$ [NC]
RewriteRule ^(.*)$ http://www.example.com/befr/$1 [R=301,L]
# your other rules (and Wordpress' default rule) here

Resources