Replace or mirror a url from another url using htacess - wordpress

i have a website that is built in wordpress and i have a url
www.mywebsite.com/product/my-new-product so what i need is i want to remove /product from url . i want www.mywebsite.com/my-new-product .
or is it possible that some one click on www.mywebsite.com/my-new-product then it will show the content of www.mywebsite.com/product/my-new-product
Please check
i tired the following but it also not working
RewriteEngine on
RewriteRule ^/product/my-new-product$ /my-new-product/

Rewriting works the other way round than what you apparently expect. It works on incoming requests, so you need to rewrite the URL you want to see in the browsers address bar to what internally is required on the server side:
RewriteEngine on
RewriteRule ^/?my-new-product/(.*)$ /product/my-new-product/$1 [END]
To keep the desired URL in the browsers address bar you also need to take care that the page uses relative references. So not absolute ones referring to /product/my-new-product/... but only to /my-new-product. That is outside the scope of rewriting, though.
In case you receive an internal server error (http status 500) using the rule above then chances are that you operate a very old version of the apache http server. You will see a definite hint to an unsupported [END] flag in your http servers error log file in that case. You can either try to upgrade or use the older [L] flag, it probably will work the same in this situation, though that depends a bit on your setup.
This rule will work likewise in the http servers host configuration or inside a dynamic configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a dynamic configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

Related

redirect all sub url to primary url

I have a wordpress site running on apache. I need to redirect domain.com/examplepage/* to domain.com/examplepage
so as an example
domain.com/examplepage/randomstring/randomstring
world go to
domain.com/examplepage
I've tried to google how to do this but cant find a way. I'm sure it's because I just don't know how to search for the correct thing. I'm willing to use wordpress plugin, .htaccess, or apache config. Whatever works.
Well, you can see many, many answers to this here on SO. Did you check the "Related" section on the right hand side here?
Anyway, here is what you are probably looking for:
RewriteEngine on
RewriteRule ^/?examplepage/.+ /examplepage [END]
The above implements an internal rewrite. In case you really want an external redirection instead this would be the variant:
RewriteEngine on
RewriteRule ^/?examplepage/.+ /examplepage [R=301]
It is a good idea to start out with a 302 temporary redirection and only change that to a 301 permanent redirection later, once you are certain everything is correctly set up. That prevents caching issues while trying things out...
In case you receive an internal server error (http status 500) using the rule above then chances are that you operate a very old version of the apache http server. You will see a definite hint to an unsupported [END] flag in your http servers error log file in that case. You can either try to upgrade or use the older [L] flag, it probably will work the same in this situation, though that depends a bit on your setup, certainly you will need to add another rewriting condition to break an endless rewriting loop.
This implementation will work likewise in the http servers host configuration or inside a distributed configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a distributed configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using distributed configuration files (".htaccess"). Those distributed configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

Apache redirect with Query parameter

I am new to wordpress, i have a situiation where i have to redirect an url with query params to a new url. can some one please help me out with it . i tried few things but none of it works
here is the url pattern
http://exampledomain.com/page/?p=job/123
i want it to redirect to
http://newdomain.com/jobs/123
This probably is what you are looking for:
RewriteEngine on
RewriteCond %{QUERY_STRING} ^|&p=job/(\d+)&|$
RewriteRule ^/?page/?&p /jobs/%1 [R=301]
RewriteRule ^/?jobs/(\d+)$) /page/?p=job/$1 [END]
In case you receive an internal server error (http status 500) using that rule chances are that you operate an old version of the apache http server. You will see a definite hint to an unsupported [END] flag in that case in the http server's error log file. Either upgrade the http server or use the older [L] flag, it probably will work the same here, though that depends a but on your setup.
That rule should work likewise in the http server's host configuration or in a dynamic configuration file (".htaccess" style file). You should prefer the first option, but if you really have to use a dynamic configuation file then take care that the interpretation of such files is enabled at all in the http server configuration and that the file is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

http redirects to https

What would cause a site to try an go to an https url?
We have sitecore set up to redirect non www URLs to www pre-pended URLs. Example: joesrx.com resolves to www.joesrx.com through the Sitecore URLResolver.
What we are seeing is that if you type joesrx.com, it tries to go to https://joesrx.com before it hits the Sitecore server. Since there are no certificates on this server and https is not utilized we get a 404.
Is there something in IIS that is misconfigured? Proxy teams says it is not in their setting and network team says all of the DNS entries are correct.
As a general rule for debugging these sorts of problems, try to imagine all the elements between you and the application and then use a simple divide and conquer approach. You can also test behavior on individual levels of the path between you and the actual application.
In this case for example (from you to application code):
User
Browser
browser may do caching of redirects. Try a different browser / try incognito mode / clear cache
Browser Extensions/Settings
any extensions which make it so the browser always tries to access website(s) via https? Try with extension disabled / another browser
Proxies/Firewalls
any Proxies/Firewalls on your end which may modify requests? Can you try to access the site bypassing any proxies/firewalls, maybe from a different network connection?
Network
Web Server
Web Server Configuration / Pipelines / Resolvers / Filters / Etc.
.htaccess / IIS config / nginx config / servlet filters / (lots of options depending on your framework). Check the server
Actual application code
well.. check the code.
Example of divide and conquer, choosing the Network mid-point: Try accessing the URL with wget/curl from command-line, curl -i will also show you the headers received from the server. If you find a "Location: .." header it's clear that the server is sending a redirect. So now you only have to check Web Server / framework configuration and actual application code.
There are a few things I would check first:
Do you have rewrite rules in your web.config? They may be pattern-matching on www. and redirecting in order to enforce SSL
Do you have code in your pipelines that is attempting to enforce SSL for specific paths? The code here may not be checking the URL correctly.
In IIS, did you bind the 'www' host name to your IIS site? Or is it falling through to another site that has SSL enforced?
In case the other answers don't help, check for HTST headers such as "Strict-Transport-Security: max-age=31536000".
This HTTP header tells browsers to use only SSL for future requests (among other things).
For more info check out:
https://www.owasp.org/index.php/HTTP_Strict_Transport_Security

strange 401 error appears for some urls when using .htaccess to redirect http to https

OK, here is the 7th day of unsuccessfull attempt to find an answer why 401 error appears...
Now,
.htaccess in the root folder contains the only 3 strings (was simplified) and there are NO more .htaccess files in the project:
RewriteEngine On
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
So, it redirects all requests to be https. It works fine for any urls, even for /administration directory.
So,
http://mydomain.com
becomes
https://mydomain.com
If https://mydomain.com was entered, there are no redirections.
http://mydomain.com/administration/index.php
becomes
https://mydomain.com/administration/index.php
If https://mydomain.com/administration/index.php was entered, there are no redirections.
That's clear, and the problem is below.
I want /administration directory to be password protected. My Shared Hosting Control Panel allows to protect directories without manual creating of .htaccess and .htpasswd (you choose a directory to protect, create username and password, and .htaccess and .htpasswd are created automatically). So, .htaccess appears in the /administration folder. .htpasswd appears somewhere else, the path to .htpasswd is correct, and everything looks correct (it works the same way as to create it manually). So, there are 2 .htaccess files in the project, one in the root directory and one in the /administration directory (with .htpasswd at the directory .htaccess knows where it is).
Once the password is created,
the results are:
You enter:
https://mydomain.com/administration/index.php
Then it asks to enter a password.
If you enter it correctly,
https://mydomain.com/administration/index.php is displayed.
The result: works perfect.
But, if you enter
http://mydomain.com/administration/index.php (yes, http, without S)
then instead of redirecting to the same,but https page,
it redirects to
https://mydomain.com/401.shtml (starts with httpS)
by unknown reason and even does NOT ask a password. Why?
I've contacted a customer support regarding this question and they are sure the problem is in .htaccess file, and they do not fix .htaccess files (that's clear, they do not, I don't mind).
Why does this happen?
Did I forget to put some flags, or some options to change default settings in the .htaccess file?
P.S.Creating .htaccess and .htpasswd manually (not from hosting Control Panel) for the folder /administration causes the same 401 error in case if not https, but http was entered.
And the problem appears with URLs to /administration directory only.
Thank you.
Try using this instead. Not the L and R flag.
RewriteEngine On
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
Also clear your browsers cache first, to remove the old incorrect redirect.
If that doesn't work try using this.
RewriteCond %{HTTPS} !on
RewriteCond %{THE_REQUEST} ^(GET|HEAD)\ ([^\ ]+)
RewriteRule ^ https://%{HTTP_HOST}%2 [L,R=301]
I feel a bit bad about writing it, as it seems kind of hackish in my view.
EDIT
Seems the 2nd option fixed the problem. So here is the explanation as to why it works.
The authentication module is executed before the rewrite module. Because the username and password is not send when first requesting the page, the authentication module internally 'rewrites' the request url to the 401 page's url. After this mod_rewrite comes and %{THE_REQUEST} now contains 401.shtml instead of the original url. So the resulting redirect contains the 401.shtml, and not the url you want.
The get to the original (not 'rewritten') url, you need to extract it from %{THE_REQUEST}. THE_REQUEST is in the form [requestmethod] [url] HTTP[versionnumber]. The RewriteCond extracts just the middle part ([url]).
For completeness I added the [L,R=301] flags to the second solution.
I think I found an even better solution to this!
Just add this to your .htaccess
ErrorDocument 401 "Unauthorized"
Solution found at:
http://forum.kohanaframework.org/discussion/8934/solved-for-reall-this-time-p-htaccess-folder-password-protection/
-- EDIT
I eventually found the root cause of the issue was ModSecurity flagging my POST data (script and iframe tags cause issues). It would try to return a 401/403 but couldn't find the default error document because ModSecurity had made my htaccess go haywire.
Using ErrorDocument 401 "Unauthorized" bypassed the missing error document problem but did nothing to address the root cause.
For this I ended up using javascript to add 'salt' to anything which was neither whitespace nor a word character...
$("form").submit(function(event) {
$("textarea,[type=text]").each(function() {
$(this).val($(this).val().replace(/([^\s\w])/g, "foobar$1salt"));
});
});
then PHP to strip the salt again...
function stripSalt($value) {
if (is_array($value)) $value = array_map('stripSalt', $value);
else $value = preg_replace("/(?:foobar)+(.)(?:salt)+/", "$1", $value);
return $value;
}
$_POST = stripSalt($_POST);
Very, Very, Very Important Note:
Do not use "foobar$1salt" otherwise this post has just shown hackers how to bypass your ModSecurity!
Regex Notes:
I thought it may be worth mentioning what's going on here...
(?:foobar)+ = match first half of salt one or more times but don't store this as a matched group;
(.) = match any character and store this as the first and only group (accessible via $1);
(?:salt)+ = match second half of salt one or more times but don't store this as a matched group.
It's important to match the salt multiple times per character because if you've hit submit and then you use the back button you will go back to the form with all the salt still in there. Hit submit again and more salt gets added. This can happen again and again until you end up with something like:
foobarfoobarfoobarfoobar>saltsaltsaltsalt
I was not satisfied with the solutions above so I came up with another one:
In a modern web server configuration we should redirect all traffic to HTTPS so the user can not reach any content without HTTPS. After the user's browsing our content with HTTPS we can use authentication. With this in mind we can wrap the authentication directive in an If directive:
<If "%{HTTPS} == 'on'">
AuthType Basic
...
</If>
You can leave and use Rewrite directives as you like.
With this solution:
you must not change ErrorDocument as suggested by Hoogs
you must not extract path from THE_REQUEST in a hackish way as suggested by Gerben
This is the type of thing is that is a bit tricky to troubleshoot on Apache without the box right in front of you, but I what I think is happening is that your rewrite directive is being processed after path resolution, and it's the path resolution that has the password requirement.
Backing up a bit, the way a URL is resolved in Apache is that the request comes in and gets handed from module to module, kind of like a bucket brigade. Each module does its own thing....some modules do content negotiation, some translate URLs to file paths, some check authentication, one of them is mod_rewrite ...
One place where you see this in the configuration is actually that there is both a Location directive and a Directory directive which seem the same in most respects, but they are different because Locations talk about URLs and Directories talk about filesystem paths.
Anyhow, my guess is that going down the bucket brigade, Apache figures out that it needs a password to access that content before it figures out that it needs to redirect to HTTPS. (mod_rewrite is kind of a crazy module and it can mess with all kinds of things in surprising ways..it can do path translation, bits and pieces of rewrite, make subrequests, and a bunch of other nutty things).
There are few ways you can fix this that I can think of.
Change your directory root in the vhosts container for the http site so that it can't find the passworded file (this would be my approach)
Change your module load order so that mod_rewrite happens earlier in the chain (may have unexpected consequences)
Use setenvif
That last one needs more explanation. Remember the bucket brigade I told you about? Apache modules can also set environment variables, which are completely outside of the module->module->module->chain. You could, perhaps, set an environment variable if the site is not HTTPS. Then however you set up your access control could use the SetEnvIf directive to always allow access to the resource if it's set, BUT you have to make sure for sure that you're going to hit that rewrite rule.
As I said, my choice would #1 but sometimes people need to do crazy things, and Apache will let you.
My real-world SOP for https:// sites these days is that I just shoot all of my port 80 content over to a single vhost that can't serve any content at all. Then i mod_rewrite everything over https://... badda bing, badda boom, no http and no convoluted security risks.

How does url rewrite works?

How does web server implements url rewrite mechanism and changes the address bar of browsers?
I'm not asking specific information to configure apache, nginx, lighthttpd or other!
I would like to know what kind of information is sent to clients when servers want rewrite url?
There are two types of behaviour.
One is rewrite, the other is redirect.
Rewrite
The server performs the substitution for itself, making a URL like http://example.org/my/beatuful/page be understood as http://example.org/index.php?page=my-beautiful-page
With rewrite, the client does not see anything and redirection is internal only. No URL changes in the browser, just the server understands it differently.
Redirect
The server detects that the address is not wanted by the server. http://example.org/page1 has moved to http://example.org/page2, so it tells the browser with an HTTP 3xx code what the new page is. The client then asks for this page instead. Therefore the address in the browser changes!
Process
The process remains the same and is well described by this diagram:
Remark Every rewrite/redirect triggers a new call to the rewrite rules (with exceptions IIRC)
RewriteCond %{REDIRECT_URL} !^$
RewriteRule .* - [L]
can become useful to stop loops. (Since it makes no rewrite when it has happened once already).
Are you talking about server-side rewrites (like Apache mod-rewrite)? For those, the address bar does not generally change (unless a redirection is performed).
Or are you talking about redirections? These are done by having the server respond with an HTTP code (301, 302 or 307) and the location in the HTTP header.
There are two forms of "URL rewrite": those done purely within the server and those that are redirections.
If it's purely within the server, it's an internal matter and only matters with respect to the dispatch mechanism implemented in the server. In Apache HTTPD, mod_rewrite can do this, for example.
If it's a redirection, a status code implying a redirection is sent in the response, along with a Location header indicating to which URL the browser should be redirected (this should be an absolute URL). mod_rewrite can also do this, with the [R] flag.
The status code is usually 302 (found), but it could be configured for other codes (e.g. 301 or 307).
Another quite common use (often unnoticed because it's usually on by default in Apache HTTPD) is the redirection to the the URL with a trailing slash on a directory. This is implemented by mod_dir:
A "trailing slash" redirect is issued
when the server receives a request for
a URL http://servername/foo/dirname
where dirname is a directory.
Directories require a trailing slash,
so mod_dir issues a redirect to
http://servername/foo/dirname/.
Jeff Atwood had a great post about this: http://www.codinghorror.com/blog/2007/02/url-rewriting-to-prevent-duplicate-urls.html
How web server implements url rewrite mechanism and changes the address bar of browsers?
URL rewriting and forwarding are two completely different things. A server has no control over your browser so it can't change the URL of your browser, but it can ask your browser to go to a different URL. When your browser gets a response from a server it's entirely up to your browser to determine what to do with that response: it can follow the redirect, ignore it or be really mean and spam the server until the server gives up. There is no "mechanism" that the server uses to change the address, it's simply a protocol (HTTP 1.1) that the server abides by when a particular resource has been moved to a different location, thus the 3xx responses.
URL rewriting can transform URLs purely on the server-side. This allows web application developers the ability to make web resources accessible from multiple URLs.
For example, the user might request http://www.example.com/product/123 but thanks to rewriting is actually served a resource from http://www.example.com/product?id=123. Note that, there is no need for the address displayed in the browser to change.
The address can be changed if so desired. For this, a similar mapping as above happens on the server, but rather than render the resource back to the client, the server sends a redirect (301 or 302 HTTP code) back to the client for the rewritten URL.
For the example above this might look like:
Client request
GET /product/123 HTTP/1.1
Host: www.example.com
Server response
HTTP/1.1 302 Found
Location: http://www.example.com/product?id=123
At this point, the browser will issue a new GET request for the URL in the Location header.

Resources