Is there such thing as a HTTP URL re-write without 301 or 302 redirect? - http

Is there such a thing?
A way it might be used:
Many locations have forms that post to http://www.example.com/wally/app/receiver.aspx
Managements decides they want a cleaner URL and there is no reason to pretend you are using aspx (you didn't really think I was using aspx for that did you?)
They say it should be http://example.com/receiver
Easy enough! Just put a 301 redirect. No need to update all those forms that exist all over..,, but wait.. You can't do that for POST.
Perhaps you can receive and handle the request and then re-write the URL without causing a subsequent request? Perhaps this will not strip the www (cross domain), but can it shorten the pathname like that without a separate request?
Even in GET requests, this would indeed be a performance boost if one could re-write the URL and send the response body at the same. Can this be done?

You cannot send content to user and do 301/302 etc redirect at the same time -- browser interprets the HTTP Response code and acts accordingly to the code received. If 301/302 -- it will do redirect, if 200 -- will display it to the customer.
Is there such thing as a HTTP URL re-write without 301 or 302 redirect?
Yes -- it's called rewrite (internal redirect). For example -- customer requests http://example.com/receiver. You rewrite URL to point to /wally/app/receiver.aspx (e.g. RewriteRule ^receiver$ /wally/app/receiver.aspx [L] -- that's if you have an Apache, which you most likely not (considering receiver.aspx)). This will do internal redirect when URL remains unchanged in browser address bar (works fine with POST and GET methods).

Well, I guess rewriting url suggested by LazyOne is not the answer to the question as he himself states that
This will do internal redirect when URL remains unchanged in browser
address bar
(http://www.example.com/wally/app/receiver.aspx). Still, the question asks for
(...) it should be http://example.com/receiver
I think the solution is to redirect old url to the new one using 307 status code introduced in RFC 2616. User agents which handle version 1.1 of HTTP protocol (I guess all popular browsers for some time now) should make the new request using the same http method (POST in this case) as in the original request.

Related

HTTP code 301: why and how to allows changing the request method from POST to GET [duplicate]

I have implemented SEO URLs using Apache 301 redirects to a 'redirect.cfm' in the root of the website which handles all URL building and content delivering.
Post data is lost during a 301 redirect.
Unable to find a solution so far, have tried excluding post method from rewrites - worst case scenario we could use the old type URLs for post methods.
Is there something that can be done?
Thanks
Using a 307 should be exactly what you want
307 Temporary Redirect (since HTTP/1.1)
In this case, the request should be repeated with another URI; however, future requests
should still use the original URI.[2] In contrast to how 302 was historically implemented,
the request method is not allowed to be changed when reissuing the original request. For
instance, a POST request should be repeated using another POST request
- Wikipedia
Update circa 2021
The original answer here was written before 307 status code redirect worked consistently across browsers. As per Hashbrown's answer below, the 307 status code should be used.
Old Answer
POST data is discarded on redirect as a client will perform a GET request to the URL specified by the 301. Period.
The only option is to convert the POST parameters to GET parameters and tack them onto the end of the URL you're redirecting to. This cannot be done in a .htaccess file rewrite.
One option is to catch POST requests to the url to be redirected and pass it off to a page to handle the redirect. You'd need to do the transposition of the parameters in code then issue the redirect header with the parameter appended new url that way.
Update: As pointed out in the comments to this answer, if you do redirect to another URL specifying POST parameters and that URL is also accessed without paramters (or the params are variable), you should specify a link to the canonical URL for the page.
Say the POST form redirects transposed to the following GET resource:
http://www.example.com/finalpage.php?form_data_1=123&form_data_2=666
You would add this link record to the head section of the page:
<link rel="canonical" href="http://www.example.com/finalpage.php" />
This would ensure all SEO value would be given to http://www.example.com/finalpage.php and avoid possible issues with duplicate content.
Using 301 redirects for general URL rewriting is not the way to go.
This is a performance issue (especially for mobile, but also in general), since it doubles the number of requests for your page.
Think about using a URL rewriting tool like Tuckey's URLrewriteFilter or apache mod_rewrite.
What Ray said is all true, this is just an additional comment on your general approach.

Web. Some nuances of difference between forward and redirect

I'm starting to learn web-programming. I've read about the difference between forward and redirect. But two questions not fully understood still:
In which case does the process access to a server-side and in which case without server-side?
When does URL change and when doesn't change? Does URL changes always when redirecting? Does URL changes never when forwarding?
I would be very grateful for the clear answers and explanations! Thanks in advance!
They are not hard and fast terms.
A redirect usually means an HTTP redirect, which is an HTTP response that instructs the client to make a new HTTP request to a different URI.
An internal redirect is a common description of a redirect that is handled internally by the webserver / web application / etc and doesn't send the browser to a different URI.
Forward is not a particularly common term, but when I've encountered it it usually means a form of internal redirect.
Forward happens on serverside, server forwards the same request to another resource. whereas redirect happens on the browser side, server sends http status code 302 to browser so browser makes new request.
Redirect requires one more round trip from browser to server.
One more difference is redirect reflects in browser address bar forward doesnt.

How does url rewrite works?

How does web server implements url rewrite mechanism and changes the address bar of browsers?
I'm not asking specific information to configure apache, nginx, lighthttpd or other!
I would like to know what kind of information is sent to clients when servers want rewrite url?
There are two types of behaviour.
One is rewrite, the other is redirect.
Rewrite
The server performs the substitution for itself, making a URL like http://example.org/my/beatuful/page be understood as http://example.org/index.php?page=my-beautiful-page
With rewrite, the client does not see anything and redirection is internal only. No URL changes in the browser, just the server understands it differently.
Redirect
The server detects that the address is not wanted by the server. http://example.org/page1 has moved to http://example.org/page2, so it tells the browser with an HTTP 3xx code what the new page is. The client then asks for this page instead. Therefore the address in the browser changes!
Process
The process remains the same and is well described by this diagram:
Remark Every rewrite/redirect triggers a new call to the rewrite rules (with exceptions IIRC)
RewriteCond %{REDIRECT_URL} !^$
RewriteRule .* - [L]
can become useful to stop loops. (Since it makes no rewrite when it has happened once already).
Are you talking about server-side rewrites (like Apache mod-rewrite)? For those, the address bar does not generally change (unless a redirection is performed).
Or are you talking about redirections? These are done by having the server respond with an HTTP code (301, 302 or 307) and the location in the HTTP header.
There are two forms of "URL rewrite": those done purely within the server and those that are redirections.
If it's purely within the server, it's an internal matter and only matters with respect to the dispatch mechanism implemented in the server. In Apache HTTPD, mod_rewrite can do this, for example.
If it's a redirection, a status code implying a redirection is sent in the response, along with a Location header indicating to which URL the browser should be redirected (this should be an absolute URL). mod_rewrite can also do this, with the [R] flag.
The status code is usually 302 (found), but it could be configured for other codes (e.g. 301 or 307).
Another quite common use (often unnoticed because it's usually on by default in Apache HTTPD) is the redirection to the the URL with a trailing slash on a directory. This is implemented by mod_dir:
A "trailing slash" redirect is issued
when the server receives a request for
a URL http://servername/foo/dirname
where dirname is a directory.
Directories require a trailing slash,
so mod_dir issues a redirect to
http://servername/foo/dirname/.
Jeff Atwood had a great post about this: http://www.codinghorror.com/blog/2007/02/url-rewriting-to-prevent-duplicate-urls.html
How web server implements url rewrite mechanism and changes the address bar of browsers?
URL rewriting and forwarding are two completely different things. A server has no control over your browser so it can't change the URL of your browser, but it can ask your browser to go to a different URL. When your browser gets a response from a server it's entirely up to your browser to determine what to do with that response: it can follow the redirect, ignore it or be really mean and spam the server until the server gives up. There is no "mechanism" that the server uses to change the address, it's simply a protocol (HTTP 1.1) that the server abides by when a particular resource has been moved to a different location, thus the 3xx responses.
URL rewriting can transform URLs purely on the server-side. This allows web application developers the ability to make web resources accessible from multiple URLs.
For example, the user might request http://www.example.com/product/123 but thanks to rewriting is actually served a resource from http://www.example.com/product?id=123. Note that, there is no need for the address displayed in the browser to change.
The address can be changed if so desired. For this, a similar mapping as above happens on the server, but rather than render the resource back to the client, the server sends a redirect (301 or 302 HTTP code) back to the client for the rewritten URL.
For the example above this might look like:
Client request
GET /product/123 HTTP/1.1
Host: www.example.com
Server response
HTTP/1.1 302 Found
Location: http://www.example.com/product?id=123
At this point, the browser will issue a new GET request for the URL in the Location header.

HTTP redirect: 301 (permanent) vs. 302 (temporary)

Is the client supposed to behave differently? How?
Status 301 means that the resource (page) is moved permanently to a new location. The client/browser should not attempt to request the original location but use the new location from now on.
Status 302 means that the resource is temporarily located somewhere else, and the client/browser should continue requesting the original url.
When a search engine spider finds 301 status code in the response header of a webpage, it understands that this webpage no longer exists, it searches for location header in response pick the new URL and replace the indexed URL with the new one and also transfer pagerank.
So search engine refreshes all indexed URL that no longer exist (301 found) with the new URL, this will retain your old webpage traffic, pagerank and divert it to the new one (you will not lose you traffic of old webpage).
Browser: if a browser finds 301 status code then it caches the mapping of the old URL with the new URL, the client/browser will not attempt to request the original location but use the new location from now on unless the cache is cleared.
When a search engine spider finds 302 status for a webpage, it will only redirect temporarily to the new location and crawl both of the pages. The old webpage URL still exists in the search engine database and it always attempts to request the old location and crawl it. The client/browser will still attempt to request the original location.
Read more about how to implement it in asp.net c# and what is the impact on search engines -
http://www.dotnetbull.com/2013/08/301-permanent-vs-302-temporary-status-code-aspnet-csharp-Implementation.html
Mostly 301 vs 302 is important for indexing in search engines, as their crawlers take this into account and transfer PageRank when using 301.
See Peter Lee's answer for more details.
301 redirects are cached indefinitely (at least by some browsers).
This means, if you set up a 301, visit that page, you not only get redirected, but that redirection gets cached.
When you visit that page again, your Browser* doesn't even bother to request that URL, it just goes to the cached redirection target.
The only way to undo a 301 for a visitor with that redirection in Cache, is re-redirecting back to the original URL**. In that case, the Browser will notice the loop, and finally really request the entered URL.
Obviously, that's not an option if you decided to 301 to facebook or any other resource you're not fully under control.
Unfortunately, many Hosting Providers offer a feature in their Admin Interface simply called "Redirection", which does a 301 redirect. If you're using this to temporarily redirect your domain to facebook as a coming soon page, you're basically screwed.
*at least Chrome and Firefox, according to How long do browsers cache HTTP 301s?. Just tried it with Chrome 45.
Edit: Safari 7.0.6 on Mac also caches, a browser restart didn't help (Link says that on Safari 5 on Windows it does help.)
**I tried javascript window.location = '', because it would be the solution which could be applied in most cases - it doesn't work. It results in an undetected infinite Loop. However, php header('Location: new.url') does break the loop
Bottom Line: only use 301s if you're absolutely sure you're never going to use that URL again. Usually never on the root dir (example.com/)
301 is that the requested resource has been assigned a new permanent URI and any future references to this resource should be done using one of the returned URIs.
302 is that the requested resource resides temporarily under a different URI.
Since the redirection may be altered on occasion, the client should continue to use the Request-URI for future requests.
This response is only cachable if indicated by a Cache-Control or Expires header field.
The main issue with 301 is browser will cache the redirection even if you disabled the redirection from the server level.
It's always better to use 302 if you are enabling the redirection for a short maintenance window.
There have already been plenty of good answers, but none tells pitfalls or when to use one over the other from a plain browsers perspective.
Use 302 over a 301 HTTP Status whenever you need to keep dynamic server side control about the final URL. Using a 301 http status will make your browser always load the final URL from its own cache, without fetching anything of any previous URL (totally skipping the first time request). That may have unpredictable results in case you need to keep server side control about the redirected URL.
As an example, in case you need to do URL redirection on behalf of a users ip-geo-position (geo-ip-switching) use 302. If you would use a 301 in such a scenario, the final redirected page will always come directly from the browsers cache, giving incorrect/false content to the user.
301 is a permanent redirect, and 302 is a temporary redirect.
The browser is allowed to cache the 301 but 302 means it has to hit our system every time. assuming that we want to minimize the load on our system, 301 is the right decision. Imagine creating URL shortening service for a big company, we try to get as less hit to our servers by the clients
But if the user wants to edit their short URLs, it might take more time than usual for the browser to pick up the change because the browser has the old one cached. Also, if you want to offer users metrics on how often their URL is getting hit, 301 would mean we would not necessarily see every hit from the client. So if you want analytics as a feature later on and a smooth user experience for editing URLs, 302 is a better choice.

Which Http redirects status code to use?

friendfeed.com uses 302.
bit.ly uses 301.
I had decided to use 303.
Do they behave differently in terms of support by browsers ?
It depends on your purpose.
301 says “this isn't the proper URL, look elsewhere and use remember that other URL is better; don't come back here!”.
302 says “this is the proper URL which you should carry on using, but to actually get the content look elsewhere”.
303 is like 302 but specifically for redirections after a form submission.
If your purpose is a URL shortener then 303 isn't really relevant. It'll still work, but offers nothing over the normal 302. For a URL shortener I'd say 301 would be most suitable, as the other URL is the ‘real’ one. Saying 302 is trying to keep the ownership of the address and any SEO momentum caused by its use for yourself: a bit rude, but maybe you want to be rude.
Different status codes have different meanings. The HTTP specification describes them: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
301 — moved permanently (and change an
302 — found here
303 — find your response here, but use GET even if you started out with POST
If we take, for example, an Atom feed that has the URL changed for some reason (perhaps it is being moved to Amazon S3 or something). Given a 301 result, the feed reader should note that the feed has moved and update it's subscription. Given a 302, it will get the feed from its new location, but hit the original server looking for the original URI every time it checks for an update. (And a 303 would be silly in this situation).
Read http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for the answer.
10.3.2 301 Moved Permanently
The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise.
10.3.3 302 Found
The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.
Have a look at the HTTP 1.1 Status Code definitions. Different status codes imply different meanings and, therefore, encourage different behavior. Try to use the code which best matches your use case.
301 is for a permanent redirect and if this is what you want to do then this is recommended by all SEO experts.
"A 301 redirect is a permanent redirect that passes full link equity (ranking power) to the redirected page. 301 refers to the HTTP status code for this type of redirect. In most instances, the 301 redirect is the best method for implementing redirects on a website."
https://moz.com/learn/seo/redirection#:~:text=A%20301%20redirect%20is%20a,implementing%20redirects%20on%20a%20website.

Resources