While working on a web crawler, I came across this strange occurrence involving the following web page: http://abudhabitourism.ae/en/
When using wget to download this page, a status code 200 is returned and all is well.
However, when my crawler program requests this page (GET request), the server seems to return status code 302 with a strange-looking "moved-to" field in the location header:
http://sso.adta.ae/opensso/TacCDSSO?localServlet=http%3a%2f%2fabudhabitourism.ae%2f%2fcdsso.ashx¶mName=result&gotoURL=http%3a%2f%2fabudhabitourism.ae%2fen%2fdefault.aspx
Is this actually a url or a script? Any ideas on how I can handle this case in my crawler program (i.e. to be able to automatically extract the correct moved-to url from the location header)
Thanks,
Prof. Chiraz BenAbdelkader
I think wget follows the redirect from the 302. When I use curl to get the page, it returns the headers with 302 and the URL to follow up on.
curl -iI http://abudhabitourism.ae/en/
Related
I have a question.
Does setting a post/page staus to "private" will it generate a true 404 page with 404 header? Or it's just a page showing that is not found?
I want Google to know that the page no longer exists and it's 404.
Settings to private makes the site say it's not found, but how do I know it's a true 404 error?
Thanks! :)
What you are looking for is the HTTP response status of the URL.
You have to look about. Google about something like "get HTTP status of a URL", and you'll find many free tools to check URLs HTTP statuses.
You can also check yourself with your browser console whether it's a real 404 or not.
In the network tab, you'll have the page request lists and HTTP status for each. From there, check the request having your page URL (usually the first one), and see the HTTP status of this request. If it's a 404: you're good.
You can also use Postman to get the HTTP response status of your URL. Or on CLI using Curl and many other ways.
Hi I am trying to test my website on page speed google but getting an error.
"Attempting to load the page reached the limit of 3 client redirects. The last URL fetched was http://www.example.com/. This may indicate the page is redirecting to itself, or has a loop of redirects."
Could you please tell me exactly whats the issue.
It looks like your page is redirecting to itself. Execute the following command in your terminal.
curl -I http://www.example.com
Make sure the URL you are testing with returns HTTP/1.1 200 OK.
You can also use some online tools to find the final destination and use that one for testing the page speed. Check this one - http://redirectdetective.com/
Hi thanks for the reply I have tested the url at "http://redirectdetective.com/" and got no redirects found also I have execute the curl -I http://www.example.com command and got HTTP/1.1 200 OK response.
But still I am getting an error on page speed google.
I use LuaForWindows (latest version) and I have read this and this answer and everything i could find in the mailinglist of lua-users.org. What ever I try (most) sites only respond with either 301 or 302. I have created an example batch script which downloads (some) of the OpenGL 2.1 Reference from their man pages.
#ECHO OFF
FOR /F "SKIP=5" %%# IN ( %~fs0 ) DO lua -l socket.http -e "print(socket.http.request('https://www.opengl.org/sdk/docs/man2/xhtml/%%#.xml'))"
GOTO:EOF
glAccum
glActiveTexture
glAlphaFunc
glAreTexturesResident
glArrayElement
glAttachShader
glBegin
glBeginQuery
glBindAttribLocation
glBindBuffer
the most important part is this:
print(require('socket.http').request('https://www.opengl.org/sdk/docs/man2/xhtml/glAccum.xml')) -- added glAccum so you can run it
This ALWAYS returns a 301. This also happens to me when downloading from other random pages. (I dont note them so I cant give a list, but i happened to find out some of them use cloudflare.)
If i write an equivalent downloader in Java using URL and openConnection() it wont redirect.
I already tried folowing the redirect manually (setting refferer and stuff) and using the 'generic' way. As most of the tips stated in other answers.
You are using socket.http, but try to access https URL. luasocket doesn't handle HTTPS protocol, so it sends a request to the default port 80 instead and gets a redirect to HTTPS link (same link); this goes for several times (as the URL doesn't really change), and in the end luasocket gives up producing the message.
The solution is to install luasec and to use ssl.https module to do the request.
Is there such a thing?
A way it might be used:
Many locations have forms that post to http://www.example.com/wally/app/receiver.aspx
Managements decides they want a cleaner URL and there is no reason to pretend you are using aspx (you didn't really think I was using aspx for that did you?)
They say it should be http://example.com/receiver
Easy enough! Just put a 301 redirect. No need to update all those forms that exist all over..,, but wait.. You can't do that for POST.
Perhaps you can receive and handle the request and then re-write the URL without causing a subsequent request? Perhaps this will not strip the www (cross domain), but can it shorten the pathname like that without a separate request?
Even in GET requests, this would indeed be a performance boost if one could re-write the URL and send the response body at the same. Can this be done?
You cannot send content to user and do 301/302 etc redirect at the same time -- browser interprets the HTTP Response code and acts accordingly to the code received. If 301/302 -- it will do redirect, if 200 -- will display it to the customer.
Is there such thing as a HTTP URL re-write without 301 or 302 redirect?
Yes -- it's called rewrite (internal redirect). For example -- customer requests http://example.com/receiver. You rewrite URL to point to /wally/app/receiver.aspx (e.g. RewriteRule ^receiver$ /wally/app/receiver.aspx [L] -- that's if you have an Apache, which you most likely not (considering receiver.aspx)). This will do internal redirect when URL remains unchanged in browser address bar (works fine with POST and GET methods).
Well, I guess rewriting url suggested by LazyOne is not the answer to the question as he himself states that
This will do internal redirect when URL remains unchanged in browser
address bar
(http://www.example.com/wally/app/receiver.aspx). Still, the question asks for
(...) it should be http://example.com/receiver
I think the solution is to redirect old url to the new one using 307 status code introduced in RFC 2616. User agents which handle version 1.1 of HTTP protocol (I guess all popular browsers for some time now) should make the new request using the same http method (POST in this case) as in the original request.
Well this one freaks me out.
I used a Http Header check tool to check the headers of my webpage and guess what.
In every request the response was 302 instead of 200.
domain.con
www.domain.con
http://www.domain.con
So, am i missing something here?
I have not placed any redirect in any way.
So where the f#$% my website redirects? Is there a security hole?
UPDATE: While googling found this one
domain.com is not the same as www.domain.com - that's a redirect.
You are getting this because .net/IIS redirects your www.domain.com or domain.com to www.domain.com/default.aspx, so you get a header with 302 and then one for 200. I think this is by design but very confusing.
Maybe a case of this:
302 Found
This is the most popular redirect code, but also an example of industrial practice contradicting the standard. HTTP/1.0 specification (RFC 1945) required the client to perform a temporary redirect (the original describing phrase was "Moved Temporarily"), but popular browsers implemented 302 with the functionality of a 303 See Other. Therefore, HTTP/1.1 added status codes 303 and 307 to distinguish between the two behaviours. However, the majority of Web applications and frameworks still use the 302 status code as if it were the 303.
303 See Other (since HTTP/1.1)
The response to the request can be found under another URI using a GET method. When received in response to a PUT, it should be assumed that the server has received the data and the redirect should be issued with a separate GET message.
http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
It's possible that you forgot to add a final slash to the end of your URL. Most webservers will redirect you to the "canonical" location that includes the slash. If you include the slash, you may get the response you're looking for.
Are you using forms authentication? and log in page is some other page than the default page say auth.aspx? If this is the case then you will allways get 302 code and the page will be redirected to login page.
In ASP.Net we can redirect by using Response.Redirect & Server.Transfer.
If we go with server.Transfer the status code 302 will never hit & directly hits the 200.
If we go with Response.Redirect it passes from 302 to 200 as response.. Which is nothing but roundtrip.