Why shouldn't wget fetch the response content of a 500 response? - http

This question is a follow-up of HTTP 500 error in wget
Upon server errors wget will not fetch the content of the reponse, whereas curl should be used for that purpose.
Indeed curl, or any web browser, does load the content as well in case of 50X exit code.
So I was indeed curious to better understand why wget will not load the content anyway, with none of the lots of options it has: usually servers send an error explanation along, so why ignore it?

According to http://savannah.gnu.org/bugs/?27303#comment4, wget already has the feature (downloading error pages) in the development version.

Related

wget won't download files I can access through browser

I am an amateur historian trying to access newspaper archives. The server where the scans are located "works" using an outdated tif viewer that doesn't seem to actually work at all anymore. I can access the files individually in chrome without logging in, but when I try to use wget or curl, I'm told that viewing the file is unauthorized, even when I use my login info, and even when using my cookies from chrome.
Here is an example of one of the files: https://ulib.aub.edu.lb/nahar/images2/7810W2/78101001.TIF
When I put this into chrome, it automatically downloads the file even though I cannot access the directory itself, but when I use wget, I get the following response: "401 unauthorized Username/Password Authentication Failed."
This is the basic wget command I'm using (if I can get it to work at all, then I'll input a list of the other files):
wget --no-check-certificate https://ulib.aub.edu.lb/nahar/images2/7810W2/78101001.TIF
I've tried variations with and without cookies, with a blank user, with and without login credentials, As I'm sure you can tell, I'm new to this sort of thing but eager to learn.
From what I can see, authentication on your website is done with HTTP basic. This kind of authentication is not using HTTP cookies, it is using HTTP Authorization header. You can pass HTTP basic credentials to wget with the following arguments.
wget --http-user=YourUsername --http-password=YourPassword https://ulib.aub.edu.lb/nahar/images2/7810W2/78101001.TIF

Getting error while testing website in google page speed

Hi I am trying to test my website on page speed google but getting an error.
"Attempting to load the page reached the limit of 3 client redirects. The last URL fetched was http://www.example.com/. This may indicate the page is redirecting to itself, or has a loop of redirects."
Could you please tell me exactly whats the issue.
It looks like your page is redirecting to itself. Execute the following command in your terminal.
curl -I http://www.example.com
Make sure the URL you are testing with returns HTTP/1.1 200 OK.
You can also use some online tools to find the final destination and use that one for testing the page speed. Check this one - http://redirectdetective.com/
Hi thanks for the reply I have tested the url at "http://redirectdetective.com/" and got no redirects found also I have execute the curl -I http://www.example.com command and got HTTP/1.1 200 OK response.
But still I am getting an error on page speed google.

luaSocket HTTP requests always respond with a redirect (301 or 302)

I use LuaForWindows (latest version) and I have read this and this answer and everything i could find in the mailinglist of lua-users.org. What ever I try (most) sites only respond with either 301 or 302. I have created an example batch script which downloads (some) of the OpenGL 2.1 Reference from their man pages.
#ECHO OFF
FOR /F "SKIP=5" %%# IN ( %~fs0 ) DO lua -l socket.http -e "print(socket.http.request('https://www.opengl.org/sdk/docs/man2/xhtml/%%#.xml'))"
GOTO:EOF
glAccum
glActiveTexture
glAlphaFunc
glAreTexturesResident
glArrayElement
glAttachShader
glBegin
glBeginQuery
glBindAttribLocation
glBindBuffer
the most important part is this:
print(require('socket.http').request('https://www.opengl.org/sdk/docs/man2/xhtml/glAccum.xml')) -- added glAccum so you can run it
This ALWAYS returns a 301. This also happens to me when downloading from other random pages. (I dont note them so I cant give a list, but i happened to find out some of them use cloudflare.)
If i write an equivalent downloader in Java using URL and openConnection() it wont redirect.
I already tried folowing the redirect manually (setting refferer and stuff) and using the 'generic' way. As most of the tips stated in other answers.
You are using socket.http, but try to access https URL. luasocket doesn't handle HTTPS protocol, so it sends a request to the default port 80 instead and gets a redirect to HTTPS link (same link); this goes for several times (as the URL doesn't really change), and in the end luasocket gives up producing the message.
The solution is to install luasec and to use ssl.https module to do the request.

Wordpress site returns status 500 but still works

http://ststephens.edu/
This site returns status code 500 (Internal Server Error) when I do
wget http://ststephens.edu/ but works fine on my browser. Also as seen in this screenshot, clearly the homepage is a 500 status but the site seems functional.
What could make this happen?
Web browsers are more flexible then wget. Even if it receives an error code, it will still display the page content returned along with the error code response. Web servers only fall back on their default error view if no content is provided.
Wget and search engine crawlers are more strict. They will bail out as soon as they see the error code response.
I think there is a problem with your server configuration. Check your web server log file. It may tell you why the server gave this error.

Gradle failing to download dependency when HEAD request fails

I have set up a dependency in my Gradle build script, which is hosted on Bitbucket.
Gradle fails to download it, with error message
Could not HEAD 'https://bitbucket.org/....zip'. Received status code 403 from server: Forbidden
I looked into it, and it seems that this is because :
Bitbucket redirects to an amazon url
the Amazon url doesn't accept HEAD requests, only GET requests
I could check that by testing that URL with curl, and I also got a 403 Forbidden when sending a HEAD request with curl.
Otherwise, it could be because Amazon doesn't accept the signature in the HEAD request, which should be different from the GET one, as explained here.
Is there a way around this ? Could I tell Gradle to skip the HEAD request, and go straight to the GET request ?
I worked around the problem by using the gradle-download-task plugin, and manually writing caching as explained here

Resources