parse http response header from wget - http

Im trying to extract a line from wget's result but having trouble with it.
This is my wget call:
$ wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html
Output:
--18:24:12-- http://xxx.xxxx.xxxx:15000/myhtml.html
=> `-'
Resolving xxx.xxxx.xxxx... xxx.xxxx.xxxx
Connecting to xxx.xxxx.xxxx|xxx.xxxx.xxxx|:15000... connected.
HTTP request sent, awaiting response...
HTTP/1.1 302 Found
Date: Tue, 18 Nov 2008 23:24:12 GMT
Server: IBM_HTTP_Server
Expires: Thu, 01 Dec 1994 16:00:00 GMT
Location: https://xxx.xxxx.xxxx/siteminderagent/...
Content-Length: 508
Keep-Alive: timeout=10, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
Location: https://xxx.xxxx.xxxx//siteminderagent/...
--18:24:13-- https://xxx.xxxx.xxxx/siteminderagent/...
=> `-'
Resolving xxx.xxxx.xxxx... failed: Name or service not known.
if I do this:
$ wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html | egrep -i "302" <br/>
It doesnt return me the line that contains the string. I just want to check if the site or siteminder is up.

The output of wget you are looking for is written on stderr. You must redirect it:
$ wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html 2>&1 | egrep -i "302"

wget prints the headers to stderr, not to stdout. You can redirect stderr to stdout as follows:
wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html 2>&1 | egrep -i "302"
The "2>&1" part says to redirect ('>') file descriptor 2 (stderr) to file descriptor 1 (stdout).

A bit enhanced version of already provided solution
wget -SO- -T 1 -t 1 http://myurl.com:15000/myhtml.html 2>&1 >/dev/null | grep -c 302
2>&1 >/dev/null will trim off unneeded output. This way egrep will parse only wget`s stderr, what eliminates possibility to catch strings containing 302 from stdout (where html file itself outputted + download proces bar with resulting bytes count e.t.c.) :)
egrep -c counts number of matched strings instead of simply output them. Enough to know how much strings egrep matched.

wget --server-response http://www.amazon.de/xyz 2>&1 | awk '/^ HTTP/{print $2}'

Just to explicate a bit. The -S switch in the original question is shorthand for --server-response.
Also, I know the OP specified wget, but curl is similar and defaults to STDOUT .
curl --head --silent $yourURL
or
curl -I -s $yourURL
The --silent switch is only needed for grep-ability: (-s turns off progress % meter)

I found this question trying to scrape response codes to large lists of URLs after finding curl very slow (5+s per request).
Previously, I was using this:
curl -o /dev/null -I --silent --head --write-out %{http_code} https://example.com
Building off Piotr and Adam's answers, I came up with this:
wget -Sq -T 1 -t 1 --no-check-certificate --spider https://example.com 2>&1 | egrep 'HTTP/1.1 ' | cut -d ' ' -f 4
This has a few bugs e.g. redirects return 302 200, but overall is greased lightning in comparison.

Related

curl to get a 200 instead of 308

i have a shell script to curl an url and test if it's ok.
On web browser i have no problem, i have the content i'm waiting for.
but after running this shell :
#!/bin/bash
URL="www.mypersonalURL.com/"
STATUS=$(curl -s -o /dev/null -w "%{http_code}\n" $URL)
if [ $STATUS == 200 ] ; then
echo "$URL is up, returned $STATUS"
else
echo "$URL is not up, returned $STATUS"
fi
i always get a 308 code, what to do to make the redirection go forward and land on a 200 ?
You are not following the redirect in your curl statement. Add -L to your curl.
URL="www.mypersonalURL.com/"
STATUS=$(curl -L -s -o /dev/null -w "%{http_code}\n" $URL)

Can I have curl print just the response code?

I read https://superuser.com/questions/272265/getting-curl-to-output-http-status-code . It mentioned that
curl -i
will print the HTTP response code. Is it possible to have curl print just the HTTP response code? Is there a generic way to get the HTTP status code for any type of request like GET/POST/etc?
I am using curl 7.54.0 on Mac OS High Sierra.
Thanks for reading.
This worked for me:
$ curl -s -w "%{http_code}\n" http://google.com/ -o /dev/null
curl -s -I http://example.org | grep HTTP/ | awk {'print $2'}
output: 200
Another solution:
curl -sI http://example.org | head -n 1 | cut -d ' ' -f 2
in this way you are:
getting the first HTTP response line (head -n 1), which must contain the response HTTP version, the response code and the response message (in this order), each one separated by a whitespace (as defined in the HTTP standard);
getting the 2° field of this line (cut -d ' ' -f 2), which is the status code

Is it possible to send source file as URL with hylaFax?

I need to send a fax where the source file is coming from an HTTP URL. I have configured hylaFax. When trying a local file, it works fine. But with a URL it gives an error.
The command I am using is something like this:
sendfax -v -h faxhost -f kaur#xyz.com -D -d 1234567890 \
'http://kaur.dev.xyz.com:7771/app-name/proxy?bName=Test&oName=1.txt'
The error:
Error : 'Can not open file'
The file is downloading when connecting through browser.
sendfax will process stdin so you can pipe documents in:
wget -O - 'http://kaur.dev.xyz.com:7771/app-name/proxy?bName=Test&oName=1.txt' | sendfax -v -h faxhost -f kaur#xyz.com -D -d 1234567890
or
curl 'http://kaur.dev.xyz.com:7771/app-name/proxy?bName=Test&oName=1.txt' | sendfax -v -h faxhost -f kaur#xyz.com -D -d 1234567890

Grep Curl command's console output to a file

Need an help in doing grep of 3 variables from the curl commands console output.
By executing the below curl command i get some output which prints in the console itself. I need to grep some variables(say staus, name, url) and redirect it to a file.
curl -v -X POST -D tmp.txt -H "Content-Type:text/plain" --data "$SECRET" -H "Accept:application/xml" -H "Connection:close" http://google.com/api/search
Some of your responses must be going to stderr, try:
curl ... | grep 'pattern' &> filename
I'd recommending using slightly different options to curl if you want to process the output. If I were looking for the Expires header:
curl -si -X POST -H "Content-Type:text/plain" --data "$SECRET" -H "Accept:application/xml" -H "Connection:close" http://google.com/api/search
If you want the HTTP status, you can just do this:
curl -si -X POST -H "Content-Type:text/plain" --data "$SECRET" -H "Accept:application/xml" -H "Connection:close" http://google.com/api/search | head -1
That'll print HTTP/1.1 301 Moved Permanently -- add an |awk '{print $2}' to the end of that and you'll get only the numeric status.

How to evaluate http response codes from bash/shell script?

I have the feeling that I'm missing the obvious, but have not succeeded with man [curl|wget] or google ("http" makes such a bad search term). I'm looking for a quick&dirty fix to one of our webservers that frequently fails, returning status code 500 with an error message. Once this happens, it needs to be restarted.
As the root cause seems to be hard to find, we're aiming for a quick fix, hoping that it will be enough to bridge the time until we can really fix it (the service doesn't need high availability)
The proposed solution is to create a cron job that runs every 5 minutes, checking http://localhost:8080/. If this returns with status code 500, the webserver will be restarted. The server will restart in under a minute, so there's no need to check for restarts already running.
The server in question is a ubuntu 8.04 minimal installation with just enough packages installed to run what it currently needs. There is no hard requirement to do the task in bash, but I'd like it to run in such a minimal environment without installing any more interpreters.
(I'm sufficiently familiar with scripting that the command/options to assign the http status code to an environment variable would be enough - this is what I've looked for and could not find.)
I haven't tested this on a 500 code, but it works on others like 200, 302 and 404.
response=$(curl --write-out '%{http_code}' --silent --output /dev/null servername)
Note, format provided for --write-out should be quoted.
As suggested by #ibai, add --head to make a HEAD only request. This will save time when the retrieval is successful since the page contents won't be transmitted.
I needed to demo something quickly today and came up with this. Thought I would place it here if someone needed something similar to the OP's request.
#!/bin/bash
status_code=$(curl --write-out %{http_code} --silent --output /dev/null www.bbc.co.uk/news)
if [[ "$status_code" -ne 200 ]] ; then
echo "Site status changed to $status_code" | mail -s "SITE STATUS CHECKER" "my_email#email.com" -r "STATUS_CHECKER"
else
exit 0
fi
This will send an email alert on every state change from 200, so it's dumb and potentially greedy. To improve this, I would look at looping through several status codes and performing different actions dependent on the result.
curl --write-out "%{http_code}\n" --silent --output /dev/null "$URL"
works. If not, you have to hit return to view the code itself.
Although the accepted response is a good answer, it overlooks failure scenarios. curl will return 000 if there is an error in the request or there is a connection failure.
url='http://localhost:8080/'
status=$(curl --head --location --connect-timeout 5 --write-out %{http_code} --silent --output /dev/null ${url})
[[ $status == 500 ]] || [[ $status == 000 ]] && echo restarting ${url} # do start/restart logic
Note: this goes a little beyond the requested 500 status check to also confirm that curl can even connect to the server (i.e. returns 000).
Create a function from it:
failureCode() {
local url=${1:-http://localhost:8080}
local code=${2:-500}
local status=$(curl --head --location --connect-timeout 5 --write-out %{http_code} --silent --output /dev/null ${url})
[[ $status == ${code} ]] || [[ $status == 000 ]]
}
Test getting a 500:
failureCode http://httpbin.org/status/500 && echo need to restart
Test getting error/connection failure (i.e. 000):
failureCode http://localhost:77777 && echo need to start
Test not getting a 500:
failureCode http://httpbin.org/status/400 || echo not a failure
Here is my implementation, which is a bit more verbose than some of the previous answers
curl https://somewhere.com/somepath \
--silent \
--insecure \
--request POST \
--header "your-curl-may-want-a-header" \
--data #my.input.file \
--output site.output \
--write-out %{http_code} \
> http.response.code 2> error.messages
errorLevel=$?
httpResponse=$(cat http.response.code)
jq --raw-output 'keys | #csv' site.output | sed 's/"//g' > return.keys
hasErrors=`grep --quiet --invert errors return.keys;echo $?`
if [[ $errorLevel -gt 0 ]] || [[ $hasErrors -gt 0 ]] || [[ "$httpResponse" != "200" ]]; then
echo -e "Error POSTing https://somewhere.com/somepath with input my.input (errorLevel $errorLevel, http response code $httpResponse)" >> error.messages
send_exit_message # external function to send error.messages to whoever.
fi
With netcat and awk you can handle the server response manually:
if netcat 127.0.0.1 8080 <<EOF | awk 'NR==1{if ($2 == "500") exit 0; exit 1;}'; then
GET / HTTP/1.1
Host: www.example.com
EOF
apache2ctl restart;
fi
To follow 3XX redirects and print response codes for all requests:
HTTP_STATUS="$(curl -IL --silent example.com | grep HTTP )";
echo "${HTTP_STATUS}";
i didn't like the answers here that mix the data with the status.
found this:
you add the -f flag to get curl to fail and pick up the error status code from the standard status var: $?
https://unix.stackexchange.com/questions/204762/return-code-for-curl-used-in-a-command-substitution
i don't know if it's perfect for every scenario here, but it seems to fit my needs and i think it's much easier to work with
this can help to evaluate http status
var=`curl -I http://www.example.org 2>/dev/null | head -n 1 | awk -F" " '{print $2}'`
echo http:$var
Another variation:
status=$(curl -sS -I https://www.healthdata.gov/user/login 2> /dev/null | head -n 1 | cut -d' ' -f2)
status_w_desc=$(curl -sS -I https://www.healthdata.gov/user/login 2> /dev/null | head -n 1 | cut -d' ' -f2-)
Here comes the long-winded – yet easy to understand – script, inspired by the solution of nicerobot, that only requests the response headers and avoids using IFS as suggested here. It outputs a bounce message when it encounters a response >= 400. This echo can be replaced with a bounce-script.
# set the url to probe
url='http://localhost:8080'
# use curl to request headers (return sensitive default on timeout: "timeout 500"). Parse the result into an array (avoid settings IFS, instead use read)
read -ra result <<< $(curl -Is --connect-timeout 5 "${url}" || echo "timeout 500")
# status code is second element of array "result"
status=${result[1]}
# if status code is greater than or equal to 400, then output a bounce message (replace this with any bounce script you like)
[ $status -ge 400 ] && echo "bounce at $url with status $status"
To add to #DennisWilliamson comment above:
#VaibhavBajpai: Try this: response=$(curl --write-out \n%{http_code} --silent --output - servername) - the last line in the result will be the response code
You can then parse the response code from the response using something like the following, where X can signify a regex to mark the end of the response (using a json example here)
X='*\}'
code=$(echo ${response##$X})
See Substring Removal: http://tldp.org/LDP/abs/html/string-manipulation.html
Assuming you have already implemented a stop and start script for your application. Create a script as follows which checks the http status of your application url and restarts in case of 502:
httpStatusCode=$(curl -s -o /dev/null -w "%{http_code}" https://{your_url}/)
if [ $httpStatusCode = 502 ]; then sh /{path_to_folder}/stopscript.sh sh /{path_to_folder}/startscript.sh fi
Implement a cron job to invoke this script every 5 mins. Assuming the script above has name checkBootAndRestart.sh. Then your crontab should look like- */5 * * * * /{path_to_folder}/checkBootAndRestart.sh

Resources