Process raw HTTP request - http

I'd like to pass a raw HTTP request like:
GET /foo/bar HTTP/1.1
Host: example.org
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; fr; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8
Accept: */*
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
X-Requested-With: XMLHttpRequest
Referer: http://example.org/test
Cookie: foo=bar; lorem=ipsum;
to a HTTP client.
I tried cat raw.http | curl but without success.
Any suggestion?
Thx.

Raw data in, raw data out:
nc example.org 80 < raw.http
If you need to pipe the data through some program:
cat raw.http | someprogram | nc example.org 80
Manual page

The question is tagged curl so I thought it was about time there was a curl answer
cat raw.http | curl "telnet://TARGETHOST:80"
For normal use just need to set the TARGETHOST to be the same as "host" header value.
For my purposes(not normal) I was hitting a TARGETHOST that was an ip address with a server that was listening for host headers of specific hosts.

Note that neither of these solutions would work if your need httpS instead of http. In this case you can send it this way:
$ cat raw.http | openssl s_client -connect server:443

Related

wget says 406 Not acceptable

I have a simple file on my web server, and when I request it in a browser, it loads without problems:
http://example.server/report.php
But when I request the file with wget from a Raspberry Pi, I get this:
$ wget -d --spider http://example.server/report.php
Setting --spider (spider) to 1
DEBUG output created by Wget 1.18 on linux-gnueabihf.
Reading HSTS entries from /home/pi/.wget-hsts
URI encoding = 'ANSI_X3.4-1968'
converted 'http://example.server/report.php' (ANSI_X3.4-1968) -> 'http://example.server/report.php' (UTF-8)
Converted file name 'report.php' (UTF-8) -> 'report.php' (ANSI_X3.4-1968)
Spider mode enabled. Check if remote file exists.
--2018-06-03 07:29:29-- http://example.server/report.php
Resolving example.server (example.server)... 49.132.206.71
Caching example.server => 49.132.206.71
Connecting to example.server (example.server)|49.132.206.71|:80... connected.
Created socket 3.
Releasing 0x00832548 (new refcount 1).
---request begin---
HEAD /report.php HTTP/1.1
User-Agent: Wget/1.18 (linux-gnueabihf)
Accept: */*
Accept-Encoding: identity
Host: example.server
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 406 Not Acceptable
Date: Fri, 15 Jun 2018 08:25:17 GMT
Server: Apache
Keep-Alive: timeout=3, max=200
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
---response end---
406 Not Acceptable
Registered socket 3 for persistent reuse.
URI content encoding = 'iso-8859-1'
Remote file does not exist -- broken link!!!
I read somewhere that it might be an encoding problem, so I tried
$ wget -d --spider --header="Accept-encoding: *" http://example.server/report.php
but that gives me the exact same error.
That's because the server you're connecting to serves only to certain User-Agents.
Change the user agent and it works fine:
wget -d --user-agent="Mozilla/5.0 (Windows NT x.y; rv:10.0) Gecko/20100101 Firefox/10.0" http://example.server/report.php

Server utility: receive HTTPS POST requests, cat the data

To test something, I want to run a simple web server that:
Will listen for HTTPS POST requests
Print the POST data received to STDOUT (along with other stuff, potentially, so it's fine if it just cats the whole HTTP request)
Is there a quick way to set something like this up? I've tried using OpenSSL's s_server, but it only seems to want to respond to GET requests.
Since s_server does not support POST requests, you should use socat instead of openssl s_server:
# socat -v OPENSSL-LISTEN:443,cert=mycert.pem,key=key.pem,verify=0,fork 'SYSTEM:/bin/echo HTTP/1.1 200 OK;/bin/echo;/bin/echo this-is-the-content-of-the-http-answer'
Here are essential parameters:
fork: to loop for many requests
-v: to display the POST data (and other stuff) to STDOUT
verify=0: do not ask for mutual authentication
Now, here is an example:
We use the following POST request:
% wget -O - --post-data=abcdef --no-check-certificate https://localhost/
[...]
this-is-the-content-of-the-http-answer
We see the following socat output:
# socat -v OPENSSL-LISTEN:443,cert=mycert.crt,key=key.pem,verify=0,fork 'SYSTEM:/bin/echo HTTP/1.1 200 OK;/bin/echo;/bin/echo this-is-the-content-of-the-http-answer'
> 2017/08/05 03:13:04.346890 length=212 from=0 to=211
POST / HTTP/1.1\r
User-Agent: Wget/1.19.1 (freebsd10.3)\r
Accept: */*\r
Accept-Encoding: identity\r
Host: localhost:443\r
Connection: Keep-Alive\r
Content-Type: application/x-www-form-urlencoded\r
Content-Length: 6\r
\r
< 2017/08/05 03:13:04.350299 length=16 from=0 to=15
HTTP/1.1 200 OK
> 2017/08/05 03:13:04.350516 length=6 from=212 to=217
abcdef< 2017/08/05 03:13:04.351549 length=1 from=16 to=16
< 2017/08/05 03:13:04.353019 length=39 from=17 to=55
this-is-the-content-of-the-http-answer

Why does curl not work, but wget works?

I am using both curl and wget to get this url: http://opinionator.blogs.nytimes.com/2012/01/19/118675/
For curl, it returns no output at all, but with wget, it returns the entire HTML source:
Here are the 2 commands. I've used the same user agent, and both are coming from the same IP, and are following redirects. The URL is exactly the same. For curl, it returns immediately after 1 second, so I know it's not a timeout issue.
curl -L -s "http://opinionator.blogs.nytimes.com/2012/01/19/118675/" --max-redirs 10000 --location --connect-timeout 20 -m 20 -A "Mozilla/5.0 (Windows NT 5.2; rv:2.0.1) Gecko/20100101 Firefox/4.0.1" 2>&1
wget http://opinionator.blogs.nytimes.com/2012/01/19/118675/ --user-agent="Mozilla/5.0 (Windows NT 5.2; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
If NY Times might be cloaking, and not returning the source to curl, what could be different in the headers curl is sending? I assumed since the user agent is the same, the request should look exactly the same from both of these requests. What other "footprints" should I check?
The way to solve is to analyze your curl request by doing curl -v ... and your wget request by doing wget -d ... which shows that curl is redirected to a login page
> GET /2012/01/19/118675/ HTTP/1.1
> User-Agent: Mozilla/5.0 (Windows NT 5.2; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
> Host: opinionator.blogs.nytimes.com
> Accept: */*
>
< HTTP/1.1 303 See Other
< Date: Wed, 08 Jan 2014 03:23:06 GMT
* Server Apache is not blacklisted
< Server: Apache
< Location: http://www.nytimes.com/glogin?URI=http://opinionator.blogs.nytimes.com/2012/01/19/118675/&OQ=_rQ3D0&OP=1b5c69eQ2FCinbCQ5DzLCaaaCvLgqCPhKP
< Content-Length: 0
< Content-Type: text/plain; charset=UTF-8
followed by a loop of redirections (which you must have noticed, because you have already set the --max-redirs flag).
On the other hand, wget follows the same sequence except that it returns the cookie set by nytimes.com with its subsequent request(s)
---request begin---
GET /2012/01/19/118675/?_r=0 HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 5.2; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Accept: */*
Host: opinionator.blogs.nytimes.com
Connection: Keep-Alive
Cookie: NYT-S=0MhLY3awSMyxXDXrmvxADeHDiNOMaMEZFGdeFz9JchiAIUFL2BEX5FWcV.Ynx4rkFI
The request sent by curl never includes the cookie.
The easiest way I see to modify your curl command and obtain the desired resource is by adding -c cookiefile to your curl command. This stores the cookie in the otherwise unused temporary "cookie jar" file called "cookiefile" thereby enabling curl to send the needed cookie(s) with its subsequent requests.
For example, I added the flag -c x directly after "curl " and I obtained the output just like from wget (except that wget writes it to a file and curl prints it on STDOUT).
In my case was because the https_proxy enviroment variable for utility cURL needs set the port in the URL, for example :
Not work with cURL :
https_proxy=http://proxyapp.net.com/
Works with cURL :
https_proxy=http://proxyapp.net.com:80/
With "wget" utility works with and without the port in url, but curl needs it, in case of not set the utility "curl" return error "(56) Proxy CONNECT aborted".
When you get verbosity of the command "curl -v" could see "curl" use port "1080" as default if port in not set at proxy url.

HTTP Get not using Port

I am trying to call a page in PHP with a http_get :
$url = "http://mysite.fr:9090/neolane-webservice/campagnesclient/Coclico=1135446";
http_get($url, $appelOptions, $appelInfos);
My problem is that it does not work every time.
I installed Wireshark to see what I'm really sending and I found an odd thing. Sometimes, the port is not used for the HTTP request.
When it works, I have :
Hypertext Transfer Protocol
GET http://mysite.fr:9090/neolane-webservice/campagnesclient/Coclico=1135446 HTTP/1.1\r\n
Request Method: GET
Request URI: http://mysite.fr:9090/neolane-webservice/campagnesclient/Coclico=1135446
Request Version: HTTP/1.1
User-Agent: PECL::HTTP/1.6.5 (PHP/5.2.4-2ubuntu5.7)\r\n
Host: mysite.fr:9090\r\n
Pragma: no-cache\r\n
Accept: */*\r\n
Proxy-Connection: Keep-Alive\r\n
Keep-Alive: 300\r\n
Connection: keep-alive\r\n
Date: Fri, 15 Jun 2012 16:40:46 +0200\r\n
Accept-Charset: utf-8\r\n
Accept-Encoding: gzip;q=1.0,deflate;q=0.5\r\n
\r\n
And when it's not :
Hypertext Transfer Protocol
GET http://mysite.fr:9090/neolane-webservice/campagnesclient/Coclico=1135446 HTTP/1.1\r\n
Request Method: GET
Request URI: http://mysite.fr:9090/neolane-webservice/campagnesclient/Coclico=1135446
Request Version: HTTP/1.1
User-Agent: PECL::HTTP/1.6.5 (PHP/5.2.4-2ubuntu5.7)\r\n
Host: mysite.fr\r\n
Pragma: no-cache\r\n
Accept: */*\r\n
Proxy-Connection: Keep-Alive\r\n
Keep-Alive: 300\r\n
Connection: keep-alive\r\n
Date: Fri, 15 Jun 2012 16:40:34 +0200\r\n
Accept-Charset: utf-8\r\n
Accept-Encoding: gzip;q=1.0,deflate;q=0.5\r\n
\r\n
I tried to call the page with wget and it's always working :
wget http://mysite.fr:9090/neolane-webservice/campagnesclient/Coclico=1135446
So I'm guessing that my problem id due to Apache config, but I don't know where to look. Could you help me please ?
You will need to set the port in the $appelOptions array.
$appelOptions['port']=9090;
http_get($url, $appelOptions, $appelInfos);
Unfortunately http_get does not seem to respect the :port syntax in the URL

HTTP 500 error in wget

Take a look at this page:
http://www.ptmytrade.com/product.asp?id=61363
It's loading fine (at least here). Now I would like to grab it with wget.
$ wget http://www.ptmytrade.com/product.asp?id=61363 --debug
DEBUG output created by Wget 1.12 on linux-gnu.
--2011-05-21 18:24:51-- http://www.ptmytrade.com/product.asp?id=61363
Resolving www.ptmytrade.com... 205.209.150.134
Caching www.ptmytrade.com => 205.209.150.134
Connecting to www.ptmytrade.com|205.209.150.134|:80... connected.
Created socket 3.
Releasing 0x0890e260 (new refcount 1).
---request begin---
GET /product.asp?id=61363 HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: www.ptmytrade.com
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 500 Internal Server Error
Connection: keep-alive
Date: Sat, 21 May 2011 16:24:56 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Content-Length: 471822
Content-Type: text/html
Set-Cookie: ASPSESSIONIDSCACCAQA=FOCCMJODFHHMOKNKPAIHJCIL; path=/
Cache-control: private
---response end---
500 Internal Server Error
Stored cookie www.ptmytrade.com -1 (ANY) / <session> <insecure> [expiry none] ASPSESSIONIDSCACCAQA FOCCMJODFHHMOKNKPAIHJCIL
Registered socket 3 for persistent reuse.
Disabling further reuse of socket 3.
Closed fd 3
2011-05-21 18:24:57 ERROR 500: Internal Server Error.
OK, so I check the headers when fetching the page using my browser (using Live HTTP Headers add-on):
http://www.ptmytrade.com/product.asp?id=61361
GET /product.asp?id=61361 HTTP/1.1
Host: www.ptmytrade.com
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:2.0) Gecko/20100101 Firefox/4.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Cookie: ASPSESSIONIDSCACBBRA=AMPBLLNDGMFLNPNCPEBPNNLB; ASPSESSIONIDSCACCAQA=FJNBMJODLHHJNDHPFBIEEPEM
HTTP/1.1 500 Internal Server Error
Date: Sat, 21 May 2011 16:20:46 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Content-Length: 471822
Content-Type: text/html
Cache-Control: private
----------------------------------------------------------
http://www.ptmytrade.com/images/index_117.jpg
GET /images/index_117.jpg HTTP/1.1
Host: www.ptmytrade.com
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:2.0) Gecko/20100101 Firefox/4.0
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://www.ptmytrade.com/product.asp?id=61361
Cookie: ASPSESSIONIDSCACBBRA=AMPBLLNDGMFLNPNCPEBPNNLB; ASPSESSIONIDSCACCAQA=FJNBMJODLHHJNDHPFBIEEPEM
HTTP/1.1 404 Not Found
Content-Length: 1635
Content-Type: text/html
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Sat, 21 May 2011 16:20:48 GMT
I'm not sure what's going on here. The page displays just fine, but I'm getting the 500 error code in the header.
The problem was solved by using curl (which was also getting a 500, but fetched the page just fine) instead, but I'm curious what's going here.
Hi everybody I had this huge problem too, I don't know why, but the solution was add this:
wget -U "Opera 11.0" "http://your_link" -O out.csv
I found it on
[Curl and wget return error 500 for helloworld.php on new install but browser is fine
Using this option will fix the issue:
--content-on-error
If this is set to on, wget will not skip the content when the
server responds with a http status code that indicates error.
So the command looks like this:
wget --content-on-error "https://stackoverflow.com"
NOTE: It's important to put the URL inside double-quotes, otherwise, wget will get stuck on Redirecting output to ‘wget-log’..
Or as stated in the comments and by OP, use curl instead.
But I should note that curl cannot download whole webpages (css, js, images etc.) because it cannot parse HTML. Source and Taken from.
It's a bug in the webpage. The HTTP status is indeed seemingly incorrectly set to HTTP 500. Firefox/Firebug also confirms this. Basically, you're facing a HTTP 500 error page with "normal" content.
Report it to the site admin.
Try enclosing it in quotes:
wget "http://www.ptmytrade.com/product.asp?id=61363"
instead of:
wget http://www.ptmytrade.com/product.asp?id=61363

Resources