How does the HTTP protocol work exactly? - http

I thought I had this all figured out, but now that I'm writing a webserver, something is not quite working right.
The app listens on a port for incoming requests, and when it receives one, it reads everything up to the sequence "\r\n\r\n". (Because that signifies the end of the headers - yes, I am ignoring possible POST data.)
Now, after it reads that far, it writes to the socket the response:
HTTP/1.1 200 OK\r\n
Host: 127.0.0.1\r\n
Content-type: text/html\r\n
Content-length: 6\r\n
\r\n
Hello!
However, when Firefox or Chrome tries to view the page, it won't display. Chrome informs me:
Error 324 (net::ERR_EMPTY_RESPONSE): Unknown error.
What am I doing wrong?
Here is some of the code:
QTcpSocket * pSocket = m_server->nextPendingConnection();
// Loop thru the request until \r\n\r\n is found
while(pSocket->waitForReadyRead())
{
QByteArray data = pSocket->readAll();
if(data.contains("\r\n\r\n"))
break;
}
pSocket->write("HTTP/1.0 200 OK\r\n");
QString error_str = "Hello world!";
pSocket->write("Host: localhost:8081\r\n");
pSocket->write("Content-Type: text/html\r\n");
pSocket->write(tr("Content-Length: %1\r\n").arg(error_str.length()).toUtf8());
pSocket->write("\r\n");
pSocket->write(error_str.toUtf8());
delete pSocket;

I figured it out!
After writing the data to the socket, I have to call:
pSocket->waitForBytesWritten();
...or the buffer does not get outputted.

Could the problem be that you're not flushing and closing the socket before deleting it?
EDIT: George Edison answered his own question, but was kind enough to accept my answer. Here is the code that worked for him:
pSocket->waitForBytesWritten();

What you’ve shown looks okay, so it must be that you’re actually sending something different. (I presume that you're entering "http://127.0.0.1" in your browser.)
Download the crippleware version of this product and see what it reports:
http://www.charlesproxy.com/

Try it with telnet or nc (netcat) to debug it. Also, is it possible you're sending double newlines? I'm not sure your language, so if your print appends a newline, switch to:
HTTP/1.1 200 OK\r
It's also handy to have a small logging/forwarding program for debugging socket protocols. I don't know any offhand, but I've had one I've used for many years and it's a lifesaver when I need it.
[Edit]
Try using wget to fetch the file; use the -S -O file to save the output to a file and see what's wrong that way.
[#2]
pSocket->write(error_str.toUtf8());
delete pSocket;
I haven't used Qt for a long time, but do you need to flush or at least close the socket before deleting?

Related

Sending HTTP request

I am trying to upload data from an Arduino to data.sparkfun.com, but somehow it always fails. To make sure that the HTTP request I am sending is correct, I would like to send it from a computer to the server and see if it uploads the correct values.
According to some examples, the request should be formulated like this:
GET /input/publicKey?private_key=privateKey&dht1_t=24.23&dht1_h=42.4&dht2_t=24.48&dht2_h=41.5&bmp_t=23.3&bmp_p=984021 HTTP/1.1\n
Host: 54.86.132.254\n
Connection: close\n
\n
How do I send this request to the server from my computer? Do I just type in the terminal? Im not sure where to start.
Have a look at curl which should be able to handle your needs.
Even easier and more low level is netcat (here is an example on SO)

HTTP 505 error when requesting Heroku apps w a cell module

I'm trying to use the Telit GE910 cell module to make HTTP requests over the cell network. I've connected it via a FTDI board to my computer's USB port and am sending it AT commands via the terminal. I'm using the AT commands to successfully open a socket in command mode and send the HTTP request.
AT#SD=1,0,80,"google.com",0,0,1
OK
�AT#SSEND=1
> HE�AD� /� HTT�P/1.1
OK
SRING: 1
I don't understand why these � are turning up. When making requests for google.com this is fine but anything hosted on Heroku gives me a 505 error.
HTTP/1.1 505 HTTP Version Not Supported
Connection: close
Server: Cowboy
Date: Tue, 26 Apr 2016 20:39:34 GMT
Content-Length: 0
I've read in one or two forums that this 505 response is specific to Heroku and has to do with incorrect spacing in the HTTP request. I suspect the unrecognized characters are creating the problem. What is going on? They consistently turn up before 'A', 'space', and 'P'; there may be other letters also but those are the ones that I've seen.
Ok, I have figured out (I think) why I was getting a 505 response. Then I started getting a 400, but I figured that out too!
In the application note for socket dials from Nimbelink (which is a vendor that uses the Telit cell modules--I have one of their modules which has the Telit GE910 on it) it says that after you enter your HTTP request (e.g. GET / HTTP/1.1) you're meant to hit ctrl+j twice to signal the end of the request.
Well, I started doing all of my serial communication in CoolTerm so that I could see the HEX I was sending. (My hope was I could catch the � characters--I didn't, in fact they don't turn up in CoolTerm.) ctrl+j results in a single line feed (HEX: 0A). According to HTTP documentation, to signal the end of a line you're meant to use carriage return line feed (HEX: 0D 0A). (Heroku also says it has to be formatted like this.) This is what I send when I hit enter. So if I end GET / HTTP/1.1 with enter twice, the request gets though. Though even a HEAD / HTTP/1.1 request to heroku.com comes back as a 400. But that's up next:
According to RFC (which I found out here) HTTP 1.1 requires a Host. So if I do the whole thing with the right line endings
GET / HTTP/1.1
Host: heroku.com
it works! It also works for posting to my server.

HTTP: error during reply after 200 OK status code

As an HTTP 1.1 server, I reply to a GET request with 200 OK status code, then start sending the data to the client.
During this send, an error occurs, and I cannot finish.
I cannot send a new status code as a final status code has already been sent.
How should I behave to let the client know an error occurred and I cannot continue with this HTTP request?
I can think of only one solution: close the socket, but it's not perfect: it breaks the keep-alive feature, and no clear explanation of the error is given to the client.
The HTTP standard seems to suppose that the server already knows exactly what to reply before it starts replying.
But this is not always the case.
Examples:
I return a very large file (several GB) from disk, and I get an IO error at some point during the reading of the file.
Same example with a large DB dump.
I cannot construct my whole response in memory then send it.
The HTTP 1.1 standard helps for such usage with the chunked transfer encoding: I don't even need to know the final size before starting to send the reply.
So these usage are not excluded from HTTP 1.1.
I finally found a possible solution for this:
HTTP 1.1 Trailer headers.
In chunked encoded bodies, HTTP 1.1 allows the sender to add data after the last (empty) chunk, in the form of a block of headers.
The specification hints some use-cases like computing on the fly a md5 of the body, and send it after the body so the client can check its integrity.
I think it could be used for error reporting, even if I haven't found anything about this kind of usage.
The issues I see with this are:
this requires using chunked encoding (but it's not much of an issue)
trailers support is probably very low:
server-side (it could be bypassed by manually creating the chunked encoding, but since it's applied after the content-encoding (gzip), it would require a lot of reimplementation)
client-side (bugs fixed only in 2010 in curl for example)
and on proxies (that could then loose the trailers if not properly implemented)
I have pushed the similar question to be answered, so here you can find that there is no solution:
How to tell there's something wrong with the server during response that started as 200 OK. Fail gracefully

gawk to read last bit of binary data over a pipe without timeout?

I have a program already written in gawk that downloads a lot of small bits of info from the internet. (A media scanner and indexer)
At present it launches wget to get the information. This is fine, but I'd like to simply reuse the connection between invocations. Its possible a run of the program might make between 200-2000 calls to the same api service.
I've just discovered that gawk can do networking and found geturl
However the advice at the bottom of that page is well heeded, I can't find an easy way to read the last line and keep the connection open.
As I'm mostly reading JSON data, I can set RS="}" and exit when body length reaches the expected content-length. This might break with any trailing white space though. I'd like a more robust approach. Does anyone have a nicer way to implement sporadic http requests in awk that keep the connection open. Currently I have the following structure...
con="/inet/tcp/0/host/80";
send_http_request(con);
RS="\r\n";
read_headers();
# now read the body - but do not close the connection...
RS="}"; # for JSON
while ( con |& getline bytes ) {
body = body bytes RS;
if (length(body) >= content_length) break;
print length(body);
}
# Do not close con here - keep open
Its a shame this one little thing seems to be spoiling all the potential here. Also in case anyone asks :) ..
awk was originally chosen for historical reasons - there were not many other language options on this embedded platform at the time.
Gathering up all of the URLs in advance and passing to wget will not be easy.
re-implementing in perl/python etc is not a quick solution.
I've looked at trying to pipe urls to a named pipe and into wget -i - , that doesn't work. Data gets buffered, and unbuffer not available - also I think wget gathers up all the URLS until EOF before processing.
The data is small so lack of compression is not an issue.
The problem with the connection reuse comes from the HTTP 1.0 standard, not gawk. To reuse the connection you must either use HTTP 1.1 or try some other non-standard solutions for HTTP 1.0. Don't forget to add the Host: header in your HTTP/1.1 request, as it is mandatory.
You're right about the lack of robustness when reading the response body. For line oriented protocols this is not an issue. Moreover, even when using HTTP 1.1, if your scripts locks waiting for more data when it shouldn't, the server will, again, close the connection due to inactivity.
As a last resort, you could write your own HTTP retriever in whatever langauage you like which reuses connections (all to the same remote host I presume) and also inserts a special record separator for you. Then, you could control it from the awk script.

Why recv() returns '0' bytes at all for-loop iterations except the first one?

I'm writing a small networking program in C++. Among other things it has to download twitter profile pictures. I have a list (stl::vector) of URLs. And I think that my next step is to create for-loop and send GET messages through the socket and save the pictures to different png-files. The problem is when I send the very first message, receive the answer segments and save png-data all things seems to be fine. But right at the next iteration the same message, sent through the same socket, produces 0 received bytes by recv() function. I solved the problem by adding a socket creation code to the cycle body, but I'm a bit confused with the socket concepts. It looks like when I send the message, the socket should be closed and recreated again to send next message to the same server (in order to get next image). Is this a right way of socket's networking programming or it is possible to receive several HTTP response messages through the same socket?
Thanks in advance.
UPD: Here is the code with the loop where I create a socket.
// Get links from xml.
...
// Load images in cycle.
int i=0;
for (i=0; i<imageLinks.size(); i++)
{
// New socket is returned from serverConnect. Why do we need to create new at each iteration?
string srvAddr = "207.123.60.126";
int socketImg = serverConnect(srvAddr);
// Create a message.
...
string message = "GET " + relativePart;
message += " HTTP/1.1\r\n";
message += "Host: " + hostPart + "\r\n";
message += "\r\n";
// Send a message.
BufferArray tempImgBuffer = sendMessage(sockImg, message, false);
fstream pFile;
string name;
// Form the name.
...
pFile.open(name.c_str(), ios::app | ios::out | ios::in | ios::binary);
// Write the file contents.
...
pFile.close();
// Close the socket.
close(sockImg);
}
The other side is closing the connection. That's how HTTP/1.0 works. You can:
Make a different connection for each HTTP GET
Use HTTP/1.0 with the unofficial Connection: Keep-Alive
Use HTTP/1.1. In HTTP 1.1 all connections are considered persistent unless declared otherwise.
Obligatory xkcd link Server Attention Span
Wiki HTTP
The original version of HTTP
(HTTP/1.0) was revised in HTTP/1.1.
HTTP/1.0 uses a separate connection to
the same server for every
request-response transaction, while
HTTP/1.1 can reuse a connection
multiple times
HTTP in its original form (HTTP 1.0) is indeed a "one request per connection" protocol. Once you get the response back, the other side has probably closed the connection. There were unofficial mechanisms added to some implementations to support multiple requests per connection, but they were not standardized.
HTTP 1.1 turns this around. All connections are by default "persistent".
To use this, you need to add "HTTP/1.1" to the end of your request line. Instead of GET http://someurl/, do GET http://someurl/ HTTP/1.1. You'll also need to make sure you provide the "Host:" header when you do this.
Note well, however, that even some otherwise-compliant HTTP servers may not support persistent connections. Note also that the connection may in fact be dropped after very little delay, a certain number of requests, or just randomly. You must be prepared for this, and ready to re-connect and resume issuing your requests where you left off.
See also the HTTP 1.1 RFC.

Resources