D std.zlib stream compression with http - http

I'm trying to add gzip compression to a HTTP server i wrote in D. here is the code that dose the gzip encoding.
if ((Info.modGzip) & (indexOf(client.getRequestHeaderFieldValue("Accept-Encoding"),"gzip") != -1)){
writeln("gzip");
auto gzip = new Compress(HeaderFormat.gzip);
client.addToResponseHeader("Content-Encoding: gzip");
client.sendHeader("200 ok");
while (0 < (filestream.readBlock(readbuffer))){
client.client.send(gzip.compress(readbuffer));
}
client.sendData(gzip.flush(Z_FINISH));
delete gzip;
} else {
writeln("no gzip");
client.sendHeader("200 ok");
while (0 < (filestream.readBlock(readbuffer))){
client.client.send(readbuffer);
}
delete filestream;
}
but when i test it Firefox, internet explorer and chrome says that the encoding or compression is bad. why? the data is compressed with gzip.

Your code isn't sending the appropriate headers. The compression portion is fine, but the stuff surrounding it has a few bugs that need to be fixed.
cross posting what I said on the D newsgroup: http://forum.dlang.org/post/rngupyejcsbzkzqwgojp#forum.dlang.org
I'm trying to add gzip compression to a HTTP server i wrote in
D. here is the code that dose the gzip encoding.
I know zlib gzip works for http, I used it in my cgi.d
if(gzipResponse && acceptsGzip && isAll) {
auto c = new Compress(HeaderFormat.gzip); // want gzip
auto data = c.compress(t);
data ~= c.flush();
t = data;
}
But your http server is buggy in a lot of ways. It doesn't reply to curl and doesn't keep the connection open to issue manual requests.
Among the bugs I see looking at it quickly:
server.d getRequestHeaderFieldValue, you don't check if epos is -1. If it is, you should return null or something instead of trying to use it - the connection will hang because of an out-of-bounds array read killing the handler.
You also wrote:
if ((Info.modGzip) & (indexOf(client.getRequestHeaderFieldValue("Accept-Encoding"),"gzip") != -1)){
Notice the & instead of &&. That's in fspipedserver.d.
Finally, you write client.client.send... which never sent the headers back to the client, so it didn't know you were gzipping! Change that to client.sendData (and change sendData in server.d to take "in void[]" instead of "void[]") and then it sends the headers and seems to work by my eyeballing.

Related

How to get continuous HTTP data?

I'm trying to get live trading data from the Internet via HTTP, but it is updated continuously, so if I GET the data, it will keep downloading as long as there is data available. Until I stop the downloading stream, then I can access the data.
How to access the stream of data while the downloading is in progress?
I tried using Indy's TIdHTTP, so I can use SSL, but I tried the IdIOHandlerStream, but it was already used for IdSSLIOHandlerSocketOpenSSL. So I'm absolutely clueless here.
This is in response to a "multipart/form-data" request.
Please guide me...
Lrequest.Values['__RequestVerificationToken'] := RequestVerificationToken;
Lrequest.Values['acct'] := 'demo';
Lrequest.Values['pwd'] := 'demo';
try
Response.Text := Onhttp.Post('https://trading/data', Lrequest);
Form1.Memo1.Lines.Add(TimeToStr(Time) + ': ' + Response.Text);
except
on E: Exception do
Form1.Memo1.Lines.Add(TimeToStr(Time) + ': ' + E.ClassName +
' error raised, with message : ' + E.Message);
end;
UPDATE:
The data is an endless JSON string, like this:
{"id":"data","val":[{"rc":2,"tpc":"\\RealTime\\Global\\SGDIDR.FX","item":[{"val":{"F009":"10454.90","F011":"-33.1"}}]}]}
{"id":"data","val":[{"rc":2,"tpc":"\\RealTime\\Global\\SGDIDR.FX","item":[{"val":{"F009":"10458.80","F011":"-29.2"}}]}]}
and so on, and so on...
You can't use TIdIOHandlerStream to interface with a TCP connection, that is not what it is designed for. It is meant for performing I/O operations using user-provided TStream objects, ie for debugging previously captured sessions.
TIdHTTP is not really designed to handle endless HTTP responses in most cases, as you have described. What is the exact format that the server is delivering its live data as? What do the HTTP response headers look like? It is really difficult to answer your question without know the exact format being used.
However, that being said, there are some cases to consider, depending on what the server is actually sending:
if the server is using a MIME-based server-push format, like multipart/x-mixed-replace, you can enable the hoNoReadMultipartMIME flag in the TIdHTTP.HTTPOptions property, and then read the MIME data yourself from the TIdHTTP.IOHandler after TIdHTTP.Get() exits. For instance, you can use TIdMessageDecoderMIME to help you parse the MIME parts, see New TIdHTTP hoNoReadMultipartMIME flag in Indy's blog, or Delphi Indy TIdHttp and multipart/x-mixed-replace with Text and jpeg image.
Otherwise, if the server is using Transfer-Encoding: chunked, where each data update is sent as a new HTTP chunk, you can use the TIdHTTP.OnChunkReceived event. Or, you can enable the hoNoReadChunked flag in the TIdHTTP.HTTPOptions property, and then read the chunks yourself from the TIdHTTP.IOHandler after TIdHTTP.Get() exits. See New TIdHTTP flags and OnChunkReceived event in Indy's blog.
Otherwise, you could give TIdHTTP.Get() a TIdEventStream to write into, and then use that stream's OnWrite event to access the raw bytes. Or, you could write your own TStream-derived class that overrides the virtual Write() method. Either way, you would be responsible for manually parsing and buffering the raw body data as they are being written to the stream.
Otherwise, you may have to resort to using TIdTCPClient instead, implementing the HTTP protocol manually, then you would be solely responsible for reading in the HTTP response body however you want.

Get Hunchentoot to output no headers at all

I want to return a TSV file from a web call in Hunchentoot (SBCL), but want the user to just save the raw result blatted to the page, rather than use a separate file and download link (which is hard because of local firewall complexities).
I can't figure out how to output the page without any headers at all, i.e., to make it just plain raw text. (I know that the browser would make a mess w/o headers in the DOM, but don't care; the goal is just to have the user save the page, not read it.)
I've tried various combinations of
(setf (hunchentoot:content-type*) "text/plain")
and
(cl-who:with-html-output-to-string
(*standard-output* nil :prologue nil)
and setting the content-type* inside, outside, and around the with... but I always get header junk.
Writing a string directly
I tried defining a handler as follows:
(define-easy-handler (text :uri "/text") ()
(setf (content-type*) "text/csv")
"a,b,c")
When I visit the page locally, the browser automatically downloads a text file without even displaying (this is probably a setting we can change in Chrome, I don't know).
When I enable the browser developer mode, here are the response headers I receive as part of the HTTP protocol:
HTTP/1.1 200 OK
Server: ...
Date: ...
Content-Type: text/csv; charset=utf-8
Content-Length: 5
Connection: keep-alive
But the file itself is just the string a,b,c.
If I change the content-type to "text/plain", then the browser successfully displays the text, and nothing else (the HTTP headers are the same).
Remarks
You don't need to use the cl-who macros if you do not intend to build an HTML document, in fact its better not to. In any case, you can supply your own REPLY-CLASS when initializing the acceptor (see https://edicl.github.io/hunchentoot/#replies) and have a very low-level control about what you emit as a reply, headers included. But I don't think this is necessary in your case. I don't clearly understand where your problem comes from, but sending back a plain text is something the framework is supposed to be able to do out of the box. Please add more details if you can.
Is the correct answer not to use the Content-Disposition header?

Can I Vary on a custom header?

I'm bucketing User-Agents by device using something like varnish-devicedetect and storing the result in X-UA-Device on the request and the response.
I've seen several recommendations to vary on User-Agent. Any reason not to vary instead on X-UA-Device? Seems like it'd be nicer to downstream caches.
Since X-UA-Device is not available on the client request or in any downstream proxys (it's generated inside Varnish) you have to vary on the raw User-Agent header.
Although varying on X-UA-Device is incorrect for downstream caches, Varnish itself can still benefit from that optimization if you rewrite the Vary header in vcl_deliver:
sub vcl_deliver {
if (resp.http.Vary) {
set resp.http.Vary = regsub(resp.http.Vary,
"(?i)X-UA-Device",
"User-Agent");
}
}
This way, Varnish varies its cache on X-UA-Device and downstream caches vary on User-Agent.
In your question, you mentioned you were adding X-UA-Device to the response header as well as the request header. In that case, the above suggestion will not work and you will instead need to send Vary: User-Agent unconditionally:
sub vcl_fetch {
set beresp.http.X-UA-Device = req.http.X-UA-Device;
if (!beresp.http.Vary) {
set beresp.http.Vary = "User-Agent";
} elsif (beresp.http.Vary !~ "(?i)User-Agent") {
set beresp.http.Vary = beresp.http.Vary + ", User-Agent";
}
}
(I was not sure whether you were setting the X-UA-Device response header for the benefit of client-side scripts, or in the hope that it would be recognized by downstream caches.)

What are these weird lines in HTTP protocol?

I'm reading source from a website by building a legit connection, like this in Java:
final Socket sock = new Socket(hostname, 80);
PrintWriter writer = new PrintWriter(sock.getOutputStream(), true);
writer.println("GET /path HTTP/1.1");
writer.println("Host: " + hostname);
writer.println();
//...
while (!sock.isClosed() && (line = reader.readLine()) != null) {
System.out.println(line);
}
and it works good, except that there are some weird lines in the output which are not there when I browse the website with - say- Firefox.
The problem is some lines of source get interrupted for some random different information and I don't know why I get information like that to ruin my source.
<div clas
16d0
s="span5">
or
<td style="text-align:c
2000
enter; vertical-align:middle">information</td>
What is this and how do I fix it?
Looks like the server is sending you Chunked data. Can you send HTTP/1.0 instead of 1.1? That should ensure no chunking is performed on the response.
You are reading the HTTP stream raw off a socket, instead of using an existing HTTP reader.
If you really want to do this, you should read the HTTP specification. In your case especially sect. 3.6 concerning chunked transfer.

Why does HTTP headers doesn't get created when I use Server.Transfer()?

I'm using an .aspx page to serve an image file from the file system according to the given parameters.
Server.Transfer(imageFilePath);
When this code runs, the image is served, but no Last-Modified HTTP Header is created.
as opposed to that same file, being called directly from the URL on the same Server.
Therefor the browser doesn't issue an If-Modified-Since and doesn't cache the response.
Is there a way to make the server create the HTTP Headers like normally does with a direct request of a file (image in that case) or do I have to manually create the headers?
When you make a transfer to the file, the server will return the same headers as it does for an .aspx file, because it's basically executed by the .NET engine.
You basically have two options:
Make a redirect to the file instead, so that the browser makes the request for it.
Set the headers you want, and use Request.BinaryWrite (or smiiliar) to send the file data back in the response.
I'll expand on #Guffa's answer and share my chosen solution.
When calling the Server.Transfer method, the .NET engine treats it like an .aspx page, so It doesn't add the appropriate HTTP Headers needed (e.g. for caching) when serving a static file.
There are three options
Using Response.Redirect, so the browser makes the appropriate request
Setting the headers needed and using Request.BinaryWrite to serve the content
Setting the headers needed and calling Server.Transfer
I choose the third option, here is my code:
try
{
DateTime fileLastModified = File.GetLastWriteTimeUtc(MapPath(fileVirtualPath));
fileLastModified = new DateTime(fileLastModified.Year, fileLastModified.Month, fileLastModified.Day, fileLastModified.Hour, fileLastModified.Minute, fileLastModified.Second);
if (Request.Headers["If-Modified-Since"] != null)
{
DateTime modifiedSince = DateTime.Parse(Request.Headers["If-Modified-Since"]);
if (modifiedSince.ToUniversalTime() >= fileLastModified)
{
Response.StatusCode = 304;
Response.StatusDescription = "Not Modified";
return;
}
}
Response.AddHeader("Last-Modified", fileLastModified.ToString("R"));
}
catch
{
Response.StatusCode = 404;
Response.StatusDescription = "Not found";
return;
}
Server.Transfer(fileVirtualPath);

Resources