I'm reading source from a website by building a legit connection, like this in Java:
final Socket sock = new Socket(hostname, 80);
PrintWriter writer = new PrintWriter(sock.getOutputStream(), true);
writer.println("GET /path HTTP/1.1");
writer.println("Host: " + hostname);
writer.println();
//...
while (!sock.isClosed() && (line = reader.readLine()) != null) {
System.out.println(line);
}
and it works good, except that there are some weird lines in the output which are not there when I browse the website with - say- Firefox.
The problem is some lines of source get interrupted for some random different information and I don't know why I get information like that to ruin my source.
<div clas
16d0
s="span5">
or
<td style="text-align:c
2000
enter; vertical-align:middle">information</td>
What is this and how do I fix it?
Looks like the server is sending you Chunked data. Can you send HTTP/1.0 instead of 1.1? That should ensure no chunking is performed on the response.
You are reading the HTTP stream raw off a socket, instead of using an existing HTTP reader.
If you really want to do this, you should read the HTTP specification. In your case especially sect. 3.6 concerning chunked transfer.
Related
I'm trying to get live trading data from the Internet via HTTP, but it is updated continuously, so if I GET the data, it will keep downloading as long as there is data available. Until I stop the downloading stream, then I can access the data.
How to access the stream of data while the downloading is in progress?
I tried using Indy's TIdHTTP, so I can use SSL, but I tried the IdIOHandlerStream, but it was already used for IdSSLIOHandlerSocketOpenSSL. So I'm absolutely clueless here.
This is in response to a "multipart/form-data" request.
Please guide me...
Lrequest.Values['__RequestVerificationToken'] := RequestVerificationToken;
Lrequest.Values['acct'] := 'demo';
Lrequest.Values['pwd'] := 'demo';
try
Response.Text := Onhttp.Post('https://trading/data', Lrequest);
Form1.Memo1.Lines.Add(TimeToStr(Time) + ': ' + Response.Text);
except
on E: Exception do
Form1.Memo1.Lines.Add(TimeToStr(Time) + ': ' + E.ClassName +
' error raised, with message : ' + E.Message);
end;
UPDATE:
The data is an endless JSON string, like this:
{"id":"data","val":[{"rc":2,"tpc":"\\RealTime\\Global\\SGDIDR.FX","item":[{"val":{"F009":"10454.90","F011":"-33.1"}}]}]}
{"id":"data","val":[{"rc":2,"tpc":"\\RealTime\\Global\\SGDIDR.FX","item":[{"val":{"F009":"10458.80","F011":"-29.2"}}]}]}
and so on, and so on...
You can't use TIdIOHandlerStream to interface with a TCP connection, that is not what it is designed for. It is meant for performing I/O operations using user-provided TStream objects, ie for debugging previously captured sessions.
TIdHTTP is not really designed to handle endless HTTP responses in most cases, as you have described. What is the exact format that the server is delivering its live data as? What do the HTTP response headers look like? It is really difficult to answer your question without know the exact format being used.
However, that being said, there are some cases to consider, depending on what the server is actually sending:
if the server is using a MIME-based server-push format, like multipart/x-mixed-replace, you can enable the hoNoReadMultipartMIME flag in the TIdHTTP.HTTPOptions property, and then read the MIME data yourself from the TIdHTTP.IOHandler after TIdHTTP.Get() exits. For instance, you can use TIdMessageDecoderMIME to help you parse the MIME parts, see New TIdHTTP hoNoReadMultipartMIME flag in Indy's blog, or Delphi Indy TIdHttp and multipart/x-mixed-replace with Text and jpeg image.
Otherwise, if the server is using Transfer-Encoding: chunked, where each data update is sent as a new HTTP chunk, you can use the TIdHTTP.OnChunkReceived event. Or, you can enable the hoNoReadChunked flag in the TIdHTTP.HTTPOptions property, and then read the chunks yourself from the TIdHTTP.IOHandler after TIdHTTP.Get() exits. See New TIdHTTP flags and OnChunkReceived event in Indy's blog.
Otherwise, you could give TIdHTTP.Get() a TIdEventStream to write into, and then use that stream's OnWrite event to access the raw bytes. Or, you could write your own TStream-derived class that overrides the virtual Write() method. Either way, you would be responsible for manually parsing and buffering the raw body data as they are being written to the stream.
Otherwise, you may have to resort to using TIdTCPClient instead, implementing the HTTP protocol manually, then you would be solely responsible for reading in the HTTP response body however you want.
I'm trying to add gzip compression to a HTTP server i wrote in D. here is the code that dose the gzip encoding.
if ((Info.modGzip) & (indexOf(client.getRequestHeaderFieldValue("Accept-Encoding"),"gzip") != -1)){
writeln("gzip");
auto gzip = new Compress(HeaderFormat.gzip);
client.addToResponseHeader("Content-Encoding: gzip");
client.sendHeader("200 ok");
while (0 < (filestream.readBlock(readbuffer))){
client.client.send(gzip.compress(readbuffer));
}
client.sendData(gzip.flush(Z_FINISH));
delete gzip;
} else {
writeln("no gzip");
client.sendHeader("200 ok");
while (0 < (filestream.readBlock(readbuffer))){
client.client.send(readbuffer);
}
delete filestream;
}
but when i test it Firefox, internet explorer and chrome says that the encoding or compression is bad. why? the data is compressed with gzip.
Your code isn't sending the appropriate headers. The compression portion is fine, but the stuff surrounding it has a few bugs that need to be fixed.
cross posting what I said on the D newsgroup: http://forum.dlang.org/post/rngupyejcsbzkzqwgojp#forum.dlang.org
I'm trying to add gzip compression to a HTTP server i wrote in
D. here is the code that dose the gzip encoding.
I know zlib gzip works for http, I used it in my cgi.d
if(gzipResponse && acceptsGzip && isAll) {
auto c = new Compress(HeaderFormat.gzip); // want gzip
auto data = c.compress(t);
data ~= c.flush();
t = data;
}
But your http server is buggy in a lot of ways. It doesn't reply to curl and doesn't keep the connection open to issue manual requests.
Among the bugs I see looking at it quickly:
server.d getRequestHeaderFieldValue, you don't check if epos is -1. If it is, you should return null or something instead of trying to use it - the connection will hang because of an out-of-bounds array read killing the handler.
You also wrote:
if ((Info.modGzip) & (indexOf(client.getRequestHeaderFieldValue("Accept-Encoding"),"gzip") != -1)){
Notice the & instead of &&. That's in fspipedserver.d.
Finally, you write client.client.send... which never sent the headers back to the client, so it didn't know you were gzipping! Change that to client.sendData (and change sendData in server.d to take "in void[]" instead of "void[]") and then it sends the headers and seems to work by my eyeballing.
I have an ASP web forms page with a button. On postback the button sends some XML content back to the user as a file. It has been working however in one case with a string with a length of 16759 the downloaded file has been cut short by 10 bytes. Both Chrome and Firefox have exhibited the same behaviour.
The solution has been to change the content-type from "text/xml" (I also tried "text/plain") to "application/octet-stream". However I would like to understand why the other content-types behave in this way.
My code is as follows. (I've played around with a few different methods and they did not change anything)
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ContentType = "text/plain";
HttpContext.Current.Response.AddHeader("Content-Length", content.Length.ToString());
HttpContext.Current.Response.AddHeader("Content-Disposition", "attachment; filename=\"test.txt\"");
HttpContext.Current.Response.Write(content);
HttpContext.Current.Response.Flush();
HttpContext.Current.Response.Close();
All you have to do is not call Close on the Stream. Don't ask me for an explanation, all I know is it works.
Explanation per the MSDN link for HttpResponse.Close kindly provided by Ray Cheng:
This method terminates the connection to the client in an abrupt manner and is not intended for normal HTTP request processing. The method sends a reset packet to the client, which can cause response data that is buffered on the server, the client, or somewhere in between to be dropped.
I am working on an app where we have to pass specific web api parameters to a web app using HTTP POST.
eg:
apimethod name
parameter1 value
parameter2 value
So do I use a string or URLEncodedPostData to send that data?
It would be good if u help me with a code eg.
I am using something like this but it doesnt post the data to the server.
Though the response code is ok/200 and I also get get a parsed html response when i read the httpresponse input stream. But the code doesnt post anything. So unable to get the expected response.
_postData.append("method", "session.getToken");
_postData.append( "developerKey", "value");
_postData.append( "clientID", "value");
_httpConnection = (HttpConnection) Connector.open(URL, Connector.READ_WRITE);
String encodedData = _postData.toString();
_httpConnection.setRequestMethod(HttpConnection.POST);
_httpConnection.setRequestProperty("User-Agent", "BlackBerry/3.2.1");
_httpConnection.setRequestProperty("Content-Language", "en-US");
_httpConnection.setRequestProperty("Content-Type","application/x-www-form-urlencoded");
_httpConnection.setRequestProperty("Content-Length",(new Integer(encodedData.length())).toString());
os = _httpConnection.openOutputStream();
os.write(requeststring.getBytes());`
The code you posted above looks correct - although you'll want to do a few more things (maybe you did this already but didn't include it in your code):
Close the outputstream once you've written all the bytes to it
Call getResponseCode() on the connection so that it actually sends the request
POSTed parameters are usually sent in the response BODY, which means URL-encoding them is inappropriate. Quote from the HTTP/1.1 protocol:
Note: The "multipart/form-data" type has been specifically defined
for carrying form data suitable for processing via the POST
request method, as described in RFC 1867 [15].
The post method allows you to use pretty arbitrary message bodies — so it is whatever format the server wants.
What's the best way to stream files using ASP.NET?
There appear to be various methods for this, and I'm currently using the Response.TransmitFile() method inside an http handler, which sends the file to the browser directly. This is used for various things, including sending FLV's from outside the webroot to an embedded Flash video player.
However, this doesn't seem like a reliable method. In particular, there's a strange problem with Internet Explorer (7), where the browser just hangs after a video or two are viewed. Clicking on any links, etc have no effect, and the only way to get things working again on the site is to close down the browser and re-open it.
This also occurs in other browsers, but much less frequently. Based on some basic testing, I suspect this is something to do with the way files are being streamed... perhaps the connection isn't being closed properly, or something along those lines.
After trying a few different things, I've found that the following method works for me:
Response.WriteFile(path);
Response.Flush();
Response.Close();
Response.End();
This gets around the problem mentioned above, and viewing videos no longer causes Internet Explorer to hang.
However, my understanding is that Response.WriteFile() loads the file into memory first, and given that some files being streamed could potentially be quite large, this doesn't seem like an ideal solution.
I'm interested in hearing how other developers are streaming large files in ASP.NET, and in particular, streaming FLV video files.
I would take things outside of the "aspx" pipeline. In particular, I would write a ran handler (ashx, or mapped via config), that does the minimum work, and simply writes to the response in chunks. The handler would accept input from the query-string/form as normal, locate the object to stream, and stream the data (using a moderately sized local buffer in a loop). A simple (incomplete) example shown below:
public void ProcessRequest(HttpContext context) {
// read input etx
context.Response.Buffer = false;
context.Response.ContentType = "text/plain";
string path = #"c:\somefile.txt";
FileInfo file = new FileInfo(path);
int len = (int)file.Length, bytes;
context.Response.AppendHeader("content-length", len.ToString());
byte[] buffer = new byte[1024];
Stream outStream = context.Response.OutputStream;
using(Stream stream = File.OpenRead(path)) {
while (len > 0 && (bytes =
stream.Read(buffer, 0, buffer.Length)) > 0)
{
outStream.Write(buffer, 0, bytes);
len -= bytes;
}
}
}
Take a look at the following article Tracking and Resuming Large File Downloads in ASP.NET which will give you more in depth than just open a stream and chuck out all the bits.
The http protocol supports ranged byte requests and resumeable downloads, and many streaming clients (like video players or Adobe pdf) can and will try to chunk these up, saving bandwidth and giving your users a better experience.
Not trivial, but it's time well spent.
Try opening the file as a stream, then using Response.OutputStream.Write(). For example:
Edit: My bad, I forgot that Write takes a byte buffer. Fixed
byte [] buffer = new byte[1<<16] // 64kb
int bytesRead = 0;
using(var file = File.OpenRead(path))
{
while((bytesRead = file.Read(buffer, 0, buffer.Length)) != 0)
{
Response.OutputStream.Write(buffer, 0, bytesRead);
}
}
Response.Flush();
Response.Close();
Response.End();
Edit 2: Did you try this? It should work.
After trying lots of different combinations, including the code posted in the various answers, it seems like setting Response.Buffer = true before calling TransmitFile did the trick and the web application is now a lot more responsive in Internet Explorer.
In this particular case, the SWF extension is also mapped to ASP.NET, and we're using a custom handler in our web application to read the files from disk and then send them to the browser using Response.TransmitFile(). We've got a flash-based video player to play video files which are also SWF's, and I think having all of this activity go through the handler without buffering is what may have been causing strange things to happen in IE.