Manually close connection in request stream HTTP.jl - http

My problem is the following. I am making a GET request which returns a stream and after some time I get my desired data but the webserver does not close the connection so I want to close it on my side. My code is as follows:
using HTTP
HTTP.open(:GET, "https://someurl.com", query=Dict(
"somekey"=>"somevalue"
)) do io
for i = 1:4
println("DATA: ---")
#show io
println(String(readavailable(io)))
end
#info "Close read"
closeread(io)
#info "After close read"
#show io
end
println("At the end!")
However I never reach the last line. I have tried dozens of different approaches by consulting the docs of HTTP.jl, but none worked for me and I suspect that is, because this webserver is not sending the Close: Connection, but I have not been able to find an example that closes the connection on the client side manually / forcefully.
Interesting note: When running this from the REPL and closing the connection via hitting Ctrl-C a couple of times and then rerunning the script it hangs forever. I have to wait some random amount of seconds to minutes then before I can run it again "successfully". I suspect this has to do with the stale connection not being closed properly.
As is evident I am neither very proficient in networks programming nor julia, so any help would be highly appreciated!
EDIT: I suspect I was not quite clear enough on the behaviour of the webserver and what I wanna do so I will try to break it down as simple as possible: I want to get responses from the webserver until I detect a certain keyword. After that I wanna close the connection - the webserver would keep on sending me data but I already got all I am interested in so I don't want to wait for another few minutes for the webserver to close the connection for me!

Your code is assuming the 4 times you will get the data by calling readavailable which might not be true depending on the buffer state,
Rather than that your loop should be:
while !eof(io)
println("DATA: ---")
println(String(readavailable(io)))
end
In your case the connection gets stacked because you try to read four chunks of data and perhaps you are getting everything in the first chunk and than the connection blocks.
On top of that, if your a using the do syntax you should not close the resource yourself - it will be done automatically at the end of the block.

Related

How to efficiently decode gobs and wait for more to arrive via tcp connection

I'd like to have a TCP connection for a gaming application. It's important to be time efficient. I want to receive many objects efficiently. It's also important to be CPU efficient because of the load.
So far, I can make sure handleConnection is called every time a connection is dialed using go's net library. However once the connection is created, I have to poll (check over and over again to see if new data is ready on the connection). This seems inefficient. I don't want to run that check to see if new data is ready if it's needlessly sucking up CPU.
I was looking for something such as the following two options but didn't find what I was looking for.
(1) Do a read operation that somehow blocks (without sucking CPU) and then unblocks when new stuff is ready on the connection stream. I could not find that.
(2) Do an async approach where a function is called when new data arrives on the connection stream (not just when a new connection is dialed). I could not find that.
I don't want to put any sleep calls in here because that will increase the latency of responding to singles messages.
I also considered dialing out for every single message, but I'm not sure if that's efficient or not.
So I came up with code below, but it's still doing a whole lot of checking for new data with the Decode(p) call, which does not seem optimal.
How can I do this more efficiently?
func handleConnection(conn net.Conn) {
dec := gob.NewDecoder(conn)
p := &P{}
for {
result := dec.Decode(p)
if result != nil {
// do nothing
} else {
fmt.Printf("Received : %+v", p)
fmt.Println("result", result, "\n")
}
}
conn.Close()
}
You say:
So I came up with code below, but it's still doing a whole lot of checking for new data with the Decode(p) call.
Why do you think that? The gob decoder will issue a Read to the conn and wait for it to return data before figuring out what it is and decoding it. This is a blocking operation, and will be handled asynchronously by the runtime behind the scenes. The goroutine will sleep until the appropriate io signal comes in. You should not have to do anything fancy to make that more performant.
You can trace this yourself in the code for decoder.Decode.
I think your code will work just fine. CPU will be idle until it receives more data.
Go is not node. Every api is "blocking" for the most part, but that is not as much as a problem as in other platforms. The runtime manages goroutines very efficiently and delegates appropriate signals to sleeping goroutines as needed.

Will the write() system call block further operation till read() is involved, or vice versa?

Written as part of a TCP/IP client-server:
Server:
write(nfds,data1,sizeof(data1));
usleep(1000);
write(nfds,data2,sizeof(data2));
Client:
read(fds,s,sizeof(s));
printf("%s",s);
read(fds,s,sizeof(s));
printf("%s",s);
Without usleep(1000) between the two calls to write(), the client prints data1 twice. Why is this?
Background:
I am doing a Client-Server program where the server has to send two consecutive pieces of information after their acquisition, via the network (socket); nfds is the file descriptor we get from accept().
In the client side, we receive these information via read; here fds is the file descriptor obtained via socket().
My issue is that when I am NOT using the usleep(1000) between the write() functions, the client just prints the info represented by data1 twice, instead of printing data1 and then data2. When I put in the usleep() it's fine. Exactly WHY is this happening? Is write() blocking the operation till the buffer is read or is read() blocking the operation till info is written into the buffer? Or am I completely off the page?
You are making several false assumptions. There is nothing in TCP that guarantees that one send equals one receive. There is a lot of buffering, at both ends, and there are deliberate delays in sending to as to coalesce packets (the Nagle algorithm). When you call read(), or recv() and friends, you need to store the result into a variable and examine it for each of the following cases:
-1: an error: examine/log/print errno, or strerror(), or call perror(), and in most cases close the socket and exit the reading loop.
0: end of stream; the owner has closed the connection; close the socket and exit the reading loop.
a positive value but less than you expected: keep reading and accumulate the data until you have everything you need.
a positive value that is more than you expected: process the data you expected, and save the rest for next time.
exactly what you expected: process the data, discard it all, and repeat. This isn the easy case, and it is rare, but it is the only case you are currently programming for.
Don't add sleeps into networking code. It doesn't solve problems, it only delays them.

How to properly determine that the end-of-input has been reached in a QTCPSocket?

I have the following code that reads from a QTCPSocket:
QString request;
while(pSocket->waitForReadyRead())
{
request.append(pSocket->readAll());
}
The problem with this code is that it reads all of the input and then pauses at the end for 30 seconds. (Which is the default timeout.)
What is the proper way to avoid the long timeout and detect that the end of the input has been reached? (An answer that avoids signals is preferred because this is supposed to be happening synchronously in a thread.)
The only way to be sure is when you have received the exact number of bytes you are expecting. This is commonly done by sending the size of the data at the beginning of the data packet. Read that first and then keep looping until you get it all. An alternative is to use a sentinel, a specific series of bytes that mark the end of the data but this usually gets messy.
If you're dealing with a situation like an HTTP response that doesn't contain a Content-Length, and you know the other end will close the connection once the data is sent, there is an alternative solution.
Use socket.setReadBufferSize to make sure there's enough read buffer for all the data that may be sent.
Call socket.waitForDisconnected to wait for the remote end to close the connection
Use socket.bytesAvailable as the content length
This works because a close of the connection doesn't discard any buffered data in a QTcpSocket.

tcp connect timeout (unix/windows portable)

I'm using perl (which hopefully shouldn't affect anything), but I need to know how I can set a timeout for the connect operation. The problem is I can't wait forever for the connect operation to happen. If it doesn't happen within a few seconds, I'd rather give-up and move on.
socket(my $sock, PF_INET, SOCK_STREAM, (getprotobyname('tcp'))[2]);
setsockopt($sock, SOL_SOCKET, SO_SNDTIMEO, 10); # send timeout
print "connecting...\n";
connect($sock, sockaddr_in(80,scalar gethostbyname('lossy.host.com')));
print "connected...\n";
The problem is, if the connection to "lossy.host.com" is "lossy" or slow or anything but fast, I'd rather give up than make the user wait. (Think of it as a side-effect to a program that does something else... the user probably doesn't expect this script to communicate with a server somewhere...).
Threading Case: How would you interrupt the connect()? Would you just detach the thread and forget about it?
You can use fcntl to set the socket to be non-blocking, then select with a timeout waiting for it to become readable. If it doesn't become readable before the timeout, you could close it at that point.
I know how to do this in C, but not perl, otherwise I'd give you an example. The perlfunc manpage says that all of these functions exist and a cursory read seems to say they'll work like you want.
Edit: sorry, missed the part where perlfunc says they may not be available on non-Unix systems, and indeed, fcntl isn't available on win32. There is an IO::Socket library that you can use that will do the right thing on Windows though.
Here's sample code that works for me (on linux anyway):
#!/usr/bin/perl
use IO::Socket::INET;
use IO::Select;
$sock = IO::Socket::INET->new('PeerAddr' => 'lossy.host.com',
'PeerPort' => 80,
'Blocking' => 0 );
$sel = IO::Select->new( $sock );
#writes = $sel->can_write(10);
if ( $sock->connected ) {
print "socket is connected\n";
} else {
print "socket not connected after however long\n";
$sock->close;
}
You could spawn a separate thread to do it, and then do a timed wait for a result. If you don't receive a result in an appropriate amount of time, give up waiting and just let the thread continue. It will eventually time out, or you might be able to kill the thread.
To answer the initial question, I don't think there's a way to change the connect() timeout, at least not through a sockets API. On Windows, I wouldn't be surprised if there's a registry key you could change that would affect it, but I don't know what it would be.
If you end up doing the threaded case wherein you detach the connecting thread without killing it, beware the following: Windows only lets you have a maximum of 10 pending outgoing TCP connections (the 11th will block until one of the pending ones times out).
This was the cause of much frustration for me. I think MS put this in to prevent botnets from spreading or something. I don't think there's any way to switch it off either.

Receiving image through winsocket

i have a proxy server running on my local machine used to cache images while surfing. I set up my browser with a proxy to 127.0.0.1, receive the HTTP requests, take the data and send it back to the browser. It works fine for everything except large images. When I receive the image info, it only displays half the image (ex.: the top half of the google logo) heres my code:
char buffer[1024] = "";
string ret("");
while(true)
{
valeurRetour = recv(socketClient_, buffer, sizeof(buffer), 0);
if(valeurRetour <= 0) break;
string t;
t.assign(buffer,valeurRetour);
ret += t;
longueur += valeurRetour;
}
closesocket(socketClient_);
valeurRetour = send(socketServeur_, ret.c_str(),longueur, 0);
the socketClient_ is non-blocking. Any idea how to fix this problem?
You're not making fine enough distinctions among the possible return values of recv.
There are two levels here.
The first is, you're lumping 0 and -1 together. 0 means the remote peer closed its sending half of the connection, so your code does the right thing here, closing its socket down, too. -1 means something happened besides data being received. It could be a permanent error, a temporary error, or just a notification from the stack that something happened besides data being received. Your code lumps all such possibilities together, and on top of that treats them the same as when the remote peer closes the connection.
The second level is that not all reasons for getting -1 from recv are "errors" in the sense that the socket is no longer useful. I think if you start checking for -1 and then calling WSAGetLastError to find out why you got -1, you'll get WSAEWOULDBLOCK, which is normal since you have a non-blocking socket. It means the recv call cannot return data because it would have to block your program's execution thread to do so, and you told Winsock you wanted non-blocking calls.
A naive fix is to not break out of the loop on WSAEWOULDBLOCK but that just means you burn CPU time calling recv again and again until it returns data. That goes against the whole point of non-blocking sockets, which is that they let your program do other things while the network is busy. You're supposed to use functions like select, WSAAsyncSelect or WSAEventSelect to be notified when a call to the API function is likely to succeed again. Until then, you don't call it.
You might want to visit The Winsock Programmer's FAQ. (Disclaimer: I'm its maintainer.)
Have you analyzed the transaction at the HTTP level i.e. checked Headers?
Are you accounting for things like Chunked transfers?
I do not have a definite answer in part because of the lack of details given here.

Resources