I have near no knowledge/experience with sockets of any kind so my question may just end up being due to my ignorance.
Guile doesn't have any concept of the protocol but I figure, since websockets are just HTTP requests, I should be able to at least make a request out using Guile's web modules.
I'm using wss://node2.wsninja.io from https://wsninja.io/#demo to practice on.
I wrote up
(let ([s (socket PF_INET SOCK_STREAM 0)]
[d (vector-ref (addrinfo:addr (car (getaddrinfo "echo.websocket.org" "https"))) 1)])
(connect s AF_INET d 443))
to attempt connecting. While getaddrinfo asks for a protocol (and Guile doesn't recognize "wss" or "ws"), I only really need the IP address as an integer and that's all it's really doing – extracting the integer representation; the result matches what I got verifying against a few other services on the web so it seems right.
I used PF_INET and AF_INET since, if I'm not mistaken, IPv4 makes sense for a websocket call. I went with SOCK_STREAM since a websocket'll be streaming data back and forth, once opened.
I based the above example off of the client example given by Guile here: https://www.gnu.org/software/guile/manual/html_node/Internet-Socket-Examples.html#Internet-Socket-Examples.
I suspect it's the next portion that I've got wrong somehow but I'm not sure how.
I assumed, after connecting, the next step was to initiate the handshake. Given the Guile example, it seemed this could be accomplished by displaying to the now (in theory) opened port so I did this:
(let ([s (socket PF_INET SOCK_STREAM 0)]
[d (vector-ref (addrinfo:addr (car (getaddrinfo "echo.websocket.org" "https"))) 1)])
(connect s AF_INET d 443)
(display "GET / HTTP/1.1\r\n" s)
(display "Host: localhost\r\n" s)
(display "User-Agent: Mozilla/5.0\r\n" s)
(display "Upgrade: websocket\r\n" s)
(display "\r\n" s)
(do ([line (read-line s) (read-line s)])
((eof-object? line))
(display line)
(newline)))
I also tried this with the handshake sent as one String, as well. The do loop is pretty much lifted directly from the Guile example.
While I've seen mention that the handshake should be sent as a header for Websockets, I figured it wouldn't be terribly different from HTTP and the Guile example has it sent via display but maybe that's a source of my error.
Needless to say, this doesn't work. Nothing gets displayed and the loop is exited, immediately. But it does seem like I'm making a connection because removing the displays to the socket results in it just hanging (what I assume is an unending loop because no eof-object gets sent because the socket is waiting on the handshake).
But no affirmation or even error is sent if I do send something, which I would have expected. So I'm confounded and a little out of my depth and thought I'd ask.
Related
I'm using Netty to do forward and reverse lookups and it's working well. One thing I've run into though when doing a PTR lookup of an hp.com IP address (15.73.104.147) is that a large result set is truncated.
When using dig -x 15.73.104.147, I can get a result but I noticed that it too has a truncated response and so successfully switches to TCP mode -
;; Truncated, retrying in TCP mode.
If there any way of doing something similar with Netty given that the DnsNameResolverBuilder needs a DatagramChannel instance?
thanks,
Matt
Netty itself not support DNS over TCP atm. What you could try is increase the datagram packet size that is expected via DnsResolverBuilder. maxPayloadSize(....).
https://github.com/netty/netty/blob/4.1/resolver-dns/src/main/java/io/netty/resolver/dns/DnsNameResolverBuilder.java#L249
So I have some code like
(let ((file (open cur-fifo :if-does-not-exist :create)))
(format t "~A~%" (read-line file nil))
(close file))
Which as far as I can tell works fine, except that this will block indefinitely if no data was written to cur-fifo. I want the read to time out and return NIL if no data could be read in ~0.1 sec or so.
Running on SBCL 1.1.18, on 64-bit Gentoo Linux
FIFO model
When you open a fifo special device (for reading), the system call blocks unless either
the fifo is already opened by (another) process for writing OR
you pass O_ASYNC to open(2) - which you might not be able to do in your implementation unless you use a low level package sb-posix
When you successfully opened the fifo, your read(2) call will block until your counterparty (that opened the fifo for writing, which could be the same lisp process) writes something there.
Common Lisp
What you are looking for is listen (see also with-open-file):
(with-open-file (fifo "my-fifo" :if-does-not-exist :create)
(when (or (listen fifo)
(progn (sleep 0.1)
(listen fifo)))
(format t "Read [~A]~%" (read-line fifo))))
Debugging
Please note that special device handling is not necessarily equally well supported by all CL vendors.
If the above does not work, please do some experiments with the REPL: open the fifo, see what listen returns, write something there, see what listen reports now, &c.
If listen still returns nil even though you already wrote something into the pipe, this might mean that your CL does not recognize the file as a special device. You might have to pass some implementation-specific arguments to open, e.g., :buffering nil or something (try (describe 'open)).
On the client side of a TCP connection, I am attempting to to reuse established connections as much as possible to avoid the overhead of dialing every time I need a connection. Fundamentally, it's connection pooling, although technically, my pool size just happens to be one.
I'm running into a problem in that if a connection sits idle for long enough, the other end disconnects. I've tried using something like the following to keep connections alive:
err = conn.(*net.TCPConn).SetKeepAlive(true)
if err != nil {
fmt.Println(err)
return
}
err = conn.(*net.TCPConn).SetKeepAlivePeriod(30*time.Second)
if err != nil {
fmt.Println(err)
return
}
But this isn't helping. In fact, it's causing my connections to close sooner. I'm pretty sure this is because (on a Mac) this means the connection health starts being probed after 30 seconds and then is probed at 8 times at 30 second intervals. The server side must not be supporting keepalive, so after 4 minutes and 30 seconds, the client is disconnecting.
There might be nothing I can do to keep an idle connection alive indefinitely, and that would be absolutely ok if there were some way for me to at least detect that a connection has been closed so that I can seamlessly replace it with a new one. Alas, even after reading all the docs and scouring the blogosphere for help, I can't find any way at all in go to query the state of a TCP connection.
There must be a way. Does anyone have any insight into how that can be accomplished? Many thanks in advance to anyone who does!
EDIT:
Ideally, I'd like to learn how to handle this, low-level with pure go-- without using third-party libraries to accomplish this. Of course if there is some library that does this, I don't mind being pointed in its direction so I can see how they do it.
The socket api doesn't give you access to the state of the connection. You can query the current state it in various ways from the kernel (/proc/net/tcp[6] on linux for example), but that doesn't make any guarantee that further sends will succeed.
I'm a little confused on one point here. My client is ONLY sending data. Apart from acking the packets, the server sends nothing back. Reading doesn't seem an appropriate way to determine connection status, as there's noting TO read.
The socket API is defined such that that you detect a closed connection by a read returning 0 bytes. That's the way it works. In Go, this is translated to a Read returning io.EOF. This will usually be the fastest way to detect a broken connection.
So am I supposed to just send and act on whatever errors occur? If so, that's a problem because I'm observing that I typically do not get any errors at all when attempting to send over a broken pipe-- which seems totally wrong
If you look closely at how TCP works, this is the expected behavior. If the connection is closed on the remote side, then your first send will trigger an RST from the server, fully closing the local connection. You either need to read from the connection to detect the close, or if you try to send again you will get an error (assuming you've waited long enough for the packets to make a round trip), like "broken pipe" on linux.
To clarify... I can dial, unplug an ethernet cable, and STILL send without error. The messages don't get through, obviously, but I receive no error
If the connection is actually broken, or the server is totally unresponsive, then you're sending packets off to nowhere. The TCP stack can't tell the difference between packets that are really slow, packet loss, congestion, or a broken connection. The system needs to wait for the retransmission timeout, and retry the packet a number of times before failing. The standard configuration for retries alone can take between 13 and 30 minutes to trigger an error.
What you can do in your code is
Turn on keepalive. This will notify you of a broken connection more quickly, because the idle connection is always being tested.
Read from the socket. Either have a concurrent Read in progress, or check for something to read first with select/poll/epoll (Go usually uses the first)
Set timeouts (deadlines in Go) for everything.
If you're not expecting any data from the connection, checking for a closed connection is very easy in Go; dispatch a goroutine to read from the connection until there's an error.
notify := make(chan error)
go func() {
buf := make([]byte, 1024)
for {
n, err := conn.Read(buf)
if err != nil {
notify <- err
return
}
if n > 0 {
fmt.Println("unexpected data: %s", buf[:n])
}
}
}()
There is no such thing as 'TCP connection state', by design. There is only what happens when you send something. There is no TCP API, at any level down to the silicon, that will tell you the current state of a TCP connection. You have to try to use it.
If you're sending keepalive probes, the server doesn't have any choice but to respond appropriately. The server doesn't even know that they are keepalives. They aren't. They are just duplicate ACKs. Supporting keepalive just means supporting sending keepalives.
I have the following code:
(ns alephtest.core
(:use lamina.core aleph.tcp aleph.formats))
(defn connection-established [socket]
(println "Socket connection established")
(on-closed socket #(println "closed"))
(doseq [line (line-seq (java.io.BufferedReader. *in*))]
(enqueue socket line)))
(defn -main []
(on-realized (tcp-client {:host "localhost" :port 9000})
connection-established
#(println "error:" %)))
All it does right now is connects to a local server and then passes data from stdin through to the server. This works fine, except for the (on-closed socket #(println "closed")) portion. If I kill the server while the client is still going I should get a message saying "closed" to the console, but I never do.
My end-goal here is to create some kind of recovery code, so that if the server goes down the client will queue up messages while it waits for the server to come back, and when the server does come back it'll reconnect and send everything that was queued.
It looks like you're blocking the thread that would notify you of the socket closing with your (doseq ...) loop. Wrap that in a (future ...), and see if that doesn't fix it.
First I would like to apologize, I'm giving so much information to make it as clear as possible what the problem is. Please let me know if there's still anything which needs clarifying.
(Running erlang R13B04, kernel 2.6.18-194, centos 5.5)
I have a very strange problem. I have the following code to listen and process sockets:
%Opts used to make listen socket
-define(TCP_OPTS, [binary, {packet, raw}, {nodelay, true}, {reuseaddr, true}, {active, false},{keepalive,true}]).
%Acceptor loop which spawns off sock processors when connections
%come in
accept_loop(Listen) ->
case gen_tcp:accept(Listen) of
{ok, Socket} ->
Pid = spawn(fun()->?MODULE:process_sock(Socket) end),
gen_tcp:controlling_process(Socket,Pid);
{error,_} -> do_nothing
end,
?MODULE:accept_loop(Listen).
%Probably not relevant
process_sock(Sock) ->
case inet:peername(Sock) of
{ok,{Ip,_Port}} ->
case Ip of
{172,16,_,_} -> Auth = true;
_ -> Auth = lists:member(Ip,?PUB_IPS)
end,
?MODULE:process_sock_loop(Sock,Auth);
_ -> gen_tcp:close(Sock)
end.
process_sock_loop(Sock,Auth) ->
try inet:setopts(Sock,[{active,once}]) of
ok ->
receive
{tcp_closed,_} ->
?MODULE:prepare_for_death(Sock,[]);
{tcp_error,_,etimedout} ->
?MODULE:prepare_for_death(Sock,[]);
%Not getting here
{tcp,Sock,Data} ->
?MODULE:do_stuff(Sock,Data);
_ ->
?MODULE:process_sock_loop(Sock,Auth)
after 60000 ->
?MODULE:process_sock_loop(Sock,Auth)
end;
{error,_} ->
?MODULE:prepare_for_death(Sock,[])
catch _:_ ->
?MODULE:prepare_for_death(Sock,[])
end.
This whole setup works wonderfully normally, and has been working for the past few months. The server operates as a message passing server with long-held tcp connections, and it holds on average about 100k connections. However now we're trying to use the server more heavily. We're making two long-held connections (in the future probably more) to the erlang server and making a few hundred commands every second per each of those connections. Each of those commands, in the common case, spawn off a new thread which will probably make some kind of read from mnesia, and send some messages based on that.
The strangeness comes when we try to test those two command connections. When we turn on the stream of commands, any new connection has about 50% chance of hanging. For instance, using netcat if I were to connect and send along the string "blahblahblah" the server should immediately return back an error. In doing this it won't make any calls outside the thread (since all it's doing is trying to parse the command, which will fail because blahblahblah isn't a command). But about 50% of the time (when the two command connections are running) typing in blahblahblah results in the server just sitting there for 60 seconds before returning that error.
In trying to debug this I pulled up wireshark. The tcp handshake always happens immediately, and when the first packet from the client (netcat) is sent it acks immediately, telling me that the tcp stack of the kernel isn't the bottleneck. My only guess is that the problem lies in the process_sock_loop function. It has a receive which will go back to the top of the function after 60 seconds and try again to get more from the socket. My best guess is that the following is happening:
Connection is made, thread moves on to process_sock_loop
{active,once} is set
Thread receives, but doesn't get data even though it's there
After 60 seconds thread goes back to the top of process_sock_loop
{active, once} is set again
This time the data comes through, things proceed as normal
Why this would be I have no idea, and when we turn those two command connections off everything goes back to normal and the problem goes away.
Any ideas?
it's likely that your first call to set {active,once} is failing due to a race condition between your call to spawn and your call to controlling_process
it will be intermittent, likely based on host load.
When doing this, I'd normally spawn a function that blocks on something like:
{take,Sock}
and then call your loop on the sock, setting {active,once}.
so you'd change the acceptor to spawn, set controlling_process then Pid ! {take,Sock}
something to that effect.
note: I don't know if the {active,once} call actually throws when you aren't the controlling processes, if it doesn't, then what I just said makes sense.