Common lisp read from FIFO with timeout - common-lisp

So I have some code like
(let ((file (open cur-fifo :if-does-not-exist :create)))
(format t "~A~%" (read-line file nil))
(close file))
Which as far as I can tell works fine, except that this will block indefinitely if no data was written to cur-fifo. I want the read to time out and return NIL if no data could be read in ~0.1 sec or so.
Running on SBCL 1.1.18, on 64-bit Gentoo Linux

FIFO model
When you open a fifo special device (for reading), the system call blocks unless either
the fifo is already opened by (another) process for writing OR
you pass O_ASYNC to open(2) - which you might not be able to do in your implementation unless you use a low level package sb-posix
When you successfully opened the fifo, your read(2) call will block until your counterparty (that opened the fifo for writing, which could be the same lisp process) writes something there.
Common Lisp
What you are looking for is listen (see also with-open-file):
(with-open-file (fifo "my-fifo" :if-does-not-exist :create)
(when (or (listen fifo)
(progn (sleep 0.1)
(listen fifo)))
(format t "Read [~A]~%" (read-line fifo))))
Debugging
Please note that special device handling is not necessarily equally well supported by all CL vendors.
If the above does not work, please do some experiments with the REPL: open the fifo, see what listen returns, write something there, see what listen reports now, &c.
If listen still returns nil even though you already wrote something into the pipe, this might mean that your CL does not recognize the file as a special device. You might have to pass some implementation-specific arguments to open, e.g., :buffering nil or something (try (describe 'open)).

Related

fileevent in TCL is very slow

I tried to write a code in TCL. The idea is to write code that does a proxy over cisco. My cisco is cisco 3700 version 12.4 and the version of TCL is 8.3. I work with GNS (Graphical Network Simulator), therefore all components are virtual(including the cisco).
In my code I opened 2 sockets with two diffrent computers: sock1 and sock2.
I configured these sockets in the following way:
fconfigure $sock1 -translation binary -buffering none -blocking 0
fconfigure $sock2 -translation binary -buffering none -blocking 0
Now I tried to transfer information between them (proxy).
As I read, the language is single-threaded and therefore I need to use events. So, I created two file event handlers that called a function:
fileevent $sock1 readable [list proxy $sock1 $sock2]
fileevent $sock2 readable [list proxy $sock2 $sock1]
The proxy function read data from the first socket and send it to the other socket.
The code works well: I transfred rdp and smb over this proxy. The problem is that it really slow: It takes something like 1000-1100 ms. Therefore I can't use remote desktop througth the proxy and even smbclient is very slow. The proxy function is really fast (I checked it and also I tried to print at the start and at the end of the function). Therefore, I assume that the interrupts from the os are very slow (or tcl executes the script slow). In addition I opened wireshark on both sides of the cisco and it takes second between the incoming message and the outgoing message.
Another information:
I want that some clients will communicate at the same time, therefore my TCL code defines a server socket:
set server_socket [socket -server handle_conn $port]
vwait is_finish
and the function "handle_conn" opens socket to the second side and create file event handlers:
proc handle_conn{sock1 addr port} {
CREATE THE SECOND SOCKET (sock2)
fileevent $sock1 readable [list proxy $sock1 $sock2]
fileevent $sock2 readable [list proxy $sock2 $sock1]
}
Therefore, I need asynchronous code (I tried to write a synchronous version: it works fast but the problem is that I can't create more than one connection at the same time (and for example my proxy doesn't work with program that need two ports or with two diffrent programs at the same time)).
I can't understand if the problem is with fconfigure, with events in tcl, with gns or another problem.
Hope for help!
Edit:
proc proxy {s1 s2} {
if {([eof $s1] || [eof $s2]) || ([catch {read $s1} data] || [catch {puts -nonewline $s2 $data}])} {
catch {close $s1}
catch {close $s2}
}
I find it curious that the code is slow for you; Tcl's fast enough to be used to implement full web servers handling complex content. It makes me suspect that something else is going on. For example, the proxy command sounds like it is mainly just copying bytes from one channel to another, but there are slow ways to do this and there are fast ways. One of the best methods is to put both channels in binary mode (fconfigure $chan -translation binary) and then use fcopy in asynchronous mode to move the bytes over; it has been internally optimised to use efficient buffer sizes and limit the amount of copying between memory buffers. Here's how a proxy command might look:
proc proxy {sourceChannel destinationChannel} {
fconfigure $sourceChannel -translation binary
fconfigure $destinationChannel -translation binary
fcopy $sourceChannel $destinationChannel -command [list \
copydone $sourceChannel $destinationChannel]
}
The copydone procedure gets called when everything is moved. Here's a basic example, but you might need to be a bit more careful since you've got copies going in both directions:
proc copydone {src dst numBytes {errorMsg ""}} {
# $numBytes bytes of data were moved
close $src
close $dst
if {$error != ""} {
puts stderr "error in fcopy: $error"
}
}
If it wasn't for the fact that you are running on a cisco device, I'd also suggest upgrading the version of Tcl in use. Formally, 8.3 hasn't been supported for a very long time.

Connect to Websocket Via Guile

I have near no knowledge/experience with sockets of any kind so my question may just end up being due to my ignorance.
Guile doesn't have any concept of the protocol but I figure, since websockets are just HTTP requests, I should be able to at least make a request out using Guile's web modules.
I'm using wss://node2.wsninja.io from https://wsninja.io/#demo to practice on.
I wrote up
(let ([s (socket PF_INET SOCK_STREAM 0)]
[d (vector-ref (addrinfo:addr (car (getaddrinfo "echo.websocket.org" "https"))) 1)])
(connect s AF_INET d 443))
to attempt connecting. While getaddrinfo asks for a protocol (and Guile doesn't recognize "wss" or "ws"), I only really need the IP address as an integer and that's all it's really doing – extracting the integer representation; the result matches what I got verifying against a few other services on the web so it seems right.
I used PF_INET and AF_INET since, if I'm not mistaken, IPv4 makes sense for a websocket call. I went with SOCK_STREAM since a websocket'll be streaming data back and forth, once opened.
I based the above example off of the client example given by Guile here: https://www.gnu.org/software/guile/manual/html_node/Internet-Socket-Examples.html#Internet-Socket-Examples.
I suspect it's the next portion that I've got wrong somehow but I'm not sure how.
I assumed, after connecting, the next step was to initiate the handshake. Given the Guile example, it seemed this could be accomplished by displaying to the now (in theory) opened port so I did this:
(let ([s (socket PF_INET SOCK_STREAM 0)]
[d (vector-ref (addrinfo:addr (car (getaddrinfo "echo.websocket.org" "https"))) 1)])
(connect s AF_INET d 443)
(display "GET / HTTP/1.1\r\n" s)
(display "Host: localhost\r\n" s)
(display "User-Agent: Mozilla/5.0\r\n" s)
(display "Upgrade: websocket\r\n" s)
(display "\r\n" s)
(do ([line (read-line s) (read-line s)])
((eof-object? line))
(display line)
(newline)))
I also tried this with the handshake sent as one String, as well. The do loop is pretty much lifted directly from the Guile example.
While I've seen mention that the handshake should be sent as a header for Websockets, I figured it wouldn't be terribly different from HTTP and the Guile example has it sent via display but maybe that's a source of my error.
Needless to say, this doesn't work. Nothing gets displayed and the loop is exited, immediately. But it does seem like I'm making a connection because removing the displays to the socket results in it just hanging (what I assume is an unending loop because no eof-object gets sent because the socket is waiting on the handshake).
But no affirmation or even error is sent if I do send something, which I would have expected. So I'm confounded and a little out of my depth and thought I'd ask.

Clojure (aleph) detecting when server disconnects

I have the following code:
(ns alephtest.core
(:use lamina.core aleph.tcp aleph.formats))
(defn connection-established [socket]
(println "Socket connection established")
(on-closed socket #(println "closed"))
(doseq [line (line-seq (java.io.BufferedReader. *in*))]
(enqueue socket line)))
(defn -main []
(on-realized (tcp-client {:host "localhost" :port 9000})
connection-established
#(println "error:" %)))
All it does right now is connects to a local server and then passes data from stdin through to the server. This works fine, except for the (on-closed socket #(println "closed")) portion. If I kill the server while the client is still going I should get a message saying "closed" to the console, but I never do.
My end-goal here is to create some kind of recovery code, so that if the server goes down the client will queue up messages while it waits for the server to come back, and when the server does come back it'll reconnect and send everything that was queued.
It looks like you're blocking the thread that would notify you of the socket closing with your (doseq ...) loop. Wrap that in a (future ...), and see if that doesn't fix it.

unix network process

I was wondering how tcp/ip communication is implemented in unix. When you do a send over the socket, does the tcp/level work (assembling packets, crc, etc) get executed in the same execution context as the calling code?
Or, what seems more likely, a message is sent to some other daemon process responsible for tcp communication? This process then takes the message and performs the requested work of copying memory buffers and assembling packets etc.? So, the calling code resumes execution right away and tcp work is done in parallel? Is this correct?
Details would be appreciated. Thanks!
The TCP/IP stack is part of your kernel. What happens is that you call a helper method which prepares a "kernel trap". This is a special kind of exception which puts the CPU into a mode with more privileges ("kernel mode"). Inside of the trap, the kernel examines the parameters of the exception. One of them is the number of the function to call.
When the function is called, it copies the data into a kernel buffer and prepares everything for the data to be processed. Then it returns from the trap, the CPU restores registers and its original mode and execution of your code resumes.
Some kernel thread will pick up the copy of the data and use the network driver to send it out, do all the error handling, etc.
So, yes, after copying the necessary data, your code resumes and the actual data transfer happens in parallel.
Note that this is for TCP packets. The TCP protocol does all the error handling and handshaking for you, so you can give it all the data and it will know what to do. If there is a problem with the connection, you'll notice only after a while since the TCP protocol can handle short network outages by itself. That means you'll have "sent" some data already before you'll get an error. That means you will get the error code for the first packet only after the Nth call to send() or when you try to close the connection (the close() will hang until the receiver has acknowledged all packets).
The UDP protocol doesn't buffer. When the call returns, the packet is on it's way. But it's "fire and forget", so you only know that the driver has put it on the wire. If you want to know whether it has arrived somewhere, you must figure out a way to achieve that yourself. The usual approach is have the receiver send an ack UDP packet back (which also might get lost).
No - there is no parallel execution. It is true that the execution context when you're making a system call is not the same as your usual execution context. When you make a system call, such as for sending a packet over the network, you must switch into the kernel's context - the kernel's own memory map and stack, instead of the virtual memory you get inside your process.
But there are no daemon processes magically dispatching your call. The rest of the execution of your program has to wait for the system call to finish and return whatever values it will return. This is why you can count on return values being available right away when you return from the system call - values like the number of bytes actually read from the socket or written to a file.
I tried to find a nice explanation for how the context switch to kernel space works. Here's a nice in-depth one that even focuses on architecture-specific implementation:
http://www.ibm.com/developerworks/linux/library/l-system-calls/

What can cause a spontaneous EPIPE error without either end calling close() or crashing?

I have an application that consists of two processes (let's call them A and B), connected to each other through Unix domain sockets. Most of the time it works fine, but some users report the following behavior:
A sends a request to B. This works. A now starts reading the reply from B.
B sends a reply to A. The corresponding write() call returns an EPIPE error, and as a result B close() the socket. However, A did not close() the socket, nor did it crash.
A's read() call returns 0, indicating end-of-file. A thinks that B prematurely closed the connection.
Users have also reported variations of this behavior, e.g.:
A sends a request to B. This works partially, but before the entire request is sent A's write() call returns EPIPE, and as a result A close() the socket. However B did not close() the socket, nor did it crash.
B reads a partial request and then suddenly gets an EOF.
The problem is I cannot reproduce this behavior locally at all. I've tried OS X and Linux. The users are on a variety of systems, mostly OS X and Linux.
Things that I've already tried and considered:
Double close() bugs (close() is called twice on the same file descriptor): probably not as that would result in EBADF errors, but I haven't seen them.
Increasing the maximum file descriptor limit. One user reported that this worked for him, the rest reported that it did not.
What else can possibly cause behavior like this? I know for certain that neither A nor B close() the socket prematurely, and I know for certain that neither of them have crashed because both A and B were able to report the error. It is as if the kernel suddenly decided to pull the plug from the socket for some reason.
Perhaps you could try strace as described in: http://modperlbook.org/html/6-9-1-Detecting-Aborted-Connections.html
I assume that your problem is related to the one described here: http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
Unfortunately I'm having a similar problem myself but couldn't manage to get it fixed with the given advices. However, perhaps that SO_LINGER thing works for you.
shutdown()
may have been called on one of the
socket endpoints.
If either side may fork and execute a
child process, ensure that the
FD_CLOEXEC
(close-on-exec) flag is set on the
socket file descriptor if you did not
intend for it to be inherited by the
child. Otherwise the child process
could (accidentally or otherwise) be
manipulating your socket connection.
I would also check that there's no sneaky firewall in the middle. It's possible an intermediate forwarding node on the route sends an RST. The best way to track that down is of course the packet sniffer (or its GUI cousin.)

Resources