Difficulty reading input pipe in SBCL - common-lisp

I am slowly getting closer to be able to read and write to/from named pipes of a background process through SBCL. What I do is kick off the program I am trying to read/write from/to:
todd#ubuntu:~/CoreNLP$ cat ./spin | /usr/bin/java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -outputFormat text > ./spout &
[1] 24616
So that all works out fine, so I kick off SBCL and do this:
(defparameter from-corenlp (open "./spout"))
Which also works out fine, but declaring the stream causes SBCL to spill the stream onto the screen (which is all the startup information from the background process). It does not wait until I read from the stream. Is that how things are supposed to work?

The solution, as I posted it to the stanford parser mailing list (stack overflow reformatted a lot of it to something weird, but you get the idea):
It took quite a while, but I finally figured out embedding (for the most part) the CoreNLP program (while in interactive mode) in SBCL Lisp.
First of all, forget using (sb-ext:run-program ...). This combination of spawning Java with a quoted argument (like the asterisk) no matter how well escaped, simply makes the spawned program crash.
Inferior shell seems to kick off the parser but it is only good for a one-off parse, even in the interactive mode. Perhaps I could have done better, but inferior shell needs to be installed and it is poorly documented.
The initial attempted solution of using Unix named pipes ends up being the final one, but it took a bit of work, first with buffering, then with the order of operations, and finally understanding some nuances about the parser program.
First, turning off buffering completely when running the program is important, so running it looks like this:
stdbuf --i=0 --o=0 --e=0 cat ./spin | /usr/bin/java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -outputFormat text > ./spout &
That is supposed to be running the parser in the background accepting input from spin and sending its output to spout. But if you look at the process table in Linux, you will not see it running. It is still waiting for something to pull from the output pipe before it can even run.
So, we run SBCL and start a stream pulling from the parser´s pipe:
(defparameter *from-corenlp* (open "./spout"))
NOW the parser starts running. Here, oddly, it also starts dumping output to the screen, not to the pipe! That is because all of this banner stuff when the parser starts and stops (and apparently even the NLP> prompt) is sent to stderr, not stdout. This is actually a good thing.
So then we declare the stream from Lisp to the parser:
(defparameter *to-corenlp* (open "./spin" :direction :output :if-exists :append))
Then we send some text for the parser to parse:
(write-line "This is the first test." *to-corenlp*)
I ran into a problem here a few times, even. Remember that Lisp has its own buffer so you have to clear out the stream every time:
(finish-output *to-corenlp*)
You then can run this line below a whole bunch of times to verify you obtain the exact same behavior you would have gotten from an interactive session of the parser:
(format t "~a~%" (read-line *from-corenlp*))
Which, if you are a good boy scout, should not only be true, but you can carry on with your interactive slave parser session for as long as you like:
(write-line "This is the second test." *to-corenlp*)
(finish-output *to-corenlp*)
Isn´t that great? And notice I pulled all of that off being terrible at Unix, terrible at Lisp and being a terrible boy scout!
Now so can you!

Related

SBCL Compiler Diagnostic Messages (missing with a "when" with no body)

By accident, I recently came across a latent coding error in one of my functions, dealing with a when statement. A reduced paraphrase might look like:
(defparameter a 0)
(when (or (= a 0)
(= a 1)
(* a a)))
The problem is a misplaced parenthesis, so it actually is
(when (or (= a 0)
(= a 1)
(* a a)))
In a situation like this, wouldn't it be useful for the compiler to generate either a style warning or note? It seems to me that the meaning of a when statement normally implies a condition and a body, even though the body is strictly optional. Of course, a print pretty would have caught this in the editor, but I had copied it from elsewhere. Is there a reason that SBCL does not check for these kinds of mistakes?
a print pretty would have caught this in the editor
To discuss the options, I know about:
trivial-formatter will format the source code.
(trivial-formatter:fmt :your-system :supersede)
cl-indentify indents the source code. Has a command line utility. I tried it once and it was not bad, but different than Emacs' indentation, thus annoying for me.
$ cl-indentify bar.lisp
It links to lispindent but I was less happy with its result.
However, the best would be to not only format the code and re-read it ourselves, but to
run checks against a set of rules to warn against code smells
This is what proposes the lisp-critic. It can critique a function or a file. However:
(edit) it doesn't really have a Slime integration, we have to either critique a function or a whole file.
if you feel adventurous, see an utility of mine here. It could be an easier way to test snippets that you enter at the REPL.
it hasn't the rule about when without a body (we can easily add it)
And it would be best that the run failed with an error status code if it found a code smell. Again, a little project of mine in beta tries to do that, see here. It doesn't have much rules now, but I just pushed a check for this. You can call the script:
$colisper.sh tests/playground.lisp
it shows an error (but doesn't write it in-place by default):
|;; when with no body
|(when (or (= a 0)
| (= a 1)
!| (* a a))
!| (error "colisper found a 'when' with a missing body. (we should error the script without this rewrite!)"))
and returns with an exit code, so we can use it has a git hook or on a CI pipeline.
The problem is that if a human writes (when x) (or whatever that expands into, perhaps (if x (progn) nil)) this is probably a mistake, but when a program writes it it may well not be: it may be just some edge case that the program hasn't been smart enough to optimize completely away. And a huge amount of code that the compiler processes is written by programs, not humans.

How to open a Julia repl in a specific mode

I want to have a short script that opens a Julia REPL in a specific mode, for instance, the shell> mode or the C++ > (from Cxx.jl) mode. How can this be achieved?
Update:
After getting an answer I created a script to start Julia REPL in Cxx.jl C++ mode (and pre-run some C++ code). See it here: https://github.com/cdsousa/cxxrepl.jl.
Whatever this may be good for...
The easiest way (without having dug into the innards of Base.REPL) is to write the appropriate character to STDIN, e.g
write(STDIN.buffer,'?');
If you want to start the REPL and drop to shell mode immediately, call julia as
julia -i -e write(STDIN.buffer,';')

reading deeply nested tree causes stack overflow

I'm trying to read a massive sexp from file into memory, and it seems to be working out fine for smaller inputs, but on more deeply nested ones sbcl conks out with stack exhaustion. There seems to be a hard recursion limit (at 1000 functions deep) that sbcl simply cannot surpass (strangely, even when its stack size is increased). Example (code is here): make check-c works, but make check-cpp exhausts the stack as below:
INFO: Control stack guard page unprotected
Control stack guard page temporarily disabled: proceed with caution
Unhandled SB-KERNEL::CONTROL-STACK-EXHAUSTED in thread #<SB-THREAD:THREAD
"main thread" RUNNING
{10034E6DE3}>:
Control stack exhausted (no more space for function call frames).
This is probably due to heavily nested or infinitely recursive function
calls, or a tail call that SBCL cannot or has not optimized away.
PROCEED WITH CAUTION.
Backtrace for: #<SB-THREAD:THREAD "main thread" RUNNING {10034E6DE3}>
0: ((LAMBDA NIL :IN SB-DEBUG::FUNCALL-WITH-DEBUG-IO-SYNTAX))
1: (SB-IMPL::CALL-WITH-SANE-IO-SYNTAX #<CLOSURE (LAMBDA NIL :IN SB-DEBUG::FUNCALL-WITH-DEBUG-IO-SYNTAX) {100FC9006B}>)
2: (SB-IMPL::%WITH-STANDARD-IO-SYNTAX #<CLOSURE (LAMBDA NIL :IN SB-DEBUG::FUNCALL-WITH-DEBUG-IO-SYNTAX) {100FC9003B}>)
...
Why am I using recursion, then? Actually, I'm not, but unfortunately the builtin (read) uses recursion, and that's where the stack overflow is occurring. The other option (which I've started working on) is to write an iterative version of read which relies upon the more limited syntax that I'm feeding into it from a separate program to avoid the complexity of re-implementing read (my (currently broken) attempts at that are in the lisp branch of the above repository).
However, I'd prefer a more canonical solution. Are there alternatives to the builtin read that can parse deeply nested structures by avoiding recursion?
EDIT: This appears to be an insurmountable issue with sbcl itself, not the input data. For a quick example, try running:
(for i in $(seq 1 2000); do
echo -n "("
done; echo -n "2"; for i in $(seq 1 2000); do
echo -n ")"
done; echo) > file
And then in sbcl:
(with-open-file (file "file" :direction :input) (read file))
The same failure occurs.
EDIT: Asked around on #sbcl, and apparently the control stack size really applies only to new threads, and that the stack size for the main thread is affected by a lot of other factors as well. So I tried putting the read in a separate thread. Still didn't work. Checkout this repo and run make check if you're interested.
I don't know what you did (because you didn't show it exactly), but when I start sbcl as follows your example works fine for me:
sbcl --control-stack-size 100
Of course I recommended GNU CLISP and Embedded Common Lisp as they also work A-OK for your example.
I'll add a reference to this answer for future readers: https://stackoverflow.com/a/9002973/816536
I'll also mention that compiling the code with appropriate optimization options may be necessary in many CL implementations to benefit from tail-call optimization.

translate-pathname behaves strange

Following this question: Strange symbols in filespec when calling load I tried my luck with pathnames, but, as you see, failed. Below is an example of the error, which I cannot explain:
This code does not work:
(defun test-process-imgae-raw ()
(cl-gd:with-image-from-file
(test #P"digit-recognition:digit-7.png")
(process-image-raw test)))
Neither does this:
(defun test-process-imgae-raw ()
(cl-gd:with-image-from-file
(test "digit-recognition:digit-7.png")
(process-image-raw test)))
But this code does:
(defun test-process-imgae-raw ()
(cl-gd:with-image-from-file
(test (translate-logical-pathname "digit-recognition:digit-7.png"))
(process-image-raw test)))
And so does this:
(defun test-process-imgae-raw ()
(cl-gd:with-image-from-file
(test (translate-logical-pathname #P"digit-recognition:digit-7.png"))
(process-image-raw test)))
Here's the "translator":
(setf (logical-pathname-translations "DIGIT-RECOGNITION")
`(("**;*.*" "/home/wvxvw/Projects/digit-recognition/**/*.*")))
And here's the error I'm getting:
Pathname components from SOURCE and FROM args to TRANSLATE-PATHNAME
did not match:
:NEWEST NIL
[Condition of type SIMPLE-ERROR]
Restarts:
0: [RETRY] Retry SLIME REPL evaluation request.
1: [*ABORT] Return to SLIME's top level.
2: [ABORT] Abort thread (#<THREAD "repl-thread" RUNNING {1003800113}>)
Backtrace:
0: (SB-IMPL::DIDNT-MATCH-ERROR :NEWEST NIL)
1: (SB-IMPL::TRANSLATE-COMPONENT :NEWEST NIL :NEWEST T)
2: (TRANSLATE-PATHNAME #P"DIGIT-RECOGNITION:DIGIT-7.PNG.NEWEST" #P"DIGIT-RECOGNITION:**;*.*" #P"/home/wvxvw/Projects/digit-recognition/**/*.*")
3: (TRANSLATE-LOGICAL-PATHNAME #P"DIGIT-RECOGNITION:DIGIT-7.PNG.NEWEST")
4: (SB-IMPL::QUERY-FILE-SYSTEM #P"DIGIT-RECOGNITION:DIGIT-7.PNG" :TRUENAME NIL)
5: (PROBE-FILE #P"DIGIT-RECOGNITION:DIGIT-7.PNG")
6: (CREATE-IMAGE-FROM-FILE #<unavailable argument> NIL)
7: (TEST-PROCESS-IMGAE-RAW)
I'm trying to read the Hyperspec section on translate-pathname, but I can make absolutely no sense of what it says, neither from the examples it shows. Let alone it, I can't even understand how there can possibly be an error if you transform a string by whatever rules you put in place, so far it's only one way transformation...
I'm trying to read SBCL sources for this function, but they are really lengthy, and trying to figure out the problem this way is taking huge amounts of time.
tl;dr How is it even possible that translate-logical-pathname called from user's code will produce something different to what is produced from that function if called from system code? This is not only non-portable, this is just outright broken.
EDIT:
Adding one more asterisk to the pattern on the left side, but not on the right solved this. But the purpose or logic of why is this necessary is beyond me.
I.e.
(setf (logical-pathname-translations "DIGIT-RECOGNITION")
`(("**;*.*.*" "/home/wvxvw/Projects/digit-recognition/**/*.*")))
This allows pathnames like digit-recognition:foo.bar.newest to succeed, just like digit-recognition:foo.bar but why is that asterisk a requirement flies beyond me. Also, why is the system function feels entitled to change the pathname to something else of what it was given?.. But just not to get you confused, with-image-from-file will only work with the path already expanded by translate-logical-pathname, it won't work otherwise.
EDIT2:
OK, it seems like this is the problem with cl-gd, instead of trying to expand the file name, it takes it literally. This code taken from create-image-from-file probably best answers my question:
(when (pathnamep file-name)
(setq file-name
#+:cmu (ext:unix-namestring file-name)
#-:cmu (namestring file-name)))
(with-foreign-object (err :int)
(with-cstring (c-file-name file-name)
(let ((image (ecase %type
((:jpg :jpeg)
(gd-image-create-from-jpeg-file c-file-name err))
I.e. instead of doing (namestring file-name) it has to do (namestring (trnaslate-logical-pathname file-name)). Duh...
Another way is to use TRUENAME, which returns the real file name. Normally this would not make a difference.
Image a file system with file versions (like the file systems of VMS, ...). If you have a logical pathname foo:bar;baz.png.newest, then it might translate to, say, /myfiles/images/baz.png~newest (again, just assume that it has version numbers). This still is not a real physical file. If such a Lisp system tries to open the file, it has to look into the file system to actually determine the newest file. That might be /myfiles/images/baz.png~42.
So, if you want to pass real physical filenames to external tools (like a C library), it might not be sufficient to expand the logical pathname, but it might be necessary to compute the truename - the real physical file.
The ability to deal with file versions comes from a time when file versions where quite common (see Versioning file system) with operating systems like ITS, VMS or the various Lisp Machine operating systems.
The main practical problem for this is that there is no common test suite for pathname operations for the various CL implementations and thus implementations differ in a lot of subtle details (especially when you need to deal with different file systems from different operating systems). Plus real file systems have complications - for example file names in Mac OS X use a special unicode encoding when dealing with Umlauts.

How can I automate these emacs ESS (ess-remote) commands?

I'm using a local emacs instance (aquamacs) to run R processes on a remote server, and I'd like to automate the process of connecting to my server. The process is as follows:
[in emacs]
M-x shell
[in the resulting console]
TERM=xterm
ssh -Y -C <my remote server>
screen -rd [and/or] R
[in emacs]
M-x ess-remote
r
I discovered this general approach here: http://blog.nguyenvq.com/2010/07/11/using-r-ess-remote-with-screen-in-emacs/. The -Y -C options allow you use xterm to view plots. I don't know lisp and tho I've googled around a bit, I can't seem to piece together how to actually define a function to automate this (e.g., in .emacs.el). Has anyone implemented anything like this?
Let's assume you just want to call shell in code. In Lisp, everything is prefix notation surrounded by parentheses. So we enter this into a buffer (say, the scratch buffer):
(shell)
Move your pointer to the end of the line after the close-paren, and type <C-x C-e> to execute the Lisp code. You should see that the shell function is called.
Now, let's make it a function, so we can add other things to it. The command to create a function is defun, and it takes the name of the function, the argument list (in parentheses), and then the body of the function:
(defun automate-connection ()
(shell))
Move your cursor to the end of the code, hit <C-x C-e>, and the function will be defined. You can call it from Lisp by executing
(automate-connection)
Ok, now we just need to put some text into the shell buffer.
(defun automate-connection ()
(shell)
(insert "TERM=xterm"))
Now, when we run that, we get "TERM=xterm" put into the shell buffer. But it doesn't actually send the command. Let's try putting a newline.
(defun automate-connection ()
(shell)
(insert "TERM=xterm\n"))
That puts in a newline, but doesn't actually make the command run. Why not? Let's see what the enter key does. Go to your *shell* buffer, and type <C-h c>, then hit the return key. (<C-h c> runs describe-key-briefly, which prints the name of the function invoked by hitting the given key). That says that when you hit RET, it's not putting a newline, but actually calling comint-send-input. So let's do that:
(defun automate-connection ()
(shell)
(insert "TERM=xterm")
(comint-send-input))
Now, when you run `(automate-connection) from any Lisp code, you should get the given thing sent. I leave it as an exercise to the reader to add your other commands.
But wait! We're not really done, are we? I assume you don't want to have to move to a Lisp scratch buffer, type in (automate-connection), then evaluate that code. You probably just want to type , and call it a day. You can't do that by default with the function we just created. Luckily, it's simple to allow that: just add a call to (interactive) in your function:
(defun automate-connection ()
(interactive)
(shell)
(insert "TERM=xterm")
(comint-send-input))
Now you can call it as you want, and it'll open the *shell* buffer, put in the text, and tell Emacs to tell the shell to run that text.

Resources