"with-open-file" will read from the beginning of a file. If the file is VERY big how to read the last 20 lines efficiently ?
Sincerely!
This opens a file, reads the final byte, and closes the file.
(defun read-final-byte (filename)
(with-open-file (s filename
:direction :input
:if-does-not-exist :error)
(let ((len (file-length s)))
(file-position s (1- len)) ; 0-based position.
(read-char s nil)))) ; don't error if reading the end of the file.
If you want to specifically read the last n lines, you will have to read back an indeterminate number of bytes until you get n+1 newlines. In order to do this, you will either have to do block reads backwards (faster but will wind up in reading unneeded bytes), or byte-reads (slower but allows precision and a slightly more obvious algorithm).
I suspect tail has a reasonable algorithm applied for this, so it would likely be worth reading tail's source for a guideline.
Related
I don't understand why this code behaves differently in different implementations:
(format t "asdf")
(setq var (read))
In CLISP it behaves as would be expected, with the prompt printed followed by the read, but in SBCL it reads, then outputs. I read a bit on the internet and changed it:
(format t "asdf")
(force-output t)
(setq var (read))
This, again, works fine in CLISP, but in SBCL it still reads, then outputs. I even tried separating it into another function:
(defun output (string)
(format t string)
(force-output t))
(output "asdf")
(setq var (read))
And it still reads, then outputs. Am I not using force-output correctly or is this just an idiosyncrasy of SBCL?
You need to use FINISH-OUTPUT.
In systems with buffered output streams, some output remains in the output buffer until the output buffer is full (then it will be automatically written to the destination) or the output buffer is explicity emptied.
Common Lisp has three functions for that:
FINISH-OUTPUT, attempts to ensure that all output is done and THEN returns.
FORCE-OUTPUT, starts the remaining output, but IMMEDIATELY returns and does NOT wait for all output being done.
CLEAR-OUTPUT, tries to delete any pending output.
Also the T in FORCE-OUTPUT and FORMAT are unfortunately not the same.
force-output / finish-output: T is *terminal-io* and NIL is *standard-output*
FORMAT: T is *standard-output*
this should work:
(format t "asdf")
(finish-output nil) ; note the NIL
(setq var (read))
I try to write BLOB into database - chunk by chunk, using database API C-function (say, db-write-chunk).
This function takes a pointer to a foreign memory (where chunk is placed) as an argument.
So, I make buffer for a chunk: foreign-buffer.
I'll take chunk data from a file (or binary stream) by read-sequence into stream-buffer:
(let ((foreign-buffer (foreign-alloc :uchar 1024)))
(stream-buffer ((make-array 1024 :element-type '(unsigned-byte 8))))
(loop
for cnt = (read-sequence stream-buffer MY-STREAM)
while (> cnt 0)
do
;; copy cnt bytes from stream-buffer into foreign-buffer
;; call db-write-chunk with foreign-buffer
L in BLOB is for Large and loop may iterate many times.
Besides that, all this code may be wrapped by the external loop (bulk-insert, for example).
So, I want to minimize the count of steps in the loop(s) body.
To have this done I need:
to be able to read sequence not into stream-buffer, but into foreign-buffer directly, like this:
(read-sequence (coerce foreign-buffer '(vector/array ...)) MY-STREAM)
or to be able to interpret stream-buffer as foreign memory, like this:
(db-write-chunk (mem-aptr stream-buffer :uchar 0))
Is it possible to solve my problem using single buffer only - native or foreign, without copying memory between them?
Like anything else ffi, this is implementation dependent, but cffi has cffi:make-shareable-byte-vector, which is a CL (unsigned-byte 8) array which you can then use with cffi:with-pointer-to-vector-data:
(cffi:defcfun memset :pointer
(ptr :pointer)
(val :int)
(size :int))
(let ((vec (cffi:make-shareable-byte-vector 256)))
(cffi:with-pointer-to-vector-data (ptr vec)
(memset ptr 0 (length vec))))
Depending on your use, this might be preferable to static-vectors, because you don't have to remember to free it manually. On SBCL this works by pinning the vector data during with-pointer-to-vector-data.
I was hoping to experiment with cl-async to run a series of external programs with a large combinations of command line arguments. However, I can't figure out how to read the stdout of the processes launched with as:spawn.
I would typically use uiop which makes it easy to capture the process output:
(let ((p (uiop:launch-program ... :output :stream)))
(do-something-else-until-p-is-done)
(format t "~a~%" (read-line (uiop:process-info-output p))))
I've tried both :output :pipe and :output :stream options to as:spawn and executing (as:process-output process-object) in my exit-callback shows the appropriate pipe or async-stream objects but I can't figure out how to read from them.
Can anyone with experience with this library tell how to accomplish this?
So you go to your repl and type:
CL-USER> (documentation 'as:spawn 'function)
And you read whatever comes out (or put your point on the symbol and hit C-c C-d f). If you read it you’ll see that the format for the :input, etc arguments is either :pipe, (:pipe args...), :stream, or (:stream args...) (or some other options). And that :stream behaves similarly to :pipe but gives output of a different type and that for details of args one should look at PIPE-CONNECT so you go and look up the documentation for that. Well it tells you what the options are but it isn’t very useful. What’s the documentation/description of PIPE or STREAM? Well it turns out that pipe is a class and a subclass of STREAMISH. What about PROCESS that’s a class too and it has slots (and accessors) for things like PROCESS-OUTPUT. So what is a good plan for how to figure out what to do next? Here’s a suggestion:
Spawn a long running process (like cat foo.txt -) with :output :stream :input :pipe say
Inspect the result (C-c C-v TAB)
Hopefully it’s an instance of PROCESS. What is it’s output? Inspect that
Hopefully the output is a Gray stream (ASYNC-STREAM). Get it into your repl and see what happens if you try to read from it?
And what about the input? See what type that has and what you can do with it
The above is all speculation. I’ve not tried running any of this but you should. Alternatively go look at the source code for the library. It’s already on your computer and if you can’t find it it’s on GitHub. There are only about half a dozen source files and they’re all small. Just read them and see what you can learn. Or go to the symbol you want to know about and hit M-. to jump straight to its definition. Then read the code. Then see if you can figure out what to do.
I found the answer in the test suite. The output stream can only be processed asynchronously via a read call-back. The following is simple example for posterity
(as:start-event-loop
(lambda ()
(let ((bytes (make-array 0 :element-type '(unsigned-byte 8))))
(as:spawn "./test.sh" '()
:exit-cb (lambda (proc exit-status term-signal)
(declare (ignore proc exit-status term-signal))
(format t "proc output:~%~a"
(babel:octets-to-string bytes)))
:output (list :stream
:read-cb (lambda (pipe stream)
(declare (ignore pipe))
(let ((buf (make-array 128 :element-type '(unsigned-byte 8))))
(loop for n = (read-sequence buf stream)
while (plusp n) do
(setf bytes
(concatenate '(vector (unsigned-byte 8))
bytes
(subseq buf 0 n)))))))))))
with
$ cat test.sh
#!/bin/bash
sleep_time=$((1+$RANDOM%10))
echo "Process $$ will sleep for $sleep_time"
sleep $sleep_time
echo "Process $$ exiting"
yields the expected output
It might seem simple, but I can't get it to work. I simply need to read a file where the contents are just one big list
(a b c d)
. . . as is . . . into a list in my program. I have
(let ((ardplst nil))
...
(with-open-file (in ardpfile :direction :input :if-does-not-exist nil)
(when in
(read-sequence ardplst in))
(format t "~a" ardplst))
But it's not working. I get NIL. What am I doing wrong?
What does read-sequence do? It reads some elements from the stream, typically characters (but it depends on the element-type of the stream) and destructively insert them into the input sequence. So, you would collect characters #\(, then #\a, then #\Space, then #\b, etc. However, reading stops as soon as you reach the end of your sequence: with your empty list, that means immediately (you are supposed to pass a buffer, e.g. a vector). In you case, read-sequence returns 0.
The reason you get nil is because your last expression is format, which in the above code outputs to the standard output (because of t) and returns nil. You could have used print, which returns the printed object.
I don't understand why you are explicitely using :if-does-not-exist nil. Are you sure you want to silently skip the task if the file cannot be opened? What if the list you read is empty? You should probably let an error be signaled in case the file is not found.
I would use read while disabling read-time evaluation:
(with-open-file (in my-file)
(let* ((*read-eval* nil)
(list (read in)))
(prog1 list
(check-type list list))))
Note that the default :direction is :input. In my opinion it does not hurt to omit this argument here, though sometimes it can be more readable to write it explicitely.
Exactly as the question says. I want to use shared memory to communicate between two lisp processes. Any pointers on how to do that?
I can see some tutorials on doing this in clozure at :-
http://ccl.clozure.com/manual/chapter4.7.html
Can someone point me to a similar library to do this with sbcl?
For a portable implementation, you might want to use the osicat library, which provides a CFFI wrapper for many POSIX calls in the osicat-posix package.
There is a very nice and short article with code for using it at http://wandrian.net/2012-04-07-1352-mmap-files-in-lisp.html (by Nicolas Martyanoff).
To preserve that, I mostly cite from there:
Mapping a file is done by opening it with osicat-posix:open, reading its size with fstat, then calling mmap. Once the file has been mapped we can close the file descriptor, it’s not needed anymore.
(defun mmap-file (path)
(let ((fd (osicat-posix:open path (logior osicat-posix:o-rdonly))))
(unwind-protect
(let* ((size (osicat-posix:stat-size (osicat-posix:fstat fd)))
(addr (osicat-posix:mmap (cffi:null-pointer) size
(logior osicat-posix:prot-read)
(logior osicat-posix:map-private)
fd 0)))
(values addr size))
(osicat-posix:close fd))))
The mmap-file function returns two values: the address of the memory mapping and its size.
Unmapping this chunk of memory is done with osicat-posix:munmap.
Let’s add a macro to safely map and unmap files:
(defmacro with-mmapped-file ((file addr size) &body body)
(let ((original-addr (gensym "ADDR-"))
(original-size (gensym "SIZE-")))
`(multiple-value-bind (,addr ,size)
(mmap-file ,file)
(let ((,original-addr ,addr)
(,original-size ,size))
(unwind-protect
(progn ,#body)
(osicat-posix:munmap ,original-addr ,original-size))))))
This macro mmaps the given file and binds the two given variables to its address and and size. You can then calculate address pointers with cffi:inc-pointer and access the file contents with cffi:mem-aref. You might want to build your own wrappers around this to represent the format of your file (e. g. plain text in UTF-8).
(In comparison to the posting linked above, I removed the wrapping of osicat-posix:munmap into another function of exactly the same signature and effect, because it seemed superfluous to me.)
There is low-level mmap function bundled with sbcl:
CL-USER> (apropos "MMAP")
SB-POSIX:MMAP (fbound)
; No value
CL-USER> (describe 'sb-posix:mmap)
SB-POSIX:MMAP
[symbol]
MMAP names a compiled function:
Lambda-list: (ADDR LENGTH PROT FLAGS FD OFFSET)
Derived type: (FUNCTION (T T T T T T)
(VALUES SYSTEM-AREA-POINTER &OPTIONAL))
Inline proclamation: INLINE (inline expansion available)
Source file: SYS:CONTRIB;SB-POSIX;INTERFACE.LISP.NEWEST
; No value
You have to use explicit address arithmetics to use it, as in C.