Common LISP: convert (unknown) struct object to plist? - common-lisp

(defstruct (mydate (:constructor make-mydate (year month day)))
(year 1970)
(month 1)
(day 1))
(defvar *date1* (make-mydate 1992 1 1))
The problem is more general, but say I would like to convert an object like date1 to a "document" I can persist to a database (e.g. mongoDB, using package cl-mongo). So I write
(defun mydate->document (mydate)
(cl-mongo:$ (cl-mongo:$ "year" (mydate-year mydate))
(cl-mongo:$ "month" (mydate-month mydate))
(cl-mongo:$ "day" (mydate-day mydate))))
REPL--> (mydate->doc *date1*)
kv-container : #(#S(CL-MONGO::PAIR :KEY year :VALUE 1992)
#S(CL-MONGO::PAIR :KEY month :VALUE 1)
#S(CL-MONGO::PAIR :KEY day :VALUE 1))
But could I, instead of having to write down all fields of my struct, obtained their names and values programmatically? After all, my lisp runtime can do that:
REPL--> (describe *date1*)
#S(MYDATE :YEAR 1992 :MONTH 1 :DAY 1)
[structure-object]
Slots with :INSTANCE allocation:
YEAR = 1992
MONTH = 1
DAY = 1
On the other hand, I did not find anything relevant in any book, and I noticed that the library cl-json cannot convert structs to JSON format (even though it does convert lists and CLOS objects). I guess that if there was a function to convert a struct to a plist, the problem would be solved.

There is no portable way.
Implementations do it differently. Probably most have a way to access the names of the slots. It's not clear to me why such functionality is missing from the standard.
LispWorks for example has:
(structure:structure-class-slot-names (find-class 'some-structure-class))
Maybe there is already a compatibility library somewhere. It would make sense to use the Meta-Object Protocol functionality for CLOS also for structure classes.
SBCL:
* (sb-mop:class-slots (find-class 'foo))
(#<SB-PCL::STRUCTURE-EFFECTIVE-SLOT-DEFINITION A>
#<SB-PCL::STRUCTURE-EFFECTIVE-SLOT-DEFINITION B>)
* (mapcar 'sb-mop:slot-definition-name *)
(A B)

What you are looking for is called Meta-Object Protocol.
Common lisp pioneered MOP, but only for CLOS objects, not for defstruct objects.
Some CL implementations support defstruct MOP, but not all of them, you can check that using:
(defstruct s a)
(slot-definition-initargs (car (class-direct-slots (find-class 's))))

This is SBCL specific:
In REPL is can do this:
(struct-to-list '(NIL #S(N :V 78) #S(OP :V #\*)
#S(E :V1 #S(N :V 456) :OP #S(OP :V #\+) :V2 #S(N :V 123))))
which will give this result =>
(NIL
((N (V 78))
((OP (V #\*)) ((E (V1 (N (V 456))) (OP (OP (V #\+))) (V2 (N (V 123)))) NIL))))
The code used:
(defun struct-names (struct)
(loop for sl in (sb-mop::class-direct-slots (class-of struct))
collect (list
(sb-mop:slot-definition-name sl)
(slot-value sl 'sb-pcl::internal-reader-function))))
(defun struct-values (struct)
(cons (type-of struct)
(loop for np in (struct-names struct)
collect (cons (car np)
(funcall (cadr np) struct)))))
(defun structp (val)
(equalp 'STRUCTURE-OBJECT
(sb-mop:class-name (car (sb-mop:class-direct-superclasses (class-of val))))))
(defun struct-to-list (val)
(cond
((structp val)
(loop for v in (struct-values val)
collect (struct-to-list v)))
((consp val)
(cons (struct-to-list (car val))
(struct-to-list (cdr val))))
(T val)))

Related

Common lisp macro not calling function

I am working on a complicated macro and have run into a roadblock.
(defmacro for-each-hashtable-band (body vars on &optional counter name)
`(block o
(with-hash-table-iterator (next-entry ,on)
(destructuring-bind
,(apply #'append vars)
(let ((current-band (list ,#(mapcar #'not (apply #'append vars)))))
(for (i 1 ,(length (apply #'append vars)) 2)
(multiple-value-bind
(succ k v) (next-entry)
(if succ
(progn
(setf (nth i current-band) k)
(setf (nth (+ 1 i) current-band) v))
(return-from o nil))))
current-band)
,#body))))
im getting "Evaluation aborted on #<UNDEFINED-FUNCTION NEXT-ENTRY {100229C693}>"
i dont understand why next-entry appears to be invisible to the macro i have created.
I've tried stripping down this to a small replicable example but i couldnt find a minimal scenario without the macro i created where next-entry would be invisible besides this scenario no matter what I tried, i've always managed to find a way to call next-entry in my other examples so im stumped as to why i cannot get it working here
I've tested the for macro ive created and it seems to generally work in most cases but for some reason it cannot see this next-entry variable. How do i make it visible?
In your code there are multiple places where the macro generates bindings in a way that is subject to variable capture (pdf).
(defmacro for-each-hashtable-band (body vars on &optional counter name)
`(block o ;; VARIABLE CAPTURE
(with-hash-table-iterator (next-entry ,on) ;; VARIABLE CAPTURE
(destructuring-bind ,(apply #'append vars)
(let ((current-band ;;; VARIABLE CAPTURE
(list ,#(mapcar #'not (apply #'append vars)))))
(for
(i ;;; VARIABLE CAPTURE
1 ,(length (apply #'append vars)) 2)
(multiple-value-bind (succ k v) ;;; VARIABLE CAPTURE
,(next-entry) ;;; WRONG EVALUATION TIME
(if succ
(progn
(setf (nth i current-band) k)
(setf (nth (+ 1 i) current-band) v))
(return-from o nil))))
current-band)
,#body))))
A simplified example of such a capture is:
`(let ((x 0)) ,#body)
Here above, the x variable is introduced, but if the code is expanded in a context where xis already bound, then body will not be able to reference that former x binding and will always see x bound to zero (you generally don't want this behavior).
Write a function instead
Instead of writing a big macro for this, let's first try understanding what you want to achieve and write instead a higher-order function, ie. a function that calls user-provided functions.
If I understand correctly, your function iterates over a hash-table by bands of entries. I assume vars holds a list of (key value) pairs of symbols, for example ((k1 v1) (k2 v2)). Then, body works on all the key/value pairs in the band.
In the following code, the function map-each-hashtable-band accepts a function, a hash-table, and instead of vars it accepts a size, the width of the band (the number of pairs).
Notice how in your code, you only have one loop, which builds a band using the hash-table iterator. But then, since the macro is named for-each-hashtable-band, I assume you also want to loop over all the bands. The macro with-hash-table-iterator provides an iterator but does not loop itself. That's why here I have two loops.
(defun map-each-hashtable-band (function hash-table band-size)
(with-hash-table-iterator (next-entry hash-table)
(loop :named outer-loop :do
(loop
:with key and value and next-p
:repeat band-size
:do (multiple-value-setq (next-p key value) (next-entry))
:while next-p
:collect key into current-band
:collect value into current-band
:finally (progn
(when current-band
(apply function current-band))
(unless next-p
(return-from outer-loop)))))))
For example:
(map-each-hashtable-band (lambda (&rest band) (print `(:band ,band)))
(alexandria:plist-hash-table
'(:a 0 :b 1 :c 2 :d 3 :e 4 :f 5 :g 6))
2)
NB. Iterating over a hash-table happens in an arbitrary order, there is no guarantee that you'll see the entries in any particular kind of order, this is implementation-dependant.
With my current version of SBCL this prints the following:
(:BAND (:A 0 :B 1))
(:BAND (:C 2 :D 3))
(:BAND (:E 4 :F 5))
(:BAND (:G 6))
Wrap the function in a macro
The previous function might not be exactly the behavior you want, so you need to adapt to your needs, but once it does what you want, you can wrap a macro around it.
(defmacro for-each-hashtable-band (vars hash-table &body body)
`(map-each-hashtable-band (lambda ,(apply #'append vars) ,#body)
,hash-table
,(length vars)))
For example:
(let ((test (alexandria:plist-hash-table '(:a 0 :b 1 :c 2 :d 3 :e 4 :f 5))))
(for-each-hashtable-band ((k1 v1) (k2 v2)) test
(format t "~a -> ~a && ~a -> ~a ~%" k1 v1 k2 v2)))
This prints:
A -> 0 && B -> 1
C -> 2 && D -> 3
E -> 4 && F -> 5
Macro-only solution, for completeness
If you want to have only one, single macro, you can start by inlining the body of the above function in the macro, you don't need to use apply anymore, but instead you need to establish bindings around the body, using destructuring-bind as you did. A first draft would be to simply as follows, but notice that this is not a proper solution:
(defmacro for-each-hashtable-band (vars hash-table &body body)
(let ((band-size (length vars)))
`(with-hash-table-iterator (next-entry ,hash-table)
(loop :named outer-loop :do
(loop
:with key and value and next-p
:repeat ,band-size
:do (multiple-value-setq (next-p key value) (next-entry))
:while next-p
:collect key into current-band
:collect value into current-band
:finally (progn
(when current-band
(destructuring-bind ,(apply #'append vars) current-band
,#body))
(unless next-p
(return-from outer-loop))))))))
In order to be free of variable capture problems with macros, each temporary variable you introduce must be named after a symbol that cannot exist in any context you expand your code. So instead we first unquote all the variables, making the macro definition fail to compile:
(defmacro for-each-hashtable-band (vars hash-table &body body)
(let ((band-size (length vars)))
`(with-hash-table-iterator (,next-entry ,hash-table)
(loop :named ,outer-loop :do
(loop
:with ,key and ,value and ,next-p
:repeat ,band-size
:do (multiple-value-setq (,next-p ,key ,value) (,next-entry))
:while ,next-p
:collect ,key into ,current-band
:collect ,value into ,current-band
:finally (progn
(when ,current-band
(destructuring-bind ,(apply #'append vars) ,current-band
,#body))
(unless ,next-p
(return-from ,outer-loop))))))))
When compiling the macro, the macro is supposed to inject symbols into the code, but here we have a compilation error that says undefined variables:
;; undefined variables: CURRENT-BAND KEY NEXT-ENTRY NEXT-P OUTER-LOOP VALUE
So now, those variables should be fresh symbols:
(defmacro for-each-hashtable-band (vars hash-table &body body)
(let ((band-size (length vars)))
(let ((current-band (gensym))
(key (gensym))
(next-entry (gensym))
(next-p (gensym))
(outer-loop (gensym))
(value (gensym)))
`(with-hash-table-iterator (,next-entry ,hash-table)
(loop :named ,outer-loop :do
(loop
:with ,key and ,value and ,next-p
:repeat ,band-size
:do (multiple-value-setq (,next-p ,key ,value) (,next-entry))
:while ,next-p
:collect ,key into ,current-band
:collect ,value into ,current-band
:finally (progn
(when ,current-band
(destructuring-bind ,(apply #'append vars) ,current-band
,#body))
(unless ,next-p
(return-from ,outer-loop)))))))))
This above is a bit verbose, but you could simplify that.
Here is what the previous for-each-hashtable-band example expands into with this new macro:
(with-hash-table-iterator (#:g1576 test)
(loop :named #:g1578
:do (loop :with #:g1575
and #:g1579
and #:g1577
:repeat 2
:do (multiple-value-setq (#:g1577 #:g1575 #:g1579) (#:g1576))
:while #:g1577
:collect #:g1575 into #:g1574
:collect #:g1579 into #:g1574
:finally (progn
(when #:g1574
(destructuring-bind
(k1 v1 k2 v2)
#:g1574
(format t "~a -> ~a && ~a -> ~a ~%" k1 v1 k2
v2)))
(unless #:g1577 (return-from #:g1578))))))
Each time you expand it, the #:gXXXX variables are different, and cannot possibly shadow existing bindings, so for example, the body can use variables named like current-band or value without breaking the expanded code.

Programmatically generating symbol macros

I've got a data structure that consists of two parts:
A hash table mapping symbols to indices
A vector of vectors containing data
For example:
(defparameter *h* (make-hash-table))
(setf (gethash 'a *h*) 0)
(setf (gethash 'b *h*) 1)
(setf (gethash 'c *h*) 2)
(defparameter *v-of-v* #(#(1 2 3 4) ;vector a
#(5 6 7 8) ;vector b
#(9 10 11 12))) ;vector c
I'd like to define a symbol macro to get at vector a without going through the hashmap. At the REPL:
(define-symbol-macro a (aref *v-of-v* 0))
works fine:
* a
#(1 2 3 4)
but there could be potentially many named vectors, and I don't know what the mappings will be ahead of time, so I need to automate this process:
(defun do-all-names ()
(maphash #'(lambda (key index)
(define-symbol-macro key (aref *v-of-v* index)))
*h*))
But that does nothing. And neither does any of the combinations I have tried of making do-all-names a macro, back-quote/comma templates, etc. I am beginning to wonder if this doesn't have something to do with the define-symbol-macro itself. It seems a little used feature, and On Lisp only mentions it twice. Not too many mentions here nor elsewhere either. In this case I'm using SBCL 2.1
Anyone have any ideas?
You need something like above to do it at runtime:
(defun do-all-names ()
(maphash #'(lambda (key index)
(eval `(define-symbol-macro ,key (aref *v-of-v* ,index)))
*h*))
DEFINE-SYMBOL-MACRO is a macro and does not evaluate all its arguments. So you need to generate a new macro form for each argument pair and evaluate it.
The other way to do it, usually at compile time, is to write a macro which generates these forms on the toplevel:
(progn
(define-symbol-macro a (aref *v-of-v* 0))
(define-symbol-macro b (aref *v-of-v* 1))
; ....
)
I'm not too sure on what you mean by "I don't know what the mappings will be ahead of time".
You could do something like:
(macrolet ((define-accessors ()
`(progn
,#(loop for key being the hash-keys of *h*
collect
`(define-symbol-macro ,key (aref *v-of-v* ,(gethash key *h*)))))))
(define-accessors))
If you know you do not require global access, then, you could do:
(defmacro with-named-vector-accessors (&body body) ; is that the name you want?
`(symbol-macrolet (,#(loop for key being the hash-keys of *h*
collect `(,key (aref *v-of-v* ,(gethash key *h*)))))
,#body))
;;; Example Usage:
(with-named-vector-accessors
(list a b c)) ;=> (#(1 2 3 4) #(5 6 7 8) #(9 10 11 12))
Also,
If you know *h* and the indices each symbol maps to at macroexpansion time, the above works.
If you know *h* at macroexpansion but the indices each symbol maps to will change after macroexpansion, you will want to collect (,key (aref *v-of-v* (gethash ,key *h*))).
PS: If you find loop ugly for hash-tables, you could use the iterate library with the syntax:
(iter (for (key value) in-hashtable *h*)
(collect `(,key (aref *v-of-v* ,value))))

Looking for format string which allows custom formatting for list of pairs

So I have lists, looking like this:
((24 . 23) (9 . 6) ... )
and want to custom format the output to something looking like this:
"24/23 9/6 ..."
I tried:
(defun show-pair (ostream pair col-used atsign-used)
(declare (ignore col-used atsign-used))
(format ostream "~d/~d" (first pair) (second pair)))
(let ((x '( 1 . 2))) (format nil "~{~/show-pair/~^~}" (list x)))
as a simple warming up exercise to show a list with only 1 pair. But when trying this in the emacs slime repl, I get the error
The value
2
is not of type
LIST
[Condition of type TYPE-ERROR]
Which, of course is confusing as ~/show-pair/ was expected to handle one entry in the list, which is is the pair, passing the pair to show-pair. But it appears, something else is actually happening.
If you want to do it with format - the problem is to access first and second element of the alist using format directives. I didn't found how I could access them inside a format directive.
However, in such regularly formed structures like an alist, one could flatten the list first and then let format-looping directive consume two elements per looping - then one consumes the pair.
Since the famous :alexandria library doesn't count as dependency in Common Lisp world, one could directly use alexandria:flatten:
(defparameter *al* '((24 . 23) (9 . 6)))
(ql:quickload :alexandria) ;; import alexandria library
(format nil "~{~a/~a~^ ~}" (alexandria:flatten *al*))
;; => "24/23 9/6"
nil return as string
~{ ~} loop over the list
~a/~a the fraction
~^ empty space between the elements but not after last element
flatten by the way without :alexandria-"dependency" would be in this case:
(defun flatten (l)
(cond ((null l) nil)
((atom l) (list l))
(t (append (flatten (car l)) (flatten (cdr l))))))
While waiting for feedback, I found out, where the problem is coming from:
So far, I considered (second x) to behave exactly like (cdr x) but for tagged values, this assumption is wrong. If I change in show-pairs above (in the question) accordingly, it all works.
So, it is not a formatting problem at all, nor any surprises with how ~/foo~/ works.
(defun show-pair (ostream pair col-used atsign-used)
(declare (ignore col-used atsign-used))
(format ostream "~d/~d" (first pair) (cdr pair))) ;; second -> cdr fixes the problem
It is fairly seldom a good idea to use ~/.../ in format in my experience. It's probably much better to simply turn the list you have into the list you need and then process that directly. So, for instance:
> (format t "~&~:{~D/~D~:^, ~}~%"
(mapcar (lambda (p)
(list (car p) (cdr p)))
'((1 . 2) (3 . 4))))
1/2, 3/4
nil
Or if you want to use loop:
> (format t "~&~:{~D/~D~:^, ~}.~%"
(loop for (n . d) in '((1 . 2) (3 . 4))
collect (list n d)))
1/2, 3/4.
nil
The cost (and storage) associated with converting the list is likely to be absolutely tiny compared with the I/O cost of printing it.
Using hints from Redefinition of the print-object method for conses..., you could end up with something like this:
CL-USER> (let ((std-function (pprint-dispatch 'cons)))
(unwind-protect
(progn (set-pprint-dispatch
'cons
(lambda (s o) (format s "~d/~d" (car o) (cdr o))))
(format t "~{~a~%~}" '((23 . 24) (5 . 9))))
(set-pprint-dispatch 'cons std-function))
(format t "~{~a~%~}" '((23 . 24) (5 . 9))))
23/24
5/9
(23 . 24)
(5 . 9)
NIL
CL-USER>
Hiding the bookkeeping;
(defmacro with-fractional-conses (&body body)
(let ((std-function (gensym "std-function")))
`(let ((,std-function (pprint-dispatch 'cons)))
(unwind-protect
(progn (set-pprint-dispatch
'cons
(lambda (s o) (format s "~d/~d"
(car o)
(cdr o))))
,#body)
(set-pprint-dispatch 'cons ,std-function)))))
CL-USER> (with-fractional-conses
(format t "~{~a~%~}"
'((23 . 24) (5 . 9))))
23/24
5/9
NIL
CL-USER>

Pointers in Common Lisp

I want to save a reference (pointer) to a part of some Data I saved in another variable:
(let ((a (list 1 2 3)))
(let ((b (car (cdr a)))) ;here I want to set b to 2, but it is set to a copy of 2
(setf b 4))
a) ;evaluates to (1 2 3) instead of (1 4 2)
I could use macros, but then there would ever be much code to be executed if I want to change some Data in the middle of a list and I am not very flexible:
(defparameter *list* (create-some-list-of-arrays))
(macrolet ((a () '(nth 1000 *list*)))
(macrolet ((b () `(aref 100 ,(a))))
;; I would like to change the macro a here if it were possible
;; but then b would mean something different
(setf (b) "Hello")))
Is it possible, to create a variable as a reference and not as a copy?
cl-user> (let ((a '(1 2 3)))
(let ((b (car (cdr a))))
(setf b 4))
a)
;Compiler warnings :
; In an anonymous lambda form: Unused lexical variable B
(1 2 3)
A cons cell is a pair of pointers. car dereferences the first, and cdr dereferences the second. Your list is effectively
a -> [ | ] -> [ | ] -> [ | ] -> NIL
| | |
1 2 3
Up top where you're defining b, (cdr a) gets you that second arrow. Taking the car of that dereferences the first pointer of that second cell and hands you its value. In this case, 2. If you want to change the value of that pointer, you need to setf it rather than its value.
cl-user> (let ((a '(1 2 3)))
(let ((b (cdr a)))
(setf (car b) 4))
a)
(1 4 3)
If all you need is some syntactic sugar, try symbol-macrolet:
(let ((a (list 1 2 3 4)))
(symbol-macrolet ((b (car (cdr a))))
(format t "~&Old: ~S~%" b)
(setf b 'hello)
(format t "~&New: ~S~%" b)))
Note, that this is strictly a compile-time thing. Anywhere (in the scope of the symbol-macrolet), where b is used as variable, it is expanded into (car (cdr a)) at compile time. As Sylwester already stated, there are no "references" in Common Lisp.
I wouldn't recommend this practice for general use, though.
And by the way: never change quoted data. Using (setf (car ...) ...) (and similar) on a constant list literal like '(1 2 3) will have undefined consequences.
Building on what Baggers suggested. Not exactly what you are looking for but you can define setf-expanders to create 'accessors'. So lets say your list contains information about people in the for of (first-name last-name martial-status) and when someone marries you can update it as:
(defun marital-status (person)
(third person))
(defun (setf marital-status) (value person)
(setf (third person) value))
(let ((person (list "John" "Doe" "Single")))
(setf (marital-status person) "Married")
person)
;; => ("John" "Doe" "Married")

Possible to do this without using eval in Common Lisp?

In my little project I have two arrays, lets call them A and B. Their values are
#(1 2 3) and #(5 6 7). I also have two lists of symbols of identical length, lets call them C and D. They look like this: (num1 num2 num3) and (num2 num3 num4).
You could say that the symbols in lists C and D are textual labels for the values in the arrays A and B. So num1 in A is 1. num2 in A is 2. num2 in B is 5. There is no num1 in B, but there is a num3, which is 6.
My goal is to produce a function taking two arguments like so:
(defun row-join-function-factory (C D)
...body...)
I want it to return a function of two arguments:
(lambda (A B) ...body...)
such that this resulting function called with arguments A and B results in a kind of "join" that returns the new array: #(1 5 6 7)
The process taking place in this later function obtained values from the two arrays A and B such that it produces a new array whose members may be represented by (union C D). Note: I haven't actually run (union C D), as I don't actually care about the order of the symbols contained therein, but lets assume it returns (num1 num2 num3 num4). The important thing is that (num1 num2 num3 num4) corresponds as textual labels to the new array #(1 5 6 7). If num2, or any symbol, exists in both C and D, and subsequently represents values from A and B, then the value from B corresponding to that symbol is kept in the resulting array rather than the value from A.
I hope that gets the gist of the mechanical action here. Theoretically, I want row-join-function-factory to be able to do this with arrays and symbol-lists of any length/contents, but writing such a function is not beyond me, and not the question.
The thing is, I wish the returned function to be insanely efficient, which means that I'm not willing to have the function chase pointers down lists, or look up hash tables at run time. In this example, the function I require to be returned would be almost literally:
(lambda (A B)
(make-array 4
:initial-contents (list (aref A 0) (aref B 0) (aref B 1) (aref B 2))))
I do not want the array indexes calculated at run-time, or which array they are referencing. I want a compiled function that does this and this only, as fast as possible, which does as little work as possible. I do not care about the run-time work required to make such a function, only the run-time work required in applying it.
I have settled upon the use of (eval ) in row-join-function-factory to work on symbols representing the lisp code above to produce this function. I was wondering, however, if there is not some simpler method to pull off this trick that I am not thinking of, given one's general cautiousness about the use of eval...
By my reasoning, i cannot use macros by themselves, as they cannot know what all values and dimensions A, B, C, D could take at compile time, and while I can code up a function that returns a lambda which mechanically does what I want, I believe my versions will always be doing some kind of extra run-time work/close over variables/etc...compared to the hypothetical lambda function above
Thoughts, answers, recommendations and the like are welcome. Am I correct in my conclusion that this is one of those rare legitimate eval uses? Apologies ahead of time for my inability to express the problem as eloquently in english...
(or alternatively, if someone can explain where my reasoning is off, or how to dynamically produce the most efficient functions...)
From what I understand, you need to precompute the vector size and the aref args.
(defun row-join-function-factory (C D)
(flet ((add-indices (l n)
(loop for el in l and i from 0 collect (list el n i))))
(let* ((C-indices (add-indices C 0))
(D-indices (add-indices D 1))
(all-indices (append D-indices
(set-difference C-indices
D-indices
:key #'first)))
(ns (mapcar #'second all-indices))
(is (mapcar #'third all-indices))
(size (length all-indices)))
#'(lambda (A B)
(map-into (make-array size)
#'(lambda (n i)
(aref (if (zerop n) A B) i))
ns is)))))
Note that I used a number to know if either A or B should be used instead of capturing C and D, to allow them to be garbage collected.
EDIT: I advise you to profile against a generated function, and observe if the overhead of the runtime closure is higher than e.g. 5%, against a special-purpose function:
(defun row-join-function-factory (C D)
(flet ((add-indices (l n)
(loop for el in l and i from 0 collect (list el n i))))
(let* ((C-indices (add-indices C 0))
(D-indices (add-indices D 1))
(all-indices (append D-indices
(set-difference C-indices
D-indices
:key #'first)))
(ns (mapcar #'second all-indices))
(is (mapcar #'third all-indices))
(size (length all-indices))
(j 0))
(compile
nil
`(lambda (A B)
(let ((result (make-array ,size)))
,#(mapcar #'(lambda (n i)
`(setf (aref result ,(1- (incf j)))
(aref ,(if (zerop n) 'A 'B) ,i)))
ns is)
result))))))
And validate if the compilation overhead indeed pays off in your implementation.
I argue that if the runtime difference between the closure and the compiled lambda is really small, keep the closure, for:
A cleaner coding style
Depending on the implementation, it might be easier to debug
Depending on the implementation, the generated closures will share the function code (e.g. closure template function)
It won't require a runtime license that includes the compiler in some commercial implementations
I think the right approach is to have a macro which would compute the indexes at compile time:
(defmacro my-array-generator (syms-a syms-b)
(let ((table '((a 0) (b 0) (b 1) (b 2)))) ; compute this from syms-a and syms-b
`(lambda (a b)
(make-array ,(length table) :initial-contents
(list ,#(mapcar (lambda (ai) (cons 'aref ai)) table))))))
And it will produce what you want:
(macroexpand '(my-array-generator ...))
==>
#'(LAMBDA (A B)
(MAKE-ARRAY 4 :INITIAL-CONTENTS
(LIST (AREF A 0) (AREF B 0) (AREF B 1) (AREF B 2))))
So, all that is left is to write a function which will produce
((a 0) (b 0) (b 1) (b 2))
given
syms-a = (num1 num2 num3)
and
syms-b = (num2 num3 num4)
Depends on when you know the data. If all the data is known at compile time, you can use a macro (per sds's answer).
If the data is known at run-time, you should be looking at loading it into an 2D array from your existing arrays. This - using a properly optimizing compiler - should imply that a lookup is several muls, an add, and a dereference.
By the way, can you describe your project in a wee bit more detail? It sounds interesting. :-)
Given C and D you could create a closure like
(lambda (A B)
(do ((result (make-array n))
(i 0 (1+ i)))
((>= i n) result)
(setf (aref result i)
(aref (if (aref use-A i) A B)
(aref use-index i)))))
where n, use-A and use-index are precomputed values captured in the closure like
n --> 4
use-A --> #(T nil nil nil)
use-index --> #(0 0 1 2)
Checking with SBCL (speed 3) (safety 0) the execution time was basically identical to the make-array + initial-contents version, at least for this simple case.
Of course creating a closure with those precomputed data tables doesn't even require a macro.
Have you actually timed how much are you going to save (if anything) using an unrolled compiled version?
EDIT
Making an experiment with SBCL the closure generated by
(defun merger (clist1 clist2)
(let ((use1 (list))
(index (list))
(i1 0)
(i2 0))
(dolist (s1 clist1)
(if (find s1 clist2)
(progn
(push NIL use1)
(push (position s1 clist2) index))
(progn
(push T use1)
(push i1 index)))
(incf i1))
(dolist (s2 clist2)
(unless (find s2 clist1)
(push NIL use1)
(push i2 index))
(incf i2))
(let* ((n (length index))
(u1 (make-array n :initial-contents (nreverse use1)))
(ix (make-array n :initial-contents (nreverse index))))
(declare (type simple-vector ix)
(type simple-vector u1)
(type fixnum n))
(print (list u1 ix n))
(lambda (a b)
(declare (type simple-vector a)
(type simple-vector b))
(let ((result (make-array n)))
(dotimes (i n)
(setf (aref result i)
(aref (if (aref u1 i) a b)
(aref ix i))))
result)))))
runs about 13% slower than an hand-written version providing the same type declarations (2.878s instead of 2.529s for 100,000,000 calls for the (a b c d)(b d e f) case, a 6-elements output).
The inner loop for the data based closure version compiles to
; 470: L2: 4D8B540801 MOV R10, [R8+RCX+1] ; (aref u1 i)
; 475: 4C8BF7 MOV R14, RDI ; b
; 478: 4C8BEE MOV R13, RSI ; source to use (a for now)
; 47B: 4981FA17001020 CMP R10, 537919511 ; (null R10)?
; 482: 4D0F44EE CMOVEQ R13, R14 ; if true use b instead
; 486: 4D8B540901 MOV R10, [R9+RCX+1] ; (aref ix i)
; 48B: 4B8B441501 MOV RAX, [R13+R10+1] ; load (aref ?? i)
; 490: 4889440B01 MOV [RBX+RCX+1], RAX ; store (aref result i)
; 495: 4883C108 ADD RCX, 8 ; (incf i)
; 499: L3: 4839D1 CMP RCX, RDX ; done?
; 49C: 7CD2 JL L2 ; no, loop back
The conditional is not compiled to a jump but to a conditional assignment (CMOVEQ).
I see a little room for improvement (e.g. using CMOVEQ R13, RDI directly, saving an instruction and freeing a register) but I don't think this would shave off that 13%.

Resources