Matrix-multiplication using BLAS from Common Lisp

Matrix-multiplication using BLAS from Common Lisp - common-lisp

Let's say I have two matrices (in the form of a Common Lisp array) foo and bar such that:
(defvar foo #2A((2 1 6) (7 3 4)))
(defvar bar #2A((3 1) (6 5) (2 3)))
I would like to perform a matrix multiplication using BLAS without using wrappers such as Matlisp, GSLL, LLA, & co. so that I get an array with the result:
#2A((24 25) (47 34))
Which steps should I take to perform such operation?
My understanding is that I should call the BLAS matrix multiplication function from the REPL and pass it my arguments foo and bar.
In R, I can easily do it like this:
foo %*% bar
How can I do it in Common Lisp?
Disclaimer:
1) I use SBCL
2) I am not a seasoned computer scientist

Here's the perfect answer I was looking for. Credits to Miroslav Urbanek from Charles University in Prague.
"Here's the basic idea. I find a function I want to use from
BLAS/LAPACK. In case of matrix multiplication, it's DGEMM. "D" stands
for double float, "GE" stands for general matrices (without a special
shape like symmetric, triangular, etc.), and "MM" stands for matrix
multiplication. The documentation is here:
http://www.netlib.org/lapack/explore-html/d7/d2b/dgemm_8f.html
Then I define an alien routine using SBCL FFI. I pass Lisp array
directly using some special SBCL functions. The Lisp arrays must be
created with an option :element-type 'double-float.
An important point is that SBCL stores array elements in row-major
order, similarly to C. Fortran uses column-major order. This
effectively corresponds to transposed matrices. The order of matrices
and their dimensions must be therefore changed when calling DGEMM from
Lisp."
;; Matrix multiplication in SBCL using BLAS
;; Miroslav Urbanek <mu#miroslavurbanek.com>
(load-shared-object "libblas.so.3")
(declaim (inline dgemm))
(define-alien-routine ("dgemm_" dgemm) void
(transa c-string)
(transb c-string)
(m int :copy)
(n int :copy)
(k int :copy)
(alpha double :copy)
(a (* double))
(lda int :copy)
(b (* double))
(ldb int :copy)
(beta double :copy)
(c (* double))
(ldc int :copy))
(defun pointer (array)
(sap-alien (sb-sys:vector-sap (array-storage-vector array)) (* double)))
(defun mm (a b)
(unless (= (array-dimension a 1) (array-dimension b 0))
(error "Matrix dimensions do not match."))
(let* ((m (array-dimension a 0))
(n (array-dimension b 1))
(k (array-dimension a 1))
(c (make-array (list m n) :element-type 'double-float)))
(sb-sys:with-pinned-objects (a b c)
(dgemm "n" "n" n m k 1d0 (pointer b) n (pointer a) k 0d0 (pointer c) n))
c))
(defparameter a (make-array '(2 3) :element-type 'double-float :initial-contents '((2d0 1d0 6d0) (7d0 3d0 4d0))))
(defparameter b (make-array '(3 2) :element-type 'double-float :initial-contents '((3d0 1d0) (6d0 5d0) (2d0 3d0))))
(format t "a = ~A~%b = ~A~%" a b)
(defparameter c (mm a b))

In R you are using the R wrapper. You cannot avoid using a "wrapper". So you should use that best suits you.
Sorry if this isn't much helpful, but that's how things are.
Marco

Related

swap elements between two lists in LISP

thanks for your support, I am a newbie...
I would like to swap elements BETWEEN two lists in Common-LISP given a certain index of the first and second list, for example:
(1 2 3 4) (A B C D) -> (D 2 3 4) when specified indexes are (0 3).
It might look randomish but it has a nice utility in musical sequences...
Thanks,
Alessandro

If you need to use an index, maybe a vector can be more sensible. Use for example ROTATEF, as explained by jkiiski:
CL-USER> (let ((a (vector 1 2 3 4))
(b (vector 'a 'b 'c 'd)))
(rotatef (aref a 0) (aref b 3))
(values a b))
#(D 2 3 4)
#(A B C 1)
If you really want to use lists, then use NTH, or ELT, which works on both kinds of sequences.
Preemptive remark: you cannot modify constant data. Note how vectors a and b are allocated at runtime. Constant data is data that was computed at read-time or compile-time, and should not be modified at runtime. Quoted lists are constant, as shown by this example:
CL-USER> (let ((list '(a b))) (setf (first list) 0) list)
; in: LET ((LIST '(A B)))
; (SETF (FIRST LIST) 0)
; ==>
; (SB-KERNEL:%RPLACA LIST 0)
;
; caught WARNING:
; Destructive function SB-KERNEL:%RPLACA called on constant data: (A B).
; See also:
; The ANSI Standard, Special Operator QUOTE
; The ANSI Standard, Section 3.2.2.3
;

Computing linear combination of vectors in Common Lisp

I'm working on some numerical computations in Common Lisp and I need to compute a linear combination of several vectors with given numerical coefficients. I'm rewriting a piece of Fortran code, where this can be accomplished by res = a1*vec1 + a2*vec2 + ... + an*vecn. My initial take in CL was to simply write each time something like:
(map 'vector
(lambda (x1 x2 ... xn)
(+ (* x1 a1) (* x2 a2) ... (* xn an)))
vec1 vec2 ... vecn)
But I soon noticed that this pattern would recur over and over again, and so started writing some code to abstract it away. Because the number of vectors and hence the number of lambda's arguments would vary from place to place, I figured a macro would be required. I came up with the following:
(defmacro vec-lin-com (coefficients vectors &key (type 'vector))
(let ((args (loop for v in vectors collect (gensym))))
`(map ',type
(lambda ,args
(+ ,#(mapcar #'(lambda (c a) (list '* c a)) coefficients args)))
,#vectors)))
Macroexpanding the expression:
(vec-lin-com (10 100 1000) (#(1 2 3) #(4 5 6) #(7 8 9)))
yields the seemingly correct expansion:
(MAP 'VECTOR
(LAMBDA (#:G720 #:G721 #:G722)
(+ (* 10 #:G720) (* 100 #:G721) (* 1000 #:G722)))
#(1 2 3) #(4 5 6) #(7 8 9))
So far, so good...
Now, when I try to use it inside a function like this:
(defun vector-linear-combination (coefficients vectors &key (type 'vector))
(vec-lin-com coefficients vectors :type type))
I get a compilation error stating essentially that The value VECTORS is not of type LIST. I'm not sure how to approach this. I feel I'm missing something obvious. Any help will be greatly appreciated.

You've gone into the literal trap. Macros are syntax rewriting so when you pass 3 literal vectors in a syntax list you can iterate on them at compile time, but replacing it with a bindnig to a list is not the same. The macro only gets to see the code and it doesn't know what vectors will eventually be bound to at runtime when it does its thing. You should perhaps make it a function instead:
(defun vec-lin-com (coefficients vectors &key (type 'vector))
(apply #'map
type
(lambda (&rest values)
(loop :for coefficient :in coefficients
:for value :in values
:sum (* coefficient value)))
vectors))
Now you initial test won't work since you passed syntax and not lists. you need to quote literals:
(vec-lin-com '(10 100 1000) '(#(1 2 3) #(4 5 6) #(7 8 9)))
; ==> #(7410 8520 9630)
(defparameter *coefficients* '(10 100 1000))
(defparameter *test* '(#(1 2 3) #(4 5 6) #(7 8 9)))
(vec-lin-com *coefficients* *test*)
; ==> #(7410 8520 9630)
Now you could make this a macro, but most of the job would have been done by the expansion and not the macro so basically you macro would expand to similar code to what my function is doing.

Remember that macros are expanded at compile-time, so the expression ,#(mapcar #'(lambda (c a) (list '* c a)) coefficients args) has to be meaningful at compile-time. In this case, all that mapcar gets for coefficients and args are the symbols coefficients and vectors from the source code.
If you want to be able to call vec-lin-com with an unknown set of arguments (unknown at compile-time, that is), you'll want to define it as a function. It sounds like the main problem you're having is getting the arguments to + correctly ordered. There's a trick using apply and map to transpose a matrix that may help.
(defun vec-lin-com (coefficients vectors)
(labels
((scale-vector (scalar vector)
(map 'vector #'(lambda (elt) (* scalar elt)) vector))
(add-vectors (vectors)
(apply #'map 'vector #'+ vectors)))
(let ((scaled-vectors (mapcar #'scale-vector coefficients vectors)))
(add-vectors scaled-vectors))))
This isn't the most efficient code in the world; it does a lot of unnecessary consing. But it is effective, and if you find this to be a bottleneck you can write more efficient versions, including some that can take advantage of compile-time constants.

Collecting to a vector instead of a list

I solved Project Euler's 8th problem using SBCL and the iterate package from quicklisp. In my code I defined a function that turns a number into a list of it's digits. Here's the source code:
(defun number-to-list (n)
(iter (for c in-string (write-to-string n)) (collect (digit-char-p c))))
The collect clause both in iter and in loop make a list out of the values. Is it possible to instead generate a vector (one dimensional array)?
Would my only option be to convert the list generated by number-to-list to a vector? Because that seems inefficient (although probably not that inefficient)

Usually there is one big problem: how large will the result vector be? It would be best to know that upfront, then we can allocate the vector once with the correct size. Otherwise we would have find ways to deal with that: use a resizable vector, allocate a list first and copy into a result vector later, allocate a larger vector with a fill pointer, ...
If you have a sequence, then one can use the Common Lisp function MAP: if the source object is a vector, here a string, its length is cheap to get.
CL-USER 1 > (map 'vector
#'digit-char-p
(write-to-string 5837457324534))
#(5 8 3 7 4 5 7 3 2 4 5 3 4)
You can use ITERATE and collect a vector:
FOO 32 > (defun number-to-vector (n)
(iter (for c in-string (write-to-string n))
(collect (digit-char-p c) result-type vector)))
NUMBER-TO-VECTOR
FOO 33 > (number-to-vector 8573475934)
#(8 5 7 3 4 7 5 9 3 4)
If you look at the macro expansion, it actually collects into a list and then calls COERCE to create the vector. So: no win in efficiency.
Note that this is another example where ITERATE is more powerful than LOOP: the standard LOOP can't directly return vectors from collect.

The proposed solutions are correct and elegant, but they first create a list, or trasform the number in string. I would like to propose a direct transformation from integers to arrays, without transforming first the number in a list or a string:
(defun digits(n)
"Transform a positive integer n in array of digits"
(let* ((logn (floor (log n 10)))
(result (make-array (1+ logn) :element-type '(integer 0 9))))
(loop for i downfrom logn to 0
do (setf (values n (aref result i)) (floor n 10)))
result))
The problem of allocating an array of the correct dimension is solved with the formula that gives the number of decimal digits of an integer n: ⌊log10 n⌋+1.

Maybe not a direct answer to your question but here are my num-to-list and list-to-num functions I frequently use.
(defun num-to-list-helper (n liste)
(cond ((< n 1) liste)
(t (num-to-list-helper (truncate (/ n 10)) (cons (rem n 10) liste))))))
(defun num-to-list (n)
(num-to-list-helper n nil))
(defun list-to-num-helper (liste n)
(if (null liste)
n
(list-to-num-helper (cdr liste)
(+ n (* (car liste) (expt 10 (1- (length liste))))))))
(defun list-to-num (liste)
(list-to-num-helper liste 0))
You could try these and see if there's an improvement over converting the number to string. Personally I don't prefer strings for numbers as I consider them as an ugly trick I was forced to do in my Java days.
You could also convert these functions to a version using vectors and see how they do.

code with incorrect result for big N. Common Lisp

The below code gives wrong answer. It should give approximately 0.5 which is the average of an array with many random numbers between 0 an 1. I think the problem is because N is "to big", or perhaps the precision on the generated random number?. The code works well for smaller values of N (10^7, 10^6, etc). Some advices will be helpful.
Thank you in advance.
(defun randvec(n)
(let ((arr (make-array n)))
(dotimes (i n)
(setf (aref arr i) (random 1.0))
)
arr
)
)
(defparameter N (expt 10 8))
(setf *random-state* (make-random-state t))
(defparameter vector1 (randvec N))
(format t "~a~%" (/ (reduce #'+ vector1) (length vector1)))

Precision of floating point numbers
You are computing with single precision floating point numbers. By adding up all random numbers you get a single-float number. The more numbers you add, the larger the float will be. This eventually causes your result to have not enough precision.
double-floats like 1.0d0 have a higher precision than single-floats like 1.0s0. By default 1.0 is read as a single-float. (RANDOM 1.0d0) will compute a double float.
(defun randvec (n)
(let ((v (make-array n)))
(dotimes (i n v)
(setf (aref v i) (random 1.0d0))))) ; create a double float random number
(defun test (&optional (n 10))
(setf *random-state* (make-random-state t))
(let ((v (randvec n)))
(/ (reduce #'+ v) (length v))))
Example:
CL-USER 58 > (test (expt 10 8))
0.4999874882753848D0
Style
Please use common Lisp programming style when programming in Common Lisp:
don't use global variables, if not necessary. Write functions with local variables instead.
if you define a global variable with defparameter, don't name it n, but *n*.
format and indent your code correctly. Indentation should be done with editor help.
don't use parentheses on their own line.
See my example above.

Possible to do this without using eval in Common Lisp?

In my little project I have two arrays, lets call them A and B. Their values are
#(1 2 3) and #(5 6 7). I also have two lists of symbols of identical length, lets call them C and D. They look like this: (num1 num2 num3) and (num2 num3 num4).
You could say that the symbols in lists C and D are textual labels for the values in the arrays A and B. So num1 in A is 1. num2 in A is 2. num2 in B is 5. There is no num1 in B, but there is a num3, which is 6.
My goal is to produce a function taking two arguments like so:
(defun row-join-function-factory (C D)
...body...)
I want it to return a function of two arguments:
(lambda (A B) ...body...)
such that this resulting function called with arguments A and B results in a kind of "join" that returns the new array: #(1 5 6 7)
The process taking place in this later function obtained values from the two arrays A and B such that it produces a new array whose members may be represented by (union C D). Note: I haven't actually run (union C D), as I don't actually care about the order of the symbols contained therein, but lets assume it returns (num1 num2 num3 num4). The important thing is that (num1 num2 num3 num4) corresponds as textual labels to the new array #(1 5 6 7). If num2, or any symbol, exists in both C and D, and subsequently represents values from A and B, then the value from B corresponding to that symbol is kept in the resulting array rather than the value from A.
I hope that gets the gist of the mechanical action here. Theoretically, I want row-join-function-factory to be able to do this with arrays and symbol-lists of any length/contents, but writing such a function is not beyond me, and not the question.
The thing is, I wish the returned function to be insanely efficient, which means that I'm not willing to have the function chase pointers down lists, or look up hash tables at run time. In this example, the function I require to be returned would be almost literally:
(lambda (A B)
(make-array 4
:initial-contents (list (aref A 0) (aref B 0) (aref B 1) (aref B 2))))
I do not want the array indexes calculated at run-time, or which array they are referencing. I want a compiled function that does this and this only, as fast as possible, which does as little work as possible. I do not care about the run-time work required to make such a function, only the run-time work required in applying it.
I have settled upon the use of (eval ) in row-join-function-factory to work on symbols representing the lisp code above to produce this function. I was wondering, however, if there is not some simpler method to pull off this trick that I am not thinking of, given one's general cautiousness about the use of eval...
By my reasoning, i cannot use macros by themselves, as they cannot know what all values and dimensions A, B, C, D could take at compile time, and while I can code up a function that returns a lambda which mechanically does what I want, I believe my versions will always be doing some kind of extra run-time work/close over variables/etc...compared to the hypothetical lambda function above
Thoughts, answers, recommendations and the like are welcome. Am I correct in my conclusion that this is one of those rare legitimate eval uses? Apologies ahead of time for my inability to express the problem as eloquently in english...
(or alternatively, if someone can explain where my reasoning is off, or how to dynamically produce the most efficient functions...)

From what I understand, you need to precompute the vector size and the aref args.
(defun row-join-function-factory (C D)
(flet ((add-indices (l n)
(loop for el in l and i from 0 collect (list el n i))))
(let* ((C-indices (add-indices C 0))
(D-indices (add-indices D 1))
(all-indices (append D-indices
(set-difference C-indices
D-indices
:key #'first)))
(ns (mapcar #'second all-indices))
(is (mapcar #'third all-indices))
(size (length all-indices)))
#'(lambda (A B)
(map-into (make-array size)
#'(lambda (n i)
(aref (if (zerop n) A B) i))
ns is)))))
Note that I used a number to know if either A or B should be used instead of capturing C and D, to allow them to be garbage collected.
EDIT: I advise you to profile against a generated function, and observe if the overhead of the runtime closure is higher than e.g. 5%, against a special-purpose function:
(defun row-join-function-factory (C D)
(flet ((add-indices (l n)
(loop for el in l and i from 0 collect (list el n i))))
(let* ((C-indices (add-indices C 0))
(D-indices (add-indices D 1))
(all-indices (append D-indices
(set-difference C-indices
D-indices
:key #'first)))
(ns (mapcar #'second all-indices))
(is (mapcar #'third all-indices))
(size (length all-indices))
(j 0))
(compile
nil
`(lambda (A B)
(let ((result (make-array ,size)))
,#(mapcar #'(lambda (n i)
`(setf (aref result ,(1- (incf j)))
(aref ,(if (zerop n) 'A 'B) ,i)))
ns is)
result))))))
And validate if the compilation overhead indeed pays off in your implementation.
I argue that if the runtime difference between the closure and the compiled lambda is really small, keep the closure, for:
A cleaner coding style
Depending on the implementation, it might be easier to debug
Depending on the implementation, the generated closures will share the function code (e.g. closure template function)
It won't require a runtime license that includes the compiler in some commercial implementations

I think the right approach is to have a macro which would compute the indexes at compile time:
(defmacro my-array-generator (syms-a syms-b)
(let ((table '((a 0) (b 0) (b 1) (b 2)))) ; compute this from syms-a and syms-b
`(lambda (a b)
(make-array ,(length table) :initial-contents
(list ,#(mapcar (lambda (ai) (cons 'aref ai)) table))))))
And it will produce what you want:
(macroexpand '(my-array-generator ...))
==>
#'(LAMBDA (A B)
(MAKE-ARRAY 4 :INITIAL-CONTENTS
(LIST (AREF A 0) (AREF B 0) (AREF B 1) (AREF B 2))))
So, all that is left is to write a function which will produce
((a 0) (b 0) (b 1) (b 2))
given
syms-a = (num1 num2 num3)
and
syms-b = (num2 num3 num4)

Depends on when you know the data. If all the data is known at compile time, you can use a macro (per sds's answer).
If the data is known at run-time, you should be looking at loading it into an 2D array from your existing arrays. This - using a properly optimizing compiler - should imply that a lookup is several muls, an add, and a dereference.
By the way, can you describe your project in a wee bit more detail? It sounds interesting. :-)

Given C and D you could create a closure like
(lambda (A B)
(do ((result (make-array n))
(i 0 (1+ i)))
((>= i n) result)
(setf (aref result i)
(aref (if (aref use-A i) A B)
(aref use-index i)))))
where n, use-A and use-index are precomputed values captured in the closure like
n --> 4
use-A --> #(T nil nil nil)
use-index --> #(0 0 1 2)
Checking with SBCL (speed 3) (safety 0) the execution time was basically identical to the make-array + initial-contents version, at least for this simple case.
Of course creating a closure with those precomputed data tables doesn't even require a macro.
Have you actually timed how much are you going to save (if anything) using an unrolled compiled version?
EDIT
Making an experiment with SBCL the closure generated by
(defun merger (clist1 clist2)
(let ((use1 (list))
(index (list))
(i1 0)
(i2 0))
(dolist (s1 clist1)
(if (find s1 clist2)
(progn
(push NIL use1)
(push (position s1 clist2) index))
(progn
(push T use1)
(push i1 index)))
(incf i1))
(dolist (s2 clist2)
(unless (find s2 clist1)
(push NIL use1)
(push i2 index))
(incf i2))
(let* ((n (length index))
(u1 (make-array n :initial-contents (nreverse use1)))
(ix (make-array n :initial-contents (nreverse index))))
(declare (type simple-vector ix)
(type simple-vector u1)
(type fixnum n))
(print (list u1 ix n))
(lambda (a b)
(declare (type simple-vector a)
(type simple-vector b))
(let ((result (make-array n)))
(dotimes (i n)
(setf (aref result i)
(aref (if (aref u1 i) a b)
(aref ix i))))
result)))))
runs about 13% slower than an hand-written version providing the same type declarations (2.878s instead of 2.529s for 100,000,000 calls for the (a b c d)(b d e f) case, a 6-elements output).
The inner loop for the data based closure version compiles to
; 470: L2: 4D8B540801 MOV R10, [R8+RCX+1] ; (aref u1 i)
; 475: 4C8BF7 MOV R14, RDI ; b
; 478: 4C8BEE MOV R13, RSI ; source to use (a for now)
; 47B: 4981FA17001020 CMP R10, 537919511 ; (null R10)?
; 482: 4D0F44EE CMOVEQ R13, R14 ; if true use b instead
; 486: 4D8B540901 MOV R10, [R9+RCX+1] ; (aref ix i)
; 48B: 4B8B441501 MOV RAX, [R13+R10+1] ; load (aref ?? i)
; 490: 4889440B01 MOV [RBX+RCX+1], RAX ; store (aref result i)
; 495: 4883C108 ADD RCX, 8 ; (incf i)
; 499: L3: 4839D1 CMP RCX, RDX ; done?
; 49C: 7CD2 JL L2 ; no, loop back
The conditional is not compiled to a jump but to a conditional assignment (CMOVEQ).
I see a little room for improvement (e.g. using CMOVEQ R13, RDI directly, saving an instruction and freeing a register) but I don't think this would shave off that 13%.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Matrix-multiplication using BLAS from Common Lisp - common-lisp

In R you are using the R wrapper. You cannot avoid using a "wrapper". So you should use that best suits you. Sorry if this isn't much helpful, but that's how things are. Marco

Related

swap elements between two lists in LISP

Computing linear combination of vectors in Common Lisp

Collecting to a vector instead of a list

code with incorrect result for big N. Common Lisp

Possible to do this without using eval in Common Lisp?

Categories

Resources