How do I best save/read data structures?

How do I best save/read data structures? - common-lisp

I want to write some data structures pointed to by FOO and BAR to a file, and to read the data structures back into the symbols FOO and BAR when I start a new session of Common Lisp.
It would appear *PRINT-READABLY* allows objects to be printed in a fashion that can be read back in by READ, and I can change how objects are printed using (defmethod print-object ...). Since the objects should be printed in a way acceptable to READ, I shouldn't have to define any further methods for reading the object back in.
But is there a way to tie each written data structure to its corresponding symbol, without having to rely on the order the data structures are written and read in?

If I understand correctly, you could store the values and associated symbols as pairs in the file, so something like this:
(x . (1 2 3 4))
(y . (6 7 8 1))
And when you parse it out, use something like this:
(let ((pair (read))
(set (car pair) (cdr pair)))

Related

Generating a bitvector in scheme

I am trying to implement a spellchecker that takes a hash function and a dictionary, and then map the hash values of the words to a bitvector. More specifically, I am trying to write a function called gen-checker that takes as input a list of hash functions and a dictionary of words and returns a spellchecker. The spellchecker must generate a bitvector representation for the input of the dictionary, which contains #t or #f indicating the correct or incorrect spelling of the word.
I have already defined the has functions and have a dictionary to use, but I can't seem to get the bit vector setup
I have tried implementing (make-bitvector 8 #f) found here:
http://www.gnu.org/software/guile/manual/html_node/Bit-Vectors.html
But for some reason drracket does not recognize it. What am I doing wrong? How to implement the bitvector representation?

It may seem like this answer is joking, but it is not:
(define make-bitvector make-vector)
(define bitvector-ref vector-ref)
;; ...
After everything is working, and only then, would one need to optimize storage by bit packing.

Copy a circular list to a fixed length list

I store graph coordinates in a circular list:
(defun make-circular-list (size &key initial-element)
(let ((list (make-list size :initial-element initial-element)))
(nconc list list)))
(defvar *coordinates* (make-circular-list 1024 :initial-element 0.0))
Now it's easy to update *coordinates* whenever new coordinates must be set.
However, I have a library function that takes a sequence of coordinates to draw lines on a graph. Of course this function does not work with a circular structure, so I would like to make a copy of fixed length. A list or an array is fine.
So far I have tried subseq and make-array with the :initial-contents keyword, but it fails. loop or dotimes do work but I was hoping to avoid iterating on each elements of the list.
Is it possible to efficiently copy this circular structure or make a fixed length array in the spirit of a displaced-array?

There is nothing wrong with using LOOP.
(loop for c in *coordinates* repeat 1024 collect c)
Btw., sometimes it might be useful to hide the circular list behind a CLOS object.
(defclass circular-list ()
((list)
(size)
(first-element)
(last-element)))
And so on...
That way you can provide some CLOS methods for accessing and changing it (create, add, copy, delete, , as-list, ...). You can also control printing of it using a method for PRINT-OBJECT.

Put an element to the tail of a collection

I find myself doing a lot of:
(concat coll [e]) where coll is a collection and e a single element.
Is there a function for doing this in Clojure? I know conj does the job best for vectors but I don't know up front which coll will be used. It could be a vector, list or sorted-set for example.

Some types of collections can add cheaply to the front (lists, seqs), while others can add cheaply to the back (vectors, queues, kinda-sorta lazy-seqs). Rather than using concat, if possible you should arrange to be working with one of those types (vector is most common) and just conj to it: (conj [1 2 3] 4) yields [1 2 3 4], while (conj '(1 2 3) 4) yields (4 1 2 3).

concat does not add an element to the tail of a collection, nor does it concatenate two collections.
concat returns a seq made of the concatenation of two other seqs. The original type of the collections from which seqs may be inferred are lost for the return type of concat.
Now, clojure collections have different properties one must know about in order to write efficient code, that's why there isn't a universal function available in core to concatenate collections of any kind together.
To the contrary, list and vectors do have "natural insertion positions" which conj knows, and does what is right for the kind of collection.

This is a very small addendum to #amalloy's answer in order to address OP's request for a function that always adds to the tail of whatever kind of collection. This is an alternative to (concat coll [x]). Just create a vector version of the original collection:
(defn conj*
[s x]
(conj (vec s) x))
Caveats:
If you started with a lazy sequence, you've now destroyed the laziness--i.e. the output is not lazy. This may be either a good thing or a bad thing, depending on your needs.
There's some cost to creating the vector. If you need to call this function a lot, and you find (e.g. by benchmarking with Criterium) that this cost is significant for your purposes, then follow the other answers' advice to try to use vectors in the first place.

To distill the best of what amalloy and Laurent Petit have already said: use the conj function.
One of the great abstractions that Clojure provides is the Sequence API, which includes the conj function. If at all possible, your code should be as collection-type agnostic as it can be, instead using the seq API to handle operations on collections and picking a particular collection type only when you need to be specific.
If vectors are a good match, then yes, conj will be adding items onto the end. If use lists instead, then conj will be adding things to the front of your collection. But if you then use the standard seq API functions for pulling items from the "top" of a collection (the back of a vector, the front of a list), it doesn't matter which implementation you use, because it will always use the one with best performance and thus adding and removing items will be consistent.

If you are working with lazy sequences, you can also use lazy-cat:
(take 5 (lazy-cat (range) [1])) ; (0 1 2 3 4)
Or you could make it a utility method:
(defn append [coll & items] (lazy-cat coll items))
Then use it like this:
(take 5 (append (range) 1)) ; (0 1 2 3 4)

Permuting output of a tree of closures

This a conceptual question on how one would implement the following in Lisp (assuming Common Lisp in my case, but any dialect would work). Assume you have a function that creates closures that sequentially iterate over an arbitrary collection (or otherwise return different values) of data and returns nil when exhausted, i.e.
(defun make-counter (up-to)
(let ((cnt 0))
(lambda ()
(if (< cnt up-to)
(incf cnt)
nil))))
CL-USER> (defvar gen (make-counter 3))
GEN
CL-USER> (funcall gen)
1
CL-USER> (funcall gen)
2
CL-USER> (funcall gen)
3
CL-USER> (funcall gen)
NIL
CL-USER> (funcall gen)
NIL
Now, assume you are trying to permute a combinations of one or more of these closures. How would you implement a function that returns a new closure that subsequently creates a permutation of all closures contained within it? i.e.:
(defun permute-closures (counters)
......)
such that the following holds true:
CL-USER> (defvar collection (permute-closures (list
(make-counter 3)
(make-counter 3))))
CL-USER> (funcall collection)
(1 1)
CL-USER> (funcall collection)
(1 2)
CL-USER> (funcall collection)
(1 3)
CL-USER> (funcall collection)
(2 1)
...
and so on.
The way I had it designed originally was to add a 'pause' parameter to the initial counting lambda such that when iterating you can still call it and receive the old cached value if passed ":pause t", in hopes of making the permutation slightly cleaner. Also, while the example above is a simple list of two identical closures, the list can be an arbitrarily-complicated tree (which can be permuted in depth-first order, and the resulting permutation set would have the shape of the tree.).
I had this implemented, but my solution wasn't very clean and am trying to poll how others would approach the problem.
Thanks in advance.
edit Thank you for all the answers. What I ended up doing was adding a 'continue' argument to the generator and flattening my structure by replacing any nested list with a closure that permuted that list. The generators did not advance and always returned the last cached value unless 'continue' was passed. Then I just recursively called each generator until I got to the either the last cdr or a nil. If i got to the last cdr, I just bumped it. If I got to a NIL, I bumped the one before it, and reset every closure following it.

You'll clearly need some way of using each value returned by a generator more than once.
In addition to Rainer Joswig's suggestions, three approaches come to mind.
Caching values
permute-closures could, of course, remember every value returned by each generator by storing it in a list, and reuse that over and over. This approach obviously implies some memory overhead, and it won't work very well if the generated sequences can be infinite.
Creating new generators on each iteration
In this approach, you would change the signature of permute-closures to take as arguments not ready-to-use generators but thunks that create them. Your example would then look like this:
(permute-closures (list (lambda () (make-counter 3))
(lambda () (make-counter 3))))
This way, permute-closures is able to reset a generator by simply recreating it.
Making generator states copyable
You could provide a way of making copies of generators along with their states. This is kind of like approach #2 in that permute-closures would reset the generators as needed except the resetting would be done by reverting to a copy of the original state. Also, you would be able to do partial resets (i.e., backtrack to an arbitrary point rather than just the beginning), which may or may not make the code of permute-closures significantly simpler.
Copying generator states might be a bit easier in a language with first-class continuations (like Scheme), but if all generators follow some predefined structure, abstracting it away with a define-generator macro or some such should be possible in Common Lisp as well.

I would add to the counter either one of these:
being able to reset the counter to the start
letting the counter return NIL when the count is done and then starting from the first value again on the next call

Functional Programming: what is an "improper list"?

Could somebody explain what an "improper list" is?
Note: Thanks to all ! All you guys rock!

I think #Vijay's answer is the best one so far and I just intend to Erlangify it.
Pairs (cons cells) in Erlang are written as [Head|Tail] and nil is written as []. There is no restriction as to what the head and tail are but if you use the tail to chain more cons cells you get a list. If the final tail is [] then you get a proper list. There is special syntactic support for lists in that the proper list
[1|[2|[3|[]]]]
is written as
[1,2,3]
and the improper list
[1|[2|[3|4]]]
is written as
[1,2,3|4]
so you can see the difference. Matching against proper/improper lists is correspondingly easy. So a length function len for proper lists:
len([_|T]) -> 1 + len(T);
len([]) -> 0.
where we explicitly match for the terminating []. If given an improper list this will generate an error. While the function last_tail which returns the last tail of a list can handle improper lists as well:
last_tail([_|T]) -> last_tail(T);
last_tail(Tail) -> Tail. %Will match any tail
Note that building a list, or matching against it, as you normally do with [Head|Tail] does not check if the tail is list so there is no problem in handling improper lists. There is seldom a need for improper lists, though you can do cool things with them.

I think it's easier to explain this using Scheme.
A list is a chain of pairs that end with an empty list. In other words, a list ends with a pair whose cdr is ()
(a . (b . (c . (d . (e . ())))))
;; same as
(a b c d e)
A chain of pairs that doesn't end in the empty list is called an improper list. Note that an improper list is not a list. The list and dotted notations can be combined to represent improper lists, as the following equivalent notations show:
(a b c . d)
(a . (b . (c . d)))
An example of a usual mistake that leads to the construction of an improper list is:
scheme> (cons 1 (cons 2 3))
(1 2 . 3)
Notice the dot in (1 2 . 3)---that's like the dot in (2 . 3), saying that the cdr of a pair points to 3, not another pair or '(). That is, it's an improper list, not just a list of pairs. It doesn't fit the recursive definition of a list, because when we get to the second pair, its cdr isn't a list--it's an integer.
Scheme printed out the first part of the list as though it were a normal cdr-linked list, but when it got to the end, it couldn't do that, so it used "dot notation."
You generally shouldn't need to worry about dot notation, because you should use normal lists, not improper list. But if you see an unexpected dot when Scheme prints out a data structure, it's a good guess that you used cons and gave it a non-list as its second argument--something besides another pair or ().
Scheme provides a handy procedure that creates proper lists, called list. list can take any number of arguments, and constructs a proper list with those elements in that order. You don't have to remember to supply the empty list---list automatically terminates the list that way.
Scheme>(list 1 2 3 4)
(1 2 3 4)
Courtesy: An Introduction to Scheme

The definition of a list in Erlang is given in the manual - specifically Section 2.10
In Erlang the only thing you really need to know about improper lists is how to avoid them, and the way to do that is very simple - it is all down to the first 'thing' that you are going to build your list on. The following all create proper lists:
A = [].
B = [term()].
C = [term(), term(), term()].
In all these cases the syntax ensures that there is a hidden 'empty' tail which matches to '[]' sort of at the end....
So from them the following operations all produce a proper list:
X = [term() | A].
Y = [term() | B].
Z = [term() | C].
They are all operations which add a new head to a proper list.
What makes is useful is that you can feed each of X, Y or Z into a function like:
func([], Acc) -> Acc;
func([H | T], Acc) -> NewAcc = do_something(H),
func(T, [NewAcc | Acc]).
And they will rip through the list and terminate on the top clause when the hidden empty list at the tail is all that is left.
The problem comes when your base list has been improperly made, like so:
D = [term1() | term2()]. % term2() is any term except a list
This list doesn't have the hidden empty list as the terminal tail, it has a term...
From here on downwards is mince as Robert Virding pointed out in the comments
So how do you write a terminal clause for it?
What makes it infuriating is that there is no way to see if a list is improper by inspecting it... print the damn thing out it looks good... So you end up creating an improper base list, doing some stuff on it, passing it around, and then suddenly kabloowie you have a crash miles from where the error is and you pull your hair and scream and shout...
But you should be using the dialyzer to sniff these little beasts out for you.
Apologies
Following Robert's comment I tried printing out an improper list and, lo and behold, it is obvious:
(arrian#localhost)5>A = [1, 2, 3, 4].
[1,2,3,4]
(arrian#localhost)5> B = [1, 2, 3 | 4].
[1,2,3|4]
(arrian#localhost)6> io:format("A is ~p~nB is ~p~n", [A, B]).
A is [1,2,3,4]
B is [1,2,3|4]
I had spent some time hunting an improper list once and had convinced myself it was invsible, well Ah ken noo!

To understand what an improper list is, you must first understand the definition of a proper list.
Specifically, the "neat discovery" of lists is that you can represent a list using only forms with a fixed number of elements, viz:
;; a list is either
;; - empty, or
;; - (cons v l), where v is a value and l is a list.
This "data definition" (using the terms of How To Design Programs) has all kinds of
nice properties. One of the nicest is that if we define the behavior or meaning of a function on each "branch" of the data definition, we're guaranteed not to miss a case. More significantly, structures like this generally lead to nice clean recursive solutions.
The classic "length" example:
(define (length l)
(cond [(empty? l) 0]
[else (+ 1 (length (rest l))]))
Of course, everything's prettier in Haskell:
length [] = 0
length (f:r) = 1 + length r
So, what does this have to do with improper lists?
Well, an improper list uses this data definition, instead:
;; an improper list is either
;; - a value, or
;; - (cons v l), where v is a value and l is an improper list
The problem is that this definition leads to ambiguity. In particular, the first and second cases overlap. Suppose I define "length" for an improper list thusly:
(define (length l)
(cond [(cons? l) (+ 1 (length (rest l)))]
[else 1]))
The problem is that I've destroyed the nice property that if I take two values and put them into an improper list with (cons a b), the result has length two. To see why, suppose I consider the values (cons 3 4) and (cons 4 5). The result is (cons (cons 3 4) (cons 4 5)), which may be interpreted either as the improper list containing (cons 3 4) and (cons 4 5), or as the improper list containing (cons 3 4), 4, and 5.
In a language with a more restrictive type system (e.g. Haskell), the notion of an "improper list" doesn't make quite as much sense; you could interpret it as a datatype whose base case has two things in it, which is probably not what you want, either.

I think possibly it refers to a "dotted pair" in LISP, e.g. a list whose final cons cell has an atom, rather than a reference to another cons cell or NIL, in the cdr.
EDIT
Wikipedia suggests that a circular list also counts as improper. See
http://en.wikipedia.org/wiki/Lisp_(programming_language)
and search for 'improper' and check the footnotes.

I would say the implication of an improper list is that a recursive treatment of the list will not match the typical termination condition.
For example, say you call the following sum, in Erlang, on an improper list:
sum([H|T]) -> H + sum(T);
sum([]) -> 0.
Then it will raise an exception since the last tail is not the empty list, but an atom.

In Common Lisp improper lists are defined as:
dotted lists that have a non-NIL terminating 'atom'.
Example
(a b c d . f)
or
circular lists
Example
#1=(1 2 3 . #1#)

A list is made up of cells, each cell consisting of two pointers. First one pointing to the data element, second one to the next cell, or nil at the end of the list.
If the second one does not point to a cell (or nil), the list is improper. Functional languages will most probably allow you to construct cells, so you should be able to generate improper lists.
In Erlang (and probably in other FP languages as well) you can save some memory by storing your 2-tuples as improper lists:
2> erts_debug:flat_size({1,2}).
3
3> erts_debug:flat_size([1|2]).
2

In Erlang a proper list is one where [H|T].
H is the head of the list and T is the rest of the list as another list.
An improper list does not conform to this definition.

In erlang, a proper list is a singly linked list. An improper list is a singly linked list with the last node not being a real list node.
A proper list
[1, 2, 3] is like
An improper list
[1, 2 | 3] is like

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex