Should I return Multiple Values with caution? - common-lisp

In Practical Common Lisp, Peter Seibel write:
The mechanism by which multiple values are returned is implementation dependent just like the mechanism for passing arguments into functions is. Almost all language constructs that return the value of some subform will "pass through" multiple values, returning all the values returned by the subform. Thus, a function that returns the result of calling VALUES or VALUES-LIST will itself return multiple values--and so will another function whose result comes from calling the first function. And so on.
The implementation dependent does worry me.
My understanding is that the following code might just return primary value:
> (defun f ()
(values 'a 'b))
> (defun g ()
(f))
> (g) ; ==> a ? or a b ?
If so, does it mean that I should use this feature sparingly?
Any help is appreciated.

It's implementation-dependent in the sense that how multiple values are returned at the CPU level may vary from implementation to implementation. However, the semantics are well-specified at the language level and you generally do not need to be concerned about the low-level implementation.
See section 2.5, "Function result protocol", of The Movitz development platform for an example of how one implementation handles multiple return values:
The CPU’s carry flag (i.e. the CF bit in the eflags register) is used to signal whether anything other than precisely one value is being returned. Whenever CF is set, ecx holds the number of values returned. When CF is cleared, a single value in eax is implied. A function’s primary value is always returned in eax. That is, even when zero values are returned, eax is loaded with nil.
It's this kind of low-level detail that may vary from implementation to implementation.

One thing to be aware: there is a limit for the number of values which can be returned on a specific Common Lisp implementation.
The variable MULTIPLE-VALUES-LIMIT has the implementation/machine specific value of the maximum numbers of values which can be returned. The standard says that it should not be smaller than 20. SBCL has a very large number on my computer, while LispWorks has only 51, ECL has 64 and CLISP has 128.
But I can't remember seeing Lisp code which wants to return more than 5 values.

Related

What is the difference between functions and data?

In functional programming, we tend to distinguish between data and functions, but what is the difference?
If I consider a constant, I could think of it as a function, which just returns the same value:
(def x 5)
So what is the distinction between data and a function? I fail to see the difference.
Data
Data is a value (with a specific type).
For example, 5 is a value of type Integer, and "abc" is a value of type String. A composite value such as [5 "abc"] has the type Vector.
Two data values of the same type can always be compared for equality.
Data is never executed. That is, the thread of control (aka program counter or PC) never enters the data structure.
Function (aka "code")
A function's only type is "code".
Two functions are never equal, even if they are duplicates of each other.
A function produces a value (with a specific type) when it is executed (possibly with arguments).
Execution means the thread of control enters the code data structure. The code and data values encountered there have complete control over any side-effects that occur, as well as the return value.
Both compiled and interpreted code produce the same results. The only difference between them are implementation details that trade off complexity vs speed.
Eval
The (eval ...) special form accepts data as input and returns a function as output. The returned function can be executed (i.e. invoked) so the thread of control enters the function.
For clarity, the above elides details such as the reader, etc.
Macros are best viewed as a compiler extension embedded within the code, and do not affect the data vs code distinction.
Postscript
It occurred to me that the original question has not been fully answered. Consider the following:
; A Clojure Var pointing to the value 5
(def five 5)
; A Clojure Var pointing to a function that always returns the value 5
(def ->five (fn [& args] 5))
and then use these 2 Vars:
five => 5
(->five) => 5
The parentheses make all the difference.
See also:
Brave Clojure
LispCast
In languages with the property of homoiconicity, code is data and data is code.
This code data duality blurs the distinction between code and data.
(I think your question is about what is the difference between lambda and data - if lambda itself is actually also just a data structure which has to be executed ...)
In homoiconic languages, data can become lambda (if it contains the instructions for a lambda) and vice versa.
So perhaps, the distinction is only by their type (function vs. any other data structure or primitive data type).

How can I retrieve an object by id in Julia

In Julia, say I have an object_id for a variable but have forgotten its name, how can I retrieve the object using the id?
I.e. I want the inverse of some_id = object_id(some_object).
As #DanGetz says in the comments, object_id is a hash function and is designed not to be invertible. #phg is also correct that ObjectIdDict is intended precisely for this purpose (it is documented although not discussed much in the manual):
ObjectIdDict([itr])
ObjectIdDict() constructs a hash table where the keys are (always)
object identities. Unlike Dict it is not parameterized on its key and
value type and thus its eltype is always Pair{Any,Any}.
See Dict for further help.
In other words, it hashes objects by === using object_id as a hash function. If you have an ObjectIdDict and you use the objects you encounter as the keys into it, then you can keep them around and recover those objects later by taking them out of the ObjectIdDict.
However, it sounds like you want to do this without the explicit ObjectIdDict just by asking which object ever created has a given object_id. If so, consider this thought experiment: if every object were always recoverable from its object_id, then the system could never discard any object, since it would always be possible for a program to ask for that object by ID. So you would never be able to collect any garbage, and the memory usage of every program would rapidly expand to use all of your RAM and disk space. This is equivalent to having a single global ObjectIdDict which you put every object ever created into. So inverting the object_id function that way would require never deallocating any objects, which means you'd need unbounded memory.
Even if we had infinite memory, there are deeper problems. What does it mean for an object to exist? In the presence of an optimizing compiler, this question doesn't have a clear-cut answer. It is often the case that an object appears, from the programmer's perspective, to be created and operated on, but in reality – i.e. from the hardware's perspective – it is never created. Consider this function which constructs a complex number and then uses it for a simple computation:
julia> function f(y::Real)
z = Complex(0,y)
w = 2z*im
return real(w)
end
f (generic function with 1 method)
julia> foo(123)
-246
From the programmer's perspective, this constructs the complex number z and then constructs 2z, then 2z*im, and finally constructs real(2z*im) and returns that value. So all of those values should be inserted into the "Great ObjectIdDict in the Sky". But are they really constructed? Here's the LLVM code for this function applied to an Int:
julia> #code_llvm foo(123)
define i64 #julia_foo_60833(i64) #0 !dbg !5 {
top:
%1 = shl i64 %0, 1
%2 = sub i64 0, %1
ret i64 %2
}
No Complex values are constructed at all! Instead, all of the work is inlined and eliminated instead of actually being done. The whole computation boils down to just doubling the argument (by shifting it left one bit) and negating it (by subtracting it from zero). This optimization can be done first and foremost because the intermediate steps have no observable side effects. The compiler knows that there's no way to tell the difference between actually constructing complex values and operating on them and just doing a couple of integer ops – as long as the end result is always the same. Implicit in the idea of a "Great ObjectIdDict in the Sky" is the assumption that all objects that seem to be constructed actually are constructed and inserted into a large, permanent data structure – which is a massive side effect. So not only is recovering objects from their IDs incompatible with garbage collection, it's also incompatible with almost every conceivable program optimization.
The only other way one could conceive of inverting object_id would be to compute its inverse image on demand instead of saving objects as they are created. That would solve both the memory and optimization problems. Of course, it isn't possible since there are infinitely many possible objects but only a finite number of object IDs. You are vanishingly unlikely to actually encounter two objects with the same ID in a program, but the finiteness of the ID space means that inverting the hash function is impossible in principle since the preimage of each ID value contains an infinite number of potential objects.
I've probably refuted the possibility of an inverse object_id function far more thoroughly than necessary, but it led to some interesting thought experiments, and I hope it's been helpful – or at least thought provoking. The practical answer is that there is no way to get around explicitly stashing every object you might want to get back later in an ObjectIdDict.

How do you prove termination of a recursive list length?

Suppose we have a list:
List = nil | Cons(car cdr:List).
Note that I am talking about modifiable lists!
And a trivial recursive length function:
recursive Length(List l) = match l with
| nil => 0
| Cons(car cdr) => 1 + Length cdr
end.
Naturally, it terminates only when the list is non-circular:
inductive NonCircular(List l) = {
empty: NonCircular(nil) |
\forall head, tail: NonCircular(tail) => NonCircular (Cons(head tail))
}
Note that this predicate, being implemented as a recursive function, also does not terminate on a circular list.
Usually I see proofs of list traversal termination that use list length as a bounded decreasing factor. They suppose that Length is non-negative. But, as I see it, this fact (Length l >= 0) follows from the termination of Length on the first place.
How do you prove, that the Length terminates and is non-negative on NonCircular (or an equivalent, better defined predicate) lists?
Am I missing an important concept here?
Unless the length function has cycle detection there is no guarantee it will halt!
For a singly linked list one uses the Tortoise and hare algorithm to determine the length where there is a chance there might be circles in the cdr.
It's just two cursors, the tortoise starts at first element and the hare starts at the second. Tortoise moves one pointer at a time while the hare moves two (if it can). The hare will eventually either be the same as the tortoise, which indicates a cycle, or it will terminate knowing the length is 2*steps or 2*steps+1.
Compared to finding cycles in a tree this is very cheap and performs just as well on terminating lists as a function that does not have cycle detection.
The definition of List that you have on top doesn't seem to permit circular lists. Each call to the "constructor" Cons will create a new pointer, and you aren't allowed to modify the pointer later to create the circularity.
You need a more sophisticated definition of List if you want to handle circularity. You probably need to define a Cell containing data value and an address, and a Node which contains a Cell and an address pointing to the previous node, and then you'll need to define the dereferencing operator to go back from addresses to Cells. You can also try to define non-circular on this object.
My gut feeling is that you will also need to define an injective function from the "simple" list definition you have above to the sophisticated one that I've outlined and then finally you'll be able to prove your result.
One other thing, the definition of NonCircular doesn't need to terminate. It isn't a program, it is a proof. If it holds, then you can examine the proof to see why it holds and use this in other proofs.
Edit: Thanks to Necto for pointing out I'm wrong.

Fortran Pointer arithmetic

That's my first question post ever ... don't be cruel, please.
My problem is the following. I'd like to assign a fortran pointer as an expression. I think that's not possible by simple fortran techniques. But since new fortran versions seem to provide ways to handle things used in C and C++ (like c_ptr and c_f_pointer ... ), maybe someone knows a way to solve my problem. (I have not really in idea about C, but I read that pointer arithmetic is possible in C)
To make things more clear, here is the code which came to my mind immediately but isn't working:
program pointer
real(8),target :: a
real(8),pointer :: b
b=>a*2.0d0 ! b=>a is of course working
do i=1,10
a=dble(i)*2.0d0
write(*,*)b
end do
end program
I know that there are ways around this issue, but in the actual program, all of which came to my mind, would lead to much longer computation time and/or quite wiered code.
Thanks, a lot, in advance!
Best, Peter
From Michael Metcalf,
Pointers are variables with the POINTER attribute; they are not a distinct data type (and so no 'pointer arithmetic' is possible).
They are conceptually a descriptor listing the attributes of the objects (targets) that the pointer may point to, and the address, if any, of a target. They have no associated storage until it is allocated or otherwise associated (by pointer assignment, see below):
So your idea of b=>a*2 doesn't work because b is being assigned to a and not given the value of a.
Expression, in general (there two and a half very significant exceptions), are not valid pointer targets. Evaluation of an expression (in general) yields a value, not an object.
(The exceptions relate to the case where the overall expression results in a reference to a function with a data pointer result - in that case the expression can be used on the right hand side of a pointer assignment statement, or as the actual argument in a procedure reference that correspond to a pointer dummy argument or [perhaps - and F2008 only] in any context where a variable might be required, such as the left hand side of an ordinary assignment statement. But your expressions do not result in such a function reference and I don't think the use cases are relevant to what you wnt to do. )
I think you want the value of b to change as the "underlying" value of a changes, as per the form of the initial expression. Beyond the valid pointer target issue, this requires behaviour contrary to one of the basic principles of the language (most languages really) - evaluation of an expression uses the value of its primaries at the time the expression is evaluation - subsequent changes in those primaries do not result in a change in the historically evaluated value.
Instead, consider writing a function that calculates b based on a.
program pointer
IMPLICIT NONE
real(8) :: a
do i=1,10
a=dble(i)*2.0d0
write(*,*) b(a)
end do
contains
function b(x)
real(kind(a)), intent(in) :: x
real(kind(a)) :: b
b = 2.0d0 * x
end function b
end program
Update: I'm getting closer to what I wanted to have (for those who are interested):
module test
real,target :: a
real, pointer :: c
abstract interface
function func()
real :: func
end function func
end interface
procedure (func), pointer :: f => null ()
contains
function f1()
real,target :: f1
c=>a
f1 = 2.0*c
return
end function f1
end module
program test_func_ptrs
use test
implicit none
integer::i
f=>f1
do i=1,10
a=real(i)*2.0
write(*,*)f()
end do
end program test_func_ptrs
I would be completely satisfied if I could find a way to avoid the dummy arguments (at least in when I'm calling f).
Additional information: The point is that I want to define different functions f1 and deside before starting the loop, what f is going to be inside of the loop (depending on whatever input).
Pointer arithmetic, in the sense of calculating address offsets from a pointer, is not allowed in Fortran. Pointer arithmetic can easily cause memory errors and the authors of Fortran considered it unnecessary. (One could do it via the back door of interoperability with C.)
Pointers in Fortran are useful for passing procedures as arguments, setting up data structures such as linked lists (e.g., How can I implement a linked list in fortran 2003-2008), etc.

Stackoverflow with specialized Hashtbl (via Hashtbl.make)

I am using this piece of code and a stackoverflow will be triggered, if I use Extlib's Hashtbl the error does not occur. Any hints to use specialized Hashtbl without stackoverflow?
module ColorIdxHash = Hashtbl.Make(
struct
type t = Img_types.rgb_t
let equal = (==)
let hash = Hashtbl.hash
end
)
(* .. *)
let (ctable: int ColorIdxHash.t) = ColorIdxHash.create 256 in
for x = 0 to width -1 do
for y = 0 to height -1 do
let c = Img.get img x y in
let rgb = Color.rgb_of_color c in
if not (ColorIdxHash.mem ctable rgb) then ColorIdxHash.add ctable rgb (ColorIdxHash.length ctable)
done
done;
(* .. *)
The backtrace points to hashtbl.ml:
Fatal error: exception Stack_overflow Raised at file "hashtbl.ml",
line 54, characters 16-40 Called from file "img/write_bmp.ml", line
150, characters 52-108 ...
Any hints?
Well, you're using physical equality (==) to compare the colors in your hash table. If the colors are structured values (I can't tell from this code), none of them will be physically equal to each other. If all the colors are distinct objects, they will all go into the table, which could really be quite a large number of objects. On the other hand, the hash function is going to be based on the actual color R,G,B values, so there may well be a large number of duplicates. This will mean that your hash buckets will have very long chains. Perhaps some internal function isn't tail recursive, and so is overflowing the stack.
Normally the length of the longest chain will be 2 or 3, so it wouldn't be surprising that this error doesn't come up often.
Looking at my copy of hashtbl.ml (OCaml 3.12.1), I don't see anything non-tail-recursive on line 54. So my guess might be wrong. On line 54 a new internal array is allocated for the hash table. So another idea is just that your hashtable is just getting too big (perhaps due to the unwanted duplicates).
One thing to try is to use structural equality (=) and see if the problem goes away.
One reason you may have non-termination or stack overflows is if your type contains cyclic values. (==) will terminates on cyclic values (while (=) may not), but Hash.hash is probably not cycle-safe. So if you manipulate cyclic values of type Img_types.rgb_t, you have to devise your one cycle-safe hash function -- typically, calling Hash.hash on only one of the non-cyclic subfields/subcomponents of your values.
I've already been bitten by precisely this issue in the past. Not a fun bug to track down.

Resources