How can I print polymorphic values in Standard ML? - functional-programming

Is there a way to print polymorphic values in Standard ML (SML/NJ specifically)? I have a polymorphic function that is not doing what I want and due to the abysmal state that is debugging in SML (see Any real world experience debugging a production functional program?), I would like to see what it is doing with some good-ol' print's. A simple example would be (at a prompt):
fun justThisOnce(x : 'a) : 'a = (print(x); x);
justThisOnce(42);
Other suggestions are appreciated. In the meantime I'll keep staring the offending code into submission.
Update
I was able to find the bug but the question still stands in the hopes of preventing future pain and suffering.

No, there is no way to print a polymorphic value.
You have two choices:
Specialize your function to integers or strings, which are readily printed. Then when the bug is slain, make it polymorphic again.
If the bug manifests only with some other instantiation, pass show as an additional argument to your function. So for example, if your polymorphic function has type
'a list -> 'a list
you extend the type to
('a -> string) -> 'a list -> 'a list
You use show internally to print, and then by partially applying the function to a suitable show, you can get a version you can use in the original context.
It's very tedious but it does help. (But be warned: it may drive you to try Haskell.)

Only in MOSML: Merely for debugging purposes, use the printVal function. Note that this function is only available in toplevel mode, it will cause an error when you try to compile your program.
Edit: In that case, I'm afraid there is no general solution, you need to translate your values explicitly to strings, and print those. See other answer for good suggestions.

Related

Knowing when what you're looking at must be a macro

I know there is macro-function, explained here, which allows you to check, but is it also possible in simply reading lisp source to sometimes infer of what you're looking at "that must be a macro"? (assuming of course you have never seen the function/macro before).
I'm fairly sure the answer is yes, but as this seems so fundamental, I thought worth asking, especially because any nuances on this may be valuable & interesting to know about.
In Paul Graham's ANSI Common Lisp, p70, he is describing how to use defstruct.
When I see (defstruct point x y), were I to know absolutely nothing about what defstruct was, this could just as well be a function.
But when I see
(defstruct polemic
(subject "foo")
(effect "bar"))
I know that must be a macro because (let's assume), I also know that subject and effect are undefined functions. (I know that because they error with undefined function when called 'at the top level'(?)) (if that's the right term).
If the two list arguments to defstruct above were quoted, it would not be so simple. Because they're not quoted, it must be a macro.
Is it as simple as that?
I've changed the field names slightly from those used on the book to make this question clearer.
Finally, Graham writes:
"We can specify default values for structure fields by enclosing the field name and a default expression in a list in the original definition"
What I'm noticing is that that's true but it is not a (quoted) list. Would any readers of this post have phrased the above sentence at all differently (given that macros haven't been introduced in the book yet (though I have a basic awareness of what they are)).
My feeling is it's not a "data list" those default expressions are enclosed in. (apologies for bad terminology) - seeking how rightly to conceptualise here.
In general, you're right: if there's some nesting inside the call and you are sure that the car's of the nested lists aren't functions - it's a macro.
Also, almost always, def-something and with-something are macros.
But there's no guarantee. The question is, what are you trying to accomplish? Some code walking/transformation or external processing (like in an editor). For the latter, you should keep in mind that full control is possible only if you perform code evaluation, although heuristics (like in Emacs) can take you pretty far. Or you just want to develop your intuition for faster code reading...
There is a set of conventions that identify quite cleary what forms are supposed to be macros, simply by mimicking the syntax of existing macros or special operators of CL.
For example, the following is a mix of various imaginary macros, but even without knowing their definition, the code shouldn't be too hard to figure out:
(defun/typed example ((id (integer 0 10)))
(with-connection (connection (connect id))
(do-events (event connection)
(event-case event
(:quit (&optional code) (return code))))))
The usual advice about macros is to avoid them if possible, so if you spot something that doesn't make sense as a lisp expression, it probably is, or is enclosed in, a macro.
(defstruct point x y)
[...] were I to know absolutely nothing about what defstruct was, this could just as well be a function.
There are various hints that this is not a function. First of all, the name starts with def. Then, if defstruct was a function, then point, x and y would all be evaluated before calling the function, and that means the code would be relying on global variables, even though they are not wearing earmuffs (e.g. *point*, *x*, *y*), and you probably won't find any definition for them in the preceding forms (or later in the same compilation unit). Also, if it was a function, the result would be discarded directly since it is not used (this is a toplevel form). That only indicates the probable presence of side-effects, but still, this would be unusual.
A top-level function with side-effects would look like this instead, with quoted data:
(register-struct 'point '(x y))
Finally, there are cases where you cannot easily guess if you are using a macro or a function:
(my-get object :slot)
This could be a function call, or you could have a macro that turns the above to (aref object 0) (assuming :slot is the zeroth slot in object, because all your objects are assumed to be of a certain custom type backed by a vector). You could also have compiler macros. In case of doubt, try to macroexpand it and look at the documentation.

Hidden remote function call of a local binding in ocaml

Given the following example for generating a lazy list number sequence:
type 'a lazy_list = Node of 'a * (unit -> 'a lazy_list);;
let make =
let rec gen i =
Node(i, fun() -> gen (i + 1))
in gen 0
;;
I asked myself the following questions when trying to understand how the example works (obviously I could not answer myself and therefore I am asking here)
When calling let Node(_, f) = make and then f(), why does the call of gen 1 inside f() succeed although gen is a local binding only existing in make?
Shouldn't the created Node be completely unaware of the existence of gen? (Obviously not since it works.)
How is a construction like this being handled by the compiler?
First of all, the questions that are asking have nothing to do with the concepts of lazy, so we can disregard this particular issue, to simplify the discussion.
As Jeffrey noted in the comment to your question, the answer is simple - it is a closure.
But let me extend it a little bit. Functional programming languages, as well as many other modern languages, including Python and C++, allows to define functions in a scope of another function and to refer to the variables available in the scope of the enclosing function. These variables are called captured variables, and the created functional object along with the captured values is called the closure.
From the compiler perspective, the implementation is rather simple (to understand). The closure is a normal value, that contains a code to be executed, as well as pointers to the extra values, that were captured from the outer scope. Since OCaml is a garbage collected language, the values are preserved, as they are referenced from a live object. In C++ the story is much more complicated, as C++ doesn't have the GC, but this is a completely different story.
Shouldn't the created Node be completely unaware of the existence of gen? (Obviously not since it works.)
The create Node is an object that has two pointers, a pointer to the initial object i, and a pointer to the anonymous function fun() -> gen (i + 1). The anonymous function has a pointer to the same initial object i. In our particular case, the i is an integer, so instead of being a pointer the i value is represented inline, but these are details that are irrelevant to the question.

How can I dispatch on traits relating two types, where the second type that co-satisfies the trait is uniquely determined by the first?

Say I have a Julia trait that relates to two types: one type is a sort of "base" type that may satisfy a sort of partial trait, the other is an associated type that is uniquely determined by the base type. (That is, the relation from BaseType -> AssociatedType is a function.) Together, these types satisfy a composite trait that is the one of interest to me.
For example:
using Traits
#traitdef IsProduct{X} begin
isnew(X) -> Bool
coolness(X) -> Float64
end
#traitdef IsProductWithMeasurement{X,M} begin
#constraints begin
istrait(IsProduct{X})
end
measurements(X) -> M
#Maybe some other stuff that dispatches on (X,M), e.g.
#fits_in(X,M) -> Bool
#how_many_fit_in(X,M) -> Int64
#But I don't want to implement these now
end
Now here are a couple of example types. Please ignore the particulars of the examples; they are just meant as MWEs and there is nothing relevant in the details:
type Rope
color::ASCIIString
age_in_years::Float64
strength::Float64
length::Float64
end
type Paper
color::ASCIIString
age_in_years::Int64
content::ASCIIString
width::Float64
height::Float64
end
function isnew(x::Rope)
(x.age_in_years < 10.0)::Bool
end
function coolness(x::Rope)
if x.color=="Orange"
return 2.0::Float64
elseif x.color!="Taupe"
return 1.0::Float64
else
return 0.0::Float64
end
end
function isnew(x::Paper)
(x.age_in_years < 1.0)::Bool
end
function coolness(x::Paper)
(x.content=="StackOverflow Answers" ? 1000.0 : 0.0)::Float64
end
Since I've defined these functions, I can do
#assert istrait(IsProduct{Rope})
#assert istrait(IsProduct{Paper})
And now if I define
function measurements(x::Rope)
(x.length)::Float64
end
function measurements(x::Paper)
(x.height,x.width)::Tuple{Float64,Float64}
end
Then I can do
#assert istrait(IsProductWithMeasurement{Rope,Float64})
#assert istrait(IsProductWithMeasurement{Paper,Tuple{Float64,Float64}})
So far so good; these run without error. Now, what I want to do is write a function like the following:
#traitfn function get_measurements{X,M;IsProductWithMeasurement{X,M}}(similar_items::Array{X,1})
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
Generically, this function is meant to be an example of "I want to use the fact that I, as the programmer, know that BaseType is always associated to AssociatedType to help the compiler with type inference. I know that whenever I do a certain task [in this case, get_measurements, but generically this could work in a bunch of cases] then I want the compiler to infer the output type of that function in a consistently patterned way."
That is, e.g.
do_something_that_makes_arrays_of_assoc_type(x::BaseType)
will always spit out Array{AssociatedType}, and
do_something_that_makes_tuples(x::BaseType)
will always spit out Tuple{Int64,BaseType,AssociatedType}.
AND, one such relationship holds for all pairs of <BaseType,AssociatedType>; e.g. if BatmanType is the base type to which RobinType is associated, and SupermanType is the base type to which LexLutherType is always associated, then
do_something_that_makes_tuple(x::BatManType)
will always output Tuple{Int64,BatmanType,RobinType}, and
do_something_that_makes_tuple(x::SuperManType)
will always output Tuple{Int64,SupermanType,LexLutherType}.
So, I understand this relationship, and I want the compiler to understand it for the sake of speed.
Now, back to the function example. If this makes sense, you will have realized that while the function definition I gave as an example is 'correct' in the sense that it satisfies this relationship and does compile, it is un-callable because the compiler doesn't understand the relationship between X and M, even though I do. In particular, since M doesn't appear in the method signature, there is no way for Julia to dispatch on the function.
So far, the only thing I have thought to do to solve this problem is to create a sort of workaround where I "compute" the associated type on the fly, and I can still use method dispatch to do this computation. Consider:
function get_measurement_type_of_product(x::Rope)
Float64
end
function get_measurement_type_of_product(x::Paper)
Tuple{Float64,Float64}
end
#traitfn function get_measurements{X;IsProduct{X}}(similar_items::Array{X,1})
M = get_measurement_type_of_product(similar_items[1]::X)
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
Then indeed this compiles and is callable:
julia> get_measurements(Array{Rope,1}([Rope("blue",1.0,1.0,1.0),Rope("red",2.0,2.0,2.0)]))
2-element Array{Float64,1}:
1.0
2.0
But this is not ideal, because (a) I have to redefine this map each time, even though I feel as though I already told the compiler about the relationship between X and M by making them satisfy the trait, and (b) as far as I can guess--maybe this is wrong; I don't have direct evidence for this--the compiler won't necessarily be able to optimize as well as I want, since the relationship between X and M is "hidden" inside the return value of the function call.
One last thought: if I had the ability, what I would ideally do is something like this:
#traitdef IsProduct{X} begin
isnew(X) -> Bool
coolness(X) -> Float64
∃ ! M s.t. measurements(X) -> M
end
and then have some way of referring to the type that uniquely witnesses the existence relationship, so e.g.
#traitfn function get_measurements{X;IsProduct{X},IsWitnessType{IsProduct{X},M}}(similar_items::Array{X,1})
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
because this would be somehow dispatchable.
So: what is my specific question? I am asking, given that you presumably by this point understand that my goals are
Have my code exhibit this sort of structure generically, so that
I can effectively repeat this design pattern across a lot of cases
and then program in the abstract at the high-level of X and M,
and
do (1) in such a way that the compiler can still optimize to the best of its ability / is as aware of the relationship among
types as I, the coder, am
then, how should I do this? I think the answer is
Use Traits.jl
Do something pretty similar to what you've done so far
Also do ____some clever thing____ that the answerer will indicate,
but I'm open to the idea that in fact the correct answer is
Abandon this approach, you're thinking about the problem the wrong way
Instead, think about it this way: ____MWE____
I'd also be perfectly satisfied by answers of the form
What you are asking for is a "sophisticated" feature of Julia that is still under development, and is expected to be included in v0.x.y, so just wait...
and I'm less enthusiastic about (but still curious to hear) an answer such as
Abandon Julia; instead use the language ________ that is designed for this type of thing
I also think this might be related to the question of typing Julia's function outputs, which as I take it is also under consideration, though I haven't been able to puzzle out the exact representation of this problem in terms of that one.

Reflecting on a Type parameter

I am trying to create a function
import Language.Reflection
foo : Type -> TT
I tried it by using the reflect tactic:
foo = proof
{
intro t
reflect t
}
but this reflects on the variable t itself:
*SOQuestion> foo
\t => P Bound (UN "t") (TType (UVar 41)) : Type -> TT
Reflection in Idris is a purely syntactic, compile-time only feature. To predict how it will work, you need to know about how Idris converts your program to its core language. Importantly, you won't be able to get ahold of reflected terms at runtime and reconstruct them like you would with Lisp. Here's how your program is compiled:
Internally, Idris creates a hole that will expect something of type Type -> TT.
It runs the proof script for foo in this state. We start with no assumptions and a goal of type Type -> TT. That is, there's a term being constructed which looks like ?rhs : Type => TT . rhs. The ?foo : ty => body syntax shows that there's a hole called foo whose eventual value will be available inside of body.
The step intro t creates a function whose argument is t : Type - this means that we now have a term like ?foo_body : TT . \t : Type => foo_body.
The reflect t step then fills the current hole by taking the term on its right-hand side and converting it to a TT. That term is in fact just a reference to the argument of the function, so you get the variable t. reflect, like all other proof script steps, only has access to the information that is available directly at compile time. Thus, the result of filling in foo_body with the reflection of the term t is P Bound (UN "t") (TType (UVar (-1))).
If you could do what you are wanting here, it would have major consequences both for understanding Idris code and for running it efficiently.
The loss in understanding would come from the inability to use parametricity to reason about the behavior of functions based on their types. All functions would effectively become potentially ad-hoc polymorphic, because they could (say) run differently on lists of strings than on lists of ints.
The loss in performance would come from representing enough type information to do the reflection. After Idris code is compiled, there is no type information left in it (unlike in a system such as the JVM or .NET or a dynamically typed system such as Python, where types have a runtime representation that code can access). In Idris, types can be very large, because they can contain arbitrary programs - this means that far more information would have to be maintained, and computation occurring at the type level would also have to be preserved and repeated at runtime.
If you're wanting to reflect on the structure of a type for further proof automation at compile time, take a look at the applyTactic tactic. Its argument should be a function that takes a reflected context and goal and gives back a new reflected tactic script. An example can be seen in the Data.Vect source.
So I suppose the summary is that Idris can't do what you want, and it probably never will be able to, but you might be able to make progress another way.

Haskell "collections" language design

Why is the Haskell implementation so focused on linked lists?
For example, I know Data.Sequence is more efficient
with most of the list operations (except for the cons operation), and is used a lot;
syntactically, though, it is "hardly supported". Haskell has put a lot of effort into functional abstractions, such as the Functor and the Foldable class, but their syntax is not compatible with that of the default list.
If, in a project I want to optimize and replace my lists with sequences - or if I suddenly want support for infinite collections, and replace my sequences with lists - the resulting code changes are abhorrent.
So I guess my wondering can be made concrete in questions such as:
Why isn't the type of map equal to (Functor f) => (a -> b) -> f a -> f b?
Why can't the [] and (:) functions be used for, for example, the type in Data.Sequence?
I am really hoping there is some explanation for this, that doesn't include the words "backwards compatibility" or "it just grew that way", though if you think there isn't, please let me know. Any relevant language extensions are welcome as well.
Before getting into why, here's a summary of the problem and what you can do about it. The constructors [] and (:) are reserved for lists and cannot be redefined. If you plan to use the same code with multiple data types, then define or choose a type class representing the interface you want to support, and use methods from that class.
Here are some generalized functions that work on both lists and sequences. I don't know of a generalization of (:), but you could write your own.
fmap instead of map
mempty instead of []
mappend instead of (++)
If you plan to do a one-off data type replacement, then you can define your own names for things, and redefine them later.
-- For now, use lists
type List a = [a]
nil = []
cons x xs = x : xs
{- Switch to Seq in the future
-- type List a = Seq a
-- nil = empty
-- cons x xs = x <| xs
-}
Note that [] and (:) are constructors: you can also use them for pattern matching. Pattern matching is specific to one type constructor, so you can't extend a pattern to work on a new data type without rewriting the pattern-matchign code.
Why there's so much list-specific stuff in Haskell
Lists are commonly used to represent sequential computations, rather than data. In an imperative language, you might build a Set with a loop that creates elements and inserts them into the set one by one. In Haskell, you do the same thing by creating a list and then passing the list to Set.fromList. Since lists so closely match this abstraction of computation, they have a place that's unlikely to ever be superseded by another data structure.
The fact remains that some functions are list-specific when they could have been generic. Some common functions like map were made list-specific so that new users would have less to learn. In particular, they provide simpler and (it was decided) more understandable error messages. Since it's possible to use generic functions instead, the problem is really just a syntactic inconvenience. It's worth noting that Haskell language implementations have very little list-speficic code, so new data structures and methods can be just as efficient as the "built-in" ones.
There are several classes that are useful generalizations of lists:
Functor supplies fmap, a generalization of map.
Monoid supplies methods useful for collections with list-like structure. The empty list [] is generalized to other containers by mempty, and list concatenation (++) is generalized to other containers by mappend.
Applicative and Monad supply methods that are useful for interpreting collections as computations.
Traversable and Foldable supply useful methods for running computations over collections.
Of these, only Functor and Monad were in the influential Haskell 98 spec, so the others have been overlooked to varying degrees by library writers, depending on when the library was written and how actively it was maintained. The core libraries have been good about supporting new interfaces.
I remember reading somewhere that map is for lists by default since newcomers to Haskell would be put off if they made a mistake and saw a complex error about "Functors", which they have no idea about. Therefore, they have both map and fmap instead of just map.
EDIT: That "somewhere" is the Monad Reader Issue 13, page 20, footnote 3:
3You might ask why we need a separate map function. Why not just do away with the current
list-only map function, and rename fmap to map instead? Well, that’s a good question. The
usual argument is that someone just learning Haskell, when using map incorrectly, would much
rather see an error about lists than about Functors.
For (:), the (<|) function seems to be a replacement. I have no idea about [].
A nitpick, Data.Sequence isn't more efficient for "list operations", it is more efficient for sequence operations. That said, a lot of the functions in Data.List are really sequence operations. The finger tree inside Data.Sequence has to do quite a bit more work for a cons (<|) equivalent to list (:), and its memory representation is also somewhat larger than a list as it is made from two data types a FingerTree and a Deep.
The extra syntax for lists is fine, it hits the sweet spot at what lists are good at - cons (:) and pattern-matching from the left. Whether or not sequences should have extra syntax is further debate, but as you can get a very long way with lists, and lists are inherently simple, having good syntax is a must.
List isn't an ideal representation for Strings - the memory layout is inefficient as each Char is wrapped with a constructor. This is why ByteStrings were introduced. Although they are laid out as an array ByteStrings have to do a bit of administrative work - [Char] can still be competitive if you are using short strings. In GHC there are language extensions to give ByteStrings more String-like syntax.
The other major lazy functional Clean has always represented strings as byte arrays, but its type system made this more practical - I believe the ByteString library uses unsafePerfomIO under the hood.
With version 7.8, ghc supports overloading list literals, compare the manual. For example, given appropriate IsList instances, you can write
['0' .. '9'] :: Set Char
[1 .. 10] :: Vector Int
[("default",0), (k1,v1)] :: Map String Int
['a' .. 'z'] :: Text
(quoted from the documentation).
I am pretty sure this won't be an answer to your question, but still.
I wish Haskell had more liberal function names(mixfix!) a la Agda. Then, the syntax for list constructors (:,[]) wouldn't have been magic; allowing us to at least hide the list type and use the same tokens for our own types.
The amount of code change while migrating between list and custom sequence types would be minimal then.
About map, you are a bit luckier. You can always hide map, and set it equal to fmap yourself.
import Prelude hiding(map)
map :: (Functor f) => (a -> b) -> f a -> f b
map = fmap
Prelude is great, but it isn't the best part of Haskell.

Resources