Idiomatic alternative to reflection - reflection

I am trying to select a digest algorithm (from rust-crypto) based on a configuration string. In Python or JavaScript, say, I'd probably use reflection to get at this:
getattr(Digest, myAlgorithm)
...but from what I've been able to Google, this isn't best practice in a language such as Rust (plus I've found no details on how it could be done). My initial thought was to use a pattern match:
let mut digest = match myAlgorithm {
"sha256" => Sha256::new(),
...
};
However, this doesn't work because, while all the branches of the match implement the same trait, they're ultimately different types. Moreover, presuming there were a way around this, it's a lot of hassle to manually enumerate all these options in the code.
What's the right way to do this in Rust?

Since all the algorithms implement the same trait Digest, which offers everything you need, you can box all the algorithms and convert them to a common Box<Digest>:
let mut digest: Box<Digest> = match my_algorithm {
"sha256" => Box::new(Sha256::new()),
...
};
Now you don't know anymore what the type was, but you still know it's a Digest.
The python and javascript do the boxing (dynamic heap allocation) for you in the background. Rust is very picky about such things and therefor requires you to explicitly state what you mean.
It would be interesting to have reflection in Rust to be able to enumerate all types in scope that implement a trait, but such a system would require quite some effort in the rust compiler and in the brains of of the rust community members. Don't expect it any time soon.

Related

Ada: how to send and receive objects between executables on separate machines

I need to send some fairly large data-structures between instances of my running Ada program. Obviously json over https is an option. Not one I want to use as it's bigger than I'd like in terms of data overhead, but it will work for now.
Ideally I'd want to mash it into a binary blob and be sent with a hash to confirm the message. Is there a decent way to do this in Ada?
I would look for a solution based on Streams, sent over TCP.
If you want to implement your own blocking and hashing, you’ll probably need to write the raw stream to memory first so that you can tell how big the blob is and work out the checksum. A fairly straightforward approach to this would be here, spec and body.
For a solution that’s had a lot more work put into it, look at Dmitry Kazakov’s Simple Components’ Block Streams.
Ideally I'd want to mash it into a binary blob and be sent with a hash
to confirm the message. Is there a decent way to do this in Ada?
As mentioned, the DSA [Annex E] is an excellent way to handle this, though with some caveats due to the implementations (rather than language) — the definitions of/for the DSA are broad enough that the transport could be mostly anything, so long as the interface (RPC- and Stream-based) is respected.
Things will be simple[r] if you structure your program with proper categorizations from the outset —see Pure, Shared_Passive, Remote_Types, and Remote_Call_Interface in the ARM, and the Ada Rational— rather than trying to shoehorn something extant into the DSA's required structuring. (That said, there are some cases where modifying an extant program to be DSA capable is rather simply a matter of adding the categorization pragmas/aspects and configuring+compiling.)
Note that Ada's containers are designed so that they can be used in DSA programs, and are all [IIRC] Remote_Types categorized.
I need to send some fairly large data-structures between instances of
my running Ada program.
Also an option is ASN.1, which allows you to make a type-definition for some data in a language- and machine-independent manner. There are several ASN.1 compilers and a good chunk can generate Ada; here's one (written in F*, IIRC) used by the ESA and freely available open-source.
ASN.1 has an encoding scheme which is optimized for space, and so will give you the most compact on-the-wire representation.
Obviously json over https is an option. Not
one I want to use as it's bigger than I'd like in terms of data
overhead, but it will work for now.
Using HTTP and JSON directly is attractive for many people because it's "easy", though this ease is typically misleading: all the things that they don't do, such as range-checking values or validating the structure are offloaded to the programmer. — That said, you can make things modular and use generics to allow you to "swap-out" methods.
Generic
Type Data(<>) is private;
Type Transport_Type(<>) is private;
Target : However_you_address_the_target;
with Function Encode(Input : Data) return Transport_Type;
Procedure Send( Value : Data );
and
Generic
Type Data(<>) is private;
Type Transport_Type(<>) is private;
with Function Decode(Input : Transport_Type) return Data;
Procedure Receive( Value : Transport_Type ) return Data;
Or something similar to this. I would rate this as less convenient than using the DSA, but also possibly a bit more simple, considering you [mostly] don't have to worry about categorization with this method.

Ignoring certain types with respect to = in OCaml

I'm in a situation where I'm modifying an existing compiler written in OCaml. I've added locations to the AST of the compiled language, but it has cause a bunch of bugs, because equality checks that previously succeeded now fail when identical ASTs have a different location attached.
In particular, I'm seeing List.mem return false when it should return true, since it relies on equality.
I'm wondering, is there a way for me to specify that, for any two values of my location type, that = should always return true for any two values of this type?
It would be a ton of work to refactor the entire compiler to use a custom equality everywhere, particularly since many polymorphic functions rely on being able to use = on any type.
There's no existing OCaml mechanism to do what you want.
You can use ppx to write OCaml syntax extensions, and (as I understand it) the behavior can depend on types. So there's some chance you could get things working that way. But it wouldn't be as straightforward as what you're asking for. I suspect you would need to explicitly handle = and any standard functions (like List.mem) that use = implicitly. (Note that I have no experience with ppx.)
I found a description of PPX here: http://ocamllabs.io/doc/ppx.html
Many experienced OCaml programmers avoid the use of built-in polymorphic equality because its behavior is often surprising. So it might be worth converting to a custom comparison function after all.
What an annoying problem to have.
If you are desperate and willing to write a little C code you can change the representation of locations to Custom_tag blocks, which allow customising the behaviour of some of the polymorphic operations. It's a nasty solution, and I suggest you look hard for a better approach before resorting to this one.
One possibility is that most of the compiler does not use locations at all. If so, you might be able to get away with replacing every location in the AST with the same dummy location. That should allow equality to behave as if locations were not there at all. This is rather hacky, and may not be possible if passes later in the compiler make any use of location info.
The 'clean' solution is to define a sane equality operation for ASTs (or to derive one using ppx) and to change the code to use that. As you say, this would be a lot more work.

Golang: arithmetic operators on structs

Is there a way to define artihmetic ooerators between structs?
Im using a decimal package to work with fixed decimal positions and avoid floats rounding erre ta. Ir defines operations cAlling functions like mul, add, sub, etc.
Id like to use that structure like i do with floats: 6 / 2, not decimal.newfromfloat(6).div(newfromfloat(2))
I was hoping to find something interface to implement which alouds me to do that kind of operations, or maybe some kind of getter setter to work with the underlying valúes... Any ideas?
No, you can't overload operators in Go. There is a FAQ entry about it:
Why does Go not support overloading of methods and operators?
Method dispatch is simplified if it doesn't need to do type matching as well. Experience with other languages told us that having a variety of methods with the same name but different signatures was occasionally useful but that it could also be confusing and fragile in practice. Matching only by name and requiring consistency in the types was a major simplifying decision in Go's type system.
Regarding operator overloading, it seems more a convenience than an absolute requirement. Again, things are simpler without it.
https://golang.org/doc/faq#overloading
If you need a working solution, look at how package math/big deals with arithmetic sans operator overloading.

How does Rust achieve compile-time-only pointer safety?

I have read somewhere that in a language that features pointers, it is not possible for the compiler to decide fully at compile time whether all pointers are used correctly and/or are valid (refer to an alive object) for various reasons, since that would essentially constitute solving the halting problem. That is not surprising, intuitively, because in this case, we would be able to infer the runtime behavior of a program during compile-time, similarly to what's stated in this related question.
However, from what I can tell, the Rust language requires that pointer checking be done entirely at compile time (there's no undefined behavior related to pointers, "safe" pointers at least, and there's no "invalid pointer" or "null pointer" runtime exception either).
Assuming that the Rust compiler doesn't solve the halting problem, where does the fallacy lie?
Is it the case that pointer checking isn't done entirely at compile-time, and Rust's smart pointers still introduce some runtime overhead compared to, say, raw pointers in C?
Or is it possible that the Rust compiler can't make fully correct decisions, and it sometimes needs to Just Trust The Programmer™, probably using one of the lifetime annotations (the ones with the <'lifetime_ident> syntax)? In this case, does this mean that the pointer/memory safety guarantee is not 100%, and still relies on the programmer writing correct code?
Another possibility is that Rust pointers are non-"universal" or restricted in some sense, so that the compiler can infer their properties entirely during compile-time, but they are not as useful as e. g. raw pointers in C or smart pointers in C++.
Or maybe it is something completely different and I'm misinterpreting one or more of { "pointer", "safety", "guaranteed", "compile-time" }.
Disclaimer: I'm in a bit of a hurry, so this is a bit meandering. Feel free to clean it up.
The One Sneaky Trick That Language Designers Hate™ is basically this: Rust can only reason about the 'static lifetime (used for global variables and other whole-program lifetime things) and the lifetime of stack (i.e. local) variables: it cannot express or reason about the lifetime of heap allocations.
This means a few things. First of all, all of the library types that deal with heap allocations (i.e. Box<T>, Rc<T>, Arc<T>) all own the thing they point to. As a result, they don't actually need lifetimes in order to exist.
Where you do need lifetimes is when you're accessing the contents of a smart pointer. For example:
let mut x: Box<i32> = box 0;
*x = 42;
What is happening behind the scenes on that second line is this:
{
let box_ref: &mut Box<i32> = &mut x;
let heap_ref: &mut i32 = box_ref.deref_mut();
*heap_ref = 42;
}
In other words, because Box isn't magic, we have to tell the compiler how to turn it into a regular, run of the mill borrowed pointer. This is what the Deref and DerefMut traits are for. This raises the question: what, exactly, is the lifetime of heap_ref?
The answer to this is in the definition of DerefMut (from memory because I'm in a hurry):
trait DerefMut {
type Target;
fn deref_mut<'a>(&'a mut self) -> &'a mut Target;
}
Like I said before, Rust absolutely cannot talk about "heap lifetimes". Instead, it has to tie the lifetime of the heap-allocated i32 to the only other lifetime it has on hand: the lifetime of the Box.
What this means is that "complicated" things don't have an expressible lifetime, and thus have to own the thing they manage. When you convert a complicated smart pointer/handle into a simple borrowed pointer, that is the moment that you have to introduce a lifetime, and you usually just use the lifetime of the handle itself.
Actually, I should clarify: by "lifetime of the handle", I really mean "the lifetime of the variable in which the handle is currently being stored": lifetimes are really for storage, not for values. This is typically why newcomers to Rust get tripped up when they can't work out why they can't do something like:
fn thingy<'a>() -> (Box<i32>, &'a i32) {
let x = box 1701;
(x, &x)
}
"But... I know that the box will continue to live on, why does the compiler say it doesn't?!" Because Rust can't reason about heap lifetimes and must resort to tying the lifetime of &x to the variable x, not the heap allocation it happens to point to.
Is it the case that pointer checking isn't done entirely at compile-time, and Rust's smart pointers still introduce some runtime overhead compared to, say, raw pointers in C?
There are special runtime-checks for things that can't be checked at compile time. These are usually found in the cell crate. But in general, Rust checks everything at compile time and should produce the same code as you would in C (if your C-code isn't doing undefined stuff).
Or is it possible that the Rust compiler can't make fully correct decisions, and it sometimes needs to Just Trust The Programmer™, probably using one of the lifetime annotations (the ones with the <'lifetime_ident> syntax)? In this case, does this mean that the pointer/memory safety guarantee is not 100%, and still relies on the programmer writing correct code?
If the compiler cannot make the correct decision you get a compile time error telling you that the compiler cannot verify what you are doing. This might also restrict you from stuff you know is correct, but the compiler doesn't. You can always go to unsafe code in that case. But as you correctly assumed, then the compiler relies partly on the programmer.
The compiler checks the function's implementation, to see if it does exactly what the lifetimes say it does. Then, at the call-site of the function, it checks if the programmer uses the function correctly. This is similar to type-checking. A C++ compiler checks if you are returning an object of the correct type. Then it checks at the call-site if the returned object is stored in a variable of the correct type. At no time can the programmer of a function break the promise (except if unsafe is used, but you can always let the compiler enforce that no unsafe is used in your project)
Rust is continuously improved. More things may get legal in Rust once the compiler becomes smarter.
Another possibility is that Rust pointers are non-"universal" or restricted in some sense, so that the compiler can infer their properties entirely during compile-time, but they are not as useful as e. g. raw pointers in C or smart pointers in C++.
There's a few things that can go wrong in C:
dangling pointers
double free
null pointers
wild pointers
These don't happen in safe Rust.
You can never have a pointer that points to an object no longer on the stack or heap. That's proven at compile time through lifetimes.
You do not have manual memory management in Rust. Use a Box to allocate your objects (similar but not equal to a unique_ptr in C++)
Again, no manual memory management. Boxes automatically free memory.
In safe Rust you can create a pointer to any location, but you cannot dereference it. Any reference you create always is bound to an object.
There's a few things that can go wrong in C++:
everything that can go wrong in C
SmartPointers only help you not forget calling free. You can still create dangling references: auto x = make_unique<int>(42);
auto& y = *x;
x.reset();
y = 99;
Rust fixes those:
see above
as long as y exists, you may not modify x. This is checked at compile time and cannot be circumvented by more levels of indirection or structs.
I have read somewhere that in a language that features pointers, it is not possible for the compiler to decide fully at compile time whether all pointers are used correctly and/or are valid (refer to an alive object) for various reasons, since that would essentially constitute solving the halting problem.
Rust doesn't prove you all pointers are used correctly. You may still write bogus programs. Rust proves that you are not using invalid pointers. Rust proves that you never have null-pointers. Rust proves that you never have two pointers to the same object, execept if all these pointers are non-mutable (const). Rust does not allow you to write any program (since that would include programs that violate memory safety). Right now Rust still prevents you from writing some useful programs, but there are plans to allow more (legal) programs to be written in safe Rust.
That is not surprising, intuitively, because in this case, we would be able to infer the runtime behavior of a program during compile-time, similarly to what's stated in this related question.
Revisiting the example in your referenced question about the halting problem:
void foo() {
if (bar() == 0) this->a = 1;
}
The above C++ code would look one of two ways in Rust:
fn foo(&mut self) {
if self.bar() == 0 {
self.a = 1;
}
}
fn foo(&mut self) {
if bar() == 0 {
self.a = 1;
}
}
For an arbitrary bar you cannot prove this, because it might access global state. Rust soon gets const functions, which can be used to compute stuff at compile-time (similar to constexpr). If bar is const, it becomes trivial to prove if self.a is set to 1 at compile-time. Other than that, without pure functions or other restrictions of the function content, you can never prove whether self.a is set to 1 or not.
Rust currently doesn't care whether your code is called or not. It cares whether the memory of self.a still exists during the assignment. self.bar() can never destroy self (except in unsafe code). Therefor self.a will always be available inside the if branch.
Most of the safety of Rust references is guaranteed by strict rules:
If you posses a const reference (&), you can clone this reference and pass it around, but not create a mutable &mut reference out of it.
If a mutable (&mut) reference to an object exists, no other reference to this object can exist.
A reference is not allowed to outlive the object it refers to, and all functions manipulating references must declare how the references from their input and output are linked, using lifetime annotations (like 'a).
So in terms of expressiveness, we are effectively more limited than when using plain raw pointers (for example, building a graph structure is not possible using only safe references), but these rules can effectively be completely checked at compile-time.
Yet, it is still possible to use raw pointers, but you have to enclose the code dealing with them in a unsafe { /* ... */ } block, telling to the compiler "Trust me, I know what I am doing here". That is what some special smart pointers do internally, such as RefCell, which allows you to have these rules checked at runtime rather than compile-time, to gain expressiveness.

Program extraction using native integers/words (not bignums) from Isabelle theory

This question comes in a context where Isabelle is used with formal software development in mind more than with pure maths theorization in mind (and from a standalone developer's context).
Seems at best, SML programs generated from an Isabelle theory, use SML's IntInf.int, not the native integer type, which is Int.int; even if Code_Target_Int, Code_Binary_Nat or Code_Target_Nat is used. Investigation of these theories sources seems to confirm it's all it can do. Native platform integers may be required for multiple reasons, including efficiency and the case the SML imperative program is to be optionally translated into an imperative language subset (ex. C or Ada), which is relevant when the theory relies on the Imperative_HOL theory. The codegen.pdf document which comes with the Isabelle distribution, did not help with it, except in suggesting the first of the options below.
Options may be:
Not using Isabelle's int and nat and re‑create a new numeric type from scratch, then use the code_printing commands (with its type_constructor and constant) to give it the native platform representation and operations (implies inclusion of range limitations in some way in the theory) : must be tedious, although unlikely error‑prone I hope, due to the formal environment. Note this does seems feasible with Isabelle's own int and nat… it makes code generation fails, and nothing tells which constants are missing in the code_printing command.
If the SML program is to be compiled directly (ex. with MLTon), tweak the SML environment with a replacement IntInf structure : may be unsafe or not feasible, and still requires to embed the range limitations in the theory, so the previous options may finally be better than this one.
Touch the generated program to change IntInf into Int : easy, but it is safe? (at least, IntInf implements the same signature as Int do, so may be it's safe). As above, requires to specifies bounds in the theory in some way, it's OK with this.
Dive into Isabelle internals : surely unreasonable, even worse than the second option.
There exist a Word theory, but according to some readings, it's seems not suited for that purpose.
Are they other known options not listed here? Are they comments on the listed options?
If there is no ready‑to‑cook solutions (I feel there is no at the time), what hints or tracks would be best known? (ex. links to documents, mentions of concepts).
Update
Points #2 and #3 of the list, may be OK (if it really is) only if there is a single integer type. If the program use more than only one, it's not applicable.
Directly generating native words from Isabelle int would be unsound, because your formalisation would not take overflow into account where it exists in reality.
It looks like the AFP entry Native_Word does what you want, though:
http://afp.sourceforge.net/entries/Native_Word.shtml

Resources