Specifying Referential transparency in ACSL - frama-c

I want to find some ACSL annotation that can be applied to a function or function pointer to indicate that it has the property of referential transparency. Some way to say "this function will always return the same value when given the same arguments". So far I haven't found any such way. Can anyone point me to a way to express that?
Maybe some way to refer to an arbitrary logic function? If I could name an unknown logic boolean uknown_function(void* a, void* b) = /* this is unkown */; then I could document a function as having a postcondition that it's \result is equal to this arbitrary/unknown logic function?
The larger context is trying to do type-erased comparisons. I want to generally express the concept of "the user has given me void*s to work with and a bool (*)(void const*, void const*) to compare them with, and the user is guaranteeing to me that the function provided really is a strict partial order over whatever those pointers point to." If I had that, then I could start to describe properties of these type-erased objects being sorted, for example.

There is indeed no direct possibility to do that in ACSL: a function contract only specifies what happens during a single call of the function. You could indeed rely on a declared but left undefined logic function, with a reads clause that specifies the part of the C memory state that the function will need to compute its result, e.g.
/*# logic boolean unknown_function{L}(int* a, int* b) reads a[0 .. 1], b[2 .. 3]; */
but if you work with void *, without knowing the size of the underlying objects, this might be tricky to specify: unless the result of unknown_function relies solely on the value of the pointer, and not the content of the pointed object, in which case you don't need that reads trick.
Note in addition that contracts over function pointers are not supported yet, which will probably be an issue for what you intend to do if I understand correctly your last paragraph.
Finally, you might be interested in an upcoming plug-in, RPP, that proposes a way to specify, prove, and use properties relating several calls of one or more C function(s). It is described here and here, and a public release should happen in a not-too-distant future.

Related

Is it possible to declare a tuple struct whose members are private, except for initialization?

Is it possible to declare a tuple struct where the members are hidden for all intents and purposes, except for declaring?
// usize isn't public since I don't want users to manipulate it directly
struct MyStruct(usize);
// But now I can't initialize the struct using an argument to it.
let my_var = MyStruct(0xff)
// ^^^^
// How to make this work?
Is there a way to keep the member private but still allow new structs to be initialized with an argument as shown above?
As an alternative, a method such as MyStruct::new can be implemented, but I'm still interested to know if its possible to avoid having to use a method on the type since it's shorter, and nice for types that wrap a single variable.
Background
Without going into too many details, the only purpose of this type is to wrap a single type (a helper which hides some details, adds some functionality and is optimized away completely when compiled), in this context it's not exactly exposing hidden internals to use the Struct(value) style initializing.
Further, since the wrapper is zero overhead, its a little misleading to use the new method which is often associated with allocation/creation instead of casting.
Just as it's convenient type (int)v or int(v), instead of int::new(v), I'd like to do this for my own type.
It's used often, so the ability to use short expression is very convenient. Currently I'm using a macro which calls a new method, its OK but a little awkward/indirect, hence this question.
Strictly speaking this isn't possible in Rust.
However the desired outcome can be achieved using a normal struct with a like-named function (yes, this works!)
pub struct MyStruct {
value: usize,
}
#[allow(non_snake_case)]
pub fn MyStruct(value: usize) -> MyStruct {
MyStruct { value }
}
Now, you can write MyStruct(5) but not access the internals of MyStruct.
I'm afraid that such a concept is not possible, but for a good reason. Each member of a struct, unless marked with pub, is admitted as an implementation detail that should not raise to the surface of the public API, regardless of when and how the object is currently being used. Under this point of view, the question's goal reaches a conundrum: wishing to keep members private while letting the API user define them arbitrarily is not only uncommon but also not very sensible.
As you mentioned, having a method named new is the recommended approach of doing that. It's not like you're compromising code readability with the extra characters you have to type. Alternatively, for the case where the struct is known to wrap around an item, making the member public can be a possible solution. That, on the other hand, would allow any kind of mutations through a mutable borrow (thus possibly breaking the struct's invariants, as mentioned by #MatthieuM). This decision depends on the intended API.

Why are the keys and values of a borrowed HashMap accessed by reference, not value?

I have a function that takes a borrowed HashMap and I need to access values by keys. Why are the keys and values taken by reference, and not by value?
My simplified code:
fn print_found_so(ids: &Vec<i32>, file_ids: &HashMap<u16, String>) {
for pos in ids {
let whatever: u16 = *pos as u16;
let last_string: &String = file_ids.get(&whatever).unwrap();
println!("found: {:?}", last_string);
}
}
Why do I have to specify the key as a reference, i.e., file_ids.get(&whatever).unwrap() instead of file_ids.get(whatever).unwrap()?
As I understand it, the last_string has to be of type &String, meaning a borrowed string, because the owning collection is borrowed. Is that right?
Similar to the above point, am I correct in assuming pos is of type &u16 because it takes borrowed values from ids?
Think about the semantics of passing parameters as references or as values:
As reference: no ownership transfer. The called function merely borrows the parameter.
As value: the called function takes ownership of the parameter and may not be used by the caller anymore.
Since the function HashMap::get does not need ownership of the key to find an element, the less restrictive passing method was chosen: by reference.
Also, it does not return the value of the element, only a reference. If it returned the value, the value inside the HashMap would no longer be owned by the HashMap and thus be inaccessible in the future.
TL;DR: Rust is not Java.
Rust may have high-level constructs, and data-structures, but it is at heart a low-level language, as illustrated by one of its guiding principle: You don't pay for what you don't use.
As a result, the language and its libraries will as much as possible attempt to eliminate any cost that is superfluous, such as allocating memory needlessly.
Case 1: Taking the key by value.
If the key is a String, this means allocating (and deallocating) memory for each and every look-up, when you could use a local buffer that is only allocated once and for all.
Case 2: Returning by value.
Returning by value means that either:
you remove the entry from the container to give it to the user
you copy the entry in the container to give it to the user
The latter is obviously inefficient (copy means allocation), the former means that if the user wants the value back in another insertion has to take place again, which means look-up etc... and is also inefficient.
In short, returning by value is inefficient in this case.
Rust, therefore, takes the most logical choice as far as efficiency is concerned and passes and returns by value whenever practical.
While it seems unhelpful when the key is a u16, think about how it would work with a more complex key such as a String.
In that case taking the key by value would often mean having to allocate and initialise a new String for each lookup, which would be expensive.

glVertexAttribPointer last attribute value or pointer

The last attribute of glVertexAttribPointer is of type const GLvoid*. But is it really a pointer? It is actually an offset. If I put 0, it means an offset of 0 and not a null pointer to an offset. In my engine, I use this function:
void AbstractVertexData::vertexAttribPtr(int layout) const
{
glVertexAttribPointer(layout,
getShaderAttribs()[layout]->nbComponents,
static_cast<GLenum>(getShaderAttribs()[layout]->attribDataType),
getShaderAttribs()[layout]->shouldNormalize,
getVertexStride(layout),
reinterpret_cast<const void*>(getVertexAttribStart(layout)));
}
getVertexAttribStart returns an intptr_t. When I run drmemory, it says "uninitialized read" and I want to remove that warning. This warning comes from the reinterpret_cast. I can't static_cast to a const void* since my value isn't a pointer. What should I do to fix this warning?
Originally, back in OpenGL-1.1 when vertex arrays got introduces, functions like glVertexPointer, glTexCoordPointer and so on were accepting pointers into client address space. When shaders got introduced they came with arbitrary vertex attributes and the function glVertexAttribPointer follows the same semantics (this was in OpenGL-2.1).
The buffer objects API was then reusing existing functions, where you'd pass an integer for a pointer parameter.
OpenGL-3.3 core eventually made the use of buffer objects mandatory and ever since the glVertexAttribPointer functions being defines with a void* in their function signature are a sore spot; I've written in extent about it in https://stackoverflow.com/a/8284829/524368 (but make sure to read the rest of the answers as well).
Eventually new functions got introduced that allow for a more fine grained control over how vertex attributes are accessed, replacing glVertexAttribPointer, and those operate purely on offsets.

What's the point of unique_ptr?

Isn't a unique_ptr essentially the same as a direct instance of the object? I mean, there are a few differences with dynamic inheritance, and performance, but is that all unique_ptr does?
Consider this code to see what I mean. Isn't this:
#include <iostream>
#include <memory>
using namespace std;
void print(int a) {
cout << a << "\n";
}
int main()
{
unique_ptr<int> a(new int);
print(*a);
return 0;
}
Almost exactly the same as this:
#include <iostream>
#include <memory>
using namespace std;
void print(int a) {
cout << a << "\n";
}
int main()
{
int a;
print(a);
return 0;
}
Or am I misunderstanding what unique_ptr should be used for?
In addition to cases mentioned by Chris Pitman, one more case you will want to use std::unique_ptr is if you instantiate sufficiently large objects, then it makes sense to do it in the heap, rather than on a stack. The stack size is not unlimited and sooner or later you might run into stack overflow. That is where std::unique_ptr would be useful.
The purpose of std::unique_ptr is to provide automatic and exception-safe deallocation of dynamically allocated memory (unlike a raw pointer that must be explicitly deleted in order to be freed and that is easy to inadvertently not get freed in the case of interleaved exceptions).
Your question, though, is more about the value of pointers in general than about std::unique_ptr specifically. For simple builtin types like int, there generally is very little reason to use a pointer rather than simply passing or storing the object by value. However, there are three cases where pointers are necessary or useful:
Representing a separate "not set" or "invalid" value.
Allowing modification.
Allowing for different polymorphic runtime types.
Invalid or not set
A pointer supports an additional nullptr value indicating that the pointer has not been set. For example, if you want to support all values of a given type (e.g. the entire range of integers) but also represent the notion that the user never input a value in the interface, that would be a case for using a std::unique_ptr<int>, because you could get whether the pointer is null or not as a way of indicating whether it was set (without having to throw away a valid value of integer just to use that specific value as an invalid, "sentinel" value denoting that it wasn't set).
Allowing modification
This can also be accomplished with references rather than pointers, but pointers are one way of doing this. If you use a regular value, then you are dealing with a copy of the original, and any modifications only affect that copy. If you use a pointer or a reference, you can make your modifications seen to the owner of the original instance. With a unique pointer, you can additionally be assured that no one else has a copy, so it is safe to modify without locking.
Polymorphic types
This can likewise be done with references, not just with pointers, but there are cases where due to semantics of ownership or allocation, you would want to use a pointer to do this... When it comes to user-defined types, it is possible to create a hierarchical "inheritance" relationship. If you want your code to operate on all variations of a given type, then you would need to use a pointer or reference to the base type. A common reason to use std::unique_ptr<> for something like this would be if the object is constructed through a factory where the class you are defining maintains ownership of the constructed object. For example:
class Airline {
public:
Airline(const AirplaneFactory& factory);
// ...
private:
// ...
void AddAirplaneToInventory();
// Can create many different type of airplanes, such as
// a Boeing747 or an Airbus320
const AirplaneFactory& airplane_factory_;
std::vector<std::unique_ptr<Airplane>> airplanes_;
};
// ...
void Airline::AddAirplaneToInventory() {
airplanes_.push_back(airplane_factory_.Create());
}
As you mentioned, virtual classes are one use case. Beyond that, here are two others:
Optional instances of objects. My class may delay instantiating an instance of the object. To do so, I need to use memory allocation but still want the benefits of RAII.
Integrating with C libraries or other libraries that love returning naked pointers. For example, OpenSSL returns pointers from many (poorly documented) methods, some of which you need to cleanup. Having a non-copyable pointer container is perfect for this case, since I can protect it as soon as it is returned.
A unique_ptr functions the same as a normal pointer except that you do not have to remember to free it (in fact it is simply a wrapper around a pointer). After you allocate the memory, you do not have to afterwards call delete on the pointer since the destructor on unique_ptr takes care of this for you.
Two things come to my mind:
You can use it as a generic exception-safe RAII wrapper. Any resource that has a "close" function can be wrapped with unique_ptr easily by using a custom deleter.
There are also times you might have to move a pointer around without knowing its lifetime explicitly. If the only constraint you know is uniqueness, then unique_ptr is an easy solution. You could almost always do manual memory management also in that case, but it is not automatically exception safe and you could forget to delete. Or the position you have to delete in your code could change. The unique_ptr solution could easily be more maintainable.

Assigning block pointers: differences between Objective-C vs C++ classes

I’ve found that assigning blocks behaves differently with respect to Objective-C class parameters and C++ classes parameters.
Imagine I have this simple Objective-C class hierarchy:
#interface Fruit : NSObject
#end
#interface Apple : Fruit
#end
Then I can write stuff like this:
Fruit *(^getFruit)();
Apple *(^getApple)();
getFruit = getApple;
This means that, with respect to Objective-C classes, blocks are covariant in their return type: a block which returns something more specific can be seen as a “subclass” of a block returning something more general. Here, the getApple block, which delivers an apple, can be safely assigned to the getFruit block. Indeed, if used later, it's always save to receive an Apple * when you're expecting a Fruit *. And, logically, the converse does not work: getApple = getFruit; doesn't compile, because when we really want an apple, we're not happy getting just a fruit.
Similarly, I can write this:
void (^eatFruit)(Fruit *);
void (^eatApple)(Apple *);
eatApple = eatFruit;
This shows that blocks are covariant in their argument types: a block that can process an argument that is more general can be used where a block that processes an argument that is more specific is needed. If a block knows how to eat a fruit, it will know how to eat an apple as well. Again, the converse is not true, and this will not compile: eatFruit = eatApple;.
This is all good and well — in Objective-C. Now let's try that in C++ or Objective-C++, supposing we have these similar C++ classes:
class FruitCpp {};
class AppleCpp : public FruitCpp {};
class OrangeCpp : public FruitCpp {};
Sadly, these block assignments don't compile any more:
FruitCpp *(^getFruitCpp)();
AppleCpp *(^getAppleCpp)();
getFruitCpp = getAppleCpp; // error!
void (^eatFruitCpp)(FruitCpp *);
void (^eatAppleCpp)(AppleCpp *);
eatAppleCpp = eatFruitCpp; // error!
Clang complains with an “assigning from incompatible type” error. So, with respect to C++ classes, blocks appear to be invariant in the return type and parameter types.
Why is that? Doesn't the same argument I made with Objective-C classes also hold for C++ classes? What am I missing?
This distinction is intentional, due to the differences between the Objective-C and C++ object models. In particular, given a pointer to an Objective-C object, one can convert/cast that pointer to point at a base class or a derived class without actually changing the value of the pointer: the address of the object is the same regardless.
Because C++ allows multiple and virtual inheritance, this is not the case for C++ objects: if I have a pointer to a C++ class and I cast/convert that pointer to point at a base class or a derived class, I may have to adjust the value of the pointer. For example, consider:
class A { int x; }
class B { int y; }
class C : public A, public B { }
B *getC() {
C *c = new C;
return c;
}
Let's say that the new C object in getC() gets allocated at address 0x10. The value of the pointer 'c' is 0x10. In the return statement, that pointer to C needs to be adjusted to point at the B subobject within C. Because B comes after A in C's inheritance list, it will (generally) be laid out in memory after A, so this means adding an offset of 4 bytes (
== sizeof(A)) to the pointer, so the returned pointer will be 0x14. Similarly, casting a B* to a C* would subtract 4 bytes from the pointer, to account for B's offset within C. When dealing with virtual base classes, the idea is the same but the offsets are no longer known, compile-time constants: they're accessed through the vtable during execution.
Now, consider the effect this has on an assignment like:
C (^getC)();
B (^getB)();
getB = getC;
The getC block returns a pointer to a C. To turn it into a block that returns a pointer to a B, we would need to adjust the pointer returned from each invocation of the block by adding 4 bytes. This isn't an adjustment to the block; it's an adjustment to the pointer value returned by the block. One could implement this by synthesizing a new block that wraps the previous block and performs the adjustment, e.g.,
getB = ^B() { return getC() }
This is implementable in the compiler, which already introduces similar "thunks" when overriding a virtual function with one that has a covariant return type needing adjustment. However, with blocks it causes an additional problem: blocks allow equality comparison with ==, so to evaluate whether "getB == getC", we would have to be able to look through the thunk that would be generated by the assignment "getB = getC" to compare the underlying block pointers. Again, this is implementable, but would require a much more heavyweight blocks runtime that is able to create (uniqued) thunks able to perform these adjustments to the return value (and as well as for any contravariant parameters). While all of this is technically possible, the cost (in runtime size, complexity, and execution time) outweighs the benefits.
Getting back to Objective-C, the single-inheritance object model never needs any adjustments to the object pointer: there's only a single address to point at a given Objective-C object, regardless of the static type of the pointer, so covariance/contravariance never requires any thunks, and the block assignment is a simple pointer assignment (+ _Block_copy/_Block_release under ARC).
the feature was probably overlooked. There are commits that show Clang people caring about making covariance and contravariance work in Objective-C++ for Objective-C types but I couldn't find anything for C++ itself. The language specification for blocks doesn't mention covariance or contravariance for either C++ or Objective-C.

Resources