Frama-C generates confusing assertions about pointer comparison - frama-c

I'm getting assertions that don't make sense to me.
Code:
struct s {
int pred;
};
/*# assigns \result \from \nothing;
ensures \valid(\result);
*/
struct s *get(void);
int main(void)
{
return !get()->pred;
}
Frama-C output:
$ frama-c -val frama-ptr.c
[...]
frama-ptr.c:12:[kernel] warning: pointer comparison.
assert \pointer_comparable((void *)0, (void *)tmp->pred);
(tmp from get())
Am I doing something wrong or is it a bug in Frama-C?

This is not really a bug, although this behavior is not really satisfactory either. I recommend that you observe the analysis in the GUI (frama-c-gui -val frama-ptr.c), in particular the value of tmp->pred at line 12 in the Values tab.
Value before:
∈
{{ garbled mix of &{alloced_return_get}
(origin: Library function {c/ptrcmp.c:12}) }}
This is a very imprecise value, that has been generated because the return value of function get is a pointer. The Eva plugin generates a dummy variable (alloced_return_get), and fills it with dummy contents (garbled mix of &{alloced_return_get}). Even though the field pred is an integer, the analyzer assumes that (imprecise) pointers can also be present. This is why an alarm for pointer comparison is emitted.
There are at least two ways to avoid this:
use option -val-warn-undefined-pointer-comparison pointer, so that pointer comparison alarms are emitted only on comparisons involving lvalues with pointer type. The contents of the field pred will remain imprecise, though.
add a proper body to function get, possibly written using malloc, so that the field pred receives a more accurate value.
On a more general note, the Eva/Value plugin requires precise specifications for functions without a body; see section 7.2 of the manual. For functions that return pointers, you won't be able to write satisfactory specifications: you need to write allocates clauses, and those are currently not handled by Eva/Value. Thus, I suggest that you write a body for get.

Related

Frama-c : Trouble understanding WP memory models

I'm looking for WP options/model that could allow me to prove basic C memory manipulations like :
memcpy : I've tried to prove this simple code :
struct header_src{
char t1;
char t2;
char t3;
char t4;
};
struct header_dest{
short t1;
short t2;
};
/*# requires 0<=n<=UINT_MAX;
# requires \valid(dest);
# requires \valid_read(src);
# assigns (dest)[0..n-1] \from (src)[0..n-1];
# assigns \result \from dest;
# ensures dest[0..n] == src[0..n];
# ensures \result == dest;
*/
void* Frama_C_memcpy(char *dest, const char *src, uint32_t n);
int main(void)
{
struct header_src p_header_src;
struct header_dest p_header_dest;
p_header_src.t1 = 'e';
p_header_src.t2 = 'b';
p_header_src.t3 = 'c';
p_header_src.t4 = 'd';
p_header_dest.t1 = 0x0000;
p_header_dest.t2 = 0x0000;
//# assert \valid(&p_header_dest);
Frama_C_memcpy((char*)&p_header_dest, (char*)&p_header_src, sizeof(struct header_src));
//# assert p_header_dest.t1 == 0x6265;
//# assert p_header_dest.t2 == 0x6463;
}
but the two last assert weren't verified by WP (with default prover Alt-Ergo). It can be proved thanks to Value analysis, but I mostly want to be able to prove the code not using abstract interpretation.
Cast pointer to int : Since I'm programming embedded code, I want to be able to specify something like:
#define MEMORY_ADDR 0x08000000
#define SOME_SIZE 10
struct some_struct {
uint8_t field1[SOME_SIZE];
uint32_t field2[SOME_SIZE];
}
// [...]
// some function body {
struct some_struct *p = (some_struct*)MEMORY_ADDR;
if(p == NULL) {
// Handle error
} else {
// Do something
}
// } body end
I've looked a little bit at WP's documentation and it seems that the version of frama-c that I use (Magnesium-20151002) has several memory model (Hoare, Typed , +cast, +ref, ...) but none of the given example were proved with any of the model above. It is explicitly said in the documentation that Typed model does not handle pointer-to-int casts. I've a lot of trouble to understand what's really going on under the hood with each wp-model. It would really help me if I was able to verify at least post-conditions of the memcpy function. Plus, I have seen this issue about void pointer that apparently are not very well handled by WP at least in the Magnesium version. I didn't tried another version of frama-c yet, but I think that newer version handle void pointer in a better way.
Thank you very much in advance for your suggestions !
memcpy
Reasoning about the result of memcpy (or Frama_C_memcpy) is out of range of the current WP plugin. The only memory model that would work in your case is Bytes (page 13 of the manual for Chlorine), but it is not implemented.
Independently, please note that your postcondition from Frama_C_memcpy is not what you want here. You are asserting the equality of the sets dest[0..n] and src[0..n]. First, you should stop at n-1. Second, and more importantly, this is far too weak, and is in fact not sufficient to prove the two assertions in the caller. What you want is a quantification on all bytes. See e.g. the predicate memcmp in Frama-C's stdlib, or the variant \forall int i; 0 <= i < n -> dest[i] == src[i];
By the way, this postcondition holds only if dest and src are properly separated, which your function does not require. Otherwise, you should write dest[i] == \at (src[i], Pre). And your requires are also too weak for another reason, as you only require the first character to be valid, not the n first ones.
Cast pointer to int
Basically, all current models implemented in WP are unable to reason on codes in which the memory is accessed with multiple incompatible types (through the use of unions or pointer casts). In some cases, such as Typed, the cast is detected syntactically, and a warning is issued to warn that the code cannot be analyzed. The Typed+Cast model is a variant of Typed in which some casts are accepted. The analysis is correct only if the pointer is re-cast into its original type before being used. The idea is to allow using some libc functions that require a cast to void*, but not much more.
Your example is again a bit different, because it is likely that MEMORY_ADDR is always addressed with type some_stuct. Would it be acceptable to change the code slightly, and change your function as taking a pointer to this type? This way, you would "hide" the cast to MEMORY_ADDR inside a function that would remain unproven.
I tried this example in the latest version of Frama-C (of course the format is modified a little bit).
for the memcpy case
Assertion 2 fails but assertion 3 is successfully proved (basically because the failure of assertion 2 leads to a False assumption, which proves everything).
So in fact both assertion cannot be proved, same as your problem.
This conclusion is sound because the memory models used in the wp plugin (as far as I know) has no assumption on the relation between fields in a struct, i.e. in header_src the first two fields are 8 bit chars, but they may not be nestedly organized in the physical memory like char[2]. Instead, there may be paddings between them (refer to wiki for detailed description). So when you try to copy bits in such a struct to another struct, Frama-C becomes completely confused and has no idea what you are doing.
As far as I am concerned, Frama-C does not support any approach to precisely control the memory layout, e.g. gcc's PACKED which forces the compiler to remove paddings.
I am facing the same problem, and the (not elegant at all) solution is, use arrays instead. Arrays are always nested, so if you try to copy a char[4] to a short[2], I think the assertion can be proved.
for the Cast pointer to int case
With memory model Typed+cast, the current version I am using (Chlorine-20180501) supports casting between pointers and uint64_t. You may want to try this version.
Moreover, it is strongly suggested to call Z3 and CVC4 through why3, whose performance is certainly better than Alt-Ergo.

Is it undefined in Rust to temporarily swap uninitialized values into a vector?

Suppose I have a vector, v: Vec<T> with length 5 and capacity 10. Does the following invoke undefined behavior?
let p = v.as_mut_ptr();
unsafe {
std::mem::swap(p, p.offset(5));
std::mem::swap(p.offset(5), p);
}
Yes, this is undefined. From Section 6.1.3.2.3 of the Rust Reference:
The following is a list of behavior which is forbidden in all Rust code, including within unsafe blocks and unsafe functions. Type checking provides the guarantee that these issues are never caused by safe code.
...
Reads of undef (uninitialized) memory
...
p.offset(5) is undefined memory and you have to read it to be able to swap it.
Of course, I don't really see the point to your question, as even if it were defined, the operation would be a no-op. I suspect that this is an artifact of the XY Problem, and that you have an actual problem you are trying to solve.

Lua Alien - Pointer Arithmetic and Dereferencing

My goal is to call Windows' GetModuleInformation function to get a MODULEINFO struct back. This is all working fine. The problem comes as a result of me wanting to do pointer arithmetic and dereferences on the LPVOID lpBaseOfDll which is part of the MODULEINFO.
Here is my code to call the function in Lua:
require "luarocks.require"
require "alien"
sizeofMODULEINFO = 12 --Gotten from sizeof(MODULEINFO) from Visual Studio
MODULEINFO = alien.defstruct{
{"lpBaseOfDll", "pointer"}; --Does this need to be a buffer? If so, how?
{"SizeOfImage", "ulong"};
{"EntryPoint", "pointer"};
}
local GetModuleInformation = alien.Kernel32.K32GetModuleInformation
GetModuleInformation:types{ret = "int", abi = "stdcall", "long", "pointer", "pointer", "ulong"}
local GetModuleHandle = alien.Kernel32.GetModuleHandleA
GetModuleHandle:types{ret = "pointer", abi = "stdcall", "pointer"}
local GetCurrentProcess = alien.Kernel32.GetCurrentProcess
GetCurrentProcess:types{ret = "long", abi = "stdcall"}
local mod = MODULEINFO:new() --Create struct (needs buffer?)
local currentProcess = GetCurrentProcess()
local moduleHandle = GetModuleHandle("myModule.dll")
local success = GetModuleInformation(currentProcess, moduleHandle, mod(), sizeofMODULEINFO)
if success == 0 then --If there is an error, exit
return 0
end
local dataPtr = mod.lpBaseOfDll
--Now how do I do pointer arithmetic and/or dereference "dataPtr"?
At this point, mod.SizeOfImage seems to be giving me the correct values that I am expecting, so I know the functions are being called and the struct is being populated. However, I cannot seem to do pointer arithmetic on mod.lpBaseOfDll because it is a UserData.
The only information in the Alien Documentation that may address what I'm trying to do are these:
Pointer Unpacking
Alien also provides three convenience functions that let you
dereference a pointer and convert the value to a Lua type:
alien.tostring takes a userdata (usually returned from a function that has a pointer return value), casts it to char*, and returns a Lua
string. You can supply an optional size argument (if you don’t Alien
calls strlen on the buffer first).
alien.toint takes a userdata, casts it to int*, dereferences it and returns it as a number. If you pass it a number it assumes the
userdata is an array with this number of elements.
alien.toshort, alien.tolong, alien.tofloat, and alien.todouble are like alien.toint, but works with with the respective typecasts.
Unsigned versions are also available.
My issue with those, is I would need to go byte-by-byte, and there is no alien.tochar function. Also, and more importantly, this still doesn't solve the problem of me being able to get elements outside of the base address.
Buffers
After making a buffer you can pass it in place of any argument of
string or pointer type.
...
You can also pass a buffer or other userdata to the new method of your
struct type, and in this case this will be the backing store of the
struct instance you are creating. This is useful for unpacking a
foreign struct that a C function returned.
These seem to suggest I can use an alien.buffer as the argument of MODULEINFO's LPVOID lpBaseOfDll. And buffers are described as byte arrays, which can be indexed using this notation: buf[1], buf[2], etc. Additionally, buffers go by bytes, so this would ideally solve all problems. (If I am understanding this correctly).
Unfortunately, I can not find any examples of this anywhere (not in the docs, stackoverflow, Google, etc), so I am have no idea how to do this. I've tried a few variations of syntax, but nearly every one gives a runtime error (others simply does not work as expected).
Any insight on how I might be able to go byte-by-byte (C char-by-char) across the mod.lpBaseOfDll through dereferences and pointer arithmetic?
I need to go byte-by-byte, and there is no alien.tochar function.
Sounds like alien.tostring has you covered:
alien.tostring takes a userdata (usually returned from a function that has a pointer return value), casts it to char*, and returns a Lua string. You can supply an optional size argument (if you don’t Alien calls strlen on the buffer first).
Lua strings can contain arbitrary byte values, including 0 (i.e. they aren't null-terminated like C strings), so as long as you pass a size argument to alien.tostring you can get back data as a byte buffer, aka Lua string, and do whatever you please with the bytes.
It sounds like you can't tell it to start at an arbitrary offset from the given pointer address. The easiest way to tell for sure, if the documentation doesn't tell you, is to look at the source. It would probably be trivial to add an offset parameter.

LLVM converting a Constant to a Value

I am using custom LLVM pass where if I encounter a store to
where the compiler converts the value to a Constant; e.g. there is an explicit store:
X[gidx] = 10;
Then LLVM will generate this error:
aoc: ../../../Instructions.cpp:1056: void llvm::StoreInst::AssertOK(): Assertion `getOperand(0)->getType() == cast<PointerType>(getOperand(1)->getType())->getElementType() && "Ptr must be a pointer to Val type!"' failed.
The inheritance order goes as: Value<-User<-Constant, so this shouldn't be an issue, but it is. Using an a cast on the ConstantInt or ConstantFP has no effect on this error.
So I've tried this bloated solution:
Value *new_value;
if(isa<ConstantInt>(old_value) || isa<ConstantFP>(old_value)){
Instruction *allocInst = builder.CreateAlloca(old_value->getType());
builder.CreateStore(old_value, allocInst);
new_value = builder.CreateLoad(allocResultInst);
}
However this solution creates its own register errors when different type are involved, so I'd like to avoid it.
Does anyone know how to convert a Constant to a Value? It must be a simple issue that I'm not seeing. I'm developing on Ubuntu 12.04, LLVM 3, AMD gpu, OpenCL kernels.
Thanks ahead of time.
EDIT:
The original code that produces the first error listed is simply:
builder.CreateStore(old_value, store_addr);
EDIT2:
This old_value is declared as
Value *old_value = current_instruction->getOperand(0);
So I'm grabbing the value to be stored, in this case "10" from the first code line.
You didn't provide the code that caused this first assertion, but its wording is pretty clear: you are trying to create a store where the value operand and the pointer operand do not agree on their types. It would be useful for the question if you'd provide the code that generated that error.
Your second, so-called "bloated" solution, is the correct way to store old_value into the stack and then load it again. You write:
However this solution creates its own register errors when different type are involved
These "register errors" are the real issue you should be addressing.
In any case, the whole premise of "converting a constant to a value" is flawed - as you have correctly observed, all constants are values. There's no point storing a value into the stack with the sole purpose of loading it again, and indeed the standard LLVM pass "mem2reg" will completely remove such a sequence, replacing all uses of the load with the original value.

The use of IN OUT in Ada

Given below is some code in ada
with TYPE_VECT_B; use TYPE_VECT_B;
Package TEST01 is
procedure TEST01
( In_State : IN VECT_B ;
Out_State : IN OUT VECT_B );
function TEST02
( In_State : IN VECT_B ) return Boolean ;
end TEST01;
The TYPE_VECT_B package specification and body is also defined below
Package TYPE_VECT_B is
type VECT_B is array (INTEGER range <>) OF BOOLEAN ;
rounded_data : float ;
count : integer ;
trace : integer ;
end TYPE_VECT_B;
Package BODY TYPE_VECT_B is
begin
null;
end TYPE_VECT_B;
What does the variable In_State and Out_State actually mean? I think In_State means input variable. I just get confused to what actually Out_State means?
An in parameter can be read but not written by the subprogram. in is the default. Prior to Ada 2012, functions were only allowed to have in parameters. The actual parameter is an expression.
An out parameter implies that the previous value is of no interest. The subprogram is expected to write to the parameter. After writing to the parameter, the subprogram can read back what it has written. On exit the actual parameter receives the value written to it (there are complications in this area!). The actual parameter must be a variable.
An in out parameter is like an out parameter except that the previous value is of interest and can be read by the subprogram before assignment. For example,
procedure Add (V : Integer; To : in out Integer; Limited_To : Integer)
is
begin
-- Check that the result wont be too large. This involves reading
-- the initial value of the 'in out' parameter To, which would be
-- wrong if To was a mere 'out' parameter (it would be
-- uninitialized).
if To + V > Limited_To then
To := Limited_To;
else
To := To + V;
end if;
end Add;
Basically, every parameter to a function or procedure has a direction to it. The options are in, out, in out (both), or access. If you don't see one of those, then it defaults to in.
in means data can go into the subroutine from the caller (via the parameter). You are allowed to read from in parameters inside the routine. out means data can come out of the routine that way, and thus you are allowed to assign values to the parameter inside the routine. In general, how the compiler accomplishes the data passing is up to the compiler, which is in accord with Ada's general philosophy of allowing you to specify what you want done, not how you want it done.
access is a special case, and is roughly like putting a "*" in your parameter definition in Cish languages.
The next question folks usually have is "if I pass something large as an in parameter, is it going to push all that data on the stack or something?" The answer is "no", unless your compiler writers are unconsionably stupid. Every Ada compiler I know of under the hood passes objects larger than fit in a machine register by reference. It is the compiler, not the details of your parameter passing mechanisim, that enforces not writing data back out of the routine. Again, you tell Ada what you want done, it figures out the most efficient way to do it.

Resources