Frama-c : Trouble understanding WP memory models - frama-c

I'm looking for WP options/model that could allow me to prove basic C memory manipulations like :
memcpy : I've tried to prove this simple code :
struct header_src{
char t1;
char t2;
char t3;
char t4;
};
struct header_dest{
short t1;
short t2;
};
/*# requires 0<=n<=UINT_MAX;
# requires \valid(dest);
# requires \valid_read(src);
# assigns (dest)[0..n-1] \from (src)[0..n-1];
# assigns \result \from dest;
# ensures dest[0..n] == src[0..n];
# ensures \result == dest;
*/
void* Frama_C_memcpy(char *dest, const char *src, uint32_t n);
int main(void)
{
struct header_src p_header_src;
struct header_dest p_header_dest;
p_header_src.t1 = 'e';
p_header_src.t2 = 'b';
p_header_src.t3 = 'c';
p_header_src.t4 = 'd';
p_header_dest.t1 = 0x0000;
p_header_dest.t2 = 0x0000;
//# assert \valid(&p_header_dest);
Frama_C_memcpy((char*)&p_header_dest, (char*)&p_header_src, sizeof(struct header_src));
//# assert p_header_dest.t1 == 0x6265;
//# assert p_header_dest.t2 == 0x6463;
}
but the two last assert weren't verified by WP (with default prover Alt-Ergo). It can be proved thanks to Value analysis, but I mostly want to be able to prove the code not using abstract interpretation.
Cast pointer to int : Since I'm programming embedded code, I want to be able to specify something like:
#define MEMORY_ADDR 0x08000000
#define SOME_SIZE 10
struct some_struct {
uint8_t field1[SOME_SIZE];
uint32_t field2[SOME_SIZE];
}
// [...]
// some function body {
struct some_struct *p = (some_struct*)MEMORY_ADDR;
if(p == NULL) {
// Handle error
} else {
// Do something
}
// } body end
I've looked a little bit at WP's documentation and it seems that the version of frama-c that I use (Magnesium-20151002) has several memory model (Hoare, Typed , +cast, +ref, ...) but none of the given example were proved with any of the model above. It is explicitly said in the documentation that Typed model does not handle pointer-to-int casts. I've a lot of trouble to understand what's really going on under the hood with each wp-model. It would really help me if I was able to verify at least post-conditions of the memcpy function. Plus, I have seen this issue about void pointer that apparently are not very well handled by WP at least in the Magnesium version. I didn't tried another version of frama-c yet, but I think that newer version handle void pointer in a better way.
Thank you very much in advance for your suggestions !

memcpy
Reasoning about the result of memcpy (or Frama_C_memcpy) is out of range of the current WP plugin. The only memory model that would work in your case is Bytes (page 13 of the manual for Chlorine), but it is not implemented.
Independently, please note that your postcondition from Frama_C_memcpy is not what you want here. You are asserting the equality of the sets dest[0..n] and src[0..n]. First, you should stop at n-1. Second, and more importantly, this is far too weak, and is in fact not sufficient to prove the two assertions in the caller. What you want is a quantification on all bytes. See e.g. the predicate memcmp in Frama-C's stdlib, or the variant \forall int i; 0 <= i < n -> dest[i] == src[i];
By the way, this postcondition holds only if dest and src are properly separated, which your function does not require. Otherwise, you should write dest[i] == \at (src[i], Pre). And your requires are also too weak for another reason, as you only require the first character to be valid, not the n first ones.
Cast pointer to int
Basically, all current models implemented in WP are unable to reason on codes in which the memory is accessed with multiple incompatible types (through the use of unions or pointer casts). In some cases, such as Typed, the cast is detected syntactically, and a warning is issued to warn that the code cannot be analyzed. The Typed+Cast model is a variant of Typed in which some casts are accepted. The analysis is correct only if the pointer is re-cast into its original type before being used. The idea is to allow using some libc functions that require a cast to void*, but not much more.
Your example is again a bit different, because it is likely that MEMORY_ADDR is always addressed with type some_stuct. Would it be acceptable to change the code slightly, and change your function as taking a pointer to this type? This way, you would "hide" the cast to MEMORY_ADDR inside a function that would remain unproven.

I tried this example in the latest version of Frama-C (of course the format is modified a little bit).
for the memcpy case
Assertion 2 fails but assertion 3 is successfully proved (basically because the failure of assertion 2 leads to a False assumption, which proves everything).
So in fact both assertion cannot be proved, same as your problem.
This conclusion is sound because the memory models used in the wp plugin (as far as I know) has no assumption on the relation between fields in a struct, i.e. in header_src the first two fields are 8 bit chars, but they may not be nestedly organized in the physical memory like char[2]. Instead, there may be paddings between them (refer to wiki for detailed description). So when you try to copy bits in such a struct to another struct, Frama-C becomes completely confused and has no idea what you are doing.
As far as I am concerned, Frama-C does not support any approach to precisely control the memory layout, e.g. gcc's PACKED which forces the compiler to remove paddings.
I am facing the same problem, and the (not elegant at all) solution is, use arrays instead. Arrays are always nested, so if you try to copy a char[4] to a short[2], I think the assertion can be proved.
for the Cast pointer to int case
With memory model Typed+cast, the current version I am using (Chlorine-20180501) supports casting between pointers and uint64_t. You may want to try this version.
Moreover, it is strongly suggested to call Z3 and CVC4 through why3, whose performance is certainly better than Alt-Ergo.

Related

Frama-C generates confusing assertions about pointer comparison

I'm getting assertions that don't make sense to me.
Code:
struct s {
int pred;
};
/*# assigns \result \from \nothing;
ensures \valid(\result);
*/
struct s *get(void);
int main(void)
{
return !get()->pred;
}
Frama-C output:
$ frama-c -val frama-ptr.c
[...]
frama-ptr.c:12:[kernel] warning: pointer comparison.
assert \pointer_comparable((void *)0, (void *)tmp->pred);
(tmp from get())
Am I doing something wrong or is it a bug in Frama-C?
This is not really a bug, although this behavior is not really satisfactory either. I recommend that you observe the analysis in the GUI (frama-c-gui -val frama-ptr.c), in particular the value of tmp->pred at line 12 in the Values tab.
Value before:
∈
{{ garbled mix of &{alloced_return_get}
(origin: Library function {c/ptrcmp.c:12}) }}
This is a very imprecise value, that has been generated because the return value of function get is a pointer. The Eva plugin generates a dummy variable (alloced_return_get), and fills it with dummy contents (garbled mix of &{alloced_return_get}). Even though the field pred is an integer, the analyzer assumes that (imprecise) pointers can also be present. This is why an alarm for pointer comparison is emitted.
There are at least two ways to avoid this:
use option -val-warn-undefined-pointer-comparison pointer, so that pointer comparison alarms are emitted only on comparisons involving lvalues with pointer type. The contents of the field pred will remain imprecise, though.
add a proper body to function get, possibly written using malloc, so that the field pred receives a more accurate value.
On a more general note, the Eva/Value plugin requires precise specifications for functions without a body; see section 7.2 of the manual. For functions that return pointers, you won't be able to write satisfactory specifications: you need to write allocates clauses, and those are currently not handled by Eva/Value. Thus, I suggest that you write a body for get.

Lua Alien - Pointer Arithmetic and Dereferencing

My goal is to call Windows' GetModuleInformation function to get a MODULEINFO struct back. This is all working fine. The problem comes as a result of me wanting to do pointer arithmetic and dereferences on the LPVOID lpBaseOfDll which is part of the MODULEINFO.
Here is my code to call the function in Lua:
require "luarocks.require"
require "alien"
sizeofMODULEINFO = 12 --Gotten from sizeof(MODULEINFO) from Visual Studio
MODULEINFO = alien.defstruct{
{"lpBaseOfDll", "pointer"}; --Does this need to be a buffer? If so, how?
{"SizeOfImage", "ulong"};
{"EntryPoint", "pointer"};
}
local GetModuleInformation = alien.Kernel32.K32GetModuleInformation
GetModuleInformation:types{ret = "int", abi = "stdcall", "long", "pointer", "pointer", "ulong"}
local GetModuleHandle = alien.Kernel32.GetModuleHandleA
GetModuleHandle:types{ret = "pointer", abi = "stdcall", "pointer"}
local GetCurrentProcess = alien.Kernel32.GetCurrentProcess
GetCurrentProcess:types{ret = "long", abi = "stdcall"}
local mod = MODULEINFO:new() --Create struct (needs buffer?)
local currentProcess = GetCurrentProcess()
local moduleHandle = GetModuleHandle("myModule.dll")
local success = GetModuleInformation(currentProcess, moduleHandle, mod(), sizeofMODULEINFO)
if success == 0 then --If there is an error, exit
return 0
end
local dataPtr = mod.lpBaseOfDll
--Now how do I do pointer arithmetic and/or dereference "dataPtr"?
At this point, mod.SizeOfImage seems to be giving me the correct values that I am expecting, so I know the functions are being called and the struct is being populated. However, I cannot seem to do pointer arithmetic on mod.lpBaseOfDll because it is a UserData.
The only information in the Alien Documentation that may address what I'm trying to do are these:
Pointer Unpacking
Alien also provides three convenience functions that let you
dereference a pointer and convert the value to a Lua type:
alien.tostring takes a userdata (usually returned from a function that has a pointer return value), casts it to char*, and returns a Lua
string. You can supply an optional size argument (if you don’t Alien
calls strlen on the buffer first).
alien.toint takes a userdata, casts it to int*, dereferences it and returns it as a number. If you pass it a number it assumes the
userdata is an array with this number of elements.
alien.toshort, alien.tolong, alien.tofloat, and alien.todouble are like alien.toint, but works with with the respective typecasts.
Unsigned versions are also available.
My issue with those, is I would need to go byte-by-byte, and there is no alien.tochar function. Also, and more importantly, this still doesn't solve the problem of me being able to get elements outside of the base address.
Buffers
After making a buffer you can pass it in place of any argument of
string or pointer type.
...
You can also pass a buffer or other userdata to the new method of your
struct type, and in this case this will be the backing store of the
struct instance you are creating. This is useful for unpacking a
foreign struct that a C function returned.
These seem to suggest I can use an alien.buffer as the argument of MODULEINFO's LPVOID lpBaseOfDll. And buffers are described as byte arrays, which can be indexed using this notation: buf[1], buf[2], etc. Additionally, buffers go by bytes, so this would ideally solve all problems. (If I am understanding this correctly).
Unfortunately, I can not find any examples of this anywhere (not in the docs, stackoverflow, Google, etc), so I am have no idea how to do this. I've tried a few variations of syntax, but nearly every one gives a runtime error (others simply does not work as expected).
Any insight on how I might be able to go byte-by-byte (C char-by-char) across the mod.lpBaseOfDll through dereferences and pointer arithmetic?
I need to go byte-by-byte, and there is no alien.tochar function.
Sounds like alien.tostring has you covered:
alien.tostring takes a userdata (usually returned from a function that has a pointer return value), casts it to char*, and returns a Lua string. You can supply an optional size argument (if you don’t Alien calls strlen on the buffer first).
Lua strings can contain arbitrary byte values, including 0 (i.e. they aren't null-terminated like C strings), so as long as you pass a size argument to alien.tostring you can get back data as a byte buffer, aka Lua string, and do whatever you please with the bytes.
It sounds like you can't tell it to start at an arbitrary offset from the given pointer address. The easiest way to tell for sure, if the documentation doesn't tell you, is to look at the source. It would probably be trivial to add an offset parameter.

I don't understand the result of this little program

i've made this little program to test a little part of a bigger program.
int main()
{
char c[]="ddddddddddddd";
char *g= malloc(4*sizeof(char));
*g=NULL;
strcpy (g,c);
printf("Hello world %s!\n",g);
return 0;
}
I expected that the function would return "Hello World dddd" ,since the length of g is 4*sizeof(char), but it returns " Hello World ddddddddddddd ".Can you explain me Where I'm wrong ?
Don't do that, it's undefined behaviour.
The strcpy function will happily copy all those characters in c regardless of the size of g.
That's because it copies characters up to the first \0 in c. In this particular case it may corrupt your heap, or it may not, depending on the minimum size of things that get allocated in the heap (many have a "resolution" of sixteen bytes for example).
There are other functions you can use (though they're optional) if you want your code to be more robust, such as strncpy (provided you understand the limitations), or strcpy_s(), as detailed in Appendix K of the ISO C11 standard (and earlier iterations as well).
Or, if you can't use those for some reason, it's up to the developer to ensure they don't break the rules.

LLVM : recognize reference

How can i recognize in llvm taking the address of variable. For example:
int g;
int *v;
int *test() {
v = &g
func(&g)
return &g
}
In LLVM is getting address:
store i32* #g, i32** #v, align 4
call i32 #func(i32* #g)
ret i32* #g
I want to recognize if the address is taken or the value of variable is taken. Thanks for the answers.
There isn't going to be a way to tell this in general. Taking the address of a certain things might be optimized away in certain instances. The compiler might choose to pass some things by pointer even though it wasn't written that way.
You might have better luck using libclang an operating on the c++ syntax tree, I think this is possible.

Global variable touched by a passed-in parameter becomes unusable

folks!
I pass a struct full of data to my kernel, and I run into the following difficulty using it (very stripped down):
[edit: mac osx / xcode 3.2 on mac book pro; this compile is obviously for cpu]
typedef struct
{
float xoom;
int sizex;
} varholder;
float zX, xd;
__kernel void Harlan( __global varholder * vh )
{
int X = get_global_id(0), Y = get_global_id(1);
zX = ( ( X - vh->sizex/2 ) / vh->xoom + vh->sizex/2 ); // (a)
xd = zX; // (b) BOOM!!
}
after executing line (a), the line marked (b), a simple assignment, gives "LLVM compiler failed to compile a function".
if, however, we do not execute line (a), then line (b) is fine.
So, through my fiddling around a LOT with this, it seems as if it is the assignment statement (a), which uses a passed-in parameter, that messes up the future access of the variable zX. However, of course I need to be able to use the results of calculations further down the line.
I have zX and xd declared at the file level because my helper functions need them.
Any thoughts?
Thanks!
David
p.s. I'm now registered so will be able to upvote and accept answers, which I am sadly unable to do for the last person who helped me (used same username to register, but can't seem to vote on the old post; sorry!).
No, say it ain't so!
I am sincerely hoping that this is not a "correct" answer to my own question. I found on another forum (though not the same question asked!) the following, and I am afraid that it refers to what I'm trying to do:
(quote)
You're doing something the standard prohibits. Section 6.5 says:
'All program scope variables must be declared in the __constant address space.'
In other words, program scope variables cannot be mutable.
(end quote)
... well, tcha!!!! What an astoundingly inconvenient restriction! I'm sure there's reasoning behind it.
[edit: Not At All inconvenient! it was in fact astonishingly easy to work around, given a fresh start the next morning. (And no alcohol.)]
You guys & dolls all knew this, right, and didn't have the heart to tell me?...

Resources