I have a problem with proving assigns after calling other function that assigns until strlen. Below is a simple example with a call to a function from standard library. The function contract is basically a copy from strcpy requirements.
#include <string.h>
/*#
requires valid_string_src: valid_read_string(src);
requires room_string: \valid(dest+(0..strlen(src)));
requires separation:
\separated(dest+(0..strlen(src)), src+(0..strlen(src)));
assigns dest[0..(strlen(src))];
*/
void mycpy(char *dest, const char *src) {
strcpy(dest, src);
}
Frama-c fails to prove assigns for mycpy even though it matches assigns of the strcpy:
Goal Assigns ... (exit):
Let a_0 = « dest#L1 + 0 ».
Let x_0 = L_strlen(µ:Mchar#L1, src#L1).
Let x_1 = 1 + x_0.
Let x_2 = L_strlen(Mchar_0, src#L1).
Assume {
Have: 0 <= x_2.
Type: is_sint8_chunk(µ:Mchar#L1).
(* Heap *)
Type: (region(dest#L1.base) <= 0) /\ (region(src#L1.base) <= 0) /\
linked(µ:Malloc#L1) /\ sconst(µ:Mchar#L1).
(* Goal *)
When: !invalid(µ:Malloc#L1, a_0, 1 + x_2).
Stmt { L1: }
(* Pre-condition *)
Have: P_valid_read_string(µ:Malloc#L1, µ:Mchar#L1, src#L1) /\
valid_rw(µ:Malloc#L1, a_0, x_1) /\
separated(a_0, x_1, « src#L1 + 0 », x_1).
}
Prove: x_2 <= x_0.
--------------------------------------------------------------------------------
Prover Alt-Ergo 2.3.3: Timeout (Qed:6ms) (10s) (cached).
The full context of the goal shows that it tries to prove: L_strlen(Mchar_0, src#L1) <= L_strlen(µ:Mchar#L1, src#L1). However, there is no information about Mchar_0.
What is this µ:Mchar#L1 and Mchar_0? How do I prove this assigns?
Frama-c version: 22.0 (Titanium).
Related
Assume the numbers are in base 10 and each subsequent number is 1 more than the previous.
A naïve solution would be:
fn range_digits(start: usize, end: usize) -> usize {
(start..=end).fold(0, |a, b| a + b.to_string().len())
}
Which gives the output 88915 for the inputs 5 for start and 20005 for end.
The best solution I could come up with was:
use std::convert::TryInto;
fn digits(a: usize) -> usize {
((a as f64).log10() as usize) + 1
}
// Present conversions and type casts could be problematic for certain inputs.
fn range_digits(start: usize, end: usize) -> usize {
let (start_digits, end_digits) = (digits(start), digits(end));
if start_digits == end_digits {
(end - start + 1) * start_digits
} else {
let (a, b) = (
10_usize.pow(start_digits.try_into().unwrap()) - 1,
10_usize.pow((end_digits - 1).try_into().unwrap()) as usize,
);
(digits(a + 1)..=digits(b - 1)).fold(0, |acc, elem| {
acc + 9 * elem * 10_usize.pow((elem - 1).try_into().unwrap())
}) + ((a - start + 1) * start_digits)
+ ((end - b + 1) * end_digits)
}
}
But I'm wondering if there's a yet more computationally efficient/optimal solution/formula.
The fastest approach probably is to do this completely with integer arithmetic. Switching between floats and integers is expensive. Here's a simple implementation. I didn't perform any benchmarks on it.
fn digits_in_range(base: usize, range: Range<usize>) -> usize {
let mut result;
let mut power = 1;
let mut current_digits = 0;
while power <= range.start {
power *= base;
current_digits += 1;
}
result = (power - range.start) * current_digits;
while power <= range.end {
let last_power = power;
power *= base;
current_digits += 1;
result += (power - last_power) * current_digits;
}
result -= (power - range.end) * current_digits;
result
}
This takes the number system base as the first argument, and a Range as the second argument. Note that a Range excludes its endpoint, so it's not included in the count. You can change this to RangeInclusive with a small correction to the code if you prefer.
Here is a solution:
Hard-code the number of digits for intervals of the form [0, 10**k - 1];
Using extremely simple integer arithmetic to deduce the value for any interval [a, b].
Below is code assuming 0 <= a <= b < 10**10.
I don't know rust, so this is python; but the algorithm is so simple that I believe the logic of the code is very understandable.
def get_next_power_of_10(n):
v, p = 10, 1
while v <= n:
v *= 10
p += 1
return (v, p)
# d[k] = number of digits in interval [0, 10**k - 1]
d = [1, 10, 190, 2890, 38890, 488890, 5888890, 68888890, 788888890, 8888888890, 98888888890]
def get_number_of_digits_in_0_n(n):
(v, p) = get_next_power_of_10(n)
return d[p] - p * (v - 1 - n)
def get_number_of_digits_in_interval(a, b):
return get_number_of_digits_in_0_n(b) - get_number_of_digits_in_0_n(a-1)
I'd like to prove this loop implementation of Euclidean division in Frama-C :
/*#
requires a >= 0 && 0 < b;
ensures \result == a / b;
*/
int euclid_div(const int a, const int b)
{
int q = 0;
int r = a;
/*#
loop invariant a == b*q+r && r>=0;
loop assigns q,r;
loop variant r;
*/
while (b <= r)
{
q++;
r -= b;
}
return q;
}
But the post condition fails to prove automatically (the loop invariant proved fine) :
Goal Post-condition:
Let x = r + (b * euclid_div_0).
Assume {
(* Pre-condition *)
Have: (0 < b) /\ (0 <= x).
(* Invariant *)
Have: 0 <= r.
(* Else *)
Have: r < b.
}
Prove: (x / b) = euclid_div_0.
--------------------------------------------------------------------------------
Prover Alt-Ergo: Unknown (250ms).
It does have all the hypotheses of Euclidean division, does anyone know why it cannot conclude ?
As indicated by Mohamed Iguernlala's answer, automated provers are not very comfortable with non-linear arithmetic. It is possible to do interactive proofs with WP, either directly within the GUI (see section 2.3 of WP Manual for more information), or by using coq (double click on the appropriate cell of the WP Goals tab of the GUI for launching coqide on the corresponding goal).
It is usually better to use coq on ACSL lemmas, as you can focus on the exact formula you want to prove manually, without being bothered by the logical model of the code you're trying to prove. Using this tactic, I've been able to prove your post-condition with the following intermediate lemma:
/*#
// WP's interactive prover select lemmas based on the predicate and
// function names which appear in it, but does not take arithmetic operators
// into account 😭. Hence the DIV definition.
logic integer DIV(integer a,integer b) = a / b ;
lemma div_rem:
\forall integer a,b,q,r; a >=0 ==> 0 < b ==> 0 <= r < b ==>
a == b*q+r ==> q == DIV(a, b);
*/
/*#
requires a >= 0 && 0 < b;
ensures \result == DIV(a, b);
*/
int euclid_div(const int a, const int b)
{
int q = 0;
int r = a;
/*#
loop invariant a == b*q+r;
loop invariant r>=0;
loop assigns q,r;
loop variant r;
*/
while (b <= r)
{
q++;
r -= b;
}
/*# assert 0<=r<b; */
/*# assert a == b*q+r; */
return q;
}
More precisely, the lemma itself is proved with the following Coq script:
intros a b q prod Hb Ha Hle Hge.
unfold L_DIV.
generalize (Cdiv_cases a b).
intros Hcdiv; destruct Hcdiv.
clear H0.
rewrite H; auto with zarith.
clear H.
symmetry; apply (Zdiv_unique a b q (a-prod)); auto with zarith.
unfold prod; simpl.
assert (b*q = q*b); auto with zarith.
While the post-condition only requires to instantiate the lemma with the appropriate arguments.
Because it's non-linear arithmetic, which is sometimes hard for automatic (SMT) solvers.
I re-written the goal in SMT2 format, and none of Alt-Ergo 2.2, CVC4 1.5 and Z3 4.6.0 is able to prove it:
(set-logic QF_NIA)
(declare-const i Int)
(declare-const i_1 Int)
(declare-const i_2 Int)
(assert (>= i_1 0))
(assert (> i_2 0))
(assert (>= i 0))
(assert (< i i_2))
; proved by alt-ergo 2.2 and z3 4.6.0 if these two asserts are uncommented
;(assert (<= i_1 10))
;(assert (<= i_2 10))
(assert
(not
(= i_1
(div
(+ i (* i_1 i_2))
i_2 )
)
)
)
(check-sat)
If you change your post-condition like this, it is proved by Alt-Ergo
ensures \exists int r ;
a == b * \result + r && 0 <= r && r < b;
I have the following program:
procedure Main with SPARK_Mode is
F : array (0 .. 10) of Integer := (0, 1, others => 0);
begin
for I in 2 .. F'Last loop
F (I) := F (I - 1) + F (I - 2);
end loop;
end Main;
If I run gnatprove, I get the following result, pointing to the + sign:
medium: overflow check might fail
Does this mean that F (I - 1) could be equal to Integer'Last, and adding anything to that would overflow? If so, then is it not clear from the flow of the program that this is impossible? Or do I need to specify this with a contract? If not, then what does it mean?
A counterexample shows that indeed gnatprove in this case worries about the edges of Integer:
medium: overflow check might fail (e.g. when F = (1 => -1, others => -2147483648) and I = 2)
Consider adding a loop invariant to your code. The following is an example from the book "Building High Integrity Applications with Spark".
procedure Copy_Into(Buffer : out Buffer_Type;
Source : in String) is
Characters_To_Copy : Buffer.Count_Type := Maximum_Buffer_Size;
begin
Buffer := (Others => ' '); -- Initialize to all blanks
if Source'Length < Characters_To_Copy then
Characters_To_Copy := Source'Length;
end if;
for Index in Buffer.Count_Type range 1..Characters_To_Copy loop
pragma Loop_Invariant
(Characters_To_Copy <= Source'Length and
Characters_To_Copy = Characters_To_Copy'Loop_Entry);
Buffer (Index) := Source(Source'First + (Index - 1));
end loop;
end Copy_Into;
This is already an old question, but I would like to add an answer anyway (just for future reference).
With the advancement of provers, the example as stated in the question now proves out-the-box in GNAT CE 2019 (i.e. no loop invariant needed). A somewhat more advanced example can also be proven:
main.adb
procedure Main with SPARK_Mode is
-- NOTE: The theoretical upper bound for N is 46 as
--
-- Fib (46) < 2**31 - 1 < Fib (47)
-- 1_836_311_903 < 2_147_483_647 < 2_971_215_073
-- NOTE: Proved with Z3 only. Z3 is pretty good in arithmetic. Additional
-- options for gnatprove:
--
-- --prover=Z3 --steps=0 --timeout=10 --report=all
type Seq is array (Natural range <>) of Natural;
function Fibonacci (N : Natural) return Seq with
Pre => (N in 2 .. 46),
Post => (Fibonacci'Result (0) = 0)
and then (Fibonacci'Result (1) = 1)
and then (for all I in 2 .. N =>
Fibonacci'Result (I) = Fibonacci'Result (I - 1) + Fibonacci'Result (I - 2));
---------------
-- Fibonacci --
---------------
function Fibonacci (N : Natural) return Seq is
F : Seq (0 .. N) := (0, 1, others => 0);
begin
for I in 2 .. N loop
F (I) := F (I - 1) + F (I - 2);
pragma Loop_Invariant
(for all J in 2 .. I =>
F (J) = F (J - 1) + F (J - 2));
-- NOTE: The loop invariant below helps the prover to proof the
-- absence of overflow. It "reminds" the prover that all values
-- from iteration 3 onwards are strictly monotonically increasing.
-- Hence, if absence of overflow is proven in this iteration,
-- then absence is proven for all previous iterations.
pragma Loop_Invariant
(for all J in 3 .. I =>
F (J) > F (J - 1));
end loop;
return F;
end Fibonacci;
begin
null;
end Main;
This loop invariant should work - since 2^(n-1) + 2^(n-2) < 2^n - but I can't convince the provers:
procedure Fibonacci with SPARK_Mode is
F : array (0 .. 10) of Natural := (0 => 0,
1 => 1,
others => 0);
begin
for I in 2 .. F'Last loop
pragma Loop_Invariant
(for all J in F'Range => F (J) < 2 ** J);
F (I) := F (I - 1) + F (I - 2);
end loop;
end Fibonacci;
You can probably convince the provers with a bit of manual assistance (showing how 2^(n-1) + 2^(n-2) = 2^(n-2) * (2 + 1) = 3/4 * 2^n < 2^n).
I am doing practice with F#. I am trying to create a simple program capable to find me out a couple of prime numbers that, summed together, equal a natural number input. It is the Goldbach conjecture. A single couple of primes will be enough. We will assume the input to be a even number.
I first created a function to check if a number is prime:
let rec isPrime (x: int) (i: int) :bool =
match x % i with
| _ when float i > sqrt (float x) -> true
| 0 -> false
| _ -> isPrime x (i + 1)
Then, I am trying to develop a function that (a) looks for prime numbers, (b) compare their sum with the input 'z' and (c) returns a tuple when it finds the two numbers. The function should not be correct yet, but I would get the reason behind this problem:
let rec sumPrime (z: int) (j: int) (k: int) :int * int =
match isPrime j, isPrime k with
| 0, 0 when j + k > z -> (0, 0)
| 0, 0 -> sumPrime (j + 1) (k + 1)
| _, 0 -> sumPrime j (k + 1)
| 0, _ -> sumPrime (j + 1) k
| _, _ -> if j + k < z then
sumPrime (j + 1) k
elif j + k = z then
(j, k)
The problem: even if I specified that the output should be a tuple :int * int the compiler protests, claiming that the expected output should be of type bool. When in trouble, I usually refer to F# for fun and profit, that i love, but this time I cannot find out the problem. Any suggestion is greatly appreciated.
Your code has three problems that I've spotted:
Your isPrime returns a bool (as you've specified), but your match expression in sumPrime is matching against integers (in F#, the Boolean value false is not the same as the integer value 0). Your match expression should look like:
match isPrime j, isPrime k with
| false, false when j + k > z -> (0, 0)
| false, false -> ...
| true, false -> ...
| false, true -> ...
| true, true -> ...
You have an if...elif expression in your true, true case, but there's no final else. By default, the final else of an if expression returns (), the unit type. So once you fix your first problem, you'll find that F# is complaining about a type mismatch between int * int and unit. You'll need to add an else condition to your final match case to say what to do if j + k > z.
You are repeatedly calling your sumPrime function, which takes three parameters, with just two parameters. That is perfectly legal in F#, since it's a curried language: calling sumPrime with two parameters produces the type int -> int * int: a function that takes a single int and returns a tuple of ints. But that's not what you're actually trying to do. Make sure you specify a value for z in all your recursive calls.
With those three changes, you should probably see your compiler errors go away.
I'm having difficulty determining the big O of simple recursive methods. how can I calculate big-O for these methods?
Case 1) find big-O for method f:
int f(int x){
if(x<1) return 1;
return f(x-1)+g(x);
}
int g(int x){
if(x<2) return 1;
return f(x-1)+g(x/2);
}
Case 2)
int test(int n){
if(x<=2) return 1;
return test(n-2) * test(n-2);
}
Case 3)
int T(int n){
if(n<=1) return 1;
return T(n/2)+T(n/2);
}
Case 1
Setting the base cases aside (g(1) = g(0) = 1, etc.), you can rewrite g in terms of f:
f(n) = f(n-1) + g(n) <=> g(n) = f(n)-f(n-1)
We know that g is defined as:
g(n) = f(n-1) + g(n/2)
If we replace g(n/2) with the rewritten form above, we get:
g(n) = f(n-1) + f(n/2) + f(n/2-1)
Which means that we can rewrite f without any reference to g, by replacing g(n) in the original definition of f with the formula above:
f(n) = f(n-1) + f(n-1) + f(n/2) + f(n/2-1)
To double check that this is equivalent, you can run this program, which accepts an integer n as the first argument, and prints the result of the original f(n) followed by the rewritten form of f(n) (called f2 in the code):
#include <stdio.h>
int g(int x);
int f(int x) {
if (x < 1)
return 1;
return f(x-1)+g(x);
}
int g(int x) {
if (x < 2)
return 1;
return f(x-1)+g(x/2);
}
int f2(int x) {
if (x < 1)
return 1;
return f2(x-1)+f2(x-1)+f2(x/2)-f2(x/2-1);
}
int main(int argc, char *argv[]) {
int n;
sscanf(argv[1], "%d", &n);
printf("%d\n", f(n));
printf("%d\n", f2(n));
return 0;
}
Some examples:
$ ./a.out 10
1952
1952
$ ./a.out 11
3932
3932
$ ./a.out 12
7923
7923
$ ./a.out 13
15905
15905
$ ./a.out 14
31928
31928
$ ./a.out 15
63974
63974
Now, if you imagine the recursion tree, each node branches off into 4 sub-trees (one for each of f(n-1), f(n-1), f(n/2) and f(n/2-1)). The size of each subtree is not the same, e.g., if we descend on a sub-tree and always follow any of the 2 rightmost branches, we have a binary tree of depth log(N). But there are other branches (if we always follow the f(n-1) path) that have depth n, and it branches into n-1 twice. Because of this, we can say it's definitely exponential.
It's a bit hard to get the exact number, but an obvious upper bound is O(4^N) - though this disregards the fact that some branches are only log(N) deep, so in reality it's a bit better than O(4^N).
Case 2
Think about the recursion tree again. At each point, we branch twice (test(n-2) and test(n-2)). Because we decrease n by 2 on each call, the tree will be O(n/2) deep, so we need O(2^(n/2)) time to traverse the tree - again, an exponential growth. Not particularly interesting.
(Side note: if you were to use memoization here, this would be linear!).
Case 3
Similar logic as case 2, but this time the tree has depth log(N) (because that's how many times you need to divide N by 2 to get to the base case), so we get 2^log(N) = N. So it's linear.