I have the following problem when analyzing if-conditions with my plugin. When I analyze code like
if ((a && b) || c)
Frama-C creates code like this:
if (a) {
if (b){
goto _LOR;
}
else{
goto _LAND;
}
}
else{
_LAND: ;
if (c) {
_LOR: //....
For my analysis, I want to get the control-flow without the conditions split up, so that one statement remains. I want this, because by splitting up the conditional, the reaching of _LOR is dependent on the or-part only.
Example: creating the CFG as in the viewcfg plugin of the developer guide leads to this:
a=(a+b)+c can be reached in the then and the else-path of if a, which I cancel out in my plugin in the block-statement (so that after a "simple" if, the statement is not dependent on the condition anymore).
Is there a possibility to block the splitting of the if, by a command-line argument or something like that?
An undocumented and unsupported solution exists. Before compiling Frama-C, in function initCIL of cil.ml, change
theMachine.useLogicalOperators <- false (* do not use lazy LAND and LOR *);
into
theMachine.useLogicalOperators <- true;
The normalization will use logical || and && operators instead of gotos.
Please note that this is unsupported for a good reason. The Frama-C plugins packaged with the kernel expect an AST in which those operators are not used, so they will probably crash or do something unsound on your program. Use at your own risk!
Related
Using top, I manually measured the following memory usages at the specific points designated in the comments of the following code block:
x <- matrix(rnorm(1e9),nrow=1e4)
#~15gb
gc()
# ~7gb after gc()
y <- as.vector(x)
gc()
#~15gb after gc()
It's pretty clear that rnorm(1e9) is a ~7gb vector that's then copied to create the matrix. gc() removes the original vector since it's not assigned to anything. as.vector(x) then coerces and copies the data to vector.
My question is, why can't these three objects all point to the same memory block (at least until one is modified)? Isn't a matrix really just a vector with some additional metadata?
This is in R version 3.6.2
edit: also tested in 4.0.3, same results.
The question you're asking is to the reasoning. That seems more suited for R-devel, and I am assuming the answer in return is "no one knows". The relevant function from R-source is the do_asvector function.
Going down the source code of a call to as.vector(matrix(...)), it is important to note that the default argument for mode is any. This translates to ANYSXP (see R internals). This lets us find the evil culprit (line 1524) of the copy-behaviour.
// source reference: do_asvector
...
if(type == ANYSXP || TYPEOF(x) == type) {
switch(TYPEOF(x)) {
case LGLSXP:
case INTSXP:
case REALSXP:
case CPLXSXP:
case STRSXP:
case RAWSXP:
if(ATTRIB(x) == R_NilValue) return x;
ans = MAYBE_REFERENCED(x) ? duplicate(x) : x; // <== evil culprit
CLEAR_ATTRIB(ans);
return ans;
case EXPRSXP:
case VECSXP:
return x;
default:
;
}
...
Going one step further, we can find the definition for MAYBE_REFERENCED in src/include/Rinternals.h, and by digging a bit we can find that it checks whether sxpinfo.named is equal to 0 (false) or not (true). What I am guessing here is that the assignment operator <- increments the sxpinfo.named counter and thus MAYBE_REFERENCED(x) returns TRUE and we get a duplicate (deep copy).
However, Is this behaviour necessary?
That is a great question. If we had given an argument to mode other than any or class(x) (same as our input class), we skip the duplicate line, and we continue down the function, until we hit a ascommon. So I dug a bit extra and took a look at the source code for ascommon, we can see that if we were to try and convert to list manually (setting mode = "list"), ascommon only calls shallowDuplicate.
// Source reference: ascommon
---
if ((type == LISTSXP) &&
!(TYPEOF(u) == LANGSXP || TYPEOF(u) == LISTSXP ||
TYPEOF(u) == EXPRSXP || TYPEOF(u) == VECSXP)) {
if (MAYBE_REFERENCED(v)) v = shallow_duplicate(v); // <=== ascommon duplication behaviour
CLEAR_ATTRIB(v);
}
return v;
}
---
So one could imagine that the call to duplicate in do_asvector could be replaced by a call to shallow_duplicate. Perhaps a "better safe than sorry" strategy was chosen when the code was originally implemented (prior to R-2.13.0 according to a comment in the source code), or perhaps there is a scenario in one of the types not handled by ascommon that requires a deep-copy.
For now I would test if the function does a deep-copy if we set mode='list' or pass the list without assignment. In either case it might not be a bad idea to send a follow-up question to the R-devel mailing list.
Edit: <- behaviour
I took the liberty to confirm my suspicion, and looked at the source code for <-. I previously stated that I assumed that <- incremented sxpinfo.named, and we can confirm this by looking at do_set (the c source code for <-). When assigning as x <- ... x is a SYMSXP, and this we can see that the source code calls INCREMENT_NAMED which in turn calls SET_NAMED(x, NAMED(X) + 1). So everything else equal we should see a copy behaviour for x <- matrix(...); y <- as.vector(x) while we shouldn't for y <- as.vector(matrix(...)).
At the final gc(), you have x pointing to a vector with a dim attribute, and y pointing to a vector without any dim attribute. The data is an intrinsic part of the object, it's not an attribute, so those two vectors have to be different.
If matrices had been implemented as lists, e.g.
x <- list(data = rnorm(1e9), dim = c(1e4, 1e5))
then a shallow copy would be possible, but that's not how it was done. You can read the details of the internal structure of objects in the R Internals manual. For the current release, that's here: https://cloud.r-project.org/doc/manuals/r-release/R-ints.html#SEXPs .
You may wonder why things were implemented this way. I suspect it's intended to be efficient for the common use cases. Converting a matrix to a vector isn't generally necessary (you can treat x as a vector already, e.g. x[100000] and y[100000] will give the same value), so there's no need for "convert to vector" to be efficient. On the other hand, extracting elements is very common, so you don't want to have an extra pointer dereference slowing that down.
Why do some people use while(true){} blocks in their code? How does it work?
It's an infinite loop. At each iteration, the condition will be evaluated. Since the condition is true, which is always... true... the loop will run forever. Exiting the loop is done by checking something inside the loop, and then breaking if necessary.
By placing the break check inside the loop, instead of using it as the condition, this can make it more clear that you're expecting this to run until some event occurs.
A common scenario where this is used is in games; you want to keep processing the action and rendering frames until the game is quit.
It's just a loop that never ends on its own, known as an infinite-loop. (Often times, that's a bad thing.)
When it's empty, it serves to halt the program indefinitely*; otherwise there's typically some condition in the loop that, when true, breaks the loop:
while (true)
{
// ...
if (stopLoop)
break;
// ...
}
This is often cleaner than an auxiliary flag:
bool run = true;
while (run)
{
// ...
if (stopLoop)
{
run = false;
continue; // jump to top
}
// ...
}
Also note some will recommend for (;;) instead, for various reasons. (Namely, it might get rid of a warning akin to "conditional expression is always true".)
*In most languages.
Rather than stuff all possible conditions in the while statement,
// Always tests all conditions in loop header:
while( (condition1 && condition2) || condition3 || conditionN_etc ) {
// logic...
if (notable_condition)
continue; // skip remainder, go direct to evaluation portion of loop
// more logic
// maybe more notable conditions use keyword: continue
}
Some programmers might argue it's better to put the conditions throughough the logic, (i.e. not just inside the loop header) and to employ break statements to get out at appropriate places. This approach will usually negate the otherwise original conditions to determine when to leave the loop (i.e. instead of when to keep looping).
// Always tests all conditions in body of loop logic:
while(true) {
//logic...
if (!condition1 || !condition2)
break; // Break out for good.
// more logic...
if (!condition3)
break;
// even more logic ...
}
In real life it's often a more gray mixture, a combination of all these things, instead of a polarized decision to go one way or another.
Usage will depend on the complexity of the logic and the preferences of the programmer .. and maybe on the accepted answer of this thread :)
Also don't forget about do..while. The ultimate solution may use that version of the while construct to twist conditional logic to their liking.
do {
//logic with possible conditional tests and break or continue
} while (true); /* or many conditional tests */
In summary it's just nice to have options as a programmer. So don't forget to thank your compiler authors.
When Edsger W. Dijkstra was young, this was equivalent to:
Do loop initialization
label a:
Do some code
If (Loop is stoppable and End condition is met) goto label b
/* nowadays replaced by some kind of break() */
Do some more code, probably incrementing counters
go to label a
label b:
Be happy and continue
After Dijkstra decided to become Antigotoist, and convinced hordes of programmers to do so, a religious faith came upon earth and the truthiness of code was evident.
So the
Do loop initialization
While (true){
some code
If (Loop is stoppable and End condition is met) break();
Do some more code, probably incrementing counters
}
Be happy and continue
Replaced the abomination.
Not happy with that, fanatics went above and beyond. Once proved that recursion was better, clearer and more general that looping, and that variables are just a diabolic incarnation, Functional Programming, as a dream, came true:
Nest[f[.],x, forever[May God help you break]]
And so, loops recursion became really unstoppable, or at least undemonstratively stoppable.
while (the condition){do the function}
when the condition is true.. it will do the function.
so while(true)
the condition is always true
it will continue looping.
the coding will never proceed.
It's a loop that runs forever, unless there's a break statement somewhere inside the body.
The real point to have while (true) {..} is when semantics of exit conditions have no strong single preference, so its nice way to say to reader, that "well, there are actually break conditions A, B, C .., but calculations of conditions are too lengthy, so they were put into inner blocks independently in order of expected probability of appearance".
This code refers to that inside of it will run indefinitely.
i = 0
while(true)
{
i++;
}
echo i; //this code will never be reached
Unless inside of curly brackets is something like:
if (i > 100) {
break; //this will break the while loop
}
or this is another possibility how to stop while loop:
if (i > 100) {
return i;
}
It is useful to use during some testing. Or during casual coding. Or, like another answer is pointing out, in videogames.
But what I consider as bad practice is using it in production code.
For example, during debugging I want to know immediately what needs to be done in order to stop while. I don't want to search in the function for some hidden break or return.
Or the programmer can easily forget to add it there and data in a database can be affected before the code is stopped by other manners.
So ideal would be something like this:
i = 0
while(i < 100)
{
i++;
}
echo i; //this code will be reached in this scenario
I am using frama-c Aluminium-20160502 version and I want to find out the dependencies in a large program. When using the option -deps in the command line I found some dependencies are missing. In particular when several conditions are joined in one if, the dependency analysis stops whenever one condition is false. In this example here:
#include<stdio.h>
#include<stdbool.h>
/*Global variable definitions*/
bool A = true;
bool B = false;
bool C = true;
bool X;
bool Y;
bool res;
void main(){
if (A && B && C) {
res = X;
}else res = Y;
}
when I try: frama-c -deps program.c
frama shows the following dependencies:
[from] ====== DEPENDENCIES COMPUTED ======
These dependencies hold at termination for the executions that terminate:
[from] Function main:
res FROM A; B; Y
[from] ====== END OF DEPENDENCIES ======
so it does not reach the condition C because already B is false.
I wonder if there is a way to tell frama to compute all dependencies even if the condition is not fulfilled. I tried with the option -slevel but with no results. I know there is a way to use an interval Frama_C_interval(0,1) but when I use it the variable using this function is not shown in the dependencies. I would like to get X and Y dependent on A,B,C and res dependent on A,B,C,X,Y
Any ideas?
The From plugin uses the results of the Value analysis plugin. In your example, the values of A and B are sufficiently precise that Value is able to infer that the condition is entirely determined (since the && operator is lazily evaluated, from left to right) before reaching C, therefore C never affects the outcome and is thus not a dependency from the point of view of From.
Unfortunately, Frama_C_interval cannot be used directly at global initializers:
user error: Call to Frama_C_interval in constant.
You can, however, use a "hack" (not always the best solution, but works here):
volatile bool nondet;
bool A = nondet;
bool B = nondet;
bool C = nondet;
...
Note that because nondet is volatile, each variable A, B and C is assigned a different non-deterministic value.
In this case, Value has to consider both branches of the conditionals, and therefore C becomes a dependency in your example, since it is possible that C will be read during the execution of main. You'll then have:
These dependencies hold at termination for the executions that terminate:
[from] Function main:
res FROM A; B; C; X; Y
Note that some plugins require special treatment when dealing with volatile variables, so this is not always the best solution.
This however only deals with some kinds of dependencies. As mentioned in Chapter 6 of the Value Analysis user manual, the From plugin computes functional, imperative and operational dependencies. These do not include indirect control dependencies, which are those such as X from A, B, C, as you mention. For those, you need the PDG (Program Dependence Graph) plugin, but it does not currently have a textual output of the dependencies. You can use -pdg to compute it, and then -pdg-dot <file> to export the dependency graph in dot (graphviz) format. Here's what I get for your main function (using the volatile hack as mentioned previously):
Finally, as a side note: -slevel is mostly used to improve precision, but in your example you already have too much precision (that is, Value is already able to infer that C is never read inside main).
A predicate on booleans seems a little silly to me (well, at least in the following scenario):
static Set<A> aSet = ...;
checkCondition(B b) {
return aSet.stream()
.map(b::aMethodReturningBoolean)
.filter((Boolean check) -> check)
.limit(1).count() > 0;
}
What I am doing is that given the object b, checking whether there is at least one member of aSet that satisfies a certain condition with respect to b.
Everything is working fine, but the line filter((Boolean check) -> check) is like a tiny little pin pricking me! Is there a way I can avoid it? I mean, if I have a line in my code that is literally the identity function, then there must be something wrong with my approach.
All you need is
return aSet.stream().anyMatch(b::aMethodReturningBoolean);
which is much more readable.
I would like to generate all the preconditions generated by Frama-C which are stored in a table according to the calculus.ml code. I am mainly interested to get the initial predicate which is converted to logic formula and sent to the solvers. Can this be done? Please help me to print the initial predicate which is sent to the solvers. The code I am trying with is given below:
int main()
{
int x=42,y=40;
if(x<50)
{
x=x+2; y=x-y;
}
else
{
x=x-2; y=x-y;
}
//# assert P: y>0;
}
I think that you can get what you wish by using the -wp-out dir option, and then look at the generated .ergo file in the dir directory, but some simplifications might already have been done. I don't think that you can turn off these internal simplifications.