I am attempting to replicate some Python code in Fortran 90 to make it work within a larger Fortran project I am contributing to. Specifically, I am trying to convert some code that recursively identifies upstream paths in a binary tree, such as in the following example:
4 -- 5 -- 8
/
2 --- 6 - 9 -- 10
/ \
1 -- 11
\
3 ----7
This tree is represented and traversed by:
class Node(object):
def __init__(self):
self.name = None
self.parent = None
self.children = set()
self._upstream = set()
def __repr__(self):
return "Node({})".format(self.name)
# Recursively search upstream in the drainage network, returns a set of all paths
#property
def upstream_paths(self):
if not self._paths:
for child in self.children:
if child.upstream_paths:
self._paths.extend([child] + path for path in child.upstream_paths)
else:
self._paths.append([child])
return self._paths
from collections import defaultdict
edges = {(11, 9), (10, 9), (9, 6), (6, 2), (8, 5), (5, 4), (4, 2), (2, 1), (3, 1), (7, 3)}
nodes = collections.defaultdict(lambda: Node())
for node, parent in edges:
nodes[node].name = node
nodes[parent].name = parent
nodes[node].parent = nodes[parent]
nodes[parent].children.add(nodes[node])
Is it possible to implement anything like this in Fortran 90? I have a decent understanding of recursion in f90 but without the object-orientedness of Python, I can't imagine how this can be done.
EDIT:
For further description:
What I intend to do is identify upstream drainage paths in a dendritic stream network. For any given outlet (root) there may be hundreds or thousands upstream paths. There would be no modification of the network required once it is initialized, although there will be calls to many different nodes within the network (in the above example, a call will be made for all upstream paths from 1, from 6, from 5, etc.) I've been looking into using pointers but can't seem to find any examples out there of this kind of path-finding.
I find that it's much easier to convert Python towards another language while you're still in Python (and then you can test in Python every step of the way). Python is so flexible that it is much easier to make Python that looks like F90 (or almost anything else) than the other way around.
I used to do this with assembly language. I'd modify my python to make it look more like assembler, then try to code it in assembler, realize I'd missed something, then modify the Python again. By the time I finished, I had Python and assembly language that were easy to read and had one-to-one correspondence, and the Python had been tested every step of the way. The assembly language just worked(TM) when I ran it.
FWIW, if removing the recursion is something you are considering, here is an excellent guide on the best way to do exactly that.
Related
EDIT: The original question had unnecessary details
I have a source file which I do value analysis in Frama-C, some of the code is highlighted as dead code in the normalized window, no the original source code.
Can I obtain a slice of the original code that removes the dead code?
Short answer: there's nothing in the current Frama-C version that will let you do that directly. Moreover, if your original code contains macros, Frama-C will not even see the real original code, as it relies on an external preprocessor (e.g. cpp) to do macro expansion.
Longer answer: Each statement in the normalized (aka CIL) Abstract Syntax Tree (AST, the internal representation of C code within Frama-C) contains information about the location (start point and end point) of the original statement where it stems from, and this information is also available in the original AST (aka Cabs). It might thus be possible for someone with a good knowledge of Frama-C's inner workings (e.g. a reader of the developer's manual), to build a correspondance between both, and to use that to detect dead statement in Cabs. Going even further, one could bypass Cabs, and identify zones in the original text of the program which are dead code. Note however that it would be a tedious and quite error prone (notably because a single original statement can be expanded in several normalized ones) task.
Given your clarifications, I stand by #Virgile's answer; but for people interested in performing some simplistic dead code elimination within Frama-C, the script below, gifted by a colleague who has no SO account, could be helpful.
(* remove_dead_code.ml *)
let main () =
!Db.Value.compute ();
Slicing.Api.Project.reset_slicing ();
let selection = ref Slicing.Api.Select.empty_selects in
let o = object (self)
inherit Visitor.frama_c_inplace
method !vstmt_aux stmt =
if Db.Value.is_reachable_stmt stmt then
selection :=
Slicing.Api.Select.select_stmt ~spare:true
!selection
stmt
(Extlib.the self#current_kf);
Cil.DoChildren
end in
Visitor.visitFramacFileSameGlobals o (Ast.get ());
Slicing.Api.Request.add_persistent_selection !selection;
Slicing.Api.Request.apply_all_internal ();
Slicing.Api.Slice.remove_uncalled ();
ignore (Slicing.Api.Project.extract "no-dead")
let () = Db.Main.extend main
Usage:
frama-c -load-script remove_dead_code.ml file.c -then-last -print -ocode output.c
Note that this script does not work in all cases and could have further improvements (e.g. to handle initializers), but for some quick-and-dirty hacking, it can still be helpful.
I wanted to count instructions in simple recursive fibo function O(2^n). I succeded to do so with bubble sort and matrix multiplication, but in this case it seemed like instruction count ignored my fibo function. Here is the code used for instrumentation:
// Insert a call at the entry point of a routine to increment the call count
RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_rtnCount), IARG_END);
// For each instruction of the routine
for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins))
{
// Insert a call to docount to increment the instruction counter for this rtn
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_icount), IARG_END);
}
I started to wonder what's the difference between this program and the previous ones and my first thought was: here I'm not using an array.
This is what I realised after some manual tests:
a = 5; // instruction ignored by PIN and
// pretty much everything not using array
fibo[1] = 1 // instruction counted properly
a = fibo[1] // instruction ignored by PIN
So it seems like only instructions counted are writes to the memory (that's what I assume). After I changed my fibo function to this it works:
long fibonacciNumber(int n, long *fiboNumbers)
{
if (n < 2) {
fiboNumbers[n] = n;
return n;
}
fiboNumbers[n] = fiboNumbers[n-1] + fiboNumbers[n-2];
return fibonacciNumber(n - 1, fiboNumbers) + fibonacciNumber(n - 2, fiboNumbers);
}
But I would like to count instructions also for programs that aren't written by me. Is there a way to count all type of instrunctions? Is there any particular reason why only this instructions are counted? Any help appreciated.
//Edit
I used disassembly option in Visual Studio to check how it looks and it still makes no sense for me. I can't find the reason why only assingment to array is interpreted by PIN as instruction.
instruction_comparison
This exceeded all my expectations, counted as 2 instructions:
even 2 instructions, not one
PIN, like other low-level profiling and analysis tools, measures individual instructions, low-level orders like "add these two registers" or "load a value from that memory address". The sequence of instructions which a program comprises are generally produced from a high-level language like C++ through a compiler. An individual line of C++ code might be transformed into exactly one instruction, but it's also common for a line to translate to several instructions or even to zero instructions; and the instructions for a line of code may be interleaved with those of other instructions.
Your compiler can output an assembly-language file for your source code, showing what instructions were produced for which lines of code. (For GCC and Clang, this is done with the -S flag.) Note that reading the assembly code output from a compiler is not the best way to learn assembly. Also, I would point you to godbolt.org, a very convenient tool for analyzing assembly output.
I have:
(ps:ps (ps:var vertices (ps:lisp (cons 'list *VERTICES*))))
which evaluates to:
"var vertices = [0.0, -200.0, 0, ... 0.4, 40];"
which is the correct expected result.
Where:
ps refers to parenscript (full documentation is here).
*VERTICES* is just a plain flat list of
numbers in my global Lisp environment
However, When *VERTICES* is large, the evaluation leads to the error:
Error: Too many arguments.
While executing: PARENSCRIPT::COMPILE-SPECIAL-FORM, in process listener(1).
How do I get around this error?
Not knowing how parenscript really works internally, this problem is hard to solve. So I tried changing the way the list is passed to ps.
Here are a few failed attempts:
(ps:ps (ps:var vertices (ps:lisp (list *VERTICES*))))
=> "var vertices = 1(2, 3, 4, 0, 9, 0.1)();"
(ps:ps (ps:var vertices (ps:lisp *VERTICES*)))
=> "var vertices = 1(2, 3, 4, 0, 9, 0.1);"
(ps:ps (ps:var vertices *VERTICES*))
=> "var vertices = VERTICES;"
None are the correct expected output.
What is the right way to pass a Lisp list variable's value to parenscript to form a correct javascript array-variable assignment-statement?
I was able to reproduce this with 1000000 items. It seems that the parenscript compiler (which is sitting behind that ps macro) translates your list to a form (array 1 2 3 4 5 …) first. This means that array gets a million arguments, which is rather unusual for source code.
I'd recommend thinking over your design. Why do you want to create such a large vector in your source code? Should this maybe sit in a separate file, or be loaded at runtime over the network?
By the way, you can inspect what is happening by using the debugger you get thrown into at error. For example, in SLIME, I pressed v on the frame of compile-special-form to see the corresponding source code of parenscript. This showed the following expression:
(apply expression-impl (cdr form))
which suggests that expressions will be subject to call-arguments-limit.
Is there a way to enforce a dictionary being constant?
I have a function which reads out a file for parameters (and ignores comments) and stores it in a dict:
function getparameters(filename::AbstractString)
f = open(filename,"r")
dict = Dict{AbstractString, AbstractString}()
for ln in eachline(f)
m = match(r"^\s*(?P<key>\w+)\s+(?P<value>[\w+-.]+)", ln)
if m != nothing
dict[m[:key]] = m[:value]
end
end
close(f)
return dict
end
This works just fine. Since i have a lot of parameters, which i will end up using on different places, my idea was to let this dict be global. And as we all know, global variables are not that great, so i wanted to ensure that the dict and its members are immutable.
Is this a good approach? How do i do it? Do i have to do it?
Bonus answerable stuff :)
Is my code even ok? (it is the first thing i did with julia, and coming from c/c++ and python i have the tendencies to do things differently.) Do i need to check whether the file is actually open? Is my reading of the file "julia"-like? I could also readall and then use eachmatch. I don't see the "right way to do it" (like in python).
Why not use an ImmutableDict? It's defined in base but not exported. You use one as follows:
julia> id = Base.ImmutableDict("key1"=>1)
Base.ImmutableDict{String,Int64} with 1 entry:
"key1" => 1
julia> id["key1"]
1
julia> id["key1"] = 2
ERROR: MethodError: no method matching setindex!(::Base.ImmutableDict{String,Int64}, ::Int64, ::String)
in eval(::Module, ::Any) at .\boot.jl:234
in macro expansion at .\REPL.jl:92 [inlined]
in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at .\event.jl:46
julia> id2 = Base.ImmutableDict(id,"key2"=>2)
Base.ImmutableDict{String,Int64} with 2 entries:
"key2" => 2
"key1" => 1
julia> id.value
1
You may want to define a constructor which takes in an array of pairs (or keys and values) and uses that algorithm to define the whole dict (that's the only way to do so, see the note at the bottom).
Just an added note, the actual internal representation is that each dictionary only contains one key-value pair, and a dictionary. The get method just walks through the dictionaries checking if it has the right value. The reason for this is because arrays are mutable: if you did a naive construction of an immutable type with a mutable field, the field is still mutable and thus while id["key1"]=2 wouldn't work, id.keys[1]=2 would. They go around this by not using a mutable type for holding the values (thus holding only single values) and then also holding an immutable dict. If you wanted to make this work directly on arrays, you could use something like ImmutableArrays.jl but I don't think that you'd get a performance advantage because you'd still have to loop through the array when checking for a key...
First off, I am new to Julia (I have been using/learning it since only two weeks). So do not put any confidence in what I am going to say unless it is validated by others.
The dictionary data structure Dict is defined here
julia/base/dict.jl
There is also a data structure called ImmutableDict in that file. However as const variables aren't actually const why would immutable dictionaries be immutable?
The comment states:
ImmutableDict is a Dictionary implemented as an immutable linked list,
which is optimal for small dictionaries that are constructed over many individual insertions
Note that it is not possible to remove a value, although it can be partially overridden and hidden
by inserting a new value with the same key
So let us call what you want to define as a dictionary UnmodifiableDict to avoid confusion. Such object would probably have
a similar data structure as Dict.
a constructor that takes a Dict as input to fill its data structure.
specialization (a new dispatch?) of the the method setindex! that is called by the operator [] =
in order to forbid modification of the data structure. This should be the case of all other functions that end with ! and hence modify the data.
As far as I understood, It is only possible to have subtypes of abstract types. Therefore you can't make UnmodifiableDict as a subtype of Dict and only redefine functions such as setindex!
Unfortunately this is a needed restriction for having run-time types and not compile-time types. You can't have such a good performance without a few restrictions.
Bottom line:
The only solution I see is to copy paste the code of the type Dict and its functions, replace Dict by UnmodifiableDict everywhere and modify the functions that end with ! to raise an exception if called.
you may also want to have a look at those threads.
https://groups.google.com/forum/#!topic/julia-users/n-lqjybIO_w
https://github.com/JuliaLang/julia/issues/1974
REVISION
Thanks to Chris Rackauckas for pointing out the error in my earlier response. I'll leave it below as an illustration of what doesn't work. But, Chris is right, the const declaration doesn't actually seem to improve performance when you feed the dictionary into the function. Thus, see Chris' answer for the best resolution to this issue:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test(D)
for jdx = 1:1000
# D[2] = 2
for idx = 0.0:5:3600
a = D[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test(D1); # 0.017789 seconds (4 allocations: 160 bytes)
#time test(D2); # 0.015075 seconds (4 allocations: 160 bytes)
Old Response
If you want your dictionary to be a constant, you can use:
const MyDict = getparameters( .. )
Update Keep in mind though that in base Julia, unlike some other languages, it's not that you cannot redefine constants, instead, it's just that you get a warning when doing so.
julia> const a = 2
2
julia> a = 3
WARNING: redefining constant a
3
julia> a
3
It is odd that you don't get the constant redefinition warning when adding a new key-val pair to the dictionary. But, you still see the performance boost from declaring it as a constant:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test1()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D1[idx]
end
end
end
function test2()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D2[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test1(); # 0.049204 seconds (1.44 M allocations: 22.003 MB, 5.64% gc time)
#time test2(); # 0.013657 seconds (4 allocations: 160 bytes)
To add to the existing answers, if you like immutability and would like to get performant (but still persistent) operations which change and extend the dictionary, check out FunctionalCollections.jl's PersistentHashMap type.
If you want to maximize performance and take maximal advantage of immutability, and you don't plan on doing any operations on the dictionary whatsoever, consider implementing a perfect hash function-based dictionary. In fact, if your dictionary is a compile-time constant, these can even be computed ahead of time (using metaprogramming) and precompiled.
I have been using the chr library along with the jpl interface. I have a general inquiry though. I send the constraints from SWI Prolog to an instance of a java class from within my CHR program. The thing is if the input constraint is leq(A,B) for example, the names of the variables are gone, and the variable names that appear start with _G. This happens even if I try to print leq(A,B) without using the interface at all. It appears that whenever the variable is processed the name is replaced with a fresh one. My question is whether there is a way to do the mapping back. For example whether there is a way to know that _G123 corresponds to A and so on.
Thank you very much.
(This question has nothing to do with CHR nor is it specific to SWI).
The variable names you use when writing a Prolog program are discarded completely by the Prolog system. The reason is that this information cannot be used to print variables accurately. There might be several independent instances of that variable. So one would need to add some unique identifier to the variable name. Also, maintaining that information at runtime would incur significant overheads.
To see this, consider a predicate mylist/1.
?- [user].
|: mylist([]).
|: mylist([_E|Es]) :- mylist(Es).
|: % user://2 compiled 0.00 sec, 4 clauses
true.
Here, we have used the variable _E for each element of the list. The toplevel now prints all those elements with a unique identifier:
?- mylist(Fs).
Fs = [] ;
Fs = [_G295] ;
Fs = [_G295, _G298] .
Fs = [_G295, _G298, _G301] .
The second answer might be printed as Fs = [_E] instead. But what about the third? It cannot be printed as Fs = [_E,_E] since the elements are different variables. So something like Fs = [_E_295,_E_298] is the best we could get. However, this would imply a lot of extra book keeping.
But there is also another reason, why associating source code variable names with runtime variables would lead to extreme complexities: In different places, that variable might have a different name. Here is an artificial example to illustrate this:
p1([_A,_B]).
p2([_B,_A]).
And the query:
?- p1(L), p2(L).
L = [_G337, _G340].
What names, would you like, these two elements should have? The first element might have the name _A or _B or maybe even better: _A_or_B. Or, even _Ap1_and_Bp2. For whom will this be a benefit?
Note that the variable names mentioned in the query at the toplevel are retained:
?- Fs = [_,F|_], mylist(Fs).
Fs = [_G231, F] ;
Fs = [_G231, F, _G375] ;
Fs = [_G231, F, _G375, _G378]
So there is a way to get that information. On how to obtain the names of variables in SWI and YAP while reading a term, please refer to this question.