In a typical application, the code doesn't get replaced too much, so whether old code gets deallocated before the process exits is inconsequential. However, the use case I'm thinking of is not typical. If I were to replace functions a lot, will Julia garbage-collect old code (source, compiled, and all intermediate representations)?
From what I understand, generated code is never freed.
See for example the discussion in https://github.com/JuliaLang/julia/issues/14495.
Related
I wonder what is the fundamental difference between binding and linking when working with Ada code? I couldn't find a good explanation on google and this is why I ask the question.
For the binding process what is the input and what is the output?
What is the relation between binding and linking? I assume binding needs to be done first.
Thanks,
Bogdan.
With GNAT, there are two jobs which the binder performs: first, checking that all the necessary compilations have been done, so that the program’s closure is consistent, and secondly arranging for elaboration to happen (these jobs are needed for any Ada build system, but they may be implemented differently).
When using gnatmake, the first of these jobs is usually superfluous, because gnatmake has already organised all the necessary compilations. It is possible to get this wrong (by, for example, moving a unit to a different library and not deleting its compilation products from the original place) but quite hard!
Elaboration is a feature of Ada that isn’t present in many other languages. There’s explanation at gcc.gnu.org and other places, but for a simple example,
with Foo;
package Bar is
Int : Integer := Foo.Value;
[...]
end Bar;
package Foo is
function Value return Integer;
[...]
end Foo;
we don’t know what Foo.Value is going to return at compile time, and we may not know until run time (what if it reads a value from the command line?), so Foo.Value must be in a fit state to be called before Bar’s initialisation happens.
Bar’s initialisation happens when Bar is elaborated, and likewise for Foo, so it’s gnatbind’s job to recognise this and arrange that Foo is elaborated before Bar.
It does this by emitting calls to packages’ elaboration code in a function (usually called adanit), and a main(), which is to be called by the operating system and calls adainit and then the Ada main program, say program.adb.
gnatmake then calls gnatlink, which takes the gnatbind-generated code, in Ada in files called b-program.ad[sb] or b__program.ad[sb] or b~program.ad[sb] depending on the vintage of the compiler, compiles it, and links it with the program’s closure to produce the final executable.
See the four points listed here: https://docs.adacore.com/gnat_ugn-docs/html/gnat_ugn/gnat_ugn/building_executable_programs_with_gnat.html#binding-with-gnatbind
You could think of it as a built-in make but without the recompilation: it ensures objects are consistent, generates a correct initialization order, compiles it, and passes everything to the linker.
As pointed out, in Ada the program entry point is not your main procedure, but one that performs a safe initialization and then calls your main procedure.
I've got a procedure within a SPARK module that calls the standard Ada-Text_IO.Put_Line.
During proving I get the following warning warning: no Global contract available for "Put_Line".
I do already know how to add the respective data dependency contract to procedures and functions written by myself but how do I add them to a procedures / functions written by others where I can't edit the source files?
I looked through sections 5.2 and 7.4 of the Adacore SPARK 2014 user's guide but didn't found an example with a solution to my problem.
This means that the analyzer cannot "see" whether global variables might be affected when this function is called. It therefore assumes this call is not modifying anything (otherwise all other proofs could be refuted immediately). This is likely a valid assumption for your specific example, but it might not be valid on an embedded system, where a custom implementation of Put_Line might do anything.
There are two ways to convey the missing information:
verifier can examine the source code of the function. Then it can try to generate global contracts itself.
global contracts are specified explicitly, see RM 6.1.4 (http://docs.adacore.com/spark2014-docs/html/lrm/subprograms.html#global-aspects)
In this case, the procedure you are calling is part of the run-time system (RTS), and therefore the source is not visible, and you probably cannot/should not change it.
What to do in practice?
Suppressing warnings is almost never a good idea, especially not when you are working on something safety-critical. Usually the code has to be changed until the warning goes away, or some justification process has to start.
If you are serious about the analysis results, I recommend to not use such subprograms. If you really need output there, either write your own procedure that replaces the RTS subprogram, or ensure that the subprogram really has no side effects. This is further backed up by what Frédéric has linked: Even if the callee has no side effects, you don't know whether it raises an exception for specific inputs (e.g., very long strings).
If you are not so serious about the results, then you can consider this specific one as a warning that you could live with.
Wrapper packages for use in development of SPARK applications may be found here:
https://github.com/joakim-strandberg/aida_2012
I think you just can't add Spark contracts on code you don't own, especially code from the Ada standard.
About Text_Io, I found something that may be valuable to you in the reference manual.
EDIT
Another solution compared to what Martin said, according to "Building high integrity applications with Spark" book, is to create a wrapper package.
As Spark requires you to deal with Spark packages but allows you to depend on a Spark spec with an Ada body, the solution is to build a Spark package wrapping your Ada.Text_io calls.
It might be tedious as you will have to wrap possible exceptions, possibly define specific types and so on but this way, you'll be able to discharge VCs on your full Spark package.
I could not find detailed documentation about the #async macro. From the docs about parallelism I understand that there is only one system thread used inside a Julia process and there is explicit task switching going on by the help of the yieldto function - correct me if I am wrong about this.
For me it is difficult to understand when exactly these task switches happen just by looking at the code, and knowing when it happens seems crucial.
As I understand a yieldto somewhere in the code (or in some function called by the code) needs to be there to ensure that the system is not stuck with only one task.
For example when there is a read operation, inside the read there probably is a wait call and in the implementation of wait there probably is a yieldto call. I thought that without the yieldto call the code would stuck in one task; however running the following example seems to prove this hypothesis wrong.
#async begin # Task A
while true
println("A")
end
end
while true # Task B
println("B")
end
This code produces the following output
BA
BA
BA
...
It is very unclear to me where the task switching happens inside the task created by the #async macro in the code above.
How can I tell about looking at some code the points where task switching happens?
The task switch happens inside the call to println("A"), which at some point calls write(STDOUT, "A".data). Because isa(STDOUT, Base.AsyncStream) and there is no method that is more specialized, this resolves to:
write{T}(s::AsyncStream,a::Array{T}) at stream.jl:782
If you look at this method, you will notice that it calls stream_wait(ct) on the current task ct, which in turn calls wait().
(Also note that println is not atomic, because there is a potential wait between writing the arguments and the newline.)
You could of course determine when stuff like that happens by looking at all the code involved. But I don't see why you would need to know this exactly, because, when working with parallelism, you should not depend on processes not switching context anyway. If you depend on a certain execution order, synchronize explicitly.
(You already kind of noted this in your question, but let me restate it here: As a rule of thumb, when using green threads, you can expect potential context switches when doing IO, because blocking for IO is a textbook example of why green threads are useful in the first place.)
I've got a legacy TP5 program. It compiles and runs OK using TP7. I'd like to capture and log some of the write / writeln statements. I can do a global search-and-replace for write and writeln, so I don't mind code changes like that. It does use some formated output:
write(r:4:2)
so I'd like to keep that.
I don't know any way to capture write/writeln other than writing to a file, then reading each line back and writing it again :~( But it occured to me that that is very like writing to a stream (introduced in Tp5.5), then streaming copies to multiple outputs.
Has anyone done this before? Is it possible? Is there another way?
Afaik it is possible, and commonly done in FPC and Delphi, TP's successors. The only trouble is that the TEXTREC isn't exported by TP, so you have to copy it from sources to somewhere.
The textrec has a bunch of procedure variables (like function pointers in C) that you can set to your own functions to process I/O. Setting these variables is what Assign() does.
The problem is finding the room to store the state (e.g. the pointer to the stream) though. IIRC the TP textrec is tighter than Delphi's.
Anyway, search for a unit StreamIO by Peter Below. This is a FPC/Delphi unit that you will have to adopt, but at least it demonstrates the principles. It would be better to simply migrate to something newer anyway. Maybe SWAG has a TP equivalent too.
Greetings to all the compiler designers here on Stack Overflow.
I am currently working on a project, which focuses on developing a new scripting language for use with high-performance computing. The source code is first compiled into a byte code representation. The byte code is then loaded by the runtime, which performs aggressive (and possibly time consuming) optimizations on it (which go much further, than what even most "ahead-of-time" compilers do, after all that's the whole point in the project). Keep in mind the result of this process is still byte code.
The byte code is then run on a virtual machine. Currently, this virtual machine is implemented using a straight-forward jump table and a message pump. The virtual machine runs over the byte code with a pointer, loads the instruction under the pointer, looks up an instruction handler in the jump table and jumps into it. The instruction handler carries out the appropriate actions and finally returns control to the message loop. The virtual machine's instruction pointer is incremented and the whole process starts over again. The performance I am able to achieve with this approach is actually quite amazing. Of course, the code of the actual instruction handlers is again fine-tuned by hand.
Now most "professional" run-time environments (like Java, .NET, etc.) use Just-in-Time compilation to translate the byte code into native code before execution. A VM using a JIT does usually have much better performance than a byte code interpreter. Now the question is, since all an interpreter basically does is load an instruction and look up a jump target in a jump table (remember the instruction handler itself is statically compiled into the interpreter, so it is already native code), will the use of Just-in-Time compilation result in a performance gain or will it actually degrade performance? I cannot really imagine the jump table of the interpreter to degrade performance that much to make up the time that was spent on compiling that code using a JITer. I understand that a JITer can perform additional optimization on the code, but in my case very aggressive optimization is already performed on the byte code level prior to execution. Do you think I could gain more speed by replacing the interpreter by a JIT compiler? If so, why?
I understand that implementing both approaches and benchmarking will provide the most accurate answer to this question, but it might not be worth the time if there is a clear-cut answer.
Thanks.
The answer lies in the ratio of single-byte-code-instruction complexity to jump table overheads. If you're modelling high level operations like large matrix multiplications, then a little overhead will be insignificant. If you're incrementing a single integer, then of course that's being dramatically impacted by the jump table. Overall, the balance will depend upon the nature of the more time-critical tasks the language is used for. If it's meant to be a general purpose language, then it's more useful for everything to have minimal overhead as you don't know what will be used in a tight loop. To quickly quantify the potential improvement, simply benchmark some nested loops doing some simple operations (but ones that can't be optimised away) versus an equivalent C or C++ program.
When you use an interpreter, the code cache in your processor caches the interpreter code; not the byte code (which may be cached in the data cache). Since code caches are 2 to 3 times faster than data caches, IIRC; you may see a performance boost if you JIT compile. Also, the native, real code you are executing is probably PIC; something which can be avoided for JITted code.
Everything else depends on how optimized the byte code is, IMHO.
JIT can theoretically optimize better, since it has information not available at compile time (especially about typical runtime behavior). So it can for example do better branch prediction, roll out loops as needed, et.c.
I am sure your jumptable approach is OK, but I still think it would perform rather poor compared to straight C code, don't you think?