In Thinking in Postscript (pdf), Chapter 5, exercise 3 (pp. 64-65) asks the reader to refactor this code to not store any dictionary entries:
36 750 moveto /Times-Roman 24 selectfont
% works like “show”, leaving current point at proper location
/ushow
% linethickness lineposition (words) ushow -
{ %def
LOCAL begin
/text exch def
/linepos exch def
/linethick exch def
gsave
0 linepos rmoveto
text stringwidth rlineto
linethick setlinewidth stroke
grestore
text show
end
} dup 0 4 dict put def
0.5 -4 (test underlined text) ushow
My question is about LOCAL. Ghostscript runs this code without error, and yet LOCAL is not:
Defined in the exercise
Documented in the Postcript Language Reference, third edition
Documented in PostScript Language Tutorial and Cookbook
What, in PostScript, is LOCAL?
It is nothing defined. The code is a bit sneaky as the code dup 0 4 dict put def will take the executable and replace the LOCAL with the result of 4 dict. The executable block (stuff between {}) is copied mainly because put returns nothing. Since its a reference to the same block your left over with
/ushow {-dict- begin ...rest of the executable...} def
This is all valid because LOCAL is never used anywhere (it is destroyed before its used). It would not matter what you use in place of LOCAL.
As joojaa correctly explains, LOCAL is not defined to anything which is ok since it is replaced before executing. It parses as an executable name during construction of the procedure body (array) and its use here is just to allocate a slot in the array. It could actually be any type, and I've often seen (and written) { 0 begin ...} for the same purpose. Using a name lets you give more semantic information to a human reader of the code. I've also seen it written { DICT begin ... } Here in my matrix functions, I called it STATICDICT, apparently.
There is a convention of using all upper-case for any meta-syntactical tokens like this. It is a nametype token, but meta-syntactically, it refers to a dicttype object to be filled-in later. There's no need (nor even any mechanism) to declare what you're doing for the benefit of the interpreter, but there is much to be gained by preferring DICT over 0. Again, since it will be completely replaced, you could also use a literal name /LOCAL to try to, idunno, relieve the next noob to read your code from the wild-goose-chase of looking for where LOCAL is defined?? To this end, I've also written simply DUMMY for token to-be-filled-in-later. I suppose the choice among these terms is a matter of style or audience or some other intangible quality. sigh... or just a matter of context.
There is another style that works well for making dynamic substitutions in procedure bodies. By placing the dictionary on the dictstack and naming it (within itself, so it's a closed namespace), we can refer to it with an //immediate name
4 dict begin
/mydict currentdict def
/proc {
//mydict begin
...
end
}
and then remove the dictionary before defining.
end def
Thus defining the procedure normally (in the outer-level dictionary (unnamed here, presumably userdict)), but with the dictionary embedded by name, from having been available by that name while the procedure body was scanned.
This can be extended to more procedures sharing the same private dictionary by juggling the dict off of the stack for each definition.
/enddefbegin { currentdict 3 1 roll end def begin } def
4 dict begin
/mydict currentdict def
/proc1 {
//mydict begin
...
end
} enddefbegin
/proc2 {
//mydict begin
...
end
} enddefbegin
end
The enddefbegin end at the end can of course be simplified to end def.
One caveat. The dictionary created this way is recursively contained with itself. Do not try to print it with ghostscript's === operator!
Pg 133 of the "Blue Book" has a slightly easier example of the same technique:
/sampleproc
{ 0 begin
/localvariable 6 def
end
} def
/sampleproc load 0 1 dict put
Here the procedure is defined before it is modified. It's a little easier to wrap your mind around. In the original post, the trickiest part for me was the "dup" because I didn't realize that the array on the stack is not an array precisely, it's an array reference (I was thinking copy-by-value, it functions copy-by-reference), so the "put" in the original code affects the array with the first reference (which is consequently consumed from the stack), and the second is used to define the procedure. It was a Newbie mistake, but maybe other newbies can learn from it:
Stack Progression:
... % fast forward
1. ushow --array-- --array-- 0 4 dict | put def % dict creates a dictionary
2. ushow --array-- --array-- 0 --dict-- | put def % put arrives on stack
3. ushow --array-- --array-- 0 --dict-- put | def % put consumes the array 0 and dict
4. ushow --array-- | def % def arrives on stack
5. ushow --array-- def % def consumes the remaining tokens
6.
Sorry for what is probably incorrect notation, I just stared at this for a while, and figured I might be able to save someone a little stare-time. Please let me know if there are any errors or misleading statements that I should fix.
Related
I was assigned this task as my homework. I have a file which contains lines of text of varying lengths. The program is supposed to write the data onto the screen in precisely the same order in which it is written in the file, yet it fails to do so. To achieve the desired result I tried reading only one character per iteration so as to detect new line characters. What am I doing wrong?
WITH Ada.Text_IO;
WITH Ada.Characters.Latin_1;
USE Ada.Text_IO;
PROCEDURE ASCII_artwork IS
File : File_Type;
c : Character;
BEGIN
Open(File, In_File, "Winnie_The_Pooh.txt");
WHILE NOT End_Of_File(File) LOOP
Get(File, C);
IF (C = Ada.Characters.Latin_1.LF) THEN Put_Line(" "); ELSE
Put(C);
END IF;
END LOOP;
Close(File);
END ASCII_Artwork;
For each file, the Ada runtime maintains a fictitious "cursor". This is not the typical file position cursor (index), but one that indicates the position on a page, line, etc. (see also RM A.10 (7)). This is somewhat of an inheritance from the early versions of Ada.
Get stems from this same era and is expected to update the location of this cursor when some particular control characters are being read (e.g. an end-of-line mark). If Get reads such such a control character, it will only use it to update the cursor (internally) and then continue to read a next character (see also RM A.10.7 (3)). You'll therefore never detect an end-of-line mark when using Get.
This behavior, however, has some uncomfortable consequence: if a file ends with a sequence of control characters, then Get will keep reading those characters and hit the end of the file causing an End_Error exception.
You can, of course, catch this exception and handle it, but such a construct is dubious as having a sequence of control characters at the end of a file is actually not such an abnormal case (and hence dubious if worth an exception). As a programmer, however, you cannot change this behavior: it's defined by the language and the language will not be changed because it has been decided to keep Ada (highly) backwards compatible (which in itself is understandable given its field of application).
Hence, in your case, if you want stick to a character-by-character processing approach, I would suggest to move away from Get and instead use (for example) streams to perform I/O as in the example below.
main.adb
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Text_IO.Text_Streams; use Ada.Text_IO.Text_Streams;
procedure ASCII_artwork IS
File : File_Type;
Input : Stream_Access;
Output : Stream_Access;
C : Character;
begin
Open (File, In_File, "Winnie_The_Pooh.txt");
Input := Stream (File);
Output := Stream (Standard_Output);
while not End_Of_File (File) loop
Character'Read (Input, C);
Character'Write (Output, C);
end loop;
Close(File);
end ASCII_Artwork;
Output is as expected (i.e. the content of this the file at ascii-art.de).
NOTE: Check the source code of the GNAT runtime to actually see how Get works internally (focus on the loop at the end).
As explained by DeeDee, text inputs are buffered linewise in Ada. The idea is to be able to read two integers on the same line. For consistency sake (the designers of Ada are picky on that...), Get(File, C) does the same. It is not practical in your case. Fortunately, Ada 95 has introduced Get_Immediate, to solve precisely that issue.
Otherwise, as suggested by Frédéric, you could use the function Get_Line to absorb Winnie_The_Pooh.txt line by line seamlessly. By the way, the Get_Line method will convert the different end-of-line conventions automatically.
Line terminators in Ada.Text_IO are a concept, not a character or sequence of characters in the file. (Although most commonly used file systems implement them as characters or sequences of characters in the file, there exist file systems that do not.) Line terminators must therefore be manipulated using the operations in the package. For reading, End_Of_Line checks to see if the cursor is at a line terminator, Skip_Line skips the next line terminator, and Get_Line may skip a line terminator. For writing, New_Line and Put_Line write line terminators.
For your problem, the canonical solution is to use the Get_Line function to read lines, and Put_Line to output the lines read.
Generally speaking, Functional Programming prides itself for being clearer and concise. The fact that you don't have side-effects/state management makes it easier for developers to reason about their code and assure behaviours. How far does this truth reach?
I'm still learning Elixir but given the code from Coding Gnome:
def make_move(game = %{ game_state: state }, _guess)
when state in [:won, :lost] do
...
end
def make_move(game = %{ game_state: state }, _guess)
when state in [:pending] do
...
end
def make_move(game, guess) do
...
end
One could write it without any fanciness in Javascript as:
const makeMove = (game, guess) => {
switch(game.state) {
case 'won':
return makeMoveFinalState();
case 'lost':
return makeMoveFinalState();
case 'pending':
return makeMovePending();
}
}
Disregarding all the type/struct safety provided by Elixir compiler, an Elixir programmer would have to read the whole file before making sure that there wasn't a function with a different signature hijacking another, right? I feel that this increases the overhead while a programmer, because it's yet another thing you have to think about, even before looking at the implementation of the function.
Besides that, it feels to me as a misdirection because you can't be 100% sure that a case is ending up in that general make_move function unless you know beforehand all others and the signatures types, while with a conditional you have a clearer path of flow.
Could this be rewritten in a better way? At what point does these abstractions start to weight in the programmer?
I think this boils down mostly to preference and usually simple exercises with pattern matching with simple conditions do not show the range of "clarity" pattern matching can provide. But I'm suspect because I prefer pattern matching, any way, I'm gonna bite.
In this case, the switch could be said to be more readable and straightforward, but note that there's nothing preventing you from writing a very similar thing in Elixir (or erlang)
def make_move(game = %{ game_state: state }, _guess) do
case state do
state when state in [:won, :lost] -> # do things
:pending -> # do things
_else -> # do other things
end
end
Regarding the placement of different function clauses for the same function name, elixir will emit a warning if they're not grouped together, so that ends up just being your responsibility to write them together and in the correct order (it will also warn you if any of the branches is by definition unreachable, like placing a catch all before any specific branch that has matchings).
But I think that if for instance you add a slight change of the matching requirements for the pending state, then in my view it starts becoming clearer to write it in the erlang/elixir way. Say that when the state is pending there are two different execution paths, depending if it's your turn or something else.
Now you could write 2 specific branches for that with just function signatures:
def make_move(game = %{ game_state: :pending, your_turn: true }, _guess) do
# do stuff
end
def make_move(game = %{ game_state: :pending }, _guess) do
# do stuff
end
To do that in JS you would need to have either another switch, or another if. If you have more complex matching patterns then it easily becomes harder to follow, while on elixir I think the paths are quite clear.
If the other conditions could be more thornier, say when it's :pending and there's nothing on a stack key that holds a list, then again matching that becomes:
def make_move(game = %{ game_state: :pending, your_turn: true, stack: [] }, _guess) do
Or if there's another branch where it depends if the first item in the stack was something specific:
def make_move(game = %{ game_state: :pending, your_turn: true, player_id: your_id, stack: [%AnAlmostTypedStruct{player: your_id} | _] }, _guess) do
Here erlang/elixir would only match this if your_id was the same in both places where it's used in the pattern.
And also, you say "without fanciness" in JS, but different function heads/arity/pattern matching is nothing fancy in Elixir/Erlang, it's just like the language has support for switch/case based statements at a much lower level (at the module compilation level?).
I for one would love to have effective pattern matching & different function clauses (not destructuring only) in JS.
Is there a way to enforce a dictionary being constant?
I have a function which reads out a file for parameters (and ignores comments) and stores it in a dict:
function getparameters(filename::AbstractString)
f = open(filename,"r")
dict = Dict{AbstractString, AbstractString}()
for ln in eachline(f)
m = match(r"^\s*(?P<key>\w+)\s+(?P<value>[\w+-.]+)", ln)
if m != nothing
dict[m[:key]] = m[:value]
end
end
close(f)
return dict
end
This works just fine. Since i have a lot of parameters, which i will end up using on different places, my idea was to let this dict be global. And as we all know, global variables are not that great, so i wanted to ensure that the dict and its members are immutable.
Is this a good approach? How do i do it? Do i have to do it?
Bonus answerable stuff :)
Is my code even ok? (it is the first thing i did with julia, and coming from c/c++ and python i have the tendencies to do things differently.) Do i need to check whether the file is actually open? Is my reading of the file "julia"-like? I could also readall and then use eachmatch. I don't see the "right way to do it" (like in python).
Why not use an ImmutableDict? It's defined in base but not exported. You use one as follows:
julia> id = Base.ImmutableDict("key1"=>1)
Base.ImmutableDict{String,Int64} with 1 entry:
"key1" => 1
julia> id["key1"]
1
julia> id["key1"] = 2
ERROR: MethodError: no method matching setindex!(::Base.ImmutableDict{String,Int64}, ::Int64, ::String)
in eval(::Module, ::Any) at .\boot.jl:234
in macro expansion at .\REPL.jl:92 [inlined]
in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at .\event.jl:46
julia> id2 = Base.ImmutableDict(id,"key2"=>2)
Base.ImmutableDict{String,Int64} with 2 entries:
"key2" => 2
"key1" => 1
julia> id.value
1
You may want to define a constructor which takes in an array of pairs (or keys and values) and uses that algorithm to define the whole dict (that's the only way to do so, see the note at the bottom).
Just an added note, the actual internal representation is that each dictionary only contains one key-value pair, and a dictionary. The get method just walks through the dictionaries checking if it has the right value. The reason for this is because arrays are mutable: if you did a naive construction of an immutable type with a mutable field, the field is still mutable and thus while id["key1"]=2 wouldn't work, id.keys[1]=2 would. They go around this by not using a mutable type for holding the values (thus holding only single values) and then also holding an immutable dict. If you wanted to make this work directly on arrays, you could use something like ImmutableArrays.jl but I don't think that you'd get a performance advantage because you'd still have to loop through the array when checking for a key...
First off, I am new to Julia (I have been using/learning it since only two weeks). So do not put any confidence in what I am going to say unless it is validated by others.
The dictionary data structure Dict is defined here
julia/base/dict.jl
There is also a data structure called ImmutableDict in that file. However as const variables aren't actually const why would immutable dictionaries be immutable?
The comment states:
ImmutableDict is a Dictionary implemented as an immutable linked list,
which is optimal for small dictionaries that are constructed over many individual insertions
Note that it is not possible to remove a value, although it can be partially overridden and hidden
by inserting a new value with the same key
So let us call what you want to define as a dictionary UnmodifiableDict to avoid confusion. Such object would probably have
a similar data structure as Dict.
a constructor that takes a Dict as input to fill its data structure.
specialization (a new dispatch?) of the the method setindex! that is called by the operator [] =
in order to forbid modification of the data structure. This should be the case of all other functions that end with ! and hence modify the data.
As far as I understood, It is only possible to have subtypes of abstract types. Therefore you can't make UnmodifiableDict as a subtype of Dict and only redefine functions such as setindex!
Unfortunately this is a needed restriction for having run-time types and not compile-time types. You can't have such a good performance without a few restrictions.
Bottom line:
The only solution I see is to copy paste the code of the type Dict and its functions, replace Dict by UnmodifiableDict everywhere and modify the functions that end with ! to raise an exception if called.
you may also want to have a look at those threads.
https://groups.google.com/forum/#!topic/julia-users/n-lqjybIO_w
https://github.com/JuliaLang/julia/issues/1974
REVISION
Thanks to Chris Rackauckas for pointing out the error in my earlier response. I'll leave it below as an illustration of what doesn't work. But, Chris is right, the const declaration doesn't actually seem to improve performance when you feed the dictionary into the function. Thus, see Chris' answer for the best resolution to this issue:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test(D)
for jdx = 1:1000
# D[2] = 2
for idx = 0.0:5:3600
a = D[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test(D1); # 0.017789 seconds (4 allocations: 160 bytes)
#time test(D2); # 0.015075 seconds (4 allocations: 160 bytes)
Old Response
If you want your dictionary to be a constant, you can use:
const MyDict = getparameters( .. )
Update Keep in mind though that in base Julia, unlike some other languages, it's not that you cannot redefine constants, instead, it's just that you get a warning when doing so.
julia> const a = 2
2
julia> a = 3
WARNING: redefining constant a
3
julia> a
3
It is odd that you don't get the constant redefinition warning when adding a new key-val pair to the dictionary. But, you still see the performance boost from declaring it as a constant:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test1()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D1[idx]
end
end
end
function test2()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D2[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test1(); # 0.049204 seconds (1.44 M allocations: 22.003 MB, 5.64% gc time)
#time test2(); # 0.013657 seconds (4 allocations: 160 bytes)
To add to the existing answers, if you like immutability and would like to get performant (but still persistent) operations which change and extend the dictionary, check out FunctionalCollections.jl's PersistentHashMap type.
If you want to maximize performance and take maximal advantage of immutability, and you don't plan on doing any operations on the dictionary whatsoever, consider implementing a perfect hash function-based dictionary. In fact, if your dictionary is a compile-time constant, these can even be computed ahead of time (using metaprogramming) and precompiled.
When generating a not explicitly generated version of a function, #ngenerate runs
eval(quote
local _F_
$localfunc # Definition of _F_ for the requested value of N
_F_
end)
Since eval runs in the scope of the current module, not the function, I wonder what is the effect of local in this context. As far as I know, the languange documentation only mentions the use of local inside function definitions.
To give some background why this question arose: I frequently need to code loops of the form
function foo(n::Int)
s::Int = 0
for i in 1:1000000000
for j in 1:n
s += 1
end
end
return s
end
where n <= 10 (of course, in my actual code the loops are such that they cannot just be reduced to O(1)). Because this code is very simple for the compiler but demanding at runtime, it turns out to be beneficial to simply recompile the loops with the required value of n each time foo is called.
function clever_foo(n::Int)
eval(quote
function clever_foo_impl()
s::Int = 0
for i in 1:1000000000
s += $(Expr(:call,:+,[1 for j in 1:n]...))
end
return s
end
end)
return clever_foo_impl()
end
However, I am not sure whether I am doing this the right way.
It's to prevent _F_ from being visible in the global method cache.
If you'll call clever_foo with the same n repeatedly, you can do even better by saving the compiled function in a Dict. That way you don't have to recompile it each time.
According to the PLRM it doesn't matter in which order you execute a forall on a dict:
(p. 597) forall pushes a key and a value on the operand stack and executes proc for each key-value pair in the dictionary
...
(p. 597) The order in which forall enumerates the entries in the dictionary is arbitrary. New entries put in the dictionary during the execution of proc may or may not be included in the enumeration. Existing entries removed from the dictionary by proc will not be encountered later in the enumeration.
Now I was executing some code:
/d 5 dict def
d /abc 123 put
d { } forall
My output (operand stack) is:
--------top-
/abc
123
-----bottom-
The output of ghostscript and PLRM (operand stack) is:
--------top-
123
/abc
-----bottom-
Does it really not matter in what order you process the key-value pairs of the dict?
on the stack, do you first need to push the value and then the key, or do you need to push the key first? (as the PLRM only talks about "a key and a value", but doesnt tell you anything about the order).
Thanks in advance
It would probably help if you quoted the page number qhen you quote sections from the PLRM, its hard to see where you are getting this from.
When executing forall the order in which forall enumerates the dictionary pairs is arbitrary, you have no influence over it. However forall always pushes the key and then the value. Even if this is implied in the text you (didn't quite) quote, you can see from the example in the forall operator that this is hte case.
when you say 'my output' do you mean you are writing your own PostScript interpreter ? If so then your output is incorrect, when pushing a key/value pair the key is pushed first.