To learn some Prolog (I'm using GNU Prolog) and grok its parsing abilities, I am starting by writing a Lisp (or S-expression, if I'm being exact) tokenizer, which given a set of tokens like ['(', 'f', 'o', 'o', ')'] should produce ['(', 'foo', ')']. It's not working as expected, which is why I'm here! I thought my thought process shined through in my pseudocode:
tokenize([current | rest], buffer, tokens):
if current is '(' or ')',
Tokenize the rest,
And the output will be the current token buffer,
Plus the parenthesis and the rest.
if current is ' ',
Tokenize the rest with a clean buffer,
And the output will be the buffer plus the rest.
if the tail is empty,
The output will be a one-element list containing the buffer.
otherwise,
Add the current character to the buffer,
And the output will be the rest tokenized, with a bigger buffer.
I translated that to Prolog like this:
tokenize([Char | Chars], Buffer, Tokens) :-
((Char = '(' ; Char = ')') ->
tokenize(Chars, '', Tail_Tokens),
Tokens is [Buffer, Char | Tail_Tokens];
Char = ' ' ->
tokenize(Chars, '', Tail_Tokens),
Tokens is [Buffer | Tail_Tokens];
Chars = [] -> Tokens is [Buffer];
atom_concat(Buffer, Char, New_Buffer),
tokenize(Chars, New_Buffer, Tokens)).
print_tokens([]) :- write('.').
print_tokens([T | N]) :- write(T), write(', '), print_tokens(N).
main :-
% tokenize(['(', 'f', 'o', 'o', '(', 'b', 'a', 'r', ')', 'b', 'a', 'z', ')'], '', Tokens),
tokenize(['(', 'f', 'o', 'o', ')'], '', Tokens),
print_tokens(Tokens).
When running the result, below, like this: gprolog --consult-file lisp_parser.pl it just tells me no. I traced main, and it gave me the stack trace below. I do not understand why tokenize fails for the empty case. I see that the buffer is empty since it was cleared with the previous ')', but even if Tokens is empty at that point in time, wouldn't Tokens accumulate a larger result recursively? Can someone who is good with Prolog give me a few tips here?
| ?- main.
no
| ?- trace.
The debugger will first creep -- showing everything (trace)
(1 ms) yes
{trace}
| ?- main.
1 1 Call: main ?
2 2 Call: tokenize(['(',f,o,o,')'],'',_353) ?
3 3 Call: tokenize([f,o,o,')'],'',_378) ?
4 4 Call: atom_concat('',f,_403) ?
4 4 Exit: atom_concat('',f,f) ?
5 4 Call: tokenize([o,o,')'],f,_429) ?
6 5 Call: atom_concat(f,o,_454) ?
6 5 Exit: atom_concat(f,o,fo) ?
7 5 Call: tokenize([o,')'],fo,_480) ?
8 6 Call: atom_concat(fo,o,_505) ?
8 6 Exit: atom_concat(fo,o,foo) ?
9 6 Call: tokenize([')'],foo,_531) ?
10 7 Call: tokenize([],'',_556) ?
10 7 Fail: tokenize([],'',_544) ?
9 6 Fail: tokenize([')'],foo,_519) ?
7 5 Fail: tokenize([o,')'],fo,_468) ?
5 4 Fail: tokenize([o,o,')'],f,_417) ?
3 3 Fail: tokenize([f,o,o,')'],'',_366) ?
2 2 Fail: tokenize(['(',f,o,o,')'],'',_341) ?
1 1 Fail: main ?
(1 ms) no
{trace}
| ?-
How about this. I think that's what you want to do, but let's use Definite Clause Grammars (which are just horn clauses with :- replaced by --> and two elided arguments holding the input character list and remaining character list. An example DCG rule:
rule(X) --> [c], another_rule(X), {predicate(X)}.
List processing rule rule//1 says: When you find character c in the input list, then continue list processing with another_rule//1, and when that worked out, call predicate(X) as normal.
Then:
% If we encounter a separator symbol '(' or ')', we commit to the
% clause using '!' (no point trying anything else, in particular
% not the clause for "other characters", tokenize the rest of the list,
% and when we have done that decide whether 'MaybeToken', which is
% "part of the leftmost token after '(' or ')'", should be retained.
% it is dropped if it is empty. The caller is then given an empty
% "part of the leftmost token" and the list of tokens, with '(' or ')'
% prepended: "tokenize('', [ '(' | MoreTokens] ) -->"
tokenize('', [ '(' | MoreTokens] ) -->
['('],
!,
tokenize(MaybeToken,Tokens),
{drop_empty(MaybeToken,Tokens,MoreTokens)}.
tokenize('',[')'|MoreTokens]) -->
[')'],
!,
tokenize(MaybeToken,Tokens),
{drop_empty(MaybeToken,Tokens,MoreTokens)}.
% No more characters in the input list (that's what '--> []' says).
% We succeed, with an empty token list and an empty buffer fro the
% leftmost token.
tokenize('',[]) --> [].
% If we find a 'Ch' that is not '(' or ')', then tokenize
% more of the list via 'tokenize(MaybeToken,Tokens)'. On
% returns 'MaybeToken' is a piece of the leftmost token found
% in that list, so we have to stick 'Ch' onto its start.
tokenize(LargerMaybeToken,Tokens) -->
[Ch],
tokenize(MaybeToken,Tokens),
{atom_concat(Ch,MaybeToken,LargerMaybeToken)}.
% ---
% This drops an empty "MaybeToken". If "MaybeToken" is
% *not* empty, it is actually a token and prepended to the list "Tokens"
% ---
drop_empty('',Tokens,Tokens) :- !.
drop_empty(MaybeToken,Tokens,[MaybeToken|Tokens]).
% -----------------
% Call the DCG using phrase/2
% -----------------
tokenize(Text,Result) :-
phrase( tokenize(MaybeToken,Tokens), Text ),
drop_empty(MaybeToken,Tokens,Result),!.
And so:
?- tokenize([h,e,l,l,o],R).
R = [hello].
?- tokenize([h,e,l,'(',l,')',o],R).
R = [hel,(,l,),o].
?- tokenize([h,e,l,'(',l,l,')',o],R).
R = [hel,(,ll,),o].
I think in GNU Prolog, the notation `hello` generates [h,e,l,l,o] directly.
I do not understand why tokenize fails for the empty case.
The reason anything fails in Prolog is because there is no clause that makes it true. If your only clause for tokenize is of the form tokenize([Char | Chars], ...), then no call of the form tokenize([], ...) will ever be able to match this clause, and since there are no other clauses, the call will fail.
So you need to add such a clause. But first:
:- set_prolog_flag(double_quotes, chars).
This allows you to write ['(', f, o, o, ')'] as "foo".
Also, you must plan for the case where the input is completely empty, or other cases where you must maybe emit a token for the buffer, but only if it is not '' (since there should be no '' tokens littering the result).
finish_buffer(Tokens, Buffer, TokensMaybeWithBuffer) :-
( Buffer = ''
-> TokensMaybeWithBuffer = Tokens
; TokensMaybeWithBuffer = [Buffer | Tokens] ).
For example:
?- finish_buffer(MyTokens, '', TokensMaybeWithBuffer).
MyTokens = TokensMaybeWithBuffer.
?- finish_buffer(MyTokens, 'foo', TokensMaybeWithBuffer).
TokensMaybeWithBuffer = [foo|MyTokens].
Note that you can prepend the buffer to the list of tokens, even if you don't yet know what that list of tokens is! This is the power of logical variables. The rest of the code uses this technique as well.
So, the case for the empty input:
tokenize([], Buffer, Tokens) :-
finish_buffer([], Buffer, Tokens).
For example:
?- tokenize([], '', Tokens).
Tokens = [].
?- tokenize([], 'foo', Tokens).
Tokens = [foo].
And the remaining cases:
tokenize([Parenthesis | Chars], Buffer, TokensWithParenthesis) :-
( Parenthesis = '('
; Parenthesis = ')' ),
finish_buffer([Parenthesis | Tokens], Buffer, TokensWithParenthesis),
tokenize(Chars, '', Tokens).
tokenize([' ' | Chars], Buffer, TokensWithBuffer) :-
finish_buffer(Tokens, Buffer, TokensWithBuffer),
tokenize(Chars, '', Tokens).
tokenize([Char | Chars], Buffer, Tokens) :-
Char \= '(',
Char \= ')',
Char \= ' ',
atom_concat(Buffer, Char, NewBuffer),
tokenize(Chars, NewBuffer, Tokens).
Note how I used separate clauses for the separate cases. This makes the code more readable, but it does have the drawback compared to (... -> ... ; ...) that the last clause must exclude characters handled by previous clauses. Once you have your code in this shape, and you're happy that it works, you can transform it into a form using (... -> ... ; ...) if you really want to.
Examples:
?- tokenize("(foo)", '', Tokens).
Tokens = ['(', foo, ')'] ;
false.
?- tokenize(" (foo)", '', Tokens).
Tokens = ['(', foo, ')'] ;
false.
?- tokenize("(foo(bar)baz)", '', Tokens).
Tokens = ['(', foo, '(', bar, ')', baz, ')'] ;
false.
Finally, and very importantly, the is operator is meant only for evaluation of arithmetic expressions. It will throw an exception when you apply it to anything that is not arithmetic. Unification is different from the evaluation of arithmetic expression. Unification is written as =.
?- X is 2 + 2.
X = 4.
?- X = 2 + 2.
X = 2+2.
?- X is [a, b, c].
ERROR: Arithmetic: `[a,b,c]' is not a function
ERROR: In:
ERROR: [20] throw(error(type_error(evaluable,...),_3362))
ERROR: [17] arithmetic:expand_function([a,b|...],_3400,_3402) at /usr/lib/swi-prolog/library/arithmetic.pl:175
ERROR: [16] arithmetic:math_goal_expansion(_3450 is [a|...],_3446) at /usr/lib/swi-prolog/library/arithmetic.pl:147
ERROR: [14] '$expand':call_goal_expansion([system- ...],_3512 is [a|...],_3492,_3494,_3496) at /usr/lib/swi-prolog/boot/expand.pl:863
ERROR: [13] '$expand':expand_goal(_3566 is [a|...],_3552,_3554,_3556,user,[system- ...],_3562) at /usr/lib/swi-prolog/boot/expand.pl:524
ERROR: [12] setup_call_catcher_cleanup('$expand':'$set_source_module'(user,user),'$expand':expand_goal(...,_3640,_3642,_3644,user,...,_3650),_3614,'$expand':'$set_source_module'(user)) at /usr/lib/swi-prolog/boot/init.pl:443
ERROR: [8] '$expand':expand_goal(user:(_3706 is ...),_3692,user:_3714,_3696) at /usr/lib/swi-prolog/boot/expand.pl:458
ERROR: [6] setup_call_catcher_cleanup('$toplevel':'$set_source_module'(user,user),'$toplevel':expand_goal(...,...),_3742,'$toplevel':'$set_source_module'(user)) at /usr/lib/swi-prolog/boot/init.pl:443
ERROR:
ERROR: Note: some frames are missing due to last-call optimization.
ERROR: Re-run your program in debug mode (:- debug.) to get more detail.
^ Call: (14) call('$expand':'$set_source_module'(user)) ? abort
% Execution Aborted
?- X = [a, b, c].
X = [a, b, c].
I see this Stackoverflow code for =>, but when I search Julia 1.0.0 on-line help for "=>", I get zero hits.
replace!(x, 0=>4) # The last expression is the focus of this question.
In the REPL help I get:
help?> =>
search: =>
Pair(x, y)
x => y
Construct a Pair object with type Pair{typeof(x), typeof(y)}. The elements are stored in the fields first and second.
They can also be accessed via iteration.
See also: Dict
Examples
≡≡≡≡≡≡≡≡≡≡
julia> p = "foo" => 7
"foo" => 7
julia> typeof(p)
Pair{String,Int64}
julia> p.first
"foo"
julia> for x in p
println(x)
end
foo
7
What does => do in replace!(x, 0=>4)? Does it create a pair, a replacement of all zeros by fours, or what? Why do I seem to not find it in the Julia 1.0.0 on-line docs?
EDIT
Code added to help me understand #Bill's helpful answer below:
julia> x = [1, 0, 3, 2, 0]
5-element Array{Int64,1}:
1
0
3
2
0
julia> replace!(x, 0=>4)
5-element Array{Int64,1}:
1
4
3
2
4
Edit 2
Besides #Bill's accepted answer, I found #Steven's answer helpful as well. Sorry I could not check them both, but Bill's came in first and they both offered useful information.
"What does => do in replace!(x, 0=>4)? Does it create a pair, a replacement of all zeros by fours, or what?"
It creates a Pair. In the function replace, a Pair in the second argument position means the multiple dispatch of replace() chooses a version of the replace function where, given a numeric array or string x, all items within x fitting the first part of the Pair are replaced with an instance of the second part of the Pair.
You can check the REPL docs for replace for details.
This small example should show how "=>" makes a pair
julia> replace("julia", Pair("u", "o"))
"jolia"
julia> replace("julia", "u" => "o")
"jolia"
"=>" operator means "Change into"
so
julia> replace("hello world",'l' => 'z')
"hezzo worzd"
means Change the string "hello world" using "change" 'l' "into" 'z'
and producing the resultant string "hezzo worzd"
julia> replace( [1,2,3,4,5], 3 => 666 )
5-element Array{Int64,1}:
1
2
666
4
5
I'm currently facing a strange problem for which I can't find the reason. I'm trying to create a multidimensional array in AutoIt and initialize it with values. Keeping the variant datatype in mind, it should be possible to create an array, that
consists of multiple elements (of course)
each element is another array containing exactly 2 elements (with different types)
these two elements are a single integer and another array with 4 integers
So basically I want to have an array of key/value pairs where the value is an array of numbers. I know that the "value" in this case is treated as a simple variable and not as an directly accessible array - that is what I want.
However, if I try
Global Const $x[3][2] = [ _
[1, [11,12,13,14] ], _
[2, [21,22,23,24] ], _
[3, [31,32,33,34] ] _
]
I just get
error: syntax error
[1, [
~~~~^
error: syntax error
[2, [
~~~~^
error: syntax error
[3, [
~~~~^
Either I'm missing something or the initializer doesn't realize that I don't want to have 3 dimensions but just 2 and handle the 4 item array as a single variant.
Can anybody suggest the preferred solution for that? Or should I forget this combined solution and simple go with an array of arrays with 5 elements each, like
Global Const $x[3][5] = [ _
[1, 11,12,13,14 ], _
[2, 21,22,23,24 ], _
[3, 31,32,33,34 ] _
]
and handle the difference in the code?
It is possible but you need to initialise the inner arrays separately.
Global $arr[3][2]
Global $a1[4] = [11,12,13,14] ;these are the inner arrays
Global $a2[4] = [21,22,23,24]
Global $a3[4] = [31,32,33,34]
$arr[0][0] = 1
$arr[0][1] = $a1 ;put the inner arrays into the outer array
$arr[1][0] = 2
$arr[1][1] = $a2
$arr[2][0] = 3
$arr[2][1] = $a3
Global $arrTemp
For $i = 0 to UBound($arr)-1
$arrTemp = $arr[$i][1]
For $j = 0 To UBound($arrTemp)-1
ConsoleWrite($arr[$i][0] & ': ' & $arrTemp[$j] & #CRLF)
Next
Next
This will output:
1: 11
1: 12
1: 13
1: 14
2: 21
2: 22
2: 23
2: 24
3: 31
3: 32
3: 33
3: 34
I don't think that is possible. You can got for this:
#region ;************ Includes ************
#include <Array.au3>
#endregion ;************ Includes ************
Global $a[4] = [11, 12, 13, 14]
Global Const $x[3][2] = [ _
[1, $a], _
[2, "21,22,23,24"], _
[3, "31,32,33,34"] _
]
_ArrayDisplay($x)
; Getting the values of [0][1]
For $i = 0 To UBound($a) - 1
ConsoleWrite($a[$i] & #CR)
Next
$re = StringSplit($x[1][1], ',', 2)
For $i = 0 To UBound($re) - 1
ConsoleWrite($re[$i] & #CR)
Next