A nested parenthesis parsing case leading multiple sequence results - pyparsing

I'd like to parse a string with nested parenthesis with these conditions:
Elements are delimited by comma , or bar |.
Nested parenthesis elements might be a single alphanum or another nested parenthesis.
Each nested parenthesis element connected by bar | literal leads to creation of a new sequence combining previous sequence elements and forward elements connected by comma, outside that nested parenthesis.
In order to clarify, let me give some examples of input strings and the results they should return:
(a, b, c) should return: a, b, c
(a, (b | c)) should return: a, b and a, c
(a, b, (c | (d, e)), f) should return: a, b, c, f and a, b, d, e, f
(a, b, (c | (d, e) | f), g) should return: a, b, c, g and a, b, d, e, g and a, b, f, g
(a, b, c, ((d, (e | f)) | (g, h)), i) should return: a, b, c, d, e, i and a, b, c, d, f, i and a, b, c, g, h, i
((a | b), c) should return: a, c and b, c

(from the pyparsing wiki)
You can get the string parsed using infixNotation (formerly known as operatorPrecedence). Assuming that ',' has precedence over '|', this would look like:
variable = oneOf(list(alphas.lower()))
expr = infixNotation(variable,
[
(',', 2, opAssoc.LEFT),
('|', 2, opAssoc.LEFT),
])
Converting your test cases to a little testing framework, we can at least test the parsing part:
tests = [
("(a, b, c)", ["abc"]),
("(a, b | c)", ["ab", "c"]),
("((a, b) | c)", ["ab", "c"]),
("(a, (b | c))", ["ab", "ac"]),
("(a, b, (c | (d, e)), f)", ["abcf","abdef"]),
("(a, b, (c | (d, e) | f), g)", ["abcg", "abdeg", "abfg"]),
("(a, b, c, ((d, (e | f)) | (g, h)), i)",
["abcdei", "abcdfi", "abcghi"]),
("((a | b), c)", ["ac", "bc"]),
]
for test,expected in tests:
# if your expected values *must* be lists and not strings, then
# add this line
# expected = [list(ex) for ex in expected]
result = expr.parseString(test)
print result[0].asList()
which will give you something like this:
['a', ',', 'b', ',', 'c']
[['a', ',', 'b'], '|', 'c']
[['a', ',', 'b'], '|', 'c']
['a', ',', ['b', '|', 'c']]
['a', ',', 'b', ',', ['c', '|', ['d', ',', 'e']], ',', 'f']
['a', ',', 'b', ',', ['c', '|', ['d', ',', 'e'], '|', 'f'], ',', 'g']
['a', ',', 'b', ',', 'c', ',', [['d', ',', ['e', '|', 'f']], '|', ['g', ',', 'h']], ',', 'i']
[['a', '|', 'b'], ',', 'c']
That is the easy part, parsing the string and getting the operator precedence reflected in the resulting structure. Now if you follow the example from the regex inverter, you will need to attach objects to each parsed bit, something like this:
class ParsedItem(object):
def __init__(self, tokens):
self.tokens = tokens[0]
class Var(ParsedItem):
""" TBD """
class BinaryOpn(ParsedItem):
def __init__(self, tokens):
self.tokens = tokens[0][::2]
class Sequence(BinaryOpn):
""" TBD """
class Alternation(BinaryOpn):
""" TBD """
variable = oneOf(list(alphas.lower())).setParseAction(Var)
expr = infixNotation(variable,
[
(',', 2, opAssoc.LEFT, Sequence),
('|', 2, opAssoc.LEFT, Alternation),
])
Now you will have to implement the bodies of Var, Sequence, and Alternation. You will not get a list of values directly back from pyparsing, instead you will get one of these object types back. Then, instead of calling asList() as I did in the sample above, you'll call something like generate or makeGenerator to get a generator from that object. Then you'll invoke that generator to have the objects generate all the different results for you.
I leave the rest as an exercise for you.
-- Paul

Related

Is there a way to find if a sequence of two chars are found in a list only if they are consecutive?

I am currently working in the elm syntax. An example would be like this:
(Sequence ('a') ('b')) ('c') ['a', 'b', 'c', 'd'] . In this example, i only test if the elements 'a', 'b', 'c' are members of the list. If yes, then i partition it and obtain (['a','b','c'],['d'])
I encountered problems in the following case:
(Sequence ('a') ('b')) ('c') ['a', 'b', 'c', 'a']
obtaining the result :
(['a','b','c','a'],[])
My question is: what condition should i put such that the elements 'a' and 'b' must be consecutive avoiding the case when they are matched alone?
This answer assumes that if you have Sequence 'a' 'b' 'c' and test it against the list ['a', 'b', 'c', 'a'], you want to receive the result (['a', 'b', 'c'], ['a']) (as asked in this comment).
In pseudo-code:
Split the list into two, list1 and list2. list1 should have the same length as your sequence. Elm provides List.take and List.drop for that
Convert your sequence into a list list_sequence with a helper function
Test if list1 and list_sequence are equal
If they are, return the tuple (list1, list2)
And here is the actual Elm code:
https://ellie-app.com/bjBLns4dKkra1
Here is some code that tests if a sequence of elements occurs in a list:
module Main exposing (main)
import Html exposing (Html, text)
containsSeq : List a -> List a -> Bool
containsSeq seq list =
let
helper remainingSeq remainingList savedSeq savedList =
case remainingSeq of
[] ->
True
x :: xs ->
case remainingList of
[] ->
False
y :: ys ->
if x == y then
helper xs ys (savedSeq ++ [ x ]) (savedList ++ [ y ])
else
case savedList of
[] ->
helper (savedSeq ++ remainingSeq) ys [] []
y2 :: y2s ->
helper (savedSeq ++ remainingSeq) (y2s ++ remainingList) [] []
in
helper seq list [] []
main =
text <| Debug.toString <| containsSeq [ 'a', 'b', 'c' ] [ 'a', 'b', 'a', 'b', 'c', 'd' ]
This only checks if the sequences appears and the type of the elements have to be comparable.
Here is the above function altered to return a partitioning of the old list as a 3 elements Tuple with (elementsBefore, sequence, elementsAfter). The result is wrapped in a Maybe so that if the sequence is not found, it returns Nothing.
module Main exposing (main)
import Html exposing (Html, text)
partitionBySeq : List a -> List a -> Maybe ( List a, List a, List a )
partitionBySeq seq list =
let
helper remainingSeq remainingList savedSeq savedCurrentList savedOldList =
case remainingSeq of
[] ->
Just ( savedOldList, seq, remainingList )
x :: xs ->
case remainingList of
[] ->
Nothing
y :: ys ->
if x == y then
helper xs ys (savedSeq ++ [ x ]) (savedCurrentList ++ [ y ]) savedOldList
else
case savedCurrentList of
[] ->
helper (savedSeq ++ remainingSeq) ys [] [] (savedOldList ++ [ y ])
y2 :: y2s ->
helper (savedSeq ++ remainingSeq) (y2s ++ remainingList) [] [] (savedOldList ++ [ y ])
in
helper seq list [] [] []
main =
text <| Debug.toString <| partitionBySeq [ 'a', 'b', 'c' ] [ 'a', 'b', 'a', 'b', 'c', 'd' ]
Of course, if you are only dealing with characters, you might as well convert the list into a String using String.fromList and use String.contains "abc" "ababcd" for the first version and String.split "abc" "ababcd" to implement the second one.

How to solve the ransom note problem functionally

Write a function that given a list of letters what and a word, returns true if the word can be spelt using the letters from the list and false if not. For example a list like
['a';'b';'d';'e';'a']
and the word
bed
the function should return true.
If the word was bbed it should also return false because there is only one b in the list.
This is easy enough to do imperatively by mutating the state of the dictionary in a for loop but how can one do it in a more functional style without mutation?
Here's the imperative version I did:
open System.Collections.Generic
let letters = new Dictionary<char,int>()
[ ('a', 2); ('b', 1); ('c', 1); ('e', 1) ] |> Seq.iter letters.Add
let can_spell (word : string) =
let mutable result = true
for x in word do
if letters.ContainsKey x && letters.[x] > 0 then
let old = letters.[x]
letters.[x] <- old - 1
else
result <- false
done
result
You can use 2 dictionaries to keep track of the counts by letter of the word and the existing letters and then check that the letter count is greater than the word letter count:
let contains (word:string)(letters:IDictionary<char,int>) =
let w = word
|>Seq.countBy id
|>dict
w.Keys
|>Seq.map(fun k-> letters.ContainsKey k && letters.[k] >= w.[k])
|>Seq.reduce(&&)
and you can use it like this
let letters =
[ ('a', 2); ('b', 1); ('c', 1); ('e', 1); ('d', 1)]
|> dict
contains "bed" letters // True
I would do like this:
let can_spell2 letters word =
let uniqueLettersCount = //group unique letters from words and count them
word |> Seq.groupBy id
|> Seq.map (fun (l,s) -> l,Seq.length s)
uniqueLettersCount //keep taking the sequence until you don't find a key or the key count is below the unique letter number
|> Seq.takeWhile (fun (c,n) ->
match Map.tryFind c letters with
| Some n' -> if n' >= n then true else false
| None -> false)
|> fun s -> if Seq.length s = Seq.length uniqueLettersCount then true else false //if takeWhile didn't stop, the word is allowed
EDIT:
Examples of usage:
let letters = ['a',2;'b',1;'c',1;'e',1] |> Map.ofList
can_spell2 letters "aab" //true
can_spell2 letters "aaba" //false
can_spell2 letters "bf" //false
can_spell2 letters "ecaba" //true

How to extract tuple from a datatype?

New to SML and trying to learn through a series of exercises. The function I am trying to write deals with flattening a tree with N children. My approach was to simply take the current NTreeNode and add its value to some list that I would return. Then take its second argument, the list of children, and tack that on to another list, which would be my queue. This queue would serve as all the items I still have left to process.
I tried to do this approach by passing the NTreeList and the list I would return with the initial value in flattenNTree, to a helper function.
However, when I try to process an NTreeNode from my queue it gives me back an NTree and I can't use my first/second functions on that, I need a tuple back from the queue. I just don't understand how to get back a tuple, I tried to use the NTreeNode constructor, but even that's giving me an NTree back.
My question is how can I extract a tuple from the NTree datatype I have defined.
datatype NTree =
NTreeNode of int * NTree list
| EmptyNTree
;
fun first (a, _) = a;
fun second (_, b) = b;
fun processTree queue finalList =
if null queue
then finalList
else processTree ((tl queue)#(second(NTreeNode(hd queue)))) finalList#[first (NTreeNode (hd queue)) ]
;
fun flattenNTree EmptyNTree = []
| flattenNTree (NTreeNode x) = processTree (second x) [(first x)]
;
An example input value:
val t =
NTreeNode (1, [
NTreeNode (2, [
NTreeNode (3, [EmptyNTree]),
NTreeNode (4, []),
NTreeNode (5, [EmptyNTree]),
EmptyNTree
]),
NTreeNode (6, [
NTreeNode (7, [EmptyNTree])
])
]);
It's much easier to take things apart with pattern matching than fiddling around with selectors like first or tl.
It's also more efficient to accumulate a list in reverse and fix that when you're finished than to repeatedly append to the end of it.
fun processTree [] final = reverse final
| processTree (EmptyTree::ts) final = processTree ts final
| processTree ((NTreeNode (v,t))::ts) final = processTree (ts # t) (v :: final)
Your processTree function is missing the case for EmptyNTree and you seem to be trying to add NTree constructors before calling first and second, whereas you need rather to strip them away, as you do in flattenNTree.
Both problems can be fixed by applying pattern matching to the head of the queue:
fun processTree queue finalList =
if null queue
then finalList
else case hd queue of
EmptyNTree => processTree (tl queue) finalList
| NTreeNode v => processTree (tl queue # second v) (finalList # [first v])
;
You might also consider an implementation based on list functionals (although the order of the result is not the same):
fun flattenNTree t = case t of
EmptyNTree => []
| NTreeNode (n, nts) => n :: (List.concat (List.map flattenNTree nts));
Given the tree type
datatype 'a tree = Node of 'a * 'a tree list
| Leaf
you can fold it:
fun fold f e0 Leaf = e0
| fold f e0 (Node (x, ts)) =
let val e1 = f (x, e0)
in foldl (fn (t, e2) => fold f e2 t) e1 ts
end
and flatten it:
fun flatten t =
fold op:: [] t

Ocaml pattern matching for "square" tuple?

In attempting to learn Ocaml and functional languages in general, I have been looking into pattern matching. I was reading this documentation, and decided to try the following exercise for myself:
Make an expression that evaluates to true when an integer 4-tuple is input such that each element in the 4-tuple is equal.
(4, 4, 4, 4) -> true
(4, 2, 4, 4) -> false
I find that doing pattern matching for the specificity of the value of the elements to not be obvious. This is the code I wrote.
let sqr x = match x with
(a, a, a, a) -> true
| (_, _, _, _) -> false ;;
Of course, this code throws the following error:
Error: Variable a is bound several times in this matching
How else can I not only enforce that x is a 4-tuple, but also of strictly integers that are equal?
(Also, of course a "square" tuple should not allow non-positive integers, but I'm more concerned with the aforementioned problem as of now).
`
As you found out, unlike some other languages' pattern-matching systems, you can't do this in OCaml. What you can do is match each element of the tuple separately while using guards to only succeed if some property (like equivalence) holds across them:
let sqr x =
match x with
| (a, b, c, d) when a = b && b = c && c = d -> `Equal
| (a, b, c, d) when (a < b && b < c && c < d)
|| (a > b && b > c && c > d) -> `Ordered
| _ -> `Boring
You have many ways to do pattern-matching, pattern matching is not only when using the match keyword
let fourtuple_equals (a,b,c,d) = List.for_all ((=) a) [b;c;d]
val fourtuple_equals : 'a * 'a * 'a * 'a -> bool = <fun>
Here you have a pattern matching directly in the parameter in order to access your four elements tuple.
In this example I use a list to have a more concise code, but is not the more efficient.

Case Statements and Pattern Matching

I'm coding in SML for an assignment and I've done a few practice problems and I feel like I'm missing something- I feel like I'm using too many case statements. Here's what I'm doing and the problem statements for what I'm having trouble with.:
Write a function all_except_option, which takes a string and a string list. Return NONE if the string is not in the list, else return SOME lst where lst is like the argument list except the string is not in it.
fun all_except_option(str : string, lst : string list) =
case lst of
[] => NONE
| x::xs => case same_string(x, str) of
true => SOME xs
| false => case all_except_option(str, xs) of
NONE => NONE
| SOME y=> SOME (x::y)
Write a function get_substitutions1, which takes a string list list (a list of list of strings, the substitutions) and a string s and returns a string list. The result has all the strings that are in some list in substitutions that also has s, but s itself should not be in the result.
fun get_substitutions1(lst : string list list, s : string) =
case lst of
[] => []
| x::xs => case all_except_option(s, x) of
NONE => get_substitutions1(xs, s)
| SOME y => y # get_substitutions1(xs, s)
-
same_string is a provided function,
fun same_string(s1 : string, s2 : string) = s1 = s2
First of all I would start using pattern matching in the function definition
instead of having a "top-level" case statement. Its basically boils down to the
same thing after de-sugaring. Also I would get rid of the explicit type annotations, unless strictly needed:
fun all_except_option (str, []) = NONE
| all_except_option (str, x :: xs) =
case same_string(x, str) of
true => SOME xs
| false => case all_except_option(str, xs) of
NONE => NONE
| SOME y => SOME (x::y)
fun get_substitutions1 ([], s) = []
| get_substitutions1 (x :: xs, s) =
case all_except_option(s, x) of
NONE => get_substitutions1(xs, s)
| SOME y => y # get_substitutions1(xs, s)
If speed is not of importance, then you could merge the two cases in the first function:
fun all_except_option (str, []) = NONE
| all_except_option (str, x :: xs) =
case (same_string(x, str), all_except_option(str, xs)) of
(true, _) => SOME xs
| (false, NONE) => NONE
| (false, SOME y) => SOME (x::y)
But since you are using append (#), in the second function, and since it is not
tail recursive, I don't believe that it your major concern. Keep in mind that
append is potential "evil" and you should almost always use concatenation (and
then reverse your result when returning it) and tail recursion when possible (it
always is).
If you really like the explicit type annotations, then you could do it like this:
val rec all_except_option : string * string list -> string list option =
fn (str, []) => NONE
| (str, x :: xs) =>
case (same_string(x, str), all_except_option(str, xs)) of
(true, _) => SOME xs
| (false, NONE) => NONE
| (false, SOME y) => SOME (x::y)
val rec get_substitutions1 : string list list * string -> string list =
fn ([], s) => []
| (x :: xs, s) =>
case all_except_option(s, x) of
NONE => get_substitutions1(xs, s)
| SOME y => y # get_substitutions1(xs, s)
But that is just my preferred way, if I really have to add type annotations.
By the way, why on earth do you have the same_string function? You can just do the comparison directly instead. Using an auxilary function is just wierd, unless you plan to exchange it with some special logic at some point. However your function names doesn't sugest that.
In addition to what Jesper.Reenberg mentioned, I just wanted to mention that a match on a bool for true and false can be replaced with an if-then-else. However, some people consider if-then-else uglier than a case statement
fun same_string( s1: string, s2: string ) = if String.compare( s1, s2 ) = EQUAL then true else false
fun contains( [], s: string ) = false
| contains( h::t, s: string ) = if same_string( s, h ) then true else contains( t, s )
fun all_except_option_successfully( s: string, [] ) = []
| all_except_option_successfully( s: string, h::t ) = if same_string( s, h ) then t else ( h :: all_except_option_successfully( s, t ) )
fun all_except_option( s: string, [] ) = NONE
| all_except_option( s: string, h::t ) = if same_string( s, h ) then SOME t else if contains( t, s ) then SOME ( h :: all_except_option_successfully( s, t ) ) else NONE

Resources