Simultaneous match in FParsec - bnf

If I'm trying to parse the following into lines and fields. Lines are delimited by '\n' and fields are delimited by '|'.
abcd|efgh|ijkl
mnopq\|rst|uvwxy
za|bcd
efg|hijk|lmnop
I can define the following:
let displayCharacter = satisfy (fun c -> ' ' <= c && c <= '~')
let escapedDC = pchar '\\' >>. displayCharacter
let test1 =
run (manyChars (escapedDC <|> displayCharacter)) "asdf\|efgh|ijkl"
// Success: "asdf|efgh|ijkl"
But let fields = sepBy (manyChars (escapedDC <|> displayCharacter)) (pchar '|') cannot work to exclude the '|' from the field. These delimiters are context sensitive, so I want to avoid hard-coding them into displayCharacter since '|' is a display character, but just might need escaping in certain contexts.
If I try to define a single field with manyCharsTill, then I need to account for the final element on a line with anyOf "|\n", but this reads in all of the lines into one line.
I may have further subdelimiters beyond '|' that are supported in certain contexts. For this reason, it seems messy to have to define versions of displayCharacter and escapedDC for every case. Rather, using lookahead features seems cleaner. Or perhaps a parser called both which somehow requires a match on two parsers simultaneously.
manyCharsSepBy (escapedDC <|> displayCharacter) (pchar '|')
or
let contextualDisplayCharacter1 = both displayCharacter (satisfy ((<>) '|'))
Is there an easier way to accomplish this? Perhaps it is just my implied BNF that is flawed - that if fixed, would translate easily?
============
This is the best I can come up with, but I would like to know from the experts if it is the most flexible way.
let displayCharacter (excludeDelimiters : string) = satisfy (fun c -> ' ' <= c && c <= '~' && not (Seq.exists ((=) c) excludeDelimiters))
let escapedDisplayCharacter = pchar '\\' >>. displayCharacter ""
let field =
manyChars (escapedDisplayCharacter <|> displayCharacter "|")

Related

currying multiple functions in parallel in F#

I'm trying to learn F# at the moment and have come up on a problem I can't solve and can't find any answers for on google.
Initially I wanted a log function that would work like the printf family of functions whereby I could provide a format string and a number of arguments (statically checked) but which would add a little metadata before printing it out. With googling, I found this was possible using a function like the following:
let LogToConsole level (format:Printf.TextWriterFormat<'T>) =
let extendedFormat = (Printf.TextWriterFormat<string->string->'T> ("%s %s: " + format.Value))
let date = DateTime.UtcNow.ToString "yyyy-MM-dd HH:mm:ss.fff"
let lvl = string level
printfn extendedFormat date lvl
having the printfn function as the last line of this function allows the varargs-like magic of the printf syntax whereby the partially-applied printfn method is returned to allow the caller to finish applying arguments.
However, if I have multiple such functions with the same signature, say LogToConsole, LogToFile and others, how could I write a function that would call them all keeping this partial-application magic?
Essential I'm looking for how I could implement a function MultiLog
that would allow me to call multiple printf-like functions from a single function call Such as in the ResultIWant function below:
type LogFunction<'T> = LogLevel -> Printf.TextWriterFormat<'T> -> 'T
let MultiLog<'T> (loggers:LogFunction<'T>[]) level (format:Printf.TextWriterFormat<'T>) :'T =
loggers
|> Seq.map (fun f -> f level format)
|> ?????????
let TheResultIWant =
let MyLog = MultiLog [LogToConsole; LogToFile]
MyLog INFO "Text written to %i outputs" 2
Perhaps the essence of this question can be caught more succintly: given a list of functions of the same signature how can I partially apply them all with the same arguments?
type ThreeArg = string -> int -> bool -> unit
let funcs: ThreeArg seq = [func1; func2; func3]
let MagicFunction = ?????
// I'd like this to be valid
let partiallyApplied = MagicFunction funcs "string"
// I'd also like this to be valid
let partiallyApplied = MagicFunction funcs "string" 255
// and this (fullyApplied will be `unit`)
let fullyApplied = MagicFunction funcs "string" 255 true
To answer the specific part of the question regarding string formatting, there is a useful function Printf.kprintf which lets you do what you need in a very simple way - the first parameter of the function is a continuation that gets called with the formatted string as an argument. In this continuation, you can just take the formatted string and write it to all the loggers you want. Here is a basic example:
let Loggers = [printfn "%s"]
let LogEverywhere level format =
Printf.kprintf (fun s ->
let date = DateTime.UtcNow.ToString "yyyy-MM-dd HH:mm:ss.fff"
let lvl = string level
for logger in Loggers do logger (sprintf "%s %s %s" date lvl s)) format
LogEverywhere "BAD" "hi %d" 42
I don't think there is a nice and simple way to do what you wanted to do in the more general case - I suspect you might be able to use some reflection or static member constraints magic, but fortunately, you don't need to in this case!
There is almost nothing to add to a perfect #TomasPetricek answer as he is basically a "semi-god" in F#. Another alternative, which comes to mind, is to use a computation expression (see, for example: https://fsharpforfunandprofit.com/series/computation-expressions.html). When used properly it does look like magic :) However, I have a feeling that it is a little bit too heavy for the problem, which you described.

Count number of occurences of a character in an element using xquery

I have a variable which has | separated values like below.
I need to make sure it never has more than 30 sequences separated by '|', so i believe if i count number of occurrences of '|' in the var it would suffice
class=1111|2222|3333|4444
Can you please help in writing xquery for the same.
I am new to xquery.
If you remove all characters but the bar and then use string-length as in let $s := '1111|2222|3333|4444' return string-length(translate($s, translate($s, '|', ''), '')) you get the number of | characters. That use of string-length and the double translate to remove anything but a certain character is an old XPath 1 trick, of course as XQuery also has replace you could as well use let $s := '1111|2222|3333|4444' return string-length(replace($s, '[^|]+', '')).
You could use the tokenize() function to split the value by the | character, and then count how many items in the sequence with fn:count().
Just remember that the tokenize function uses a regex pattern, so you would need to escape the | as \|:
let $PSV := "1111|2222|3333|4444"
let $tokens := fn:tokenize($PSV, "\|")
let $token-count := fn:count($tokens)
return
if ($token-count > 30) then
fn:error((), "Too many pipe separated values")
else
(: less than thirty values, do stuff with the $tokens :)
()
Just for good measure, and in case you want to do any performance comparisons, you could try
let $sep := string-to-codepoints('|')
return count(string-to-codepoints($in)[.=$sep])
This has the theoretical advantage that (at least in Saxon) it doesn't construct any new strings or sequences in memory.

Ocaml - global vs local variable

I wanted to create a global variable called result that uses 5 string concatenations to create a string containing 9 times the string start, separated by commas.
I have two pieces of code, only the second one declares a global variable.
For some reason it's not registering easily in my brain... Is it just that i used a let in so result in the first piece of code is a local variable? Is there a more detailed explanation for this?
let start = "ab";;
let result = start ^ "," in
let result = result ^ result in
let result = result ^ result in
let result = result ^ result in
let result = result ^ start in
result;;
- : string = "ab,ab,ab,ab,ab,ab,ab,ab,ab"
let result =
let result = start ^ "," in
let result = result ^ result in
let result = result ^ result in
let result = result ^ result in
let result = result ^ start in
result;;
val result : string = "ab,ab,ab,ab,ab,ab,ab,ab,ab"
Let me to be a little bit boring person. There are no local and global variables in OCaml. This concept came from languages with different scoping rules. Also, the word "variable" itself should be taken with care. Its meaning was perverted by C-like languages. The original, mathematical, meaning of this word corresponds to a name of some mathematical object, that is used inside a formula, that represent a range of such values. In C-like languages, a variable is confused with the memory cell, that can change in time. So, to avoid the confusion let's use a more accurate terminology. Let's use word name instead of variable. Since, variables... sorry names are not memory cells, there is nothing to create. When you're using one of the let syntaxes, you're actually creating a binding, i.e., an association between a name and a value. The let <name> = <expr-1> in <expr-2> binds a value of the in the scope of the <expr-2> expression. The let <name> = <expr-1> in <expr-2> is by itself is also an expression, so, for example <expr-2> can also contain let ... in ... constructs inside, e.g.,
let a = 1 in
let b = a + 1 in
let c = b + 1 in
a + b + c
I especially, indented the code in non-idiomatic way to highlight the syntactic structure of the expression. OCaml also allows to use a name, that is already bound in the scope. The new binding will hide the existing one (that is not allowed in C, for example), e.g.,
let a = a + 1 in
let a = a + 1 in
let a = a + 1 in
a + a + a
Finally, the top-level (aka module level) let-binding (called definition in OCaml parlance), has the syntax: let <name> = <expr>, note that there is no in here. The definition binds the <name> to a result of the evaluation of <expr> in the lexical scope that extends form the point of definition to the end of the enclosing module. When you're implementing a module, you must use let <name> = <expr> to bind your code to names (you may omit name by using _). It is a little bit different from the interactive toplevel (interactive ocaml program), that actually accepts an expression, and evaluates it. For example,
let result = start ^ "," in
let result = result ^ result in
let result = result ^ result in
let result = result ^ result in
let result = result ^ start in
result
Is not a valid OCaml program (something that can be put into an ml file and compiled). Because it is an expression, not a module definition.
Is it just that i used a let in so result in the first piece of code is a local variable?
Pretty much. The syntax to define a global variable is let variable = expression without an in. The syntax to define a local variable is let variable = expression in expression which will define variable local to the expression after the in.
When you have let ... in, you're declaring a local variable. When you have just let by itself (at the top level of a module), you're declaring a global name of the module. (That is, a name that can be exported from the module.)
Your first example consists entirely of let ... in. So there is no top-level name declared.
Your second example has one let by itself, followed by several occurrences of let ... in. So it declares a top-level name result.

How to convert a string to integer list in ocaml?

I need to pass two list as command line arguments in ocaml.
I used the following code to access it in the program.
let list1=Sys.argv.(1);;
let list2=Sys.argv.(2);;
I need to have the list1 and list2 as list of integers.
I am getting the error
This expression has type string but an expression was expected of type
int list
while processing.
How can I convert that arguments to a list of integers.
The arguments are passed in this format [1;2;3;4] [1;5;6;7]
Sys.argv.(n) will always be a string. You need to parse the string into a list of integers. You could try something like this:
$ ocaml
OCaml version 4.01.0
# #load "str.cma";;
# List.map int_of_string (Str.split (Str.regexp "[^0-9]+") "[1;5;6;7]");;
- : int list = [1; 5; 6; 7]
Of course this doesn't check the input for correct form. It just pulls out sequences of digits by brute force. To do better you need to do some real lexical analysis and simple parsing.
(Maybe this is obvious, but you could also test your function in the toplevel (the OCaml read-eval-print loop). The toplevel will handle the work of making a list from what you type in.)
As Sys.argv is a string array, you need to write your own transcription function.
I guess the simplest way to do this is to use the Genlex module provided by the standard library.
let lexer = Genlex.make_lexer ["["; ";"; "]"; ]
let list_of_string s =
let open Genlex in
let open Stream in
let stream = lexer (of_string s) in
let fail () = failwith "Malformed string" in
let rec aux acc =
match next stream with
| Int i ->
( match next stream with
| Kwd ";" -> aux (i::acc)
| Kwd "]" -> i::acc
| _ -> fail () )
| Kwd "]" -> acc
| _ -> fail ()
in
try
match next stream with
| Kwd "[" -> List.rev (aux [])
| _ -> fail ()
with Stream.Failure -> fail ()
let list1 = list_of_string Sys.argv.(1)
let list2 = list_of_string Sys.argv.(2)
Depending on the OCaml flavor you want to use, some other library may look more interesting. If you like yacc, Menhir may solve your problem in a few lines of code.

OCaml: how to use user defined types as key for Map.Make?

I have the following code which I intend to create a Map with self defined types variable and location. I understand that the key type should be ordered (some comparator function). How shall I add these rules to make this work? Also, I find the code ugly here. Do I really need the ;; at the end of a type and module?
type variable = string;;
type location = int;;
module LocationMap = Map.Make(variable);;
module EnvironmentMap = Map.Make(location);;
EDIT
This is the rest of my code:
type variable = Variable of string
type location = Location of int
module LocationMap = Map.Make(struct type t = variable let compare = compare end)
module EnvironmentMap = Map.Make(struct type t = variable let compare = compare end)
(*file read function*)
let read_file filename =
let lines = ref [] in
let chan = open_in filename in
try
while true do
lines := input_line chan :: !lines
done;
!lines
with End_of_file ->
close_in chan;
List.rev !lines
in
(*get the inputs*)
let inputs = read_file Sys.argv.(1) in
for i = 0 to List.length inputs - 1 do
Printf.printf "%s\n" (List.nth inputs i)
done;
This has a syntax error. I am not sure why.
EDIT2
I make this work with the following edit:
type variable = Variable of string
type location = Location of int
module LocationMap = Map.Make(struct type t = variable let compare = compare end)
module EnvironmentMap = Map.Make(struct type t = variable let compare = compare end)
(*file read function*)
let read_file filename =
let lines = ref [] in
let chan = open_in filename in
try
while true do
lines := input_line chan :: !lines
done;
!lines
with End_of_file ->
close_in chan;
List.rev !lines
(*get the inputs*)
let () =
let inputs = read_file Sys.argv.(1) in
for i = 0 to List.length inputs - 1 do
Printf.printf "%s\n" (List.nth inputs i)
done;
Sorry for the long list of questions, what does let () = do here? Is it true that when I define a function with let, I do not need in?
When applying the Map.Make functor, you need to supply a struct containing your type and a compare function:
module LocationMap =
Map.Make(struct type t = variable let compare = compare end)
module EnvironmentMap =
Map.Make(struct type t = location let compare = compare end)
You never need to use ;; in compiled code. It's only required when using the toplevel, to tell it when it should evaluate what you've typed in so far.
Some people do use ;; in compiled code, but you never need to do this and I personally never do. There is always a way to get the same effect without using ;;.
Update
The let compare = compare binds the pre-existing OCaml function compare (the infamous polymorphic comparison function) to the name compare inside the struct. So, it creates a Map that uses polymorphic compare to do its comparisons. This is often what you want.
I created a file containing your definitions (without ;;) and the above code, then compiled it with ocamlc -c. There were no syntax errors. I'm positive you don't need to use ;;, as I've written many many thousands of lines of code without it.
Note that I'm not saying that if you remove ;; from syntactically correct OCaml code, the result is always syntactically correct. There are a few idioms that only work when you use ;;. I personally just avoid those idioms.
Update 2
A let at top level of a module is special, and doesn't have an in. It defines a global value of the module. OCaml treats every source file as a module (for free, as I like to say), with a name that's the same as the source file name (capitalized).
You can actually have any pattern in let pattern = expression. So let () = ... is completely normal. It just says that the expression has unit type (hence the pattern matches).

Resources