All substrings that are sequences of characters using functional programming - functional-programming

As a followup to my earlier question on finding runs of the same character in a string, I would also like to find a functional algorithm to find all substrings of length greater than 2 that are ascending or descending sequences of letters or digits (e,g,: "defgh", "34567", "XYZ", "fedcba", "NMLK", 9876", etc.) in a character string ([Char]). The only sequences that I am considering are substrings of A..Z, a..z, 0..9, and their descending counterparts. The return value should be a list of (zero-based offset, length) pairs. I am translating the "zxcvbn" password strength algorithm from JavaScript (containing imperative code) to Scala. I would like to keep my code as purely functional as possible, for all the usual reasons given for writing in the functional programming style.
My code is written in Scala, but I can probably translate an algorithm in any of Clojure, F#, Haskell, or pseudocode.
Example: For the string qweABCD13987 would return [(3,4),(9,3)].
I have written a rather monsterous function that I will post when I again have access to my work computer, but I am certain that a more elegant solution exists.
Once again, thanks.

I guess a nice solution for this problem is really more complicated than it seems at first.
I'm no Scala Pro, so my solution is surely not optimal and nice, but maybe it gives you some ideas.
The basic idea is to compute the difference between two consecutive characters, afterwards it unfortunately gets a bit messy. Ask me if some of the code is unclear!
object Sequences {
val s = "qweABCD13987"
val pairs = (s zip s.tail) toList // if s might be empty, add a check here
// = List((q,w), (w,e), (e,A), (A,B), (B,C), (C,D), (D,1), (1,3), (3,9), (9,8), (8,7))
// assuming all characters are either letters or digits
val diff = pairs map {case (t1, t2) =>
if (t1.isLetter ^ t2.isLetter) 0 else t1 - t2} // xor could also be replaced by !=
// = List(-6, 18, 36, -1, -1, -1, 19, -2, -6, 1, 1)
/**
*
* #param xs A list indicating the differences between consecutive characters
* #param current triple: (start index of the current sequence;
* number of current elements in the sequence;
* number indicating the direction i.e. -1 = downwards, 1 = upwards, 0 = doesn't matter)
* #return A list of triples similar to the argument
*/
def sequences(xs: Seq[Int], current: (Int, Int, Int) = (0, 1, 0)): List[(Int, Int, Int)] = xs match {
case Nil => current :: Nil
case (1 :: ys) =>
if (current._3 != -1)
sequences(ys, (current._1, current._2 + 1, 1))
else
current :: sequences(ys, (current._1 + current._2 - 1, 2, 1)) // "recompute" the current index
case (-1 :: ys) =>
if (current._3 != 1)
sequences(ys, (current._1, current._2 + 1, -1))
else
current :: sequences(ys, (current._1 + current._2 - 1, 2, -1))
case (_ :: ys) =>
current :: sequences(ys, (current._1 + current._2, 1, 0))
}
sequences(diff) filter (_._2 > 1) map (t => (t._1, t._2))
}

It's always best to split a problem into several smaller subproblems. I wrote a solution in Haskell, which is easier for me. It uses lazy lists, but I suppose you can convert it to Scala either using streams or by making the main function tail recursive and passing the intermediate result as an argument.
-- Mark all subsequences whose adjacent elements satisfy
-- the given predicate. Includes subsequences of length 1.
sequences :: (Eq a) => (a -> a -> Bool) -> [a] -> [(Int,Int)]
sequences p [] = []
sequences p (x:xs) = seq x xs 0 0
where
-- arguments: previous char, current tail sequence,
-- last asc. start offset of a valid subsequence, current offset
seq _ [] lastOffs curOffs = [(lastOffs, curOffs - lastOffs)]
seq x (x':xs) lastOffs curOffs
| p x x' -- predicate matches - we're extending current subsequence
= seq x' xs lastOffs curOffs'
| otherwise -- output the currently marked subsequence and start a new one
= (lastOffs, curOffs - lastOffs) : seq x' xs curOffs curOffs'
where
curOffs' = curOffs + 1
-- Marks ascending subsequences.
asc :: (Enum a, Eq a) => [a] -> [(Int,Int)]
asc = sequences (\x y -> succ x == y)
-- Marks descending subsequences.
desc :: (Enum a, Eq a) => [a] -> [(Int,Int)]
desc = sequences (\x y -> pred x == y)
-- Returns True for subsequences of length at least 2.
validRange :: (Int, Int) -> Bool
validRange (offs, len) = len >= 2
-- Find all both ascending and descending subsequences of the
-- proper length.
combined :: (Enum a, Eq a) => [a] -> [(Int,Int)]
combined xs = filter validRange (asc xs) ++ filter validRange (desc xs)
-- test:
main = print $ combined "qweABCD13987"

Here is my approximation in Clojure:
We can transform the input string so we can apply your previous algorithm to find a solution. The alorithm wont be the most performant but I think you will have a more abstracted and readable code.
The example string can be transformed in the following way:
user => (find-serials "qweABCD13987")
(0 1 2 # # # # 7 8 # # #)
Reusing the previous function "find-runs":
user => (find-runs (find-serials "qweABCD13987"))
([3 4] [9 3])
The final code will look like this:
(defn find-runs [s]
(let [ls (map count (partition-by identity s))]
(filter #(>= (% 1) 3)
(map vector (reductions + 0 ls) ls))))
(def pad "#")
(defn inc-or-dec? [a b]
(= (Math/abs (- (int a) (int b))) 1 ))
(defn serial? [a b c]
(or (inc-or-dec? a b) (inc-or-dec? b c)))
(defn find-serials [s]
(map-indexed (fn [x [a b c]] (if (serial? a b c) pad x))
(partition 3 1 (concat pad s pad))))
find-serials creates a 3 cell sliding window and applies serial? to detect the cells that are the beginning/middle/end of a sequence. The string is conveniently padded so the window is always centered over the original characters.

Related

Unexpected output type

I am doing practice with F#. I am trying to create a simple program capable to find me out a couple of prime numbers that, summed together, equal a natural number input. It is the Goldbach conjecture. A single couple of primes will be enough. We will assume the input to be a even number.
I first created a function to check if a number is prime:
let rec isPrime (x: int) (i: int) :bool =
match x % i with
| _ when float i > sqrt (float x) -> true
| 0 -> false
| _ -> isPrime x (i + 1)
Then, I am trying to develop a function that (a) looks for prime numbers, (b) compare their sum with the input 'z' and (c) returns a tuple when it finds the two numbers. The function should not be correct yet, but I would get the reason behind this problem:
let rec sumPrime (z: int) (j: int) (k: int) :int * int =
match isPrime j, isPrime k with
| 0, 0 when j + k > z -> (0, 0)
| 0, 0 -> sumPrime (j + 1) (k + 1)
| _, 0 -> sumPrime j (k + 1)
| 0, _ -> sumPrime (j + 1) k
| _, _ -> if j + k < z then
sumPrime (j + 1) k
elif j + k = z then
(j, k)
The problem: even if I specified that the output should be a tuple :int * int the compiler protests, claiming that the expected output should be of type bool. When in trouble, I usually refer to F# for fun and profit, that i love, but this time I cannot find out the problem. Any suggestion is greatly appreciated.
Your code has three problems that I've spotted:
Your isPrime returns a bool (as you've specified), but your match expression in sumPrime is matching against integers (in F#, the Boolean value false is not the same as the integer value 0). Your match expression should look like:
match isPrime j, isPrime k with
| false, false when j + k > z -> (0, 0)
| false, false -> ...
| true, false -> ...
| false, true -> ...
| true, true -> ...
You have an if...elif expression in your true, true case, but there's no final else. By default, the final else of an if expression returns (), the unit type. So once you fix your first problem, you'll find that F# is complaining about a type mismatch between int * int and unit. You'll need to add an else condition to your final match case to say what to do if j + k > z.
You are repeatedly calling your sumPrime function, which takes three parameters, with just two parameters. That is perfectly legal in F#, since it's a curried language: calling sumPrime with two parameters produces the type int -> int * int: a function that takes a single int and returns a tuple of ints. But that's not what you're actually trying to do. Make sure you specify a value for z in all your recursive calls.
With those three changes, you should probably see your compiler errors go away.

Average calculating of consecutive list elements in OCaml

I am trying to write a function in OCaml that will calculate the average of consecutive elements in a list. For example with [1; 2; 3; 4] it should output [1; 2; 3]. It should take (1 + 2) / 2 and give 1 then take (2 + 3) / 2 and give 2 and so on.
The code I wrote, however, only returns [1; 2]:
let rec average2 xs = match xs with
|[] -> []
|x :: [] -> [x]
|x :: x' :: xs -> if xs = [] then [(x + x') / 2] else [(x + x') / 2] # (average2 xs)
Can you please tell me how to fix this. Thank you.
When you're doing x :: y :: l in a match, you're effectively taking out the elements of the list permanently.
So if you want to do an operation on pairs of elements, you need to put one back in.
Example:
You have a list of [1;2;3;4]
You want to operate on 1 and 2, in your match it will interpret as:
1 :: 2 :: [3;4]
If you continue without adding an element in, the next statement would be:
3 :: 4 :: []
which is not what you want.
To correct this, in your recurice call you need to do (average2 (x'::xs) and not just (average2 xs) because xs is the rest of the list after taking the elements out.
OCaml allows to bind a pattern p to a variable v using p as v (alias patterns):
let rec average2 = function
| x :: (y :: _ as tail) -> (x + y) / 2 :: (average2 tail)
| _ -> []
Above, y :: _ as tail destructures a list named tail as a non-empty list headed by y and having an arbitrary tail _, the value of which we don't care about.
Note that I also simplified your function so that you don't check whether _ is empty or not: recursion handles this for you here.
Also, when you have zero or one element in the list, you should return an empty list.
# average2 [ 10; 20; 30; 40];;
- : int list = [15; 25; 35]

How to calculate 5^262144 in Erlang

Based on THIS question, I realized that calculating such numbers seems not possible in regular ways.
Any suggestions?
It is possible, but you need an algorithm that is a bit more clever than the naive solution. If you write the naive power function, you do something along the lines of:
pow(_, 0) -> 1;
pow(A, 1) -> A;
pow(A, N) -> A * pow(A, N-1).
which just unrolls the power function. But the problem is that in your case, that will be 262144 multiplications, on increasingly larger numbers. The trick is a pretty simple insight: if you divide N by 2, and square A, you almost have the right answer, except if N is odd. So if we add a fixing term for the odd case, we obtain:
-module(z).
-compile(export_all).
pow(_, 0) -> 1;
pow(A, 1) -> A;
pow(A, N) ->
B = pow(A, N div 2),
B * B * (case N rem 2 of 0 -> 1; 1 -> A end).
This completes almost instantly on my machine:
2> element(1, timer:tc(fun() -> z:pow(5, 262144) end)).
85568
of course, if doing many operations, 85ms is hardly acceptable. But computing this is actually rather fast.
(if you want more information, take a look at: https://en.wikipedia.org/wiki/Exponentiation_by_squaring )
If you are interested how compute power using same algorithm as in I GIVE CRAP ANSWERS's solution but in tail recursive code, there it is:
power(X, 0) when is_integer(X) -> 1;
power(X, Y) when is_integer(X), is_integer(Y), Y > 0 ->
Bits = bits(Y, []),
power(X, Bits, X).
power(_, [], Acc) -> Acc;
power(X, [0|Bits], Acc) -> power(X, Bits, Acc*Acc);
power(X, [1|Bits], Acc) -> power(X, Bits, Acc*Acc*X).
bits(1, Acc) -> Acc;
bits(Y, Acc) ->
bits(Y div 2, [Y rem 2 | Acc]).
It simple since Erlang uses arbitrary-precision for integers(big numbers) you can define own function pow for integer, for example:
-module(test).
-export([int_pow/2]).
int_pow(N,M)->int_pow(N,M,1).
int_pow(_,0,R) -> R;
int_pow(N,M,R) -> int_pow(N,M-1,R*N).
Note, I did not check the arguments and showed the implementation for your example.
You can do:
defmodule Pow do
def powa(x, n), do: powa(x, n, 1)
def powa(_, 0, acc), do: acc
def powa(x, n, acc), do: powa(x, n-1, acc * x)
end
Apparently
Pow.powa(5, 262144) |> to_string |> String.length
yields
183231
long number that you were curious about.

Where is nat base 10 converted to num base 2?

For term "15::nat", the value 15 is automatically converted to the binary value (num.Bit1 (num.Bit1 (num.Bit1 num.One))). I would like to know where that's done, so I can know how it's done.
(Small update: I know that 15 is a type class numeral constant, which gets converted to binary Num.num, which gets mapped to nat, so maybe the nat is decimal, or maybe it's binary. However, my basic question remains the same. Where is the decimal to binary conversion done?)
I show below how I know about the conversion.
I define notation to show me that Num.numeral :: (num => 'a) is coercing 15 to Num.num.
abbreviation nat_of_numeral :: "num => nat" where
"nat_of_numeral n == (numeral n)"
notation nat_of_numeral ("n#N|_" [1000] 1000)
Next, 15 gets coerced to binary in a term command:
term "15::nat"
(*The output:*)
term "n#N|(num.Bit1 (num.Bit1 (num.Bit1 num.One))) :: nat"
And next, 15 gets coerced before it gets used in a proof goal:
lemma "15 = n#N|(num.Bit1 (num.Bit1 (num.Bit1 num.One)))" (*
goal (1 subgoal):
1. n#N|(num.Bit1 (num.Bit1 (num.Bit1 num.One))) =
n#N|(num.Bit1 (num.Bit1 (num.Bit1 num.One))) *)
by(rule refl)
The conversion seems to be decently fast, as shown by this:
(*140 digits: 40ms*)
term "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
::nat"
I also want to convert base 2 to base 10, but if I see how the above is done, it might show me how to do that.
Here's the overview of how I think it's done.
It starts in the Num.thy parse_translation, at the ML function numeral_tr.
In that function, there is the use of Lexicon.read_xnum of lexicon.ML, which takes a string argument.
I don't know the details, but string "15" is extracted from an expression like
"15 = n#N|(num.Bit1 (num.Bit1 (num.Bit1 num.One)))",
and fed to read_xnum, where there is this equivalency:
Lexicon.read_xnum "15" = {leading_zeros = 0, radix = 10, value = 15}
In read_xnum the function Library.read_radix_int of library.ML is used, and it takes as arguments an integer radix and a list of digits, such as shown in this equivalency:
Library.read_radix_int 10 ["1","5"] = (15, []);
Next, the conversion from 15 to (num.Bit1 (num.Bit1 (num.Bit1 num.One))) is a result of IntInf.quotRem in the parse_translation.
This takes me out of convenient HTML linking for isabelle.in.tum.de/repos.
IntInf.quotRem is in Isabelle2013-2\contrib\polyml-5.5.1-1\src\basis\IntInf.sml, and is defined as
val quotRem: int*int->int*int = RunCall.run_call2C2 POLY_SYS_quotrem,
which leads to Isabelle2013-2\contrib\polyml-5.5.1-1\src\basis\RuntimeCalls.ML line 83:
val POLY_SYS_quotrem = 104 (* DCJM 05/03/10 *).
For Windows that takes me to Isabelle2013-2\contrib\polyml-5.5.1-1\src\libpolyml\x86asm.asm line 1660, though I could be leaving out some important details:
quotrem_really_long:
MOVL Reax,Redi
CALLMACRO CALL_IO POLY_SYS_quotrem
CALLMACRO RegMask quotrem,(M_Reax OR M_Redi OR M_Redx OR Mask_all).
I think that's enough of an overview to answer my question. A string "15" is converted to an ML integer, and then some assembly language level quotient/remainder division is used to convert 15 to binary Num.num.
I'm actually only interested in how "15" is converted to 15 in read_radix_int, and whether the details of that function will help me. I explain the application more below.
From here, I put in more detail for myself, to put in a nice form much of the information I collected.
What's sort of the deal
I start with a binary number as a bool list, something like [True, True, True, True] for binary 15, though here, I simplify it in some ways.
That then gets converted to something like [[True, False, True, False], [True, False, True]], which is decimal 15.
The real problem can be finding the right search phrase
Searching on something like "decimal to binary conversion" returns links to a lot of basic math algorithms, which aren't what I'm looking for.
Normal programming conversions aren't what I need either, which is just making explicit the underlying fact that integers are already in binary form:
Convert from base 10 to base 2 using bitwise operations
C: Convert decimal to binary
Finally, other searches led me to the right word, "radix". Additionally, I resorted to doing searches on how things are done in C, where bit shifts are what I'm trying to tie into, though these may not be what I need:
Binary to decimal and decimal to binary base conversion using bitwise operators in C
C Program to Convert Decimal to Binary using Bitwise AND operator
Radix leading back to Num.thy
"Radix" took me back to Num.thy, which is where I thought the action might be, but hadn't seen anything that was obvious to me.
I include some source from Num.thy, lexicon.ML, and IntInf.sml:
(* THE TWO MAIN EXTERNAL FUNCTIONS IN THE TRANSLATIONS: read_xnum, quotRem *)
ML{*
Lexicon.read_xnum; (* string ->
{leading_zeros: int, radix: int, value: int}.*)
Lexicon.read_xnum "15"; (* {leading_zeros = 0, radix = 10, value = 15}.*)
Lexicon.read_xnum "15" = {leading_zeros = 0, radix = 10, value = 15};
IntInf.quotRem; (* int * int -> int * int.*)
IntInf.quotRem (5,3); (* (1, 2) *)
*}
parse_translation {* (* Num.thy(293) *)
let
fun num_of_int n =
if n > 0 then
(case IntInf.quotRem (n, 2) of
(0, 1) => Syntax.const #{const_name One}
| (n, 0) => Syntax.const #{const_name Bit0} $ num_of_int n
| (n, 1) => Syntax.const #{const_name Bit1} $ num_of_int n)
else raise Match
val pos = Syntax.const #{const_name numeral}
val neg = Syntax.const #{const_name neg_numeral}
val one = Syntax.const #{const_name Groups.one}
val zero = Syntax.const #{const_name Groups.zero}
fun numeral_tr [(c as Const (#{syntax_const "_constrain"}, _)) $ t $ u] =
c $ numeral_tr [t] $ u
| numeral_tr [Const (num, _)] =
let
val {value, ...} = Lexicon.read_xnum num;
in
if value = 0 then zero else
if value > 0
then pos $ num_of_int value
else neg $ num_of_int (~value)
end
| numeral_tr ts = raise TERM ("numeral_tr", ts);
in [("_Numeral", K numeral_tr)] end
*}
ML{* (* lexicon.ML(367) *)
(* read_xnum: hex/bin/decimal *)
local
val ten = ord "0" + 10;
val a = ord "a";
val A = ord "A";
val _ = a > A orelse raise Fail "Bad ASCII";
fun remap_hex c =
let val x = ord c in
if x >= a then chr (x - a + ten)
else if x >= A then chr (x - A + ten)
else c
end;
fun leading_zeros ["0"] = 0
| leading_zeros ("0" :: cs) = 1 + leading_zeros cs
| leading_zeros _ = 0;
in
fun read_xnum str =
let
val (sign, radix, digs) =
(case Symbol.explode (perhaps (try (unprefix "#")) str) of
"0" :: "x" :: cs => (1, 16, map remap_hex cs)
| "0" :: "b" :: cs => (1, 2, cs)
| "-" :: cs => (~1, 10, cs)
| cs => (1, 10, cs));
in
{radix = radix,
leading_zeros = leading_zeros digs,
value = sign * #1 (Library.read_radix_int radix digs)}
end;
end;
*}
ML{* (* IntInf.sml(42) *)
val quotRem: int*int->int*int = RunCall.run_call2C2 POLY_SYS_quotrem
*}
A big part of what I was looking for; radix again
(* THE FUNCTION WHICH TRANSLATES A LIST OF DIGITS/STRINGS TO A ML INTEGER *)
ML{*
Library.read_radix_int; (* int -> string list -> int * string list *)
Library.read_radix_int 10 ["1","5"]; (* (15, []): int * string list.*)
Library.read_radix_int 10 ["1","5"] = (15, []);
*}
ML{* (* library.ML(670) *)
fun read_radix_int radix cs =
let
val zero = ord "0";
val limit = zero + radix;
fun scan (num, []) = (num, [])
| scan (num, c :: cs) =
if zero <= ord c andalso ord c < limit then
scan (radix * num + (ord c - zero), cs)
else (num, c :: cs);
in scan (0, cs) end;
*}
The low-level division action
There's the high-level Integer.div_mod in integer.ML, which wasn't used for the translation above:
fun div_mod x y = IntInf.divMod (x, y);
In Isabelle2013-2\contrib\polyml-5.5.1-1\src\basis\IntInf.sml, the higher-level divMod can be compared to the lower-level quotRem:
ML{* (* IntInf.sml(42) *)
val quotRem: int*int->int*int = RunCall.run_call2C2 POLY_SYS_quotrem
(* This should really be defined in terms of quotRem. *)
fun divMod(i, j) = (i div j, i mod j)
*}
With the low-level action for quotRem apparently being done at the assembly language level:
ML{* (* RuntimeCalls.ML(83) *)
val POLY_SYS_quotrem = 104 (* DCJM 05/03/10 *)
*}
(* x86asm.asm(1660)
quotrem_really_long:
MOVL Reax,Redi
CALLMACRO CALL_IO POLY_SYS_quotrem
CALLMACRO RegMask quotrem,(M_Reax OR M_Redi OR M_Redx OR Mask_all)
*)
These various forms of div and mod are important to me.
I'm thinking that div and mod are to be avoided if possible, but tying into quotRem would be the way to go, I think, if division is unavoidable.

polynomial equation standard ml

I'm trying to make a function that will solve a univariante polynomial equation in Standard ML, but it keeps giving me error.
The code is below
(* Eval Function *)
- fun eval (x::xs, a:real):real =
let
val v = x (* The first element, since its not multiplied by anything *)
val count = 1 (* We start counting from the second element *)
in
v + elms(xs, a, count)
end;
(* Helper Function*)
- fun pow (base:real, 0) = 1.0
| pow (base:real, exp:int):real = base * pow(base, exp - 1);
(* A function that solves the equation except the last element in the equation, the constant *)
- fun elms (l:real list, a:real, count:int):real =
if (length l) = count then 0.0
else ((hd l) * pow(a, count)) + elms((tl l), a, count + 1);
now the input should be the coefficient if the polynomial elements and a number to substitute the variable, ie if we have the function 3x^2 + 5x + 1, and we want to substitute x by 2, then we would call the eval as follows:
eval ([1.0, 5.0, 3.0], 2.0);
and the result should be 23.0, but sometimes on different input, its giving me different answers, but on this imput its giving me the following error
uncaught exception Empty raised at:
smlnj/init/pervasive.sml:209.19-209.24
what could be my problem here?
Empty is raised when you run hd or tl on an empty list. hd and tl are almost never used in ML; lists are almost always deconstructed using pattern matching instead; it's much prettier and safer. You don't seem to have a case for empty lists, and I didn't go through your code to figure out what you did, but you should be able to work it out yourself.
After some recursive calls, elms function gets empty list as its argument. Since count is always greater than 0, (length l) = count is always false and the calls hd and tl on empty list are failed right after that.
A good way to fix it is using pattern matching to handle empty lists on both eval and elms:
fun elms ([], _, _) = 0.0
| elms (x::xs, a, count) = (x * pow(a, count)) + elms(xs, a, count + 1)
fun eval ([], _) = 0.0
| eval (x::xs, a) = x + elms(xs, a, 1)

Resources