Functional opposite of flatmap? - functional-programming

You know how flatmap takes a sequence of items and converts each one into a new subsequence, aggregating all the subsequences:
[A, B, C] -> [A1, A2, B1, B2, B3, C1]
Is there a name for the transform which does the opposite? Something like:
[A1, A2, B1, B2, B3, C1] -> [A, B, C]
The specific example that got me thinking about this was doing evaluation of mathematical expressions:
1 * 2 + 3 * 4 + 5 + 6 * 7 * 8
-> 2 + 12 + 5 + 6 + 336
-> 361
Individually, the evaluation of the 6 * 7 * 8 seems like a classic reduce step, while deciding which blocks need to be reduced would need repeated takeWhile steps.
I know how to do this in the classic iterative way, keeping track of indices and all that. For most cases, I've found a nice functional replacement for most iterative patterns. Is there a name for a single operation that does this, or a simple set of operations which can be composed to create this effect?

I think the opposite of flatmap is groupby.
$ python3
>>> from itertools import groupby
>>> groupby(['A1', 'A2', 'B1', 'B2', 'B3', 'C1'], lambda x: x[0])

Related

Calculation of quadratic form using broadcasting in Julia

I want to calculate a vector of a quadratic form, extracting the submatrix from 3 by 3 by 5 arrays. However, I cannot make the quadratic form using broadcasting (i.e., macro "#."). When using “for” statement, we can calculate the vector of the quadratic form. I have no idea how to conduct matrix operations using “#.” (I am reluctant to expand the quadratic form to calculate the vector.)
By contrast, the inner product is computable using “#.”.
The example code is as follows:
using LinearAlgebra
a1=[5 7 2; 2 1 5; 6 2 3]
a2=[2 7 1; 3 7 2; 1 2 3]
a3=[8 5 9; 1 1 3; 2 2 3]
a4=[2 5 6; 3 5 1; 1 1 1]
a5=[7 8 1; 5 1 3; 1 5 2]
z=cat(a1,a2,a3,a4,a5,dims=3)
##### case of inner product
x=zeros(5,3)
wz = reshape([],0)
for k in 1:5
w = hcat(z[[1],[1],k], z[2,2,k]) * hcat(z[[1],[1],k], z[[2],[2],k])'
#println(w)
wz=vcat(wz, w)
end
#. wz=convert(Float64,wz)
wz=Matrix{Float64}(wz)
x[:,3]=wz
# [inner product] same result, the 3rd column vector [26.0, 53.0, 65.0, 29.0, 50.0]
display(x)
x=zeros(5,3)
#. x[:,3] = dot(hcat(z[1,1,:],z[2,2,:]), hcat(z[1,1,:],z[2,2,:])) # ok, working
# [inner product] same result, the 3rd column vector [26.0, 53.0, 65.0, 29.0, 50.0]
display(x)
##### case of quadratic form
x=zeros(5,3)
wy = reshape([],0)
for k in 1:5
w = hcat(z[[1],[1],k], z[[2],[2],k]) * z[[1,3],[1,3],k] * hcat(z[[1],[1],k], z[[2],[2],k])'
#println(w)
wy=vcat(wy, w)
end
#. wy=convert(Float64,wy)
wy=Matrix{Float64}(wy)
x[:,3]=wy
# [quadratic form] distinct result, the 3rd column vector [168.0, 183.0, 603.0, 103.0, 359.0]
display(x)
# generating five 2 by 2 matrices, distinct result
#. dot(hcat(z[[1],[1],:],z[[2],[2],:]), z[[1,3],[1,3],:], hcat(z[[1],[1],:],z[[2],[2],:]))
# obtaining ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 2 and 5")
#. dot(hcat(z[1,1,:],z[2,2,:]), z[[1,3],[1,3],:], hcat(z[1,1,:],z[2,2,:]))
Would you mind giving helps and suggestions how to get the calculation of 3rd column vector [168.0, 183.0, 603.0, 103.0, 359.0] (which is made from the quadratic form) in the above code using "#."?
EDIT:
Perhaps the question is about specifically how to make broadcasting work in this case. If so:
#views dot.(vcat.(z[1,1,:],z[2,2,:]),getindex.(Ref(z),Ref([1,3]),Ref([1,3]),axes(z,3)),vcat.(z[1,1,:],z[2,2,:]))
should be a possible clarification. Or with the #. macro (though it doesn't seem simpler):
#. dot(vcat(z[1,1,:],z[2,2,:]),getindex($Ref(z),$Ref([1,3]),$Ref([1,3]),$axes(z,3)),vcat(z[1,1,:],z[2,2,:]))
ORIGINAL:
One way to calculate this:
[
[z[1,1,k] z[2,2,k]]*z[[1,3],[1,3],k]*[z[1,1,k] z[2,2,k]]' |> first
for k ∈ axes(z,3)
]
giving:
5-element Vector{Int64}:
168
183
603
103
359
(the |> first turns 1x1 matrix into scalar)
Option 2:
[let t = z[[1,3],[1,3],k] ; sum(z[i,i,k]*t[i,j]*z[j,j,k] for i ∈ (1,2), j ∈ (1,2)) ; end for k ∈ 1:5]
or:
[let t = z[[1,3],[1,3],k], v = [z[1,1,k],z[2,2,k]] ; dot(v,t,v) ; end for k ∈ 1:5]
or (this is pretty cool):
map((z;t=z[[1,3],[1,3]],v=[z[1,1],z[2,2]])->dot(v,t,v), eachslice(z,dims=3))

Concatenation of binary representation of first n positive integers in O(logn) time complexity

I came across this question in a coding competition. Given a number n, concatenate the binary representation of first n positive integers and return the decimal value of the resultant number formed. Since the answer can be large return answer modulo 10^9+7.
N can be as large as 10^9.
Eg:- n=4. Number formed=11011100(1=1,10=2,11=3,100=4). Decimal value of 11011100=220.
I found a stack overflow answer to this question but the problem is that it only contains a O(n) solution.
Link:- concatenate binary of first N integers and return decimal value
Since n can be up to 10^9 we need to come up with solution that is better than O(n).
Here's some Python code that provides a fast solution; it uses the same ideas as in Abhinav Mathur's post. It requires Python >= 3.8, but it doesn't use anything particularly fancy from Python, and could easily be translated into another language. You'd need to write algorithms for modular exponentiation and modular inverse if they're not already available in the target language.
First, for testing purposes, let's define the slow and obvious version:
# Modulus that results are reduced by,
M = 10 ** 9 + 7
def slow_binary_concat(n):
"""
Concatenate binary representations of 1 through n (inclusive).
Reinterpret the resulting binary string as an integer.
"""
concatenation = "".join(format(k, "b") for k in range(n + 1))
return int(concatenation, 2) % M
Checking that we get the expected result:
>>> slow_binary_concat(4)
220
>>> slow_binary_concat(10)
462911642
Now we'll write a faster version. First, we split the range [1, n) into subintervals such that within each subinterval, all numbers have the same length in binary. For example, the range [1, 10) would be split into four subintervals: [1, 2), [2, 4), [4, 8) and [8, 10). Here's a function to do that splitting:
def split_by_bit_length(n):
"""
Split the numbers in [1, n) by bit-length.
Produces triples (a, b, 2**k). Each triple represents a subinterval
[a, b) of [1, n), with a < b, all of whose elements has bit-length k.
"""
a = 1
while n > a:
b = 2 * a
yield (a, min(n, b), b)
a = b
Example output:
>>> list(split_by_bit_length(10))
[(1, 2, 2), (2, 4, 4), (4, 8, 8), (8, 10, 16)]
Now for each subinterval, the value of the concatenation of all numbers in that subinterval is represented by a fairly simple mathematical sum, which can be computed in exact form. Here's a function to compute that sum modulo M:
def subinterval_concat(a, b, l):
"""
Concatenation of values in [a, b), all of which have the same bit-length k.
l is 2**k.
Equivalently, sum(i * l**(b - 1 - i)) for i in range(a, b)) modulo M.
"""
n = b - a
inv = pow(l - 1, -1, M)
q = (pow(l, n, M) - 1) * inv
return (a * q + (q - n) * inv) % M
I won't go into the evaluation of the sum here: it's a bit off-topic for this site, and it's hard to express without a good way to render formulas. If you want the details, that's a topic for https://math.stackexchange.com, or a page of fairly simple algebra.
Finally, we want to put all the intervals together. Here's a function to do that.
def fast_binary_concat(n):
"""
Fast version of slow_binary_concat.
"""
acc = 0
for a, b, l in split_by_bit_length(n + 1):
acc = (acc * pow(l, b - a, M) + subinterval_concat(a, b, l)) % M
return acc
A comparison with the slow version shows that we get the same results:
>>> fast_binary_concat(4)
220
>>> fast_binary_concat(10)
462911642
But the fast version can easily be evaluated for much larger inputs, where using the slow version would be infeasible:
>>> fast_binary_concat(10**9)
827129560
>>> fast_binary_concat(10**18)
945204784
You just have to note a simple pattern. Taking up your example for n=4, let's gradually build the solution starting from n=1.
1 -> 1 #1
2 -> 2^2(1) + 2 #6
3 -> 2^2[2^2(1)+2] + 3 #27
4 -> 2^3{2^2[2^2(1)+2]+3} + 4 #220
If you expand the coefficients of each term for n=4, you'll get the coefficients as:
1 -> (2^3)*(2^2)*(2^2)
2 -> (2^3)*(2^2)
3 -> (2^3)
4 -> (2^0)
Let the N be total number of bits in the string representation of our required number, and D(x) be the number of bits in x. The coefficients can then be written as
1 -> 2^(N-D(1))
2 -> 2^(N-D(1)-D(2))
3 -> 2^(N-D(1)-D(2)-D(3))
... and so on
Since the value of D(x) will be the same for all x between range (2^t, 2^(t+1)-1) for some given t, you can break the problem into such ranges and solve for each range using mathematics (not iteration). Since the number of such ranges will be log2(Given N), this should work in the given time limit.
As an example, the various ranges become:
1. 1 (D(x) = 1)
2. 2-3 (D(x) = 2)
3. 4-7 (D(x) = 3)
4. 8-15 (D(x) = 4)

Prolog recursing through list of lists

I'm attempting a prolog question that states the following:
A magic square is a 3× 3 matrix of distinct numbers (between 1 and 9) such that all rows and columns add up to the same total (but not necessarily the diagonals). For example:
2 7 6
9 5 1
4 3 8
is a magic square.
We will represent squares in Prolog as 3 × 3 matrices, i.e. lists of lists [R1, R2, R3] where each R_i is a list of three numbers. For example, the
representation of the above magic square is
[[2,7,6],[9,5,1],[4,3,8]]
Define a predicate magic/1 that tests whether a ground 3 × 3 matrix (i.e.
where all the entries are numbers) is a magic square.
I've done this the following way, and I'm pretty sure it's allowed if I also do it like this in an exam, however to me it seems like sort of a hack:
magic([[A,B,C], [D,E,F], [G,H,I]]) :-
Y is A + B + C,
Y is D + E + F,
Y is G + H + I,
Y is A + D + G,
Y is B + E + H,
Y is C + F + I.
My desired way would be to recurse through each list in the outer list, and sum it up. For each of the lists in outer list, they should sum up to the same value (I think 15 is actually the only possible solution for this "magic" matrix). Likewise, I do the same for the columns (take the first, second and 3rd of each list and add up respectively). However, I'm not entirely sure how to do the latter as I haven't been working with list of lists much. I would appreciate if anybody would give a neat solution on how these sort of computations can be done generally.
Thanks
Note that your solution does not check that A, ..,I values are distinct and in the range 1..9. Here is a solution for NxN squares for N > 2:
magic(L) :-
magic_range(L),
magic_sum(S, L),
magic_line(S, L),
transpose(L, T),
magic_line(S, T).
% S value from https://oeis.org/A006003
magic_sum(S, L) :-
length(L, N),
S is N * (N*N + 1) / 2.
magic_range(L) :-
flatten(L, F),
sort(F, S),
length(L, N),
N2 is N * N,
numlist(1, N2, S).
magic_line(_, []).
magic_line(S, [A | As]) :-
sumlist(A, S),
magic_line(S, As).
% https://github.com/SWI-Prolog/swipl-devel/blob/9452af09962000ebb5157fe06169bbf51af5d5c9/library/clp/clpfd.pl#L6411
transpose(Ls, Ts) :-
must_be(list(list), Ls),
lists_transpose(Ls, Ts).
lists_transpose([], []).
lists_transpose([L|Ls], Ts) :-
foldl(transpose_, L, Ts, [L|Ls], _).
transpose_(_, Fs, Lists0, Lists) :-
maplist(list_first_rest, Lists0, Fs, Lists).
list_first_rest([L|Ls], L, Ls).
Some queries
?- magic([[1,1,1],[1,1,1],[1,1,1]]).
false.
?- magic([[2,7,6],[9,5,1],[4,3,8]]).
true ;
false.
?- magic([[16,3,2,13], [5,10,11,8], [9,6,7,12], [4,15,14,1]]).
true ;
false.
The transpose predicate is the most complicated part. See here for some alternatives.

Implement Gauss-Jordan elimination in Haskell

We want to program the gauss-elimination to calculate a basis (linear algebra) as exercise for ourselves. It is not homework.
I thought first of [[Int]] as structure for our matrix. I thought then that we can sort the lists lexicographically. But then we must calculate with the matrix. And there is the problem. Can someone give us some hints.
Consider using matrices from the hmatrix package. Among its modules you can find both a fast implementation of a matrix and a lot of linear algebra algorithms. Browsing their sources might help you with your doubts.
Here's a simple example of adding one row to another by splitting the matrix into rows.
import Numeric.Container
import Data.Packed.Matrix
addRow :: Container Vector t => Int -> Int -> Matrix t -> Matrix t
addRow from to m = let rows = toRows m in
fromRows $ take to rows ++
[(rows !! from) `add` (rows !! to)] ++
drop (to + 1) rows
Another example, this time by using matrix multiplication.
addRow :: (Product e, Container Vector e) =>
Int -> Int -> Matrix e -> Matrix e
addRow from to m = m `add` (e <> m)
where
nrows = rows m
e = buildMatrix nrows nrows
(\(r,c) -> if (r,c) /= (to,from) then 0 else 1)
Cf. Container, Vector, Product.
It will be easier if you use [[Rational]] instead of [[Int]] since you get nice division.
You probably want to start by implementing the elementary row operations.
swap :: Int -> Int -> [[Rational]] -> [[Rational]
swap r1 r2 m = --a matrix with r1 and r2 swapped
scale :: Int -> Rational -> [[Rational]] -> [[Rational]]
scale r c m = --a matrix with row r multiplied by c
addrow :: Int -> Int -> Rational -> [[Rational]] -> [[Rational]]
addrow r1 r2 c m = --a matrix with (c * r1) added to r2
In order to actually do guassian elimination, you need a way to decide what multiple of one row to add to another to get a zero. So given two rows..
5 4 3 2 1
7 6 5 4 3
We want to add c times row 1 to row 2 so that the 7 becomes a zero. So 7 + c * 5 = 0 and c = -7/5. So in order to solve for c all we need are the first elements of each row. Here's a function that finds c:
whatc :: Rational -> Rational -> Rational
whatc _ 0 = 0
whatc a b = - a / b
Also, as others have said, using lists to represent your matrix will give you worse performance. But if you're just trying to understand the algorithm, lists should be fine.

Little help with null space of a matrix

This requires a little knowledge about Matlab and I have none. I was just wondering if someone could point me in the right direction and give me some pointer :)
I have to write a matlab code for finding the Null spaces of matices
A and B, where B = A^T x A. And then nd the general solutions to AX = b1
and BX = b2, where b1= the column [1 2 3 4 5] and b2= the column [ 1 2 3 4 5 6 7 8].
My concern is that I dont really know how to go about this code.
This is what I have so far and I do not think i am in the right track. I have a specific matrix as below.
The rows are divided by semi-colon.
A = [ 1 2 3 4 5 6 7 8;
1 2^2 3^2 4^2 5^2 6^2 7^2 8^2;
1 2^3 3^3 4^3 5^3 6^3 7^3 8^3;
1 2^4 3^4 4^4 5^4 6^4 7^4 8^4;
6 8 1 1 7 9 0 7 ]
B = A’A (this is how transpose is written)
C = null(A)
D = null(B)
I feel like there should be a rref somewhere - I'm just not getting anywhere. Please point me in the right direction.
Ok so I updated it to the this now....My username changed from jona and I dont know why
A = [ 1 2 3 4 5 6 7 8;
1 2^2 3^2 4^2 5^2 6^2 7^2 8^2;
1 2^3 3^3 4^3 5^3 6^3 7^3 8^3;
1 2^4 3^4 4^4 5^4 6^4 7^4 8^4;
6 8 1 1 7 9 0 7 ]
B = A’*A (this is how transpose is written)
null(A)
null(B)
b1=[1; 2; 3; 4; 5 ];
b2=[1; 2; 3; 4; 5; 6; 7; 8 ];
end
rref(A,b1)
rref(B,b2)
end
However I still don't feel this is right :(
#Chris A. I know the null space is the solution to Ax=0. However I'm confused on how to use it to find general solution using the b1 and b2 I have. Is it possible for you to explain to me the connection? I don't undertand the book as much.
In MATLAB, arithmetic operations need to be explicit, i.e. a(b+c) should be written as a*(b+c)
Have you tried writing B as
B=A'*A;
Also, you seem to be using a different character for the transpose... You're using ’, the unicode character for single right quotation when you should be using ' or the unicode character for apostrophe.
Ok, so the bottom line is that the null space is the set of all vectors x such that A * x = 0. You got that right. And C is an orthonormal basis for the vectors in the null space. So that means if you have a particular solution (let's call it v) such that A * v = b1 then the space of solutions is the vector v plus any combination of vectors in the null space.
For the case of A, the size (second dimension) of your C will tell you the dimension of the null space. Each one of the vectors in C will be a vector in the null space.
To get v you can do v = A \ b1. You can write arbitrary combinations of vectors in C by C * c where little c is a column vector that is the size of the null space.
The general solution is thus v + C * c where c is any vector that is the dimension of the null space. To see that this solve the system, just plug it back in
A * (v + C * c) =
A * v + A * C * c =
b1 + 0 * c =
b1
Edit: It's the exact same idea for finding the solution to A'*A * x = b2, just anywhere you see A in the above discussion, replace it with A'*A and anywhere you see b1 replace it with b2. The solutions to A * x = b1 and A'*A * x = b2 are separate problems.

Resources