Sorting with parity in julia - julia

Suppose I have the following array:
[6,3,3,5,6],
Is there an already implemented way to sort the array and that returns also the number of permutations that it had to make the algorithm to sort it?
For instance, I have to move 3 times to the right with the 6 so it can be ordered, which would give me parity -1.
The general problem would be to order an arbitrary array (all integers, with repeated indexes!), and to know the parity performed by the algorithm to order the array.

a=[6,3,3,5,6]
sortperm(a) - [ 1:size(a)[1] ]
Results in
3-element Array{Int64,1}:
1
1
1
-3
0
sortperm shows you where each n-th index should go into. We're using 1:size(a)[1] to compare the earlier index to its original indexation.

If your array is small, you can compute the determinant of the permutation matrix
function permutation_sign_1(p)
n = length(p)
A = zeros(n,n)
for i in 1:n
A[i,p[i]] = 1
end
det(A)
end
In general, you can decompose the permutation as a product of cycles,
count the number of even cycles, and return its parity.
function permutation_sign_2(p)
n = length(p)
not_seen = Set{Int}(1:n)
seen = Set{Int}()
cycles = Array{Int,1}[]
while ! isempty(not_seen)
cycle = Int[]
x = pop!( not_seen )
while ! in(x, seen)
push!( cycle, x )
push!( seen, x )
x = p[x]
pop!( not_seen, x, 0 )
end
push!( cycles, cycle )
end
cycle_lengths = map( length, cycles )
even_cycles = filter( i -> i % 2 == 0, cycle_lengths )
length( even_cycles ) % 2 == 0 ? 1 : -1
end
The parity of a permutation can also be obtained from the
number of inversions.
It can be computed by slightly modifying the merge sort algorithm.
Since it is also used to compute Kendall's tau (check less(corkendall)),
there is already an implementation.
using StatsBase
function permutation_sign_3(p)
x = copy(p)
number_of_inversions = StatsBase.swaps!(x)
number_of_inversions % 2 == 0 ? +1 : -1
end
On your example, those three functions give the same result:
x = [6,3,3,5,6]
p = sortperm(x)
permutation_sign_1( p )
permutation_sign_2( p )
permutation_sign_3( p ) # -1

Related

CVXPY violates constraints when it solves SDP

Let's say that I want to solve the following problem.
minimize Tr(CY)
s.t. Y = xxT
x is 0 or 1.
where xxT indicates an outer product of n-1 dimension vector x. C is a n-1 by n-1 square matrix. To convert this problem to a problem with a single matrix variable, I can write down the code as follows by using cvxpy.
import cvxpy as cp
import numpy as np
n = 8
np.random.seed(1)
S = np.zeros(shape=(int(n), int(n)))
S[int(n-1), int(n-1)] = 1
C = np.zeros(shape=(n,n))
C[:n-1, :n-1] = np.random.randn(n-1, n-1)
X = cp.Variable((n,n), PSD=True)
constraints=[]
constraints.append(cp.trace(S # X) == 1)
for i in range(n-1):
Q = np.zeros(shape=(n,n))
Q[i,i] = 1
Q[-1,i] = -0.5
Q[i,-1] = -0.5
const = cp.trace(Q # X) == 0
constraints.append(const)
prob = cp.Problem(cp.Minimize(cp.trace(C # X)),constraints)
prob.solve(solver=cp.MOSEK)
print("X is")
print(X.value)
print("C is")
print(C)
To satisfy the binary constraint that the entries of the vector x should be one or zero, I added some constraints for the matrix variable X.
X = [Y x; xT 1]
Tr(QX) == 0
There are n-1 Q matrices which are forcing the vector x's entries to be 0 or 1.
However, when I ran this simple code, the constraints are violated severely.
Looking forward to see any suggestion or comments on this.

Dijkstra's algorithm with adjacency matrix

I'm trying to implement the following code from here but it won't work correctly.
What I want is the shortest path distances from a source to all nodes and also the predecessors. Also, I want the input of the graph to be an adjacency matrix which contains all of the edge weights.
I'm trying to make it work in just one function so I have to rewrite it. If I'm right the original code calls other functions (from graph.jl for example).
I don't quite understand how to rewrite the for loop which calls the adj() function.
Also, I'm not sure if the input is correct in the way the code is for now.
function dijkstra(graph, source)
node_size = size(graph, 1)
dist = ones(Float64, node_size) * Inf
dist[source] = 0.0
Q = Set{Int64}() # visited nodes
T = Set{Int64}(1:node_size) # unvisited nodes
pred = ones(Int64, node_size) * -1
while condition(T)
# node selection
untraversed_nodes = [(d, k) for (k, d) in enumerate(dist) if k in T]
if minimum(untraversed_nodes)[1] == Inf
break # Break if remaining nodes are disconnected
end
node_ind = untraversed_nodes[argmin(untraversed_nodes)][2]
push!(Q, node_ind)
delete!(T, node_ind)
# distance update
curr_node = graph.nodes[node_ind]
for (neigh, edge) in adj(graph, curr_node)
t_ind = neigh.index
weight = edge.cost
if dist[t_ind] > dist[node_ind] + weight
dist[t_ind] = dist[node_ind] + weight
pred[t_ind] = node_ind
end
end
end
return dist, pred
end
So if I'm trying it with the following matrix
A = [0 2 1 4 5 1; 1 0 4 2 3 4; 2 1 0 1 2 4; 3 5 2 0 3 3; 2 4 3 4 0 1; 3 4 7 3 1 0]
and source 2 i would like to get the distances in a vector dist and the predeccessors in anothe vectore pred.
Right now I'm getting
ERROR: type Array has no field nodes
Stacktrace: [1] getproperty(::Any, ::Symbol) at .\sysimg.jl:18
I guess I have to rewrite it a bit more.
I m thankful for any help.
Assuming that graph[i,j] is a length of path from i to j (your graph is directed looking at your data), and it is a Matrix with non-negative entries, where 0 indicates no edge from i to j, a minimal rewrite of your code should be something like:
function dijkstra(graph, source)
#assert size(graph, 1) == size(graph, 2)
node_size = size(graph, 1)
dist = fill(Inf, node_size)
dist[source] = 0.0
T = Set{Int}(1:node_size) # unvisited nodes
pred = fill(-1, node_size)
while !isempty(T)
min_val, min_idx = minimum((dist[v], v) for v in T)
if isinf(min_val)
break # Break if remaining nodes are disconnected
end
delete!(T, min_idx)
# distance update
for nei in 1:node_size
if graph[min_idx, nei] > 0 && nei in T
possible_dist = dist[min_idx] + graph[min_idx, nei]
if possible_dist < dist[nei]
dist[nei] = possible_dist
pred[nei] = min_idx
end
end
end
end
return dist, pred
end
(I have not tested it extensively, so please report if you find any bugs)

Julia: Searching for a column in a sorted matrix

I have a matrix that is sorted like the one shown below
1 1 2 2 3
1 2 3 4 1
2 1 2 1 1
It's a bit hard for me to describe the ordering, but hopefully it's clear from the example. The rough idea is that we first sort on the first row, then the second, etc.
I would like to find a specific column in the matrix, and that column may or may not exist in it.
I tried the following code:
index = searchsortedfirst(1:total_cols, col, lt=(index,x) -> (matrix[: index] < x))
The above code works, but it is slow. I profiled the code, and it spends a lot of time in "_get_index". I then tried the following
#views index = searchsortedfirst(1:total_cols, col, lt=(index,x) -> (matrix[: index] < x))
As expected this helped a lot, likely due to the slices I'm taking. However, is there a better way to go about this? There still seems to be a lot of overhead, and I feel like there might be a cleaner way to write this, which would be easier to optimize.
However, I absolutely value speed over clarity.
Here is some code I wrote to compare binary vs. linear search.
using Profile
function test_search()
max_val = 20
rows = 4
matrix = rand(1:max_val, rows, 10^5)
matrix = Array{Int64,2}(sortslices(matrix, dims=2))
indices = #time #profile lin_search(matrix, rows, max_val, 10^3)
indices = #time #profile bin_search(matrix, rows, max_val, 10^3)
end
function bin_search(matrix, rows, max_val, repeats)
indices = zeros(repeats)
x = zeros(Int64, rows)
cols = size(matrix)[2]
for i = 1:repeats
x = rand(1:max_val, rows)
#inbounds #views index = searchsortedfirst(1:cols, x, lt=(index,x)->(matrix[:,index] < x))
indices[i] = index
end
return indices
end
function array_eq(matrix, index, y, rows)
for i=1:rows
#inbounds if view(matrix, i, index) != y[i]
return false
end
end
return true
end
function lin_search(matrix, rows, max_val, repeats)
indices = zeros(repeats)
x = zeros(Int64, rows)
cols = size(matrix)[2]
for i = 1:repeats
index = cols + 1
x = rand(1:max_val, rows)
for j=1:cols
if array_eq(matrix, j, x, rows)
index = j;
break
end
end
indices[i] = index
end
return indices
end
Profile.clear()
test_search()
Here is some sample output
0.041356 seconds (68.90 k allocations: 3.431 MiB)
0.070224 seconds (110.45 k allocations: 5.418 MiB)
After adding some more #inbounds, it looks like a linear search is faster than binary. Seems strange when there are 10^5 columns.
If speed is most important, why not simply use the fact that Julia allows you to write fast loops?
julia> function findcol(M, col)
#inbounds #views for c in axes(M, 2)
M[:,c] == col && return c
end
return nothing
end
findcol (generic function with 1 method)
julia> col = [2,3,2];
julia> M = [1 1 2 2 3;
1 2 3 4 1;
2 1 2 1 1];
julia> #btime findcol($M, $col)
32.854 ns (3 allocations: 144 bytes)
3
This should probably be fast enough and does not even take into account any ordering.
I discovered two issues, that when fixed result in both linear and binary searches being much faster. And the binary search becomes faster than linear.
First, there was some type instability. I changed on one of the lines to
matrix::Array{Int64,2} = Array{Int64,2}(sortslices(matrix, dims=2))
This resulted in an order of magnitude speedup. Also it turns out that using #views does not do anything in the following code
#inbounds #views index = searchsortedfirst(1:cols, x, lt=(index,x)->(matrix[:,index] < x))
I am new to Julia, but my hunch is that since matrix[:,index] is copied no matter what in the anonymous function. This would make sense, since it allows for closures.
If I write a separate non-anonymous function, then that copy goes away. Linear search didn't copy the slices, so this also really sped up the binary search.

How to calculate elements needed from a loop?

I have the following data:
y-n-y-y-n-n-n
This repeats infinitely, such as:
y-n-y-y-n-n-n-y-n-y-y-n-n-n-y-n-y-y-n-n-n...
I have 5 "x".
"x" only sticks with "y".
Meaning, if I distribute x on the loop above, it will be:
y-n-y-y-n-n-n-y-n-y-y-n-n-n
x---x-x-----x-x
I want to count how many of the loop's element I needed to use to spread 5 x across, and the answer is 10.
How do I calculate it with a formula?
I presume what you're saying is that you need to process the first 10 elements of the infinite list to get 5 Y's, which match/stick with the 5 X's you have.
y-n-y-y-n-n-n-y-n-y-y-n-n-n-y-n-y-y-n-n-n...
x-_-x-x-_-_-_-x-_-x
^
L____ 10 elements read from the infinite list to place the 5 x's.
I also presume that your question is: given an input of 5 Xs, what is the number of elements you need to process in the infinite list to match those 5 Xs.
You could calculate it with a loop like the following pseudo-code:
iElementsMatchedCounter = 0
iXsMatchedCounter = 0
iXLimit = 5
strElement = ""
if (InfiniteList.IsEmpty() == false)
{
do
{
strElement = InfiniteList.ReadNextElement()
if (strElement == "y")
{
iXsMatchedCounter += 1
}
iElementsMatchedCounter += 1
} while ( (InfiniteList.IsEndReached() == false) AND (iXsMatchedCounter < iXLimit) )
}
if (iXsMatchedCounter = iXLimit)
then Print(iElementsMatchedCounter)
else Print("End of list reached before all X's were matched!")
The drawback of the above approach is that you are actually reading the infinite list, which might not be preferable.
Instead, given you know your list is an infinitely repeating sequence of the same elements y-n-y-y-n-n-n, you don't even need to loop through the entire list, but just operate on the sub-list y-n-y-y-n-n-n. The following algorithm describes how:
Given your starting input:
iNumberOfXs = 5 (you have 5 Xs to match)
iNumberOfYsInSubList = 3
(you have 3 Ys in the sub-list, the total list repeats infinitely)
iLengthOfSubList = 7 (you have 7 elements in the sub-list
y-n-y-y-n-n-n)
We then have intermediate results which are calculated:
iQuotient
iPartialLengthOfList
iPendingXs
iPendingLengthOfList
iResult
The following steps should give the result:
Divide the iNumberOfXs by iNumberOfYsInSubList. Here, this gives us 5/3 = 1.666....
Discard the remainder of the result (the 0.666...), so you're left with 1 as iQuotient. This is the number of complete sub-lists you have to iterate.
Multiply this quotient 1 with iLengthOfSubList, giving you 1*7=7 as iPartialLengthOfList. This is the partial sum of the result, and is the number of elements in the complete sub-lists you iterate.
Also multiply the quotient with iNumberOfYsInSubList, and subtract this product from iNumberOfXs, i.e. iNumberOfXs - (iQuotient * iNumberOfYsInSubList) = 5 - (1 * 3) = 2. Save this value 2 as iPendingXs, which is the number of as-yet unmatched X's.
Note that iPendingXs will always be less than iLengthOfSubList (i.e. it is a modulo, iPendingXs = iNumberOfXs MODULO iNumberOfYsInSubList).
Now you have the trivial problem of matching 2 X's (i.e. the value of iPendingXs calculated above) in the sub-list of y-n-y-y-n-n-n.
The pending items to match (counted as iPendingLengthOfList) is:
Equal to iPendingXs if iPendingXs is 0 or 1
Equal to iPendingXs + 1 otherwise (i.e. if iPendingXs is greater than 1)
In this case, iPendingLengthOfList = 3, because iPendingXs is greater than 1.
The sum of iPartialLengthOfList (7) and iPendingLengthOfList (3) is the answer, namely 10.
In general, if your sub-list y-n-y-y-n-n-n is not pre-defined, then you cannot hard-code the rule in step 6, but will instead have to loop through only the sub-list once to count the Ys and elements, similar to the pseudo-code given above.
When it comes to actual code, you can use integer division and modulo arithmetic to quickly to the operations in steps 2 and 4 respectively.
iQuotient = iNumberOfXs / iNumberOfYsInSubList // COMMENT: here integer division automatically drops the remainder
iPartialLengthOfList = iQuotient * iLengthOfSubList
iPendingXs = iNumberOfXs - (iQuotient * iNumberOfYsInSubList)
// COMMENT: can use modulo arithmetic like the following to calculate iPendingXs
// iPendingXs = iNumberOfXs % iNumberOfYsInSubList
// The following IF statement assumes the sub-list to be y-n-y-y-n-n-n
if (iPendingXs > 1)
then iPendingLengthOfList = iPendingXs + 1
else iPendingLengthOfList = iPendingXs
iResult = iPartialLengthOfList + iPendingLengthOfList

Python Programming While Loop

Hi I'm new to python and programming in general. I am trying write a program that uses a while loop to add integers from 1 to the number entered. the program also has to give an error statement if the user enters a 0 or negative number. So far the integers add up and the error statement works but the program is not looping, it only asks the user to input a number one time. Please help. This is my source code so far. Thanks
x = int(input("Enter a positive number not including zero:" ))
total = 0
n = 1
while n <= x:
total = total + n
n = n + 1
# prints the total of integers up to number entered
print("Sum of integers from 1 to number entered= ",total)
if x <= 0 or x == -x:
print ("invalid entry")
Try this code...
op='y'
while op=='y':
x = int(input("Enter a positive number not including zero:" ))
total = 0
n = 1
if x > 0:
while n <= x:
total = total + n
n = n + 1
# prints the total of integers up to number entered
print("Sum of integers from 1 to number entered= ",total)
else:
print ("invalid entry")
op = raw_input("Are you want to continue this operation (y/n):" )
Put your whole code this way
done = False
while not done:
//your entire code here except the last 2 lines
if x > 0:
done = True

Resources