Number of movements in a dynamic array - dynamic-arrays

A dynamic array is an array that doubles its size, when an element is added to an already full array, copying the existing elements to a new place more details here. It is clear that there will be ceil(log(n)) of bulk copy operations.
In a textbook I have seen the number of movements M as being computed this way:
M=sum for {i=1} to {ceil(log(n))} of i*n/{2^i} with the argument that "half the elements move once, a quarter of the elements twice"...
But I thought that for each bulk copy operation the number of copied/moved elements is actually n/2^i, as every bulk operation is triggered by reaching and exceeding the 2^i th element, so that the number of movements is
M=sum for {i=1} to {ceil(log(n))} of n/{2^i} (for n=8 it seems to be the correct formula).
Who is right and what is wrong in the another argument?

Both versions are O(n), so there is no big difference.
The textbook version counts the initial write of each element as a move operation but doesn't consider the very first element, which will move ceil(log(n)) times. Other than that they are equivalent, i.e.
(sum for {i=1} to {ceil(log(n))} of i*n/{2^i}) - (n - 1) + ceil(log(n))
== sum for {i=1} to {ceil(log(n))} of n/{2^i}
when n is a power of 2. Both are off by different amounts when n is not a power of 2.

Related

Can SPARK be used to prove that Quicksort actually sorts?

I'm not a user of SPARK. I'm just trying to understand the capabilities of the language.
Can SPARK be used to prove, for example, that Quicksort actually sorts the array given to it?
(Would love to see an example, assuming this is simple)
Yes, it can, though I'm not particularly good at SPARK-proving (yet). Here's how quick-sort works:
We note that the idea behind quicksort is partitioning.
A 'pivot' is selected and this is used to partition the collection into three groups: equal-to, less-than, and greater-than. (This ordering impacts the procedure below; I'm using this because it's different than the in-order version to illustrate that it is primarily about grouping, not ordering.)
If the collection is 0 or 1 in length, then you are sorted; if 2 then check and possibly-correct the ordering and they are sorted; otherwise continue on.
Move the pivot to the first position.
Scan from the second position position to the last position, depending on the value under consideration:
Less – Swap with the first item in the Greater partition.
Greater – Null-op.
Equal — Swap with the first item of Less, the swap with the first item of Greater.
Recursively call on the Less & Greater partitions.
If a function return Less & Equal & Greater, if a procedure re-arrange the in out input to that ordering.
Here's how you would go about doing things:
Prove/assert the 0 and 1 cases as true,
Prove your handling of 2 items,
Prove that given an input-collection and pivot there are a set of three values (L,E,G) which are the count of the elements less-than/equal-to/greater-than the pivot [this is probably a ghost-subprogram],
Prove that L+E+G equals the length of your collection,
Prove [in the post-condition] that given the pivot and (L,E,G) tuple, the output conforms to L items less-than the pivot followed by E items which are equal, and then G items that are greater.
And that should do it. [IIUC]

Parallel iteration over array with step size greater than 1

I'm working on a practice program for doing belief propagation stereo vision. The relevant aspect of that here is that I have a fairly long array representing every pixel in an image, and want to carry out an operation on every second entry in the array at each iteration of a for loop - first one half of the entries, and then at the next iteration the other half (this comes from an optimisation described by Felzenswalb & Huttenlocher in their 2006 paper 'Efficient belief propagation for early vision'.) So, you could see it as having an outer for loop which runs a number of times, and for each iteration of that loop I iterate over half of the entries in the array.
I would like to parallelise the operation of iterating over the array like this, since I believe it would be thread-safe to do so, and of course potentially faster. The operation involved updates values inside the data structures representing the neighbouring pixels, which are not themselves used in a given iteration of the outer loop. Originally I just iterated over the entire array in one go, which meant that it was fairly trivial to carry this out - all I needed to do was put .Parallel between Array and .iteri. Changing to operating on every second array entry is trickier, however.
To make the change from simply iterating over every entry, I from Array.iteri (fun i p -> ... to using for i in startIndex..2..(ArrayLength - 1) do, where startIndex is either 1 or 0 depending on which one I used last (controlled by toggling a boolean). This means though that I can't simply use the really nice .Parallel to make things run in parallel.
I haven't been able to find anything specific about how to implement a parallel for loop in .NET which has a step size greater than 1. The best I could find was a paragraph in an old MSDN document on parallel programming in .NET, but that paragraph only makes a vague statement about transforming an index inside a loop body. I do not understand what is meant there.
I looked at Parallel.For and Parallel.ForEach, as well as creating a custom partitioner, but none of those seemed to include options for changing the step size.
The other option that occurred to me was to use a sequence expression such as
let getOddOrEvenArrayEntries myarray oddOrEven =
seq {
let startingIndex =
if oddOrEven then
1
else
0
for i in startingIndex..2..(Array.length myarray- 1) do
yield (i, myarray.[i])
}
and then using PSeq.iteri from ParallelSeq, but I'm not sure whether it will work correctly with .NET Core 2.2. (Note that, currently at least, I need to know the index of the given element in the array, as it is used as the index into another array during the processing).
How can I go about iterating over every second element of an array in parallel? I.e. iterating over an array using a step size greater than 1?
You could try PSeq.mapi which provides not only a sequence item as a parameter but also the index of an item.
Here's a small example
let res = nums
|> PSeq.mapi(fun index item -> if index % 2 = 0 then item else item + 1)
You can also have a look over this sampling snippet. Just be sure to substitute Seq with PSeq

Existence of a random integer in a sorted circularlly linked list

This is my task:
There exists a circular and sorted linked list with n integers. Every element has a successor pointer to the next bigger item. The biggest item in the list points back to the smallest item. Determine whether a passed in target item is a member of the list. You can access an item in the list in two ways: You can follow the next pointer from an item that has already been accessed, or you can use a function "RAND" that returns a pointer to a uniformly random item in the list. Create a randomized algorithm that finds the target item and makes at most O(√n) passes in expectation and always returns the right answer.
I'm unsure about how to construct an algorithm in the required time complexity. I think it has something to do with calculating and storing some set of sums in the list but can't figure that step out.
The solution is based on the Dave's comment:
Use RAND sqrt(n) times, storing the largest element < target. Show that this is expected to be within sqrt(n) of the target.
Denote by i index of biggest element in the list which less to the target element.
Now lets compute expectation of hitting element in a range (i - sqrt(n), i] in sqrt(n) trials. On every trial the probability of hitting the range is range length divided by the list length, which is sqrt(n)/n = 1/sqrt(n), so
E = 1/sqrt(n) * sqrt(n) = 1.
So we expect to check presence of the target element by choosing the biggest element which is smaller than the target in our sqrt(n) trials and then advance linearly for sqrt(n) items.

Determine if there exists a number in the array occurring k times

I want to create a divide and conquer algorithm (O(nlgn) runtime) to determine if there exists a number in an array that occurs k times. A constraint on this problem is that only a equality/inequality comparison method is defined on the objects of the array (i.e can't use <, >).
So I have tried a number of approaches including splitting the array into k pieces of equal size (approximately). The approach is similar to finding the majority item in an array, however in the majority case when you split the array, you know that one half must have a majority item if such an item exists. Any pointers or tips that one could provide to put me in the right direction ?
EDIT: To clear up a little, I am wondering whether the problem of finding the majority item by splitting the array in half and using a recursive solution can be extended to other situations where k may be n/4 or n/5 etc.
Maybe I should of phrased the question using n/k instead.
This is impossible. As a simple example of why this is impossible, consider an input with a length-n array, all elements distinct, and k=2. The only way to be sure no element appears twice is to compare every element against every other element, which takes O(n^2) time. Until you perform all possible comparisons, you cannot be sure that some pair you didn't compare isn't actually equal.

Generating a unique random number in constant time

Is there any way to generate a unique random number in constant time? I'm currently using an arrayList which contains all my possible numbers. It implements Collection and I'm calling the shuffle method on this arrayList. I'm then removing the 0th element from the arrayList to get the unique random number, but I believe this cannot be constant time for 2 reasons:
the remove method is O(n)
Collection.shuffle is O(n)
Suggestions? Thanks!
Collection.shuffle is indeed O(n), but it only needs to be called once. If you're getting n random numbers, that time sort of diffuses into the remaining operations. (this is the kind of idea behind amortized analysis, which seems necessary for your problem: either you want "amortized deterministic constant time", or "expected constant time", as one might do with a dictionary)
the remove method is O(n) in general, but if you repeatedly remove the last element instead of the first, the array doesn't have to constantly shift its elements down, so it's actually O(1)--it's just that it's O(n) in the average and worst cases.

Resources