The adjacency list representation consists of n lists, one for each vertex. This is usually implemented as a linked list of "edge" records. Sorting one of these n lists can be done in O(n) time using bucket sort (with n buckets). So it is trivial to sort all the n lists in o(n2) time. Assuming the adjacency list representation, design a linear time (O(m+n)) algorithm to sort each of the n adjacency list of a given simple undirected graph G(V,E). Hint : Use radix sort with n buckets.
I tried with normal sorting but it is taking nlog*n
Related
I will explain my problem in general setting (as I am interested in a general algorithm), then decline it to my particular case.
Say we have two finite sets, A and B, both subsets of X and a distance function d that assigns a distance between any two points of X.
What is an algorithm to find two functions: f1 from A to B and f2 from B to A such that f1(a) is the element in B that is closest to a and the same viceversa for f2.
My special case is in R language, where I have two sets of points on earth (lat, lon) and I need to pair them up (from A to B and viceversa) according to their distance.
For reference, I am using the Haversine distance from geosphere package.
Thanks in advance.
Just mentioning, this is an algorithmic solution for an algorithmic problem.
Lets begin with a solution in O(n^2) time and memory complexity. For each element in A remember the distance from each element in B. Then iterate over this 2 dimensional array and for each row find its minimum - these elements are the image of f1, f2 is always the reverse function from f1.
Now we can create a similar solution in O(n log n) time complexity and O(n) memory complexity. Using a binary search.
Let's sort the elements in A in a way we can say what is the closest item to some item out of the set in O(log n). With numbers it can be done just by sorting them, with lon & lat you just need to sort them first by lon than by lat.
Now for each element in A search what is the closest item in B using binary search. It will take O(log n) per question. Now for each element we know which is the closest. O(n log n).
I have two tensors of rank 3 each, in other words two 3D matrix. I want to take dot product of these two matrix. I am confused to continue with this problem. Help me out with formula to do so.
A 3-way tensor (or equivalently 3D array or 3-order array) need not necessarily be of rank-3; Here, "rank of a tensor" means the minimum number of rank-1 tensors (i.e. outer product of vectors; For N-way tensor, it's the outer product of N vectors) needed to get your original tensor. This is explained in the below figure of so-called CP decomposition.
In the above figure, the original tensor(x) can be written as a sum of R rank-1 tensors, where R is a positive integer. In CP decomposition, we aim to find a minimum R that yields our original tensor X. And this minimum R is called the rank of our original tensor.
For a 3-way tensor, it is the minimum number of (a1,a2,a3...aR; b1,b2,b3...bR; c1,c2,c3...cR) vectors (where each of the vectors is n dimensional) required to obtain the original tensor. The tensor can be written as the outer product of these vectors as:
In terms of element-wise, we can write the 3-way tensor as:
Now, with that background, to answer your specific question, to take the dot product (also called tensor inner product), both tensors must be of same shape (for e.g. 3x2x5 and 3x2x5), then the inner product is defined as the sum of the element-wise product of their values.
where the script X and Y are the same-shape tensors.
P.S.: The tilde in the above formulae should not be interpreted as an approximation.
The vector inner product sum the elementwise products. The tensor inner product follows the same idea. Match the elements, multiply them, and add them all .
I'm working on a string similarity algorithm, and was thinking on how to give a score between 0 and 1 when comparing two strings. The two variables for this function are the Levenshtein distance D: (added, removed and changed characters) and the maximum length of the two strings L (but you could also take the average).
My initial algorithm was just 1-D/L but this gave too high scores for short strings, e.g. 'tree' and 'bee' would get a score of 0.5, and too low scores for longer strings which have more in common even if half of the characters is different.
Now I'm looking for a mathematical function that can output a better score. I wasn't able to come up with one, so I sketched this height map of a 3D plot (L is x and D = y).
Does anyone know how to convert such a graph to an equation, if I would be better off to just create a lookup table or if there is an existing solution?
I have 2 questions,
I've made a vector from a document by finding out how many times each word appeared in a document. Is this the right way of making the vector? Or do I have to do something else also?
Using the above method I've created vectors of 16 documents, which are of different sizes. Now i want to apply cosine similarity to find out how similar each document is. The problem I'm having is getting the dot product of two vectors because they are of different sizes. How would i do this?
Sounds reasonable, as long as it means you have a list/map/dict/hash of (word, count) pairs as your vector representation.
You should pretend that you have zero values for the words that do not occur in some vector, without storing these zeros anywhere. Then, you can use the following algorithm to compute the dot product of these vectors (pseudocode):
algorithm dot_product(a : WordVector, b : WordVector):
dot = 0
for word, x in a do
y = lookup(word, b)
dot += x * y
return dot
The lookup part can be anything, but for speed, I'd use hashtables as the vector representation (e.g. Python's dict).
I have a graph G(V,E), the number of edges is 35000 and the number of nodes is 3500,
Is there anyway I can generate a origin-destination list within n (say 4) stops for each node?
I think the function neighborhood() does exactly what you want. Set the order argument to 4 and for each vertex you'll get a vector of vertex ids for the vertices that are at most 4 steps away from it.
I figure it out:
Use the property of the adjacency matrix A, the entry in row i and column j of A^n gives the number of (directed or undirected) walks of length n from vertex i to vertex j. So for n stop, construct n matrix An, A(n-1)......A1, in which, An= A^n. Then the union of An,An-1....A1 should be the matrix that representing n stop reachable destinations for an origin.