An undirected graph is given (as an adjacency list or incidence matrix). For multiple queries, check if a path of length exactly x exists between two nodes. Same nodes can be visited more than once.
I know that for single queries it's easy to check for this, simply by raising the incidence matrix to the power x (number of steps) and checking if the value at [first node][second node] is greater that 0. This takes too long, and for bigger matrices it takes too much memory. Even more so for multiple queries.
How can I solve this problem using as little space and time as possible?
Example:
Graph
Queries:
Is it possible to reach 3 from 2 in 1 step? yes
Is it possible to reach 4 from 1 in 1 step? no
Is it possible to reach 5 from 5 in 8 steps? yes
Is it possible to reach 8 from 1 in 10 steps? no
Thank you in advance.
Hypothetical scenario to have a descriptive example: I've a model consisting of 10 parts (vertices) to be put together. Each part can be connected to others (edges) as defined by a connection table.
There's a shortest.paths function in igraph. However here the aim is to find a way to calculate the longest path in the adjacency matrix. Resulting in a path using as many parts as possible, ideally all, so no part of the model is left alone in the end. MWE as follows:
library(igraph)
connections <- read.table(text="A B
1 2
1 7
1 9
1 10
2 7
2 9
2 10
3 1
3 7
3 9
3 10
4 1
4 6
4 7
7 5
7 9
7 10
8 9
8 10
9 10", header=TRUE)
adj <- get.adjacency(graph.edgelist(as.matrix(connections), directed=FALSE))
g1 <- graph_from_adjacency_matrix(adj, weighted=TRUE, mode="undirected")
plot(g1)
Edit:
The result should be something like: for instance if the first part of the model is 8 it could be combined with 9 or 10. Let's say 10 is selected next part can be either 1,2,7 or 9. If 9 is selected as next the follow up could be 1,2,3,7 or 8. If then 8 is selected the model would be finished as part 10 is already in use. The question then would be how to find a way/path to put together as many parts as possible, ideally all of them. The latter would be possible only by starting with 6 or 5.
There are cycles in your graphs, and I don't think you have stated that we cannot use the same vertex (part) more than once: and in this case the longest path might be infinitely long as you can traverse the cycle infinitely many times and then proceed to your destination.
As per your edit, I think this is not allowed. You can use dynamic programming for this I hope. You can start with DFS like algorithm and mark all the vertex except starting as unvisited. Then apply recursion to choose maximum between the longest paths from all the possible vertex we can reach (except which are already visited) from that given vertex.
It is an NP-hard problem, so you would have to check all the possible paths!
You can see: https://en.wikipedia.org/wiki/Longest_path_problem . You will have modify the algorithm to work in graphs with cycle by adding, as stated earlier, a flag to tell which vertices are already visited.
Tell me if i get it right, you are trying to find a path, that touch the maximum number of nodes?
If that so this is basically an instance of the Hamiltonian path problem, I would say an easier version of it if you can pass on a node more than 1 time.
You can try to watch that algorithm.
to respect you edit maybe, you can try to see the graphs search algorithms, you can find something here, however be advise that this type of algorithms are quite heavy on the memory complexity side.
I'd like to split a sequence into k parts, and optimize the homogeneity of these sub-parts.
Example : 0 0 0 0 0 1 1 2 3 3 3 2 2 3 2 1 0 0 0
Result : 0 0 0 0 0 | 1 1 2 | 3 3 3 2 2 3 2 | 1 0 0 0 when you ask for 4 parts (k = 4)
Here, the algorithm did not try to split in fixed-length parts, but instead tried to make sure elements in the same parts are as homogeneous as possible.
What algorithm should I use ? Is there an implementation of it in R ?
Maybe you can use Expectation-maximization algorithm. Your points would be (value, position). In your example, this would be something like:
With the E-M algorithm, the result would be something like (by hand):
This is the desired output, so you can consider using this, and if it really works in all your scenarios. An annotation, you must assign previously the number of clusters you want, but I think it's not a problem for you, as you have set out your question.
Let me know if this worked ;)
Edit:
See this picture, is what you talked about. With k-means you should control the delta value, this is, how the position increment, to have its value to the same scale that value. But with E-M this doesn't matter.
Edit 2:
Ok I was not correct, you need to control the delta value. It is not the same if you increment position by 1 or by 3: (two clusters)
Thus, as you said, this algorithm could decide to cluster points that are not neighbours if their position is far but their value is close. You need to guarantee this not to happen, with a high increment of delta. I think that with a increment of 2 * (max - min) values of your sequence this wouldn't happen.
Now, your points would have the form (value, delta * position).
I have an n-partite (undirected) graph, given as an adjacency matrix, for instance this one here:
a b c d
a 0 1 1 0
b 0 0 0 1
c 0 0 0 1
d 0 0 0 0
I would like to know if there is a set of matrix operations that I can apply to this matrix, which will result in a matrix that "lists" all paths (of length n, i.e. through all the partitions) in this graph. For the above example, there are paths a->b->d and a->c->d. Hence, I would like to get the following matrix as a result:
a b c d
1 1 0 1
1 0 1 1
The first path contains nodes a,b,d and the second one nodes a,c,d. If necessary, the result matrix may have some all-0 lines, as here:
a b c d
1 1 0 1
0 0 0 0
1 0 1 1
0 0 0 0
Thanks!
P.S. I have looked at algorithms for computing the transitive closure, but these usually only tell if there is a path between two nodes, and not directly which nodes are on that path.
One thing you can do is to compute the nth power of you matrix A. The result will tell you how many paths there of length n from any one vertex to any other.
Now if you're interested in knowing all of the vertices along the path, I don't think that using purely matrix operations is the way to go. Bearing in mind that you have an n-partite graph, I would set up a data structure as follows: (Bear in mind that space costs will be expensive for all but small values.)
Each column will have one entry of each of the nodes in our graph. The n-th column will contain 1 in if this node is reachable on the n-th iteration from our designated start vertex or start set, and zero otherwise. Each column entry will also contain a list of back pointers to the vertices in the n-1 column which led to this vertex in the nth column. (This is like the viterbi algorithm, except that we have to maintain a list of backpointers for each entry rather than just one.) The complexity of doing this is (m^2)*n, where m is the number of vertices in the graph, and n is the length of the desired path.
I'm a little bit confused by your top matrix: with an undidrected graph, I would expect the adjacency matrix to be symmetric.
No, There is no pure matrix way to generate all paths. Please use pure combinatorial algorithms.
'One thing you can do is to compute the nth power of you matrix A. The result will tell you how many paths there of length n from any one vertex to any other.'
The power of matriax generates walks not paths.
I've never been much for math and I'm hoping that someone can help me out with the following.
I have 5 boxes:
1 2 3 4 5
[ ] [ ] [ ] [ ] [ ]
The boxes can either be white, gray, or black (or think of it as 0, 1, 2)
How many possible states can the box set be in?
What is the pseudocode (or in any language) to generate all the possible outcomes??
ie...
00000
00001
00011
00111
etc, etc...
I really appreciate any help anyone can give me with this.
the answer for the number of combinations is: 3x3x3x3x3 (3^5) since each box can have 3 possible colors.
As for generating the outcomes, see if you can figure it out using this matrix with 0, 1, or 2 to represent the color of the box. On a smaller scale (lets assume 3 boxes) it would look like this:
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
2 0 0
2 0 1
2 0 2
2 1 0
2 1 1
2 1 2
2 2 0
2 2 1
2 2 2
This is a classic permutation generation problem. You have 3 possibilities for each position, and 5 positions. The total number of generated string is 3^5 = 243.
You need recursion if you want a general solution (a simple iterative loop only works for a single instance of the problem).
Here's a quick example:
public static void Main(string[] args){
Generate("", 5);
}
private void Generate(string s, int limit)
{
if (s.Length == limit)
Console.WriteLine(s);
else
{
Generate(s+"0", limit);
Generate(s+"1", limit);
Generate(s+"2", limit);
}
}
To answer your first question, what would the answer be if the boxes could contain only one of two values? So, what's the answer if the boxes contain one of three values?
To answer your second question, what pseudocode generates all possible outcomes of one box? Now, pseudocode generates all possible outcomes of two boxes?
I'd recommend solving the problem on paper first. Try to solve it with a smaller number of boxes (maybe three), and list all possibilities. Then, think of how your reasoning went, or how you'd explain what you did to a small child.
This seems like a homework problem. I'll just give you some help as to the solution then.
What you are saying is that each box has three states, which are all independent. One box would have 3 solutions, and two boxes would have 3 * 3 solutions - for each state of the first box the second box would have three states as well. Extend that to 5 boxes.
To generate each solution, you can just cycle through it. It is easy to make nested for loops for each box, and multiplying by powers of 10 can let you show the number at once.
You can generalize the code for multiple boxes in a similar way.
Thank you all for your answers, at least those of you who actually gave me one.
While I can appreciate that the question sounded like it was pulled straight out of Computer Science 101, it wasn't. The irony of the matter is that it was for real life on a real deadline and I didn't have time to hearken back to when I was being taught this stuff and said to myself, "when am I ever going to need this crap"
If I wanted to be patronized and treated like a school boy I would go back to my elementary school and ask my 5th grade teacher if I can go to the bathroom
Thanks again
the number of states is 3^5.
pseudocode is
for value from 0 to 3^5-1
print base3(value)
where base3 is a function that repeatedly takes modulo 3 to get a digit, then removes that digit (by dividing by 3)
Hint: imagine that each box is a position in a number and each colour is a different digit. In the real world, how many combinations (including zero) do you get with 2 positions and 10 possible digits? What about 3 positions? What's the relationship between adding an extra position and the number of combinations, given the number of digits you have available?
Unique number of combinations: 3^5=243
Code:
n = 0
for i = 0 to 3^5-1
{
s = ""
for j = 1 to 5
{
d = n mod 3
s = toascii(d) . s
n = n / 3
}
println s
i = i + 1
}
Here's how I first learned to do this: first think about how many choices you are making. You are making five choices, one for each box. So write down five blank lines with multiplication signs:
__ x __ x __ x __ x __ = ?
In each blank, write the number of objects you have to choose from for that box. Since you have 3 numbers to choose from for each box, you write a 3 in every blank:
3 x 3 x 3 x 3 x 3 = 243
This gives you the total number of permutations for those choices.
The number of possibilities is 3 to the power 5
If you loop from 0 to that number minus 1 and express it in base 3 you will have all the possibilities (remember to prepend 0s where necessary)
In Ruby:
number_of_possibilities = 3**5-1
for i in (0..number_of_possibilities)
base_3_number = i.to_s(3)
puts "%05d" % base_3_number # number formatting used to prepend 0s where necessary
end
Can I ask what about this you don't understand or whats tripping you up? I see that everyone here has simply answered the question, but if you've copied their answers, you've learned nothing, and thus completely missed the point of the homework. Assuming your next lesson builds upon this one, you're just going to fall further behind.
If you either worked for me or were in my class I'd simply ask the following...
"How do you think the problem should be solved?" The answer to which might reveal where you're getting hung up. A wise professor of mine at CMU once said "I can't help you understand this until you know what you don't understand" I never did figure out what I didn't understand and I dropped his class, but the lesson stuck with me.
I know its probably too late, but for these homework questions I really think we should be helping the person learn as opposed to simply providing an answer and doing their homework for them.
Your problem needs nothing more than the rule of product in combinatorics.
You can choose the state of the first box in 3 ways, and the state of the second box in 3 ways, and ... and the state of the 5th box in 3 ways. The number of ways in which you can set the state of all the boxes is the product of all the five (equal) numbers of ways, i.e. 3x3x3x3x3 = 35.
Similar question: how many numbers can you form with 5 digits in the decimal system, counting the leading zeros? That is, how many numbers are there from 00000 to 99999? You can choose the first digit in 10 ways (0...9), and so on and so on, and the answer is 10x10x10x10x10 = 100000, as you already know.
Don't even try to write code to answer this! The reason is that you need some very large numbers (factorials) to calculate it. These create numbers much larger than any base type in the CLR. You can use this opensource library to do the calculation.
void solve(int p=0,int n=5,int d=0)
{
if (n==p)
{
int rev=d;
int i=0;
while (i<5) {
cout << rev%10;
rev /= 10;
i++;## Heading ##
}
cout << endl;
return;
}
for(int i=0; i<3 ; i++)
{
solve(p+1,n, d*10 + i);
}
}