This is another question from my past midterm, and i am supposed to give a formal formulation, describe the algorithm used, and justify the correctness. Here is the problem:
The University is trying to schedule n different classes. Each class has a start and finish time. All classes have to be taught on Friday. There are only two classrooms available.
Help the university decide whether it is possible to schedule these classes without causing any time conflict (i.e. two classes with overlapping class times are scheduled in the same classroom).
Sort the classes by starting time (O(nlogn)), then go through them in order (O(n)), noting starting and ending times and looking for the case of more than two classes going on at the same time.
This isn't a problem with a bipartite graph solution. Who told you it was?
#Beta is nearly correct. Create a list of pairs <START, time> and <END, time>. Each class has two pairs in the list, one START, and one END.
Now sort the list by time. Or if you like, put them in a min heap, which amounts to heapsort. For equal times, put the END triples before START. Then execute the following loop:
set N = 0
while sorted list not empty
pop <tag, time> from the head of the list
if tag == START
N = N + 1
if N > 2 return "can't schedule"
else // tag == END
N = N - 1
end
end
return "can schedule"
You can easily enrich the algorithm a bit to return the time periods where more than 2 classes are in session at the same time, return those classes, and other useful information.
This indeed IS a bipartite/bicoloring problem.
Imagine each class to be a node of a graph. Now create an edge between 2 nodes if they have time overlap. Now the final graph that you get if you can bicolor this graph then its possible to schedule all the class. Otherwise not.
The graph you created if it can be bicolored, then each "black" node will belong to room1 and each "white" node will belong to room2
Related
While studying MDP via different sources, I came across two different formulas for the Value update in Value-Iteration algorithm.
The first one is (the one on Wikipedia and a couple of books):
.
And the second one is (in some questions here on stack, and my course slides) :
For a specific iteration, they don't seem to give the same answer. Is one of them converging faster to the solution ?
Actually the difference is in reward functions R(s , s') or R(s) in the second formula.
First equation is generalized.
In the first one, the rewards is Ra(s , s') when transitioning from state s to the state s' due action a'.
Reward could be different for different states and actions.
But if for every state s we have some pre-defined reward(regardless of the previous state and the action that leads to s), then we can simplify the formula to the second one.
The final values are not necessarily equal but the policies are same.
In a .net project we have a group of 200 people of two types, lets say x and y, who need to be separated into groups of 7 or 8.
We have a web page where the people write other members they want to be in a group with. Each person builds a list of wanted members.
After this, there should be an algorithm to build the 7-8 member groups considering the peoples ratings, and the following condition: each group has at least 2 people of each type (x/y).
I'm pretty sure there must be a well known algorithm similar to this but didn't find one. Anyone knows how to do it?
this problem smells NP-Hard, so I suggest using Artificial Intelligence tools.
A possible approach is steepest ascent hill climbing [SAHC]
first, we will define our utility function (let it be u) as mentioned in the comments to the question. [sum of friends in group for each user]. let's define u(illegal) = -1 for illegal solution.
next,we define our 'world': S is the group of all possible solutions].
for each solution in S we define:
next(s)={all possibilities moving one person to a different group}
all we have to do now is run SAHC with random restarts:
1. best<- -INFINITY
2. while there is more time
3. choose a random legal solution
4. NEXT <- next(s)
5. if max{ U(NEXT) } < u(s): //s is the top of the hill
5.1. if u(s) > best: best <- u(s) //if s is better then the previous result - store it.
5.2. go to 2. //restart the hill climbing from a different random point.
6. else:
6.1. s <- max{ NEXT } //climb on the steepest hill.
6.2. goto 4.
7. return best //when out of time, return the best solution found so far.
It is anytime algorithm, meaning it will get a better result as you give it more time to run, and eventually [at time infinity] it will find the optimal result.
I have an undirected graph with Vertex V and Edge E. I am looking for an algorithm to identify all the cycle bases in that graph.
I think Tarjans algorithm is a good start. But the reference I have is about finding all of the cycles, not cycle base ( which, by definition is the cycle that cannot be constructed by union of other cycles).
For example, take a look at the below graph:
So, an algorithm would be helpful. If there is an existing implementation (preferably in C#), it's even better!
From what I can tell, not only is Brian's hunch spot on, but an even stronger proposition holds: each edge that's not in the minimum spanning tree adds exactly one new "base cycle".
To see this, let's see what happens when you add an edge E that's not in the MST. Let's do the favorite math way to complicate things and add some notation ;) Call the original graph G, the graph before adding E G', and the graph after adding E G''. So we need to find out how does the "base cycle count" change from G' to G''.
Adding E must close at least one cycle (otherwise E would be in the MST of G in the first place). So obviously it must add at least one "base cycle" to the already existing ones in G'. But does it add more than one?
It can't add more than two, since no edge can be a member of more than two base cycles. But if E is a member of two base cycles, then the "union" of these two base cycles must've been a base cycle in G', so again we get that the change in the number of cycles is still one.
Ergo, for each edge not in MST you get a new base cycle. So the "count" part is simple. Finding all the edges for each base cycle is a little trickier, but following the reasoning above, I think this could do it (in pseudo-Python):
for v in vertices[G]:
cycles[v] = []
for e in (edges[G] \ mst[G]):
cycle_to_split = intersect(cycles[e.node1], cycles[e.node2])
if cycle_to_split == None:
# we're adding a completely new cycle
path = find_path(e.node1, e.node2, mst[G])
for vertex on path:
cycles[vertex].append(path + [e])
cycles
else:
# we're splitting an existing base cycle
cycle1, cycle2 = split(cycle_to_split, e)
for vertex on cycle_to_split:
cycles[vertex].remove(cycle_to_split)
if vertex on cycle1:
cycles[vertex].append(cycle1)
if vertex on cycle2:
cycles[vertex].append(cycle2)
base_cycles = set(cycles)
Edit: the code should find all the base cycles in a graph (the base_cycles set at the bottom). The assumptions are that you know how to:
find the minimum spanning tree of a graph (mst[G])
find the difference between two lists (edges \ mst[G])
find an intersection of two lists
find the path between two vertices on a MST
split a cycle into two by adding an extra edge to it (the split function)
And it mainly follows the discussion above. For each edge not in the MST, you have two cases: either it brings a completely new base cycle, or it splits an existing one in two. To track which of the two is the case, we track all the base cycles that a vertex is a part of (using the cycles dictionary).
off the top of my head, I would start by looking at any Minimum Spanning Tree algorithm (Prim, Kruskal, etc). There can't be more base cycles (If I understand it correctly) than edges that are NOT in the MST....
The following is my actual untested C# code to find all these "base cycles":
public HashSet<List<EdgeT>> FindBaseCycles(ICollection<VertexT> connectedComponent)
{
Dictionary<VertexT, HashSet<List<EdgeT>>> cycles =
new Dictionary<VertexT, HashSet<List<EdgeT>>>();
// For each vertex, initialize the dictionary with empty sets of lists of
// edges
foreach (VertexT vertex in connectedComponent)
cycles.Add(vertex, new HashSet<List<EdgeT>>());
HashSet<EdgeT> spanningTree = FindSpanningTree(connectedComponent);
foreach (EdgeT edgeNotInMST in
GetIncidentEdges(connectedComponent).Except(spanningTree)) {
// Find one cycle to split, the HashSet resulted from the intersection
// operation will contain just one cycle
HashSet<List<EdgeT>> cycleToSplitSet =
cycles[(VertexT)edgeNotInMST.StartPoint]
.Intersect(cycles[(VertexT)edgeNotInMST.EndPoint]);
if (cycleToSplitSet.Count == 0) {
// Find the path between the current edge not in ST enpoints using
// the spanning tree itself
List<EdgeT> path =
FindPath(
(VertexT)edgeNotInMST.StartPoint,
(VertexT)edgeNotInMST.EndPoint,
spanningTree);
// The found path plus the current edge becomes a cycle
path.Add(edgeNotInMST);
foreach (VertexT vertexInCycle in VerticesInPathSet(path))
cycles[vertexInCycle].Add(path);
} else {
// Get the cycle to split from the set produced before
List<EdgeT> cycleToSplit = cycleToSplitSet.GetEnumerator().Current;
List<EdgeT> cycle1 = new List<EdgeT>();
List<EdgeT> cycle2 = new List<EdgeT>();
SplitCycle(cycleToSplit, edgeNotInMST, cycle1, cycle2);
// Remove the cycle that has been splitted from the vertices in the
// same cicle and add the results from the split operation to them
foreach (VertexT vertex in VerticesInPathSet(cycleToSplit)) {
cycles[vertex].Remove(cycleToSplit);
if (VerticesInPathSet(cycle1).Contains(vertex))
cycles[vertex].Add(cycle1);
if (VerticesInPathSet(cycle2).Contains(vertex))
cycles[vertex].Add(cycle2); ;
}
}
}
HashSet<List<EdgeT>> ret = new HashSet<List<EdgeT>>();
// Create the set of cycles, in each vertex should be remained only one
// incident cycle
foreach (HashSet<List<EdgeT>> remainingCycle in cycles.Values)
ret.AddAll(remainingCycle);
return ret;
}
Oggy's code was very good and clear but i'm pretty sure it contains an error, or it's me that don't understand your pseudo python code :)
cycles[v] = []
can't be a vertex indexed dictionary of lists of edges. In my opinion, it have to be a vertex indexed dictionary of sets of lists of edges.
And, to add a precisation:
for vertex on cycle_to_split:
cycle-to-split is probably an ordered list of edges so to iterate it through vertices you have to convert it in a set of vertices. Order here is negligible, so it's a very simple alghoritm.
I repeat, this is untested and uncomplete code, but is a step forward. It still requires a proper graph structure (i use an incidency list) and many graph alghoritms you can find in text books like Cormen. I wasn't able to find FindPath() and SplitCycle() in text books, and are very hard to code them in linear time of number of edges+vertices in the graph. Will report them here when I will test them.
Thanks a lot Oggy!
The standard way to detect a cycle is to use two iterators - for each iteration, one moves forward one step and the other two. Should there be a cycle, they will at some point point to each other.
This approach could be extended to record the cycles so found and move on.
I'm looking for a general idea (and maybe some code example or at least pseudocode)
Now, this is from a problem that someone gave me, or rather showed me, I don't have to solve it, but I did most of the questions anyway, the problem that I'm having is this:
Let's say you have a directed weighted graph with the following nodes:
AB5, BC4, CD8, DC8, DE6, AD5, CE2, EB3, AE7
and the question is:
how many different routes from C to C with a distance of less than x. (say, 10, 20, 30, 40)
The answer of different trips is: CDC, CEBC, CEBCDC, CDCEBC, CDEBC, CEBCEBC, CEBCEBCEBC.
The main problem I'm having with it is that when I do DFS or BFS, my implementation first chooses the node and marks it as visited therefore I'm only able to find 2 paths which are CDC and CEBC and then my algorithm quits. If I don't mark it as visited then on the next iteration (or recursive call) it will choose the same node and not next available route, so I have to always mark them as visited however by doing that how can I get for example CEBCEBCEBC, which is pretty much bouncing between nodes.
I've looked at all the different algorithms books that I have at home and while every algorithm describes how to do DFS, BFS and find shortest paths (all the good stufF), none show how to iterate indefinitively and stop only when one reaches certain weight of the graph or hits certain vertex number of times.
So why not just keep branching and branching; at each node you will evaluate two things; has this particular path exceeded the weight limit (if so, terminate the branch) and is this node where I started (in which case log my path history to an 'acceptable solutions' list); then make new branches which each take a step in each possible direction.
You should not mark nodes as visited; as MikeB points out, CDCDC is a valid solution and yet it revisits D.
I'd do it lke this:
Start with two lists of paths:
Solutions (empty) and
ActivePaths (containing one path, "C").
While ActivePaths is not empty,
Take a path out of ActivePaths (suppose it's "CD"[8]).
If its distance is not over the limit,
see where you are by looking at the last node in the path ("D").
If you're at "C", add a copy of this path to Solutions.
Now for each possible next destination ("C", "E")
make a copy of this path, ("CD"[8])
append the destination, ("CDC"[8])
add the weight, ("CDC"[16])
and put it in ActivePaths
Discard the path.
Whether this turns out to be a DFS, a BFS or something else depends on where in ActivePaths you insert and remove paths.
No offense, but this is pretty simple and you're talking about consulting a lot of books for the answer. I'd suggest playing around with the simple examples until they become more obvious.
In fact you have two different problems:
Find all distinct cycles from C to C, we will call them C_1, C_2, ..., C_n (done with a DFS)
Each C_i has a weight w_i, then you want every combination of cycles with a total weight less than N. This is a combinatorial problem (and seems to be easily solvable with dynamic programming).
What is the best way to find out whether two number ranges intersect?
My number range is 3023-7430, now I want to test which of the following number ranges intersect with it: <3000, 3000-6000, 6000-8000, 8000-10000, >10000. The answer should be 3000-6000 and 6000-8000.
What's the nice, efficient mathematical way to do this in any programming language?
Just a pseudo code guess:
Set<Range> determineIntersectedRanges(Range range, Set<Range> setofRangesToTest)
{
Set<Range> results;
foreach (rangeToTest in setofRangesToTest)
do
if (rangeToTest.end <range.start) continue; // skip this one, its below our range
if (rangeToTest.start >range.end) continue; // skip this one, its above our range
results.add(rangeToTest);
done
return results;
}
I would make a Range class and give it a method boolean intersects(Range) . Then you can do a
foreach(Range r : rangeset) { if (range.intersects(r)) res.add(r) }
or, if you use some Java 8 style functional programming for clarity:
rangeset.stream().filter(range::intersects).collect(Collectors.toSet())
The intersection itself is something like
this.start <= other.end && this.end >= other.start
This heavily depends on your ranges. A range can be big or small, and clustered or not clustered. If you have large, clustered ranges (think of "all positive 32-bit integers that can be divided by 2), the simple approach with Range(lower, upper) will not succeed.
I guess I can say the following:
if you have little ranges (clustering or not clustering does not matter here), consider bitvectors. These little critters are blazing fast with respect to union, intersection and membership testing, even though iteration over all elements might take a while, depending on the size. Furthermore, because they just use a single bit for each element, they are pretty small, unless you throw huge ranges at them.
if you have fewer, larger ranges, then a class Range as describe by otherswill suffices. This class has the attributes lower and upper and intersection(a,b) is basically b.upper < a.lower or a.upper > b.lower. Union and intersection can be implemented in constant time for single ranges and for compisite ranges, the time grows with the number of sub-ranges (thus you do not want not too many little ranges)
If you have a huge space where your numbers can be, and the ranges are distributed in a nasty fasion, you should take a look at binary decision diagrams (BDDs). These nifty diagrams have two terminal nodes, True and False and decision nodes for each bit of the input. A decision node has a bit it looks at and two following graph nodes -- one for "bit is one" and one for "bit is zero". Given these conditions, you can encode large ranges in tiny space. All positive integers for arbitrarily large numbers can be encoded in 3 nodes in the graph -- basically a single decision node for the least significant bit which goes to False on 1 and to True on 0.
Intersection and Union are pretty elegant recursive algorithms, for example, the intersection basically takes two corresponding nodes in each BDD, traverse the 1-edge until some result pops up and checks: if one of the results is the False-Terminal, create a 1-branch to the False-terminal in the result BDD. If both are the True-Terminal, create a 1-branch to the True-terminal in the result BDD. If it is something else, create a 1-branch to this something-else in the result BDD. After that, some minimization kicks in (if the 0- and the 1-branch of a node go to the same following BDD / terminal, remove it and pull the incoming transitions to the target) and you are golden. We even went further than that, we worked on simulating addition of sets of integers on BDDs in order to enhance value prediction in order to optimize conditions.
These considerations imply that your operations are bounded by the amount of bits in your number range, that is, by log_2(MAX_NUMBER). Just think of it, you can intersect arbitrary sets of 64-bit-integers in almost constant time.
More information can be for example in the Wikipedia and the referenced papers.
Further, if false positives are bearable and you need an existence check only, you can look at Bloom filters. Bloom filters use a vector of hashes in order to check if an element is contained in the represented set. Intersection and Union is constant time. The major problem here is that you get an increasing false-positive rate if you fill up the bloom-filter too much.
Information, again, in the Wikipedia, for example.
Hach, set representation is a fun field. :)
In python
class nrange(object):
def __init__(self, lower = None, upper = None):
self.lower = lower
self.upper = upper
def intersection(self, aRange):
if self.upper < aRange.lower or aRange.upper < self.lower:
return None
else:
return nrange(max(self.lower,aRange.lower), \
min(self.upper,aRange.upper))
If you're using Java
Commons Lang Range
has a
overlapsRange(Range range) method.