I know there are some scheduling problems out there that are NP-hard/NP-complete ... however, none of them are stated in such a way to show this situation is also NP.
If you have a set of tasks constrained to a startAfter, startBy, and duration all trying to use a single resource ... can you resolve a schedule or identify that it cannot be resolved without an exhaustive search?
If the answer is "sorry pal, but this is NP-complete" what would be the best heuristic(s?) to use and are there ways to decrease the time it takes to a) resolve a schedule and b) to identify an unresolvable schedule.
I've implemented (in prolog) a basic conflict resolution goal through recursion that implements a "smallest window first" heuristic. This actually finds solutions rather quickly, but is exceptionally slow at finding invalid schedules. Is there a way to overcome this?
Yay for compound questions!
The hardest part of most scheduling problems in real life is getting hold of a reliability and complete set of constraints. If we take the example of creating a university timetable:
Professor A will not get up in the morning, he is on a lot of committees, but no-one will tell the timetable office about this sort of constraint
Department 1 needs the timetable by the start of term, however, Department 2 that uses the same rooms is unwilling to decide on the courses that will be run until after all the students have arrived
Etc
Then you need a schedule system that can cope with changes, so when one constraint is changed at the last minute you don’t have to change the complete timetable.
All of the above is normally ignored in research papers about scheduling systems. As to NP completeness of a given scheduling problem, in real life you don’t care as even if it is not NP complete you are unlikely to even be able to define what the “best solution” is, so good enough is good enough.
See http://www.asap.cs.nott.ac.uk/watt/resources/university.html for a list of papers that may help get you started; there are still many PHDs to be had in scheduling software.
There are often good approximation algorithms for NP-hard/complete optimization problems like scheduling. You might skim the course notes by Ahmed Abu Safia on Approximation Algorithms for scheduling or various papers.
In a sense, all public key cryptography is done with "less hard" problems like factoring partially because NP-hard problems offer up too many easy cases. It's the same NP-completeness that makes them "morally hard" which also gives them too many easy problems, which often fall within some error bound of optimal.
There is a deeper theory of hardness of approximation that discusses the limitations of approximation algorithms though.
You can use dynamic programming to solve some of these things. Greedy algorithms also come to mind. Scheduling theory is both deep and beautiful but those two I find will solve most of the problems I've faced. Perhaps I've been lucky.
What do you mean with startBy?
With startAfter and if there is only one resource, then a fast solution could be to use topological sorting. The example algorithm runs in linear time, but does not include the error case if the graph contains cycles.
Here's one that isn't.
Schedule a set of jobs i= 1,2...n on a single machine which each take time t(i) so that the average waiting time is minimized.
Solution: Sort in increasing order of t(i). O(n log n)
Good list here
Consider the scheduling problem that is in the class P:
Input: list of activities which include the start time and finish time.
Sort by finish time.
Select the first N elements of this sorted list to find the maximum amount of activities you can schedule in a given time.
You can add caveats like: all activities must end at 5pm, well in this case as you work through the list, stop once you reach an activity which ends after this time.
Related
I am working on my bachelor's final project, which is about the comparison between Apache Spark Streaming and Apache Flink (only streaming) and I have just arrived to "Physical partitioning" in Flink's documentation. The matter is that in this documentation it doesn't explain well how this two transformations work. Directly from the documentation:
shuffle(): Partitions elements randomly according to a uniform distribution.
rebalance(): Partitions elements round-robin, creating equal load per partition. Useful for performance optimisation in the presence of data skew.
Source: https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/datastream_api.html#physical-partitioning
Both are automatically done, so what I understand is that they both redistribute equally (shuffle() > uniform distribution & rebalance() > round-robin) and randomly the data. Then I deduce that rebalance() distributes the data in a better way ("equal load per partitions") so the tasks have to process the same amount of data, but shuffle() may create bigger and smaller partitions. Then, in which cases might you prefer to use shuffle() than rebalance()?
The only thing that comes to my mind is that probably rebalance()requires some processing time so in some cases it might use more time to do the rebalancing than the time it will improve in the future transformations.
I have been looking for this and nobody has talked about this, only in a mailing list of Flink, but they don't explain how shuffle() works.
Thanks to Sneftel who has helped me to improve my question asking me things to let me rethink about what I wanted to ask; and to Till who answered quite well my question. :D
As the documentation states, shuffle will randomly distribute the data whereas rebalance will distribute the data in a round robin fashion. The latter is more efficient since you don't have to compute a random number. Moreover, depending on the randomness, you might end up with some kind of not so uniform distribution.
On the other hand, rebalance will always start sending the first element to the first channel. Thus, if you have only few elements (fewer elements than subtasks), then only some of the subtasks will receive elements, because you always start to send the first element to the first subtask. In the streaming case this should eventually not matter because you usually have an unbounded input stream.
The actual reason why both methods exist is a historically reason. shuffle was introduced first. In order to make the batch an streaming API more similar, rebalance was then introduced.
This statement by Flink is misleading:
Useful for performance optimisation in the presence of data skew.
Since it's used to describe rebalance, but not shuffle, it suggests it's the distinguishing factor. My understanding of it was that if some items are slow to process and some fast, the partitioner will use the next free channel to send the item to. But this is not the case, compare the code for rebalance and shuffle. The rebalance just adds to next channel regardless how busy it is.
// rebalance
nextChannelToSendTo = (nextChannelToSendTo + 1) % numberOfChannels;
// shuffle
nextChannelToSendTo = random.nextInt(numberOfChannels);
The statement can be also understood differently: the "load" doesn't mean actual processing time, just the number of items. If your original partitioning has skew (vastly different number of items in partitions), the operation will assign items to partitions uniformly. However in this case it applies to both operations.
My conclusion: shuffle and rebalance do the same thing, but rebalance does it slightly more efficiently. But the difference is so small that it's unlikely that you'll notice it, java.util.Random can generate 70m random numbers in a single thread on my machine.
I want to model an HTTP interaction, i.e. a sequence of HTTPRequest/HTTPResponse, and I am trying to model this as a transition system.
I defined an ordering on a class State by using:
open util/ordering[State]
where a State is simply a set of Messages:
sig State {
msgSet: set Message
}
Each pair of (HTTPRequest->HTTPResponse) and (HTTPResponse->HTTPRequest) is represented as a rule in my transition system.
The rules are expressed in Alloy as predicates that let one move from one state to another.
E.g., this is a rule generating an HTTPResponse after a particular HTTPRequest is received:
pred rsp1 [s, s': State] {
one msg: Request, msg':Response | (
// Preconditions (previous Request)
msg.method=get &&
msg.address.url=sample_com &&
// Postconditions (next Response)
msg'.status=OK_200 &&
// previous Request has to be in previous state
msg in s.msgSet &&
// Response generated is added to next state
s'.msgSet = s.msgSet + msg'
}
Unfortunately, the model created seems to be too complex: we have a dozen of rules (more complex than the one above but following the same pattern) and the execution is very slow.
EDIT: In particular, the CNF generation is extremely slow, while the solving takes a reasonable amount of time.
Do you have any suggestion on how to model a similar transition system?
Thank you very much!
This is a model with an impressive level of detail; thank you for sharing it!
None of the various forms of honestAction by itself takes more than two or three minutes to find an instance (or in some cases to fail to find any instance), except for rsp8, which takes quite a while by itself (it ran for fifteen minutes or so before I stopped it).
So the long CNF preparation times you are observing are apparently caused by either (a) just predicate rsp8 that's causing your time issues, or (b) the size of the disjunction in the honestAction predicate, or (c) both.
I suspect but have not proved that the time issue is caused by combinatorial explosion in the number of individuals required to populate a model and the number of constraints in the model.
My first instinct (it's not more than that) would be to cut back on the level of detail in the model, in particular the large number of singleton signatures which instantiate your abstract signatures. These seem (I could be wrong) to be present either for bookkeeping purposes (so you can identify which rule licenses the transition from one state to another), or because the modeler doesn't trust Alloy to generate concrete instances of signatures like UserName, Password, Code, etc.
As the model now is, it looks as if you're doing a lot of work to define all the individuals involved in a particular example, instead of defining constraints and letting Alloy do the work of finding examples. (Using Alloy to check the properties a particular concrete example can be useful, but there are other ways to do that.)
Since so many of the concrete signatures in the model are constrained to singleton cardinality, I don't actually know that defining them makes the task of finding models more complex; for all I know, it makes it simpler. But my instinct is to think that it would be more useful to know (as well as possibly easier for Alloy to establish) that state transitions have a particular property in general, no matter what hosts, users, and URIs are involved, than to know that property rsp1 applies in all the cases where the host is named examplecom and the address URI is example_url_https and whatnot.
I conjecture that reducing the number of individuals whose existence and properties are prescribed, and the constraints on which individuals can be involved in which state transitions, will reduce the CNF generation time.
If your long-term goal is to test long sequences of state transitions to test whether from a given starting point it's possible or impossible to arrive at a particular state (or kind of state), you may need to re-think the approach to enable shorter sequences of state transitions to do the job.
A second conjecture would involve less restructuring of the model. For reasons I don't think I understand fully, sometimes quantification with one seems to hurt rather than help performance, as in this example, where explicitly quantifying some variables with some instead of one turned out to make a problem tractable instead of intractable.
That question involves quantification in a predicate, not in the model overall, and the quantification with one wasn't intended in the first place, so it may not be relevant here. But we can test the effect of the one keyword on this model in a simple way: I commented out everything in honestAction except rsp8 and ran the predicate first != last in a scope of 8, once with most of the occurrences of one commented out and once with those keywords intact. With the one keywords commented out, the Analyser ran the problem in 24 seconds or so; with the one keywords in place, it ran for 500 seconds so far before I decided the point was made and terminated it.
So I'd try removing the keyword one from all of the signatures with instance-specific individuals, leaving it only on get, post, OK_200, etc., and appData. I would also try doing without the various subtypes of Key, SessionID, URL, Host, UserName, and Password, or at least constraining their cardinality in the run command.
N processes share M resource units that can be reserved and release only one at a time. The maximum need of each process does not exceed M, and the sum of all maximum needs is less than M+N. Can a deadlock occur in the system ?
I hope you got the answer. Answering this question for other visitors.
The answer is that the deadlock will not occur in the system.
The proof is given in the image below.
The image was taken from http://alumni.cs.ucr.edu/~choua/school/cs153/Solution%20Manual.pdf on page 31
the system you are describing looks like semaphores
about your last question : YES. You "could" always do a deadlock ; if you don't see how, ask a young/shameful/motivated/deviant developer.
One good way to make a good one ; is to have strange locking/releasing resources rules. For example, if a process needs M resources to perform a task, he could locks half of them right away, and then waits for the other half to be available before doing anything.
I assume he never gives up until he have its M precious resources and releases them all once the task done.
A single process wouldn't cause much problems but several will as they will lock more than M total resources and will need more of them to get out this frozen state.
I'm trying to re-create this conquerclub (Risk-like) game:
http://conquerclub.barrycarter.info/ONEOFF/7460216.html
In other words, I want to know who owned each territory at each point
in time, and how many troops they had on that territory. My primary
source of information is the Game Log. Notes:
% It's not in the Game Log, but all territories start w/ 3 troops.
% Since we know the territory owners at the end of the game, and the
Game Log mentions all owner changes, determining territory owners at
any point in time is easy.
% The challenge is to find the number of troops on a territory at a
given time.
% The Game Log gives information on troop deployment, reinforcement,
and conquest.
% However, the Game Log is incomplete. Suppose territory X attacks
territory Y unsuccessfully, but both territories lose troops in the
process. The Game Log will not mention this.
% It's probably not possible (in general) to find the exact number
of troops on a territory at a given time, so I'm looking for a range.
% I tried feeding the data to Mathematica as a series of
inequalities, but as the manual warns, the computation time increases
exponentially with the number of inequalities. Even with a fairly
small number of inequalities, it hangs. Plus, I'm not convinced
Mathematica is the right tool here.
% Any thoughts? Another example is:
http://conquerclub.barrycarter.info/ONEOFF/7562013.html
% I know about http://userscripts.org/scripts/show/83035 but that only tracks \
owners, not number of troops.
You could make use of Prolog's constraint programming (specifically, CLP/FD). It would require you to encode all rules in Prolog, which might be a non-trivial task. However Prolog would be able then to show you all possible valid (legal in terms of encoded rules) ways of playing such game, or just show ranges of possible values.
Also, while CLP/FD in Prolog sometimes is quite fast, it might be difficult to use it to make solving your problem quickly. Most free solvers have many quirks.
Again, I think this is a nontrivial task, and even greater if you haven't programmed in Prolog earlier. But I am pretty sure this would give you answers you seek.
I am trying to use Flex Profiler to improve the application performance (loading time, etc). I have seen the profiler results for the current desgn. I want to compare these results with a new design for the same set of data. Is there some direct way to do it? I don't know any way to save the current profiling results in history and compare it later with the results of a new design.
Otherwise I have to do it manually, write the two results in a notepad and then compare it.
Thanks in advance.
Your stated goal is to improve aspects of the application performance (loading time, etc.) I have similar issues in other languages (C#, C++, C, etc.) I suggest that you focus not so much on the timing measurements that the Flex profiler gives you, but rather use it to extract a small number of samples of the call stack while it is being slow. Don't deal in summaries, but rather examine those stack samples closely. This may bend your mind a little bit, because it will not give you particularly precise time measurements. What it will tell you is which lines of code you need to focus on to get your speedup, and it will give you a very rough idea of how much speedup you can expect. To get the exact amount of speedup, you can time it afterward. (I just use a stopwatch. If I'm getting the load time down from 2 minutes to 10 seconds, timing it is not a high-tech problem.)
(If you are wondering how/why this works, it works because the reason for the program being slower than it's going to be is that it's requesting work to be done, mostly by method calls, that you are going to avoid executing so much. For the amount of time being spent in those method calls, they are sitting exposed on the stack, where you can easily see them. For example, if there is a line of code that is costing you 60% of the time, and you take 5 stack samples, it will appear on 3 samples, plus or minus 1, roughly, regardless of whether it is executed once or a million times. So any such line that shows up on multiple stacks is a possible target for optimization, and targets for optimization will appear on multiple stack samples if you take enough.
The hard part about this is learning not to be distracted by all the profiling results that are irrelevant. Milliseconds, average or total, for methods, are irrelevant. Invocation counts are irrelevant. "Self time" is irrelevant. The call graph is irrelevant. Some packages worry about recursion - it's irrelevant. CPU-bound vs. I/O bound - irrelevant. What is relevant is the fraction of stack samples that individual lines of code appear on.)
ADDED: If you do this, you'll notice a "magnification effect". Suppose you have two independent performance problems, A and B, where A costs 50% and B costs 25%. If you fix A, total time drops by 50%, so now B takes 50% of the remaining time and is easier to find. On the other hand, if you happen to fix B first, time drops by 25%, so A is magnified to 67%. Any problem you fix makes the others appear bigger, so you can keep going until you just can't squeeze it any more.