How do I determine whether a deadlock will occur in this system? - deadlock

N processes share M resource units that can be reserved and release only one at a time. The maximum need of each process does not exceed M, and the sum of all maximum needs is less than M+N. Can a deadlock occur in the system ?

I hope you got the answer. Answering this question for other visitors.
The answer is that the deadlock will not occur in the system.
The proof is given in the image below.
The image was taken from http://alumni.cs.ucr.edu/~choua/school/cs153/Solution%20Manual.pdf on page 31

the system you are describing looks like semaphores
about your last question : YES. You "could" always do a deadlock ; if you don't see how, ask a young/shameful/motivated/deviant developer.
One good way to make a good one ; is to have strange locking/releasing resources rules. For example, if a process needs M resources to perform a task, he could locks half of them right away, and then waits for the other half to be available before doing anything.
I assume he never gives up until he have its M precious resources and releases them all once the task done.
A single process wouldn't cause much problems but several will as they will lock more than M total resources and will need more of them to get out this frozen state.

Related

Why the time-out values are so small?

I'm not sure if this is a pure stackoverflow relevant question. It is related to general design practice. Since I cannot think of another relevant stack exchange site, posting it here.
In the general design practice of converting an async call to sync one, we use a time-out and wait for the results. While, this may not exactly a good practice from the point of view of responsiveness, it definitely makes the implementation easier.
I have seen many such implementations and often noticed that the developers tend to give a very small time-out value. I can understand that the people may have the need of a responsive system in mind when they did this. But many of these applications I have seen are very data critical ones where the loss of data is very bad. So, it is always better to wait more and try to get as much data instead of timing out early and giving an error message to the user. Now, the situations where the server failing to give data or the client unable to reach server etc are rare. In those situations, I expect the a large time-out for such waits. After all, these time-outs don't mean that the wait will definitely last until the given time-out value; the timeout value is only an upper limit. So, I have always arguing for higher values here. But I see the use of low values in more and more places and now I'm getting confused if really there is something else in this practice that I don't understand.
So, my question is : Are there any arguments, other than the need for responsiveness to implement a very small time-out for waiting?
As always, the right decision depends on the real-life data.
The timeout should be proportional to the time it usually takes to complete an operation successfully.
Sending a UDP message for example could take between 1 - 50 milliseconds so a timeout of 100 milliseconds is more than reasonable however copying a file over the wire could take minutes or more so a 100 millisecond timeout is laughable.
There are pros and cons to both short and long timeouts so it's a tradeoff. Longer timeouts use more resources (tasks, threads, memory, etc.) for the same amount of work while short timeouts, as you mentioned, may result in loss of data.
In conclusion, you need to set a configurable timeout that sounds reasonable and then figure out whether you timeout too many operations in production or the other way around and calibrate accordingly.

MS Project single resource assigned to multiple concurrent activities

In Microsoft Project (MSP), is it possible to have two parallel/concurrent activities with the same resource assigned to both? For example:
The worker resource "matt" is assigned to both activities, hence the overallocation error showing in the info column (red human icon).
Why do I require such a structure you may ask? In reality, one task would continually be temporarily interrupted to do the other one, and vice versa. But in MSP, it would be counter-productive to do so. Hence, my simplified approach - parallel activities.
So how should I fix this issue?
Ignore it?
Allocate 50% of the resource to each (and fix up finish dates which are automatically extended by MSP?)
Or is there a better solution?
if they indeed overlap then the best way to put this in your project is to allocate 50% on each task. But why fix up finish date, the program will recalculate the new duration of the task based on the new work, while the allocation is lowered to 50% then the duration should double ... Or you can declare the task as duration driven if that is the case, for such tasks duration will stay as set though allocation is changed
I do not think there is a better solution. So solution number 2 that is adjusting assignment units is the best in your case.

Preventing Deadlocks

for a pseudo function like
void transaction(Account from, Account to, double amount){
Semaphore lock1, lock2;
lock1 = getLock(from);
lock2 = getLock(to)
wait(lock1);
wait(lock2);
withdraw(from, amount);
deposit(to, amount);
signal(lock2);
signal(lock1);
}
deadlock happens if you run transaction(A,B,50) transaction(B,A,10)
how can this be prevented?
would this work?
A simple deadlock prevention strategy when handling locks is to have strict order on the locks in the application and always grab the locks according to this order. Assuming all accounts have a number, you could change your logic to always grab the lock for the account with the lowest account number first. Then grab the lock for the one with the highest number.
Another strategy for preventing deadlocks is to reduce the number of locks. In this case it might be better to have one lock that locks all accounts. It would definitely make the lock structure far more simple. If the application shows performance problems under heavy load and profiling shows that lock congestion is the problem - then it is time to invent a more fine grained locking strategy.
By making the entire transaction a critical section? That's only one possible solution, at least.
I have a feeling this is homework of some sort, because it's very similar to the dining philosophers problem based on the example code you give. (Multiple solutions to the problem are available at the link provided, just so you know. Check them out if you want a better understanding of the concepts.)

Can you sacrifice performance to get concurrency in Sqlite on a NFS?

I need to write a client/server app stored on a network file system. I am quite aware that this is a no-no, but was wondering if I could sacrifice performance (Hermes: "And this time I mean really slash.") to prevent data corruption.
I'm thinking something along the lines of:
Create a separate file in the system everytime a write is called (I'm willing do it for every connection if necessary)
Store the file name as the current millisecond timestamp
Check to see if the file with that time or earlier exists
If the same one exists wait a random time between 0 to 10 ms, and try again.
While file is the earliest timestamp, do work, delete file lock, otherwise wait 10ms and try again.
If a file persists for more than a minute, log as an error, stop until it is determined that the data is not corrupted by a person.
The problem I see is trying to maintain the previous state if something locks up. Or choosing to ignore it, if the state change was actually successful.
Is there a better way of doing this, that doesn't involve not doing it this way? Or has anyone written one of these with a lot less problems than the Sqlite FAQ warns about? Will these mitigations even factor in to preventing data corruption?
A couple of notes:
This must exist on an NSF, the why is not important because it is not my decision to make (it doesn't look like I was clear enough on that point).
The number of readers/writers on the system will be between 5 and 10 all reading and writing at the same time, but rarely on the same record.
There will only be clients and a shared memory space, there is no way to put a server on there, or use a server based RDMS, if there was, obviously I would do it in a New York minute.
The amount of data will initially start off at about 70 MB (plain text, uncompressed), it will grown continuous from there at a reasonable, but not tremendous rate.
I will accept an answer of "No, you can't gain reasonably guaranteed concurrency on an NFS by sacrificing performance" if it contains a detailed and reasonable explanation of why.
Yes, there is a better way. Don't use NFS to do this.
If you are willing to create a new file every time something changes, I expect that you have a small amount of data and/or very infrequent changes. If the data is small, why use SQLite at all? Why not just have files with node names and timestamps?
I think it would help if you described the real problem you are trying to solve a bit more. For example if you have many readers and one writer, there are other approaches.
What do you mean by "concurrency"? Do you actually mean "multiple readers/multiple writers", or can you get by with "multiple readers/one writer with limited latency"?

Are all scheduling problems NP-Hard?

I know there are some scheduling problems out there that are NP-hard/NP-complete ... however, none of them are stated in such a way to show this situation is also NP.
If you have a set of tasks constrained to a startAfter, startBy, and duration all trying to use a single resource ... can you resolve a schedule or identify that it cannot be resolved without an exhaustive search?
If the answer is "sorry pal, but this is NP-complete" what would be the best heuristic(s?) to use and are there ways to decrease the time it takes to a) resolve a schedule and b) to identify an unresolvable schedule.
I've implemented (in prolog) a basic conflict resolution goal through recursion that implements a "smallest window first" heuristic. This actually finds solutions rather quickly, but is exceptionally slow at finding invalid schedules. Is there a way to overcome this?
Yay for compound questions!
The hardest part of most scheduling problems in real life is getting hold of a reliability and complete set of constraints. If we take the example of creating a university timetable:
Professor A will not get up in the morning, he is on a lot of committees, but no-one will tell the timetable office about this sort of constraint
Department 1 needs the timetable by the start of term, however, Department 2 that uses the same rooms is unwilling to decide on the courses that will be run until after all the students have arrived
Etc
Then you need a schedule system that can cope with changes, so when one constraint is changed at the last minute you don’t have to change the complete timetable.
All of the above is normally ignored in research papers about scheduling systems. As to NP completeness of a given scheduling problem, in real life you don’t care as even if it is not NP complete you are unlikely to even be able to define what the “best solution” is, so good enough is good enough.
See http://www.asap.cs.nott.ac.uk/watt/resources/university.html for a list of papers that may help get you started; there are still many PHDs to be had in scheduling software.
There are often good approximation algorithms for NP-hard/complete optimization problems like scheduling. You might skim the course notes by Ahmed Abu Safia on Approximation Algorithms for scheduling or various papers.
In a sense, all public key cryptography is done with "less hard" problems like factoring partially because NP-hard problems offer up too many easy cases. It's the same NP-completeness that makes them "morally hard" which also gives them too many easy problems, which often fall within some error bound of optimal.
There is a deeper theory of hardness of approximation that discusses the limitations of approximation algorithms though.
You can use dynamic programming to solve some of these things. Greedy algorithms also come to mind. Scheduling theory is both deep and beautiful but those two I find will solve most of the problems I've faced. Perhaps I've been lucky.
What do you mean with startBy?
With startAfter and if there is only one resource, then a fast solution could be to use topological sorting. The example algorithm runs in linear time, but does not include the error case if the graph contains cycles.
Here's one that isn't.
Schedule a set of jobs i= 1,2...n on a single machine which each take time t(i) so that the average waiting time is minimized.
Solution: Sort in increasing order of t(i). O(n log n)
Good list here
Consider the scheduling problem that is in the class P:
Input: list of activities which include the start time and finish time.
Sort by finish time.
Select the first N elements of this sorted list to find the maximum amount of activities you can schedule in a given time.
You can add caveats like: all activities must end at 5pm, well in this case as you work through the list, stop once you reach an activity which ends after this time.

Resources