how make a new scheduler in xv6? - xv6

i am new to XV6 so be patient with me :D .
i want to make a new scheduler and it is a mix of two scheduler the multi-level feedback queue (MLFQ) and another one the lottery scheduler.
The basic idea is simple: Build a two-level scheduler which first places jobs into the high-priority queue. When a job uses its time slice on the first queue, move it to the lower-priority queue; jobs on the lower-priority queue should run for two time slices before relinquishing the CPU.When there is more than one job on a queue, each should run in proportion to the number of tickets it has; the more tickets a process has, the more it runs. Each time slice, a randomized lottery determines the winner of the lottery; that winning process is the one that runs for that time slice.
i need a couple of new system calls to implement this scheduler.
The first is int settickets(int num) , which sets the number of tickets of the calling process. By default, each process should get one ticket; calling this routine makes it such that a process can raise the number of tickets it receives, and thus receive a higher proportion of CPU cycles. This routineshould return 0 if successful, and -1 otherwise (if, for example, the user passes in a number less than one).
The second is int getpinfo(struct pstat *) . This routine returns some basic information about each running process, including how many times it has been chosen to run and its process ID, and which queue it is on (high or low).
any help ? any lniks to help .

Related

MPI How to balance workload in an unknown length problem

I have an MPI program which traverses through a graph to solve a problem.
if some rank finds another branch of the graph it will send the task to another random rank. all ranks will wait and receive another task after they complete one.
I have 4 processors, when I see the CPU usage when running the program I will usually see 2-3 processors at max while 1-2 processors idling because the tasks are not split equally among the ranks.
to solve this issue I have to know which rank is not already busy solving some task. so when some rank finds another branch in the graph it will see which rank is free to work on this branch and send it the task.
Q: How can I balance the workload between the ranks.
note: I don't know the length or the size of the graph so I can't split the tasks into ranges for each rank when starting. I have to visit each node on the fly and check if it solves the graph problem. if not I send the next node branches to other ranks.

Intel MPI distributed memory: building a wall out of M*N blocks using q<M processors

Imagine I have M independent jobs, each job has N steps. Jobs are independent from each other but steps of each job should be serial. In other words J(i,j) should be started only after J(i,j-1) is finished (i indicates the job index and j indicates the step). This is isomorphic to building a wall with width of M and hight of N blocks.
Each block of job should be executed only once. The time that it takes to do one block of work using one CPU (also the same order) is different for different blocks and is not known in advance.
The simple way of doing this using MPI is to assign blocks of work to processors and wait until all of them finish their blocks before the next assignment. This way we can make ensure that priorities are enforced, but there will be a lot of waiting time.
Is there a more efficient way of doing this? I mean when a processor finishes its job, using some kind of environmental variables or shared memory, could decide which block of job it should do next, without waiting for other processors to finish their jobs and make a collective decision using communications.
You have M jobs with N steps each. You also have a set of worker processes of size W, somewhere between 2 and M.
If W is close to M, the best you can do is simply assign them 1:1. If one worker finishes early that's fine.
If W is much smaller than M, and N is also fairly large, here is an idea:
Estimate some average or typical time for one step to complete. Call this T. You can adjust this estimate as you go in case you have a very poor estimator at the outset.
Divide your M jobs evenly in number among the workers, and start them. Tell the workers to run as many steps of their assigned jobs as possible before a timeout, say T*N/K. Overrunning the timeout slightly to finish the current job is allowed to ensure forward progress.
Have the workers communicate to each other which steps they completed.
Repeat, dividing the jobs evenly again taking into account how complete each one is (e.g. two 50% complete jobs count the same as one 0% complete job).
The idea is to give all the workers enough time to complete roughly 1/K of the total work each time. If no job takes much more than K*T, this will be quite efficient.
It's up to you to find a reasonable K. Maybe try 10.
Here's an idea, IDK if it's good:
Maintain one shared variable: n = the progress of the farthest-behind task. i.e. the lowest step-number that any of the M tasks has completed. It starts out at 0, because all tasks start at the first step. It stays at 0 until all tasks have completed at least 1 step each.
When a processor finishes a step of a job, check the progress of the step it's currently working on against n. If n < current_job_step - 4, switch tasks because the one we're working on is too far ahead of the farthest-behind one.
I picked 4 to give a balance between too much switching vs. having too much serial work in only a couple tasks. Adjust as necessary, and maybe make it adaptive as you near the end.
Switching tasks without having two threads both grab the same work unit is non-trivial unless you have a scheduler thread that makes all the decisions. If this is on a single shared-memory machine, you could use locking to protect a priority queue.

Single-cycle vs a pipelined approach

I understand that single-cycle programs are not very efficient. One reason is because not all instructions are equal in length, but in a single-cycle program, all instructions are completed in the same length of time.
In pipeline, throughput is increased, which means the time between one output and the next will be shorter than in a single-cycle implementation after you reach a certain point. But then can you say that instructions in a pipelined approach take the same amount of time (going from IF/Instruction Fetch to WB/Writeback)? Or is this the wrong conclusion?
See all instructions in a single cycle non pipelined structure do not necessarily take same amount of time rather the next instruction to be executed after an instruction can not start until the next clock cycle ,current instruction may complete before the current cycle because cycle length is determined by the longest instruction.e.g add register completes before load in a RISC.
Now in a pipelined structure processor is
multistage with register to store and propogate the state of processor.Now basically on pipelined processor we save time by overlapping two instructionss' substages.hence even though individually the length of instruction is increased but overall time has reduced.Now see every instruction may not go through all the stages eg load and add again
So overall latency for each instruction will consist of all the stages but its execution may have had taken less number of cycles
So you can say that latency of each instruction is same but not the execution time or cycles consumed

Variable memory allocation in MPI Code

In a cluster running MPI code, is a copy of all the declared-variables, sent to all nodes , so that all nodes can access it locally, and not perform a remote memory access ?
No, MPI itself can't do this for you in single call.
There is an own state of memory in every MPI process, and every value may be different in any of MPI process.
The only way of sending/receiving data is to use explicit calls of MPI, like Send or Recv. You can pack most of your data into some memory space and send this area of memory to each MPI Process, but this area will not contain 'every declared variable', only variables placed manually into this area.
Update:
Each node runs a copy of the program. Each copy will initialize variables as it want (it can be the same initialization, or individual, based on MPI Process number, called Rank; got from MPI_Comm_Rank function). So every variable exist in N copyes; one set per MPI Process. Every process sees variables, but only the set it owns. Values of variables are unsyncronized automatically.
So, task of programmer is to syncronize values of variables between Nodes (mpi processes).
E.g. here is small MPI program to compute Pi:
http://www.mcs.anl.gov/research/projects/mpi/usingmpi/examples/simplempi/cpi_c.htm
It will send value of the 'n' variable from first process to all other (MPI_Bcast); and every process will send its own 'mypi' after calculation into 'pi' variable of first process (with addition of individual values via MPI_Reduce function).
Only first process will be able to read N from user (via scanf) and this code is conditionally executed based on rank of process; other processes must get the N from the first because they didn't read it from user directly.
Update2 (sorry for late answer):
This is syntax of MPI_Bcast. Programmer should give an address of variable into this function. Each of MPI processes will give the address of its own 'n' variable (it can be different). And the MPI_Bcast will
check the rank of current process and compare with other argument, the rank of "Broadcaster".
If the current process is broadcaster, MPI_Bcast will read value, placed in memory at given address (it will read value of the 'n' variable on "Broadcaster"); then the value will be send via network.
Else, if the current process is not a broadcaster, it is an "receiver". MPI_Bcast at receiver will get the value from "Broadcaster" (Using MPI Library internals, via network) and store the value in memory of current process at given address.
So, the address is given to this function because on some nodes the function will write to the variable. Only value is send via network.

Asynchronous vs synchronous execution. What is the difference? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 2 months ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What is the difference between asynchronous and synchronous execution?
When you execute something synchronously, you wait for it to finish before moving on to another task. When you execute something asynchronously, you can move on to another task before it finishes.
In the context of operating systems, this corresponds to executing a process or task on a "thread." A thread is a series of commands (a block of code) that exist as a unit of work. The operating system runs a given thread on a processor core. However, a processor core can only execute a single thread at once. It has no concept of running multiple threads simultaneously. The operating system can provide the illusion of running multiple threads at once by running each thread for a small slice of time (such as 1ms), and continuously switching between threads.
Now, if you introduce multiple processor cores into the mix, then threads CAN execute at the same time. The operating system can allocate time to one thread on the first processor core, then allocate the same block of time to another thread on a different processor core. All of this is about allowing the operating system to manage the completion of your task while you can go on in your code and do other things.
Asynchronous programming is a complicated topic because of the semantics of how things tie together when you can do them at the same time. There are numerous articles and books on the subject; have a look!
Synchronous/Asynchronous HAS NOTHING TO DO WITH MULTI-THREADING.
Synchronous or Synchronized means "connected", or "dependent" in some way. In other words, two synchronous tasks must be aware of one another, and one task must execute in some way that is dependent on the other, such as wait to start until the other task has completed.
Asynchronous means they are totally independent and neither one must consider the other in any way, either in the initiation or in execution.
Synchronous (one thread):
1 thread -> |<---A---->||<----B---------->||<------C----->|
Synchronous (multi-threaded):
thread A -> |<---A---->|
\
thread B ------------> ->|<----B---------->|
\
thread C ----------------------------------> ->|<------C----->|
Asynchronous (one thread):
A-Start ------------------------------------------ A-End
| B-Start -----------------------------------------|--- B-End
| | C-Start ------------------- C-End | |
| | | | | |
V V V V V V
1 thread->|<-A-|<--B---|<-C-|-A-|-C-|--A--|-B-|--C-->|---A---->|--B-->|
Asynchronous (multi-Threaded):
thread A -> |<---A---->|
thread B -----> |<----B---------->|
thread C ---------> |<------C--------->|
Start and end points of tasks A, B, C represented by <, > characters.
CPU time slices represented by vertical bars |
Technically, the concept of synchronous/asynchronous really does not have anything to do with threads. Although, in general, it is unusual to find asynchronous tasks running on the same thread, it is possible, (see below for examples) and it is common to find two or more tasks executing synchronously on separate threads... No, the concept of synchronous/asynchronous has to do solely with whether or not a second or subsequent task can be initiated before the other (first) task has completed, or whether it must wait. That is all. What thread (or threads), or processes, or CPUs, or indeed, what hardware, the task[s] are executed on is not relevant. Indeed, to make this point I have edited the graphics to show this.
ASYNCHRONOUS EXAMPLE:
In solving many engineering problems, the software is designed to split up the overall problem into multiple individual tasks and then execute them asynchronously. Inverting a matrix, or a finite element analysis problem, are good examples. In computing, sorting a list is an example. The quicksort routine, for example, splits the list into two lists and performs a quicksort on each of them, calling itself (quicksort) recursively. In both of the above examples, the two tasks can (and often were) executed asynchronously. They do not need to be on separate threads. Even a machine with one CPU and only one thread of execution can be coded to initiate processing of a second task before the first one has completed. The only criterion is that the results of one task are not necessary as inputs to the other task. As long as the start and end times of the tasks overlap, (possible only if the output of neither is needed as inputs to the other), they are being executed asynchronously, no matter how many threads are in use.
SYNCHRONOUS EXAMPLE:
Any process consisting of multiple tasks where the tasks must be executed in sequence, but one must be executed on another machine (Fetch and/or update data, get a stock quote from financial service, etc.). If it's on a separate machine it is on a separate thread, whether synchronous or asynchronous.
In simpler terms:
SYNCHRONOUS
You are in a queue to get a movie ticket. You cannot get one until everybody in front of you gets one, and the same applies to the people queued behind you.
ASYNCHRONOUS
You are in a restaurant with many other people. You order your food. Other people can also order their food, they don't have to wait for your food to be cooked and served to you before they can order.
In the kitchen restaurant workers are continuously cooking, serving, and taking orders.
People will get their food served as soon as it is cooked.
Simple Explanation via analogy
(story & pics given to help you remember).
Synchronous Execution
My boss is a busy man. He tells me to write code. I tell him: Fine. I get started and he's watching me like a vulture, standing behind me, off my shoulder. I'm like "Dude, WTF: why don't you go and do something while I finish this?"
he's like: "No, I'm waiting right here until you finish." This is synchronous.
Asynchronous Execution
The boss tells me to do it, and rather than waiting right there for my work, the boss goes off and does other tasks. When I finish my job I simply report to my boss and say: "I'm DONE!" This is Asynchronous Execution.
(Take my advice: NEVER work with the boss behind you.)
Synchronous execution means the execution happens in a single series. A->B->C->D. If you are calling those routines, A will run, then finish, then B will start, then finish, then C will start, etc.
With Asynchronous execution, you begin a routine, and let it run in the background while you start your next, then at some point, say "wait for this to finish". It's more like:
Start A->B->C->D->Wait for A to finish
The advantage is that you can execute B, C, and or D while A is still running (in the background, on a separate thread), so you can take better advantage of your resources and have fewer "hangs" or "waits".
In a nutshell, synchronization refers to two or more processes' start and end points, NOT their executions. In this example, Process A's endpoint is synchronized with Process B's start point:
SYNCHRONOUS
|--------A--------|
|--------B--------|
Asynchronous processes, on the other hand, do not have their start and endpoints synchronized:
ASYNCHRONOUS
|--------A--------|
|--------B--------|
Where Process A overlaps Process B, they're running concurrently or synchronously (dictionary definition), hence the confusion.
UPDATE: Charles Bretana improved his answer, so this answer is now just a simple (potentially oversimplified) mnemonic.
Synchronous means that the caller waits for the response or completion, asynchronous that the caller continues and a response comes later (if applicable).
As an example:
static void Main(string[] args)
{
Console.WriteLine("Before call");
doSomething();
Console.WriteLine("After call");
}
private static void doSomething()
{
Console.WriteLine("In call");
}
This will always ouput:
Before call
In call
After call
But if we were to make doSomething() asynchronous (multiple ways to do it), then the output could become:
Before call
After call
In call
Because the method making the asynchronous call would immediately continue with the next line of code. I say "could", because order of execution can't be guaranteed with asynch operations. It could also execute as the original, depending on thread timings, etc.
Sync vs Async
Sync and async operations are about execution order a next task in relation to the current task.
Let's take a look at example where Task 2 is current task and Task 3 is a next task. Task is an atomic operation - method call in a stack (method frame).
Synchronous
Implies that tasks will be executed one by one. A next task is started only after current task is finished. Task 3 is not started until Task 2 is finished.
Single Thread + Sync - Sequential
Usual execution.
Pseudocode:
main() {
task1()
task2()
task3()
}
Multi Thread + Sync - Parallel
Blocked.
Blocked means that a thread is just waiting(although it could do something useful. e.g. Java ExecutorService[About] and Future[About]) Pseudocode:
main() {
task1()
Future future = ExecutorService.submit(task2())
future.get() //<- blocked operation
task3()
}
Asynchronous
Implies that task returns control immediately with a promise to execute a code and notify about result later(e.g. callback, feature). Task 3 is executed even if Task 2 is not finished. async callback, completion handler[About]
Single Thread + Async - Concurrent
Callback Queue (Message Queue) and Event Loop (Run Loop, Looper) are used. Event Loop checks if Thread Stack is empty and if it is true it pushes first item from the Callback Queue into Thread Stack and repeats these steps again. Simple examples are button click, post event...
Pseudocode:
main() {
task1()
ThreadMain.handler.post(task2());
task3()
}
Multi Thread + Async - Concurrent and Parallel
Non-blocking.
For example when you need to make some calculations on another thread without blocking. Pseudocode:
main() {
task1()
new Thread(task2()).start();
//or
Future future = ExecutorService.submit(task2())
task3()
}
You are able use result of Task 2 using a blocking method get() or using async callback through a loop.
For example in Mobile world where we have UI/main thread and we need to download something we have several options:
sync block - block UI thread and wait when downloading is done. UI is not responsive.
async callback - create a new tread with a async callback to update UI(is not possible to access UI from non UI thread). Callback hell.
async coroutine[About] - async task with sync syntax. It allows mix downloading task (suspend function) with UI task.
[iOS sync/async], [Android sync/async]
[Paralel vs Concurrent]
I think this is bit round-about explanation but still it clarifies using real life example.
Small Example:
Let's say playing an audio involves three steps:
Getting the compressed song from harddisk
Decompress the audio.
Play the uncompressed audio.
If your audio player does step 1,2,3 sequentially for every song then it is synchronous. You will have to wait for some time to hear the song till the song actually gets fetched and decompressed.
If your audio player does step 1,2,3 independent of each other, then it is asynchronous. ie.
While playing audio 1 ( step 3), if it fetches audio 3 from harddisk in parallel (step 1) and it decompresses the audio 2 in parallel. (step 2 )
You will end up in hearing the song without waiting much for fetch and decompress.
I created a gif for explain this, hope to be helpful:
look, line 3 is asynchronous and others are synchronous.
all lines before line 3 should wait until before line finish its work, but because of line 3 is asynchronous, next line (line 4), don't wait for line 3, but line 5 should wait for line 4 to finish its work, and line 6 should wait for line 5 and 7 for 6, because line 4,5,6,7 are not asynchronous.
Simply said asynchronous execution is doing stuff in the background.
For example if you want to download a file from the internet you might use a synchronous function to do that but it will block your thread until the file finished downloading. This can make your application unresponsive to any user input.
Instead you could download the file in the background using asynchronous method. In this case the download function returns immediately and program execution continues normally. All the download operations are done in the background and your program will be notified when it's finished.
As a really simple example,
SYNCHRONOUS
Imagine 3 school students instructed to run a relay race on a road.
1st student runs her given distance, stops and passes the baton to the 2nd. No one else has started to run.
1------>
2.
3.
When the 2nd student retrieves the baton, she starts to run her given distance.
1.
2------>
3.
The 2nd student got her shoelace untied. Now she has stopped and tying up again. Because of this, 2nd's end time has got extended and the 3rd's starting time has got delayed.
1.
--2.--->
3.
This pattern continues on till the 3rd retrieves the baton from 2nd and finishes the race.
ASYNCHRONOUS
Just Imagine 10 random people walking on the same road.
They're not on a queue of course, just randomly walking on different places on the road in different paces.
2nd person's shoelace got untied. She stopped to get it tied up again.
But nobody is waiting for her to get it tied up. Everyone else is still walking the same way they did before, in that same pace of theirs.
10--> 9-->
8--> 7--> 6-->
5--> 4-->
1--> 2. 3-->
Synchronous basically means that you can only execute one thing at a time. Asynchronous means that you can execute multiple things at a time and you don't have to finish executing the current thing in order to move on to next one.
When executing a sequence like: a>b>c>d>, if we get a failure in the middle of execution like:
a
b
c
fail
Then we re-start from the beginning:
a
b
c
d
this is synchronous
If, however, we have the same sequence to execute: a>b>c>d>, and we have a failure in the middle:
a
b
c
fail
...but instead of restarting from the beginning, we re-start from the point of failure:
c
d
...this is know as asynchronous.
An example of instructions for making a breakfast:
Pour a cup of coffee.
Heat a pan, then fry two eggs.
Fry three slices of bacon.
Toast two pieces of bread.
Add butter and jam to the toast.
Pour a glass of orange juice.
If you have experience with cooking, you'd execute those instructions asynchronously. You'd start warming the pan for eggs, then start the bacon. You'd put the bread in the toaster, then start the eggs. At each step of the process, you'd start a task, then turn your attention to tasks that are ready for your attention.
Cooking breakfast is a good example of asynchronous work that isn't parallel. One person (or thread) can handle all these tasks. Continuing the breakfast analogy, one person can make breakfast asynchronously by starting the next task before the first task completes. The cooking progresses whether or not someone is watching it. As soon as you start warming the pan for the eggs, you can begin frying the bacon. Once the bacon starts, you can put the bread into the toaster.
For a parallel algorithm, you'd need multiple cooks (or threads). One would make the eggs, one the bacon, and so on. Each one would be focused on just that one task. Each cook (or thread) would be blocked synchronously waiting for the bacon to be ready to flip, or the toast to pop.
(emphasis mine)
From Asynchronous programming concepts
A synchronous operation does its work before returning to the caller.
An asynchronous operation does (most or all of) its work after returning to the caller.
You are confusing Synchronous with Parallel vs Series. Synchronous mean all at the same time. Syncronized means related to each othere which can mean in series or at a fixed interval. While the program is doing all, it it running in series. Get a dictionary...this is why we have unsweet tea. You have tea or sweetened tea.
A different english definition of Synchronize is Here
Coordinate; combine.
I think that is a better definition than of "Happening at the same time". That one is also a definition, but I don't think it is the one that fits the way it is used in Computer Science.
So an asynchronous task is not co-coordinated with other tasks, whereas a synchronous task IS co-coordinated with other tasks, so one finishes before another starts.
How that is achieved is a different question.
I think a good way to think of it is a classic running Relay Race
Synchronous: Processes like members of the same team, they won't execute until they receive baton (end of the execution of previous process/runner) and yet they are all acting in sync with each other.
Asynchronous: Where processes like members of different teams on the same relay race track, they will run and stop, async with each other, but within same race (overall program execution).
Does it make sense?
Synchronous means queue way execution one by one task will be executed. Suppose there is only vehicle that need to be share among friend to reach their destination one by one vehicle will be share.
In asynchronous case each friend can get rented vehicle and reach its destination.
In regards to the "at the same time" definition of synchronous execution (which is sometimes confusing), here's a good way to understand it:
Synchronous Execution: All tasks within a block of code are all executed at the same time.
Asynchronous Execution: All tasks within a block of code are not all executed at the same time.
Yes synchronous means at the same time, literally, it means doing work all together. multiple human/objects in the world can do multiple things at the same time but if we look at computer, it says synchronous means where the processes work together that means the processes are dependent on the return of one another and that's why they get executed one after another in proper sequence. Whereas asynchronous means where processes don't work together, they may work at the same time(if are on multithread), but work independently.

Resources