Confusion about raft algorithm

Confusion about raft algorithm - raft

In the paper 《 In Search of an Understandable Consensus Algorithm 》, Figure 8 shows a problem in (d) and (e) that some old logs may be overwritten and never come back.
In section 5.4.2, it says “To eliminate problems like the one in Figure 8, Raft never commits log entries from previous terms by counting replicas. Only log entries from the leader’s current term are committed by counting replicas; once an entry from the current term has been committed in this way, then all prior entries are committed indirectly because of the Log Matching Property”.
I'm confused about that part, how does it works in Figure 8? What will happen and what will not?

By adding the rule to figure 8.
Raft never commits log entries from previous terms by counting replicas.
So now we never commits log entries from previous terms, let see what will happen again at figure 8. I modified figure 8 to show the situation after apply the rule.
(a) and (b) works the same.
Start from (c), log entry at index 2 is append at term 2 since step (a), where I draw a yellow circle. So it is from previous terms. Thus the leader will not replicate that entry (the yellow 2 with my black cross) according the rule. It must start replicate from entry at index 3.
This rule also mentioned at Figure 2 "Rules for Servers" leader's rule 3.1:
Send AppendEntries RPC with entries startting at nextIndex.
The nextIndex is initialized with last log index + 1, so it should start at log index 3 at (c), not index 2.
So for the hypothetical procedure at original (c), it is impossible to append log 2 to majority before log 3（the pink one appended at term 4） replicate at majority. and (d) will not happen.
UPDATE: 2020-12-04
#coderx and #OrlandoL have comments discussed about the (a), (b) and how S5 can't be a leader. Their discuss makes this answer more complete, So I put a reference here.
Basically, (a), (b) is not a must-happen condition, there are cases that S5 won't elected leader, such as S3,S4 have same chance to become leader. (please see the comments for detail)
These assumption is correct that S5 may not become a leader and the following procedure won't happen.
But let's go back to the paper Figure 8 and read the annotation of the figure:
A time sequence showing why a leader cannot determine commitment using log entries from older terms. In
(a) S1 is leader and partially replicates the log entry at index 2.
In (b) S1 crashes; S5 is elected leader for term 3 with votes
from S3, S4, and itself, and accepts a different entry at log
index 2.
IMO, the author is talking about the case that S5 is elected leader. Thus the whole procedure makes scene.
As #OrlandoL mentioned, In a MIT 6.824 Lab, you should consider all conditions to have a correct Raft implementation.
Hope this helps.

Raft doesn't commit entries from previous term because these entries of previous term could be overwritten by future leader just like leader S5 in (d).
Suppose leader S1 in (c) committed entry at index 2 of term 2, then that entry will be applied by S1, S2 and S3. Then S1 crashed, it's totally possible for S5 to become leader like in (d) 'cause its log is more up-to-date than S2, S3 and S4. S5 would overwrite entry of term 2 at index 2 with its own entry of term 3. This means leader S5 overwrites a committed entry! Some servers(S1, S2 and S3) have applied the entry of term 2, others(S4, S5) would apply entry of term 3 at index 2, which violates the State Machine Safety in figure 3.
So leader S1 of term 4 in (c) cannot commit entry of term 2 at index 2 unless it commit an entry of its own term like entry of term 4 at index 3 like in (d). Once entry at index 3 of term 4 is committed, entry at index2 of term 2 is auto-committed and they will never be overwritten by future leaders. (A candidate can become leader only if it has all committed entries from previous term.)

Related

Google code jam 2016: Round 1A, BFF

Question :
You are a teacher at the brand new Little Coders kindergarten. You have N kids in your class, and each one has a different student ID number from 1 through N. Every kid in your class has a single best friend forever (BFF), and you know who that BFF is for each kid. BFFs are not necessarily reciprocal -- that is, B being A's BFF does not imply that A is B's BFF.
Your lesson plan for tomorrow includes an activity in which the participants must sit in a circle. You want to make the activity as successful as possible by building the largest possible circle of kids such that each kid in the circle is sitting directly next to their BFF, either to the left or to the right. Any kids not in the circle will watch the activity without participating.
What is the greatest number of kids that can be in the circle?
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each test case consists of two lines. The first line of a test case contains a single integer N, the total number of kids in the class. The second line of a test case contains N integers F1, F2, ..., FN, where Fi is the student ID number of the BFF of the kid with student ID i.
Output
For each test case, output one line containing "Case #x: y", where x is the test case number (starting from 1) and y is the maximum number of kids in the group that can be arranged in a circle such that each kid in the circle is sitting next to his or her BFF.
My problem : There is the contest analysis on the code jam site, but I don't understand it. Where is the optimization happening? If someone can explain this problem and its solution in a detailed manner, it will be very helpful.
Edit : I am not adding any pseudo-code, because I want to better my understanding of the problem, and it's not a coding issue.

More info needed on number of nodes generated by Breadth First Search

I am new to AI and was going through Peter Norvig book. I've looked into this question already What is the number of nodes generated by breadth-first search?.
It says that if we apply goal test to each node when it is selected for expansion then we have nodes = 1 + b + b^2 + b^3 + ... + b^d + (b^(d+1) - b)
But what if my goal state is a leaf node at the final depth. So there is no depth at all after the goal. Then how can b^(d+1) evaluate?. eg: in a tree with max depth 3, if my goal lies at depth 3, then how would I evaluate b^(3+1) when there is no 4th level at all?. Please clear my doubt. Thanks in advance!

Note that the answer you linked mentioned that that is the amount of nodes that will be generated in the worst case.
Generated means that not all of those nodes are tested to see if they are the goal; they're simply generated and stored so that they can eventually be compared to the goal in case the goal is not found yet.
Worst case has two important implications. Try to visualize the Breadth-First Search going from left to right, then down one level, then left to right again, then down, etc. With worst case we assume that, on whatever depth level d the goal is located, the goal is the very last (rightmost) node. This means that all nodes to the left of it are compared to the goal node, and any successors/children of them are generated as well.
Now, I know that you said that in your case there are no nodes at a depth level below d, but the second implication of saying worst case is that we do assume there are basically infinitely many depth levels.
Indeed, for your case that equation is not entirely correct, but this is simply because you don't have the worst case. In your case, the search process would indeed not have to generate the last (b^(d+1) - b) nodes of the equation.
A final note on the terminology you used: you asked how b^(d+1) (for example, b^(3+1) can be evaluated if there is no depth level below d = 3. There is still no problem to mathematically evaluate that term. Even in your case there is no depth level 4, we can still mathematically evaluate the term b^(3+1). In your case it would not make sense to do so, because it is not correct, but we can still evaluate the term just fine.

Error in MPI broadcast

Sorry for the long post. I did read some other MPI broadcast related errors but I couldn't
find out why my program is failing.
I am new to MPI and I am facing this problem. First I will explain what I am trying to do:
My declarations :
ROWTAG 400
COLUMNTAG 800
Create a 2 X 2 Cartesian topology.
Rank 0 has the whole matrix. It wants to dissipate parts of matrix to all the processes in the 2 X 2 Cartesian topology. For now, instead
of matrix I am just dealing with integers. So for process P(i,j) in 2 X 2 Cartesian topology, (i - row , j - column), I want it to receive
(ROWTAG + i ) in one message and (COLUMNTAG + j) in another message.
My strategy to do so is:
Processes: P(0,0) , P(0,1), P(1,0), P(1,1)
P(0,0) has all the initial data.
P(0,0) sends (ROWTAG+1) (in this case 401) to P(1,0) - In essense P(1,0) is responsible for dissipating information related to row 1 for all the processes in Row 1 - I just used a blocking send
P(0,0) sends (COLUMNTAG+1) (in this case 801) to P(0,1) - In essense P(0,1) is responsible for dissipating information related to column 1 for all the processes in Column 1 - Used a blocking send
For each process, I made a row_group containing all the processes in that row and out of this created a row_comm (communicator object)
For each process, I made a col_group containing all the processes in that column and out of this created a col_comm (communicator object)
At this point, P(0,0) has given information related to row 'i' to Process P(i,0) and P(0,0) has given information related to column 'j' to
P(0,j). I call P(i,0) and P(0,j) as row_head and col_head respectively.
For Process P(i,j) , P(i,0) gives information related to row i, and P(0,j) gives information related to column j.
I used a broad cast call:
MPI_Bcast(&row_data,1,MPI_INT,row_head,row_comm)
MPI_Bcast(&col_data,1,MPI_INT,col_head,col_comm)
Please find my code here: http://pastebin.com/NpqRWaWN
Here is the error I see:
* An error occurred in MPI_Bcast
on communicator MPI COMMUNICATOR 5 CREATE FROM 3
MPI_ERR_ROOT: invalid root
* MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
Also please let me know if there is any better way to distribute the matrix data.

There are several errors in your program. First, row_Ranks is declared with one element less and when writing to it, you possibly overwrite other stack variables:
int col_Ranks[SIZE], row_Ranks[SIZE-1];
// ^^^^^^
On my test system the program just hangs because of that.
Second, you create new subcommunicators out of matrixComm but you use rank numbers from the latter to address processes in the former when performing the broadcast. That doesn't work. For example, in a 2x2 Cartesian communicator ranks range from 0 to 3. In any column- or row-wise subgroup there are only two processes with ranks 0 and 1 - there is neither rank 2 nor rank 3. If you take a look at the value of row_head across the ranks, it is 2 in two of them, hence the error.
For a much better way to distribute the data, you should refer to this extremely informative answer.

How does is_integer(X) procedure work?

I read in Mellish, Clocksin book about Prolog and got to this:
is_integer(0).
is_integer(X) :- is_integer(Y), X is Y + 1.
with the query ?- is_integer(X). the zero output is easy but how does it get 1, 2, 3, 4...
I know it is not easy to explain writing only but I will appreciate any attempt.
After the 1-st result X=0 I hit ; then the query becomes is_integer(0) or is still is_integer(X)?
It's long time I search for a good explanation to this issue. Thanks in advance.

This strikes to the heart of what makes Prolog so interesting and difficult. You're definitely not stupid, it's just extremely different.
There are two rules here. The existence of alternatives causes choice points to be created. You can think of the choice point as a moment when Prolog saw an alternate way of proceeding with the computation. Prolog always tries rules in the order they appear in the database, which will correspond to the order they appear in the source file. So when you issue the query is_integer(X), the first rule matches and unifies X with 0. This is your first result.
When you press ';' you are telling Prolog to fail, that this answer is not acceptable, which triggers backtracking. The only thing for Prolog to do is try entering the other rule, which begins is_integer(Y). Y is a new variable; it may or may not wind up instantiated to the same value as X, so far you haven't seen any reason why that wouldn't be the case.
This call, is_integer(Y) essentially duplicates the computation that's been attempted so far. It will enter the first rule, is_integer(0) and try that. This rule will succeed, Y will be unified with 0 and then X will be unified with Y+1, and the rule will exit having unified X with 1.
When you press ';' again, Prolog will back up to the nearest choice point. This time, the nearest choice point is the call is_integer(Y) within the second rule for is_integer/1. So the depth of the call stack is greater, but we haven't left the second rule. Indeed, each subsequent result will be had by backtracking from the first to the second rule at this location in the previous location's activation of the second rule. I doubt very seriously a verbal explanation like the preceeding is going to help, so please look at this trashy ASCII art of how the call tree is evolving like this:
1 2 2
/ \
1 2
/
1
^ ^ ^
| | |
0 | |
1+0 |
1+(1+0)
where the numbers are indicating which rule is activated and the level is indicating the depth of the call stack. The next several steps will evolve like this:
2 2
\ \
2 2
\ \
2 2
/ \
1 2
/
1
^ ^
| |
1+(1+(1+0)) |
= 3 1+(1+(1+(1+0)))
= 4
Notice that we always produce a value by increasing the stack depth by 1 and reaching the first rule.

The answer of Daniel is very good, I just want to offer another way to look at it.
Take this trivial Prolog definition of natural numbers based on TNT (so 0 is 0, 1 is s(0), 2 is s(s(0)) etc):
n(0). % (1)
n(s(N)) :- n(N). % (2)
The declarative meaning is very clear. (1) says that 0 is a number. (2) says that s(N) is a number if N is a number. When called with a free variable:
?- n(X).
it gives you the expected X = 0 (from (1)), then looks at (2), and goes into a "new" invocation of n/1. In this new invocation, (1) succeeds, the recursive call to n/1 succeeds, and (2) succeeds with X = s(0). Then it looks at (2) of the new invocation, and so on, and so on.
This works by unification in the head of the second clause. Nothing stops you, however, from saying:
n_(0).
n_(S) :- n_(N), S = s(N).
This simply delays the unification of S with s(N) until after n_(N) is evaluated. As nothing happens between evaluating n_(N) and the unification, the result, when called with a free variable, is identical.
Do you see how this is isomorphic to your is_integer/1 predicate?
A word of warning. As pointed out in the comments, this query:
?- n_(0).
as well as the corresponding
?- is_integer(0).
have the annoying property of not terminating (you can call them with any natural number, not only 0). This is because after the first clause has been reached recursively, and the call succeeds, the second clause still gets evaluated. At that point you are "past" the end-of-recursion of the first clause.
n/1 defined above does not suffer from this, as Prolog can recognize by looking at the two clause heads that only one of them can succeed (in other words, the two clauses are mutually exclusive).

I attempted to put into a graphic #daniel's great answer. I found his answer enlightening and could not have figured out what was going on here without his help. I hope that this image helps someone the way that #daniel's answer helped me!

Depth First Search

Perform Depth-first Search on the graph shown starting with vertex a. When you traverse the neighbours, process them in alphabetical order.
The question is to find the DFI, Level and the Parent of each vertex.
Here is a picture of it:
I'm unsure of how to get going with this, it is a practice question for an upcoming exam. I know for depth first search, it uses a stack and it will start at vertex a and go in alphabetical order in the stack but i'm not sure how I would get the values for each of the columns. Can someone explain further or help me with this?

So you start at 'a' and must traverse the nodes in alphabetical order so from a you either have the option of going to b or g so you choose b because it is first alphabetically. from b your only choice is g and so on....
now for your values. the parent of a is null since you have no previous nodes the parent of b is a and the parent of g is b and so on.
the dfs level is the level that it would end up on a tree. so imagine that you do your traversal then erase all lines that weren't part of the traversal. and then you take your root and 'shake it out' what i mean is you rearrange it so that it looks like a tree. (this particular graph is very uninteresting) and then you assign levels based on that tree.
And the dfs index is simply the order in which you touched the nodes.
The folowing are for your graph but using g as a starting point....I think it makes it slightly more intersting
the numbers are the order in which the edges were taken.
Here is what i was talking about when i said 'shake it out' this is what your tree looks like and in blue i show the level of each node(0 based). I hope the images make it a little more understandable.
the one i drew( the terrible free hand one) was formed by deleting all of the edges that weren't used and then rearranging them to look like a tree.
You can think of the depth as how many steps did i have to take from the root to get to the current node. so from g to b is 1 step so depth of 1 from g to i 3 because we go from g->c->d->i 3 steps. after you have made your traversal you ignore the fact that you can in fact get from g to i in two steps(g->h->i) because it wasnt part of the traversal

The index is simply the number in order that the node is visited. a is first, write 1 there. Knowing depth first search as you do, you should know what the second node is; so write 2 under that. Depth is how high a node is; every time you deepen the depth, it increases, and whenever you go shallower, it's less. So a is on depth 1; the next node and its sister will be on depth 2, etc. The parent is the letter identifying the node that you just came from; so a has no parent, and the node with index 2 will have a as parent.
If your class uses a zero-based numbering system, replace 2 in the above paragraph with 1, and 1 with 0. If you have no idea what "zero-based numbering system" is, ignore this paragraph.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex