Google code jam 2016: Round 1A, BFF - graph

Question :
You are a teacher at the brand new Little Coders kindergarten. You have N kids in your class, and each one has a different student ID number from 1 through N. Every kid in your class has a single best friend forever (BFF), and you know who that BFF is for each kid. BFFs are not necessarily reciprocal -- that is, B being A's BFF does not imply that A is B's BFF.
Your lesson plan for tomorrow includes an activity in which the participants must sit in a circle. You want to make the activity as successful as possible by building the largest possible circle of kids such that each kid in the circle is sitting directly next to their BFF, either to the left or to the right. Any kids not in the circle will watch the activity without participating.
What is the greatest number of kids that can be in the circle?
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each test case consists of two lines. The first line of a test case contains a single integer N, the total number of kids in the class. The second line of a test case contains N integers F1, F2, ..., FN, where Fi is the student ID number of the BFF of the kid with student ID i.
Output
For each test case, output one line containing "Case #x: y", where x is the test case number (starting from 1) and y is the maximum number of kids in the group that can be arranged in a circle such that each kid in the circle is sitting next to his or her BFF.
My problem : There is the contest analysis on the code jam site, but I don't understand it. Where is the optimization happening? If someone can explain this problem and its solution in a detailed manner, it will be very helpful.
Edit : I am not adding any pseudo-code, because I want to better my understanding of the problem, and it's not a coding issue.

Related

Knapsack with non-linear constraints & step function including item dependencies

I am trying to solve an optimization which looks similar to a knapsack-problem. The setting is the following:
I am having a pool of ~80,000 players of which I want to build the cheapest squad of exactly 11 players. Each player has multiple attributes, the main position he is playing in, nation, club, league and rating.
The players not only need to be selected but also assigned to a position in the formation:
Stating the following problem:
The first constraint is a minimum rating of the squad, which can simply be formulated as a linear constraint. The second and third constraint make sure that exactly one player is selected for each position and each player can only be selected once.
There are several other linear constrains that can occur like a minimum amount of players from one nation or at most three players from a specific club etc.
The chemistry of a squad is a non-linear constraint with a step function.
A players individual chemistry is the product of his position & link bonus.
The position bonus is defined by what the players main position is and where in the formation he is placed in. A central defender placed in the according position gets 3 points, used as a striker he gets 0 points. The bonuses can be seen in the next table.
This part of the constraint still can be formulated linearly. The link bonus is the non linear component. Each position/node in the formation/graph has a weight between [0-3], two adjacent players have a weight of 1 if they are from the same nation, league or club. Sharing two attributes is a weight of 2 and for three respectively. The bonus for a specific position is the average of all edges multiplied by a factor 3.
This bonus is plugged into a step function, which can be seen in the next figure (mapping values between [0-1] to 0.9 etc.). The link bonus is multiplied by the position bonus and capped to 10. The team chemistry is defined as the sum of the individual player chemistries.
I implemented it as described with miniZinc solving it with the osicbc solver, but even for a player pool of ~100 players this is not really feasible to compute, depending on the additional constraints.
Now I am looking for an implementation that can approximate the solution. I was thinking about a simulated annealing or genetic algorithm. However, due to this chemistry constraint these approaches produce a lot of invalid solutions, wandering around in the dark.
Does anyone have an approach that might be applicable to my problem?

More info needed on number of nodes generated by Breadth First Search

I am new to AI and was going through Peter Norvig book. I've looked into this question already What is the number of nodes generated by breadth-first search?.
It says that if we apply goal test to each node when it is selected for expansion then we have nodes = 1 + b + b^2 + b^3 + ... + b^d + (b^(d+1) - b)
But what if my goal state is a leaf node at the final depth. So there is no depth at all after the goal. Then how can b^(d+1) evaluate?. eg: in a tree with max depth 3, if my goal lies at depth 3, then how would I evaluate b^(3+1) when there is no 4th level at all?. Please clear my doubt. Thanks in advance!
Note that the answer you linked mentioned that that is the amount of nodes that will be generated in the worst case.
Generated means that not all of those nodes are tested to see if they are the goal; they're simply generated and stored so that they can eventually be compared to the goal in case the goal is not found yet.
Worst case has two important implications. Try to visualize the Breadth-First Search going from left to right, then down one level, then left to right again, then down, etc. With worst case we assume that, on whatever depth level d the goal is located, the goal is the very last (rightmost) node. This means that all nodes to the left of it are compared to the goal node, and any successors/children of them are generated as well.
Now, I know that you said that in your case there are no nodes at a depth level below d, but the second implication of saying worst case is that we do assume there are basically infinitely many depth levels.
Indeed, for your case that equation is not entirely correct, but this is simply because you don't have the worst case. In your case, the search process would indeed not have to generate the last (b^(d+1) - b) nodes of the equation.
A final note on the terminology you used: you asked how b^(d+1) (for example, b^(3+1) can be evaluated if there is no depth level below d = 3. There is still no problem to mathematically evaluate that term. Even in your case there is no depth level 4, we can still mathematically evaluate the term b^(3+1). In your case it would not make sense to do so, because it is not correct, but we can still evaluate the term just fine.

Counting to a million in Python - Theory

I'm learning Python and came across a question that went something like "How long would it take to count to 1,000,000 out loud?" The only parameter it gave was, "you count, on average, 1 digit per second." I did that problem, which wasn't very difficult. Then I started thinking about counting aloud, annunciating each numeral. That parameter seems off to me, and indeed the answer Google gives to the question alone "how long to count to a million" suggests it's off. Given that each number in the sequence takes progressively longer (an exponential increase??), there must be a better way.
Any ideas or general guidance would be of assistance. Would sampling various people's "counting rates" at various intervals work? Would programming the # of syllables work? I am really curious, and have looked all over SO and Google for solutions that don't revolve around that seemingly inaccurate "average time".
Thanks, and sorry if this isn't on topic or in the appropriate place. I'm a long time lurker, but new to posting, so let me know if you need more info or anything. Thanks!
Let us suppose for the sake of simplicity that you don't say 1502 as "fifteen hundred and two", but as "thousand five hundred and two". Then we can hierarchically break it down.
And let's ignore the fact whether you say "and" or not (though apparently it is more said than not) for now. I will use this reference (and British English, because I like it more and it's more consistent : http://forum.wordreference.com/showthread.php?t=15&langid=6) for how to pronounce numbers.
In fact, to formally describe this, let t be a function of a set of numbers, that tells you how much time it takes to pronounce every number in that set. Then your question is how to compute t([1..1000000]), and we will use M=t([1..999])
Triplet time in function of previous one
To read a large number we start at the left and read the three-digit groups. The group at the left, of course, may have only one or two digits.
Thus for every number x of thousands you will say x thousand y where y will describe all the numbers from 1 to 999.
Thus the time you spend in the x thousand ... is 1000 t({1000x}) + M, as detailed here after :
Note that this formula is generalizable to numbers below 1000, by simply defining t({0}) = 0.
Now the time to say "x thousand" is, per our hypothesis, equal to the time to say "x" plus the time to say "thousand" (when x > 0). Thus your answer is :
Where is the time it takes to say the word thousand. This supposes you say 1000 as "one thousand". You may want to remove 1000 tau("one") if you would only say "thousand".
How ever I stick with the reference :
The numbers 100-199 begin with one hundred... or a hundred...
You can in exactly the same way express the time it takes to count to a billion from and the number above, and so on for all the greater powers of 103, i.e.
Taking into account the "and"
There is a small correction to be done. Let us suppose that M is the time it takes to pronounce numbers from 1 to 999 when they are preceded by at least a non-0 group of numbers, including initial "and"s.
Our reference (well, the wordreference post I linked) says the following :
What do we say to join the groups?
Normally, we don’t use any joining word.
The exception is the last group.
If the last group after the thousands is 1-99 it is joined with and.
Thus our correction applies only to the numbers between 0 and 999 (where there is no non-zero group preceding) :
Getting M
Or rather, let's get t([1..999]) since it's more natural and we know how it is related to M.
Let C = t([1..99]), X = t([1..9]).
Between 1 and 999 we have all the numbers from [1..99] and all the 9 exact hundreds where you don't say "and", that is 108 occurences. There are 900 numbers prefixed with a hundreds number.
Thus
C is probably hard to break down, so I'm not going to try.
Final result
The corrected formula is :
And as a function of C and X :
Note that your measures of tau(word), C, and X need to be very precise if you plan on doing this multiplication and having any kind of correct order of magnitude.
Conclusion : Brits end up saying "and" a whole lot. The nice thing about the last formulation is that you can remove all the "and"s if you decide you actually don't want to pronounce them.

Graph traversal

At a party with n people P1, . . . , Pn, certain pairs of individuals cannot stand each other.
Given a list of such pairs, determine if we can divide the n people into two groups such that all the people
in both group are amicable, that is, they can stand each other.
Suppose we have a G that the pairs of people cannot be in the same group has a edge between them. Use DFS in this G and set Group1 for s, and then Group2 for its successor, and then Group2.... If we can finish it, we find it, otherwise, there are some collisions, which means we can't divide them into two groups as the question asked.
One brute force solution would be to find all possible combinations of n choose n/2 people and then verify that everyone in the group is amicable, if so, then you must check everyone in the other half as well. If both sides are happy then you've found a solution. Otherwise, move on to the next combination. Obviously, this is not an ideal solution, but it does work deterministically. Typically in an interview, it is best to start with something that works and iterate on to better ideas.
A more sophisticated solution would compute the complement graph, then remove any edges that are not bi-directional, pick an arbitrary node to start from, use depth-first search, mark every node found in group 1. Then pick any unmarked node, and mark every node found in group 2. If there are any remaining unmarked nodes, then the individuals cannot be divided into two amicable groups.

Friends selection algorithm

In a .net project we have a group of 200 people of two types, lets say x and y, who need to be separated into groups of 7 or 8.
We have a web page where the people write other members they want to be in a group with. Each person builds a list of wanted members.
After this, there should be an algorithm to build the 7-8 member groups considering the peoples ratings, and the following condition: each group has at least 2 people of each type (x/y).
I'm pretty sure there must be a well known algorithm similar to this but didn't find one. Anyone knows how to do it?
this problem smells NP-Hard, so I suggest using Artificial Intelligence tools.
A possible approach is steepest ascent hill climbing [SAHC]
first, we will define our utility function (let it be u) as mentioned in the comments to the question. [sum of friends in group for each user]. let's define u(illegal) = -1 for illegal solution.
next,we define our 'world': S is the group of all possible solutions].
for each solution in S we define:
next(s)={all possibilities moving one person to a different group}
all we have to do now is run SAHC with random restarts:
1. best<- -INFINITY
2. while there is more time
3. choose a random legal solution
4. NEXT <- next(s)
5. if max{ U(NEXT) } < u(s): //s is the top of the hill
5.1. if u(s) > best: best <- u(s) //if s is better then the previous result - store it.
5.2. go to 2. //restart the hill climbing from a different random point.
6. else:
6.1. s <- max{ NEXT } //climb on the steepest hill.
6.2. goto 4.
7. return best //when out of time, return the best solution found so far.
It is anytime algorithm, meaning it will get a better result as you give it more time to run, and eventually [at time infinity] it will find the optimal result.

Resources