I do have a pairs "manager, worker" for some hierarchy structure.
What will be the total number of relations "manager, worker" including relations like "manager, worker of my worker - my worker", "worker of worker of my worker - my worker" and so on...
For example:
Alex, Pete
Pete, Kane
Jones, Alex
Clark, Allen
The total amount of connections is 7:
Jones -> Alex -> Pete -> Kane
Jones -> Pete
Jones -> Kane
Alex -> Kane
Clark -> Allen
I have to calculate the total amount of connections for about ~20k relations.
Is there any special methods for doing so?
i think there cant be a mathematical solution without analyzing the stucture.
and if you have analyzed it it is straight forward to count them.
here is a solution in delphi, but you can easily translate it in any other language. you only have to fill the dictionary with your source of the 20k pairs.
program Project1;
uses
Generics.Collections;
function relationcount(dict:TDictionary<string,string>):Integer;
function _relationcount(_key:string):Integer;
begin
if dict.ContainsKey(_key) then
result:=1+_relationcount(dict.Items[_key])
else
result:=0;
end;
var k:string;
begin
result:=0;
for k in dict.Keys do
result:=result+_relationcount(k);
end;
var
relations:TDictionary<string,string>;
begin
relations:=TDictionary<string,string>.Create;
// replace this with your read from file algorithm //
relations.Add('Alex', 'Pete'); //
relations.Add('Pete', 'Kane'); //
relations.Add('Jones', 'Alex'); //
relations.Add('Clark', 'Allen'); //
// ----------------------------------------------- //
writeln(relationcount(relations));
end.
Sure there's a solution. Essentially you're asking for the sum of the number of supervisors each person has. So you could do something as simple as
s = 0
for each worker w
for each worker v
if v is above w in the supervisory chain then s <- s+1
return s
Depending on how your data is stored and what language you are using there might be a more efficient way of doing this.
Related
I've been banging my head at this question for sometime and I can't figure it out. I've read the definition of free variable "Free variables and bound variables" from wikipedia and several books but I can't get the answer right
Consider the following code:
local A B C=1 D=2 in
A = 1
proc {Add E F G}
E = A + D + F
end
end
Which of these identifiers (A, B, C, D, E, F, G) are free identifiers?
The concept of free identifier comes always with a context. If you consider only the statement E=A+D+F the four identifiers are free. But if you consider the procedure definition, E and F are now bound because they are formal parameters. So the free identifiers are A and D. Finally if you consider the entire code you give, there is no free identifier since all identifiers are declared.
Reference : Concepts, Techniques and Models of Computer Programming by Peter Van Roy and Seif Haridi.
The end of page 57 and the page 58 are interesting for this matter.
The first three chapters are available on the edX platform if you enroll the course Paradigms of Computer Programming
The free identifiers of any instruction are those indentifier occurrences inside the instruction that correspond to declarations outside the instruction.
So it seams that A and D was the answer.
A very basic question here:
Example rule (suppose its generated from WEKA) :
bread=t 10 ==> milk=t 10 conf:(1)
Which means that "from 10 instances, everytime people buy bread, they also buy milk". (ignore the support)
Does this rule can be read both ways? Like, "every time people buy milk, they also buy bread?"
Another example
Physics101=A ==> Superphysics401=A
Can it be read both ways like this:
"If people got A on Physics101, they also got A on Superphysics401"
"If people got A on Superphysics401, they also got A on Physics101" ?
If so, what makes WEKA generate the rule in that order (Physics ==> Superphysics), why not the other way? Or does the order not relevant?
Does this rule can be read both ways? Like, "everytime people buy milk, they also buy bread?"
No, it can only be read one way.
This follows from the rules of implication. A -> B and B -> A are different things. Read former as "A is a subset of B", thus, whenever you are in A, you are in B. B -> A, also called converse of A -> B, can be interpreted in similar way. When both of these hold, we say that A <-> B which means that A and B are essentially the same.
If the above looks like too much jargon, keep the following in mind:
Rain -> Clouds is true. Whenever there is rain, there will be clouds, But Clouds -> Rain is not always true. There may be clouds but no rain.
If so, what makes WEKA generate the rule in that order (Physics ==>
Superphysics), why not the other way? Or does the order not relevant?
The dataset leads to the rules. Here is an example :
Milk, Bread, Waffers
Milk, Toasts, Butter
Milk, Bread, Cookies
Milk, Cashewnuts
Convince yourself that Bread -> Milk, but Milk ! -> Bread.
Note that we may not be always interested in rules that either hold or do not hold. Thus, we try to add a notion of confidence to the rules. A natural way of defining confidence for A->B is P(B|A) i.e. how often do we see B when we see A.
This can be calculated by dividing the count of B and A appearing together and dividing by the count of A appearing alone.
In our example,
P(Milk | Bread) = 2 / 2 = 1 and
P(Bread | Milk) = 2 / 4 = 0.5
You can now sort list of rules on the basis of confidence and decide which ones do you want to use.
I am trying to write a program to compute the total material costs of an item in production.
So far, I figured out a recursive algorithm might be the best way to do this. While I learned about them in school, we have always avoided them in any practical tasks.
As an example, this will have to do:
1 Kitchen requires 1 Table and 2 Chairs
1 Chair requires 4 Legs
1 Table requires 4 Legs and 1 Plate
Thus, 1 Kitchen requires 12 Legs and 1 Plate (the Table and Chairs should be omitted, as they are intermediate products)
However, I just can't seem to wrap my head around how to actually write the recursive function.
NOTE: If you believe this question is not useful to the community or otherwise of bad quality, please leave a comment on what could improve this question.
I am not familiar in pseudo-code and I am not sure what programming language you are familiar with. Hence I try to answer it in code form.
Recursive method : CollectRawProduct(Product)
CollectRawProduct(Product product) {
if (product has subProducts) {
for ( subProduct : subProducts ) { // iterates the sub products
CollectRawProduct( subProduct ); // visit subproduct
}
} else {
rawProductList.add(product) // add to a raw product list
}
}
The raw product list will contain the raw product like e.g for your case 4 legs one plate.
Does anyone know how to replicate the (pg_trgm) postgres trigram similarity score from the similarity(text, text) function in R? I am using the stringdist package and would rather use R to calculate these on a matrix of text strings in a .csv file than run a bunch of postgresql quires.
Running similarity(string1, string2) in postgres give me a number score between 0 and 1.
I tired using the stringdist package to get a score but I think I still need to divide the code below by something.
stringdist(string1, string2, method="qgram",q = 3 )
Is there a way to replicate the pg_trgm score with the stringdist package or another way to do this in R?
An example would be getting the similarity score between the description of a book and the description of a genre like science fiction. For example, if I have two book descriptions and the using the similarity score of
book 1 = "Area X has been cut off from the rest of the continent for decades. Nature has reclaimed the last vestiges of human civilization. The first expedition returned with reports of a pristine, Edenic landscape; the second expedition ended in mass suicide, the third expedition in a hail of gunfire as its members turned on one another. The members of the eleventh expedition returned as shadows of their former selves, and within weeks, all had died of cancer. In Annihilation, the first volume of Jeff VanderMeer's Southern Reach trilogy, we join the twelfth expedition.
The group is made up of four women: an anthropologist; a surveyor; a psychologist, the de facto leader; and our narrator, a biologist. Their mission is to map the terrain, record all observations of their surroundings and of one anotioner, and, above all, avoid being contaminated by Area X itself.
They arrive expecting the unexpected, and Area X delivers—they discover a massive topographic anomaly and life forms that surpass understanding—but it’s the surprises that came across the border with them and the secrets the expedition members are keeping from one another that change everything."
book 2= "From Wall Street to Main Street, John Brooks, longtime contributor to the New Yorker, brings to life in vivid fashion twelve classic and timeless tales of corporate and financial life in America
What do the $350 million Ford Motor Company disaster known as the Edsel, the fast and incredible rise of Xerox, and the unbelievable scandals at GE and Texas Gulf Sulphur have in common? Each is an example of how an iconic company was defined by a particular moment of fame or notoriety; these notable and fascinating accounts are as relevant today to understanding the intricacies of corporate life as they were when the events happened.
Stories about Wall Street are infused with drama and adventure and reveal the machinations and volatile nature of the world of finance. John Brooks’s insightful reportage is so full of personality and critical detail that whether he is looking at the astounding market crash of 1962, the collapse of a well-known brokerage firm, or the bold attempt by American bankers to save the British pound, one gets the sense that history repeats itself.
Five additional stories on equally fascinating subjects round out this wonderful collection that will both entertain and inform readers . . . Business Adventures is truly financial journalism at its liveliest and best."
genre 1 = "Science fiction is a genre of fiction dealing with imaginative content such as futuristic settings, futuristic science and technology, space travel, time travel, faster than light travel, parallel universes, and extraterrestrial life. It often explores the potential consequences of scientific and other innovations, and has been called a "literature of ideas".[1] Authors commonly use science fiction as a framework to explore politics, identity, desire, morality, social structure, and other literary themes."
How can I get a similarity score for the description of each book against the description of the science fiction genre like pg_trgm using an R script?
How about something like this?
library(textcat)
?textcat_xdist
# Compute cross-distances between collections of n-gram profiles.
round(textcat_xdist(
list(
text1="hello there",
text2="why hello there",
text3="totally different"
),
method="cosine"),
3)
# text1 text2 text3
#text1 0.000 0.078 0.731
#text2 0.078 0.000 0.739
#text3 0.731 0.739 0.000
To better understand recursion, I'm trying to count how many characters are between each pair of (),
not counting characters that are within other ()s. For example:
(abc(ab(abc)cd)(()ab))
would output:
Level 3: 3
Level 2: 4
Level 3: 0
Level 2: 2
Level 1: 3
Where "Level" refers to the level of () nesting. So level three would mean that the characters are within a pair(1) within a pair(2) within a pair(3).
To do this, my guess is that the easiest thing to do is to implement some sort of recursive call to the function, as commented inside the function "recursiveParaCheck". What is my approach as I begin thinking about a recurrence relationship?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int recursiveParaCheck(char input[], int startPos, int level);
void main()
{
char input[] = "";
char notDone = 'Y';
do
{
//Read in input
printf("Please enter input: ");
scanf(" %s", input);
//Call Recursive Function to print out desired information
recursiveParaCheck(input, 1, 1);
printf("\n Would you like to try again? Y/N: ");
scanf(" %c", ¬Done);
notDone = toupper(notDone);
}while(notDone == 'Y');
}
int recursiveParaCheck(char input[], int startPos, int level)
{
int pos = startPos;
int total = 0;
do
{
if(input[pos] != '(' && input[pos] != ')')
{
++total;
}
//What is the base case?
if(BASE CASE)
{
//Do something?
}
//When do I need to make a recursive call?
if(SITUATION WHERE I MAKE RECURSIVE CALL)
{
//Do something?
}
++pos;
}while(pos < 1000000); // assuming my input will not be this long
}
Recursion is a wonderful programming tool. It provides a simple, powerful way of approaching a variety of problems. It is often hard, however, to see how a problem can be approached recursively; it can be hard to "think" recursively. It is also easy to write a recursive program that either takes too long to run or doesn't properly terminate at all. In this article we'll go over the basics of recursion and hopefully help you develop, or refine, a very important programming skill.
What is Recursion?
In order to say exactly what recursion is, we first have to answer "What is recursion?" Basically, a function is said to be recursive if it calls itself.
You may be thinking this is not terribly exciting, but this function demonstrates some key considerations in designing a recursive algorithm:
It handles a simple "base case" without using recursion.
In this example, the base case is "HelloWorld(0)"; if the function is asked to print zero times then it returns without spawning any more "HelloWorld"s.
It avoids cycles.
Why use Recursion?
The problem we illustrated above is simple, and the solution we wrote works, but we probably would have been better off just using a loop instead of bothering with recursion. Where recursion tends to shine is in situations where the problem is a little more complex. Recursion can be applied to pretty much any problem, but there are certain scenarios for which you'll find it's particularly helpful. In the remainder of this article we'll discuss a few of these scenarios and, along the way, we'll discuss a few more core ideas to keep in mind when using recursion.
Scenario #1: Hierarchies, Networks, or Graphs
In algorithm discussion, when we talk about a graph we're generally not talking about a chart showing the relationship between variables (like your TopCoder ratings graph, which shows the relationship between time and your rating). Rather, we're usually talking about a network of things, people, or concepts that are connected to each other in various ways. For example, a road map could be thought of as a graph that shows cities and how they're connected by roads. Graphs can be large, complex, and awkward to deal with programatically. They're also very common in algorithm theory and algorithm competitions. Luckily, working with graphs can be made much simpler using recursion. One common type of a graph is a hierarchy, an example of which is a business's organization chart:
Name Manager
Betty Sam
Bob Sally
Dilbert Nathan
Joseph Sally
Nathan Veronica
Sally Veronica
Sam Joseph
Susan Bob
Veronica
In this graph, the objects are people, and the connections in the graph show who reports to whom in the company. An upward line on our graph says that the person lower on the graph reports to the person above them. To the right we see how this structure could be represented in a database. For each employee we record their name and the name of their manager (and from this information we could rebuild the whole hierarchy if required - do you see how?).
Now suppose we are given the task of writing a function that looks like "countEmployeesUnder(employeeName)". This function is intended to tell us how many employees report (directly or indirectly) to the person named by employeeName. For example, suppose we're calling "countEmployeesUnder('Sally')" to find out how many employees report to Sally.
To start off, it's simple enough to count how many people work directly under her. To do this, we loop through each database record, and for each employee whose manager is Sally we increment a counter variable. Implementing this approach, our function would return a count of 2: Bob and Joseph. This is a start, but we also want to count people like Susan or Betty who are lower in the hierarchy but report to Sally indirectly. This is awkward because when looking at the individual record for Susan, for example, it's not immediately clear how Sally is involved.
A good solution, as you might have guessed, is to use recursion. For example, when we encounter Bob's record in the database we don't just increment the counter by one. Instead, we increment by one (to count Bob) and then increment it by the number of people who report to Bob. How do we find out how many people report to Bob? We use a recursive call to the function we're writing: "countEmployeesUnder('Bob')". Here's pseudocode for this approach:
function countEmployeesUnder(employeeName)
{
declare variable counter
counter = 0
for each person in employeeDatabase
{
if(person.manager == employeeName)
{
counter = counter + 1
counter = counter + countEmployeesUnder(person.name)
}
}
return counter
}
If that's not terribly clear, your best bet is to try following it through line-by-line a few times mentally. Remember that each time you make a recursive call, you get a new copy of all your local variables. This means that there will be a separate copy of counter for each call. If that wasn't the case, we'd really mess things up when we set counter to zero at the beginning of the function. As an exercise, consider how we could change the function to increment a global variable instead. Hint: if we were incrementing a global variable, our function wouldn't need to return a value.
Mission Statements
A very important thing to consider when writing a recursive algorithm is to have a clear idea of our function's "mission statement." For example, in this case I've assumed that a person shouldn't be counted as reporting to him or herself. This means "countEmployeesUnder('Betty')" will return zero. Our function's mission statment might thus be "Return the count of people who report, directly or indirectly, to the person named in employeeName - not including the person named employeeName."
Let's think through what would have to change in order to make it so a person did count as reporting to him or herself. First off, we'd need to make it so that if there are no people who report to someone we return one instead of zero. This is simple -- we just change the line "counter = 0" to "counter = 1" at the beginning of the function. This makes sense, as our function has to return a value 1 higher than it did before. A call to "countEmployeesUnder('Betty')" will now return 1.
However, we have to be very careful here. We've changed our function's mission statement, and when working with recursion that means taking a close look at how we're using the call recursively. For example, "countEmployeesUnder('Sam')" would now give an incorrect answer of 3. To see why, follow through the code: First, we'll count Sam as 1 by setting counter to 1. Then when we encounter Betty we'll count her as 1. Then we'll count the employees who report to Betty -- and that will return 1 now as well.
It's clear we're double counting Betty; our function's "mission statement" no longer matches how we're using it. We need to get rid of the line "counter = counter + 1", recognizing that the recursive call will now count Betty as "someone who reports to Betty" (and thus we don't need to count her before the recursive call).
As our functions get more and more complex, problems with ambiguous "mission statements" become more and more apparent. In order to make recursion work, we must have a very clear specification of what each function call is doing or else we can end up with some very difficult to debug errors. Even if time is tight it's often worth starting out by writing a comment detailing exactly what the function is supposed to do. Having a clear "mission statement" means that we can be confident our recursive calls will behave as we expect and the whole picture will come together correctly.