how to make select query for Infinite level of tree, using a recusive method - recursion

I want to make an SQL query that can select all the children trees that belongs to the chosen parent for example:
the foolowing picture will explain.
if I choose parent hot i must get{tea,green tea, lemon tea,reg tea,cofee,espresso,cappuchino,late}
if I choose "juice" i get also all children belongs for it {mango,orange,lemonade}
I think it should be a recusive select method that can call it self untill all levels of sub childrens are called.
(https://i.stack.imgur.com/h5FGc.png)
DRINK-hot-tea-green,reg,lemontea
| |_Coffe-Cappuchino,espresso,late
|
|____cold-shake-coktail,strawbery,banann
|__Juice-mango,orange,...
|__water-still,sparkling,flavoured,..
it should be a recursive select method that can call itself until all levels ofsub-childrenn are called.
the table can be
id name ref
1 drink 0
2 cold 1
3 hot 1
4 tea 3
5 coffe 3
6 shake 2
7 juice 2
8 water 2
9 espreso 5
10 capucino 5
11 late 5
12 mango 7
13 coktail 6
14 still 8
15 sparkling 8
table

Related

Hierarchy chart of 250+ shops and 35,000 employees

are there any tips for make org charts for a 35,000 member organization?
I've attached an org chart for a single shop.
Scenario: We have 250+ shops. Each shop is made up of multiple sections. Each section has a unique section name. Each section is made up of a different amount of managers, technicians, and supervisors. Each shop can be considered a child that reports to a parent. Each parent not only has that particular child shop, but also can have multiple other shops under them as well. That parent can also be a child to a different shop, which is making group_by a challenge. A is a child to parent B, but B is also a child to parent C, who is also a child to parent D, for example.
source doc is an excel doc with 35,000 rows and 50+ columns. Each shop is identified by a shop code and each shop code reports to a parent with it's own shop code.
group_by(parent id, child id might not work because a parent to one shop can be a child to a different parent.
Unit ID Reports To Unit name managers in unit supervisors in unit technicians in unit
10 11 i 2 0 4
9 11 h 2 1 0
8 9 g 4 3 2
6 7 f 2 3 4
5 7 e 1 2 3
4 5 d 2 1 0
3 4 c 4 3 2
2 4 b 2 3 4
1 2 a 1 2 3
You are looking for BALKAN OrgChartJS, it has the functionalities you are asking for:
Code demo with chart and 100k nodes (rows)
Code demo for Import from CSV file and other formats
Also you can read directly from the Excel(CSV) file with http request and load the chart, without any server side code
Disclaimer: I'm a developer in BALKAN App

Is there an efficient algorithm to create this type of schedule? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am creating a schedule for a sports league with several dozen teams. I already have all of the games in a set order and now I just need to assign one team to be the "home" team and one to be "away" for each game.
The problem has two constraints:
Each pair of teams must play an equal number of home and away
games against each other. For example, if team A and team B play 4
games, then 2 must be hosted by A and 2 by B. Assume that each pair
of teams plays an even number of games against each other.
No team should have more than three consecutive home games or three
consecutive away games at any point in the schedule.
I have been trying to use brute force in R to solve this problem but I can't get any of my code blocks to solve the issue in a timely fashion. Does anyone have any advice on how to deal with either (or both) of the above constraints algorithmically?
You need to do more research on simple scheduling.
There are a lot of references on line for these things.
Here are the basics for your application. Let's assume a league of 6 teams; the process is the same for any number.
Match 1: Simply write down the team numbers in order, in pairs, in a ring. Flatten he ring into two lines. Matches are upper (home) and lower(away).
1 2 3
6 5 4
Matches 2-5: Team 1 stays in place; the others rotate around the ring.
1 6 2
5 4 3
1 5 6
4 3 2
1 4 5
3 2 6
1 3 4
2 6 5
That's one full cycle. To balance the home-away schedule, simply invert the fixtures every other match:
1 2 3 5 4 3 1 5 6 3 2 6 1 3 4
6 5 4 1 6 2 4 3 2 1 4 5 2 6 5
There's your first full round. Simply replicate this, again switching home-away fixtures in alternate rounds. Thus, the second round would be:
6 5 4 1 6 2 4 3 2 1 4 5 2 6 5
1 2 3 5 4 3 1 5 6 3 2 6 1 3 4
Repeat this pair of rounds as many times as needed to get the length of schedule you need.
If you have an odd quantity of teams, simply declare one of the numbers to be the "bye" in the schedule. I find it easiest to follow if I use the non-rotating team -- team 1 in this example.
Note that this home-switching process guarantees that no team has three consecutive matches either home or away: they get two in a row when rounding the end of the row. However, even the two-in-a-row doesn't suffer at the end of the round: both of those teams break the streak in the first match of the next round.
Unfortunately, for an arbitrary existing schedule, you are stuck with a brute-force search with backtracking. You can employ some limits and heuristics, such as balancing partial home-away fixtures as the first option at each juncture. Still, the better approach is to make your original schedule correct by design.
There's also a slight problem that you cannot guarantee that your existing schedule will fulfill the given requirements. For instance, given the 8-team fixtures in this order:
1 2 3 4
5 6 7 8
1 2 5 6
3 4 7 8
1 3 5 7
2 4 6 8
It is not possible to avoid having at least two teams playing three consecutive home or away matches.

Identifying maximum number and longest set of time intervals

Say I have data that look like this:
level start end
1 1 133.631 825.141
2 2 133.631 155.953
3 3 146.844 155.953
4 2 293.754 302.196
5 3 293.754 302.196
6 4 293.754 301.428
7 2 326.253 343.436
8 3 326.253 343.436
9 4 333.827 343.436
10 2 578.066 611.766
11 3 578.066 611.766
12 4 578.066 587.876
13 4 598.052 611.766
14 2 811.228 825.141
15 3 811.228 825.141
or this:
level start end
1 1 3.60353 1112.62000
2 2 3.60353 20.35330
3 3 3.60353 8.77526
4 2 72.03720 143.60700
5 3 73.50530 101.13200
6 4 73.50530 81.64660
7 4 92.19030 101.13200
8 3 121.28500 143.60700
9 4 121.28500 128.25900
10 2 167.19700 185.04800
11 3 167.19700 183.44600
12 4 167.19700 182.84600
13 2 398.12300 418.64300
14 3 398.12300 418.64300
15 2 445.83600 454.54500
16 2 776.59400 798.34800
17 3 776.59400 796.64700
18 4 776.59400 795.91300
19 2 906.68800 915.89700
20 3 906.68800 915.89700
21 2 1099.44000 1112.62000
22 3 1099.44000 1112.62000
23 4 1100.14000 1112.62000
They produce the following graphs:
As you can see there are several time intervals at different levels. The level-1 interval always spans the entire duration of the time of interest. Levels 2+ have time intervals that are shorter.
What I would like to do is select the maximum number of non-overlapping time intervals covering each period that contain the maximum number of total time within them. I have marked in pink which ones those would be.
For small dataframes it is possible to brute force this, but obviously there should be some more logical way of doing this. I'm interested in hearing some ideas about what I should try.
EDIT:
I think one thing that could help here is the column 'level'. The results come from Kleinberg's burst detection algorithm (package 'bursts'). You will note that the levels are hierarchically organized. Levels of the same number cannot overlap. However levels successively increasing e.g. 2,3,4 in successive rows can overlap.
In essence, I think the problem could be shortened to this. Take the levels produced, but remove level 1. This would be the vector for the 2nd example:
2 3 2 3 4 4 3 4 2 3 4 2 3 2 2 3 4 2 3 2 3 4
Then, look at the 2s... if there are fewer than or only one '3' then that 2 is the longest interval. But if there are two or more 3's between successive 2's, then those 3s should be counted. Do this iteratively for each level. I think that should work...?
e.g.
vec<-df$level %>% as.vector() %>% .[-1]
vec
#[1] 2 3 2 3 4 4 3 4 2 3 4 2 3 2 2 3 4 2 3 2 3 4
max(vec) #4
vec3<-vec #need to find two or more 4's between 3s
vec3[vec3==3]<-NA
names(vec3)<-cumsum(is.na(vec3))
0 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 8 8
2 NA 2 NA 4 4 NA 4 2 NA 4 2 NA 2 2 NA 4 2 NA 2 NA 4
vec3.res<-which(table(vec3,names(vec3))["4",]>1)
which(names(vec3)==names(vec3.res) & vec3==4) #5 6
The above identifies rows 5 and 6 (which equate to rows 6 and 7 in original df) as having two fours that lie between 3's. Perhaps something using this sort of approach might work?
OK here is a stab using your second data set to test. This might not be correct in all cases!!
library(data.table)
dat <- fread("data.csv")
dat[,use:="maybe"]
make.pass <- function(dat,low,high,the.level,use) {
check <- dat[(use!="no" & level > the.level)]
check[,contained.by.above:=(low<=start & end<=high)]
check[,consecutive.contained.by.above:=
(contained.by.above &
!is.na(shift(contained.by.above,1)) &
shift(contained.by.above,1)),by=level]
if(!any(check[,consecutive.contained.by.above])) {
#Cause a side effect where we've learned we don't care:
dat[check[(contained.by.above),rownum],use:="no"]
print(check)
return("yes")
} else {
return("no")
}
}
dat[,rownum:=.I]
dat[level==1,use:=make.pass(dat,start,end,level,use),by=rownum]
dat
dat[use=="maybe" & level==2,use:=make.pass(dat,start,end,level,use),by=rownum]
dat
dat[use=="maybe" & level==3,use:=make.pass(dat,start,end,level,use),by=rownum]
dat
#Finally correct for last level
dat[use=="maybe" & level==4,use:="yes"]
I wrote these last steps out so you can trace in your own interactive session to see what's happening (see the print to get an idea) but you can remove the print and also condense the last steps into something like lapply(1:dat[,max(level)-1], function(the.level) dat[use=="maybe" & level==the.level,use:=make.pass......]) In response to your comment if there are an arbitrary number of levels you will definitely want to use this formalism, and follow it with a final call to dat[use=="maybe" & level==max(level),use:="yes"].
Output:
> dat
level start end use rownum
1: 1 3.60353 1112.62000 no 1
2: 2 3.60353 20.35330 yes 2
3: 3 3.60353 8.77526 no 3
4: 2 72.03720 143.60700 no 4
5: 3 73.50530 101.13200 no 5
6: 4 73.50530 81.64660 yes 6
7: 4 92.19030 101.13200 yes 7
8: 3 121.28500 143.60700 yes 8
9: 4 121.28500 128.25900 no 9
10: 2 167.19700 185.04800 yes 10
11: 3 167.19700 183.44600 no 11
12: 4 167.19700 182.84600 no 12
13: 2 398.12300 418.64300 yes 13
14: 3 398.12300 418.64300 no 14
15: 2 445.83600 454.54500 yes 15
16: 2 776.59400 798.34800 yes 16
17: 3 776.59400 796.64700 no 17
18: 4 776.59400 795.91300 no 18
19: 2 906.68800 915.89700 yes 19
20: 3 906.68800 915.89700 no 20
21: 2 1099.44000 1112.62000 yes 21
22: 3 1099.44000 1112.62000 no 22
23: 4 1100.14000 1112.62000 no 23
level start end use rownum
On the off chance this is correct, the algorithm can roughly be described as follows:
Mark all the intervals as possible.
Start with a given level. Pick a particular interval (by=rownum) say called X. With X in mind, subset a copy of the data to all higher-level intervals.
Mark any of these that are contained in X as "contained in X".
If consecutive intervals at the same level are contained in X, X is no good b/c it wastes intervals. In this case label X's "use" variable as "no" so we'll never think about X again. [Note: if it's possible that non-consecutive intervals are contained in X, or that containing multiple intervals across levels could ruin X's viability, then this logic might need to be changed to count contained intervals instead of finding consecutive ones. I didn't think about this at all, but it's just occurring to me now, so use at your own risk.]
On the other hand, if X passed the test, then we've already established it's good. Mark it as a "yes." But importantly, we also have to mark any single interval contained in X as "no," or else when we iterate the step it will forget that it was contained inside a good interval and mark itself as "yes" as well. This is the side effect step.
Now, iterate, ignoring any results that we've already determined.
Finally any "maybe"s leftover at the highest level are automatically in.
Let me know what you think of this--this is a rough draft and some aspects might not be correct.

filter sqlite query based on counts of pairwise interactions

I am trying to filter a somewhat involved sqlite3 query using a pairwise association table. Say I have these tables (where pet_id_x references an id in table pets):
[pets]
id | name | animal_types_id | <additional_info>
1 Spike 2
2 Fluffy 1
3 Whiskers 1
4 Spot 2
5 Garth 2
6 Hamilton 3
7 Dingus 1
8 Scales 3
. . .
. . .
[animal_types]
id | type
1 cat
2 dog
3 lizard
[successful_pairings]
pet_id_1 | pet_id_2
1 4
2 4
2 8
3 2
3 4
4 5
4 6
4 7
5 6
5 7
6 7
. .
. .
A toy example for my query would be to get the names of all dogs which meet certain constraints (from columns within the pets table) and have > 2 successful pairings with other dogs, resulting in:
name | successful pairings
Spot 6
Garth 3
As per the above, the total counts for each id need to be combined from pet_id_1 and pet_id_2 in successful_pairings, as an id may be represented for a given pairing in either column.
I am new to sql syntax, and am having trouble chaining queries together to filter based on conditions distributed across multiple tables.

Grouping my data in ASP.NET

I hava this table in data base
http://i.stack.imgur.com/r7ECj.jpg
I want to group data in this form
------------ Group 1 -------------
1.FoodGroupName : Milk
a.FoodSubGroup: Milk(type1)
1.food 1
2.food 2
3.food 3
4.food 4
b.FoodSubGroup: Milk(type2)
1.food 1
2.food 2
3.food 3
4.food 4
c.FoodSubGroup: Milk(type3)
1.food 1
2.food 2
3.food 3
4.food 4
--------- Group 2 ------------
2.FoodGroupName : Meat
a.FoodSubGroup: Meat(type1)
1.food 1
2.food 2
3.food 3
4.food 4
b.FoodSubGroup: Meat(type2)
1.food 1
2.food 2
3.food 3
4.food 4
c.FoodSubGroup: Meat(type3)
1.food 1
2.food 2
3.food 3
4.food 4
You could use a query like this:
SELECT fg.FoodGroupName, fsg.FoodSubGroupName, f.FoodName
FROM FoodGroups fg
INNER JOIN FoodSubGroups fsg
ON fg.FoodGroupId = fsg.FoodGroupId
INNER JOIN Foods f
ON fsg.FoodSubGroupId = f.FoodSubGroupId
ORDER BY fg.FoodGroupName, fsg.FoodSubGroupName, f.FoodName
to retrieve data and then output results as you please...
There is several ways to do that kind of thing :
1) you write a single query that will combines the 3 table (using joins) and return a single dataset (like Marco suggested).
eg :
Group SubGroup Food
--------------------------------
Meat Meat1 Food 1
Meat Meat1 Food 2
Meat Meat1 Food 3
Meat Meat2 Food 1
Milk Milk1 Food 1
Milk Milk2 Food 1
Then by using C# code, you group them.
You can do this using very few lines of code (and efficiently) by using GroupBy() linq lambda expression. Have also a look at ToLookUp() extension.
If you are not familiar with them, check this page :
http://code.msdn.microsoft.com/LINQ-to-DataSets-Grouping-c62703ea
In the end you should get a collection of objects that allows you to do the following:
var groups = ...
foreach(var group in groups)
{
//do something with group
foreach(var subGroup in group.SubGroups)
{
//do something with subGroup
foreach(var food in subGroup.Foods)
{
//do something with food
}
}
}
Then it is very easy to fill a treeview or present data using nested repeaters.
2) you write independent queries that will query each table separately. In the end, you get 3 datasets.
Group
-------
1 Meat
2 Milk
SubGroup
---------
1 1 Meat1
2 1 Meat2
3 2 Milk1
4 2 Milk2
Food
----------
1 1 Food 1
2 1 Food 2
3 1 Food 3
4 2 Food 1
5 3 Food 1
6 4 Food 1
Then you use can Group Join linq operator to group them.
In that case, C# will do the joins for you, not sql.
Check here : http://code.msdn.microsoft.com/LINQ-Join-Operators-dabef4e9/description#groupjoin
In the end you will get same result at 1). Both have advantages / disavantages.

Resources