I'm working on a Crystal report which is grouped hierarchically. Sometimes the data goes down to a fifth-level grouping, sometimes only a 4th. I'd like to be able to tell if I'm at the deepest level of the tree, so that I can switch from using the data in the group header to using the data in the detail line, setting it to columns. I've tried to use the Maximum() function to see how deep the tree goes, but that requires a field, not just an expression.
I've also tried writing a hierarchical query in Oracle and using the MAX OVER (PARTITION BY parent) clause, but that's only bringing back the data I already have, and in addition, it seems to be forcing me to lose the level-1 lines.
ETA: I have checked the GroupingLevel and HierarchyLevel functions, but those don't seem to help -- they only tell me where I am, not where I'm going. When I said above that I used Maximum(), I should have clarified I meant Maximum(HierarchyLevel(blah)).
ETA2: Ok, say the data looks like this:
id parentid
1
2 1
3 2
4 3
5 3
6 1
7 6
8 7
9 8
10 8
11 8
I want something that will return True for 4, 5, 9, 10, and 11, because that's as far down the tree as I can go.
Related
Disclaimer: I am not that advanced with R Studio and hence my question might be quite self explanatory.
Lets assume the following data set
**ID value1a value2a value1b value2b ...
1 2 3 ...
8 4 4
2 5 5
I want to create a forth variable that is part of the expression of an if sentence, that logically should go as follows:
If ID = 1 is over 5 in "value1x" and below 3 in "value2x", then add the value 1 to this forth variable. Hence the forth variable should function as a counter, that the number in the forth variable indiciates the frequency of value1x being over 5 and value2x being below 3.
I hope my question makes sense and Id appreciate answers!
I'm hoping someone may be able to help with a problem I have - trying to solve using R.
Individuals can submit requests for items. The minimum number of requests per person is one. There is a recommended maximum of five, but people can submit more in exceptional circumstances. Each item can only be allocated one individual.
Each item has a 'desirability'/quality score ranging from 10 (high quality) down to 0 (low quality). The idea is to allocate items, in line with requests, such that as many high quality items as possible are allocated. It is less important that individuals have an equitable spread of requests met.
Everyone has to have at least one request met. Next priority is to look at whether we can get anyone who is over the recommended limit within it by allocating requests to others. After that the priority is to look at where the item would rank in each individual's request list based on quality score, and allocate to the person where it would rank highest (eg, if it would be first in someone's list and third in another's, give it to the former).
Effectively I'd need a sorting algorithm of some kind that:
Identifies where an item has been requested more than once
Check all the requests of everyone making said request
If that request is the only one a person has made, give it to them
(if this scenario applies to more than one person, it should be
flagged in some way)
If all requestees have made more than one request, check to see if
any have made more than five requests - if they have it can be taken
off them.
If all are within the recommended limit, see where the request would
rank (based on quality score) and give to the person in whose list it
would rank highest.
The process needs to check that the above step isn't happening to people so many times that it leaves them without any requests...so it
effectively has to check one item at a time.
Does anyone have any ideas about how to approach this? I can think of all kinds of why I could arrange the data to make it easy to identify and see where this needs to happen, but not to automate the process itself. Thanks in advance for any help.
The data (at least the bits needed for this process) looks like the below:
Item ID Person ID Item Score
1 AAG 9
1 AAK 8
2 AAAX 8
2 AN 8
2 AAAK 8
3 Z 8
3 K 8
4 AAC 7
4 AR 5
5 W 10
5 V 9
6 AAAM 7
6 AAAL 7
7 AAAAN 5
7 AAAAO 5
8 AB 9
8 D 9
9 AAAAK 6
9 AAAAC 6
10 A 3
10 AY 3
To put it simple, I have three columns in excel like the ones below:
Vehicle x y
1 10 10
1 15 12
1 12 9
2 8 7
2 11 6
3 7 12
x and y are the coordinates of customers assigned to the corresponding vehicle. This file is the output of a program I run in advance. The list will always be sorted by vehicle, but the number of customers assigned to vehicle "k" may change from one experiment to the next.
I would like to plot a graph containing 3 series, one for each vehicle, where the customers of each vehicle would appear (as dots in 2D based on their x- and y- values) in different color.
In my real file, I have 12 vehicles and 3200 customers, and the ranges change from one experiment to the next, so I would like to automate the process, i.e copy-paste the list on my excel and see the graph appear automatically (if this is possible).
Thanks in advance for your time and effort.
EDIT: There is a similar post here: Use formulas to select chart data but requires the use of VB. Moreover, I am not sure whether it has been indeed answered.
you should try this free online tool - www.cloudyexcel.com/excel-to-graph/
first off:
Suppose I have a dataset that has variables like to_location_id, from_location_id gender, and age. now if i want to know overall the top 5 locations people like to visit i do this:
#most popular 5 locations to go to
top<-as.data.frame(sort(table((mydata$to_location_id),decreasing = TRUE)[1:5])
> top
sort(table(mydata$to_location_id), decreasing = TRUE)[1:5]
3 18544
9 18395
76 15457
5 14342
1 13898
*this gives the most 5 popular locations to go to overall in the dataset
locations 3 , 9, 76, 5, and 1
**similarly i can also get the most 5 popular locations to come form overall
Now suppose that there are 100 unique location id's (in both from and to location id's) I want to know for each location what are the top 5 popular to and the top 5 popular from locations given each location. i know i need a loop but i'm not sure how to do it. i have tried this (no luck):
for(i in unique(mydata$to_location_id)){
as.data.frame(sort(table(mydata$to_location_id),decreasing = TRUE)[1:5])
}
I'm looking for a mathmatical ranking formula.
Sample is
2008 2009 2010
A 5 6 4
B 6 7 5
C 7 8 2
I want to add a rank column for each period code field
rank
2008 2009 2010 2008 2009 2010
B 6 7 5 2 1 1
A 5 6 4 3 2 2
C 7 2 2 1 3 3
please do not reply with methods that loop thru the rows and columns, incrementing the rank value as it goes, that's easy. I'm looking for a formula much like finding the percent total (item / total). I know i've seen this before but an havning a tough time locating it.
Thanks in advance!
sort ((letters_col, number_col) descending by number_col)
As efficient as your sort alg.
Then number the rows, of course
Edit
I really got upset by your comment "please don't up vote this answer, sorting and loop is not what I'm asking for. i specifically stated this in my original question. " , and the negative votes, because, as you may have noted by the various answers received, it's basically correct.
However, I remained pondering where and how you may "have seen this before".
Well, I think I got the answer: You saw this in Excel.
Look at this:
This is the result after entering the formulas and sorting by column H.
It's exactly what you want ...
What are you using? If you're using Excel, you're looking for RANK(num, ref).
=RANK(B2,B$2:B$9)
I don't know of any programming language that has that built in, it would always require a loop of some form.
If you want the rank of a single element, you can do it in O(n) by looping through the elements, counting how many have value above the given element, and adding 1.
If you want the rank of all the elements, the best (and really only) way is to sort the elements. Anything else you do will be equivalent to sorting (there is no "formula")
Are you using T-SQL? T-SQL RANK() may pull what you want.