Counting digit Frequencies with Python - dictionary

I need your help in Python 2.7.
I made a dictionary:
{'1': 1, '3': 1, '2': 5, '6': 5}
Question 1:
What can I do if I want to print the key that has the highest value?
Question 2 :
In this case, '2' and '6' have both the same high value,
so I want Python to print the digit (type: int) of the highest key only (6).
How can I program this?
I tried many times and failed...
Im looking for easier way..without using 'sort'

simple pythonic code
a = {'1': 1, '3': 1, '2': 5, '6': 5}
m = sorted(a.items(),key=lambda x :x[1])
d = [i[0] for i in m if i[1]==m[-1][1]]
print d
print m[-1][0]

Related

Next number in the sequence || General approach

how can I find the next number in this sequence ,
1, 4, 7, 8, 13, 12, 9 , ?
How to check , given any sequence of numbers , feasible or not. Any general theory or approach is very much welcomed .
One method is to go to the Online Encyclopedia of Integer Sequences and enter your list of at least six or eight numbers into the box and see if that happens to be a known sequence. For your example this doesn't find a known sequence.
If that doesn't work then you can try Mathematica's FindFormula
p=FindFormula[{1, 4, 7, 8, 13, 12, 9}];
and then
p[1] returns 1, p[2] returns 4, p[3] returns 7... and p[8] returns 106, etc.
You can read the documentation on FindFormula and you can look at the formula p by using InputForm[p] where #1 represents a variable in the function p.
In general I think this is rarely going to produce the result that you are looking for.
seq = FindSequenceFunction[{1, 4, 7, 8, 13, 12, 9}, n]
(48 - 74 n - 14 n^2 + 11 n^3 - n^4)/(3 (-13 + 3 n))
Checking the 7th number
n = 7;
seq
9
The next number is a fraction, apparently
n = 8;
seq
32/11
Show[Plot[seq, {n, 1, 10}],
ListPlot[Table[seq, {n, 1, 10}]],
PlotRange -> {{0, 10}, {-20, 30}}, AxesOrigin -> {0, 0}]

Multi-line text input to dictionary

Say I've got a multi-line input from a text file that looks smth like this:
(empty string here)
1 -30 2 0
3 0 5 10
etc.
I parsed the input and formed a list out of it:
lst = [[], [(1, -30), (2, 0)], [(3, 0), (5, 10)]]
Assume I want to make a dictionary where the key is the line number (starting with 1) and the value is a nested dictionary which contains the first element of a tuple as a key and the second one as a value:
**lst -> dict:**
dict = {'1': {}, '2': {1: -30, 2: 0}, '3': {3: 0, 5: 10}}
What's the best (pythonic) way to do this? Spent couple hours by now trying to do this.
Thanks!
lst = [[], [(1, -30), (2, 0)], [(3, 0), (5, 10)]]
print( {str(i): dict(v) for i, v in enumerate(lst, 1)} )
Prints:
{'1': {}, '2': {1: -30, 2: 0}, '3': {3: 0, 5: 10}}

Measure of Dispersion for Labelled Data

I have data of the form {'A': 5, 'B': 10, 'C': 3} and I'd like to compute a value that's between 0 and 1 for the level of dispersion between labeled data. For example, if all the data were of the form {'A':3}, then there would be no dispersion and the value computed would be 0. If the data were of the form {'A': 1, 'B': 1, 'C': 1}, then the value computed would be 1 because there is maximal dispersion between the labels. And finally if the data were of the form {'A': 1, 'B': 2} then the value computed would be somewhere between 0 and 1.
More specifically, I'm trying to compute a value that indicates the level of physical dispersion between offices. For example, if a team of 10 people are in a single office, then there's no dispersion. If they're in a 10 different offices, then there's maximal dispersion. {'A': 2, 'B': 2, 'C': 2} should have a value less than {'A': 1, 'B': 1, 'C': 1, 'D': 1, 'E': 1, 'F': 1} because the latter is more disperse.
What's the correct mathematical function for this?

Searching matrices in Mathematica 8 - Trying to find other elements on the same row as X

The text in italics describes my general goal, if anyone is interested. Question is underneath.
I am trying to graph the energy levels of simple molecules using Mathematica 8. My method is crude, and goes as this:
Find eigenvalues of simple Hückel matrix.
Delete duplicates and determine size of list.
Evaluate the number of degeneracies by comparing duplicate list with no-duplicate list.
Create a n x 2 zero matrix where n is the number of unique energy levels.
5. Fill first column with unique energy levels, second column with degeneracies.
The matrix generated in step 5 can look like this:
(1 2)
(3 1) == M
(-1 1)
I wish to evaluate the maximum of column 2, and then find the value of the element in the same row, but in column 1. In this case, the answer I am looking for is 1.
These commands both evaluate to -1:
Extract[M[[All, 1]], M[[Max[M[[All, 2]]], 1]]]
M[[Max[M[[All, 1]]], 1]]
which is not the answer I want.
Any tips?
EDIT: This
Part[Part[Position[M, Max[M[[All, 2]]]], 1], 1]
works, but I don't understand why I have to use Part[] twice.
m = {{1, 2}, {3, 1}, {-1, 1}}
max = Max[m[[All, 2]]]
So find the position of the max and replace the second column with the first:
pos=Position[m, max] /. {x_,_}:>{x,1}
{{1,1}}
Then take the first element from pos, i.e. {1,1} and sub use it in Part
m[[Sequence ## First[pos]]]
1
But having said that I prefer something like this:
Cases[m, {x_, max} :> x]
{1}
The result is a list. You could either use First#Cases[...] or you might want to keep a list of results to cover cases where the maximum value occurs more than once in a column.
The inner Part gives you the first occurance of the maximum. Position returns a list of positions, even if there is only one element that has the maximum value, like this:
M = {{2, 2}, {2, 3}, {2, 2}, {1, 1}}
{{2, 2}, {2, 3}, {2, 2}, {1, 1}}
Position[M, Max[M[[All, 2]]]]
{{2, 2}}
So you want the first element in the first element of this output. You could condense your code like this:
Position[M, Max[M[[All, 2]]]][[1, 1]]
However, one thing that I think your code needs to handle better is this case:
M = {{3, 2}, {2, 3}, {2, 2}, {1, 1}}
3, 2}, {2, 3}, {2, 2}, {1, 1}}
Position[M, Max[M[[All, 2]]]]
{{1, 1}, {2, 2}}
You will get the wrong answer with your code in this case.
Better would be:
M[[All, 1]][[Position[M[[All, 2]], Max[M[[All, 2]]]][[1, 1]] ]]
Or alternatively
M[[Position[M[[All, 2]], Max[M[[All, 2]]]][[1, 1]], 1]]
If you only want a single column one value in the case of duplicate maximum values in column two I suggest that you make use of Ordering:
m = {{1, 3}, {1, 8}, {5, 7}, {2, 2}, {1, 9}, {4, 9}, {5, 6}};
m[[ Ordering[m[[All, 2]], -1], 1 ]]
{4}

Nth Combination

Is there a direct way of getting the Nth combination of an ordered set of all combinations of nCr?
Example: I have four elements: [6, 4, 2, 1]. All the possible combinations by taking three at a time would be:
[[6, 4, 2], [6, 4, 1], [6, 2, 1], [4, 2, 1]].
Is there an algorithm that would give me e.g. the 3rd answer, [6, 2, 1], in the ordered result set, without enumerating all the previous answers?
Note you can generate the sequence by recursively generating all combinations with the first element, then all combinations without. In both recursive cases, you drop the first element to get all combinations from n-1 elements. In Python:
def combination(l, r):
if r == 0:
yield []
elif len(l) == r:
yield l
else:
for c in (combination(l[1:], r-1)):
yield l[0:1]+c
for c in (combination(l[1:], r)):
yield c
Any time you're generating a sequence by making a choice like this, you can recursively generate the kth element by counting how many elements a choice generates and comparing the count to k. If k is less than the count, you make that choice. Otherwise, subtract the count and repeat for the other possible choices you could make at that point. If there are always b choices, you can view this as generating a number in base b. The technique still works if the number of choices varies. In pseudocode (when all choices are always available):
kth(k, choicePoints)
if choicePoints is empty
return empty list
for each choice in head of choicePoints:
if k < size of choice
return choice and kth(k, tail of choicePoints)
else
k -= size of choice
signal exception: k is out-of-bounds
This gives you a 0-based index. If you want 1-based, change the comparison to k <= size of choice.
The tricky part (and what is unspecified in the pseudocode) is that the size of a choice depends on previous choices. Note the pseudocode can be used to solve a more general case than the problem.
For this specific problem, there are two choices (b= 2) and the size of the 1st choice (i.e. including the 1st element) is given by n-1Cr-1. Here's one implementation (which requires a suitable nCr):
def kthCombination(k, l, r):
if r == 0:
return []
elif len(l) == r:
return l
else:
i=nCr(len(l)-1, r-1)
if k < i:
return l[0:1] + kthCombination(k, l[1:], r-1)
else:
return kthCombination(k-i, l[1:], r)
If you reverse the order of the choices, you reverse the order of the sequence.
def reverseKthCombination(k, l, r):
if r == 0:
return []
elif len(l) == r:
return l
else:
i=nCr(len(l)-1, r)
if k < i:
return reverseKthCombination(k, l[1:], r)
else:
return l[0:1] + reverseKthCombination(k-i, l[1:], r-1)
Putting it to use:
>>> l = [6, 4, 2, 1]
>>> [kthCombination(k, [6, 4, 2, 1], 3) for k in range(nCr(len(l), 3)) ]
[[6, 4, 2], [6, 4, 1], [6, 2, 1], [4, 2, 1]]
>>> powOf2s=[2**i for i in range(4,-1,-1)]
>>> [sum(kthCombination(k, powOf2s, 3)) for k in range(nCr(len(powOf2s), 3))]
[28, 26, 25, 22, 21, 19, 14, 13, 11, 7]
>>> [sum(reverseKthCombination(k, powOf2s, 3)) for k in range(nCr(len(powOf2s), 3))]
[7, 11, 13, 14, 19, 21, 22, 25, 26, 28]
TLDR? Just scroll to the very bottom for my final solution.
I stumbled across this question while I was looking for methods to both get the index a specified combination would be located at if it were in a lexicographically sorted list and vice versa, for a choice of objects from some potentially very large set of objects and couldn't find much on the latter (the inverse of your problem is not so elusive).
Since I also solved (what I thought was) your exact problem before I thought I'd post my solutions to both here.
** EDIT: My requirement is what your requirement was too - I saw the answers and thought recursion was fine. Well now, after six long years you have it; just scroll down.**
For your requirement as (I thought it was) posed in the question this will do the job just fine:
def iterCombinations(n, k):
if k==1:
for i in range(n):
yield [i]
return
result = []
for a in range(k-1, n):
for e in iterCombinations(n, k-1):
if e[-1] == a:
break
yield e + [a]
You can then lookup the item in a collection ordered in the descending order (or use some equivalent compare methodology), so for the case in question:
>>> itemsDescending = [6,4,2,1]
>>> for c in iterCombinations(4, 3):
... [itemsDescending[i] for i in c]
...
[6, 4, 2]
[6, 4, 1]
[6, 2, 1]
[4, 2, 1]
This is also possible straight out of the box in Python, however:
>>> import itertools
>>> for c in itertools.combinations(itemsDescending, 3):
... c
...
(6, 4, 2)
(6, 4, 1)
(6, 2, 1)
(4, 2, 1)
Here is what I did for my requirement (and really for yours!) of a non-recursive algorithm that does not create or traverse the ordered list for either direction, but rather uses a simple but effective non-recursive implementation of nCr, choose(n, k):
def choose(n, k):
'''Returns the number of ways to choose k items from n items'''
reflect = n - k
if k > reflect:
if k > n:
return 0
k = reflect
if k == 0:
return 1
for nMinusIPlus1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
n = n * nMinusIPlus1 // i
return n
To get the combination at some (zero-based) index in a forward sorted list:
def iterCombination(index, n, k):
'''Yields the items of the single combination that would be at the provided
(0-based) index in a lexicographically sorted list of combinations of choices
of k items from n items [0,n), given the combinations were sorted in
descending order. Yields in descending order.
'''
if index < 0 or index >= choose(n, k):
return
n -= 1
for i in range(k):
while choose(n, k) > index:
n -= 1
yield n
index -= choose(n, k)
n -= 1
k -= 1
To get the (zero-based) index at which some combination would reside in a reverse ordered list:
def indexOfCombination(combination):
'''Returns the (0-based) index the given combination would have if it were in
a reverse-lexicographically sorted list of combinations of choices of
len(combination) items from any possible number of items (given the
combination's length and maximum value)
- combination must already be in descending order,
and it's items drawn from the set [0,n).
'''
result = 0
for i, a in enumerate(combination):
result += choose(a, i + 1)
return result
It's overkill for your example (but I realise now that that was just an example); this is how that would go for each index in turn:
def exampleUseCase(itemsDescending=[6,4,2,1], k=3):
n = len(itemsDescending)
print("index -> combination -> and back again:")
for i in range(choose(n, k)):
c = [itemsDescending[j] for j in iterCombination(i, n, k)][-1::-1]
index = indexOfCombination([itemsDescending.index(v) for v in c])
print("{0} -> {1} -> {2}".format(i, c, index))
>>> exampleUseCase()
index -> combination -> and back again:
0 -> [6, 4, 2] -> 0
1 -> [6, 4, 1] -> 1
2 -> [6, 2, 1] -> 2
3 -> [4, 2, 1] -> 3
This can find the index given some long list or return the combination at some astronomical index in the blink of an eye, for example:
>>> choose(2016, 37)
9617597205504126094112265433349923026485628526002095715212972063686138242753600
>>> list(iterCombination(_-1, 2016, 37))
[2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003,
2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989,
1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979]
or, since that was the very last one and could be fast due to the reflection in choose(n, k), here's one from right in the middle and it seems just as fast...
>>> choose(2016, 37)//2
4808798602752063047056132716674961513242814263001047857606486031843069121376800
>>> list(iterCombination(_, 2016, 37))
[1978, 1973, 1921, 1908, 1825, 1775, 1747, 1635, 1613, 1598, 1529, 1528, 1521,
1445, 1393, 1251, 1247, 1229, 1204, 1198, 922, 901, 794, 699, 685, 633, 619, 598,
469, 456, 374, 368, 357, 219, 149, 93, 71]
This last example pauses for thought for a split second, but wouldn't you?
>>> import random
>>> rSet = set(random.randint(0, 10000000) for i in range(900))
>>> len(rSet)
900
>>> rList = sorted(rSet, reverse=True)
>>> combinations.indexOfCombination(rList)
61536587905102303838316048492163850175478325236595592744487336325506086930974887
88085020093159925576117511028315621934208381981476407812702689774826510322023536
58905845549371069786639595263444239118366962232872361362581506476113967993096033
00541202874946853699568596881200225925266331936183173583581021914595163799417151
30442624813775945054888304722079206982972852037480516813527237183254850056012217
59834465303543702263588008387352235149083914737690225710105023486226582087736870
38383323140972279867697434315252036074490127510158752080225274972225311906715033
86851377357968649982293794242170046400174118714525559851836064661141086690326842
25236658978135989907667078625869419802333512020715700514133380517628637151215549
05922388534567108671308819960483147825031620798631811671493891643972220604919591
22785587505280326638477135315176731640100473359830821781905546117103137944239120
34912084544221250309244925308316352643060056100719194985568284049903555621750881
39419639825279398618630525081169688672242833238889454445237928356800414839702024
66807635358129606994342005075585962080795273287472139515994244684088406544976674
84183671032002497594936116837768233617073949894918741875863985858049825755901232
89317507965160689287607868119414903299382093412911433254998227245783454244894604
83654290108678890682359278892580855226717964180806265176337132759167920384512456
91624558534942279041452960272707049107641475225516294235268581475735143470692000
78400891862852130481822509803019636619427631175355448729708451565341764545325720
79277290914349746541071731127111532099038538549697091038496002102703737347343739
96398832832674081286904287066696046621691978697914823322322650123025472624927566
99891468668052668317066769517155581261265629289158798073055495539590686279250097
27295943276536772955923599217742543093669565147228386873469711200278811335649924
13587219640724942441913695193417732608127949738209466313175361161142601108707568
19470026889319648128790363676253707359290547393198350533094409863254710237344552
47692325209744353688541868412075798500629908908768438513508959321262250985142709
19794478379412756202638771417821781240327337108495689300616872374578607430951230
96908870723878513999404242546015617238957825116802801618973562178005776911079790
22026655573872019955677676783191505879571719659770550759779880002320421606755826
75809722478174545846409923210824885805972611279030267270741509747224602604003738
30411365119180944456819762167312738395140461035991994771968906979578667047734952
21981545694935313345331923300019842406900689401417602004228459137311983483386802
30352489602769346000257761959413965109940729263098747702427952104316612809425394
85037536245288888254374135695390839718978818689595231708490351927063849922772653
26064826999661128817511630298712833048667406916285156973335575847429111697259113
53969532522640227276562651123634766230804871160471143157687290382053412295542343
14022687833967461351170188107671919648640149202504369991478703293224727284508796
06843631262345918398240286430644564444566815901074110609701319038586170760771099
41252989796265436701638358088345892387619172572763571929093224171759199798290520
71975442996399826830220944004118266689537930602427572308646745061258472912222347
18088442198837834539211242627770833874751143136048704550494404981971932449150098
52555927020553995188323691320225317096340687798498057634440618188905647503384292
79493920419695886724506109053220167190536026635080266763647744881063220423654648
36855624855494077960732944499038847158715263413026604773216510801253044020991845
89652657529729792772055725210165026891724511953666038764273616212464901231675592
46950937136633665320781952510620087284589083139308516989522633786063418913473703
96532777760440118656525488729217328376766171004246127636983612583177565603918697
15557602015171235214344399010185766876727226408494760175957535995025356361689144
85181975631986409708533731043231896096597038345028523539733981468056497208027899
6245509252811753667386001506195
However going back from that index to the combination of 900-choose-10,000,000 that it represents with the previous implementation would be very slow (since it simply subtracts one from n at each iteration).
For such large lists of combinations we can instead do a binary search of the space, and the overhead we add means it will only be a little slower for small lists of combinations:
def iterCombination(index, n, k):
'''Yields the items of the single combination that would be at the provided
(0-based) index in a lexicographically sorted list of combinations of choices
of k items from n items [0,n), given the combinations were sorted in
descending order. Yields in descending order.
'''
if index < 0 or n < k or n < 1 or k < 1 or choose(n, k) <= index:
return
for i in range(k, 0, -1):
d = (n - i) // 2 or 1
n -= d
while 1:
nCi = choose(n, i)
while nCi > index:
d = d // 2 or 1
n -= d
nCi = choose(n, i)
if d == 1:
break
n += d
d //= 2
n -= d
yield n
index -= nCi
From this one may notice that all the calls to choose have terms that cancel, if we cancel everything out we end up with a much faster implementation and what is, I think...
The optimal function for this problem
def iterCombination(index, n, k):
'''Yields the items of the single combination that would be at the provided
(0-based) index in a lexicographically sorted list of combinations of choices
of k items from n items [0,n), given the combinations were sorted in
descending order. Yields in descending order.
'''
nCk = 1
for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
nCk *= nMinusI
nCk //= iPlus1
curIndex = nCk
for k in range(k, 0, -1):
nCk *= k
nCk //= n
while curIndex - nCk > index:
curIndex -= nCk
nCk *= (n - k)
nCk -= nCk % k
n -= 1
nCk //= n
n -= 1
yield n
A final reminder that for the use case of the question one would do something like this:
def combinationAt(index, itemsDescending, k):
return [itemsDescending[i] for i in
list(iterCombination(index, len(itemsDescending), k))[-1::-1]]
>>> itemsDescending = [6,4,2,1]
>>> numberOfItemsBeingChosen = 3
>>> zeroBasedIndexWanted = 1
>>> combinationAt(zeroBasedIndexWanted, itemsDescending, numberOfItemsBeingChosen)
[6, 4, 1]
One way to do it would be by using properties of bits. This still requires some enumeration, but you wouldn't have to enumerate every set.
For your example, you have 4 numbers in your set. So if you were generating all the possible combinations of 4 numbers, you could enumerate them as follows:
{6, 4, 2, 1}
0000 - {(no numbers in set)}
0001 - {1}
0010 - {2}
0011 - {2, 1}
...
1111 - {6, 4, 2, 1}
See how each "bit" corresponds to "whether that number is in your set"? We see here that there are 16 possibilities (2^4).
So now we can go through and find all of the possibilities that have only 3 bits turned on. This will tell us all of the combinations of "3" that exist:
0111 - {4, 2, 1}
1011 - {6, 2, 1}
1101 - {6, 4, 1}
1110 - {6, 4, 2}
And lets rewrite each of our binary values as decimal values:
0111 = 7
1011 = 11
1101 = 13
1110 = 14
Now that we've done that - well, you said you wanted the "3rd" enumeration. So lets look at the 3rd largest number: 11. Which has the bit pattern 1011. Which corresponds to... {6, 2, 1}
Cool!
Basically, you can use the same concept for any set. So now all we've done is translate the problem from "enumerating all the sets" to "enumerating all of the integers". This might be a lot easier for your problem.
From the Python 3.6 itertools recipes:
def nth_combination(iterable, r, index):
'Equivalent to list(combinations(iterable, r))[index]'
pool = tuple(iterable)
n = len(pool)
if r < 0 or r > n:
raise ValueError
c = 1
k = min(r, n-r)
for i in range(1, k+1):
c = c * (n - k + i) // i
if index < 0:
index += c
if index < 0 or index >= c:
raise IndexError
result = []
while r:
c, n, r = c*r//n, n-1, r-1
while index >= c:
index -= c
c, n = c*(n-r)//n, n-1
result.append(pool[-1-n])
return tuple(result)
In practice:
iterable, r, index = [6, 4, 2, 1], 3, 2
nth_combination(iterable, r, index)
# (6, 2, 1)
Alternatively, as mentioned in the docstring:
import itertools as it
list(it.combinations(iterable, r))[index]
# (6, 2, 1)
See also more_itertools - a third party library that implements this recipe for you. Install via:
> pip install more_itertools
just a rough sketch:
arrange your numbers into upper triangular matrix of tuples:
A(n-1,n-1)
Aij = [i+1, j-1]
if you traverse matrix row first, you will get combinations in increasing order for two elements. To generalize to three elements, think of your matrix rows as another triangular matrix, rather than a vector. It kind of creates a corner of a cube.
At least this is how I have would approach the problem
let me clarify this, you do not have to store the matrix, you will need to compute index.
Let me work out to dimensional example, which you in principle could expand to 20 dimensions(bookkeeping may be atrocious).
ij = (i*i + i)/2 + j // ij is also the combination number
(i,j) = decompose(ij) // from ij one can recover i,j components
I = i // actual first index
J = j + 1 // actual second index
this two-dimensional example works for any number n, and you dont have to tabulate permutations.
Yes there a direct way of getting the Nth combination of an ordered set of all combinations of nCr? Say you need to generate 0th, 3rd, 6th.. combinations of a given set. You can generate it directly without generating combinations in between using JNuberTools. You can even generate next billionth combination (if your set size is large)
Here is the code example:
JNumberTools.combinationsOf(list)
.uniqueNth(8,1000_000_000) //skip to billionth combination of size 8
.forEach(System.out::println);
The maven dependency for JNumberTools is :
<dependency>
<groupId>io.github.deepeshpatel</groupId>
<artifactId>jnumbertools</artifactId>
<version>1.0.0</version>
</dependency>

Resources