I'm hoping someone might be able to point me into the right direction.
Given a list of several items I need to find the total count for each item that adds up to the solution.
Lets say I need to find solution {9, 2, 6}, I need to figure out which items would add up to the solution, items can repeat as well.
item1 {2 , 5, 6}
item2 {9, -1, 2}
item3 {6, 19, 12}
I'm not 100% sure on what you're asking but I'll give a couple of options based on different interpretations after reading your question through a couple of times.
1.) If you're just looking for the same sum, then you would (in a loop) add up the elements of the array and then compare that sum to the sum of the solution.
2.) If you're looking for an array that has the same elements in the same order as the solution, then you would just compare each element and check if they are the same.
3.) If you want to find multiple arrays that add up to the solution, then you could create another array and use it to keep track of the values, stopping when you reach the solution and using some sort of check variable (maybe another array) to record which arrays add up to the solution.
Related
This is the algorithm for finding the intersection of two skip lists:
(Finding Intersection of two skip lists - copyright to Stanford)
We can see that the "jumping by skips" benefits a lot in terms of efficiency compared to moving one step at a time.
But here I'm curious, what if the case is extended to multiple skip lists, say 100 lists? Currently, I only think of divide and conquer, in which the multiple skip lists are grouped by 2, and sequentially derive its intersection and later merge the solution, which sounds time-consuming and inefficient.
What is the better way to determine the intersections of multiple skip lists with the least time spent?
Initialize a pointer to the beginning of each of your skip lists.
We will maintain two things:
The current max value pointed to
a min-heap of (value, pointer) pairs.
At each step:
Check if all pointers have the same value by comparing the top of the min-heap with the max value.
If those two are are the same:
All current values must be the same (since min == max), so the value is in the intersection.
Add that value to the output.
Pop your min-heap, advance its pointer until it gets to a bigger value, and push the new value. Update max to the new value.
Else:
Pop your min-heap, advance its pointer towards the max value, skipping as needed.
If its new value exceeds the max value, update the max value.
Push the new value onto your min-heap.
Stop when any list runs out (you need to advance a pointer but can't.)
This is a slight twist on a classic programming interview problem "Merge k sorted lists" -- the algorithm here is very similar. I'd suggest looking at that if anything in this answer is unclear.
I'm trying to understand how exactly works function Evaluate. Here I have two examples and the only difference between them is function Evaluate.
First plot with Evaluate.
ReliefPlot[
Table[Evaluate[Sum[Sin[RandomReal[9, 2].{x, y}], {20}]], {x, 1, 2, .02},
{y, 1, 2, .02}],
ColorFunction ->
(Blend[{Darker[Green, .8], Lighter[Brown, .2],White}, #] &),
Frame -> False, Background -> None, PlotLegends -> Automatic]
https://imgur.com/itBRYEv.png "plot1"
Second plot without Evaluate.
ReliefPlot[
Table[Sum[Sin[RandomReal[9, 2].{x, y}], {20}], {x, 1, 2, .02},
{y, 1,2, .02}],
ColorFunction ->
(Blend[{Darker[Green, .8], Lighter[Brown, .2], White}, #] &),
Frame -> False, Background -> None,
PlotLegends -> Automatic]
https://i.imgur.com/fvdiSCm.png "plot2"
Please explain how Evaluate makes a difference here.
Compare this
count=0;
ReliefPlot[Table[Sum[Sin[count++;RandomReal[9,2].{x,y}],{20}],{x,1,2,.02},{y,1,2,.02}]]
count
which should display your plot followed by 52020=51*51*20 because you have a 51*51 Table and each entry needs to evaluate the 20 iterations of your Sum
with this
count=0;
ReliefPlot[Table[Evaluate[Sum[Sin[count++;RandomReal[9,2].{x,y}],{20}]],{x,1,2,.02},{y,1,2,.02}]]
count
which should display your plot followed by 20 because the Evaluate needed to do the 20 iterations of your Sum only once, even though you do see 51*51 blocks of different colors on the screen.
You will get the same counts displayed, without the graphics, if you remove the ReliefPlot from each of these, so that seems to show it isn't the ReliefPlot that is responsible for the number of times your RandomReal is calculated, it is the Table.
So that Evaluate is translating the external text of your Table entry into an internal form and telling Table that this has already been done and does not need to be repeated for every iteration of the Table.
What you put and see on the screen is the front end of Mathematica. Hidden behind that is the back end where most of the actual calculations are done. The front and back ends communicate with each other during your input, calculations, output and display.
But this still doesn't answer the question why the two plots look so different. I am guessing that when you don't use Evaluate and thus don't mark the result of the Table as being complete and finished then the ReliefPlot will repeatedly probe that expression in your Table and that expression will be different every time because of the RandomReal and this is what displays the smoother higher resolution displayed graphic. But when you do use the Evaluate and thus the Table is marked as done and finished and needs no further evaluation then the ReliefPlot just uses the 51*51 values without recalculating or probing and you get a lower resolution ReliefPlot.
As with almost all of Mathematica, the details of the algorithms used for each of the thousands of different functions are not available. Sometimes the Options and Details tab in the help page for a given function can give you some additional information. Experimentation can sometimes help you guess what is going on behind the code. Sometimes other very bright people have figured out parts of the behavior and posted descriptions. But that is probably all there is.
Table has the HoldAll attribute
Attributes[Table]
(* {HoldAll, Protected} *)
Read this and this to learn more about evaluation in the WL.
I am attempting to represent dice rolls in Julia. I am generating all the rolls of a ndsides with
sort(collect(product(repeated(1:sides, n)...)), by=sum)
This produces something like:
[(1,1),(2,1),(1,2),(3,1),(2,2),(1,3),(4,1),(3,2),(2,3),(1,4) … (6,3),(5,4),(4,5),(3,6),(6,4),(5,5),(4,6),(6,5),(5,6),(6,6)]
I then want to be able to reasonably modify those tuples to represent things like dropping the lowest value in the roll or adding a constant number, etc., e.g., converting (2,5) into (10,2,5) or (5,).
Does Julia provide nice functions to easily modify (not necessarily in-place) n-tuples or will it be simpler to move to a different structure to represent the rolls?
Thanks.
Tuples are immutable, so you can't modify them in-place. There is very good support for other mutable data structures, so there aren't many methods that take a tuple and return a new, slightly modified copy. One way to do this is by splatting a section of the old tuple into a new tuple, so, for example, to create a new tuple like an existing tuple t but with the first element set to 5, you would write: tuple(5, t[2:end]...). But that's awkward, and there are much better solutions.
As spencerlyon2 suggests in his comment, a one dimensional Array{Int,1} is a great place to start. You can take a look at the Data Structures manual page to get an idea of the kinds of operations you can use; one-dimensional Arrays are iterable, indexable, and support the dequeue interface.
Depending upon how important performance is and how much work you're doing, it may be worthwhile to create your own data structure. You'll be able to add your own, specific methods (e.g., reroll!) for that type. And by taking advantage of some of the domain restrictions (e.g., if you only ever want to have a limited number of dice rolls), you may be able to beat the performance of the general Array.
You can construct a new tuple based on spreading or slicing another:
julia> b = (2,5)
(2, 5)
julia> (10, b...)
(10, 2, 5)
julia> b[2:end]
(5,)
I am planning out a C++ program that takes 3 strings that represent a cryptarithmetic puzzle. For example, given TWO, TWO, and FOUR, the program would find digit substitutions for each letter such that the mathematical expression
TWO
+ TWO
------
FOUR
is true, with the inputs assumed to be right justified. One way to go about this would of course be to just brute force it, assigning every possible substitution for each letter with nested loops, trying the sum repeatedly, etc., until the answer is finally found.
My thought is that though this is terribly inefficient, the underlying loop-check thing may be a feasible (or even necessary) way to go--after a series of deductions are performed to limit the domains of each variable. I'm finding it kind of hard to visualize, but would it be reasonable to first assume a general/padded structure like this (each X represents a not-necessarily distinct digit, and each C is a carry digit, which in this case, will either be 0 or 1)? :
CCC.....CCC
XXX.....XXXX
+ XXX.....XXXX
----------------
CXXX.....XXXX
With that in mind, some more planning thoughts:
-Though leading zeros will not be given in the problem, I probably ought to add enough of them where appropriate to even things out/match operands up.
-I'm thinking I should start with a set of possible values 0-9 for each letter, perhaps stored as vectors in a 'domains' table, and eliminate values from this as deductions are made. For example, if I see some letters lined up like this
A
C
--
A
, I can tell that C is zero and this eliminate all other values from its domain. I can think of quite a few deductions, but generalizing them to all kinds of little situations and putting it into code seems kind of tricky at first glance.
-Assuming I have a good series of deductions that run through things and boot out lots of values from the domains table, I suppose I'd still just loop over everything and hope that the state space is small enough to generate a solution in a reasonable amount of time. But it feels like there has to be more to it than that! -- maybe some clever equations to set up or something along those lines.
Tips are appreciated!
You could iterate over this problem from right to left, i.e. the way you'd perform the actual operation. Start with the rightmost column. For every digit you encounter, you check whether there already is an assignment for that digit. If there is, you use its value and go on. If there isn't, then you enter a loop over all possible digits (perhaps omitting already used ones if you want a bijective map) and recursively continue with each possible assignment. When you reach the sum row, you again check whether the variable for the digit given there is already assigned. If it is not, you assign the last digit of your current sum, and then continue to the next higher valued column, taking the carry with you. If there already is an assignment, and it agrees with the last digit of your result, you proceed in the same way. If there is an assignment and it disagrees, then you abort the current branch, and return to the closest loop where you had other digits to choose from.
The benefit of this approach should be that many variables are determined by a sum, instead of guessed up front. Particularly for letters which only occur in the sum row, this might be a huge win. Furthermore, you might be able to spot errors early on, thus avoiding choices for letters in some cases where the choices you made so far are already inconsistent. A drawback might be the slightly more complicated recursive structure of your program. But once you got that right, you'll also have learned a good deal about turning thoughts into code.
I solved this problem at my blog using a randomized hill-climbing algorithm. The basic idea is to choose a random assignment of digits to letters, "score" the assignment by computing the difference between the two sides of the equation, then altering the assignment (swap two digits) and recompute the score, keeping those changes that improve the score and discarding those changes that don't. That's hill-climbing, because you only accept changes in one direction. The problem with hill-climbing is that it sometimes gets stuck in a local maximum, so every so often you throw out the current attempt and start over; that's the randomization part of the algorithm. The algorithm is very fast: it solves every cryptarithm I have given it in fractions of a second.
Cryptarithmetic problems are classic constraint satisfaction problems. Basically, what you need to do is have your program generate constraints based on the inputs such that you end up with something like the following, using your given example:
O + O = 2O = R + 10Carry1
W + W + Carry1 = 2W + Carry1 = U + 10Carry2
T + T + Carry2 = 2T + Carry2 = O + 10Carry3 = O + 10F
Generalized pseudocode:
for i in range of shorter input, or either input if they're the same length:
shorterInput[i] + longerInput2[i] + Carry[i] = result[i] + 10*Carry[i+1] // Carry[0] == 0
for the rest of the longer input, if one is longer:
longerInput[i] + Carry[i] = result[i] + 10*Carry[i+1]
Additional constraints based on the definition of the problem:
Range(digits) == {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Range(auxiliary_carries) == {0, 1}
So for your example:
Range(O, W, T) == {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Range(Carry1, Carry2, F) == {0, 1}
Once you've generated the constraints to limit your search space, you can use CSP resolution techniques as described in the linked article to walk the search space and determine your solution (if one exists, of course). The concept of (local) consistency is very important here and taking advantage of it allows you to possibly greatly reduce the search space for CSPs.
As a simple example, note that cryptarithmetic generally does not use leading zeroes, meaning if the result is longer than both inputs the final digit, i.e. the last carry digit, must be 1 (so in your example, it means F == 1). This constraint can then be propagated backwards, as it means that 2T + Carry2 == O + 10; in other words, the minimum value for T must be 5, as Carry2 can be at most 1 and 2(4)+1==9. There are other methods of enhancing the search (min-conflicts algorithm, etc.), but I'd rather not turn this answer into a full-fledged CSP class so I'll leave further investigation up to you.
(Note that you can't make assumptions like A+C=A -> C == 0 except for in least significant column due to the possibility of C being 9 and the carry digit into the column being 1. That does mean that C in general will be limited to the domain {0, 9}, however, so you weren't completely off with that.)
Per IBM documentation at - http://publib.boulder.ibm.com/infocenter/bigins/v1r1/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.biginsights.doc%2Fdoc%2Fc0057749.html the default order of Jaql's top operator is ascending. But when I run it, I see the default order as descending. I am using BigInsights version 1.4. I was wondering if anyone knows whether this is a documentation issue or some other reason behind this seeming discrepancy -
jaql> nums = [2,1,3];
jaql> nums -> top 2;
[
2,
1
]
Top does not impose any ordering on the input array. It translates to a slice(array, 0, n); function call. It takes the first n elements, unless you run it MR mode, which you did not in this example. Top just translates to slice(), it does not look at the values. If you wanted to impose a deterministic order, you would have to attach a comparator.
In this case, because the example used [2,1,3], it appears as though it is in descending order, but Top just returned the first two values in the array. Had you asked for:
jaql> nums -> top 3;
it would have returned:
[
2,
1,
3
]