Rust Ndarray Slice from vec/list of integers - multidimensional-array

I need to take a slice from ArrayBase ( ndarray crate ) with particular indexes (ie not range, not a single integer, but a collection of integers).
Example:
let ar = arr2(&[[1.,2.,3., 4.,5.,6.], [7., 8., 9., 10., 11., 12.]]);
From: https://docs.rs/ndarray/latest/ndarray/doc/ndarray_for_numpy_users/index.html#indexing-and-slicing I see that s! macro helps to build a slice from integers and ranges. However, what if I want to build a slice from an arbitraty vec (which I don't know upfront)?
Eg, I want to take first, fourth and fifth columns, so I want something like:
dbg!(ar.slice(s![.., &[0, 3, 4] ));
I'd like to get an array like this:
[[1.0, 4.0, 5.0],
[7.0, 10.0, 11.0]]

Related

Find which sum of any numbers in an array equals amount

I have a customer who sends electronic payments but doesn't bother to specify which invoices. I'm left guessing which ones and I would rather not try every single combination manually. I need some sort of pseudo-code to do it and then I can adapt it but I'm not sure I can come up with a good algorithm myself. . I'm familiar with php, bash, and python but I can adapt.
I would need an array with the following numbers: [357.15, 223.73, 106.99, 89.96, 312.39, 120.00]. Those are the amounts of the invoices. Then I would need to find a sum of any combination of two or more of those numbers that adds up to 596.57. Once found the program would need to tell me exactly which numbers it used to reach the sum so I can then know which invoices got paid.
This is very similar to the Subset Sum problem and can be solved using a similar approach to the typical brute-force method used for that problem. I have to do this often enough that I keep a simple template of this algorithm handy for when I need it. What is posted below is a slightly modified version1.
This has no restrictions on whether the values are integer or float. The basic idea is to iterate over the list of input values and keep a running list of every subset that sums to less than the target value (since there might be a later value in the inputs that will yield the target). It could be modified to handle negative values as well by removing the rule that only keeps candidate subsets if they sum to less than the target. In that case, you'd keep all subsets, and then search through them at the end.
import copy
def find_subsets(base_values, taget):
possible_matches = [[0, []]] # [[known_attainable_value, [list, of, components]], [...], ...]
matches = [] # we'll return ALL subsets that sum to `target`
for base_value in base_values:
temp = copy.deepcopy(possible_matches) # Can't modify in loop, so use a copy
for possible_match in possible_matches:
new_val = possible_match[0] + base_value
if new_val <= target:
new_possible_match = [new_val, possible_match[1]]
new_possible_match[1].append(base_value)
temp.append(new_possible_match)
if new_val == target:
matches.append(new_possible_match[1])
possible_matches = temp
return matches
find_subsets([list, of input, values], target_sum)
This is a very inefficient algorithm and it will blow up quickly as the size of the input grows. The Subset Sum problem is NP-Complete, so you are not likely to find a generalized solution that will work in all cases and is efficient.
1: The way lists are being used here is kludgy. If the goal was to simply find any match, the nested lists could be replaced with a dictionary, and we could exit right away once a match is found. But doing that will cause intermediate subsets that sum to the same value to also map to the same dictionary slot, so only one subset with that sum is kept. Since we need to report all matching subsets (because the values represent checks and are presumably not fungible even if the dollar amounts are equal), a dictionary won't work.
You can use itertools.combinations(t,r) to list all combinations of r elements in array t.
So we loop on the possible values of r, then on the results of itertools.combinations:
import itertools
def find_sum(t, obj):
t = [x for x in t if x < obj] # filter out elements which are too big
for r in range(1, len(t)+1): # loop on number of elements
for subt in itertools.combinations(t, r): # loop on combinations of r elements
if sum(subt) == obj:
return subt
return None
find_sum([1,2,3,4], 6)
# (2, 4)
find_sum([1,2,3,4], 10)
# (1, 2, 3, 4)
find_sum([1,2,3,4], 11)
# none
find_sum([35715, 22373, 10699, 8996, 31239, 12000], 59657)
# none
Rounding errors:
The code above is meant to be used with integers, rather than floats.
To use with floats, replace the test sum(subt) == obj with the more forgiving test sum(subt) - obj < 0.01.
Relevant documentation:
itertools.combinations

I have a list and I want to print a range of it's content with range and for loop

I have the following list on python:
items = [5,4,12,7,15,9]
and I want to print in this form:
9,15,7,12,4
How can I do that ?
numbers_list = [5,4,12,7,15,9]
for index in range(len(numbers_list)):
print(numbers_list[(index + 1) * - 1])
Not sure if it's very "Pythonic"
As the list indeces are being negated you can access the elements in the reverse order.
Last index in a Python is list [-1] and so on, till the first one being list length -1 (Plus one in this case to get the negative number closer to 0).
Using reversed and str.join:
numbers = [5, 4, 12, 7, 15, 9]
print(",".join(str(n) for n in reversed(numbers))) # 9,15,7,12,4,5
str.join is by far better than building your own string using mystring += "something" in terms of performances. How slow is Python's string concatenation vs. str.join? provides interesting insights about this.
I could also write a list comprehension to build an intermediate list like this:
reversed_string = [str(n) for n in reversed(numbers)]
print(",".join(reversed_string))
but writing list comprehension implies we store in-memory twice the list (the original one and the "strigified" one). Using a generator will dynamically compute the elements for str.join, somewhat the same way a classic iterator would do.

iterating through a multidimensional array using CartesianRange in julia

I want to retrieve all the elements along the last dimension of an N-dimensional array A. That is, if idx is an (N-1) dimensional tuple, I want A[idx...,:]. I've figured out how to use CartesianRange for this, and it works as shown below
A = rand(2,3,4)
for idx in CartesianRange(size(A)[1:end-1])
i = zeros(Int, length(idx))
[i[bdx] = idx[bdx] for bdx in 1:length(idx)]
#show(A[i...,:])
end
However, there must be an easier way to create the index i shown above . Splatting idx does not work - what am I doing wrong?
You can just index directly with the CartesianIndex that gets generated from the CartesianRange!
julia> for idx in CartesianRange(size(A)[1:end-1])
#show(A[idx,:])
end
A[idx,:] = [0.0334735,0.216738,0.941401,0.973918]
A[idx,:] = [0.842384,0.236736,0.103348,0.729471]
A[idx,:] = [0.056548,0.283617,0.504253,0.718918]
A[idx,:] = [0.551649,0.55043,0.126092,0.259216]
A[idx,:] = [0.65623,0.738998,0.781989,0.160111]
A[idx,:] = [0.177955,0.971617,0.942002,0.210386]
The other recommendation I'd have here is to use the un-exported Base.front function to extract the leading dimensions from size(A) instead of indexing into it. Working with tuples in a type-stable way like this can be a little tricky, but they're really fast once you get the hang of it.
It's also worth noting that Julia's arrays are column-major, so accessing the trailing dimension like this is going to be much slower than grabbing the columns.

Pyspark Array Key,Value

I currently have an RDD with an array that stores a key-value pair where the key is the 2D indices of the array and the value is the number at that spot. For example [((0,0),1),((0,1),2),((1,0),3),((1,1),4)]
I want to add up the values of each key with the surrounding values. In relation to my earlier example, I want to add up 1,2,3 and place it in the (0,0) key value spot. How would I do this?
I would suggest you do the following:
Define a function that, given a pair (i,j), returns a list with the pairs corresponding to the positions surrounding (i,j), plus the input pair (i,j). For instance, lets say the function is called surrounding_pairs(pair). Then:
surrounding_pairs((0,0)) = [ (0,0), (0,1), (1,0) ]
surrounding_pairs((2,3)) = [ (2,3), (2,2), (2,4), (1,3), (3,3) ]
Of course, you need to be careful and return only valid positions.
Use a flatMap on your RDD as follows:
MyRDD = MyRDD.flatMap(lambda (pos, v): [(p, v) for p in surrounding_pairs(pos)])
This will map your RDD from
[((0,0),1),((0,1),2),((1,0),3),((1,1),4)] to
[((0,0),1),((0,1),1),((1,0),1),
((0,1),2),((0,0),2),((1,1),2),
((1,0),3),((0,0),3),((1,1),3),
((1,1),4),((1,0),4),((0,1),4)]
This way, the value at each position will be "copied" to the neighbour positions.
Finally, just use a reduceByKey to add the corresponding values at each position:
from operator import add
MyRDD = MyRDD.reduceByKey(add)
I hope this makes sense.

Comparing values of 2 dictionaries

Can anyone advice how I could compare values of 2 dictionaries. For example:
A = {'John': [(300, 5000), (700, 750), (10, 300)], 'Mary': [(12, 300), (5678, 9000), (200, 657), (800, 7894)]}
B = {‘Jim’:[(500,1000),(600,1500),(900,2000)], ‘Mary’:[(13,250), (1000,6000), (222,600)]}
I would like to compare the 2 such that if the 'key' (in this case 'Mary') is present between A and B dictionaries and the first and second numbers in the 'values' of B dictionaries are within that of the 'values in A (i.e. (13,250) and (222,600) are between (12,300) and (200, 657) respectively. The return results will therefore be 'Mary': [(13,250), (222,600)]
Thanks
Okay I did what I think you wanted and retrieved the results (13,250)
(222,600). So it looks like it is working. I made three classes one was the main class, another a Point Class, and another class that did the filling of the dictionaries and comparing. I didn't know how you made up your dictionaries. But I made it like so:
private Dictionary<String,Map<String,List<Point>>> first
= new Hashtable<String,Map<String,List<Point>>>();
It is dictionary that takes in a String and a Map, which takes in a String and a List, which takes in a Point Object. When I look at your snippet there; it just screamed Points. So I made a small class with x and y properties.
Next, one should make a method that will compare the values of the points like using less than and greater than:
Then in another method after you fill the Dictionary, then check the dictionary and loop through the size of the second list of the dictionary
int size = second.get("B").get("Mary").size();
for(int i =0; i<size; i++){
//compare method that you just made
}
Then print out results
My Output: Mary:(13,250)(222,600)
If you need any help with code please reply.

Resources