Related
I'm implementing a function where I'll be repeatedly eliminating values from a large list, and passing a copy of this list as a vector into another function each iteration:
let mut v = vec![5, 4, 4, 2, 6, 5, 1, 8, 2, 1, 6, 5, 4, 2, 0, 1];
for i in 0..10 {
println!("{}", Vector::from(v).iter().sum());
v.retain(|x| x > i);
}
If v is very large, this will be slow. Is there a better way? I tried:
let mut v = vec![5, 4, 4, 2, 6, 5, 1, 8, 2, 1, 6, 5, 4, 2, 0, 1];
let mut v = v.into_iter().map(|x| Some(x)).collect();
(and then replace the "deleted" values with None) but this just seemed unwieldy to convert to and from an ordinary Vec.
How should I be storing this list of values?
You can restructure your creation of the copied list to do the removal before the copy:
for i in 0..10 {
let dup = your_list.iter().filter(|n| n > i).collect::<Vec<_>>();
use_it(dup);
}
If it is important to your use case that you are left with a filtered Vec, and cannot change the collection type, then this is probably the most useful means. If the filters are cumulative, you can overwrite the original Vec with the filtered Vec each iteration to reduce the workload for each future iteration.
let mut list = your_list;
for i in 0..10 {
list = list.iter().filter(|n| n > i).collect();
use_it(list.clone());
}
The question you asked is directly answered by reshaping how you filter and duplicate the vector, but if you are able to change your types, then the answers below may be more worthwhile.
If your use_it function does not require a Vec or slice, then you may be better served by restructuring the consumer to take an iterator of numbers, and passing in your_list.iter().filter(...). This will result in no copying or rearranging in memory, and the consumer function will just skip the invalid values.
If you care more about counting how many times numbers appear in a collection, and do not specifically need a sequential list in memory, you can rearrange your list into a HashMap:
use std::collections::HashMap;
let mut dict: HashMap<i32, usize> = HashMap::new();
for num in your_list {
*dict.entry(num).or_insert(0) += 1;
}
and then you can filter numbers out of the map with constant-time access rather than linear-time in the size of the collection.
Since this is a question about performance, then you will need to benchmark everything so that you can test your assumptions. That being said and unless there's something smart to do inside the function you call (maybe only copying lazily the items you want to mutate), then I think your retain+clone approach is close to the fastest you can do. Using Options is almost certainly a bad idea: it adds checks everywhere and it kills cache locality.
The only thing that may improve performance is to do the copy and filtering in a single pass:
let mut v = vec![5, 4, 4, 2, 6, 5, 1, 8, 2, 1, 6, 5, 4, 2, 0, 1];
let mut work = v.clone();
for i in 0..10 {
println!("{}", work.iter().sum::<i32>());
work.clear();
v.retain(|&x| if (x > i) { work.push (x); true } else { false });
}
playground
Note that this will probably not make any difference if your data fits in the cache. In any case, benchmark, benchmark, benchmark! Lots of assumptions get proven wrong in the face of compiler optimizations and modern CPU architecture.
If you're removing the elements in order you should consider a queue. Using remove() takes O(1) time to remove an element, because it is essentially a dequeue or a pop.
While reading shallow copy. It says that copy.copy(x) create shallow copy. But I don't see shallow copy behavior in case of the single dimensional list.
Example:
>> new = [1,2,3,4,5,6]
>> original = copy.copy(new)
>> new
[1, 2, 3, 4, 5, 6]
>> original
[1, 2, 3, 4, 5, 6]
>> id(new)
65022912
>> id(original)
65022512
>> new[2]=13
>> new
[1, 2, 13, 4, 5, 6]
>> original
[1, 2, 3, 4, 5, 6]
So here I assume updating "new" list should update "original" list but it is not happening.
In the case of the multidimensional list, the shallow copy is working properly.
Example:
>> parent_list = [1, 2, [3,4], [5,6]]
>> child_list = copy.copy(parent_list)
>> parent_list[2][1] = "Python"
>> parent_list
[1, 2, [3, 'Python'], [5, 6]]
>> child_list
[1, 2, [3, 'Python'], [5, 6]]
Please guide me, why the shallow copy is not working in case of a one-dimensional list.
Thanks.
There are actually 3 "Levels" to what you want.
1) Create a new reference to the same list. This aliasing is a trivial operation and would look like "original = new" or something like that. original[1] =x will update new[1]; This is the equivalent of copying a pointer in a pointer based language.
original = new
original[1] = x
new[1] will now be updated to x
This should be obvious but for completeness, If you follow the above with:
new = other
original is NOT affected at all.
2) Create a "Copy" of the list. It will allocate a new area and copy elements of the list. This is a "Shallow" copy. The children copied will be references, but the list itself will contain copies. original[1] =x will NOT update new[1], but original[1].childValue =x will update new[1].childValue
original = shallow copy of new
original[1].value = x
new[1].value WILL change to x
original[1] = y
new[1] will NOT be affected
3) Create a deep copy of the list. This will allocate a new area and shallow copy the list, but then will recurse and copy each child referenced in the list. No updates to original will modify new, or vice-a-versa.
original = deep copy of new
original[1].value = x
new[1].value will NOT be affected
original[1] = y
new[1] will NOT be affected
A shallow copy of a list is not usually what you want because your list is left in a hybrid state with some members referencing other lists and some not which will lead to unpredictable behavior, but it may be necessary if your tree is fairly deep and/or you never modify the child nodes.
original[1].value = x is giving error :AttributeError: 'int' object has no attribute 'values'
After writing the below code:
new = [1,2,3,4,5,6]
original = new.copy()
original[1] = 10
print("new:",new , "\n " ,"original:", original)
the output I am getting is :
new: [1, 2, 3, 4, 5, 6]
original: [1, 10, 3, 4, 5, 6]
changes done in the original are not getting reflected for a new list where an index is the same or a shallow copy
Is there a Qore operator/function to get sublist from a list without modifying source list, i.e. equivalent of substr(). extract operator removes items from original list.
list l = (1,2,3,4,5,6,7,8,9);
list l2 = extract l, 2, 4;
printf("l:%y\nl2:%y\n", l, l2);
l:[1, 2, 7, 8, 9]
l2:[3, 4, 5, 6]
select operator supports in condition argument $# macro expanded as index.
list l = (1,2,3,4,5,6,7,8,9);
list l2 = select l, $# >= 2 && $# <2+4;
printf("l:%y\nl2:%y\n", l, l2);
l:[1, 2, 3, 4, 5, 6, 7, 8, 9]
l2:[3, 4, 5, 6]
The select operator is the best solution as you stated in your answer to your own question.
The splice and extract operators both will modify the list operand, which is not what you want.
Note that there is an outstanding feature issue for this in Qore (1781) - not yet targeted to a release, but it could go in the next major release (0.8.13) if there is any interest.
Simple function in Elixir, returning a list of numbers from to:
defmodule MyList do
def span(_), do: raise "Should be 2 args"
def span(from, to) when from > to, do: [ to | span(to + 1, from) ]
def span(from, to) when from < to, do: [ from | span(from + 1, to) ]
def span(from, to) when from == to, do: [ from ]
end
I have no slightest clue, why this works and return a list of numbers.
MyList.span(1,5)
#=> [1,2,3,4,5]
I just can't get my head around this:
[ from | span(from + 1, to) ]
Ok, first loop, I assume, would return the following:
[ 1 | span(2, 5) ]
What is next? [ 1, 2 | span(3, 5) ] ? Why?
How does it know, when to stop? Why is it even working?
Please, do not chase the points - don't bother answering, if you are not going to make an effort to make things clear(er) for functional programmer noob (OO programmer).
As a bonus to the answer you could provide me with a tips on how to start think recursively? Is there any panacea?
How does it keep track of the head? How does the function creates new list on each iteration keeping the values produced in the previous?
Thanks!
Ok, let's give this a shot.
Erlang evaluates function calls with a call-by-value strategy. From the linked wikipedia:
[call-by-value is a] family of evaluation strategies in which a function's argument is evaluated before being passed to the function.
What this means is that when Elixir (or rather Erlang) sees a function call with some arguments, it evaluates the arguments (which can obviously be expressions as well) before calling the function.
For example, let's take this function:
def add(a, b), do: a + b
If I call it with two expressions as arguments, those expressions will be evaluated before the the results are added up:
add(10 * 2, 5 - 3)
# becomes:
add(20, 2)
# kind of becomes:
20 + 2
# which results in:
22
Now that we get call-by-value, let's think of the | construct in list as a function for a moment. Think of it like if it would be used like this:
|(1, []) #=> [1]
|(29, [1, 2, 3]) #=> [29, 1, 2, 3]
As all functions, | evaluates its arguments before doing its work (which is creating a new list with the first argument as the first element and the second argument as the rest of the list).
When you call span(1, 5), it kind of expands (let's say it expands) to:
|(1, span(2, 5))
Now, since all arguments to | have to be evaluated before being able to actually prepend 1 to span(2, 5), we have to evaluate span(2, 5).
This goes on for a while:
|(1, |(2, span(3, 5)))
|(1, |(2, |(3, span(4, 5))))
|(1, |(2, |(3, |(4, span(5, 5)))))
|(1, |(2, |(3, |(4, [5]))))))
# now, it starts to "unwind" back:
|(1, |(2, |(3, [4, 5])))
|(1, |(2, [3, 4, 5]))
|(1, [2, 3, 4, 5])
[1, 2, 3, 4, 5]
(sorry if I'm using this |() syntax, remember I'm just using | as a function instead of an operator).
Nothing keeps track of the head and no function "keeps the values produced in the previous [iteration]". The first call (span(1, 5)) just expands to [1|span(2, 5)]. Now, in order for the span(1, 5) call to return, it needs to evaluate [1|span(2, 5)]: there you have it, recursion! It will need to evaluate span(2, 5) first and so on.
Technically, the values are kept somewhere, and it's on the stack: each function call is placed on the stack and popped off only when it's able to return. So the stack will look something like the series of calls I showed above:
# First call is pushed on the stack
span(1, 5)
# Second call is pushed on top of that
span(1, 5), span(2, 5)
# ...
span(1, 5), span(2, 5), ..., span(5, 5)
# hey! span(5, 5) is not recursive, we can return [5]. Let's pop span(5, 5) from the stack then
span(1, 5), ..., span(4, 5)
# Now span(4, 5) can return because we know the value of span(5, 5) (remember, span(4, 5) is expanded to [4|span(5, 5)]
This goes on until it goes back to span(1, 5) (which is now span(1, [2, 3, 4, 5])) and finally to [1, 2, 3, 4, 5].
Ok I wrote a lot and I'm not sure I made anything clearer to you :). Please, ask anything that's not clear. There are surely a lot of resources to learn recursion out there; just to name the first bunch I found:
The "Recursion" chapter of Learn You Some Erlang for Great Good, a great book on Erlang
Obligatory Wikipedia page on recursion
A nice page I just found about recursion on the khan academy website
Why not, a couple of Elixir-specific resources: the "Getting started" guide on Elixir's website, this blog post, this other blog post
I need to make a nested loop with an arbitrary depth. Recursive loops seem the right way, but I don't know how to use the loop variables in side the loop. For example, once I specify the depth to 3, it should work like
count = 1
for i=1, Nmax-2
for j=i+1, Nmax-1
for k=j+1,Nmax
function(i,j,k,0,0,0,0....) // a function having Nmax arguments
count += 1
end
end
end
I want to make a subroutine which takes the depth of the loops as an argument.
UPDATE:
I implemented the scheme proposed by Zoltan. I wrote it in python for simplicity.
count = 0;
def f(CurrentDepth, ArgSoFar, MaxDepth, Nmax):
global count;
if CurrentDepth > MaxDepth:
count += 1;
print count, ArgSoFar;
else:
if CurrentDepth == 1:
for i in range(1, Nmax + 2 - MaxDepth):
NewArgs = ArgSoFar;
NewArgs[1-1] = i;
f(2, NewArgs, MaxDepth, Nmax);
else:
for i in range(ArgSoFar[CurrentDepth-1-1] + 1, Nmax + CurrentDepth - MaxDepth +1):
NewArgs = ArgSoFar;
NewArgs[CurrentDepth-1] = i;
f(CurrentDepth + 1, NewArgs, MaxDepth, Nmax);
f(1,[0,0,0,0,0],3,5)
and the results are
1 [1, 2, 3, 0, 0]
2 [1, 2, 4, 0, 0]
3 [1, 2, 5, 0, 0]
4 [1, 3, 4, 0, 0]
5 [1, 3, 5, 0, 0]
6 [1, 4, 5, 0, 0]
7 [2, 3, 4, 0, 0]
8 [2, 3, 5, 0, 0]
9 [2, 4, 5, 0, 0]
10 [3, 4, 5, 0, 0]
There may be a better way to do this, but so far this one works fine. It seems easy to do this in fortran. Thank you so much for your help!!!
Here's one way you could do what you want. This is pseudo-code, I haven't written enough to compile and test it but you should get the picture.
Define a function, let's call it fun1 which takes inter alia an integer array argument, perhaps like this
<type> function fun1(indices, other_arguments)
integer, dimension(:), intent(in) :: indices
...
which you might call like this
fun1([4,5,6],...)
and the interpretation of this is that the function is to use a loop-nest 3 levels deep like this:
do ix = 1,4
do jx = 1,5
do kx = 1,6
...
Of course, you can't write a loop nest whose depth is determined at run-time (not in Fortran anyway) so you would flatten this into a single loop along the lines of
do ix = 1, product(indices)
If you need the values of the individual indices inside the loop you'll need to unflatten the linearised index. Note that all you are doing is writing the code to transform array indices from N-D into 1-D and vice versa; this is what the compiler does for you when you can specify the rank of an array at compile time. If the inner loops aren't to run over the whole range of the indices you'll have to do something more complicated, careful coding required but not difficult.
Depending on what you are actually trying to do this may or may not be either a good or even satisfactory approach. If you are trying to write a function to compute a value at each element in an array whose rank is not known when you write the function then the preceding suggestion is dead flat wrong, in this case you would want to write an elemental function. Update your question if you want further information.
you can define your function to have a List argument, which is initially empty
void f(int num,List argumentsSoFar){
// call f() for num+1..Nmax
for(i = num+1 ; i < Nmax ; i++){
List newArgs=argumentsSoFar.clone();
newArgs.add(i);
f(i,newArgs);
}
if (num+1==Nmax){
// do the work with your argument list...i think you wanted to arrive here ;)
}
}
caveat: the stack should be able to handle Nmax depth function calls
Yet another way to achieve what you desire is based on the answer by High Performance Mark, but can be made more general:
subroutine nestedLoop(indicesIn)
! Input indices, of arbitrary rank
integer,dimension(:),intent(in) :: indicesIn
! Internal indices, here set to length 5 for brevity, but set as many as you'd like
integer,dimension(5) :: indices = 0
integer :: i1,i2,i3,i4,i5
indices(1:size(indicesIn)) = indicesIn
do i1 = 0,indices(1)
do i2 = 0,indices(2)
do i3 = 0,indices(3)
do i4 = 0,indices(4)
do i5 = 0,indices(5)
! Do calculations here:
! myFunc(i1,i2,i3,i4,i5)
enddo
enddo
enddo
enddo
enddo
endsubroutine nestedLoop
You now have nested loops explicitly coded, but these are 1-trip loops unless otherwise desired. Note that if you intend to construct arrays of rank that depends on the nested loop depth, you can go up to rank of 7, or 15 if you have a compiler that supports it (Fortran 2008). You can now try:
call nestedLoop([1])
call nestedLoop([2,3])
call nestedLoop([1,2,3,2,1])
You can modify this routine to your liking and desired applicability, add exception handling etc.
From an OOP approach, each loop could be represented by a "Loop" object - this object would have the ability to be constructed while containing another instance of itself. You could then theoretically nest these as deep as you need to.
Loop1 would execute Loop2 would execute Loop3.. and onwards.