Idiomatic way to spilt sequence into three lists using Kotlin - functional-programming

So this is possibly more about functional programming than Kotlin, I am at that stage were a little knowledge is dangerous, and I wrote the app in Kotlin so seems fair to ask a Kotlin question as its Kotlins structures that i am interested in.
I have a sequence of items, they are in batches of three, so the stream may look like
1,a,+,2,b,*,3,c,&.......
What I want to do is to spilt this into three lists, currently I am doing this by partitioning into two lists, one that contains the numbers and one that contains everything else, then taking the second half of the result, the letters and symbols and partitioning again, into letters and symbols, thus i end up with three lists.
This strikes me as somewhat inefficient, maybe a functional approach isn't the best approach here.
Is there an efficient way of doing this, are my choices, this or a for loop?
Thanks

You can use groupBy method to group elements of your sequence by an element type:
val elementsByType = sequence.groupBy { getElementType(it) }
where getElementType is function returning a type of the element: whether it is a letter, or a number, or a symbol. This function may return either a number, such as 1, 2, 3, or a value of some enum with 3 different entries.
groupBy returns a map from element type to list of elements of that type.

Related

Can a SQLite user-defined function take a row argument?

They are described as scalar, but I think that refers to the return type rather than the arguments.
I'm trying to define one in rust that will provide a TEXT value derived from other columns in the row, for convenience/readability at point of use, I'd like to call it as select myfunc(mytable) from mytable rather than explicitly the columns that it derives.
The rusqlite example simply gets an argument as f64, so it's not that clear to me how it might be possible to interpret it as a row and retrieve columnar values from within it. Nor have I been able to find examples in other languages.
Is this possible?
This doesn't seem possible.
func(tablename) syntax that I'm familiar with seems to be PostgreSQL-specific; SQLite supports func(*) but when func is user-defined it receives zero arguments, not one (structured) or N (all columns separately) as I expected.

Replace for loop with vectorized call of a function returning multiple values

I have the following function: problema_firma_emprestimo(r,w,r_emprestimo,posicao,posicao_banco), where all input are scalars.
This function return three different matrix, using
return demanda_k_emprestimo,demanda_l_emprestimo,lucro_emprestimo
I need to run this function for a series of values of posicao_banco that are stored in a vector.
I'm doing this using a for loop, because I need three separate matrix with each of them storing one of the three outputs of the function, and the first dimension of each matrix corresponds to the index of posicao_banco. My code for this part is:
demanda_k_emprestimo = zeros(num_bancos,na,ny);
demanda_l_emprestimo = similar(demanda_k_emprestimo);
lucro_emprestimo = similar(demanda_k_emprestimo);
for i in eachindex(posicao_bancos)
demanda_k_emprestimo[i,:,:] , demanda_l_emprestimo[i,:,:] , lucro_emprestimo[i,:,:] = problema_firma_emprestimo(r,w,r_emprestimo[i],posicao,posicao_bancos[i]);
end
Is there a fast and clean way of doing this using vectorized functions? Something like problema_firma_emprestimo.(r,w,r_emprestimo[i],posicao,posicao_bancos) ? When I do this, I got a tuple with the result, but I can't find a good way of unpacking the answer.
Thanks!
Unfortunately, it's not easy to use broadcasting here, since then you will end up with output that is an array of tuples, instead of a tuple of arrays. I think a loop is a very good approach, and has no performance penalty compared to broadcasting.
I would suggest, however, that you organize your output array dimensions differently, so that i indexes into the last dimension instead of the first:
for i in eachindex(posicao_bancos)
demanda_k_emprestimo[:, :, i] , ...
end
This is because Julia arrays are column major, and this way the output values are filled into the output arrays in the most efficient way. You could also consider making the output arrays into vectors of matrices, instead of 3D arrays.
On a side note: since you are (or should be) creating an MWE for the sake of the people answering, it would be better if you used shorter and less confusing variable names. In particular for people who don't understand Portuguese (I'm guessing), your variable names are super long, confusing and make the code visually dense. Telling the difference between demanda_k_emprestimo and demanda_l_emprestimo at a glance is hard. The meaning of the variables are not important either, so it's better to just call them A and B or X and Y, and the functions foo or something.

Is a STL vector pf pointers to vectors my best option?

I currently have a ton of vectors all set to have 1200 items, which is overkill, but could be used. So I don't have to recode a lot of stuff, what is the best way to create and iterate through a list of these vectors and resize them all as needed? (they all are equal in size)
One of my options is to create a pointer (after the fact) to each vector, and then create a vector of these pointers which can be iterated to resize. Another option is to create the vectors in the first place as pointers instead of objects. This seems like it would be a lot of work, and I currently have a lot of code where I use the vector objects.
Are there other options?
Fred E.
In C++, std::vector is just a wrapper around a raw data pointer. The vector itself stores data in the heap. So you can store vectors in a vector as objects, or you can store pointers to them.
The easiest way is to store vectors in a vector as objects, but you must carefully monitor how you pass this outer vector into functions. You should always pass it by reference or pass a pointer to it, so as not to trigger copying all nested vectors. On the other hand, such a vector will automatically and correctly remove nested vectors when its destructor fires.
The vector of pointers to vectors has its advantages and disadvantages. On the one hand, when a function is called inaccurately with this vector as a parameter, only pointers will be copied. On the other hand, accessing vectors through a pointer affects performance. In addition, when storing pointers to vectors, you must take care to delete them yourself.
If your vectors are created and deleted elsewhere and you only need to iterate through them and modify them in your code, then obviously the vector of pointers to vectors is your best choice.

Merge sort, the recursion part

After studying the merge sort for a couple of days, I understand it conceptually, but there is one thing that I don't get.
What I get:
1.) It takes a list, for example an array of numbers and splits it in half and sorts the two halfs, and in the end merges them together.
2.) Because it's an recursive algorithm it uses recursion to do that.
So the split of the mentioned array looks like this:
It, splits the array until there is only one item in each list and by that its considered sorted. And at that point the merge steps in.
Which should look like this:
What I don't get is, how does the recursion "know" after it splits all the lists to only one item in a list, to get back up the recursion tree? How does something that has a left and right side become the left side after it merges?
The thing that bothers me is this. I've taken a snapshot of the code from interactivepython page
How does the code get to the point, after we have lefthalf = 2, and righthalf = 1, to to code that's shown in the picture where the lefthalf = [1,2] and righthalf = [4,3] without going back to the recursion that would divide what we have have merged?
Tnx,
Tom
Once the list only contains one element, each pair of leaves are sorted and joined. Then you can traverse through the list and find out where the next pair should be inserted. The recursion "knows" nothing about going back up the recursion tree, rather it is the act of sorting and joining that has this effect.
The "recursion" does of course know nothing of that sort. It is the code that uses the recursion, which looks like this (a bit simplified):
sort list = merge (sort left_half) (sort right_half)
where
(left_half, right_half) = split list
Here you see that the "recursion" (i.e. the recursive invocations of sort) don't need to "know" anything. Their only job is to deliver a sorted list, array or whatever.
To put it differently: If we have merge satisfying the following invariant:
1. `merge`, given two sorted lists, will return a sorted list.
then we can write mergesort easily like outlined above. What is left to do in sort is to handle the easy cases: empty list, singleton and list with two elements.
If you are talking about odd numbered sub lists, then it is dependant on the implementation.
It either puts the bigger sub list on the left every single time, or it puts it on the right every single time.

Matching specific items in several discrete collections

I have a problem whereby I have several discrete lists of ID's eg.
List (A) 1,2,3,4,5,7,8
List (B) 2,3,4,5
List (C) 4,2,8,9,1
etc...
I then have another collection of ID's...
For example: 1,2,4
I need to try and match one into each list. If I can perfectly match all ID's in my secondary collection (one collection ID matched with an ID from each list) then I get a true result....
I have found that it becomes complicated because if you simply iterate over the lists matching the first collection/list pair that you encounter it may result in you precluding a possible combination further on down the line hence returning a false negative result.
For example:
List (A) 1,2,3,4
List (B) 1,2,3,4
List (C) 3,4
Collection is: 3,1,2
The first ID from the collection (3) matches with an entry in list A, the second ID in the collection (1) matches an item in list B, however the final ID in the collection (2) DOESNT match any entry in list C however if you rearrange the order of the collection to be: 2,1,3 then a match is found.... Therefore I am looking for some form of logic for attempting a match on all possible combinations in an efficient manner(?)
To make it more complicated the ID's are actually GUID's so cant just be sorted in ascending order
I hope I have described this well enough to make it clear what I am attempting and with a bit of luck somebody will be able to tell me that what I need to do is very easy and I am missing something real simple!
I am forced to code this in VB6 but any methods or pseudo code would be great. The backend of this is SQL server so if a solution using TSQL was possible this would be even better as all of the ID's are held in tables already.
Many thanks in advance.
Jake, yep the lists and the collection both contain GUIDS. I used plain integers to simplify the problem a bit.
Once a list has been matched it cant be searched again, hence the ordering problem that I tried to explain. If you say that a list as 'matched' then no further attempts to match this will be performed. It is this very behaviour that can cause a false negative.
'Sending' the collection in in every possible combination of orders would work but would be a massive job .....
I feel I must be missing a really straightforward concept or solution here??!!
Thanks for your assistance so far.
I don't see a way around checking each GUID contained in the lists against each GUID in the collection. You would have to keep record of in which lists each GUID in the collection occurs.
To use your example of the Collection (3, 1, 2), 3 occurs in List A, B and C.
You will basically be left with this dataset.
3 (A, B, C)
1 (A, B)
2 (A, B)
Once you have distilled it down to this dataset you can determine whether there are any GUIDs with zero occurrences in the lists which would result in a negative.
I am not at all well versed in algorithms, but this is how I would proceed after that :
Start with the first set (A, B, C), and check how many times it occurs further on in the dataset. In this case no occurrences are found.
Moving on to the next set (A, B), if the number of occurrences of this set is found to be greater than the length of this set, i.e. more than two occurrences, would result in a negative. If the number of occurrences match the length exactly, as is the case here, the set (A, B) can be removed from any further consideration.
3 (C)
1 ()
2 ()
I guess you would continue to repeat the process until a negative is identified or all the occurrences have been excluded. There is probably a recognized algorithm for this sort of problem, but my knowledge is a bit lacking in that respect. :(

Resources