Does Rust have Collection traits? - collections

I'd like to write a library that's a thin wrapper around some of the functionality in BTreeMap. I'd prefer not to tightly couple it to that particular data structure though. Strictly speaking, I only need a subset of its functionality, something along the lines of the NavigableMap interface in Java. I was hoping to find an analogous trait I could use. I seem to recall that at some point there were traits like Map and MutableMap in the standard library, but they seem to be absent now.
Is there a crate that defines these? Or will they eventually be re-added to std?

No, right now there's only Iterator. MutableMap and Map have been removed somewhere along the road to stabilization of std for Rust 1.0.
There have been various discussions about re-adding traits to std. See these discussions on Rust internals:
Traits that should be in std, but aren’t
or (less recent but more specifically on collections):
Collection Traits, Take 2
Bottom line: everybody wants some form of those traits in std but nobody wants to commit adding and supporting the wrong ones in the standard library until a clearer picture of what is ergonomic emerges.

Related

What makes Julia Composable?

I have seen many places the mention that Julia is "Composable". I know that the word itself means:
Composability is a system design principle that deals with the inter-relationships of components. A highly composable system provides components that can be selected and assembled in various combinations to satisfy specific user requirements.
But I am curious what the specific components of Julia are that make it composable. Is it the ability to override base functions with my own implementation?
I guess I'll hazard an answer, though my understanding may be no more complete than yours!
As far as I understand it (in no small part from Stefan's "Unreasonable Effectiveness of Multiple Dispatch" JuliaCon talk as linked by Oscar in the comments), I would say that it is in part:
As you say, the ability override base functions with your own implementation [and, critically, then have it "just work" (be dispatched to) whenever appropriate thanks to multiple dispatch] ...since this means if you make a custom type and define all the fundamental / primitive operations on that type (as in https://docs.julialang.org/en/v1/manual/interfaces/ -- say +-*/ et al. for numeric types, or getindex, setindex! et al. for an array-like type, etc.), then any more complex program built on those fundamentals will also "just work" with your new custom type. And that in turn means your custom type will also work (AKA compose) with other people's packages without any need for (e.g.) explicit compatibility shims as long as people haven't over-constrained their function argument types (which is, incidentally, why over-constraining function argument types is a Julia antipattern )
Following on 1), the fact that so many Base methods are also just plain Julia, so will also work with your new custom type as long as the proper fundamental operations are defined
The fact that Julia's base types and methods are generally performant and convenient enough that in many cases there's no need to do anything custom, so you can just put together blocks that all operate on, e.g., plain Julia arrays or tuples or etc.This last point is perhaps most notable in contrast to a language like Python where, for example, every sufficiently large subset of the ecosystem (numpy, tensorflow, etc.) has their own reimplementation of (e.g.) arrays, which for better performance are all ultimately implemented in some other language entirely (C++, for numpy and TF) and thus probably do not compose with each other.

Does Law of Demeter also account for standard classes?

Assuming the following code:
requiredIssue.get().isDone()
where requiredIssue is an Optional and it has been verified that requiredIssue.isPresent(). Does this code break the Law of Demeter? Technically there is a tight coupling here between my class and Optional now, because isDone() now relies on get() working properly.
But is it not reasonable to assume for the standard library to work consistently?
In practical terms, the Law of Demeter means that if you have a class or function that depends on the value of isDone(), you should not pass requiredIssue to that class or function. You should simply pass the value of isDone(). If you find that following this principle leads to a proliferation of fields or parameters, you're probably violating the Single Responsibility Principle.
First,
requiredIssue.get().isDone()
does violate the Law of Demeter, which clearly enumerates the objects you can call methods on. Whatever object get() returns is not in that list. For more detail on the Law itself, see this article.
Second, the concrete example with Optional. Calling get() of an Optional has many problems and should be avoided. The Law of Demeter (which is a sort-of canary in a coalmine for bad code) correctly indicates that this construct is flawed. Problems range from temporal coupling with the isPresent() call, technical leak of having to call isPresent() in the first place, to copying the "isPresent()" logic allover the place. Most of the time if you feel the need to use get() you are already doing it wrong.
Third, the more general question of whether standard classes should be basically exempt from the Law of Demeter. They should not be exempt, if you are trying to do object-oriented programming. You have to keep in mind that some of the JDK, and most of the JEE are not actually meant to be used in an object-oriented environment. Sure, they use Java, which is a semi-object-oriented language, but they are expected to be used in a "mixed-paradigm" context, in which procedural designs are allowed. Just think of "Service" classes, pure "Data" objects (like Beans), layering, etc. These come from our procedural past, which some argue is still relevant and useful today.
Regardless where you stand on that debate, some of the standard classes/libraries/specifications are just not compatible with the Law of Demeter, or object-orientation in general. If you are willing to compromise on object-oriented principles (like most projects) that is not a problem. If you are actually not willing to compromise on a good design, but are forced to do so by libraries or standard classes, the solution is not to discuss the Law of Demeter away, but to recognize (and accept) that you are making an conscious decision to violate it.
So, is it reasonable to assume the standard library works? Sure! That does not mean they are exempt from good design principles, like the Law of Demeter, if you really want to stick with object-orientation.
Law of Demeter only applies weakly, and at system boundaries (almost never the object level, sometimes at the module level).
It's only a problem when you are forced to make 'public' a class in your module/system that would not otherwisely have to be public. IMHO Making this mistake is and/or should be rare, and avoid it it common sense, even for med-level devs.
As you can see, the standard classes (or .Net:framework/core) are NOT a violation, because often times the objects you see in their double-dots are meant to be public, anyway.
Think about the design. If you are following GRASP principles (and DDD) when choosing what kinds of objects to make, LOD will go away and lead to increased coupling.
It's nuanced, LOD is not complete crap, it's just not helpful and better to ignore, while solving the obscure problems it's meant to solve using solutions (GRASP,DDD) that address the root cause instead of just outlawing the symptoms.
More reading from people on the internet who also agree with me (which must mean I've right, huh?)
https://www.tedinski.com/2018/12/18/the-law-of-demeter.html
https://naildrivin5.com/blog/2020/01/22/law-of-demeter-creates-more-problems-than-it-solves.html
https://wiki.c2.com/?LawOfDemeterIsInvalid

ECS with Go - circular imports

I'm exploring both Go and Entity-Component-Systems. I understand how ECS works, and I'm trying to replicate what seems to be the go-to document of ECS, namely http://cowboyprogramming.com/2007/01/05/evolve-your-heirachy/
For performance, the document recommends to use static arrays of every component type. That is, not arrays of component interfaces (arrays of pointers). The problem with this in Go is circular imports.
I have one package, ecs, which contains the definitions for Entity, Component and System types/interfaces as well as an EntityManager. Another package, ecs/components, contains the various components. Obviously, the ecs/components package depends on ecs. But, to declare arrays of specific components in EntityManager, ecs would depend on ecs/components, therefore creating a circular import.
Is there any way of avoiding this? I am aware that normally a high level system should not depend on lower systems. I'm also want to point out that using an array of pointers is probably fast enough for my purposes, but I'm interested in possible workarounds (for future reference)
Thank you for your help!
For performance, the document recommends to use static arrays of every
component type.
I'm just going to start off saying that I may be blind, but I ctrl+f'd and read that document multiple times and didn't see anything close to that. (Certainly some optimizations could be made this way with regards to things like avoiding cache misses, but I'm dubious it in any way outweighs the clerical overhead).
There's the easy answer to the exact question you asked first, the . import. Any package with an import statement like import . "some/other/package" will treat that package's contents as its own, ignoring circular dependencies. Don't do this.
Unfortunately, without merging the packages, you won't be able to do this (without using interfaces, I mean). Don't fear, though. The article you posted explicitly says this under "implementation details".
Giving each component a common interface means deriving from a base
class with virtual functions. This introduces some additional
overhead. Do not let this turn you against the idea, as the additional
overhead is small, compared to the savings due to simplification of
objects.
It's outright telling you to use interfaces (okay, C++ virtual inheritance, but close enough). It's okay, it's necessary. Especially if you want two slightly different AI components or something, it's a godsend then.

How to identify that code is over abstracted?

What should be the measures that should be used to identify that code is over abstracted and very hard to understand and what should be done to reduce over abstraction?
"Simplicity over complexity, complexity over complicatedness"
So - there's a benefit to abstract something only if You are "de-leveling" complicatedness to complexity. Reasons to do that can vary: better modularity, better encapsulation etc.
Identifying over abstraction is a chicken and egg problem. In order to reduce over abstraction You need to understand actual reason behind code lines. That includes understanding idea of particular abstraction itself (in contrast to calling it over abstracted cause of lack of understanding). And that's not enough - You need to know a better, simpler solution to prove that it's over abstracted.
If You are looking for tool that could do it in Your place - look no more, only mind can reliably judge that.
I will give an answer that will get a LOT of down votes!
If the code is written in an OO language .. it is necessarily heavily over-abstracted. The purer the language the worse the problem.
Abstraction should be used with great caution. If in doubt always use concrete data structures. (You can always abstract later, this is easier than de-abstraction :)
You must be very certain you have the right abstraction in your current context, and you must be very sure that concept will stand the test of change. Abstraction has a high price in performance of both the code and the coder.
Some weak tests for over-abstraction: if the data structure is a product type (struct in C) and the programmer has written get and set method for each field, they have utterly failed to provide any real abstraction, disabled operators like C increment, for no purpose, and simply not understood that the struct field names are already the abstract representation of a product. Duplicating and laming up the interface is not a good idea.
A good test for the product case is whether there exist any data invariants to maintain. For example a pair of integers representing a rational number is almost sufficient, there's little need for any abstraction because all pairs are valid except when the denominator is zero. However for performance reasons one may choose to maintain an invariant, typically the denominator is required to be greater than zero, and the numerator and denominator are relatively prime. To ensure the invariant, the product representation is encapsulated: the initial value protected by a constructor and methods constrained to maintain the invariant.
To fix code I recommend these steps:
Document the representation invariants the abstraction is maintaining
Remove the abstraction (methods) if you can't find strong invariants
Rewrite code using the method to access the data directly.
This procedure only works for low level abstraction, i.e. abstraction of small values by classes.
Over abstraction at a higher level is much harder to deal with. Ideally you'd refactor the code repeatedly, checking to see after each step it continues to work. However this will be hard, and sometimes a major rewrite is required, rather than a refinement. It's probably not worth it unless the abstraction is so far off base it is not tenable to continue to maintain it.
Download Magento and have a look at the code, read some documents on it and have a look at their ERD: http://www.magentocommerce.com/wiki/_media/doc/magento---sample_database_diagram.png?cache=cache
I'm not joking, this is over-abstraction.. trying to please everyone and cover every base is a terrible idea and makes life extremely difficult for everyone.
Personally I would say that "What is the ideal level of abstraction?" is a subjective question.
I don't like code that uses a new line for every atomic operation, but I also don't like 10 nested operations within one line.
I like the use of recursive functions, but I don't appreciate recursion for the sole sake of recursion.
I like generics, but I don't like (nested) generic functions that e.g. use different code for each specific type that's expected...
It is a matter of personal opinion as well as common sense. Does this answer your question?
I completely agree with what #ArnisLapsa wrote:
"Simplicity over complexity, complexity over complicatedness"
And that
an abstraction is used to "de-level" those, from complicated to complex
(and from complex to simpler)
Also, as stated by #MartinHemmings a good abstraction is quite subjective because we don't all think the same way. And actually our way of thinking change with time. So Something that someone find simple might looks complex to others, and even become simpler with more experiences. Eg. A monadic operation is something trivial for functional programmer, but can be seriously confusing for others. Similarly, a design with mutable object communicating with each other can be natural for some and feel un-trackable for others.
That being said, I would like to add a couple of indicators. Note that this applies to abstractions used in code-base, not "paradigm abstraction" such as everything-is-a-function, or everything-is-designed-as-objects. So:
To the people it concerns, the abstraction should be conceptually simpler than other alternatives, without looking at the implementation. If you find that thinking of all possible cases is simpler that reasoning using the abstraction, then this abstraction is not suitable (for you)
Its implementation should reason only about the abstraction, not the specific cases that it will be used for. As soon as the abstraction implementation has parts made for specific cases, it indicates an "unfit" abstraction. And increasing generalization to cope with each new case, is going the wrong way (and tends to fall to the next issue).
A very common indicator of over-abstraction I have found (and actually fell for) are abstractions that represent more than what is needed, now. As much as possible, they should allow to do exactly what is required, but nothing more. For example, say you're thinking of, or already have, a "2d point" abstraction for which you can define many operators you need. Then you have another need that could really be a "4d point" similar to the 2d. Don't start to use a "Ndimensionnal point" abstraction, especially thinking that you might later need it. Maybe you'll never have anything else than 2 and 4d (because it stays as "a good idea" in the backlog forever) but instead some requirements pops to convert 4d points into pairs of 2d points. That's going to be hard to generalize to n-dimensions. So, each abstraction can be checked to cover and only cover the actual needs. In my point example, the complexity "n-dimensional" is actually only used to cope with the 2 and 4d cases (and the 4d might not even be used that much).
Finally, in a more global point of view, a code-base that has many not related abstractions, is an indicator that the dev team tends to abstract every little issues. So probably many of them are or became over-abstracted.

Modelling / documenting functional programs

I've found UML useful for documenting various aspects of OO systems, particularly class diagrams for overall architecture and sequence diagrams to illustrate particular routines. I'd like to do the same kind of thing for my clojure applications. I'm not currently interested in Model Driven Development, simply on communicating how applications work.
Is UML a common / reasonable approach to modelling functional programming? Is there a better alternative to UML for FP?
the "many functions on a single data structure" approach of idiomatic Clojure code waters down the typical "this uses that" UML diagram because many of the functions end up pointing at map/reduce/filter.
I get the impression that because Clojure is a somewhat more data centric language a way of visualizing the flow of data could help more than a way of visualizing control flow when you take lazy evaluation into account. It would be really useful to get a "pipe line" diagram of the functions that build sequences.
map and reduce etc would turn these into trees
Most functional programmers prefer types to diagrams. (I mean types very broadly speaking, to include such things as Caml "module types", SML "signatures", and PLT Scheme "units".) To communicate how a large application works, I suggest three things:
Give the type of each module. Since you are using Clojure you may want to check out the "Units" language invented by Matthew Flatt and Matthias Felleisen. The idea is to document the types and the operations that the module depends on and that the module provides.
Give the import dependencies of the interfaces. Here a diagram can be useful; in many cases you can create a diagram automatically using dot. This has the advantage that the diagram always accurately reflects the code.
For some systems you may want to talk about important dependencies of implementations. But usually not—the point of separating interfaces from implementations is that the implementations can be understood only in terms of the interfaces they depend on.
There was recently a related question on architectural thinking in functional languages.
It's an interesting question (I've upvoted it), I expect you'll get at least as many opinions as you do responses. Here's my contribution:
What do you want to represent on your diagrams? In OO one answer to that question might be, considering class diagrams, state (or attributes if you prefer) and methods. So, obviously I would suggest, class diagrams are not the right thing to start from since functions have no state and, generally, implement one function (aka method). Do any of the other UML diagrams provide a better starting point for your thinking? The answer is probably yes but you need to consider what you want to show and find that starting point yourself.
Once you've written a (sub-)system in a functional language, then you have a (UML) component to represent on the standard sorts of diagram, but perhaps that is too high-level, too abstract, for you.
When I write functional programs, which is not a lot I admit, I tend to document functions as I would document mathematical functions (I work in scientific computing, lots of maths knocking around so this is quite natural for me). For each function I write:
an ID;
sometimes, a description;
a specification of the domain;
a specification of the co-domain;
a statement of the rule, ie the operation that the function performs;
sometimes I write post-conditions too though these are usually adequately specified by the co-domain and rule.
I use LaTeX for this, it's good for mathematical notation, but any other reasonably flexible text or word processor would do. As for diagrams, no not so much. But that's probably a reflection of the primitive state of the design of the systems I program functionally. Most of my computing is done on arrays of floating-point numbers, so most of my functions are very easy to compose ad-hoc and the structuring of a system is very loose. I imagine a diagram which showed functions as nodes and inputs/outputs as edges between nodes -- in my case there would be edges between each pair of nodes in most cases. I'm not sure drawing such a diagram would help me at all.
I seem to be coming down on the side of telling you no, UML is not a reasonable way of modelling functional systems. Whether it's common SO will tell us.
This is something I've been trying to experiment with also, and after a few years of programming in Ruby I was used to class/object modeling. In the end I think the types of designs I create for Clojure libraries are actually pretty similar to what I would do for a large C program.
Start by doing an outline of the domain model. List the main pieces of data being moved around the primary functions being performed on this data. I write these in my notebook and a lot of the time it will be just a name with 3-5 bullet points underneath it. This outline will probably be a good approximation of your initial namespaces, and it should point out some of the key high level interfaces.
If it seems pretty straight forward then I'll create empty functions for the high level interface, and just start filling them in. Typically each high level function will require a couple support functions, and as you build up the whole interface you will find opportunities for sharing more code, so you refactor as you go.
If it seems like a more difficult problem then I'll start diagramming out the structure of the data and the flow of key functions. Often times the diagram and conceptual model that makes the most sense will depend on the type of abstractions you choose to use in a specific design. For example if you use a dataflow library for a Swing GUI then using a dependency graph would make sense, but if you are writing a server to processing relational database queries then you might want to diagram pools of agents and pipelines for processing tuples. I think these kinds of models and diagrams are also much more descriptive in terms of conveying to another developer how a program is architected. They show more of the functional connectivity between aspects of your system, rather than the pretty non-specific information conveyed by something like UML.

Resources