What is side effects in functional programming? - functional-programming

I am learning Java 8 newly , i see one definition related to functional programming which is "A program created using only pure functions , No Side effects allowed".
One of side effects is "Modifying a data structure in place".
i don't understand this line because at last some where we need to speak with database for storing or retrieving or updating the data.
modifying database is not functional means how we will speak with database in functional programming ?

"Modifying a data structure structure in place" means you directly manipulate the input datastructure (i.e. a List). "Pure functions" mean
the result is only a function of it's input and not some other hidden state
the function can be applied multiple times on the same input producing the same result. It will not change the input.
In Object Oriented Programming, you define behaviour of objects. Behaviour could be to provide read access to the state of the object, write access to it, or both. When combining operations of different concerns, you could introduce side effects.
For example a Stack and it's pop() operation. It will produce different results for every call because it changes the state of the stack.
In functional programming, you apply functions to immutable values. Functions represent a flow of data, not a change in state. So functions itself are stateless. And the result of a function is either the original input or a different value than the input, but never a modified input.
OO also knows functions, but those aren't pure in all cases, for example sorting: In non-functional programming you rearrange the elements of a list in the original datastructure ("in-place"). In Java, this is what Collections.sort()` does.
In functional programming, you would apply the sort function on an input value (a List) and thereby produce a new value (a new List) with sorted values. The function itself has no state and the state of the input is not modified.
So to generalize: given the same input value, applying a function to this value produces the same result value
Regarding the database operations. The contents of the database itself represent a state, which is the combination of all its stored values, tables etc (a "snapshot"). Of course you could apply a function to this data producing new data. Typically you store results of operations back to the db, thus changing the state of the entire system, but that doesn't mean you change the state of the function nor it's input data. Reapplying the function again, doesn't violate the pure-function constraints, because you apply the data to new input data. But looking at the entire system as a "datastructure" would violate the constraint, because the function application changes the state of the "input".
So the entire database system could hardly be considered functional, but of course you could operate on the data in a functional way.
But Java allows you to do both (OO and FP) and even mix both paradigms, so you could choose whatever approach fits your needs best.
or to quote from this answer
If you have several needs intermixed, mix your paradigms. Do not
restrict yourself to only using the lower right corner of your
toolbox.

Related

Redux: is state normalization necessary for composition relationship?

As we know, when saving data in a redux store, it's supposed to be transformed into a normalized state. So embedded objects should be replaced by their ids and saved within a dedicated collection in the store.
I am wondering, if that also should be done if the relationship is a composition? That means, the embedded data isn't of any use outside of the parent object.
In my case the embedded objects are registrations, and the parent object is a (real life) event. Normalizing this data structure to me feels like a lot of boilerplate without any benefit.
State normalization is more than just how you access the data by traversing the object tree. It also has to do with how you observe the data.
Part of the reason for normalization is to avoid unnecessary change notifications. Objects are treated as immutable so when they change a new object is created so that a quick reference check can indicate if something in the object changed. If you nest objects and a child object changes then you should change the parent. If some code is observing the parent then it will get change notifications every time a child changes even though it might not care. So depending on your scenario you may end up with a bunch of unnecessary change notifications.
This is also partly why you see lists of entities broken out into an array of identifiers and a map of objects. In relation to change detection, this allows you to observe the list (whether items have been added or removed) without caring about changes to the entities themselves.
So it depends on your usage. Just be aware of the cost of observing and the impact your state shape has on that.
I don't agree that data is "supposed to be [normalized]". Normalizing is a useful structure for accessing the data, but you're the architect to make that decision.
In many cases, the data stored will be an application singleton and a descriptive key is more useful than forcing some kind of id.
In your case I wouldn't bother unless there is excessive data duplication, especially because your would have to then denormalize for the object to function properly.

How to "keep track" of user activity with functional programming?

tl;dr
In a program that calls a function onEnterFrame on each frame, how do you store and mutate state? For instance if you are making a level editor or a painting program where keeping track of state and making small incremental changes are tempting / enticing / inviting. What is the most performany way to handle such a thing with minimal global state mutations?
long version:
In a interactive program that accepts input from the user, like mouse clicks and key strokes, we may need to keep track of the state of the data model. For instance:
Are some elements selected?
Is the mouse cursor hovering over an element, which one?
How long is the mouse button held down? Is this a click or a drag?
We also, sometimes need make small changes to a large model:
In a level editor, we may need to add one wall to an existing large set of prefabs. You don't want to recreate the set, no?
Read Prof Frisby's mostly-adequate-guide so far, there are many functional solutions to issues that deal with extracting a piece of data from some source of input, performing computation on that data and passing the result to some output.
Sometimes an app let's the user interact and perform a sequence of mutations on data. For instance, what if a program let's the user draw (like Paint) on a canvas and we need to store the state of the painting as well as the actions that led to that state (for undo and logging/debugging purposes)?
What state is acceptable to store and what should we absolutely avoid?
Currently my conclusions is that we should never store state that we only need temporarily, we should pass it to the function that needs it directly.
But what if there are several functions that need a specific computation? Like the case in which we check if the mouse's cursor is hovering over a specific area, why would we want to recompute that?
Are there ways to further minimize mutations of global state?
Storing state isn't the problem. It is mutating global state that is the problem. There are solutions to handling this. One that comes to mind is the State Monad. However, I am not sure this is ideal for undoing operations. But it is a place to start.
If you just want to look at the problem as an initial state and a set of operations then you can think of the operations as a List that can be traversed (with the head being the latest operation). Undoing a set of n operations could be accomplished by traversing the first n elements of the list and cons-ing the inverse of these operations to the list.
That way you don't modify global state at all.

Efficiency of list operations in functional languages

In functional languages like Racket or SML, we usually perform list operations in recursive call (pattern matching, list append, list concatenation...). However, I'm not sure the general implementation of these operations in functional languages. Will operations like create, update or delete elements in a list return a whole new copy of a list? I once read in a book an example about functional programming disadvantage; that is, every time a database is updated, a whole new copy of a database is returned.
I questioned this example, since data in FP is inherently immutable, thus the creating lists from existing lists should not create a whole new copy. Instead, a new list is simply just a different collection of reference to existing objects in other lists, based on filtering criteria.
For example, list A = [a,b,c], and list B=[1,2,3], and I created a new list that contains the first two elements from the existing lists, that is C=[a,b,1,2]. This new list simply contains references to a,b, from A and 1,2 from B. It should not be a new copy, because data is immutable.
So, to update an element in a list, it should only take a linear amount of time find an element in a list, create a new value and create a new list with same elements as in the old list except the updated one. To create a new list, the running environment merely updates the next pointer of the previous element. If a list is holding non-atomic elements (i.e. list, tree...), and only one atomic element in one of the non-atomic element is updated, this process is recursively applied for the non-atomic element until the atomic element is updated as described above. Is this how it should be implemented?
If someone creates a whole deep copy of a list every time a list is created from existing lists/added/updated/deleted/ with elements, they are doing it wrong, aren't they?
Another thing is, when the program environment is updated (i.e. add a new key/value entry for a new variable, so we can refer to it later), it doesn't violate the immutable property of functional programming, is it?
You are absolutely correct! FP languages with immutable data will NEVER do a deep copy (unless they are really poorly implemented). As the data is immutable there are never any problems in reusing it. It works in exactly the same way with all other structures. So for example if you are working with a tree structure then at most only the actual tree will be copied and never the data contained in it.
So while the copying sounds very expensive it is much less than you would first think if you coming from an imperative/OO background (where you really do have to copy as you have mutable data). And there are many benefits in having immutable data.

Prolog association list

I am writing a simple program safety checker in Prolog and I need a data structure to hold variable valuation. Since I want to detect when I am visiting same state again, this structure must support some reasonable comparison semantics, so I can store visited states in set.
library(avl) has convenient getter/setter interface.
The problem is, AVL holding the same mapping can take multiple forms.
Thus two identical states would be considered distinct if their AVL representation differs.
A structure holding mapping in ordered lists would be free of this problem. However, I can't find anything like that in Sicstus docs. Is there any standard structure that does what I need, or do I have to implement it myself?
You have ordered sets but in AVL you can always convert AVLs to ordered lists of key-valued pairs and then compare them.

What is the difference between a map and a dictionary?

I know a map is a data structure that maps keys to values. Isn't a dictionary the same? What is the difference between a map and a dictionary1?
1. I am not asking for how they are defined in language X or Y (which seems to be what generally people are asking here on SO), I want to know what is their difference in theory.
Two terms for the same thing:
"Map" is used by Java, C++
"Dictionary" is used by .Net, Python
"Associative array" is used by PHP
"Map" is the correct mathematical term, but it is avoided because it has a separate meaning in functional programming.
Some languages use still other terms ("Object" in Javascript, "Hash" in Ruby, "Table" in Lua), but those all have separate meanings in programming too, so I'd avoid them.
See here for more info.
One is an older term for the other. Typically the term "dictionary" was used before the mathematical term "map" took hold. Also, dictionaries tend to have a key type of string, but that's not 100% true everywhere.
Summary of Computer Science terminology:
a dictionary is a data structure representing a set of elements, with insertion, deletion, and tests for membership; the elements may be, but are not necessarily, composed of distinct key and value parts
a map is an associative data structure able to store a set of keys, each associated with one (or sometimes more than one - e.g. C++ multimap) value, with the ability to access and erase existing entries given only the key.
Discussion
Answering this question is complicated by programmers having seen the terms given more specific meanings in particular languages or systems they've used, but the question asks for a language agnostic comparison "in theory", which I'm taking to mean in Computing Science terms.
The terminology explained
The Oxford University Dictionary of Computer Science lists:
dictionary any data structure representing a set of elements that can support the insertion and deletion of elements as well as test for membership
For example, we have a set of elements { A, B, C, D... } that we've been able to insert and could start deleting, and we're able to query "is C present?".
The Computing Science notion of map though is based on the mathematical linguistic term mapping, which the Oxford Dictionary defines as:
mapping An operation that associates each element of a given set (the domain) with one or more elements of a second set (the range).
As such, a map data structure provides a way to go from elements of a given set - known as "keys" in the map, to one or more elements in the second set - known as the associated "value(s)".
The "...or more elements in the second set" aspect can be supported by an implementation is two distinct way:
Many map implementations enforce uniqueness of the keys and only allow each key to be associated with one value, but that value might be able to be a data structure itself containing many values of a simpler data type, e.g. { {1,{"one", "ichi"}, {2, {"two", "ni"}} } illustrates values consisting of pairs/sets of strings.
Other map implementations allow duplicate keys each mapping to the same or different values - which functionally satisfies the "associates...each [key] element...with...more [than one] [value] elements" case. For example, { {1, "one"}, {1, "ichi"}, {2, "two"}, {2, "ni"} }.
Dictionary and map contrasted
So, using the strict Comp Sci terminology above, a dictionary is only a map if the interface happens to support additional operations not required of every dictionary:
the ability to store elements with distinct key and value components
the ability to retrieve and erase the value(s) given only the key
A trivial twist:
a map interface might not directly support a test of whether a {key,value} pair is in the container, which is pedantically a requirement of a dictionary where the elements happen to be {key,value} pairs; a map might not even have a function to test for a key, but at worst you can see if an attempted value-retrieval-by-key succeeds or fails, then if you care you can check if you retrieved an expected value.
Communicate unambiguously to your audience
⚠ Despite all the above, if you use dictionary in the strict Computing Science meaning explained above, don't expect your audience to follow you initially, or be impressed when you share and defend the terminology. The other answers to this question (and their upvotes) show how likely it is that "dictionary" will be synonymous with "map" in the experience of most programmers. Try to pick terminology that will be more widely and unambiguously understood: e.g.
associative container: any container storing key/value pairs with value-retrieval and erasure by key
hash map: a hash table implementation of an associative container
hash set enforcing unique keys: a hash table implementation of a dictionary storing element/values without treating them as containing distinct key/value components, wherein duplicates of the elements can not be inserted
balance binary tree map supporting duplicate keys: ...
Crossreferencing Comp Sci terminology with specific implementations
C++ Standard Library
maps: map, multimap, unordered_map, unordered_multimap
other dictionaries: set, multiset, unordered_set, unordered_multiset
note: with iterators or std::find you can erase an element and test for membership in array, vector, list, deque etc, but the container interfaces don't directly support that because finding an element is spectacularly inefficient at O(N), in some cases insert/erase is inefficient, and supporting those operations undermines the deliberately limited API the container implies - e.g. deques should only support erase/pop at the front and back and not in terms of some key. Having to do more work in code to orchestrate the search gently encourages the programmer to switch to a container data structure with more efficient searching.
...may add other languages later / feel free to edit in...
My 2 cents.
Dictionary is an abstract class in Java whereas Map is an interface. Since, Java does not support multiple inheritances, if a class extends Dictionary, it cannot extend any other class.
Therefore, the Map interface was introduced.
Dictionary class is obsolete and use of Map is preferred.
Typically I assume that a map is backed by a hash table; it connotes an unordered store.
Dictionaries connote an ordered store.
There is a tree-based dictionary called a Trie.
In Lisp, it might look like this:
(a (n (d t)) n d )
Which encapsulates the words:
a
and
ant
an
ad
The traversal from the top to the leaf yields a word.
Not really the same thing. Maps are a subset of dictionary. Dictionary is defined here as having the insert, delete, and find functions. Map as used by Java (according to this) is a dictionary with the requirement that keys mapping to values are strictly mapped as a one-to-one function. A dictionary might have more than one key map to one value, or one key map to several values (like chaining in a hasthtable), eg Twitter hashtag searches.
As a more "real world" example, looking up a word in a dictionary can give us a number of definitions for the same word, and when we find an entry that points us to another entry (see other word), a number of words for the same list of definitions. In the real world, maps are much broader, allowing us to have locations for names or names for coordinates, but also we can find a nearest neighbor or other attributes (populations, etc), so IMHO there could be argument for a greater expansion of the map type to possibly have graph based implementations, but it would be best to always assume just the key-value pair, especially since nearest neighbor and other attributes to the value could all just be data members of the value.
java maps, despite the one-to-one requirement, can implement something more like a generalized dictionary if the value is generalized as a collection itself, or if the values are merely references to collections stored elsewhere.
Remember that Java maintainers are not the maintainers of ADT definitions, and that Java decisions are specifically for Java.
Other terms for this concept that are fairly common: associative array and hash.
Yes, they are the same, you may add "Associative Array" to the mix.
using Hashtable or a Hash ofter refers to the implementation.
These are two different terms for the same concept.
Hashtable and HashMap also refer to the same concept.
so on a purely theoretical level.
A Dictionary is a value that can be used to locate a Linked Value.
A Map is a Value that provides instructions on how to locate another values
all collections that allow non linear access (ie only get first or get last) are a Map, as even a simple Array has an index that maps to the correct value. So while a Dictionary is a Type of map, maps are a much broader range of possible function.
In Practice a its usually the mapping function that defines the name, so a HashMap is a mapped data structure that uses a hashing algorithm to link the key to the value, where as a Dictionary doesn't specify how the keys are linked to a value so could be stored via a linked list, tree or any other algorithm. from the usage end you usually don't care what the algorithm only that they work so you use a generic dictionary and only shift to one of the other structures only when you need to enfore the type of algorithm
The main difference is that a Map, requires that all entries(value & key pair) have a unique key. If collisions occur, i.e. when a new entry has the same key as an entry already in the collection, then collision handling is required.
Usually, we handle collisions using either Separate Chaining. Or Linear Probing.
A Dictionary allows for multiple entries to be linked to the same key.
When a Map has implemented Separate Chaining, then it tends to resemble a Dictionary.
I'm in a data structures class right now and my understanding is the dict() data type that can also be initialized as just dictionary = {} or with keys and values, is basically the same as how the list/array data type is used to implement stacks and queues. So, dict() is the type and maps are a resulting data structure you can choose to implement with the dictionary data type in the same way you can use the list type and choose to implement a stack or queue data structure with it.

Resources