I am a JavaScript developer on a journey to up my skills in functional programming. I recently ran into a wall when it comes to managing state. When searching for a solution I stumbeled over the state monad in various articles and videos but I have a really hard time understanding it. I am wondering if it is because I expect it to be something it is not.
The problem I am trying to solve
In a web client I am fetching resources from the back end. To avoid unnecessary traffic I am creating a simple cache on the client side which contains the already fetched data. The cache is my state. I want several of my modules to be able to hold a reference to the cache and query it for its current state, a state that may have been modified by another module.
This is of course not a problem in javascript since it is possible to mutate state but I would like to learn more about functional programming and I was hoping that the state monad would help me.
What I would expect
I had assume that I could do something like this:
var state = State.of(1);
map(add(1), state);
state.evalState() // => 2
This obviously doesn't work. The state is always 1.
My question
Are my assumptions about the state monad wrong, or am I simply using it incorrectly?
I realize that I can do this:
var state = State.of(1);
var newState = map(add(1), state);
... and newState will be a state of 2. But here I don't really see the use of the state monad since I will have to create a new instance in order for the value to change. This to me seems to be what is always done in functional programming where values are immutable.
The purpose of the state monad is to hide the passing of state between functions.
Let's take an example:
The methods A and B need to use some state and mutate it, and B needs to use the state that A mutated. In a functional language with immutable data, this is impossible.
What is done instead is this: an initial state is passed to A, along with the arguments it needs, and A returns a result and a "modified" state -- really a new value, since the original wasn't changed. This "new" state (and possibly the result too) is passed into B with its required arguments, and B returns its result and a state that it (may have) modified.
Passing this state around explicitly is a PITA, so the State monad hides this under its monadic covers, allowing methods which need to access the state to get at it through get and set monadic methods.
To use the stateful computations A and B, we combine them together into a conglomerate stateful computation and give that conglomerate a beginning state (and arguments) to run with, and it returns a final "modified" state and result (after running things through A, B, and whatever else it was composed of).
From what you're describing it seems to me like you're looking for something more along the lines of the actor model of concurrency, where state is managed in an actor and the rest of the code interfaces with it through that, retrieving (a non-mutable version of) it or telling it to be modified via messages. In immutable languages (like Erlang), actors block waiting for a message, then process one when it comes in, then loop via (tail) recursion; they pass any modified state to the recursive call, and this is how the state gets "modified".
As you say, though, since you're using JavaScript it's not much of an issue.
I'm trying to answer your question from the perspective of a Javascript developer, because I believe that this is the cause of your problem. Maybe you can specify the term Javascript in the headline and in the tags.
Transferring of concepts from Haskell to Javascript is basically a good thing, because Haskell is a very mature, purely functional language. It can, however, lead to confusion, as in the case of the state monad.
The maybe monad for instance can be easily understood, because it deals with a problem that both languages are facing: Computations that might go wrong by not returning a value (null/undefined in Javascript). Maybe saves developers from scattering null checks throughout their code.
In the case of the state monad the situation is a little different. In Haskell, the state monad is required in order to compose functions, which share changeable state, without having to pass this state around. State is one or more variables that are not among the arguments of the functions involved. In Javascript you can just do the following:
var stack = {
store: [],
push: function push(element) { this.store.push(element); return this; },
pop: function pop() { return this.store.pop(); }
}
console.log(stack.push(1).push(2).push(3).pop()); // 3 (return value of stateful computation)
console.log(stack.store); // [1, 2] (mutated, global state)
This is the desired stateful computation and store does not have to be passed around from method to method. At first sight there is no reason to use the state monad in Javascript. But since store is publicly accessible, push and pop mutate global state. Mutating global state is a bad idea. This problem can be solved in several ways, one of which is precisely the state monad.
The following simplified example implements a stack as state monad:
function chain(mv, mf) {
return function (state) {
var r = mv(state);
return mf(r.value)(r.state);
};
}
function of(x) {
return function (state) {
return {value: x, state: state};
};
}
function push(element) {
return function (stack) {
return of(null)(stack.concat([element]));
};
}
function pop() {
return function (stack) {
return of(stack[stack.length - 1])(stack.slice(0, -1));
};
}
function runStack(seq, stack) { return seq(stack); }
function evalStack(seq, stack) { return seq(stack).value; }
function execStack(seq, stack) { return seq(stack).state; }
function add(x, y) { return x + y; }
// stateful computation is not completely evaluated (lazy evaluation)
// no state variables are passed around
var computation = chain(pop(), function (x) {
if (x < 4) {
return chain(push(4), function () {
return chain(push(5), function () {
return chain(pop(), function (y) {
return of(add(x, y));
});
});
});
} else {
return chain(pop(), function (y) {
return of(add(x, y));
});
}
});
var stack1 = [1, 2, 3],
stack2 = [1, 4, 5];
console.log(runStack(computation, stack1)); // Object {value: 8, state: Array[3]}
console.log(runStack(computation, stack2)); // Object {value: 9, state: Array[1]}
// the return values of the stateful computations
console.log(evalStack(computation, stack1)); // 8
console.log(evalStack(computation, stack2)); // 9
// the shared state within the computation has changed
console.log(execStack(computation, stack1)); // [1, 2, 4]
console.log(execStack(computation, stack2)); // [1]
// no globale state has changed
cosole.log(stack1); // [1, 2, 3]
cosole.log(stack2); // [1, 4, 5]
The nested function calls could be avoided. I've omitted this feature for simplicity.
There is no issue in Javascript that can be solved solely with the state monad. And it is much harder to understand something as generalized as the state monad, that solves a seemingly non-existing problem in the used language. Its use is merely a matter of personal preference.
It indeed works like your second description where a new immutable state is returned. It isn't particularly useful if you call it like this, however. Where it comes in handy is if you have a bunch of functions you want to call, each taking the state returned from the previous step and returning a new state and possibly another value.
Making it a monad basically allows you to specify a list of just the function names to be executed, rather than repeating the newState = f(initialState); newNewState = g(newState); finalState = h(newNewState); over and over. Haskell has a built-in notation called do-notation to do precisely this. How you accomplish it in JavaScript depends on what functional library you're using, but in its simplest form (without any binding of intermediate results) it might look something like finalState = do([f,g,h], initialState).
In other words, the state monad doesn't magically make immutability look like mutability, but it can simplify the tracking of intermediate states in certain circumstances.
State is present everywhere. In class, it could be the value of its properties. In programs it could be the value of variables. In languages like javascript and even java which allow mutability, we pass the state as arguments to the mutating function. However, in languages such as Haskell and Scala, which do not like mutation(called as side-effects or impure), the new State (with the updates) is explicitly returned which is then passed to its consumers. In order to hide this explicit state passes and returns, Haskell(and Scala) had this concept of State Monad. I have written an article on the same at https://lakshmirajagopalan.github.io/state-monad-in-scala/
Related
Is there any inconvenient at all if I design my reducers to, instead of reading only the partial state, had access to the full state tree?
So instead of writing this:
function reducer(state = {}, action) {
return {
a: doSomethingWithA(state.a, action),
b: processB(state.b, action),
c: c(state.c, action)
}
}
I destructure state inside doSomethingWithA, c or processB reducers, separately:
function reducer(state = {}, action) {
return {
a: doSomethingWithA(state, action), // calc next state based on a
b: processB(state, action), // calc next state based on b
c: c(state, action) // calc next state based on a, b and c
}
}
Would I'd be using more RAM? Is there any performance inconvenient? I understand that in javascript, a reference is always passed as parameter, that's why we should return a new object if we want to update the state or use Immutable.JS to enforce immutability, so... again, would it be of any inconvenient at all?
No, there's nothing wrong with that. Part of the reason for writing update logic as individual functions instead of separate Flux "stores" is that it gives you explicit control over chains of dependencies. If the logic for updating state.b depends on having state.a updated first, you can do that.
You may want to read through the Structuring Reducers section in the Redux docs, particularly the Beyond combineReducers topic. It discusses other various reducer structures besides the typical combineReducers approach. I also give some examples of this kind of structure in my blog post Practical Redux, Part 7: Form Change Handling, Data Editing, and Feature Reducers.
Looking at the real world example I see this setting up the api middleware:
export default store => next => action => {
const callAPI = action[CALL_API]
if (typeof callAPI === 'undefined') {
return next(action)
}
What exactly is happening here? I see that configureStore is importing whatever that is and passing it to applyMiddleware from redux, but what does this kind of statement mean in js?
I assume it's exporting an anonymous function that returns a function that returns a function? Just tried this:
var a = b => c => d => {
console.log('a:', a);
console.log('b:', b);
console.log('c:', c);
console.log('d:', d);
};
a(5)(6)(7);
// outputs b: 5, c: 6, and d: 7
Function Specialization
The arrow function notation simplifies currying in JavaScript.
Here it's just a way to do partial applications, and permits to bind arguments to the function at different times, by using Closures instead of Function.prototype.bind.
When you call applyMiddleware during Store creation, Redux will specialize your Middleware with the current Store it's been applied to.
Then it becomes a new specialized function, that only takes two arguments:
next => action
Where next is the next middleware that will be called on the Action. (Just like in Express, which popularized the concept, for request handling)
Timeline
The important thing here is that all these function specializations are done at different times.
store can be bound during Store creation.
next can be bound once it knows the Store it's been bound to, so also during Store creation, but could be updated later.
action is known only when you effectively dispatch an Action, which can happen any time.
The specialized middleware (the one which has been bound to the Store, and is already aware of the Next middleware function) will be reusable, and called for each new dispatched Action.
Functional Programming
These concepts (currying and partial application) come from the Functional Programming world.
Redux relies heavily on this paradigm, and the most important thing in Redux is the sidelining of Side-Effects (especially mutations).
Capturing directly the context of the function, or using a global Store via require, is a side-effect as your function will directly after its declaration be bound to this Store.
Instead Redux uses Currying to permit sort of Dependency Injection, and it results in a stateless function, that can be reused and specialized at runtime.
This way your Middleware is Loosely Coupled to the Store.
To understand this clearly you need to first know how middlewares work in redux. So first go through this
Now even after going through the documentation you are still confused, dont worry its a bit complicated, try reading it once again :).I understood this properly after 2-3 reads.
Now the one you mentioned in your question is a curried up ES6 syntax. If you try to convert this to vanilla javascript it would come to something like below
function (store) {
return function (next) {
return function (action) {
var callAPI = action[CALL_API];
if (typeof callAPI === 'undefined') {
return next(action);
}
};
};
};
So if you see its nothing but just chaining of functions.
When I use FFI to wrap some API (for example DOM API) is there any rule of thumb that could help me to decide whether function should be effectful or not?
Here is an example:
foreign import querySelectorImpl """
function querySelectorImpl (Nothing) {
return function (Just) {
return function (selector) {
return function (src) {
return function () {
var result = src.querySelector(selector);
return result ? Just(result) : Nothing;
};
};
};
};
}
""" :: forall a e. Maybe a -> (a -> Maybe a) -> String -> Node -> Eff (dom :: DOM | e) (Maybe Node)
querySelector :: forall e. String -> Node -> Eff (dom :: DOM | e) (Maybe Node)
querySelector = querySelectorImpl Nothing Just
foreign import getTagName """
function getTagName (n) {
return function () {
return n.tagName;
};
}
""" :: forall e. Node -> Eff (dom :: DOM | e) String
It feels right for querySelector to be effectful, but I'm not quite sure about getTagName
Update:
I understand what a pure function is and that it should not change the state of the program and maybe DOM was a bad example.
I ask this question because in most libraries that wrap existing js libraries pretty much every function is effectful even if it doesn't feels right. So maybe my actual question is - does this effect represent the need in this wrapped js lib or is it there just in case it is stateful inside?
If a function does not change state, and it always (past, present, and future) returns the same value when given the same arguments, then it does not need to return Eff, otherwise it does.
n.tagName is read-only, and as far as I know, it never changes. Therefore, getTagName is pure, and it's okay to not return Eff.
On the other hand, a getTextContent function must return Eff. It does not change state, but it does return different values at different times.
The vast vast vast majority of JS APIs (including the DOM) are effectful. getTagName is one of the very few exceptions. So when writing an FFI, PureScript authors just assume that all JS functions return Eff, even in the rare situations where they don't need to.
Thankfully the most recent version of purescript-dom uses non-Eff functions for nodeName, tagName, localName, etc.
Effectful functions are functions that are not pure, from Wikipedia:
In computer programming, a function may be described as a pure function if both these statements about the function hold:
The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change as program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices [...].
Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices [...].
Since the DOM stores state, functions wrapping calls to the DOM are almost always effectful.
For more details regarding PureScript, see Handling Native Effects with the Eff Monad.
Say I have two view models that each have an observable property that represents different, but similar data.
function site1Model(username) {
this.username = ko.observable(username);
....
}
function site2Model(username) = {
this.username = ko.observable(username);
....
}
These view models are independent and not necessarily linked to each other, but in some cases, a third view model creates a link between them.
function site3Model(username) = {
this.site1 = new site1Model(username);
this.site2 = new site2Model(username);
// we now need to ensure that the usernames are kept the same between site1/2
...
}
Here are some options that I've come up with.
Use a computed observable that reads one and writes to both:
site3Model.username = ko.computed({
read: function() {
return this.site1.username(); // assume they are always the same
},
write: function(value) {
this.site1.username(value);
this.site2.username(value);
},
owner: site3Model
}
This will keep the values in sync as long as changes always come through the computed. But if an underlying observable is changed directly, it won't do so.
Use the subscribe method to update each from the other:
site3Model.site1.username.subscribe(function(value) {
this.site2.username(value);
}, site3Model);
site3Model.site2.username.subscribe(function(value) {
this.site1.username(value);
}, site3Model);
This works as long as the observables suppress notifications when the values are the same; otherwise you'd end up with an infinite loop. You could also do the check earlier: if (this.site1.username() !== value) this.site1.username(value); This also has a problem that the observables have to be simple (it won't work right if site1 and site2 themselves are observables).
Use computed to do the subscribe and updates:
site3Model.username1Updater = ko.computed(function() {
this.site1.username(this.site2.username());
}, site3Model);
site3Model.username2Updater = ko.computed(function() {
this.site2.username(this.site1.username());
}, site3Model);
This format allows us to have other dependencies. For example, we could make site1 and site2 observables and then use this.site1().username(this.site2().username()); This method also requires a check for equality to avoid an infinite loop. If we can't depend on the observable to do it, we could check within the computed, but would add another dependency on the observable we're updating (until something like observable.peek is available).
This method also has the downside of running the update code once initially to set up the dependencies (since that's how computed works).
Since I feel that all of these methods have a downside, is there another way to do this that would be simple (less than 10 lines of code), efficient (not run unnecessary code or updates), and flexible (handle multiple levels of observables)?
It is not exactly 10 lines of code (although you could strip it down to your liking), but I use pub/sub messages between view models for this situation.
Here is a small library that I wrote for it: https://github.com/rniemeyer/knockout-postbox
The basic idea is just to create a ko.subscribable and use topic-based subscriptions. The library extends subscribables to add subscribeTo, publishOn and syncWith (both publish and subscribe on a topic). These methods will set up the proper subscriptions for an observable to automatically participate in this messaging and stay synchronized with the topic.
Now your view models do not need to have direct references to each other and can communicate through the pubsub system. You can refactor your view models without breaking anything.
Like I said you could strip it down to less than 10 lines of code. The library just adds some extras like being able to unsubscribe, being able to have control over when publishing actually happens (equalityComparer), and you can specify a transform to run on incoming values.
Feel free to post any feedback.
Here is a basic sample: http://jsfiddle.net/rniemeyer/mg3hj/
Ryan and John, Thank you both for your answers. Unfortunately, I really don't want to introduce a global naming system that the pub/sub systems require.
Ryan, I agree that the subscribe method is probably the best. I've put together a set of functions to handle the subscription. I'm not using an extension because I also want to handle the case where the observables themselves might be dynamic. These functions accept either observables or functions that return observables. If the source observable is dynamic, I wrap the accessor function call in a computed observable to have a fixed observable to subscribe to.
function subscribeObservables(source, target, dontSetInitially) {
var sourceObservable = ko.isObservable(source)
? source
: ko.computed(function(){ return source()(); }),
isTargetObservable = ko.isObservable(target),
callback = function(value) {
var targetObservable = isTargetObservable ? target : target();
if (targetObservable() !== value)
targetObservable(value);
};
if (!dontSetInitially)
callback(sourceObservable());
return sourceObservable.subscribe(callback);
}
function syncObservables(primary, secondary) {
subscribeObservables(primary, secondary);
subscribeObservables(secondary, primary, true);
}
This is about 20 lines, so maybe my target of less than 10 lines was a bit unreasonable. :-)
I modified Ryan's postbox example to demonstrate the above functions: http://jsfiddle.net/mbest/vcLFt/
Another option is to create an isolated datacontext that maintains the models of observables. the viewmodels all look to the datacontext for their data and refer to the same objects, so when one updates, they all do. The VM's dependency is on the datacontext, but not on other VMs. I've been doing this lately and it has worked well. Although, it is much more complex than using pub/sub.
If you want simple pub/sub, you can use Ryan Niemyer's library that he mentioned or use amplify.js which has pub/sub messaging (basically a messenger or event aggregator) built in. Both are lightweight and decoupled.
In case anyone needed.
Another option is to create a reference object/observable.
This also handle object that contains multiple observable.
(function(){
var subscriptions = [];
ko.helper = {
syncObject: function (topic, obj) {
if(subscriptions[topic]){
return subscriptions[topic];
} else {
return subscriptions[topic] = obj;
}
}
};
})();
In your view models.
function site1Model(username) {
this.username = syncObject('username', ko.observable());
this.username(username);
....
}
function site2Model(username) = {
this.username = syncObject('username', ko.observable());
this.username(username);
....
}
Remember "const poisoning" in C++, when you would mark one method as const and then you realized you had to mark all the methods it called as const and then all the methods they called, and so on?
I'm having a problem with asynchrony poisoning, in Javascript although I don't think that's relevant, although it propagates up rather than down. When a function might possibly call an asynchronous function, it itself must be re-written as asynchronous, and then all the functions that call it must be and so on.
I don't have a well-formed question here (sorry, mods) but I was hoping someone had either (a) advice or (b) a reference that might have (a).
It's not a bad question. There are a few ways around having to totally trash your flow of control, however. Note: I didn't say it was pretty.
Suppose you have objects A, B, C and D, A.Amethod doesn't return anything and calls a method getBData of B, B calls getCData method of C, and the issue is C was calling D something like so
var data = D.getRawData();
... something to be done with data ...
... something else to be done with data...
and now it has to be written like
D.getData(function(data){
... something to be done with data ...
... something else to be done with data...
});
well, you can always add a callback parameter to each of your methods, so that for the code the code that used to look like:
var A = {
//I'm not recommending coding like this, just for demonstration purposes.
...
Amethod: function(x,y,z){
var somethingForA = this.B.getBData(1,2,3);
astatement1;
astatement2;
...
}
...
}
//end of A
...
var B = {
...
Bmethod: function(x,y,z){
var somethingForB = this.C.getCData(1,2,3);
bstatement1;
var somethingFromB = bstatement2;
return somethingFromB;
}
...
}
//end of B
...
var C = {
...
Cmethod: function(x,y,z){
var somethingForC = this.D.getRawData(1,2,3)
cstatement1;
var somethingFromC = cstatement2;
return somethingFromC;
}
...
}
//end of C
...
You'd now have:
var A = {
...
Amethod: function(x,y,z){
this.B.getBData((1,2,3,function(somethingForA){
astatement1;
astatement2;
});
...
}
...
}
//end of A
...
var B = {
...
Bmethod: function(x,y,z,callback){
this.C.getCData(1,2,3,function(somethingForB){
bstatement1;
var somethingFromB = bstatement2;
callback(somethingFromB);
});
...
}
...
}
//end of B
...
var C = {
...
Cmethod: function(x,y,z,callback){
this.D.getRawData(1,2,3,function(somethingForC) {
cstatement1;
var somethingFromC = cstatement2;
callback(somethingFromC);
});
}
...
}
//end of C
...
That's pretty much a straightforward refactoring using anonymous functions to implement all functionality keeping your flow of control. I don't know how literal a transform that would be; you might have to do some adjustment for variable scope. And obviously, it's not that pretty.
Are there other ways? Sure. As you can see, the above is messy, and we'd hope to keep from writing messy code. How less messy depends upon context.
You don't have to pass a callback parameter, any necessary callback could be passed to the objects beforehand, or it may be passed as one of the items in a parameter. The callback may not need to be directly invoked, but invoked indirectly using one of the various event handling methods available (which you'd have to look at what libraries you might want to use for that), and the data may be passed in when the event is triggered. Or maybe there's a global "DataGetter" that A can register a callback whenever data is "gotten" to completely avoid the intermediaries B and C.
Finally, there's the consideration that if your are knee deep in invocations only to discover you need something that can only be obtained asynchronously, and that data has to be passed up the chain of command, you might be doing something a bit backwards in terms of which objects should control the flow of the program logic (I'm honestly stumped as to how to describe why I seem to think this scenario is problematic, though.). I tend to think, as instances of A have to contain instances of B, B contains instances of C, etc, the instances creating the sub-instances as part of its composition should have some degree of control as to how the sub-instances populate themselves, instead of letting the sub-instances fully decide.... if that makes sense :-(
At this point I feel like I'm rambling a bit so... here's to hoping somebody would explain the issues better than me!
The best solution I've seen so far is promises. All that happens, of course, is that you trade asynchrony poisoning for promise poisoning (since any computation that depends on a promise itself must return a promise, but promises are much more flexible and powerful than callbacks.
If the problem is b() should block until complete but calls asynchronous a(), then perhaps a callback from a() could set a flag and b() watches for the flag. If a() doesn't offer a callback then perhaps there is a value somewhere that changes once a() is complete.