Functional-style Updates - functional-programming

This may be an oxymoron, but how would one update a data entity in the functional programming style? From all I've read, functional-style programming uses transformations to return an output on immutable entities. The only thing I can think of would be to completely replace the original entity, but that seems almost the same as a classic update approach.

Are you talking about on disk database entities or data-structures in memory.
For the latter, functional languages use persistent data-structures, which are implemented such that the new version and the old version are both available after the update, but they share common parts (so that it is efficient). So you appear to be returning a totally new datastructure, but in fact, it shares most of its implementation with the one it was modifying.
There are some really good implementations to look at in the clojure source (written in Java) -- I took two of them apart on my blog
http://www.loufranco.com/blog/files/20-Days-of-Clojure-Day-7.html
http://www.loufranco.com/blog/files/20-Days-of-Clojure-Day-8.html

Lou Franco has it. Data structures in functional languages are implemented such that to modifying them, you "completely replace" the original entity. Behind the scenes, they still use most of the old one: they just replace the changed bits. The old version still exists, too, but garbage collection will destroy it eventually as long as nobody references it.

The short answer is that in the functional style, each data entity would be immutable, so an update is really a new data entity with the updated value, kind of like how strings work in .NET.
The real interesting challenges come when dealing with IO, it becomes tough to model I/O in a purely functional manner which leads to workarounds like Monads.

Related

What is the need for immutable/persistent data structures in erlang

Each Erlang process maintains its own private address space. All communication happens via copying without sharing (except big binaries). If each process is processing one message at a time with no concurrent access over its objects, I don't see why do we need immutable/persistent data structures.
Erlang was initially implemented in Prolog, which doesn't really use mutable data structures either (though some dialects do). So it started off without them. This makes runtime implementation simpler and faster (garbage collection in particular).
So adding mutable data structures would require a lot of effort, could introduce bugs, and Erlang programmers are nearly by definition at least willing to live without them.
Many actually consider their absence to be a positive good: less concern about object identity, no need for defensive copying because you don't know whether some other piece of code is going to modify the data you passed (or might be changed later to modify it), etc.
This absence does mean that Erlang is pretty unusable in some domains (e.g. high performance scientific computing), at least as the main language. But again, this means that nobody in these domains is going to use Erlang in the first place and so there's no particular incentive to make it usable at the cost of making existing users unhappy.
I remember seeing a mailing list post by Joe Armstrong quite a long time ago (which I couldn't find with a quick search now) saying that he initially planned to add mutable variables when he'd need them... except he never quite did, and performance was good enough for everything he was using Erlang for.
It is indeed the case that in Erlang immutability does not solve any "shared state" problems, as immutable data are "process local".
From the functional programming language perspective, however, immutability offers a number of benefits, summarized adequately in this Quora answer:
The simplest definition of functional programming is that it’s a programming
paradigm where you are transforming immutable data with functions.
The definition uses functions in the mathematical sense, where it’s
something that takes an input, and produces an output.
OO + mutability tends to violate that definition because when you want
to change a piece of data it generally will not return the output, it
will likely return void or unit, and that when you call a method on
the object the object itself isn’t input for the function.
As far as what advantages the paradigm has, composability, thread
safety, being able to track what went wrong where better, the ability
to sort of separate the data from the actual computation on it being
done, etc.
how would this work?
factorial(1) -> 1;
factorial(X) ->
X*factorial(X-1).
if you run factorial(4), a single process will be running the same function. Each time the function will have it's own value of X, if the value of X was in the scope of the process and not the function recursive functions wouldn't work. So first we need to understand scope. If you want to say that you don't see why data needs to be immutable within the scope of a single function/block you would have a point, but it would be a headache to think about where data is immutable and where it isn't.

Shouldn't a Redux app with immutable data run out of memory after a while?

I come from a background in embedded systems where you're really careful about memory management. With Redux especially its concept of immutability.
So let's say I'm modifying a member of an array. I have to create a new array that links to all original members plus the modified item.
I understand why using Immutability improves the speed but my question is since we essentially never remove the old copies of the objects and create new ones, Redux still keeps a reference to the old objects because of time traveling features.
Most machines these days have quite a lot of memory, but shouldn't at least in theory the Redux app crash because the tab/process runs out of memory? After a long use maybe?
No. First, Redux itself doesn't keep around old data - that's something that the Redux DevTools addon does. Second, I believe the DevTools addon has limits on how many actions it will track. Third, Javascript is a garbage-collected language, so items that are no longer referenced will be cleaned up. (Hand-waving a bit there, but that generally covers things.)
Immutable.js leverages the idea of Structural Sharing while creating new copies of collections. It implements persistent data structures and internally uses concepts like tries to implement structural sharing. So, if you created a list with 10 items, adding a new item to it will not create a new list.
Persistent data structures provide the benefits of immutability while
maintaining high performance reads and writes and present a familiar
API.
Immutable.js data structures are highly efficient on modern JavaScript VMs by
using structural sharing via hash maps tries and vector tries as
popularized by Clojure and Scala, minimizing the need to copy or cache
data.
I suggest you watch this awesome talk by Lee Bryon, Immutable.js creator

Abstraction or not?

The other day i stumbled onto a rather old usenet post by Linus Torwalds. It is the infamous "You are full of bull****" post when he defends his choice of using plain C for Git over something more modern.
In particular this post made me think about the enormous amount of abstraction layers that accumulate one over the other where I work. Mine is a Windows .Net environment. I must say that I like C# and the .Net environment, it really makes most things easy.
Now, I come from a very different background made of Unix technologies like C and a plethora or scripting languages; to me, also, OOP is just one, and not always the best, programming paradigm.. I often struggle (in a working kind of way, of course!) with my colleagues (one in particular), because they appear to be of the "any problem can be solved with an additional level of abstraction" church, while I'm more of the "keeping it simple" school. I think that there is a very different mental approach to the problems that maybe comes from the exposure to different cultures.
As a very simple example, for the first project I did here I needed some configuration for an application. I made a 10 rows class to load and parse a txt file to be located in the program's root dir containing colon separated key / value pairs, one per row. It worked.
In the end, to standardize the approach to the configuration problem, we now have a library to be located on every machine running each configured program that calls a service that, at startup, loads up an xml that contains the references to other xmls, one per application, that contain the configurations themselves.
Now, it is extensible and made up of fancy reusable abstractions, providers and all, but I still think that, if we one day really happen to reuse part of it, with the time taken to make it up, we can make the needed code from start or copy / past the old code and modify it.
What are your thoughts about it? Can you point out some interesting reference dealing with the problem?
Thanks
Abstraction makes it easier to construct software and understand how it is put together, but it complicates fully understanding certain issues around performance and security, because the abstraction layers introduce certain kinds of complexity.
Torvalds' position is not absurd, but he is an extremist.
Simple answer: programming languages provide data structures and ways to combine them. Use these directly at first, do not abstract. If you find you have representation invariants to maintain that are at a high risk of being broken due to a large number of usage sites possibly outside your control, then consider abstraction.
To implement this, first provide functions and convert the call sites to use them without hiding the representation. Hide the data representation only when you're satisfied your functional representation is sufficient. Make sure at this time to document the invariant being protected.
An "extreme programming" version of this: do not abstract until you have test cases that break your program. If you think the invariant can be breached, write the case that breaks it first.
Here's a similar question: https://stackoverflow.com/questions/1992279/abstraction-in-todays-languages-excited-or-sad.
I agree with #Steve Emmerson - 'Coders at Work' would give you some excellent perspective on this issue.

Functional Programming better to manipulate lists of database data?

I'm watching some lecture's on Functional Programming and the main 'data structure' so to say, but there really isn't one in FP, is lists, so my question is: when one deals a lot with database's and 'lists' of data, then is Functional Programming not superior to OOP?
One of the biggest improvements in reading from databases in recent years is LINQ. LINQ is actually based a lot on functional programming principles. In fact SQL is also a very functional style language.
I see no problems with reading data from a database using a functional language.
Now modifying the database... that's a different story. I'll leave that for another day. :)
Well, Lisp deals with lists, but the lists are heterogenous, and can well represent a tree. Other languages, like Haskell, give you structured types, named and unnamed, and - in contrast to lisp - allow for static type checking.
One thing that pure functional languages do not have is the notion of stateful variables that can be assigned. Some Lisp implementations provide such state - you get a setq opeator -, while Haskell doesn't. Reading and writing databases, however, is all about having state - and lots of it, that's what databases are for - and about reading from and writing into it. So, operating on a database is quite the opposite of using a functional language.
It does, however, make sense to create a database query language which expresses the DB operations in a non-imperative, but in a declarative, and hence in a functional way. That's how SQL makes sense, and that's also how the way LINQ is defined makes sense.
So, it makes sense to have a database language which is functional, but it's not because of the lists.

Reflection: for frameworks only?

Somebody that I work with and respect once remarked to me that there shouldn't be any need for the use of reflection in application code and that it should only be used in frameworks. He was speaking from a J2EE background and my professional experience of that platform does generally bear that out; although I have written reflective application code using Java once or twice.
My experience of Ruby on Rails is radically different, because Ruby pretty much encourages you to write dynamic code. Much of what Rails gives you simply wouldn't be possible without reflection and metaprogramming and many of the same techniques are equally as applicable and useful to your application code.
Do you agree with the viewpoint that reflection is for frameworks only? I'd be interested to hear your opinions and experiences.
There's the old joke that any sufficiently sophisticated system written in a statically-typed language contains an incomplete, inferior implementation of Lisp.
Since your requirements tend to become more complicated as a project evolves, you often eventually find that the common idioms in statically-typed object systems eventually hit a wall. Sometimes reaching for reflection is the best solution.
I'm happy in dynamically-typed languages like Ruby, and statically-typed languages like C#, but the implicit reflection in Ruby often makes for simpler, easier-to-read code. (Depending on the metaprogramming magic required, sometimes harder to write).
In C#, I've found problems that couldn't be solved without reflection, because of information I didn't have until runtime. One example: When trying to manipulate some third-party code that generated proxies to Silverlight objects running in another process, I had to use reflection to invoke a specific strongly-typed "Generic" version of a method, because the marshalling required the caller to make an assumption about the type of the object in the other process was in order to extract the data we needed from it, and C# doesn't allow the "type" of the generic method invocation to be specified at run time (except with reflection techniques). I guess you could argue our tool was kind of a framework, but I could easily imagine a case in an ordinary application facing a similar problem.
Reflection makes DRY a lot easier. It's certainly possible to write DRY code without reflection, but it's often much more verbose.
If some piece of information is encoded in my program in one way, why wouldn't I use reflection to get at it, if that's the easiest way?
It sounds like he's talking about Java specifically. And in that case, he's just citing a special case of this: in Java, reflection is so wonky it's almost never the easiest way to do something. :-) In other languages like Ruby, as you've seen, it often is.
Reflection is definitely heavily used in frameworks, but when used correctly can help simplify code in applications.
One example I've seen before is using a JDK Proxy of a large interface (20+ methods) to wrap (i.e. delegate to) a specific implementation. Only a couple of methods were overridden using a InvocationHandler, the rest of the methods were invoked via reflection.
Reflection can be useful, but it is slower that doing a regular method call. See this reflection comparison.
Reflection in Java is generally not necessary. It may be the quickest way to solve a certain problem, but I would rather work out the underlying problem that causes you to think it's necessary in app code. I believe this because it frequently pushes errors from compile time to run time, which is always a Bad Thing for large enough software that testing is non-trivial.
I disagree, my application uses reflection to dynamically create providers. I might also use reflection to control logic flow, if the logic is simple and doesn't warrant a more complicated pattern.
In C# I use reflection to grab attributes off Enumeration which help me determine how to display an enumeration to an end user.
I disagree, reflection is very useful in application code and I find myself using it quite often. Most recently, I had to use reflection to load an assembly (in order to investigate its public types) from just the path of the assembly.
Several opinions on this subject are expressed here...
What is reflection and why is it useful?
Use reflection when there is no other way! This is a matter of performance!
If you have looked into .NET performance pitfalls before, it might not surprise you how slow the normal reflection is: a simple test with repeated access to an int property proved to be ~1000 times slower using reflection compared to the direct access to the property (comparing the average of the median 80% of the measured times).
See this: .NET reflection - performance
MSDN has a pretty nice article about When Should You Use Reflection?
If your problem is best solved by using reflection, you should use it.
(Note that the definition of 'best' is something learnt by experience :)
The definition of framework vs. application isn't all that black & white either. Sometimes your app needs a bit of framework to do its job well.
I think the observation that there shouldn't be any need for the use of reflection in application code and that it should only be used in frameworks is more or less true.
On the spectrum of how coupled some piece of code are, code joined by reflection are as loosely coupled as they come.
As such, the code which is doing it's job via reflection can quite happily fulfil it's role in life knowing not-a-thing about the code which is using it.

Resources