Sunday, January 18, 2009

Comparing Functional and Imperative, Part 1: Context determines meaning

Over the past year, I’ve taken a strong interest in functional programming. I’m not sure where it started, but it may have been Brian Beckman’s Don’t fear the Monads on Channel 9. It went pretty much completely over my head, but my interest was definitely piqued. It has been in my mind to do a series of posts on functional programming once I became an expert on the subject. Well, I am so not an expert at this point, but I’ve decided to consider that an advantage. Once something makes perfect sense to you, it’s often hard to remember why it didn’t make perfect sense before.

Functional programming is an alternate universe of programming where hard things (e.g. concurrency) become easier and easy things (e.g. writing to a file) become at least a bit harder. Besides the fact that functional programming is just plain different from imperative programming, another challenge for an imperative programmer learning functional programming is the fact that the same or similar syntax is used to represent quite different things. Consider:

	a = 3
	b = a + 2

In an imperative language context, this would probably mean “assign the value 3 to variable ‘a’, then evaluate a + 2 and assign the resulting value to variable ‘b’”. In a functional programming language such as Haskell, ‘a’ and ‘b’ can be thought of as being functions. This may not technically be true, particularly in the case of ‘a’, but I think it works to think of them that way. Just to drive the point home, here is an equivalent definition in C#:

	int a() { return 3; }
	int b() { return a() + 2; }

One concept of functional programming that is quite different from imperative programming is the idea that once you define ‘a’, you can’t change it. Of course, if ‘a’ is actually a function, then this is to be expected. Even in imperative programming languages, the definitions of functions generally don’t change in a compiled program. The fact that ‘a’ is a function also explains something else that is at first peculiar about functional programming, which is the order in which statements get evaluated. Consider:

	a = 3
	b = a + 2
	c = b + 1

In an imperative program, the first line would be evaluated first, followed by the second line and so on. But what if ‘a’, ‘b’ and ‘c’ are actually functions? The equivalent C# code might look something like this:

	int a() { return 3; }
	int b() { return a() + 2; }
	int c() { return b() + 1; }

If there were some code that referenced the ‘c’ function, program execution would jump to function ‘c’, then to function ‘c’ and so on. Furthermore, if no other code in the program referenced functions ‘b’ or ‘c’, then neither of those functions would ever execute. So it is in functional programming. The point is that if we think of syntax like “foo = ...” as defining a function, it’s easier to grasp as opposed to thinking of it as a statement with special rules.

When I learned that pure functional programming has no mutable variables and no “side-effects”, my first thought was “How can I actually do anything?”. Well, Haskell (a pure functional language if there ever was one) does have side-effects, depending on how you choose to define them. For example, there are functions that perform IO (reading and writing to file streams). On the other hand, these side-effects are definitely not incidental, as they are in imperative languages. In Haskell, you know when a function may have side-effects. Generally, Haskell treats side-effecting code rather like nuclear waste, encapsulating it in a special container called a “Monad” and posting lots of scary warning signs around it.

Can a pure functional programming language be useful without side-effects? I think the answer is yes. I work for a company that creates LOBs (Line of Business apps) first and foremost. When you take all the layered architecture and technology concepts out of the picture, an LOB looks something like this:

image

That yellow arrow in the middle of the diagram is a transformation. It transforms the shape of the data as it comes from the user interface into the shape it has in the database. You could imagine that it’s a pipeline, of sorts, constructed of all manner of logical loops and branches. There is no reason that yellow arrow needs to have any mutable state within itself. In fact, for reasons of scalability and consistency, it would be better if that yellow arrow didn’t have any mutable state inside it. As such functional programs are uniquely suited to constructing this sort of pipeline.

Continue to Part 2: Economies of Syntax.

1 comments:

david pasquinelli said...

'a' is a function. it's the constant function, without which the lambda calculus would not work.