Tuesday, June 24, 2008

First-Class Procedures, Part II: passing as an argument

SICP demands that for functions to qualify as First Class Citizens,
they must satisfy several requirements. I've written about this before in reference to the first requirement: being able to assign names to procedures. The second requirements is the ability to be passed as an argument to another function, which is the topic of this post.

I'll just dive in, again showing multiples languages: Ruby, SICP's original Scheme, Erlang, and Haskell. Again, my friend Aubrey Keus has provided some JavaScript, as well.

This post is fairly Ruby-heavy. A lot of them are likely to be, as it's the language I know best among those being discussed, as well as the the one I use to earn my Yankee Dollars.

In Ruby:


$ irb

irb(main):001:0> def call_proc_arg_on_2(first_class_proc)
irb(main):002:1> first_class_proc.call 2
irb(main):003:1> end
=> nil
irb(main):004:0> add1 = lambda { |x| 1 + x }
=> #
irb(main):005:0> call_proc_arg_on_2(add1)
=> 3


Here we've defined a method that expects an argument called first_class_proc, which is (as you might expect) a Proc. add1 is such a Proc which adds 1 to its argument - hence the name. When we pass that in as the argument for call_proc_arg_on_2, we get the expected value of 3.

Proc & block syntax


Ruby has a curious subtlety in its treatment of Procs. Notice that we need to use the call method in order to make the Proc do its thing. Not true for some other languages below. Why would anyone do this when designing a language?

Below we see a more typical way of passing a procedure as an argument to a higher-order method.


irb(main):006:0> def call_block_on_2()
irb(main):007:1> yield 2
irb(main):008:1> end
=> nil
irb(main):009:0> call_block_on_2 { |x| 1 + x }
=> 3


The main differences between this and the previous example are as follows:

  1. The name of the method reflects the differences in expected arguments: a Proc vs. a block

  2. call_block_on_2 does not have a parameter in its declaration, just the empty parentheses

  3. When we call call_block_on_2, we don't pass a variable with a name (like add1), instead we just give it a block, which is the business between the {} braces. Note that this block is what we give to the lambda method to create a full-fledged Proc in the earlier example.

  4. The yield method is roughly akin to the call method, except that it automatically knows that it should operate on whatever block was given, without having to refer to it by name.



So what's the deal with this? Standard Ruby style is to call methods with blocks for operations like iteration (with each), list transformation (map) and filtering (select, partition), and so on. This is so common that the syntactic sugar of being able to easily do this outweighed the consistency of being able to call function arguments without needing the call method.

In MIT Scheme:


$ mit-scheme

1 ]=> (define call-on-2 (lambda (x) (x 2)))

;Value: call-on-2

1 ]=> (define add1 (lambda (x) (+ x 1)))

;Value: add1

1 ]=> (call-on-2 add1)

;Value: 3


Check it out. Simple procedure declarations, and procedures that can be passed as args with no special syntax. Some would say no syntax at all, both among the pro- and anti-Lisp communities. Not wishing to add to an old flame war, I'll move on.

In Erlang:


$ erl

Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.5.5 (abort with ^G)
1> Add1 = fun(X) -> 1 + X end.
#Fun
2> CallOn2 = fun(F) -> F(2) end.
#Fun
3> CallOn2(Add1).
3


I'm becoming a pretty big Erlang fan. Note that the variable Add1 is in all caps - this serves a semantic purpose in Erlang. User-defined entities in Erlang that are in lower-case are generally either straightforward functions or atoms, which are essentially just values. They're very similar to Ruby's Symbols, for example.

This Erlang example lets the coder define a function as a variable which is then usable as an argument to another function without any special syntax tricks. CallOn2 just applies the function arg F to the literal number 2 and returns the result, which is just what we want in all these examples.

In Haskell:


(I haven't written about Haskell before now. It's purely functional, strongly typed with inferencing, and offers lazy (non-strict/non-eager) evaluation. Check it out. It has several implementations - I'll show it in hugs, a REPL shell.

Assuming this file CallOnTwo.hs:

add1 = (+) 1
callOnTwo f = f 2


Executing hugs as follows:

$ hugs CallOnTwo.hs
__ __ __ __ ____ ___ _________________________________________
|| || || || || || ||__ Hugs 98: Based on the Haskell 98 standard
||___|| ||__|| ||__|| __|| Copyright (c) 1994-2005
||---|| ___|| World Wide Web: http://haskell.org/hugs
|| || Bugs: http://hackage.haskell.org/trac/hugs
|| || Version: September 2006 _________________________________________

Haskell 98 mode: Restart with command line option -98 to enable extensions

Type :? for help
Main> callOnTwo add1
3


What's going on here? The actual use of callOnTwo in the REPL example should be readable enough - it's quite similar to all of the other examples.

The lib file CallOnTwo.hs is a bit more interesting. Note the definition of add1 as (+) 1 - it leaves off the argument to add1. Rather than saying the equivalent of add1 of x is x plus one, it says (in Haskell) add1 is the function that adds one to whatever it gets.

This is what Haskell calls pointfree style. In the pointfree style, a coder leaves off terms from a definition when able. The resulting code is generally thought to be cleaner and more compact. It also lends itself well to groking function composition.

One further complication is that the + is in parentheses. This is because + is most commonly used as an infix operator, such that its arguments appear on either side, as opposed to Scheme's (+ x 1), where the function appears first (as is usual in Scheme), and the arguments follow. The wrapping in parentheses converts the traditionally infix + into a prefix function.

One could argue that this slight syntax massaging is roughly similar to Ruby's need for call, thereby angering fans of Ruby, Haskell, and Scheme simultaneously. Probably others too. Maybe F# fans. Who knows?

More Ruby:


In Ruby again, following the Object-Oriented nature of the language, we can do similar operations in which the functions are explicitly understood to be methods attached to specific objects. In this case, the Integer 1.

Assuming the existence of this file func_as_arg.rb:

class Integer
def call_symbol_arg_on_2(first_class_proc_as_sym)
2.send(first_class_proc_as_sym, self)
end
end


We can then do operations in irb again.


$ irb -r func_as_arg.rb
irb(main):001:0> 1.call_symbol_arg_on_2(:+)
=> 3
irb(main):002:0> 1.call_symbol_arg_on_2(:-)
=> 1


With the results that you'd expect.

In JavaScript:



function call_proc_arg_on_2(first_class_proc) {
return first_class_proc(2);
};

Number.prototype.call_proc_arg_on_2 = function(first_class_proc) {
return first_class_proc(this);
};

call_proc_arg_on_2(add1);
Number(100).call_proc_arg_on_2(add1);


Aubrey's JavaScript code demonstrates both the initial example common across all the languages, as well as the equivalent of my second Ruby example, in which the desired behavior is attached to existing number objects within your language's existing workspace.




I split infinitives. If such things get you riled up, I suggest you read about scientific linguistics a bit more. A good starting point is Language Myths, edited by Laurie Bauer. In this particular case, the splitting was to clarify that easily modifies the use the use of the more-common block syntax in Ruby, rather than suggesting that easily modified the strength of the outweighing. Embedding the adverb within the verb is excellent for disamgiuation of this sort. Try it - you'll like it. Trust me.

No comments: