Splarnektity: Functional Pattern Matching for Python

Splarnektity provides a limited, but valuable, form of functional pattern matching to Python, a la OCaml, Erlang, Haskell, et al.. Another way of thinking of this is: pattern matching over Python lists.

Contents

1   History

2008-08-19 16:00:24 UTC David Baird <dhbaird@gmail.com>

Version 0.3 release. This release fixes some important bugs with .others (there was an off-by-one errors that prevented matching of zero-length tuples; and there was a missing check that could match tuples that shouldn't match). Regression tests have been written for these bugs, and a few more units tests have been added.

2008-08-09 23:59:10 UTC David Baird <dhbaird@gmail.com>

Version 0.2 release. This release is a documentation and example update. The vars() method was added to provide a more consistent interface (just in case it is ever needed) than var().

2008-07-24 00:51:13 UTC David Baird <dhbaird@gmail.com>

First release, including documentation (in the module docstring) and some test cases. I originally wanted to use the name FMatch, but that was already taken; So were PFMatch and FunMatch. I had to draw the line at PyMatch and PyFMatch. Those names just don't sound cool. In the end, I gave up and just decided to name the module Splarnektity (inspired by Ze Frank).

2   License

The license for Splarnektity is a standard BSD license. See the comments at the top of the source code for the license information.

3   Motivation

This library was originally born from a motivation to process sexps in Python. S-expressions (aka sexps) are a concise way to represent structured data that both humans and computers can interpret. A sexp might look like:

'(netlist
    (namespace connector (main-board J2))
    (namespace fpga (main-board U17))
    (net (connector 1) (fpga AE12))
    (net (connector 2) (fpga AA22))
    (net (connector 3) (connector 4) (fpga B32))
)

See, easy for humans to grasp, right? But what about computers?

3.1   Python (Version 2.5) versus Functional

Let's examine a family of [currently] related languages: Python, Ruby, C, and Java, but with emphasis on Python. We're going to assume that the sexp has been converted to a Python-style list or tuple already and thus looks like this:

s = ('quote',
  ('netlist',
    ('namespace', 'connector', ('main-board', 'J2')),
    ('namespace', 'fpga', ('main-board', 'U17')),
    ('net', ('connector', '1'), ('fpga', 'AE12')),
    ('net', ('connector', '2'), ('fpga', 'AA22')),
    ('net', ('connector', '3'), ('connector', '4'), ('fpga', 'B32'))
  )
)

Phew! That's a lot of commas and quotes! Here's how one would normally process this statement in Python:

assert(len(s) == 2 and s[0] == 'quote')
netlist = s[1]
assert(len(netlist) > 0 and netlist[0] == 'netlist')
for item in netlist:
    assert(len(item) > 0)
    if item[0] == 'namespace':
        assert(len(item) == 3)
        name, components = item[1], item[2]
        print 'namespace', name, components
    if item[0] == 'net':
        assert(len(item) > 1)
        connections = item[1:]
        print 'net',
        for namespace, port in connections:
            print '%s:%s' % (namespace, port),
        print

There are several annoying aspects about this code. The first annoying aspect is that the the programmer has to keep checking the size of lists before accessing items in the list. The second annoying thing is that before unpacking a list, the programmer has to be certain what type of list is about to be unpacked. Functional pattern matching solves these annoyances, freeing the programmer's mind to focus on more important things!

Here is an example of what functional pattern matching looks like when it is officially supported by a language:

match s:
  when ('quote', netlist):
    match netlist:
      when ('netlist', *items):
        for item in items:
          match item:
            when ('namespace', name, components):
              print 'namespace', name, components
            when ('net', *connections):
              print 'net',
              for connection in connections:
                match connection:
                  when (namespace, port):
                    print '%s:%s' % (namespace, port),
              print

I'm going to highlight a portion of each code to demonstrate where functional pattern matching really starts to pay off: when multiple cases must be handled. Look more closely at this bit of Python which has more than one case:

...
    if item[0] == 'namespace':
        assert(len(item) == 3)
        name, components = item[1], item[2]
        ...
    if item[0] == 'net':
        assert(len(item) > 1)
        connections = item[1:]
        ...

and also look at the pseudo code for these cases:

...
            match item:
              when ('namespace', name, components):
                ...
              when ('net', *connections):
                ...

Notice that the comparison, unpacking, and length checking all happen in one statement (the when statement) in the pseudo code. To be fair, the Python code could be compacted a little bit more, but it is still too verbose and the programmer still must worry about indexing and unpacking:

...
    if len(item) == 3 and item[0] == 'namespace':
        name, components = item[1], item[2]
    if len(item) > 1 and item[0] == 'net':
        connections = item[1:]
        ...

So far, the pseudo code looks kinda good.

What else could we ask for? How about nested patterns and also conditions. Okay, here they are:

match s:
  when ('quote', ('netlist', *items)), len(items) > 0: # <- look here!!
    for item in items:
      match item:
        when ('namespace', name, components):
          print 'namespace', name, components
        when ('net', *connections):
          print 'net',
          for connection in connections:
            match connection:
              when (namespace, port):
                print '%s:%s' % (namespace, port),
          print

How about less indentation?:

match s:
when ('quote', ('netlist', *items)), len(items) > 0:
    for item in items:
      match item:
      when ('namespace', name, components):
          print 'namespace', name, components
      when ('net', *connections):
          print 'net',
          for connection in connections:
            match connection:
            when (namespace, port):
                print '%s:%s' % (namespace, port),
          print

3.2   A Real FMatch Example

The following pseudo code:

match s:
  when ('quote', ('netlist', *items)), len(items) > 0:
    for item in items:
      match item:
        when ('namespace', name, components):
          print 'namespace', name, components
        when ('net', *connections):
          print 'net',
          for connection in connections:
            match connection:
            when (namespace, port):
                print '%s:%s' % (namespace, port),
          print

translates to the following real Python code with FMatch:

import splarnektity as fmatch

s = ('quote',
  ('netlist',
    ('namespace', 'connector', ('main-board', 'J2')),
    ('namespace', 'fpga', ('main-board', 'U17')),
    ('net', ('connector', '1'), ('fpga', 'AE12')),
    ('net', ('connector', '2'), ('fpga', 'AA22')),
    ('net', ('connector', '3'), ('connector', '4'), ('fpga', 'B32'))
  )
)

f = fmatch.FMatch(s)
items = f.var()

if f.when(('quote', ('netlist', items.others))) and len(items.v) > 0:
    for item in items.v:
        f = fmatch.FMatch(item)
        name, components, connections = f.var(3)
        if False: pass
        elif f.when(('namespace', name, components)):
            print 'namespace', name.v, components.v
        elif f.when(('net', connections.others)):
            print 'net',
            for connection in connections.v:
                g = fmatch.FMatch(connection)
                namespace, port = g.var(2)
                if g.when((namespace, port)):
                    print '%s:%s' % (namespace.v, port.v),
            print

The result of executing this program is:

namespace connector ('main-board', 'J2')
namespace fpga ('main-board', 'U17')
net connector:1 fpga:AE12
net connector:2 fpga:AA22
net connector:3 connector:4 fpga:B32

4   Caveats

4.1   Splarnektity (i.e. Reality) versus The Ideal

4.1.1   Aesthetics

Here's a bit of the ideal pseudo code:

...
      match item:
        when ('namespace', name, components):
          print 'namespace', name, components
        when ('net', *connections):
          ...
        else:
          ...

and here's the reality code:

...
        f = fmatch.FMatch(item)
        name, components, connections = f.var(3)
        if   f.when(('namespace', name, components)):
            print 'namespace', name.v, components.v
        elif f.when(('net', connections.others)):
            ...
        else:
          ...

The obvious difference is that Python has no match/when keywords (anyone interested in taking up an effort to fix this?? ^_^). Also, all of the variables must be declared in advance in the real code (i.e. name, ... = f.var(3)) which is kind of a pain. Also, accessing vars in the real code requires accessing the .v member of the var. To get the value of a var, the .v accessor (i.e. name.v) is used. These methods have pros and cons. The pro is that all sorts of cool methods (like .int to check if the type of the variable is an int or long) can be added to the pattern/match objects. The con is that name.v looks ugly.

To summarize, the aesthetic shortcomings of Splarnektity are:

  • Variables must be instantiated in advance
  • The value of a variable must be accessed via the .v accessor method

4.1.2   Performance

The performance of Splarnektity is surely not as great as achieved with a system that actual compiles patterns into a pattern matching automaton.

4.2   Python Tuple Syntax

This is mentioned in the Python docs, but in case someone doesn't already know, here is how to create Tuple of one item within Python:

(1,)

The comma is required. Otherwise, the parentheses becomes an order-of-precedence kind of operator (rather than a tuple syntax).

5   FMatch Examples and Applications

5.1   A Simplistic FMatch Example

Example usage:

from splarnektity import FMatch, X

value = ('do not care', (17, 'happy'), 12)
f = fmatch.FMatch(value)
a = f.var()

if f.when((X, (17, 'happy'), a)) and a.int and a.v <= 2:
    print a.v # >>> (NOT REACHED)

if f.when((X, (17, 'sad'), a)) and a.int and a.v > 2:
    print a.v # >>> (NOT REACHED)

if f.when((X, (17, 'happy'), a)) and a.int and a.v > 2:
    print a.v # >>> 12

if f.when((X, (17, 'happy'), a.I)) and a.v > 2:
    print a.v # >>> 12

5.2   Term Rewriting

Caveat: I'll be impressed if an efficient term rewriting system can be made with Splarnektity. Actually, using Splarnektity for large scale term rewriting is probably a bad idea. If a better term rewriting system is needed, these languages/tools are worth looking into:

Q (and its successor, Pure) is pretty fun though and I recommend checking it out.

With that said, here is an example of a very simple term rewriter based on Splarnektity:

examples/simple_term_rewriting.html

6   Download

Latest release:

Bazaar repository:

bzr clone http://www.aclevername.com/projects/splarnektity