PLT MzLib: Libraries Manual Pattern Matching

This library provides functions for pattern-matching Scheme values. (This chapter written by Andrew K. Wright, originally titled Pattern Matching for Scheme.) The following forms are provided:

(match expr clause ...) 
(match-lambda clause ...) 
(match-lambda* clause ...) 
(match-let ((pat expr) ...) expr ···1) 
(match-let* ((pat expr) ...) expr ···1) 
(match-letrec ((pat expr) ...) expr ···1) 
(match-let var ((pat expr) ...) expr ···1) 
(match-define pat expr) 
clause is one of 
 (pat expr ···1) 
 (pat (=> identifier) expr ···1

Figures 1 and 2 give the full syntax for pat patterns.

Figure 1:  Pattern Syntax

Figure 2:  Quasipattern Syntax

The next subsection describes the various patterns.

The match-lambda and match-lambda* forms are convenient combinations of match and lambda, and can be explained as follows:

(match-lambda (pat expr ···1) ...) =(lambda (x) (match x (pat expr ···1) ...))
(match-lambda* (pat expr ···1) ...) =(lambda x (match x (pat expr ···1) ...))
where x is a unique variable. The match-lambda form is convenient when defining a single argument function that immediately destructures its argument. The match-lambda* form constructs a function that accepts any number of arguments; the patterns of match-lambda* should be lists.

The match-let, match-let*, match-letrec, and schemematch-define forms generalize Scheme's let, let*, letrec, and define expressions to allow patterns in the binding position rather than just variables. For example, the following expression:

(match-let ([(x y z) (list 1 2 3)]) body

binds x to 1, y to 2, and z to 3 in the body. These forms are convenient for destructuring the result of a function that returns multiple values. As usual for letrec and define, pattern variables bound by match-letrec and match-define should not be used in computing the bound value.

The match, match-lambda, and match-lambda* forms allow the optional syntax (=> identifier) between the pattern and the body of a clause. When the pattern match for such a clause succeeds, the identifier is bound to a failure procedure of zero arguments within the body. If this procedure is invoked, it jumps back to the pattern matching expression, and resumes the matching process as if the pattern had failed to match. The body must not mutate the object being matched, otherwise unpredictable behavior may result.

19.1  Patterns

Figure 1 gives the full syntax for patterns. Explanations of these patterns follow.

  • identifier (excluding the reserved names ?, $, _, and, or, not, set!, get!, ..., and ..k for non-negative integers k) -- matches anything, and binds a variable of this name to the matching value in the body.

  • _ -- matches anything, without binding any variables.

  • (), #t, #f, string, number, character, 's-expression -- constant patterns that match themselves (i.e., the corresponding value must be equal? to the pattern).

  • (pat1 ··· patn) matches a proper list of n elements that match pat1 through patn.

  • (pat1 ··· patn . patn+1) -- matches a (possibly improper) list of at least n elements that ends in something matching patn + 1.

  • (pat1 ··· patn patn+1 ...) -- matches a proper list of n or more elements, where each element of the tail matches patn+1. Each pattern variable in patn+1 is bound to a list of the matching values. For example, the expression:

    (match '(let ([x 1][y 2]) z) 
      [('let ((binding vals) ...) exp) expr ···1]) 
    binds binding to the list '(x y), vals to the list '(1 2), and exp to 'z in the body of the match-expression. For the special case where patn+1 is a pattern variable, the list bound to that variable may share with the matched value.

  • (pat1 ··· patn patn+1 ..k) -- similar to the previous pattern, but the tail must be at least k elements long. The pattern keywords ..0 and ... are equivalent.

  • #(pat1 ··· patn) -- matches a vector of length n, whose elements match pat1 through patn.

  • #&pat -- matches a box containing something matching pat.

  • ($ struct-name pat1 ··· patn) -- matches an instance of a structure type struct-name, where the instance contains n fields.

    Usually, struct-name is defined with define-struct. More generally, struct-name must be bound to expansion-time information for a structure type (see section 12.6.3 in PLT MzScheme: Language Manual), where the information includes at least a predicate binding and some field accessor bindings (and pat1 through patn correspond to the provided accessors). In particular, a module import or a unit/sig import with a signature containing a struct declaration (see section 35.2) can provide the structure type information.

  • (and pat1 ··· patn) -- matches if all of the subpatterns match. This pattern is often used as (and x pat) to bind x to to the entire value that matches pat.

  • (or pat1 ··· patn) -- matches if any of the subpatterns match. At least one subpattern must be present. All subpatterns must bind the same set of pattern variables.

  • (not pat1 ··· patn) -- matches if none of the subpatterns match. The subpatterns may not bind any pattern variables.

  • (? predicate-expr pat1 ··· patn) -- In this pattern, predicate-expr must be an expression evaluating to a single argument function. This pattern matches if predicate-expr applied to the corresponding value is true, and the subpatterns pat1 through patn all match. The predicate-expr should not have side effects, as the code generated by the pattern matcher may invoke predicates repeatedly in any order. The predicate-expr expression is bound in the same scope as the match expression, so free variables in predicate-expr are not bound by pattern variables.

  • (set! identifier) -- matches anything, and binds identifier to a procedure of one argument that mutates the corresponding field of the matching value. This pattern must be nested within a pair, vector, box, or structure pattern. For example, the expression:

    (define x (list 1 (list 2 3))) 
    (match x [(_ (_ (set! setit)))  (setit 4)]) 
    mutates the cadadr of x to 4, so that x is '(1 (2 4)).

  • (get! identifier) -- matches anything, and binds identifier to a procedure of zero arguments that accesses the corresponding field of the matching value. This pattern is the complement to set!. As with scmkset!, this pattern must be nested within a pair, vector, box, or structure pattern.

  • `quasipattern -- introduces a quasipattern, in which identifiers are considered to be symbolic constants. Like Scheme's quasiquote for data, unquote (,) and unquote-splicing (,@) escape back to normal patterns.

If no clause matches the value, the reult is void.

19.2  Examples

This section illustrates the convenience of pattern matching with some examples. The following function recognizes some s-expressions that represent the standard Y operator:

(define Y? 
    [('lambda (f1) 
       ('lambda (y1) 
         ((('lambda (x1) (f2 ('lambda (z1) ((x2 x3) z2)))) 
           ('lambda (a1) (f3 ('lambda (b1) ((a2 a3) b2))))) 
     (and (symbol? f1) (symbol? y1) (symbol? x1) (symbol? z1) (symbol? a1) (symbol? b1) 
          (eq? f1 f2) (eq? f1 f3) (eq? y1 y2) 
          (eq? x1 x2) (eq? x1 x3) (eq? z1 z2) 
          (eq? a1 a2) (eq? a1 a3) (eq? b1 b2))] 
    [_ #f])) 

Writing an equivalent piece of code in raw Scheme is tedious.

The following code defines abstract syntax for a subset of Scheme, a parser into this abstract syntax, and an unparser.

(define-struct Lam (args body)) 
(define-struct Var (s)) 
(define-struct Const (n)) 
(define-struct App (fun args)) 
(define parse 
    [(and s (? symbol?) (not 'lambda)) 
     (make-Var s)] 
    [(? number? n) 
     (make-Const n)] 
    [('lambda (and args ((? symbol?) ...) (not (? repeats?))) body) 
     (make-Lam args (parse body))] 
    [(f args ...) 
       (parse f) 
       (map parse args))] 
    [x (error 'syntax "invalid expression")])) 
(define repeats? 
  (lambda (l) 
    (and (not (null? l)) 
         (or (memq (car l) (cdr l)) (repeats? (cdr l)))))) 
(define unparse 
    [($ Var s) s] 
    [($ Const n) n] 
    [($ Lam args body) `(lambda ,args ,(unparse body))] 
    [($ App f args) `(,(unparse f) ,@(map unparse args))])) 

With pattern matching, it is easy to ensure that the parser rejects all incorrectly formed inputs with an error message.

With match-define, it is easy to define several procedures that share a hidden variable. The following code defines three procedures, inc, value, and reset, that manipulate a hidden counter variable:

(match-define (inc value reset) 
  (let ([val 0]) 
      (lambda () (set! val (add1 val))) 
      (lambda () val) 
      (lambda () (set! val 0))))) 

Although this example is not recursive, the bodies could recursively refer to each other.