Matzo Tutorial

Matzo is a small language for building random strings. It is technically possible to do other computations in Matzo, but it is difficult and annoying and falls outside what Matzo was intended for.

Basic Expressions

Matzo makes a distinction between statements, which are either definitions or commands, and expressions. A Matzo program consists of several statements, each of which is terminated by a semicolon. A basic Matzo program might look like

puts "Hello, world!";

which would print the string "Hello, world!" to the screen.

But this is a language for randomness! Let's make our program do something more unexpected.

puts "Hello, world!" | "Hola, mundo!";

Whatever is given to puts must be an expression; in the first program, our expression was a simple string, but here, we're using the choice operator to let our program choose between the two alternatives. If you ran this program multiple times, you would sometimes find it printing Hello, world! and sometimes Hola, mundo! depending on which happened to be chosen.

In addition to choice, we can also catenate by merely putting two expressions next to each other. For example, this will print the same thing as our first program:

puts "Hello, " "world!";

In Matzo, catenation binds more tightly than choice, so the program

puts "A" "B" | "C" "D";

will produce either AB or CD, but we can always group with parentheses:

puts "A" ("B" | "C") "D";

which will produce either ABD or ACD. We can put this together to make a more varied program:

puts ("Hello" | "Yo" | "Greetings") ", world!";

There are a few more things we can do with basic expressions. With a choice, we can weight a particular choice to be worth more or less, e.g.

puts (5: "Hello" | "Goodbye") ", world.";

Without a weighting, Matzo tries to choose each path equally often, so if we want a particular path to be more likely, we preface it with a number. This is essentially the same as if we had written the following:

puts ("Hello" | "Hello" | "Hello" | "Hello" | "Hello" | "Goodbye") ", world.";

There's another specialized operator which is the repetition operator, written as @. This is for the case in which we want to repeat a particular expression catenated with itself a random number of times, e.g.

puts 3@"na";

This could print either na, nana, or nanana, and the syntax 3 @ x can be thought of as shorthand for (x | x x | x x x). This is merely a convenience.

Let's put this all together and write a generator for simple nonsense words. Each word should be one to four syllables, where each syllable is one of the consonants p,t,k,n,l and each vowel is one of a,i,u.

(* A generator for nonsense words *)
puts 4 @ (("p" | "t" | "k" | "n" | "l") ("a" | "i" | "u"));

Definitions

We can also give names to expressions using two different kinds of definition statements. The first is a normal definition, where we assign a name to an expression with the := operator, e.g.

(* A generator for nonsense words using named subexpressions *)
vowel := "a" | "e" | "i" | "o" | "u";
consonant := "p" | "t" | "k" | "n" | "l";
puts 4 @ (consonant vowel);

The second is for the special case in which we want some expression to be a basic choice between a set of strings. Because this is a common situation, there is a special syntax in which you omit both the | operator and the quotes, and instead use the ::= operator for assignment:

(* A generator for nonsense words using literal assignments *)
vowel ::= a e i o u;
consonant ::= p t k n l;
puts 4 @ (consonant vowel);

Note that definitions can be recursive and mutually recursive, and that the order of definitions does not matter so long as all definitions that might be used by a particular puts statement are placed before that statement. Recursive expressions are not always useful, but could sometimes be used:

goesOn := 4: " and on" goesOn | "...";
puts "And it goes on" goesOn;

The Fix Statement

One final kind of statement is the fix statement. This takes a name and 'fixes' it to be a particular value. For example, the following program could print either A and A, A and B, B and A, or B and B:

x ::= A B;
(* Each time x is used, it might have a different value *)
puts x " and " x;

on the other hand, the program below can only ever print A and A or B and B:

x ::= A B;
fix x;
(* x can still be either A or B, but it will consistently be
 * either A or B henceforth *)
puts x " and " x;

An example use for this is in generating a character description. Here is a naïve approach:

person ::= man woman;
puts "This person is a " ("tall" | "short") "  "person
  " with " ("red" | "black") " hair.";
puts "This " person " holds a " ("green" | "blue") " " ("book" | "cup") ".";
puts "This " person " is " ("happy" | "angry") ".";

The problem is that every time we use person, we have a 50/50 chance of getting either man or woman selected, but we'd like to select it once and then leave it the same afterwards. This is what fix does for us, so this program has the correct behavior:

person ::= man woman;
fix person;
puts "This person is a " ("tall" | "short") "  "person
  " with " ("red" | "black") " hair.";
puts "This " person " holds a " ("green" | "blue") " " ("book" | "cup") ".";
puts "This " person " is " ("happy" | "angry") ".";

Symbols, Tuples, and Functions

All identifiers in Matzo start with lowercase letters. Something that starts with an uppercase letter is considered a symbol, which is a kind of abstract datatype.

(* This is a syntax error, because symbols can't be used as variables *)
(* Foo := 1; *)
(* This can return either the symbol Foo or the symbol Bar *)
symbol := Foo | Bar;

A tuple is a comma-separated list of expressions in between angle brackets.

(* This is a tuple of three numbers. *)
tup := <"hello", 2, This | That>;
puts "This tuple is " tup;
(* This can print either <"hello",2,This> or <"hello",2,That> *)

A function is written anonymously (i.e. without a name) and is assigned a name in the same way as other expressions. Calling a function is done with the . operator, which looks admittedly unusual.

greet := { name => "Hello, " name "!" };
puts greet."world";
(* prints "Hello, world!" *)

Functions can pattern-match over their arguments. Each case takes the form pattern => expression, and the cases are separated by semicolons.

nickname := { "John" => "Johnny"   (* If our argument is John,
                                    * return Johnny... *)
            ; "Nick" => "Hobo Sex" (* If our argument is Nick,
                                    * return Hobo Sex... *)
            ; name   => name };    (* otherwise, return the same
                                    * name we were given. *)
puts nickname."John"; (* Prints "Johnny" *)
puts nickname."Bob";  (* Prints "Bob" *)

Symbols, strings, and numbers only match equivalent symbols, strings, and numbers. Tuples match any tuple of the same length whose elements all match as well. Variables match any value and bind the value to the variable in the local scope. The special variable _ matches anything but does not bring it into scope. It is often used as a final wildcard case.

The primary use-case of symbols is to be used as an abstract data type to be processed by functions, e.g.

gender  := Male | Female;
pronoun := { Male => "he"; Female => "she" };
possPronoun := { Male => "his"; _ => "her" };
noun := { Male => "man"; _ => "woman" };
g := gender; fix g;
puts "You speak to a " noun.g " who tells you that "
  possPronoun.g " name is Pat and that " pronoun.g
  " has just arrived from Newark.";

Function can only take one argument. There are two ways around this. One is to take a tuple and pattern-match over its fields:

noun := { <Adult,Male>   => "man"
        ; <Adult,Female> => "woman"
        ; <Child,Male>   => "boy"
        ; <Child,Female> => "girl"
        };
puts noun.<Adult | Child, Male | Female>;

The other way is to curry your function, i.e. to have a one-argument function that itself returns another function. To this end, Matzo understands x.y.z as (x.y).z:

age := Adult | Child; gender := Male | Female;
noun := { Adult => { Male => "man"; Female => "woman" }
        ; Child => { Male => "boy"; Female => "girl" }
        };
puts noun.age.gender;

Because functions are just values, they can be used in any way you would use values, including passing them to other functions:

map := { f => { <> => <>
               ; <fst,rst> => <f.fst, map.f.rst>
              } };
double := { x => x x };
puts map.double.<"one",<"two",<>>>;
(* Prints <oneone,<twotwo,<>>> *)

Syntactic Sugar

With a single exception, this section consist entirely of features that could be written in terms of more primitive constructs, but exist to make life easier.

Let-binding involves proper shadowing, e.g.

x := "bar";
y := (let x := "foo" in x) x;
puts y; (* prints foobar *)

You can have any number of bindings, separated by semicolons, in a let-expression. Due to the call-by-name nature of the language, it functions effectively as a let* in Scheme parlance, i.e. terms can refer to other terms in the let binding, so

y := (let a := 4: "1"b | "?"; b := 4: "2"a | "!" in a);
puts y;

could possibly print 12121?.

Case expressions are syntactic sugar for functions, i.e.

(* The following two definitions are equivalent *)
f := { x => case x of { Foo => 1; Bar => 2 } };
g := { x => { Foo => 1; Bar => 2 }.x };

The curly braces are obligatory in case expressions, but they can sometimes make definitions more readable.

Let bindings can do one more unusual thing; namely, you can lexically fix a value. This value will only be fixed within the scope created by the let, e.g.

y ::= a b;
z := (let fix y in y y y) y;
puts z;

could print any one of aaab, aaaa, bbba, or bbbb.

TENTATIVE EXTENSION: I may at some point choose to implement a form of syntactic sugar along the lines of let x := fix y in ..., which would be equivalent to let x := y; fix x in ..., i.e. fix a variable without shadowing the original definition. This may also be a useful top-level construct.

Builtin Functions

There are a few predefined functions; there are a handful of functions for working with case:

puts capitalize."foo"; (* prints Foo *)
puts to-upper."foo";   (* prints FOO *)
puts to-lower."FOO";   (* prints foo *)

a function for mapping functions over strings:

puts string-map.{x => x | "*"}."ab";
  (* prints either ab, a*, *b, or ** *)

and several for working with tuples:

puts length.<1,2,3>; (* prints 3 *)
puts append.<1,2>.<3,4>; (* prints <1,2,3,4> *)
puts tuple-map.{ x => "(" x ")" }.<1,2,3>;
  (* prints <(1),(2),(3)> *)
puts tuple-fold
  . { x => { y => append.y.<x> } }
  . <> . <1,2,3>; (* prints <3,2,1> *)
puts concat.<"A","B","C">; (* prints ABC *)
puts choose.<"A","B","C">; (* prints A or B or C *)

There are not at present arithmetic functions, but it's likely that if they're added, they'll be added as add and sub and so forth, rather than as operators.

An Elaborate Example

Let's generate a random Roman. This uses the built-in capitalize function which takes a string and capitalizes the first letter of it.

gender:= Male | Female;
byGender := { <m,f> => { Male => m; Female => f } };

ending  := byGender.<"us","a">;
pronoun := byGender.<"He","She">;
noun    := byGender.<"man","woman">;

cons  ::= p t c d g r l m n x;
vowel ::= a e i o u;

name := { g => (vowel | "") (3 @ (cons vowel)) cons (ending.g) };

hairColor ::= black brown blonde;
eyeColor  ::= brown green blue;

job := { g =>
    "stonemason"
  | "baker"
  | "accountant"
  | case g of 
      { Male   => "fisherman"
      ; Female => "fisherwoman"
      } };
tool := { "stonemason" => "chisel"
        ; "baker"      => "bowl"
        ; "accountant" => "tablet"
        ; _            => "fishing pole"
        };
person :=
  let my-gender := gender; fix my-gender in
  let my-job := job.my-gender; fix my-job in
    "You come across " (capitalize.(name.my-gender)) ", a Roman " (noun.my-gender)
    " from the city of " (capitalize.(name.Female)) ". "
    (pronoun.my-gender) " is a hardworking " my-job " with "
    hairColor " hair and " eyeColor " eyes. "
    (capitalize.(pronoun.my-gender)) " carries a " (tool.my-job) " and smiles often.";
puts person;

Peano Arithmetic

To showcase Matzo's theoretical capability, here is a basic definition of Peano arithmetic. It's not particularly well-suited for this, but, hey, why not?

(* A Peano number is either zero, or the successor of some other
 * peano number. We represent zero as the symbol Z, and the successor
 * of x as a tuple <S,x> *)
zero := Z;
succ := { x => <S,x> };

(* Matzo doesn't have a boolean type, so we'll fake it using symbols. *)
isZero := { Z => True; <S,_> => False };

(* The add function is curried. Adding Z to y is always the same as y.
 * Adding (S x) to y is the same as adding x to (S y). *)
add := { Z     => { y => y }
       ; <S,x> => { y => add.x.<S,y> } };
(* Adding the successor of zero, i.e. 1 *)
incr := add.(succ.zero);

two  := succ.(succ.zero);
four := succ.(succ.two);

(* Because booleans don't exist, we have to roll our own if. This is
 * also a curried if. *)
if := { True  => { x => { _ => x } }
      ; False => { _ => { y => y } }
      };

(* And some tests. *)

puts "Is 0 zero?";
puts isZero.Z;

puts "Is 4 zero?";
puts isZero.four;

puts "2 + 4 = ";
puts add.four.two;

puts "Is zero zero?";
puts if.(isZero.Z)."yes"."no";

puts "Is 0+1 zero?";
puts if.(isZero.(incr.Z))."yes"."no";