Meta Language

Table of Contents

ML language stands for Meta Language, rooted in Lisp, commonly known as "Lisp with types". It is impure, eager evaluation language. Caml stands for Categorical Abstract Machine Language.

1 General Concepts

1.1 Algebraic data type

Algebraic data type, is a kind of composite type. There are two common classes: product types and sum types. Product type is like structure in C, with many fields. Sum type is like union in C, can hold either one kind of value, but not both. Thus it is also called union types or variant types.

2 Standard ML

2.1 Reference

2.2 Binding

ML is static lexical scoping. A type binding looks like this. Here, tycon means type constructor, typ means type expression. A type binding expression can establish multiple bindings at the same time.

type tycon1 = typ1
and ...
and tyconn = typn

A variable binding is very similar.

val var1 : typ1 = exp1
and ...
and varn : typn = expn

Multiple (2+) bindings writing sequentially, optionally separated by a semicolon, forms a declaration.

A let binding or local binding creates local scoping. The dec here is a declaration. The binding is only in scope in the evaluation of exp. The local expression just means the bindings of dec is visible in evaluating of dec'. But what's the use-case??

let dec in exp end
local dec in dec' end

2.3 Function

A function expression, or lambda expression, is of form:

fn var : typ => exp

It seems that ML also is curried, and function application is written as (function param). The above definition is anonymous, we can bind it to a variable:

var foo : real -> real =
  fn x : real => x

There's a syntax sugar for this

fun foo (x:real):real = x

2.4 Product Type

The type of a tuple is typ1*typ2*...*typn, while the value is (val1,val2,...,valn).

2.5 Pattern Matching

val pat = exp

Within the pattern

  • a variable pattern is of form var:typ
  • a tuple pattern is of form (pat1,...,patn)
  • a wildcard pattern is of form _

Instead of using position to refer to a field in tuple, a record type is introduced. The type is of form

{lab1:typ1, ..., labn:typn}

While the value is

{lab1=val1,...,labn=valn}

ML can take pattern matching to function parameters too. When declaring function like this, it will apply pattern matching on the parameters and establish bindings.

fn pat => exp

The corresponding syntax sugar (fun notation)

fun foo (x:read, y:real):real*real = x

Note that, thanks to using pat as the function signature, we can now accept multiple parameters, and provide multiple results!

There's also a syntax sugar, called sharp notation, to retrieve components from tuple based on either location (#i), or field (#foo). Note that these are functions that have signature of typ1*...*typn -> typi. The use of sharp notation is strongly discouraged though, due to lack of readability.

2.6 Case

The forms of all values of a tuple type must be the same, this is called Homogeneous Types. Otherwise, it is called Heterogeneous Type. This is useful to declare a Clausal Function expression.

fn pat1 => exp1
 | pat2 => exp2
 | ...
 | patn => expn

each pat=>exp is called a clause, or a rule. If any of the pattern match the supplied parameters, the function will have a specific type. Note that the type of the function can be different upon different parameters, thus it is called Heterogeneous Type. In a word, Heterogeneous Type is introduced by the alternation symbol. The function also has a corresponding fun notation:

fun foo 0 = 0
  | foo (x:int) = x

The case seems to be a syntax sugar for the clausal function expression (it is the general case in Haskell).

case exp
  of pat1 => exp1
   | ...
   | patn => expn

is short for the function application:

(fn pat1 => exp1
  | ...
  | patn => expn) exp

As some example, the negation function not is defined as

fun not true = false
  | not false = true

conditional expression

if exp then exp1 else exp2

if short hand for

case exp
  of true => exp1
   | false => exp2

The andalso and orelse is defined as

  exp1 andalso exp2
  if exp1 then exp2 else false
  exp1 orelse exp2
  if exp1 then true else exp2

2.7 Recursive Function

In ML, only functions can be recursively defined. This is because it is eagerly evaluated. Haskell can define even recursive values, thanks to its lazy evaluation.

val rec var:typ = fnexp

The keyword rec indicates this is a recursive binding, and the name var can be used in the right-hand side to refer to itself. The fun notation simply does not require the rec keyword.

fun foo 0 = 1
  | foo (n:int) = n * foo (n-1)

The type checking for recursive function is done inductively, i.e. assume nth-iteration is correct, check (n+1)-th iteration is also correct.

SML supports mutually recursive definition, i.e. foo calls bar, bar calls foo, by connecting them together. The and is must.

fun even 0 = true
  | even n = odd (n-1)
and odd 0 = false
  | odd n = even (n-1)

2.8 Type inference

Like in Haskell, although the language is strongly typed, you most likely will not write it yourself. The type system is so good that it can most likely inference for you. In addition, it does a better job in that, it will infer the most general, i.e. principal type, for you. The prototypical example is the identity function, fn x=>x. x here can be of any type. This function is said to be polymorphic, the pattern captured here is called a type scheme, expressed by type variable. In this case, the type scheme is a->a.

2.9 list

The type is written as typ list, and the values of this type is a list of values of type typ. It is defined inductively.

val nil : typ list
val (op ::) : typ * typ lsit -> typ list

The notation (op ::) tells that, here, :: is an operator function, rather than the one used in list notation.

So, a value of typ list can be

val1 :: val2 :: ... :: valn :: nil

The list notation is also available as you expect

[val1, val2, ..., valn]

Of course, list supports pattern matching, too.

fun length nil = 0
  | length (_::t) = 1 + length t

2.10 Concrete data type

Concrete data type corresponds to the data expression in Haskell. The suite is called type constructor, and each of Spades, Diamonds, Clubs are value constructor. In particular, the value constructors accepts no parameter, thus called nullary value constructor. They are constants.

datatype suite = Spades | Diamonds | Clubs

You can have type variables, and define other type of value constructors, using of keyword. SOME here is a unary value constructor.

datatype 'a option = NONE | SOME of 'a

The data constructor can also be recursive.

  datatype 'a tree =
           Empty |
           Node of 'a tree * 'a * 'a tree

Another canonical example is in pattern matching

  datatype expr =
           Numeral of int |
           Plus of expr * expr |
           Times of expr * expr
  fun eval (Numeral n) = Numeral n 
    | eval (Plus (e1, e2)) =
      let
          val Numeral n1 = eval e1
          val Numeral n2 = eval e2
      in
          Numeral (n1+n2)
      end
    | eval (Times (e1, e2)) =
      let
          val Numeral n1 = eval e1
          val Numeral n2 = eval e2
      in
          Numeral (n1*n2)
      end

2.11 Mutable Storage

A reference indicates a mutable storage. It is created by ref, read by !, assign by :=. The type typ ref is similar to typ list, and is the reference to the storage of type typ.

fun ref      : typ -> typ ref
fun (op !)   : typ ref -> typ
fun (op :=)  : typ ref * typ -> unit

To define a function accepting a typ ref, you can use ref pattern. The semantic is that, the function accept a reference, whose content matches the pattern. The ! function itself is defined as:

fun ! (ref a) = a

2.12 IO

The types instream, outstream. The common IO port: stdIn, stdOut, stdErr

fun inputLine : instream -> string
fun print     : string -> unit
(* blocking read *)
fun input : instream -> string
(* test whether input would block *)
fun canInput : instream * int -> int option
fun output    : outstream * string -> unit
fun flushOut  : outstream -> unit
fun endOfStream : instream -> bool

fun openIn : string -> instream
fun openout : string -> outstream
fun closeIn : instream -> unit
fun closeOut : outstream -> unit