\section{type system}
\label{sect.typesystem}

\paragraph{Source language}

We use an explicitly typed source language extended with \GADTs. This source language is a minor variation on \SystemFA~\cite{sulzmann07}, which we call \SystemFAPrime. The key difference is support for existential quantification and a slightly more uniform representation of pattern matches. The syntax of \SystemFAPrime is given in \Fig{fig.systemfa}.

A |v| is an identifier of which the denotation is a unique type constant. Although these identifiers can appear where a type or type variable is expected, they play an important role later. They are introduced at two places: when opening an existential using a pattern match (|TXTPAPPEXIST|) and with a universal abstraction (|TXTEUNIVABS|).

Each equality is encoded as a qualified type. The LHS is -- without loss of generality -- syntactically restricted to be a (bound) type variable. Existentials for a data constructor are introduced with an |exists| instead of the |forall| that is written in Haskell there.

The pattern language is similar to the expression language. Identifiers are bound to fields of a constructor using applications of variables. Likewise, universally quantified variables are instantiated by applying a type, and existentials are opened by applying a unique type constant.

\begin{PlainFigure}{t}{Syntax of \SystemFAPrime}{fig.systemfa}
\begin{code}
e  `elem` Expr                                               ^^^^                             ^^^^^  p   `elem` Pattern
   ::=  C                                                    ^^^^  ECON                       ^^^^^      ::=  C                              ^^^^  PCON
   |||  x                                                    ^^^^  EVAR                       ^^^^^      |||  x :: ty                        ^^^^  PVAR
   |||  e e                                                  ^^^^  EAPPEXPR                   ^^^^^      |||  p p                            ^^^^  PAPPPAT
   |||  e ty                                                 ^^^^  EAPPUNIV                   ^^^^^      |||  p ty                           ^^^^  PAPPUNIV
   |||  \p dot e                                             ^^^^  ELAMABS                    ^^^^^      |||  p v                            ^^^^  PAPPEXIST
   |||  BigLam v dot e                                       ^^^^  EUNIVABS                   ^^^^^
   |||  let overline(p = e) in e                             ^^^^  ELET                       ^^^^^  ty  `elem` Type
   |||  case e of overline(p -> e) :: ty                     ^^^^  ECASE                      ^^^^^      ::=  C                              ^^^^  TCON
                                                                                              ^^^^^      |||  x                              ^^^^  TVAR
d  `elem` Decl                                               ^^^^                             ^^^^^      |||  ty ty                          ^^^^  TAPP
   ::=  data D manyalpha = overline(| exists manybeta dot (overline(x =*= ty)) => C manyty)   ^^^^^      |||  forall x dot ty                ^^^^  TFORALL
                                                             ^^^^                             ^^^^^      |||  exists x dot ty                ^^^^  TEXISTS
t  `elem` Program                                            ^^^^                             ^^^^^      |||  (overline(ty =*= ty)) => ty    ^^^^  TEQS
   ::=  overline(d;) e                                       ^^^^  TOPLEVEL                   ^^^^^
                                                             ^^^^                             ^^^^^  env, envl `elem` Env
x, alpha, beta  `elem` Var      ^^^^ D `elem` TyCon                                           ^^^^^      ::=  x :: ty, ^^^ env
v               `elem` TyConst  ^^^^ C `elem` ValCon                                          ^^^^^      |||  x, ^^^ env
TyConst <== Var                                              ^^^^                             ^^^^^      |||  ty =*= ty, ^^^ env
\end{code}
\end{PlainFigure}

\paragraph{Notation}

First some notation before we give a type system for the source language. The semicolon in the rules for patterns acts as a separator to indicate that |env| and |envl| are separate environments. Juxtaposition of environments (i.e. |envl env|) represents environment concatenation. |env(x) = t| states that |x| is bound to |t| in |env|. Furthermore, we use an overbar to indicate a list. A single value at the position where a list is expected is implicitly assumed to be a singleton list. Juxtaposition of lists represents list concatenation. The components of the list are accessible with a subscript |i|. A type |ty[(overline(ty))]| is obtained by replacing some components of |ty| in some fixed way with types |overline(ty)|. With |tya = ty[(overline(tya))]| and |tyb = ty[(overline(tyb))]|, we express that |tya| and |tyb| have a common structure |ty|, with corresponding differences in |tyai| and |tybi| respectively.

\paragraph{Type System}

The type rules for expressions are given in \Fig{rules.G.expr.base}, with the meaning that in environment |env|, the expression |e| has type |ty|. The type rules for a pattern given in \Fig{rules.G.pat.base} denote that in environment |env|, the pattern |p| has type |ty|, with bindings for variables in |envl|.

When we ignore the special rules |TXTEAPPC|, |TXTECOERCE|, and |TXTPAPPEQS| for the moment, the rules are a minor variation on a type system for \SystemF. The differences are:
\begin{itemize}
\item Scoping is made explicit by collecting bindings for a pattern in a local environment |envl|.
\item An existential is opened with a pattern match using |TXTPAPPEXIST| by instantiation to a fresh fixed type variable |v| (a type constant). Such a |v| may not escape the scope of the pattern match, which is enforced by demanding that the only variables that may escape are those that are bound in the global environment |env| in rules |TXTELAMABS|, |TXTELET|, and |TXTECASE|.
\item There are two type signatures in the environment for a constructor: a constructor signature |env(identc)| and a deconstructor signature |env(Cdecon)|. The constructor signature defines which which equalities need to be proved and which values have to be supplied. The deconstructor signature gives the dual definition: which values can be extracted when pattern matching against this constructor and which equalities can be assumed to be proved. Finally, \Fig{rules.G.decl.base} defines how these signatures are derived from a data type declaration.
\end{itemize}

\rulerCmdUse{rules.G.decl.base}
\rulerCmdUse{rules.G.expr.base}
\rulerCmdUse{rules.G.pat.base}

\paragraph{Type Conversions}

Without the three special rules, typing the example of \Sect{sect.motivation} fails. We demand that the right-hand sides of the case alternatives are of type |t|, but the first case alternative is of type |Int|. The pattern match against the |Num| constructor gives us the proof that the |Int| is actually equal to the |t|. We exploit this knowledge using rule |TXTECOERCE|, by substituting the |t| for |Int| when typing the case alternative. Only those type constants (like |t|) may be substituted; we discuss this later. This rule uses the entailment relation |::-|, which states that in environment |env|, the two types are proved to be equal.

The rule |TXTEAPPC| is used to actually prove that |t| is equal to |Int| when we construct a value with the constructor |Num|, and rule |TXTPAPPEQS| allow us to extract this proof from the |Num| constructor and introduce it as an assumption in the environment.

\rulerCmdUse{rules.G.entails.base}

\paragraph{Entailment}

\Fig{rules.G.entails.base} gives the entailment rules. A derivation of these rules is a proof of equality between two types. The first two rules represent symmetry and transitivity of an equality relation. Rule |TXTEASSUME| uses an assumption from the environment. The subsumption rule decomposes an equality proof in proofs for components of type types. The congruence rules allows for proving an equality with sub components converted if there is an equality proof for it. Again, only type constants |v| need to be converted. The decomposition and subsumption rules are often needed when a type conversion has to be applied deep inside a type. Transitivity is not used often, but there are examples~\cite{leftcorner} with matches on more than one constructor where transitivity is needed to combine equality proofs.

\paragraph{Discussion}
The explicitly typed source language allows us to abstract from a particular choice of type checking and type inference strategy, which (although important) is a separate issue. There is, however, a concept we have to take into account. A successful \GADT pattern match results in additional assumptions between the equality of types, allowing types to be converted. Not all types are allowed to be converted for reasons of predictability and most general typing (when dealing with type inference). This is the reason why Peyton~Jones et al.\ distinguish rigid types~\cite{jones06}. In our explicitly typed system, the concept of rigid types relates to type constants |v| introduced by universal abstraction and existential pattern matching. By allowing conversions only to such a constant, we obtain a specification that takes into account which types are allowed to be converted, while leaving the choice of type inference or propagation strategy up to the implementation.

So, the type constants |v| determine the positions where types can be converted. In the example of \Sect{sect.motivation}, the requirement that the case alternatives all need to be of type |t| (coming from |Expr t|), dictates that there needs to be a conversion from the actual type of the case alternative to this type |t|.

Furthermore, note that the three special rules are not syntax directed. An implementation decides where to apply these rules. For example, assuming that the target language is extended with some additional syntax, and that the example of \Sect{sect.motivation} is lifted to some evaluation monad |m|:

\begin{code}
data Expr t
  =  Val (t ~ Int) t
  |  forall a b dot Tuple (t ~ (a,b)) (Expr a) (Expr b)
; let (eval :: forall t m dot Monad m => Expr t -> m t)
        = \\t m -> \(e :: Expr t) ->
            ( case e of
                Val     eqInt (x :: t)  -> return m (x :> sym eqInt)
                (Tuple  eqTup p q) a b  ->
                        do  ep <- eval m a p
                            eq <- eval m b q
                            return m ((,) ep eq :> sym eqTup) ) :: m t
\end{code}

In the above code, the type conversion is applied as deep as possible. Another possibility is to convert as shallow as possible. For example, by changing the first case alternative to |return m x :> (m (sym eqInt))|, where the coercion |m| (of type |m ~ m|) represents reflexivity, and the application of the coercions represent a coercion of type |m Int ~ m t|. Other possibilities are a mixture between shallow and deep. This choice should not have an effect on the outcome of the program, but since it affects the structure of the target expression, a particular choice may be more beneficial depending on a particular implementation.