%%% NOTE!
%%% I'm now taking the content of my thesis chapter directly as body for this paper.
%%% The variable thesis should be set to False


%include ../ariem-thesis/dependent-ags/depend.lhs


%if False


%format rulerfront = "{\mbox{\scshape ruler-front}}"
%format rulercore = "{\mbox{\scshape ruler-core}}"
%format rulerfrontagda = "{\mbox{\scshape AG}_{\mbox{\scshape da}}}"
%format rulercoreagda = "{\mbox{\scshape AG}^{\mbox{\scshape x}}_{\mbox{\scshape da}}}"
%format Agda = "{\mbox{\scshape Agda}}"

%format . = "."
%format ^^^ = "\;\;\;"
%format ^^^^ = "\hspace{1em}"

%format epsilon = "\epsilon"
%format (sub (x) (y)) = "{" x "}_{" y "}"
%format (sup (x) (y)) = "{" x "}^{" y "}"
%format (many (x))    = "\overline{" x "}"
%format forall = "\forall"

%format Gamma = "\Gamma"
%format Sigma = "\Sigma"
%format Phi = "\Phi"
%format :- = "\vdash"
%format emptyset = "\emptyset"
%format epsilon = "\epsilon"
%format sep = "\:;\:"
%format bar = "|"
%format ::: = "\::\:"
%format `union` = "\cup"
%format union = "\cup"
%format `intersection` = "\cap"
%format intersection = "\cap"
%format `subset` = " \subseteq "
%format .== = "\doteq"
%format `diam` = "\diamond"
%format neg a = "\neg" a
%format bottom = undefined

%format Gamma0
%format Gamma1
%format Gamma2
%format Gamma3
%format Gamma4
%format Gamma'
%format Gamma0
%format Gamma''
%format inGamma = "in" Gamma
%format inGamma'

%format Set1
%format Set2

%format _elem = "\_\in\_"
%format _subseq = "\_\sqsubseteq\_"
%format `subseq` = "\sqsubseteq"
%format `mbElem` = "\in_?"
%format _mbElem  = "\_\in_?\_"
%format `mbEqual` = "\equiv_?"
%format `Either` = "\uplus"

%format alpha = "\alpha"
%format beta = "\beta"

%format inj1
%format inj2
%format prv = "\phi"
%format prv1
%format prv2
%format prv'

%format i1
%format i2
%format x1
%format x2
%format x3
%format x4
%format es1
%format es2
%format t1
%format t2
%format a1
%format a2
%format p1
%format p2
%format p3
%format p4


%format ty         = "\tau"
%format tail       = "\kw{tail}"
%format and        = "\kw{and}"
%format attr       = "\kw{attr}"
%format data       = "\kw{data}"
%format sem        = "\kw{sem}"
%format inh        = "\kw{inh}"
%format syn        = "\kw{syn}"
%format con        = "\kw{con}"
%format clause     = "\kw{clause}"
%format default    = "\kw{default}"
%format itf        = "\kw{itf}"
%format visit      = "\kw{visit}"
%format internal   = "\kw{internal}"
%format match      = "\kw{match}"
%format child      = "\kw{child}"
%format invoke     = "\kw{invoke}"
%format datasem    = "\kw{datasem}"
%format grammar    = "\kw{grammar}"
%format term       = "\kw{term}"
%format term0      = "\kw{term?}"
%format nonterm    = "\kw{nonterm}"
%format prod       = "\kw{prod}"
%format context    = "\kw{context}"
%format contexts   = "\kw{contexts}"
%format attach     = "\kw{attach}"
%format detach     = "\kw{detach}"

%format tau = "\tau"
%format tau1
%format tau2
%format tau3
%format tau4
%format taus = "\overline{\tau}"
%format taus1
%format taus2
%format taus3
%format taus4

%format adata = "\uw{data}"
%format awhere = "\uw{where}"
%format with = "\uw{with}"
%format ainh = "\Varid{inh}"
%format asyn = "\Varid{syn}"
%format asem = "\Varid{sem}"

%format xitf = "\Varid{itf}"

%format nm = "\iota"
%format nm1
%format nm2
%format nm3
%format nm4

%format terminator = "\Box"

%format << = "\llbracket"
%format >> = "\rrbracket"
%format ~> = "\leadsto"
%format (semb (z)) = << z >>
%format (semiv (z) (i) (n) (t)) = "\llbracket_{\mathsf{iv\;}}" z >> "_{ " i "," t "}" "^{ " n "}"
%format (sema (z)) = "\llbracket_{\mathsf{a\;}}" z >>
%format (semat (z)) = "\llbracket_{\mathsf{at\;}}" z >>
%format (seman (z)) = "\llbracket_{\mathsf{an\;}}" z >>
%format (semev (z) (i) (n)) = "\llbracket_{\mathsf{ev }}" z >> "_{ " i "}" "^{ " n " }"
%format (semr (z) (k)) = "\llbracket_{\mathsf{r\;}}" z >> "_{ " k "}"
%format (semep (z) (i) (n)) = "\llbracket_{\mathsf{ep\;}}" z >> "_{ " i "}" "^{ " n " }"

%format +++ = "{\_+\!\_}"
%format Nat = "\ensuremath{{\mathbb{N}}}"
%format (joinfun (x) (y)) = x "_" y
%format (semfun (x)) = "sem_" x
%format (catafun (x)) = "cata_" x
%format (refi (c) (a)) = "{" c "_{" i "}" a "}"
%format (refs (c) (a)) = "{" c "_{" s "}" a "}"
%format (refc (k) (a)) = "{" k "_{" c "}" a "}"
%format (refL (c) (a)) = "{" c "_{" l "}" a "}"
%format (refa (c) (x)) = "{" c "_{" a "}" x "}"
%format (refv (c) (x)) = "{" c "_{" v "}" x "}"

%format (Tp (t))  = "{" T "''" t "}"
%format (Tpv (t) (x)) = "{" T "''" t "''" x "}"
%format (Alg (t)) = "{" A "''" t "}"
%format (ref (c) (a)) = "{" c "." a "}"

%if not thesis
%% \begin{abstract}
%endif

  In a functional implementation of a compiler, catamorphisms are
  a design pattern that is often used for tree traversals.
  Attribute Grammars (AGs) are a domain-specific language for
  composable descriptions of algebras for catamorphisms.
  Given such a description, it is not immediately clear how to
  state and prove properties of AGs formally.

  In this paper, we apply dependent types to AGs. Dependent types allow us
  to specify invariants as types of attributes, and machine-checkable
  proofs for them as separate and composable attributes.
  Additionally, when an AG is cycle-free, the
  composition of the attributes is logically consistent.
  This makes it both attractive to have dependent types in AGs, and
  to use AGs to express catamorphisms in dependently typed programs.

%if not thesis
%% \end{abstract}
%endif

\section{Introduction}

  Functional programming languages, such as Haskell or ML, are known to
  be convenient languages for implementing a compiler.
  %% ~\cite{DBLP:books/cu/Appel1998ml}.
  As part of the compilation process, a compiler computes properties of
  Abstract Syntax Trees (ASTs), such as environments, types, error messages, and code.
  In functional programming, \emph{catamorphisms} (also known as folds) are used for the
  inductive definition of these properties.
  Catamorphisms can be seen as a design pattern for
  tree traversals, similar to visitors in imperative programming languages.
  An \emph{algebra} defines an inductive property in terms of each constructor
  of the AST, and a catamorphism applies the algebra to the AST.
  Catamorphisms and their algebras thus play an important role in a
  functional implementation of a compiler.

  In a large compiler implementation, writing algebras is a tedious
  job. Attribute Grammars (AGs)~\cite{DBLP:journals/mst/Knuth68} are a
  domain-specific language for the \emph{composable} description of algebras for catamorphisms.
  %% Underlying an AG is a context-free grammar that describes the structure of the AST.
  An AG extends a context-free grammar with attributes associated with
  nonterminals and functional rules that define such attributes for each production in terms of
  other attributes. As AGs are typically embedded in a host language, the rules are
  terms in the host language that may additionally refer to attributes.
  Attributes represent semantic properties of an AST, and can easily be
  composed to form more complex properties.
  An AG can be compiled to an efficient functional algorithm (in the form of
  a catamorphism) in the host language that, given an AST, computes values for
  its attributes.

  It is common to prove that a type inferencer is a sound and complete
  implementation of a type system, and that the meaning of a well typed
  source program is preserved.
  It is not immediately clear how formally specify and write such proofs
  when the compiler is implemented with AGs.
  \emph{Dependent types}~\cite{bove09} provide a means to use \emph{types}
  to encode properties with the expressiveness of (higher-order) intuitionistic
  propositional logic, and \emph{terms} to encode proofs.
  Such programs are called correct by construction, because the program itself
  is a proof of its invariants.
  The goal of this paper is therefore to apply dependent types to AGs, in order
  to formally reason with AGs.

  Vice versa, AGs also offer benefits to dependently typed
  programming. Because of the Curry-Howard correspondence, dependently typed AGs are a
  domain-specific language to write structurally inductive proofs in
  a \emph{composable}, \emph{aspect-oriented} fashion; each attribute
  represents a separate aspect of the proof.
  Additionally, AGs alleviate the programmer from the tedious orchestration of
  multi-pass traversals over data structures, and ensure that the traversals are \emph{total}:
  totality is required for dependently types programs for reasons of logical consistency
  and termination of type checking.
  Hence, the combination of dependent-type technology and AGs is mutually
  beneficial.

  We contribute the following with this paper:
  \begin{itemize}
  \item We present the language |rulerfrontagda| (Section~\ref{depend.section.example}),
        a light-weight approach to facilitate dependent types in AGs, and vice versa,
        AGs in the dependently typed language Agda. |rulerfrontagda| is an embedding
        in Agda via a preprocessor.

        In contrast to conventional AGs, we can encode invariants in terms of dependently
        typed attributes, and proofs as values for attributes. This expressiveness comes
        at a price: to be able to compile to a total Agda program, we restrict ourselves
        to the class of ordered AGs, and demand the definitions of attributes to be total.
  \item We define a desugared version of |rulerfrontagda| programs (Section~\ref{depend.sect.rulerfront.agda})
        and show how to translate them to plain Agda programs (Section~\ref{depend.sect.translation.agda}).
  \item We show an extension to AGs that allows conditional attribution of nonterminals,
        such that we can give total definitions of what would otherwise be partially
        defined attributes (Section~\ref{depend.sect.partial}).
  \end{itemize}
  In Section~\ref{depend.section.preliminaries}, we introduce the notation used in
  this paper. However, we assume that the reader is both familiar
  with AGs (see e.g. \cite{uuagc}) and dependently typed programming in Agda
  (see e.g. \cite{DBLP:conf/tldi/Norell09}).

%if False
%% Might have some nice sentences to use in the thesis introduction

  In the last years, we applied AGs in several large projects,
  including the Utrecht Haskell Compiler~\cite{uhc},
  the Helium~\cite{Leijen:helium} compiler for learning Haskell,
  and the editor Proxima~\cite{Schrage04proxima}.
  The use of AGs in these projects is invaluable,
  for two reasons: rules for attributes can be
  written in isolation and combined automatically, and trivial rules can
  be inferred, saving the programmer from writing boilerplate code.

  Given a compiler such as the UHC, the question arises to what extend
  the compiler generates correct code (for some definition of
  correctness). In UHC, similar to any project that many people contribute to,
  we regularly encounter subtle bugs related to broken hidden invariants. 
  Through the use of regression testing, we get a fair idea of
  how well the implementation is, however we never get sure about the
  number of erroneous cases that lurk in a dark corner.

  The holy grail in compiler implementation is to formally verify that a
  compiler exhibits certain properties of interest. For example, that a type
  inferencer is a sound and complete implementation of a type system, and
  that the meaning of a well typed source program is preserved.
  In practice, this formal verification is (still) infeasible for large
  projects with limited contributors. Given the complexity of compilers
  involved, it is already a struggle to implement and maintain a compiler
  that gives output that seems correct, let alone to attempt to prove this.
  
  When we implement our compilers in languages with advanced type systems,
  we can ensure that compilers have certain properties by construction.
  In mainstream strongly typed languages, it is possible to guarantee that
  (even for non-trivial inputs)
  the compiler produces output that conforms to a context-free grammar.
  It is enforceable that a compiler produces, for example, syntactically-correct
  HTML or C code.
  These are properties we get for free when we define our datastructures
  properly in, for example, the strongly-typed programming language Haskell.
  
  Unfortunately, syntactic guarantees do not save us from hidden invariants.
  For that, we desire semantic guarantees.
  This is possible by using more complex types, but at a severe cost.
  It actually requires us to encode proofs by using the
  type system as a logic. In a dependently typed language such as Agda,
  we can impose arbitrary complex constraints on values via types,
  but as a consequence, proving that the program is well typed becomes
  considerably harder.
  
  To move from a compiler implementation in Haskell to Agda, for example,
  the least extra effort required---syntax and library issues aside---is
  to prove termination of the program (ensures that the program is a proof
  in a consistent logic).
  Compilers typically consist of
  primitive recursive functions, and those parts in a compiler that rely
  on iteration either compute fixpoints, or have iteration limits.
  Unification-based type inferencers may need to ensure that types in the
  substitution are finite, imposed via the `occurs check', which is a
  relatively easy property.
  The extra effort is substantial, but theoretically speaking, not a
  creative challenge.
  
  From there on, the programmer can make the types more specific, and prove
  stronger invariants.
  For a project the size of UHC, a verified compiler is still too much
  to ask for, as it makes implementation and maintenance more difficult and
  time consuming.
  Recent developments in dependently typed programming, however, make such
  approaches more and more feasible. Several success stories are known,
  which is promising for the future.
  Since we implement our compilers with Attribute Grammars, we present in
  this paper how these can be combined with dependently typed programming,
  using Agda in particular.

  AGs can be compiled to (a sequence of) functions that take a record of
  inherited attributes, have cases for each type of AST node, recursively
  visit the children of such a node, and produce a
  record of synthesized attributes. Such functions can be written in
  Agda. There are thus no theoretic limitations
  that prevent AGs to be embedded in a dependently typed programming language.

  However, dependently typed programming enforces structure to a program that
  is inconvenient when using plain Attribute Grammars.
  In short, to profit from programming with AGs, a fine granularity of
  attributes is beneficial, but may require partially defined attributes.
  In a dependently typed setting, all attributes must have a total
  definition, and to accomplish that, a coarser granularity of attributes
  may be required.
  The coarser the granularity of attributes, the less benefits an AG has
  over a conventional fold.
  We demonstrate this with an example in Section~\ref{depend.section.example}, and
  we show that we can keep the fine granularity of attributes when we
  extend attribute grammars with context-dependent attribution.
  
  In previous work, we implemented an AG preprocessor for Haskell, and
  some other languages (cite, cite). The language of this preprocessor
  is called |rulerfront|. In this paper, we introduce a variation called
  |rulerfrontagda|, which has additional syntax to support the more
  complex types for attributes, and to deal with dependently typed
  pattern matching in the left-hand sides of rules.
  In particular, dependent pattern matching against attributes may impose
  additional constraints on other attributes. For example, in the scope of
  a match against an equality proof, we get the additional information that
  two attributes are the same and Agda requires them to get a common name.
  The example in Section~\ref{depend.section.example} again demonstrates.

  As contributions, we extensively discuss an example in Section~\ref{depend.section.example},
  then describe the AG-language |rulerfront| in Section~\ref{depend.sect.rulerfront.agda},
  and its translation to Agda in Section~\ref{depend.sect.translation.agda}.
%endif

\section{Preliminaries}
\label{depend.section.preliminaries}

  In this warm-up section, we briefly touch the Agda and AG notation used
  throughout this paper. As an example, we implement the sum of a list of numbers
  with a catamorphism. We first give Agda code, then |rulerfrontagda| code.
  This example does not yet use dependently typed attributes. These are
  introduced in the next section.
  The |rulerfrontagda| code consists of blocks of plain Agda code, and blocks of
  AG code. To ease the distinction, Agda's keywords are
  underlined, and keywords of |rulerfrontagda| are typeset in bold.
  %% and keywords of |rulerfrontagda| are bold.
  
  In the following code snippet, the data type |List| represents a cons-list of natural numbers.
  The type |Tp List| is the type of the value we compute (a number), and |Alg List| is the
  type of an algebra for |List|. Such an algebra contains a \emph{semantic function} for each constructor
  of |List|, which transforms a value of that constructor into the desired value (of type |Tp List|), assuming that the
  transformation has been recursively applied to the fields of the constructor. The
  catamorphism |catafun List| performs the recursive application.
  \begin{code}
  adata List : Set awhere            -- represents a cons-list of natural numbers
    nil   : List                     -- constructor has no fields
    cons  : Nat -> List -> List      -- constructor has a number and tail list as fields

  Tp List   = Nat                                     -- defines a type alias |Tp List : Set|,
  Alg List  = (Tp List , Nat -> Tp List -> Tp List)   -- and |Alg List : Set|
  
  catafun List  : Alg List -> List -> Tp List         -- applies algebra to list
  catafun List  (n , _) nil = n                       -- in case of |nil|, replaces |nil| with |n|
  catafun List  alg l  with  alg | l                  -- otherwise, matches on |alg| and |l|
  catafun List  alg l  | (_ , c)  | cons x xs  with  (catafun List) alg xs  -- recurses on |xs|
  catafun List  alg l  | (_ , c)  | cons x xs  | r = c x r                  -- replaces |cons| with |c|
  \end{code}
  The with-construct is Agda's alternative to a case expression.
  When a function alternative ends in |with e|,
  the next lines are alternatives to that alternative. These alternatives take the
  result of |e| as additional parameter and may pattern match against it.
  Vertical bars are used to keep the patterns apart.
  Multiple expressions can be scrutinized by the with-construct when
  these are also separated by vertical bars. In the above code snippet, |l| is scrutinized to
  match against a |cons|-value, and |alg| is scrutinized to get the semantic function
  |c| that is intended for |cons|-constructors. The with-construct can also be used to
  give names to intermediate values: the result of the
  expression |(catafun List) alg xs| is given the name |r|.

  The actual algebra itself simply takes |0| for the |nil| constructor, and
  |+++| for the |cons| constructor. The function |joinfun sum List| shows how the algebra and
  catamorphism can be used.
  \begin{code}
  semfun nil   : Tp List                                      -- semantic function for |nil| constructor
  semfun nil   = 0                                            -- |Tp List = Nat| (defined above)

  semfun cons  : Nat -> Tp List -> Tp List                    -- semantic function for |cons| constructor
  semfun cons  = +++                                          -- |+++ : Nat -> Nat -> Nat| (defined in library)

  joinfun sum List  : List -> Tp List                         -- transforms the |List| into the desired sum
  joinfun sum List  = catafun List (semfun nil, semfun cons)  -- algebra is semantic functions in a tuple
  \end{code}
  In the example, the sum is defined in a bottom-up fashion. By taking a function type for
  |Tp List|, values can also be passed top-down. Multiple types can be combined by using
  products. Such algebras quickly become tedious to write. Fortunately, we can use AGs as
  a domain-specific language for algebras.
  In the code below, we give an AG implementation: we specify a grammar that describes the
  structure of the AST, declare attributes on productions, and give rules that define attributes.

  A grammar specification is a restricted form of a data declaration (for an AST):
  data constructors are called \emph{productions} and their fields are explicitly marked
  as \emph{terminal} or \emph{nonterminal}. A nonterminal field represents a \emph{child}
  in the AST and has attributes, whereas a terminal field only has a value.
  A plain Agda data-type declaration can be derived from a grammar
  specification. In such a specification, nonterminal types must have a fully saturated,
  outermost type constructor that is explicitly introduced by a grammar declaration. 
  Terminal types may be arbitrary Agda types\footnote{
    In general, although not needed in this example, nonterminal types may be parametrized,
    production types may refer to its field names, and field types may refer to preceding
    field names.
  }.
  \begin{code}
  grammar List  : Set       -- declares nonterminal |List| of type |Set|
    prod nil   : List       -- production |nil| of type |List| (no fields)
    prod cons  : List       -- production |cons| of type |List| (two fields)
      term    ^^^  hd  : Nat   -- terminal field |hd| of type |Nat|
      nonterm ^^^  tl  : List  -- nonterminal field |tl| of type |List|
  \end{code}
  With an interface specification, we define attributes for nonterminals.
  Attributes come in two fashions: \emph{inherited} attributes (used in a later example)
  must be defined by the parent, and \emph{synthesized} attributes may be used by the parent.
  Names of inherited attributes are distinct from names of synthesized attributes; 
  an attribute of the same name and fashion may only be declared once per nonterminal.
  We also partition the attributes in one or more \emph{visits}. These visits impose a partial
  order on attributes. Inherited attributes may not be defined
  in terms of a synthesized attributes of the same visit or later.
  We use this order in Section~\ref{depend.sect.rulerfront.agda} to
  derive semantic functions that are total.
  \begin{code}
  itf List              -- interface for nonterminal |List|,
    visit compute       -- with a single visit that is named |compute|,
      syn sum : Nat     -- and a synthesized attribute named |sum| of type |Nat|
  \end{code}
  %format namechild = "\Varid{child}"
  %format nameattr  = "\Varid{attr}"
  Finally, we define each of the production's attributes. We may refer to an attribute
  using |ref namechild nameattr| notation. For each production, we give rules
  that define the inherited attributes of the children and synthesized attributes of
  the production itself (with |lhs| as special name), using inherited attributes of
  the production and synthesized attributes of the children. The special name |loc|
  refers to the terminals, and to local attributes that we may associate with a
  production.
  \begin{code}
  datasem List  -- defines attributes of |List| for constructors of |List|
    prod nil   ^^^  lhs.sum = 0                    -- rule for |sum| of production |nil|
    prod cons  ^^^  lhs.sum = loc.hd + ref tl sum  -- refers to terminal |hd| and attr |ref tl sum| 
  \end{code}
  The left-hand side of a rule is a plain Agda pattern, and the right-hand side is
  either a plain Agda expression or with-construct (not shown in this example). Additionally,
  both the left and right-hand sides may contain attribute references.
  
  During attribute evaluation, visits are performed on children to obtain their associated
  synthesized attributes. We do not have to explicitly specify when to visit these children,
  neither is the order of appearance of rules relevant.
  However, an inherited attribute |ref c x| may not depend on a synthesized attribute
  |ref c y| of the same visit or later (in the interface). This guarantees that the
  attribute dependencies are acyclic, such that we can derive when children need to
  be visited and in what order.
  
  AGs are a domain-specific language to write algebras in terms of attributes.
  From the grammar, we generate the data type and catamorphism. From the interface,
  we generate the |Tp List| type. From the rules, we generate the semantic functions
  |semfun nil| and |semfun cons|.
  AGs pay off when an algebra has many inherited and synthesized attributes. Also,
  there are many AG extensions that offer abstractions over common usage patterns
  (not covered in this paper). 
  In the next section we present AGs with dependent types, such that we can formulate
  properties of attributes (and their proofs).

\section{Dependently Typed Example}
\label{depend.section.example}

  In this section, we use |rulerfrontagda| to implement a mini-compiler that performs
  name checking of a simple language |Source|, and translates it to target language |Target|
  if all used identifiers are declared, or produces errors otherwise.
  A term in |Source| is a sequence of identifier definitions or uses, for example:
  |def a `diam` use b `diam` use a|. In this case, |b| is not defined, thus the mini-compiler
  reports an error. Otherwise, it generates a |Target| term, which is a clone of the |Source|
  term that additionally carries evidence that the term is free of naming errors.
  Section~\ref{depend.sect.example.grammar} shows the definition of both
  |Source| and |Target|.
  
  We show how to prove that the mini-compiler produces only correctly named |Target| terms
  and errors messages that only mention undeclared identifiers. The proofs are part of the
  implementation's code. Name checking is only a minor task in a compiler. However, the
  example shows many aspects of a more realistic compiler.

  \subsection{Support Code Dealing With Environments}
  \label{depend.sect.utilcode}

  We need some Agda support code to deal with environments.
  We show the relevant data structures and type signatures for operations on them, but
  omit the actual implementation.
  %if False
  \footnote{
    We show only the signatures. The implementation can be found in %% TODO Section~\ref{depend.sect.support}.
  }
  %endif
  %% \footnote{
  %%  Section~\ref{depend.sect.support} shows the actual implementation of the support code.
  %%}.
  We represent the environment as a cons-list of identifiers.
  \begin{code}
  Ident  = String       -- |Ident : Set|
  Env    = List Ident   -- |Env : Set|
  \end{code}

  In intuitionistic type theory, a data type represents a relation, its data constructors deduction
  rules for such a relation, and values built using these constructors are proofs for instances of the relation.
  We use some data types to reason with environments.

  A value of type |nm `elem` Gamma| is a proof that an identifier |nm| is member of an environment |Gamma|.
  A value |here| indicates that identifier is at the front of the environment.
  A value |next| means that the identifier can be found in the tail of the environment,
  as described by the remainder of the proof.
  \begin{code}
  adata _elem : Ident -> Env -> Set awhere
    here : {nm : Ident} {Gamma : Env} -> nm `elem` (nm :: Gamma)
    next : {nm1 : Ident} {nm2 : Ident} {Gamma : Env} -> nm1 `elem` Gamma -> nm1 `elem` (nm2 :: Gamma)
  \end{code}

  The type |Gamma1 `subseq` Gamma2| represents a proof that an environment |Gamma1| is contained as a subsequence of
  an environment |Gamma2|. A value |subLeft| means that the environment |Gamma1| is a prefix of |Gamma2|, and
  |subRight| means that |Gamma1| is a suffix. With |trans|, we transitively compose two proofs.
  \begin{code}
  adata _subseq : Env -> Env -> Set awhere
    subLeft   : {Gamma1 : Env} {Gamma2 : Env} -> Gamma1 `subseq` (Gamma1 ++ Gamma2)
    subRight  : {Gamma1 : Env} {Gamma2 : Env} -> Gamma2 `subseq` (Gamma1 ++ Gamma2)
    trans     : {Gamma1 : Env} {Gamma2 : Env} {Gamma3 : Env} -> Gamma1 `subseq` Gamma2 -> Gamma2 `subseq` Gamma3 -> Gamma1 `subseq` Gamma3
  \end{code}

  The following functions operate on proofs.
  When an identifier occurs in an environment, function |inSubset| produces
  a proof that the identifier is also in the superset of the environment.
  Given an identifier and an environment, |nm `mbElem` Gamma| returns either a proof
  |nm `elem` Gamma| that the element is in the environment, or a proof that it is not.
  \begin{code}
  inSubset  :  {nm : Ident} {Gamma1 : Env} {Gamma2 : Env} -> Gamma1 `subseq` Gamma2 -> nm `elem` Gamma1 -> nm `elem` Gamma2
  _mbElem   :  (nm : Ident) -> (Gamma : Env) -> neg (nm `elem` Gamma) `Either` (nm `elem` Gamma)
  \end{code}
  A value of the sum-type |alpha `Either` beta| either consists of an |alpha| wrapped in a constructor |inj1|
  or of a |beta| wrapped in |inj2|.
  
  \subsection{Grammar of the Source and Target Language}
  \label{depend.sect.example.grammar}
  
  Below, we give a grammar for both the |Source| and |Target| language, such
  that we can analyze their ASTs with AGs\footnote{
    In our example, we could have defined the type |Target| instead using conventional
    Agda notation. However, the grammar for |Target| serves as an example of a parameterized nonterminal.
  }.
  The |Target| language is a clone of the |Source| language, except that
  terms that have identifiers carry a field |proof| that is evidence that the
  identifiers are properly introduced.
  \begin{code}
  grammar Root        : Set              -- start symbol of grammar and root of AST
    prod root : Root  ^^^  nonterm  top : Source  -- top of the |Source| tree

  grammar Source      : Set              -- grammar for nonterminal |Source|
    prod use          : Source           -- 'result type' of production
      term     nm     : Ident            -- terminals may have arbitrary Agda types
    prod def          : Source           -- 'result type' may be parametrized
      term     nm     : Ident
    prod _`diam`_     : Source           -- represents sequencing of two |Source| terms
      nonterm  left   : Source           -- nonterminal fields must have a nonterm as
      nonterm  right  : Source           -- outermost type constructor.

  grammar Target      : Env -> Set       -- grammar for nonterminal |Target|
    prod def          : Target Gamma     -- production type may refer to any field,
      term0    Gamma  : Env              -- e.g. |Gamma|. Agda feature: implicit terminal
      term     nm     : Ident            -- \hspace{4pt} (inferred when building a |def|)
      term     prv    : nm `elem` Gamma  -- field type may refer to preceding fields
    prod use          : Target Gamma     
      term0    Gamma  : Env              -- a |Target| term carries evidence: a
      term     nm     : Ident            -- proof that the identifier is in the
      term     prv    : nm `elem` Gamma  -- environment
    _`diam`_          : Target Gamma
      term0    Gamma  : Env
      nonterm  left   : Target Gamma     -- nonterm fields introduce children that
      nonterm  right  : Target Gamma     -- have attributes
  
  adata Err : Env -> Set where           -- data type for errors in Agda notation
    scope : {Gamma : Env} (nm : Ident) -> neg (nm `elem` Gamma) -> Err Gamma

  Errs Gamma = List (Err Gamma)          -- |Errs : Env -> Set|
  \end{code}
  As shown in Section~\ref{depend.section.preliminaries}, we generate 
  Agda data-type definitions and catamorphisms
  from this specification.

  The concrete syntax of the source language |Source|
  and target language |Target| of the mini-compiler is out of scope for this paper;
  the grammar defines only the abstract syntax.
  Similarly, we omit a formal operational semantics for |Source| and |Target|:
  it evaluates to unit if there is an equally named |def| for every |use|, otherwise
  evaluation diverges.

  \subsection{Dependent Attributes}
  
  In this section, we define \emph{dependently typed} attributes for |Source|.
  Such a type may contain references to preceding\footnote{
    We may refer to an attribute that is declared earlier (in order of appearance) in the same interface.
    There is one exception due to the translation to Agda (Section~\ref{depend.sect.translation.agda}):
    in the type of an inherited attribute, we may not refer to synthesized attributes of the
    same visit.
  } attributes using
  |ainh.attrNm| or |asyn.attrNm| notation, which explicitly distinguishes between
  inherited and synthesized attributes. The type specifies a property of the
  attributes it references; an attribute with such a type represents a proof of
  this property.
  
  In our mini-compiler, we compute bottom-up a synthesized attribute |gathEnv|
  that contains identifiers defined by the |Source| term.
  At the root, we pass top-down an inherited attribute |finEnv| that 
  contains all the gathered identifiers.
  We also pass down an attribute |gathInFin| that represents a proof that
  the final environment is a superset of the gathered environment. When we know that
  an identifier is in the gathered environment, we can thus also find it in
  the final environment. We pass up attribute |outcome|, which either consists of
  errors, or of a correct |Target| term.
  \begin{code}
  itf Root          -- attributes for the root of the AST
    visit compile    ^^^  syn  finEnv     : Env
                          syn  outcome    : (Errs asyn.finEnv) `Either` (Target asyn.finEnv)

  itf Source        -- attributes for |Source|
    visit analyze    ^^^  syn  gathEnv    : Env    -- attribute of first visit
    visit translate  ^^^  inh  finEnv     : Env    -- attributes of second visit
                          inh  gathInFin  : asyn.gathEnv `subset` ainh.finEnv
                          syn  outcome    : (Errs ainh.finEnv) `Either` (Target ainh.finEnv)
  
  itf Target Gamma  -- interface for |Target| (parameterized) is not used in the example.
  \end{code} 
  
  As we show later, at the root, we need the value of |gathEnv| to define |finEnv|. This requires
  |gathEnv| to be placed in a strict earlier visit. Hence we define two visits,
  ordered by appearance.
  
  Attribute |gathInFin| has a dependent type: it specifies that |gathEnv| is a subsequence of |finEnv|.
  A value of this attribute is a proof that essentially states that we did not forget any identifiers.
  Similarly, in order to
  construct |Target| terms, we need to prove that |finEnv| defines the identifiers that occur
  in the term.
  In the next section, we construct such proofs
  by applying data constructors. We may use inherited attributes as
  \emph{assumptions} and pattern matches against values of attributes as \emph{case distinctions}.
  Thus, with a dependently typed AG we can formalize and proof correctness properties of our
  implementation.
  Agda's type checker validates such proofs using symbolic evaluation driven by unification.

  \subsection{Semantics of Attributes}
  
  For each production, we give definitions for the declared attributes via rules.
  At the root, we take pass the gathered environment back down as final environment.
  Thus, these two attributes are equal, and we can trivially proof that the final
  environment is a subsequence using either |subRight| or |subLeft|.
  \begin{code}
  datasem Root prod root             -- rules for production |root| of nonterm |Root|
    root.finEnv     = top.gathEnv    -- pass gathered environment down
    root.gathInFin  = subRight {[]}  -- subsequence proof, using: |[] ++ Gamma4 == Gamma4|
    lhs.outcome     = top.outcome    -- pass |outcome| up
  \end{code}
  
  For the |use|-production of |Source|, we check if the identifier (terminal |loc.nm|)
  is in the environment. If it is, we produce a |Target| term as value for the outcome
  attribute, otherwise we produce a |scope| error.
  For |def|, we introduce an identifier in the gathered environment. No errors can
  arise, hence we always produce a |Target| term. We prove (|loc.prv1|) that the
  identifier |loc.nm| is actually in the gathered environment, and prove (|loc.prv2|)
  using |inSubset| and attribute |lhs.gathInFin| that it must also be in the final
  environment.
  For |_`diam`_|, we pass |finEnv| down to both children, concatenate their |gathEnv|s,
  and combine their |outcome|s.
  \begin{code}
  datasem Source                     -- rules for productions of |Source|
    prod use
      lhs.gathEnv      = []                                -- no names introduced
      lhs.outcome      with loc.nm `mbElem` lhs.finEnv     -- tests presence of |nm|
                       | inj1 notIn  = inj1 [ scope loc.nm notIn ]    -- when not in env
                       | inj2 isIn   = inj2 (use loc.nm isIn)         -- when in env
    prod def
      lhs.gathEnv      = [ loc.nm ]                        -- one name introduced
      loc.prv1         = here {loc.nm} {asyn.lhs.gathEnv}  -- proof of |nm| in |gathEnv|
      loc.prv2         = inSubset lhs.gathInFin loc.prv1   -- proof of |nm| in |finEnv|
      lhs.outcome      = inj2 (def loc.nm loc.prv2)        -- never any errors

    prod _`diam`_
      lhs.gathEnv      = left.gathEnv ++ right.gathEnv     -- pass names up
      left.finEnv      = lhs.finEnv                        -- pass |finEnv| down
      right.finEnv     = lhs.finEnv                        -- pass |finEnv| down

      left.gathInFin   = trans  subLeft lhs.gathInFin      -- proof for |left|
      right.gathInFin  = trans  (subRight {asyn.lhs.gathEnv} {lhs.finEnv})
                                lhs.gathInFin              -- proof for |right|

      lhs.outcome      with left.outcome                              -- four alts.
                       | inj1 es   with  right.outcome
                       | inj1 es1  | inj1 es2  = inj1 (es1 ++ es2)    -- 1: both in error
                       | inj1 es1  | inj2 _    = inj1 es1             -- 2: only |left|
                       | inj2 t1   with  left.outcome
                       | inj2 t1   | inj1 es2  = inj1 es2             -- 3: only |right|
                       | inj2 t1   | inj2 t2   = inj2 (t1 `diam` t2)  -- 4: none in error
  \end{code}
  
  Out of the above code, we generate each production's semantic function
  (and some wrapper code), such that these together with a catamorphism form a
  function that translates |Source| terms.
  The advantage of using AGs here is that we can easily add more attributes
  (and thus more properties and proofs) and refer to them.

\section{AG Descriptions and their Core Representation}
\label{depend.sect.rulerfront.agda}

In the previous sections, we presented |rulerfrontagda| (by example).
To describe the dependently-typed extension to AGs, we do so in terms of
the core language |rulercoreagda| (a subset of |rulerfrontagda|).
Implicit information in AG descriptions (notational conveniences, the order of
rules, visits to children) is made explicit in |rulercoreagda|.
We sketch the translation from |rulerfrontagda| to |rulercoreagda|.
In previous work~\cite{Middelkoop10wgt,Middelkoop10gpce}, we described
the process in more detail
(albeit in a non-dependently typed setting).

|rulercoreagda| contains interface declarations, but grammar declarations are absent
and semantic blocks encoded differently.
Each production in |rulerfrontagda| is mapped to a
\emph{semantic function} in |rulercoreagda|: it is a domain-specific language
for the contents of semantic functions.
A terminal |x : tau| of the production is mapped to a parameter |(refL loc x) : tau|.
Implicit terminals are mapped to implicit parameters. A nonterminal |x : N taus|
is mapped to a parameter |refc loc x : Tp N taus|. The body of the production
consists of the rules for the production given in the original |rulercoreagda|
description, plus a number of additional rules
that declare children and their visits explicitly.
%format semprod_seq = asem "\!\diamond{}"
\begin{code}
semprod_seq : Tp Source -> Tp Source -> Tp Source  -- derived from (non)terminal types
semprod_seq (refc loc left) (refc loc right) =     -- semantic function for |`diam`|
  sem : Source                                -- |rulercoreagda| semantics block
    child left  : Source  = (refc loc left)   -- defines a child |left|
    child right : Source  = (refc loc right)  -- defines a child |right|
    invoke analyze    of left                 -- rule requires visiting |analyze| on |left|
    invoke analyze    of right                -- rule requires visiting |analyze| on |right|
    invoke translate  of left
    invoke translate  of right
    
    lhs.gathEnv = left.gathEnv ++ right.gathEnv  -- the |rulerfrontagda| rules
    ...                                          -- etc.
\end{code}
A child rule introduces a child with explicitly semantics (a value of
the type |Tp Source|). Other rules may declare visits and refer to the attributes
of the child. An invoke rule declares a visit to a child, and brings the attributes of that
visit in scope. Conventional rules define attributes, and may refer to attributes.
The dependencies between attributes induces a def-use (partial) order.

Actually, there is one more step to go to end up with a |rulercoreagda| description. A semantics block
consists of one of more visit-blocks (in the order specified by the interface), and the rules
are partitioned over the blocks. 
In a block, the |lhs| attributes of that and earlier visits are in scope, as well as those brought
in scope by preceding rules. Also, the synthesized attributes of the visit must be defined in
the block or in an earlier block. We assign rules to the earliest block that satisfies the
def-use order. We convert this partial order into a total order by giving conventional rules
precedence over child/invoke rules, and using the order of appearance otherwise:
\begin{code}
semprod_seq : Tp Source -> Tp Source -> Tp Source  -- signature derived from itf
semprod_seq (refc loc left) (refc loc right) =     -- semantic function for |`diam`|
  sem : Source                                     -- |rulercoreagda| block
    visit analyze                                  -- first visit
      child left  : Source  = (refc loc left)      -- defines a child |left|
      invoke analyze    of left                    -- requires child to be defined
      child right : Source  = (refc loc right)     -- defines a child |right|
      invoke analyze    of right                   -- requires child to be defined
      asyn.lhs.gathEnv = asyn.left.gathEnv ++ asyn.right.gathEnv

    visit translate                                -- second visit
      ainh.left.finEnv      = ainh.lhs.finEnv      -- needs |lhs.finEnv|
      ainh.right.finEnv     = ainh.lhs.finEnv      -- needs |lhs.finEnv|
      ainh.left.gathInFin   = trans  ...           -- also needs |lhs.gathEnv|
      ainh.right.gathInFin  = trans  ...           -- also needs|lhs.gathEnv|
      invoke translate  of left                    -- needs def of inh attrs of |left|
      invoke translate  of right                   -- needs def of inh attrs of |right|
      asyn.lhs.outcome ^^^ with ...                -- needs |translate| attrs of children
\end{code}
It is a static error when such an order cannot be satisfied. Another interesting example
is the semantic function for the root: it has a child with a different interface as itself,
and has two invoke rules in the same visit.
\begin{code}
sem_root : Tp Source -> Tp Root            -- semantic function for the root
sem_root locStop =                         -- |Source|'s semantics as parameter
  sem : Root visit compile                 -- only one visit
    child top : Source  = (refc loc top)   -- defines a child |top|
    invoke analyze of top                  -- invokes first visit of |top|
    inh.top.finEnv      = syn.top.gathEnv  -- passes gathered environment back
    invoke translate of top                -- invokes second visit of |top|
    syn.lhs.outpu     t = syn.top.outcome  -- passes up the result
\end{code}

  \begin{figure}[htb]
  \begin{code}
    e   ::=  Agda [ many b ]                               -- embedded blocks |b| in |Agda|
    b   ::=  i | s | o                                     -- |rulercoreagda| blocks
    o   ::=  ainh.c.x | asyn.c.x | loc.x                   -- embedded attribute reference

    i   ::=  itf I (many x) : tau v                        -- with first visit |v|, params |x|, and signature |tau|
    v   ::=  visit x inh (many a) syn (many a) v           -- visit declaration
        |    terminator                                    -- terminator of visit decl. chain
    a   ::=  x : e                                         -- attribute decl, with Agda type |e|

    s   ::=  sem : I (many e) t                            -- semantics expr, uses interface |I (many e)|
    t   ::=  visit x (many r) t                            -- visit definition, with next visit |t|
        |    terminator                                    -- terminator of visit def. chain

    r   ::=  p e'                                          -- evaluation rule
        |    invoke x of c                                 -- invoke-rule, invokes |x| on child |c|
        |    child c : I = e                               -- child-rule, defines a child |c|, with interface |I (many e)|

    p   ::=  o                                             -- attribute def
        |    .{e}                                          -- Agda dot pattern
        |    x (many p)                                    -- constructor match
    
    e'  ::=  with e (many (p' e'))                         -- Agda |with| expression
        |    = e                                           -- Agda |=| expression
    p'                                                     -- Agda LHS
    x, I, c  -- identifiers, interface names, children respectively
    tau      -- plain Agda type
  \end{code}
  \caption{Syntax of |rulercore|}
  \label{depend.fig.sem.agda.rulercore}
  \end{figure}

  Figure~\ref{depend.fig.sem.agda.rulercore} shows the syntax of |rulercoreagda| that we demonstrated by
  example above. In general, interfaces may be parametrized. The interface has a function type
  |tau| (equal to the type of the nonterminal declaration in |rulerfrontagda|)
  that specifies the type of each parameter, and the kind of the interface (an upper bound of the
  kinds of the parameters).
  For an evaluation rule, we either use a |with|-expression when the value of the attribute
  is conditionally defined, or use a simple equation as RHS. In the next section, we plug
  such an expression in a function defined via with-expressions, hence we need knowledge about
  the with-structure of the RHS.

\section{Translation to Agda}
\label{depend.sect.translation.agda}

  To explain the preprocessing of |rulercoreagda| to Agda, we give a translation scheme
  in Figure~\ref{depend.fig.sem.denot.agda} (explained via examples below). This translation
  scheme is a denotational semantics for |rulercoreagda|. Also, if the translation is correct
  Agda, then the original is correct |rulercoreagda|.
  
  A semantics block in a |rulercoreagda| program is actually an algorithm that makes precise
  how to compute the attributes as specified by the interface: for each visit,
  the rules prescribe when to compute an attribute and when to visit a child.
  The idea is that we map such a block to an Adga function that takes values for its inherited
  attributes and delivers a dependent product of synthesized attributes. However, such a
  function would be cyclic: in the presented example, the result |gathEnv| would be needed
  for as input for |finEnv|. Fortunately, we can bypass this problem: we map to a
  |k|-visit \emph{coroutine} instead:
  a function that can be invoked |k| times, where each invocation takes a partition of the
  inputs, and produces a partition of the results (each invocation corresponds to a
  visit of the interface).
  In a pure functional language (like Agda), we can encode coroutines as one-shot
  continuations (or \emph{visit functions}~\cite{Saraiva99}).
  
  %format inhAx = "\Varid{inh}_\Varid{a}" x
  %format synAx = "\Varid{syn}_\Varid{a}" x
  %format cVx   = "\Varid{c}_\Varid{v}" x
  %format contv = "\varsigma"
  \begin{figure}[htb]
  \begin{smallcode}
  \begin{code}
  (semb (itf I (many x) : (many (sub tau x))-> tau) v)              ~>  (semiv (v) (I) (many (x : (sub tau x) )) (tau)) ^^^ ; ^^^ (semb (sig I)) : (semb tau) ^^^ ; ^^^ (semb (sig I)) = (semb (sig I (name v)))
  (semiv (visit x inh (many a) syn (many b) v) (I) (many g) (tau))  ~>  (semiv (v) (I) (many g ++ many a ++ many b) (tau))  -- interface type for later visits
                                                                        (semb (sig I x)) : (semat (sub g 1)) -> ... -> (semat (sub g n))  -> (semb (resultty tau))
                                                                        (semb (sig I x)) (many (seman g)) =  (sema (ainh.(sub a 1))) -> ... -> (sema (ainh.(sub a n))) ->
                                                                                                             (semb (typrod (many asyn.b) (sig I (name v))))

  (semiv (terminator) (I) (many g) (tau))                           ~>  (semb (sig I terminator)) = terminator  -- terminator (some unit-value)
  (sema (x : e))                                                    ~>  (semb (atname x)) : (semb e)            -- extract attribute name and type
  (semat (x : e))                                                   ~>  (semb e)                                -- extract attribute type
  (seman (x : e))                                                   ~>  (semb (atname x))                       -- extract attribute name

  (semb (sem x : I (many e) t))                                     ~>  (semb (vis lhs (name t))) awhere (semev t I (many e, emptyset))         -- top of semfun
  (semev (visit x (many r) t) (I) (many e, many g))                 ~>  (semb (vis lhs x)) : (semb (sig I x)) (semb (many e)) (many (seman g))  -- type of visit fun
                                                                        (semb (vis lhs x)) (semb (inhs I x)) = (semr (many r) (semb contv))     -- chain of rules
                                                                    ^^^^  (semb contv) ~>  = (semb (valprod (syns I x) (vis lhs (name t))))
                                                                                           awhere (semev t I (many g ++ many a ++ many b))      -- next visit
  (semev terminator (I) (many e, many g))                           ~>  (semb (vis lhs terminator)) : (semb (sig I terminator)) (semb (many e)) (many (seman g)) ^^^ ; ^^^ (semb (vis lhs terminator)) = terminator

  (semr (child c : I = e) k)                                        ~>  with (semb e) ^^^ ... | (semb (vis I (firstvisit I))) (semb k)  -- |k|: remaining rules
  (semr (invoke x of c) k)                                          ~>  with (semb (vis (xitf c) x)) (semb (inhs (xitf c) x))           -- pass inh values
                                                                        ... | (valprod (syns (xitf c) x)) (semb k)                      -- match syn values
  (semr (p e') k)                                                   ~>  (semep e' p k)                                 -- translation for attr def rule

  (semep (with e (many (p' e'))) p k)                               ~>  with e  ^^^ (many (... | p' (semr (p e') k)))  -- rule RHS is with-constr
  (semep (= e) p k)                                                 ~>  with e  ^^^ ... | (semb p) k                   -- rule RHS is expr

  atref   ainh.c.x  = refi c x       ^^^^  atname  ainh.x    = inhAx                                                   -- naming conventions
  atref   asyn.c.x  = refs c x       ^^^^  atname  asyn.x    = synAx                                                   -- |atref|: ref to attr value
  atref   loc.x     = refL loc x     ^^^^  atname  x         = x                                                       -- |atname|: ref to attr in type
  vis I x  = vis lhs x               ^^^^  sig I    = Tp I                                                             -- |vis|: name of visit function
  vis c x  = refv c x                ^^^^  sig I x  = Tpv I x                                                          -- |sig|: itf types
  \end{code}
  \end{smallcode}
  \caption{Translation of |rulercoreagda| to Agda.}
  \label{depend.fig.sem.denot.agda}
  \end{figure}

  We generate types for coroutines and for the individual visit functions that
  make up such a coroutine.
  These types are derived from the interface. For
  each visit (e.g. |translate| of |Source|), we generate a type that 
  represents a function type from the attribute types of the inherited attributes
  for that visit, to a dependent product (|Sigma|) of the types of the synthesized
  attributes and the type of the next visit function.
  These types are parameterized with the attributes of earlier visits
  (e.g. |Tpv Source translate (refa asyn gathEnv)|).
  The type of the coroutine itself is the type of the first visit.
  %if fullversion
  The type of the last visit is a terminator |terminator|.
  %endif
\begin{smallcode}
\begin{code}
Tpv Source terminator (refa asyn gathEnv) (refa ainh finEnv) (refa ainh gathInFin) (refa asyn outcome) = terminator  -- terminator
Tpv Source translate (refa asyn gathEnv) =          -- type for last visit
  ((refa ainh finEnv) : Env) -> ((refa ainh gathInFin) : (refa asyn gathEnv) `subset` (refa ainh finEnv)) ->         -- inh attrs
  Sigma (Errs (refa ainh finEnv) `Either` Target (refa ainh finEnv)) ((\(refa asyn outcome) ->                       -- syns prod
    (Tpv Source terminator) (refa asyn gathEnv) (refa ainh finEnv) (refa ainh gathInFin) (refa asyn outcome)))       -- next vis
Tpv Source analyze = Sigma Source (\(refa asyn gathEnv) -> (Tpv Source translate) (refa asyn gathEnv))
Tp Source = Tpv Source analyze                      -- type for coroutine: type of the first visit
\end{code}
\end{smallcode}
The restrictions on attribute order in the interface ensure that referenced attributes are in scope.
%if fullversion
This representation can be optimized a bit by passing only on those attributes that are referenced
in the remainder.
%endif
The scheme for |semiv v g I tau| formalizes this translation, where |g| is the list of preceding attribute
declarations, and |tau| the type for |I|. The |typrod| function mentioned in the scheme constructs a right-nested
dependent product.

The coroutine itself consists of nested continuation functions (one for each visit).
Each continuation takes the visit's inherited attributes as parameter, and consists of a
tree of with-constructs that represent intermediate computations for computations of
attributes and invocations of visits to children. Each leaf ends in a dependent product
of the visit's synthesized attributes and the continuation function for the next
visit\footnote{
  As a technical detail, a leaf of the with-tree may also be an \emph{absurd pattern}.
  These are used in Agda to indicate an unsatisfyable alternative. A body for such an alternative cannot be given.
}.
\begin{smallcode}
\begin{code}
semprod_seq : Tp Source -> Tp Source -> Tp Source                         -- example translation for |`diam`|
semprod_seq (refc loc left) (refc loc right) = refv lhs analyze awhere    -- delegates to first visit function
  (refv lhs analyze) : Tpv Source analyze                                 -- signature of first visit function
  (refv lhs analyze) with ...                                             -- computations for |analyze| here
  ... = (refs lhs gathEnv, refv lhs translate) ahwere                     -- result of first visit function
    (refv lhs translate) : Tpv Source translate (refs lhs gathEnv)        -- last visit function
    (refv lhs translate) (refi lhs finEnv) (refi lhs gathInFin) with ...  -- computations for |translate| here
    ... = (refs lhs outcome, refv lhs terminator) awhere                  -- result of second visit function
      (refv lhs terminator) : Tpv Source terminator (refs lhs gathEnv) (refi lhs finEnv) (refi lhs gathInFin) (refs lhs outcome)
      (refv lhs terminator) = terminator                                  -- explicit terminator value
\end{code}
\end{smallcode}
The scheme |(semev v (I) (many e, many g))| formalizes this translation for a visit |v| of interface |I|, where |many e|
are type arguments to the interface (empty in the example), and |many g| are the attributes of previous visits.

The with-tree for a visit-function consists of the translation of child-rules, invoke-rules
and evaluation rules. Each rule plugs into this tree. For example, the translation
for |(semb (child left : Source = (refs loc left)))| is:
\begin{code}
... with (refs loc left)              -- evaluate RHS to get first visit fun
... | (refv left analyze) with ...    -- give it a name + proceed with remainder
\end{code}
For |(semb (invoke translate of left))| the translation is:
\begin{code}
... with (refv left translate) (refi left finEnv) (refi left gathInFin)  -- visit fun takes inh attrs
... | ((refs left outcome), (refv left sentinel)) with ...               -- returns product of syn attrs
\end{code}
For |(semb (lhs.gathEnv = left.gathEnv ++ right.gathEnv))|:
\begin{code}
... with (refs left gathEnv) ++ (refs right gathEnv)                     -- translation for RHS
... | (refs lhs gathEnv) with ...                                        -- LHS + remainder
\end{code}
For |(semb (lhs.outcome with ...))| (where the RHS is a with-construct), we duplicate the remaining
with-tree for each alternative of the RHS:
\begin{code}
... with (refs left outcome)                                             -- translation for RHS
... | inj1 es   with (refs right outcome)
... | inj1 es1  | inj1 es2  with inj1 (es1 ++ es2)                       -- alternative one of four
... | inj1 es1  | inj1 es   | (refs lhs outcome) with ...                -- LHS + remainder
... | inj1 es1  | inj2 _    with inj1 es1                                -- alternative two of four
... | inj1 es1  | inj2 _    | (refs lhs outcome) with ...                -- LHS + remainder
... | inj2 ...                                                           -- remaining two alternatives
\end{code}
The scheme |(semr r k)| formalizes this translation, where |r| is a rule and |k| the translation of the
rules that follow |r|.

The size of the translate code may be exponential in the number of rules with with-constructs as RHS.
%if False
This is not bad, it may just encourage uses to break encapsulation.
%endif
It is not obvious how to treat such rules otherwise. Agda does not allow a
with-construct as subexpression. Neither can we easily factor out the RHS of a
rule to a separate function, because the conclusions drawn from the 
evaluation of preceding rules are not in scope of this function.
Fortunately, for rules that would otherwise cause a lot of needless duplication,
the programmer can perform this process manually.

When dependent pattern matching brings assumptions in scope that are needed
\emph{accross} rules, the code duplication is a necessity.
To facilitate that pattern matching effects are visible accross rules,
we need to ensure that the rule that performs the match is ordered before a
rule that needs the assumption.
We showed in previous work how such non-attribute dependencies
can be captured~\cite{Middelkoop10wgt}.

The translated code has attractive operational properties. Each attribute is only
computed once, and each node is at most traversed |k| times.

\section{Improvement: Partially Defined Attributes}
\label{depend.sect.partial}

  A fine granularity of attributes is important to use an AG effectively.
  In the mini-compiler of Section~\ref{depend.section.example}, we could replace
  the attribute |outcome| with an attribute |code| and a separate attribute |errors|.
  This would be more convenient, since it would not require a pattern match against
  the |output| attribute to collect errors.
  However, we cannot produce a target term in the presence of errors, thus |code| would not
  have a total definition. Therefore, we were forced to combine these two aspects into a
  single attribute |outcome|.
  It is common to use partially defined attributes in an AG; especially for the idiom
  where an attribute's value (e.g. |errors|) determines if another attribute is defined (e.g. |code|).
  We present a solution that uses the partitioning of attributes over visits.
  
  The idea is to make visits dependent on the value of a preceding attribute.
  We split up the |translate| visit in a visit |report| and a visit |generate|. The visit |report| has
  |errors| as synthesized attribute, and |generate| has |code|. Furthermore, we enforce that
  |generate| may only be invoked (by the parent in the AST) when the list of errors reported in the previous
  visit is empty. We accomplish this with an additional attribute |noErrors| on |generate| that gives
  evidence that the list of errors is empty. With this evidence, we can give a total definition for |code|.
  \begin{smallcode}
  \begin{code}
  itf Source  -- |Root|'s visit needs to be split up in a similar way
    visit report     ^^^  syn  errors     : Errs ainh.finEnv     -- parent can inspect |errors|
    visit generate   ^^^  inh  noErrors   : asyn.errors == []    -- enforces invariant
                          syn  code       : Target ainh.finEnv   -- only when |errors| is empty
  
  datasem Source prod use    -- example for production |use|
    loc.testInEnv = loc.nm `mbElem` lhs.finEnv    -- scheduled in visit |report|
    lhs.code with loc.testIn | lhs.noErrors       -- scheduled in visit |generate|
      | inj1 _     | ()                           -- cannot happen, hence an \emph{absurd pattern}
      | inj2 isIn  | refl = use loc.nm isIn       -- extract the evidence needed for the |code| term
  
  datasem Source prod `diam`  -- |leftNil : (alpha : Env) -> (beta : Env) -> (alpha ++ beta == []) -> (alpha == [])|
    left.noErrors  = leftNil left.errors right.errors lhs.noErrors  -- |right.noErrors| similar
    lhs.code       =  left.code `diam` right.code     -- scheduled in visit |generate|
  \end{code}
  \end{smallcode}
  For this approach work, it is essential that visits are scheduled as late as possible, and only those that are needed.
  We can improve this approach by defining a fixed number of alternative sets of attributes for a visit, and use the
  value of a preceding attribute to select one of these sets~\cite{middelkoop09wgt10}.
  

%if False
  Prove completeness:
  \begin{code}
  itf Expr
    visit typecheck  inh Gamma  syn self  syn outcome
    visit proof
      syn ty
      syn prf1 : ty in outcome
      inh ty'
      inh deriv : (inh.Gamma :- syn.self : inh.ty')
      syn prf2 : (syn.ty <= inh.ty')
  \end{code}
  visits not a total order but partial order.
%endif
  
  %if False
    We show that the granularity of attributes is important, and how we
    can circumvent merging the attributes.
    This solves several problems that become more urgent when the
    AG gets larger and has more attributes.

    Firstly, two attributes that we could before give separate rules for, now
    have to be written in a large rule. This affects the ability to describe
    and arrange rules for various aspects separately. For example, we cannot
    reason about the computations for errors separately from the computations
    related to target terms.
    A fine granularity of attributes is the advantage that an Attribute Grammar
    has compared to a conventional catamorphism.

    Secondly, when an attribute has a type with more than one inhabitant, we can
    only match against the contents in the right-hand side of a rule.
    When multiple rules refer to this attribute, they duplicate the
    match in their right-hand sides. When the match also opens an
    existential, it is also opened twice, resulting in incomparable
    values that are actually the same.
    To remedy this situation, we would like to be able to write
    rules in the context of having matched against another attribute.

    Finally, the merged rules in the example above prevented us
    from using a default rule for the errors. If we had an error attribute
    available for all children that returned an |inj1| then the default-rule
    produces exactly the expressions as we now wrote manually.

    \subsection{Dependent Attribute Grammars with Context}

    To improve attribute granularity, we distinguish context in the
    interface. A visit may result in one or more different outcomes,
    and we can match against these outcomes.
    \begin{code}
    itf Root
      visit compile
        inh  initEnv    : Env
        syn  outcome    : Outcome

    itf Source
      visit analyze
        syn  gathEnv    : Env
      visit translate
        inh  finEnv     : Env
        inh  gathInFin  : asyn.gathEnv `subset` ainh.finEnv
        context failure
          syn  errs     : Errs ainh.finEnv
        context success
          syn  target   : Target ainh.finEnv 
    \end{code}

    Furthermore, we provide a more fine-grained mechanism to specify rules.
    We organize a production as a sequence of visits, with for each visit
    a number of clauses. In a clause, the next visit is defined.
    This structure forms a tree, with a structure largely defined by the
    interface: rules may be specified at each level of
    the tree, thus in a production, visit or clause. The lifetime of a 
    rule is limited to the subtree it appears in, and we may schedule the
    evaluation of a rule anywhere in this lifetime (not earlier).

    The subtree also introduces a scope. The common attributes of
    clauses are also accessible from the parent-scope. This way, we can
    define rules in an as high-as-possible scope and let the rules be
    ordered automatically based on their attribute dependencies. When
    a rule depends on a clause or particular visit, we can restrict its
    lifetime to such a visit.

    To prevent notational overhead, in the code for a production, we only
    have to introduce the visits that matter for the lifetime and contexts
    of rules and clauses. Those that are not explicitly introduced are done
    so implicitly. If a visit has only one clause, then
    that clause may be omitted, and we introduce an anonymous clause for it.

    During the invocation of a visit, its clauses
    are tried in the order of appearance. The execution of rules may fail
    under certain conditions (explained below) and cause a backtrack to the
    next clause.
    The result context of the visit can be specified per clause, or for
    all clauses at the same time per visit.

    Moreover, we introduce an internal visit,
    which is not visible in the interface, but allows a
    choice of clause to be made during the execution of
    the visit.

    \begin{code}
    datasem Root
      prod root
        root.finEnv     = root.gathEnv ++ lhs.initEnv
        root.gathInFin  = subLeft
        internal
          clause rootOk : success
            context translate of root : failure
            lhs.outcome = outcome (inj1 root.errs)
          clause rootFail : success
            context translate of root : success
            lhs.outcome = outcome (inj2 root.target)
    \end{code}

    For the semantics of |Source|, we distinguish separate attributes and rules in different
    contexts. For example, in the |use| production, we match against
    the outcome of the membership test, and have two clauses: one
    for each outcome.
    \begin{code}
    datasem Source
      default finEnv   = last
      default gathEnv  = concat

      prod use
        loc.checkIn = loc.nm `mbElem` lhs.finEnv
        visit translate
          clause notInEnv : failure
            match inj1 notIn  = loc.checkIn
            lhs.errs          = [ scope loc.nm notIn ]

          clause isInEnv  : success
            match inj2 isIn   = loc.checkIn
            lhs.target        = use loc.nm isIn

      prod def
        loc.inFin   = inSubset lhs.gathInFin here
        lhs.target  = def loc.nm loc.inFin
        visit translate : success

      prod _`diam`_
        left.gathInFin        = trans  subLeft lhs.gathInFin
        right.gathInFin       = trans  (subRight {asyn.lhs.gathEnv} {lhs.finEnv})
                                       lhs.gathInFin
        visit translate
          clause both   : success
            context translate of left   : success
            context translate of right  : success

            lhs.target        = left.target `diam` right.target

          clause contexts  : failure
            default errs  = concat
    \end{code}
    If a visit results in more than one context and a match against this context is omitted,
    then we are not allowed to use any of the synthesized attributes of that visit.
    As a shortcut, we may introduce a pseudo clause |contexts|, which stands for
    multiple clauses: one for each combination of remaining contexts. The default-rules
    may refer to attributes of children available due to matches against contexts.
  %endif


\section{Related Work}

  Dependent types originate in Martin-L\"of's Type Theory.
  A variety of dependently typed programming languages are gaining popularity,
  including Agda~\cite{DBLP:conf/tldi/Norell09}, Epigram, and Coq.
  %% including Agda~\cite{DBLP:conf/tldi/Norell09}, Epigram~\cite{DBLP:conf/afp/McBride04},
  %% and Coq~\cite{DBLP:conf/tphol/Bertot08}.
  We present the ideas in this paper with Agda as host language, because it has a
  concept of a dependent pattern match, to which we straightforwardly map the
  left-hand sides of AG rules.
  Also, in Coq and Epigram, a program is written
  via interactive theorem proving with tactics or commands. The preprocessor-based
  approach of this paper, however, suits a declarative approach more.
  
  Attribute grammars~\cite{DBLP:journals/mst/Knuth68}
  are considered to be a promising implementation for compiler construction.
  Recently, many Attribute Grammar systems arose for mainstream languages, such as
  JastAdd
  %% ~\cite{DBLP:conf/oopsla/EkmanH07}
  and Silver
  %% ~\cite{DBLP:journals/entcs/WykBGK08}
  for Java, and UUAG~\cite{uuagc} for Haskell. These approaches may benefit from the
  stronger type discipline as presented in this paper; however, it would require an
  encoding of dependent types in the host language.

  %if fullversion
  In certain languages it is possible to implement AGs via meta-programming facilities,
  which takes away the need of a preprocessor.
  Viera, et al.~\cite{DBLP:conf/icfp/VieraSS09} show how to implement AGs into Haskell
  through type level programming. A combination of that paper with our work, would
  fit well in Agda, if Agda had a mechanism similar to Haskell's class system.
  %% Exploit preprocessor to generate a first-class Agda object?
  %endif
  
  AGs have a straightforward translation to cyclic functions in a
  lazy functional programming language~\cite{DBLP:conf/ifip2-4/SwierstraA98}.
  To prove that a cyclic functions is total and terminating is a non-trivial
  exercise.
  Kastens~\cite{DBLP:journals/acta/Kastens80} presented Ordered Attribute
  Grammars (OAGs). In OAGs, the evaluation order of attribute
  computations as well as attribute lifetime can be determined statically.
  Saraiva~\cite{Saraiva99} described how to generate (noncyclic) functional
  coroutines from OAGs. The coroutines we generate are based on these ideas.
  
  %if False
  Boyland~\cite{DBLP:journals/toplas/Boyland96} introduced conditional attribute grammars.
  In such a grammar, semantic rules may be guarded. A rule may only be evaluated if its guard
  holds. Evaluation of guards can influence the evaluation order, which makes the evaluation
  less predictable.
  In comparison, in our clauses-in-visits model, we have to explicitly indicate in what
  visits guards are evaluated (the match-statements of a clause), which makes evaluation
  clear. Our approach has the additional benefit that children may be conditionally
  introduced and visited.
  %endif
  
  %if False
  The work for this paper is carried out for a research project to support the
  implementation of the type inference component of the Utrecht Haskell compiler.
  We presented an earlier version of |rulerfront|'s clauses-per-visit model to
  allow attribute grammars with case distinction on more than a single AST,
  and express iteration and dynamic growing of trees~\cite{Middelkoop10vis,Middelkoop10wgt,Middelkoop10gpce}.
  %endif

\section{Conclusion}
\label{depend.sect.conclusion}

  We presented |rulerfrontagda|, a language for ordered AGs with dependently
  typed attributes: the type of an attribute may refer to the value of another attribute.
  This feature allows us to conveniently encode invariants in the type of attributes, and
  pass proofs of these invariants around as attributes.
  With a dependently typed AG, we write algebras for catamorphisms in a
  dependently typed language in a composeable way. Each attribute describes a separate
  aspect of the catamorphism.

  A particular advantage of composability is that attributes can easily be added and
  shared. Moreover, via \emph{local attributes} we can specify invariants and proofs
  at those places where the data is. We prove for the example in
  Section~\ref{depend.section.example} that the final environment must be equal to
  the gathered environment at the root of the tree:
  \begin{code}
  datasem Root prod root  -- more rules for the root production
    loc.eqEnvs : inh.root.finEnv == syn.top.gathEnv  -- signature for local attr
    loc.eqEnvs = refl                                -- proof of the equality
  \end{code}
  
  The approach we presented is light-weight, which means that we encode AGs as embedded
  language (via a preprocessor), such that type checking is deferred to the host language. To facilitate
  termination checking, we translate the AG to a coroutine (Section~\ref{depend.sect.translation.agda})
  that encodes a terminating, multi-visit traversal, under the restriction that the AG is ordered and
  definitions for attributes are total.

  The preprocessor approach fits nicely with the interactive Emacs mode of Agda.
  Type errors in the generated program are traceable back to the source:
  in a statically checked |rulerfrontagda| program
  these can only occur in Agda blocks. These Agda blocks are literally preserved;
  due to unicode, even attribute references can stay the same.
  Also, the Emacs mode implements interactive features via markers. These markers are
  also preserved by the translation.
  The AG preprocessor is merely an extra step in the preprocessing pipeline.

  With some generalisations, the work we presented is a proposal for a
  more flexible termination checker for Agda that accepts |k|-orderable
  cyclic functions, if the function can be written as a non-cyclic |k|-visit coroutine.
  %% (i.e. at call sites, an input may depend on an output)

%if fullversion
  As future work, it may be possible to exploit patterns in AG descriptions to
  generate boilerplate for proofs.
  For example, to deal with fixpoint iteration in AGs (and corresponding
  termination and monotonicity properties), or to deal with standard lookup
  and extension patterns of variables in environments.
  Also, it may be possible to generate proof and AG templates from a type
  system specification~\cite{DBLP:conf/flops/DijkstraS06}.
%endif


%if False
\section{Implementation of the Support Code}
\label{depend.sect.support}

  In this section, we give a definition of the support code mentioned briefly in earlier
  sections.

  As an example, below is a block of Agda code. We introduce types to represent
  identifiers, environments, and scoping errors. We simply take identifiers as strings,
  and an environment is a cons-list of them.
  \begin{code}
  Ident : Set
  Ident = String

  Env : Set
  Env = List Ident

  data Err : Set where
    scope : (nm : Ident) -> Err

  Errs : Set
  Errs = List Err
  \end{code}
  We use one type of error, a |scope|-error, consisting of an identifier |nm|.
  The constructor |scope| takes a value of type |Ident| as parameter. In a dependent
  type, we may give a name to such a value in the type signature. In this example,
  we give it the name |nm|. This name may be used in the remainder of the signature
  to specify predicates over |nm| (types that depend on |nm|).
  We do not require this functionality yet; later,
  we add an additional field to the |scope|
  constructor to represent a proof that |nm| is not in some environment.

  A type |nm `elem` Gamma| (in |Set|) is an example of a predicate as mentioned above,
  where |nm| and |Gamma| are value-level identifiers. A value of the above predicate is
  a proof that the predicate holds over the value-level identifiers.
  
  In the example, we manipulate environments. We reason about identifiers in the
  environment. Two relations, |_subseq| and |_elem|, play an important role.
  A predicate |Gamma1 `subseq` Gamma2| specifies that |Gamma1| is a subsequence of
  |Gamma2|, and |nm `elem` Gamma| that specifies that |nm| is an element of |Gamma|.
  Such relations are representable in Agda as data types, where the clauses of the
  relation are represented as constructors. A value of such a type is a proof:
  a derivation tree with constructor-values as nodes.

  We use the subsequence relation between environments in the example to reason about
  environments that we extend by prepending or appending identifiers of other
  environments. A subset relation would be another possibility, but former suffices and
  is slightly easier in use.
  The curly braces in the code below introduce implicit parameters that may be omitted
  if their contents can be fully determined by Agda through unification.
  \begin{code}
  adata _subseq : Env -> Env -> Set awhere
    refl      : {Gamma : Env}                  -> Gamma `subseq` Gamma
    subLeft   : {Gamma1 : Env} {Gamma2 : Env}  -> Gamma1 `subseq` (Gamma1 ++ Gamma2)
    subRight  : {Gamma1 : Env} {Gamma2 : Env}  -> Gamma2 `subseq` (Gamma1 ++ Gamma2)
    trans     : {Gamma1 : Env} {Gamma2 : Env} {Gamma3 : Env} -> Gamma1 `subseq` Gamma2 -> Gamma2 `subseq` Gamma3 -> Gamma1 `subseq` Gamma3
  \end{code}
  With |subLeft| and |subRight|, we can add arbitrary environments in front or after the
  environment on the right-hand side. With |trans| we repeat this process.
  
  For membership in the environment, either the element has to be on the front of the
  list, or there is a proof that the element is in the tail.
  \begin{code}
  adata _elem : (nm : Ident) -> (Gamma : Env) -> Set awhere
    here : {nm : Ident} {Gamma' : Env} -> nm `elem` (nm :: Gamma')
    next : {nm : Ident} {nm' : Ident} {Gamma : Env} -> nm `elem` Gamma -> nm `elem` (nm' :: Gamma)
  \end{code}
  With the above definitions, we prove lemma |inSubset| that, when an environment |Gamma'|
  is a subsequence of |Gamma|, and |nm| is an element in |Gamma'|, states that |nm| is also in |Gamma|.

  \begin{code}
  append : {nm : Ident} {Gamma : Env} -> (nm `elem` Gamma) -> (Gamma' : Env) -> (nm `elem` (Gamma ++ Gamma'))
  append {nm}  {.nm :: Gamma}  (here)          Gamma'  = here {nm} {Gamma ++ Gamma'}
  append {nm}  {nm' :: Gamma}  (next inGamma)  Gamma'  = next (append {nm} {Gamma} inGamma Gamma')
  append {_}   {[]}            ()              _

  prefix : {nm : Ident} {Gamma' : Env} -> (nm `elem` Gamma') -> (Gamma : Env) -> (nm `elem` (Gamma ++ Gamma'))
  prefix inGamma' []            = inGamma'
  prefix inGamma' (x :: Gamma)  = next (prefix inGamma' Gamma)

  inSubset : {nm : Ident} {Gamma : Env} {Gamma' : Env} -> (Gamma' `subseq` Gamma) -> nm `elem` Gamma' -> nm `elem` Gamma
  inSubset (subLeft  {_} {Gamma'})  inGamma'  = append inGamma' Gamma'
  inSubset (subRight {Gamma})       inGamma'  = prefix inGamma' Gamma
  inSubset (trans subL subR)        inGamma'  = inSubset subR (inSubset subL inGamma')
  \end{code}

  The proof of |inSubset| consists of straightforward structural induction. The lemmas |append| and
  |prefix| take care of the base-cases. With |append|, the proof for |Gamma| and |Gamma ++ Gamma'| is the same
  until we find the element. We then produce a |here| with the appropriate environment.
  When we have a proof that an identifier is in the environment, we never run into an empty environment without
  finding the identifier first. Agda's totally checker requires us to add a clause for when the environment is
  empty. The match against the proof can then not succeed, hence the match is called absurd (indicated with the
  empty parentheses), and the right-hand side must be omitted.

  Similarly, we prove that given an identifier |nm| and an environment |Gamma|,
  we either determine that |nm| is in |Gamma| or that it is not in
  |Gamma|. The sum-type |_`Either`_| (named |Either| in Haskell) provides
  constructors |inj1| (|Left| in Haskell) and |inj2| (|Right| in Haskell)
  for this purpose.
  
  In this proof, we use |notFirst| to prove by contradiction that an identifier is
  not in the environment if it is not equal to the head of the environment, and
  neither in the tail of the environment. If it would be in the head, it would conflict
  with the former assumption. If it would be in the tail, it would conflict with latter assumption.
  \begin{code}
  notFirst : {nm : Ident} {nm' : Ident} {Gamma : Env}  -> neg (nm == nm') -> neg (nm' `elem` Gamma)
                                                       -> (nm' `elem` (nm :: Gamma)) -> bottom
  notFirst  prv1  _     here         = prv2 refl
  notFirst  _     prv1  (next prv2)  = prv1 prv2
  \end{code}
  To find an element |nm'| in environment, we walk over the cons-list. If the environment is empty, 
  the proof that the identifier is not in the list is trivial: if it would be, neither
  |here| nor |next| can match, because they expect a non-empty environment, hence we can only
  match with an absurd pattern, which provides the contradiction.
  
  If the environment is not empty, then it has an identifier |nm| as the head. We use the
  string equality test |`mbEqual'| to give us via constructor |yes| a proof of equality between |nm| and |nm'|
  or a proof of its negation via |no|.
  We wish to express the pattern |nm' `mbElem` (nm' :: Gamma)|, with the intention of matching
  the case that |nm'| matches the head of the environment. In Agda, however, a pattern match must
  be strictly linear: an identifier may only be matched against once. To express that a value is the
  same as an already introduced value, we use a dot-pattern. This requires a proof that these are
  indeed the same, which is provided by a match against |refl| (the only constructor of an equality
  proof).
  \begin{code}
  _mbElem : (nm : Ident) -> (Gamma : Env) -> neg (nm `elem` Gamma) `Either` (nm `elem` Gamma)
  nm' `mbElem` [] = inj1 (\())
  nm' `mbElem` (nm :: Gamma)    with nm `mbEqual` nm'
  nm' `mbElem` (.nm' :: Gamma)  | yes refl = inj2 here
  nm' `mbElem` (nm :: Gamma)    | no prv'  with nm' `mbElem` Gamma
  nm' `mbElem` (nm :: Gamma)    | no prv'  | inj2 prv = inj2 (next {nm'} {nm} prv)
  nm' `mbElem` (nm :: Gamma)    | no prv'  | inj1 prv = inj1 (notFirst prv' prv)
  \end{code}
  A with-expression is a variation on case-expressions in Haskell. It allows case distinction
  on a computed value, but also a refinement of previous pattern matches.
  Semantically, it computes the value of the scrutinee, then calls the function again
  with this value as additional parameter. Horizontal bars control which clauses belong to
  what with-expression.
%endif


%endif