\documentclass{llncs} %let fullversion = False %include lhs2TeX.fmt %include polycode.fmt \usepackage{stmaryrd} \usepackage{float} \usepackage{graphicx} \usepackage{pgf} \usepackage{tikz} \usepackage{makeidx} \usepackage{txfonts} \usetikzlibrary{arrows,positioning} \floatstyle{boxed} \restylefloat{figure} \setlength{\fboxsep}{1.4pt} %format rulerfront = "{\mbox{\scshape ruler-front}}" %format rulercore = "{\mbox{\scshape ruler-core}}" %format rulerfrontagda = "{\mbox{\scshape ruler-front}_{\mbox{\scshape agda}}}" %format Agda = "{\mbox{\scshape Agda}}" %format . = "." %format ^^^ = "\;\;\;" %format ^^^^ = "\hspace{1em}" %format epsilon = "\epsilon" %format (sub (x) (y)) = "{" x "}_{" y "}" %format (sup (x) (y)) = "{" x "}^{" y "}" %format (many (x)) = "\overline{" x "}" %format forall = "\forall" %format Gamma = "\Gamma" %format Sigma = "\Sigma" %format Phi = "\Phi" %format :- = "\vdash" %format emptyset = "\emptyset" %format sep = "\:;\:" %format bar = "|" %format ::: = "\::\:" %format `union` = "\cup" %format union = "\cup" %format `intersection` = "\cap" %format intersection = "\cap" %format `subset` = " \subseteq " %format .== = "\doteq" %format `diam` = "\diamond" %format neg a = "\neg" a %format bottom = undefined %format Gamma0 %format Gamma1 %format Gamma2 %format Gamma3 %format Gamma4 %format Gamma' %format Gamma0 %format Gamma'' %format inGamma = "in" Gamma %format inGamma' %format Set1 %format Set2 %format _elem = "\_\in\_" %format _subseq = "\_\sqsubseteq\_" %format `subseq` = "\sqsubseteq" %format `mbElem` = "\in_?" %format _mbElem = "\_\in_?\_" %format `mbEqual` = "\equiv_?" %format `Either` = "\uplus" %format inj1 %format inj2 %format prv1 %format prv2 %format prv' %format i1 %format i2 %format x1 %format x2 %format x3 %format x4 %format es1 %format es2 %format t1 %format t2 %format a1 %format a2 %format ty = "\tau" %format tail = "\kw{tail}" %format and = "\kw{and}" %format attr = "\kw{attr}" %format data = "\kw{data}" %format sem = "\kw{sem}" %format inh = "\kw{inh}" %format syn = "\kw{syn}" %format con = "\kw{con}" %format clause = "\kw{clause}" %format default = "\kw{default}" %format itf = "\kw{itf}" %format visit = "\kw{visit}" %format internal = "\kw{internal}" %format match = "\kw{match}" %format child = "\kw{child}" %format invoke = "\kw{invoke}" %format datasem = "\kw{datasem}" %format grammar = "\kw{grammar}" %format term = "\kw{term}" %format nonterm = "\kw{nonterm}" %format prod = "\kw{prod}" %format context = "\kw{context}" %format contexts = "\kw{contexts}" %format attach = "\kw{attach}" %format detach = "\kw{detach}" %format adata = "\uw{data}" %format awhere = "\uw{where}" %format with = "\uw{with}" %format ainh = "inh" %format asyn = "syn" %format << = "\llbracket" %format >> = "\rrbracket" %format ~> = "\leadsto" %format (semb (z)) = << z >> %format (semiv (z) (i) (n)) = "\llbracket_{\mbox{iv }}" z >> "_{ " i "}" "^{ " n " }" %format (sema (z)) = "\llbracket_{\mbox{a }}" z >> %format (semat (z)) = "\llbracket_{\mbox{at }}" z >> %format (seman (z)) = "\llbracket_{\mbox{an }}" z >> %format (semev (z) (i) (n)) = "\llbracket_{\mbox{ev }}" z >> "_{ " i "}" "^{ " n " }" %format (semr (z) (k)) = "\llbracket_{\mbox{r }}" z >> "_{ " k "}" %format (semep (z) (i) (n)) = "\llbracket_{\mbox{ep }}" z >> "_{ " i "}" "^{ " n " }" \newcommand\kw[1]{\ensuremath{\mathbf{#1}}} \newcommand\uw[1]{\ensuremath{\underline{\mathsf{#1}}}} \newenvironment{smallcode}[0]{\renewcommand{\hscodestyle}{\small}}{\renewcommand{\hscodestyle}{\normalsize}} \begin{document} %if fullversion \title{Dependently-Typed Attribute Grammars (Extended Version)} %else \title{Dependently-Typed Atribute Grammars} %endif \author{Arie Middelkoop \and Atze Dijkstra \and S. Doaitse Swierstra} \institute{Universiteit Utrecht,\\The Netherlands} \maketitle \begin{abstract} Attribute Grammars (AGs) are a convenient formalism to program complex computations over Abstract Syntax Trees (ASTs), which is particularly useful for the construction of compilers. Given such an AG, it is not obvious if it is correct or not. Dependently-typed programming provides a means to address this challenge. We show in this paper the extension of AGs with attributes that may have a dependent type. This allows us to define and prove by-construction the correctness of such computations. \end{abstract} \section{Introduction} The static semantics of context-free languages can be described with Attribute Grammars (AGs)~\cite{DBLP:journals/mst/Knuth68}. Attribute grammars have the distinct advantage that the descriptions are composable and easy to extend, which turns out to be convenient when working with complex semantics. AGs extend a context-free grammar with attributes associated with a nonterminal, and equations (rules) between these attributes associated with a production. An Abstract Syntax Tree (AST) represents a proof of the recognition of a sentence in the context-free language. Attributes are convenient: these represent individual aspects or properties of the AST, can be combined at each node of the AST to define more complex attributes, and it is easy to add new attributes to the grammar. From a practical point of view, via a preprocessor, an AG can be compiled to an efficient algorithm in some host language that, given an AST, computes values for these attributes. The translated code in the host language then serves as an implementation for a compiler. In the last years, we applied AGs in several large projects, including the Utrecht Haskell Compiler~\cite{1596650}, the Helium~\cite{Leijen:helium} compiler for learning Haskell, and the editor Proxima~\cite{Schrage04proxima}. The use of AGs in these projects is invaluable, for two reasons: rules for attributes can be written in isolation and combined automatically, and trivial rules can be inferred, saving the programmer from writing boilerplate code. Given a compiler such as the UHC, the question arises to what extent the compiler generates correct code (for some definition of correctness). In UHC, similar to any project that many people contribute to, we regularly encounter subtle bugs related to broken hidden invariants that are not discovered by regression testing alone. The holy grail in compiler implementation is to formally verify that a compiler exhibits certain properties of interest. For example, that a type inferencer is a sound and complete implementation of a type system, and that the meaning of a well typed source program is preserved. Despite some successes~\cite{DBLP:journals/cacm/Leroy09}, formal verification is still infeasible for compilers that deal with complex language features. It is already a struggle to implement and maintain a compiler that gives output that seems correct, let alone to attempt to prove this without having a good intuition about the structure of such a proof. When we implement our compilers in languages with advanced type systems, we ensure that compilers have certain properties by construction. In mainstream strongly typed languages, it is possible to guarantee that (even for non-trivial inputs) the compiler produces output that conforms to a context-free grammar. It is enforceable that a compiler produces, for example, syntactically-correct HTML or C code. These are properties we get for free when we define our data structures properly in, for example, the strongly-typed programming language Haskell. Unfortunately, syntactic guarantees do not save us from hidden invariants. For that, we desire semantic guarantees. This is possible by using more complex types, but at a severe cost. It actually requires us to encode proofs by using the type system as a logic. In a dependently-typed language such as Agda, we impose arbitrary complex constraints on values via types, but as a consequence, proving that the program is well typed becomes considerably harder. To move from a compiler implementation in Haskell to Agda, for example, the least extra effort required---syntax and library issues aside---is to prove termination of the program (ensures that the program is a proof in a consistent logic). Compilers mostly consist of primitive recursive functions. Those parts in a compiler that rely on iteration either compute fixpoints, or have iteration limits. %% Unification-based type inferencers may need to ensure that types in the %% substitution are finite, imposed via the `occurs check', which is a %% relatively easy property. The extra effort is substantial, but not a theoretic challenge. From there on, the programmer can make the types more specific, and prove stronger invariants. Since we implement our compilers with AGs, we present in this paper how to combine these with dependently-typed programming, using Agda in particular. We compile AGs to (a sequence of) functions that take a record of inherited attributes, have cases for each type of AST node, recursively visit the children of such a node, and produce a record of synthesized attributes. Such functions can be written in Agda. There are thus no theoretic limitations that prevent AGs to be embedded in a dependently-typed programming language. %if fullversion However, we show that dependently-typed programming enforces structure to a an AG that gives rise to coarser attributes and right-hand sides of rules that case-analyze values of attributes. %else However, in the extended version of this paper~\cite{Middelkoop10iflreport}, we show that dependently-typed programming enforces structure to a an AG that gives rise to coarser attributes and right-hand sides of rules that case-analyze values of attributes. The coarser the granularity of attributes, the less benefits an AG has over a conventional fold. %endif In short, to profit from programming with AGs, a fine granularity of attributes is beneficial, but may require partially defined attributes. In a dependently-typed setting, all attributes must have a total definition, and to accomplish that, a coarser granularity of attributes may be required. %if not fullversion In the extended version, we show that it is possible to retain the fine granulairty when we allow AST nodes to reside in inspectable states with different attribuations (context-dependent attribuation). %endif In previous work, we implemented an AG preprocessor |rulerfront| for Haskell, and some other languages~\cite{Middelkoop10gpce,Middelkoop10wgt}. In this paper, we introduce a variation called |rulerfrontagda|, which has additional syntax to support the more complex types for attributes, and to deal with dependently-typed pattern matching in the left-hand sides of rules. In particular, dependent pattern matching against attributes may impose additional constraints on other attributes. For example, in the scope of a match against an equality proof, we get the additional information that two attributes are the same and Agda requires them to have a common name. The example in Section~\ref{section.example} again demonstrates this. As contributions, we extensively discuss an example in Section~\ref{section.example}, then describe the AG-language |rulerfront| in Section~\ref{sect.rulerfront.agda} and its translation to Agda. \section{Example} \label{section.example} In this section, we demonstrate a mini-compiler implemented as dependently-typed attribute grammar. We take as example scope checking of a simple language. The actual concrete syntax of the source language that the compiler takes and the target language that the compiler produces is irrelevant for this paper. The abstract syntax we represent with Agda data types given below. Scope checking of a simple language is only a minor task in a compiler. However, it still shows many aspects that a more realistic compiler has. The code of this compiler consists of blocks of Agda code, and blocks of |rulerfront|. To easy the distinction between the two, we underline keywords of Agda, and format keywords of |rulerfront| bold. \subsection{Dependently-typed program in Agda} The main distinctions between Agda and a conventional programming language such as Haskell are dependent type signatures and dependent pattern matching. We briefly explain these distinctions in this section. As an example, below is a block of Agda code. We introduce types to represent identifiers, environments, and scoping errors. We simply take identifiers as strings, and an environment is a cons-list of them. \begin{code} Ident : Set Ident = String Env : Set Env = List Ident data Err : Set where scope : (nm : Ident) -> Err Errs : Set Errs = List Err \end{code} We use one type of error, a |scope|-error, consisting of an identifier |nm|. The constructor |scope| takes a value of type |Ident| as parameter. In a dependent type, we may give a name to such a value in the type signature. In this example, we give it the name |nm|. This name may be used in the remainder of the signature to specify predicates over |nm| (types that depend on |nm|). We do not require this functionality yet; later, we add an additional field to the |scope| constructor to represent a proof that |nm| is not in some environment. A type |nm `elem` Gamma| (in |Set|) is an example of a predicate as mentioned above, where |nm| and |Gamma| are value-level identifiers. A value of the above predicate is a proof that the predicate holds over the value-level identifiers. In the example, we manipulate environments. We reason about identifiers in the environment. Two relations, |_subseq| and |_elem|, play an important role. A predicate |Gamma1 `subseq` Gamma2| specifies that |Gamma1| is a subsequence of |Gamma2|, and |nm `elem` Gamma| that specifies that |nm| is an element of |Gamma|. Such relations are representable in Agda as data types, where the clauses of the relation are represented as constructors. A value of such a type is a proof: a derivation tree with constructor-values as nodes. We use the subsequence relation between environments in the example to reason about environments that we extend by preprending or appending identifiers of other environments. A subset relation would be another possibility, but former suffices and is slightly easier in use. The curly braces in the code below introduce implicit parameters that may be omitted if their contents can be fully determined by Agda through unification. \begin{code} adata _subseq : (Gamma1 : Env) -> (Gamma2 : Env) -> Set awhere trans : {Gamma1 : Env} {Gamma2 : Env} {Gamma3 : Env} -> Gamma1 `subseq` Gamma2 -> Gamma2 `subseq` Gamma3 -> Gamma1 `subseq` Gamma3 subLeft : {Gamma1 : Env} {Gamma2 : Env} -> Gamma1 `subseq` (Gamma1 ++ Gamma2) subRight : {Gamma1 : Env} {Gamma2 : Env} -> Gamma2 `subseq` (Gamma1 ++ Gamma2) \end{code} With |subLeft| and |subRight|, we can add arbitrary environments in front or after the environment on the right-hand side. With |trans| we repeat this process. For membership in the environment, either the element has to be on the front of the list, or there is a proof that the element is in the tail. \begin{code} adata _elem : (nm : Ident) -> (Gamma : Env) -> Set awhere here : {nm : Ident} {Gamma' : Env} -> nm `elem` (nm :: Gamma') next : {nm : Ident} {nm' : Ident} {Gamma : Env} -> nm `elem` Gamma -> nm `elem` (nm' :: Gamma) \end{code} With the above definitions, we prove lemma |inSubset| that, when an environment |Gamma'| is a subsequence of |Gamma|, and |nm| is an element in |Gamma'|, states that |nm| is also in |Gamma|. \begin{code} append : {nm : Ident} {Gamma : Env} -> (nm `elem` Gamma) -> (Gamma' : Env) -> (nm `elem` (Gamma ++ Gamma')) append {nm} {.nm :: Gamma} (here) Gamma' = here {nm} {Gamma ++ Gamma'} append {nm} {nm' :: Gamma} (next inGamma) Gamma' = next (append {nm} {Gamma} inGamma Gamma') append {_} {[]} () _ prefix : {nm : Ident} {Gamma' : Env} -> (nm `elem` Gamma') -> (Gamma : Env) -> (nm `elem` (Gamma ++ Gamma')) prefix inGamma' [] = inGamma' prefix inGamma' (x :: Gamma) = next (prefix inGamma' Gamma) inSubset : {nm : Ident} {Gamma : Env} {Gamma' : Env} -> (Gamma' `subseq` Gamma) -> nm `elem` Gamma' -> nm `elem` Gamma inSubset (subLeft {_} {Gamma'}) inGamma' = append inGamma' Gamma' inSubset (subRight {Gamma}) inGamma' = prefix inGamma' Gamma inSubset (trans subL subR) inGamma' = inSubset subR (inSubset subL inGamma') \end{code} The proof of |inSubset| consists of straightforward structural induction. The lemmas |append| and |prefix| take care of the base-cases. With |append|, the proof for |Gamma| and |Gamma ++ Gamma'| is the same until we find the element. We then produce a |here| with the appropriate environment. When we have a proof that an identifier is in the environment, we never run into an empty environment without finding the identifier first. Agda's totally checker requires us to add a clause for when the environment is empty. The match against the proof can then not succeed, hence the match is called absurd (indicated with the empty parentheses), and the right-hand side must be omitted. Similarly, we prove that given an identifier |nm| and an environment |Gamma|, we either determine that |nm| is in |Gamma| or that it is not in |Gamma|. The sum-type |_`Either`_| (named |Either| in Haskell) provides constructors |inj1| (|Left| in Haskell) and |inj2| (|Right| in Haskell) for this purpose. In this proof, we use |notFirst| to prove by contradiction that an identifier is not in the environment if it is not equal to the head of the environment, and neither in the tail of the environment. If it would be in the head, it would conflict with the former assumption. If it would be in the tail, it would conflict with latter assumption. \begin{code} notFirst : {nm : Ident} {nm' : Ident} {Gamma : Env} -> neg (nm == nm') -> neg (nm' `elem` Gamma) -> (nm' `elem` (nm :: Gamma)) -> bottom notFirst prv1 _ here = prv2 refl notFirst _ prv1 (next prv2) = prv1 prv2 \end{code} To find an element |nm'| in environment, we walk over the cons-list. If the environment is empty, the proof that the identifier is not in the list is trivial: if it would be, neither |here| nor |next| can match, because they expect a non-empty environment, hence we can only match with an absurd pattern, which provides the contradiction. If the environment is not empty, then it has an identifier |nm| as the head. We use the string equality test |`mbEqual'| to give us via constructor |yes| a proof of equality between |nm| and |nm'| or a proof of its negation via |no|. We wish to express the pattern |nm' `mbElem` (nm' :: Gamma)|, with the intention of matching the case that |nm'| matches the head of the environment. In Agda, however, a pattern match must be strictly linear: an identifier may only be matched against once. To express that a value is the same as an already introduced value, we use a dot-pattern. This requires a proof that these are indeed the same, which is provided by a match against |refl| (the only constructor of an equality proof). \begin{code} _mbElem : (nm : Ident) -> (Gamma : Env) -> neg (nm `elem` Gamma) `Either` (nm `elem` Gamma) nm' `mbElem` [] = inj1 (\()) nm' `mbElem` (nm :: Gamma) with nm `mbEqual` nm' nm' `mbElem` (.nm' :: Gamma) | yes refl = inj2 here nm' `mbElem` (nm :: Gamma) | no prv' with nm' `mbElem` Gamma nm' `mbElem` (nm :: Gamma) | no prv' | inj2 prv = inj2 (next {nm'} {nm} prv) nm' `mbElem` (nm :: Gamma) | no prv' | inj1 prv = inj1 (notFirst prv' prv) \end{code} A with-expression is a variation on case-expressions in Haskell. It allows case distinction on a computed value, but also a refinement of previous pattern matches. Semantically, it computes the value of the scrutinee, then calls the function again with this value as additional parameter. Horizontal bars control which clauses belong to what with-expression. \subsection{Attribute Grammar in Agda} In AG blocks, we define the grammar, attributes, and functions between attributes. The preprocessor takes these blocks and translates them to Agda functions. The compiler is an Agda function implemented as AG that takes a source-language term of the type |Source| and translates it to a target-language term of the type |Target|, provided that there are no scoping errors. Initially, we take |Source| and |Target| as the same language, without using any dependent types. Later, in Section~\ref{sect.dep.ag}, we specify that a term of type |Target| is free of scoping errors, and have to prove that our compiler indeed produces such terms. To use AGs to analyze |Source|-terms, we define a context-free grammar for it below. Later, we declare attributes on the nonterminals in the grammar, and give functions between these attributes. The |Source| language consists of a sequence (|_`diam`_|) of |use| and |def| terms. Its operational semantics is simply: it evaluates to |()| if for every |use|, there is an equally named |def| in the sequence, otherwise evaluation diverges. \begin{code} grammar Root : Set -- start symbol of grammar and root of the AST prod root nonterm root : Source grammar Source : Set prod use term nm : Ident prod def term nm : Ident prod _`diam`_ nonterm left : Source nonterm right : Source \end{code} The grammar actually represents an Agda data type, where the productions are constructors, and terminals and nonterminals are the fields. We optionally generate this data-type from such a grammar declaration. The following type for the |Target| language (structurally equal to |Source| for now) gives an example of this translation. \begin{code} adata Target : Set awhere use : (nm : Ident) -> Target def : (nm : Ident) -> Target _`diam`_ : (left : Target) -> (right : Target) -> Target \end{code} With an AG, we translate an AST to its semantics, which is conceptually an AST node decorated with attributes. We represent the semantics of an AST node as an Agda function that takes values for inherited attributes as inputs, and produces values for synthesized attributes as outputs. The conceptual decorated tree is the closure of this function. For the example, we compute bottom-up a synthesized attribute |gathEnv| that contains the identifiers defined by the term. Top-down we pass an inherited attribute |finEnv| that contains all the gathered identifiers of the term combined with the initial environment. Furthermore, we compute bottom up an attribute |errs|, containing errors for each undefined identifier, and an attribute |target| containing the compiled version of the source term. From the above informal description of the attributes, we can already indicate a problem. The inherited attribute |finEnv| depends on the synthesized attribute |gathEnv|. This requires us to provide a result of the function as part of the argument of the same function. Such a circular programs are not allowed by Agda's termination checker. However, there is no circular dependency when we compute the attributes in two passes or more passes over the AST. In the first pass, we compute attribute |gathEnv|. In the second pass, we take |gathEnv| as resulting from the first pass and use it to compute the value for the inherited attribute |finEnv| for the second pass. For the class of Ordered Attribute Grammars~\cite{DBLP:journals/acta/Kastens80}, a multi-pass algorithm is derivable. Earlier work showed that this class is sufficiently strong for type checking Haskell. Several extensions may be needed when fixpoint iteration is required~\cite{Middelkoop10gpce}. In Agda, we express a multi-pass algorithm as a coroutine: a coroutine is a function that can be invoked (visited) multiple times, each time it may take additional parameters and yield results. As semantics of a AST node, we take a coroutine that which each invocation takes some of the remaining inherited attributes as a record, and computes a record with some of the remaining synthesized attributes. With an interface declaration, we express statically what attributes are computed in what visit. \begin{code} itf Root visit compile inh initEnv : Env syn errs : Errs syn target : Target itf Source visit analyze syn gathEnv : Env visit translate inh finEnv : Env syn errs : Errs syn target : Target \end{code} An attribute may only be declared once. However, an inherited attribute with the name |x| is considered to be different from a synthesized attribute |x|. The order of appearance determines the visit-order. For an AST node of nonterminal Root, there is only one visit, named |compile|, which takes the initial environment and produces the outcome. It does so by invoking the two visits on its |Source| subtree. First the visit |analyze| to gather the identifier, and then the visit |translate| to produce the outcome. The implementation of the coroutine is derived from functions that define attributes. These functions are called rules, and we specify them per production with a datasem-block. The left-hand side consists of a pattern that refers to one or more attributes, and the right-hand side is an Agda expression that may refer to attributes. \begin{code} datasem Root prod root root.finEnv = root.gathEnv ++ lhs.initEnv lhs.errs = root.errs lhs.target = root.target \end{code} An attribute reference is denoted |childname.attrname|. The children names |loc| and |lhs| are special. With |lhs| we refer to the inherited and synthesized attributes of the current AST node. With |loc| we refer to local attributes of the node. A terminal field |x| of a production is in scope as attribute |loc.x|. A nonterminal field |c| of a production is in scope as child named |c|, with attributes as defined by the interface of the corresponding nonterminal. An attribute |c.x| on the left-hand side refers to a synthesized attribute if |c = lhs|, and an inherited attribute otherwise. On the right-hand side, it is exactly the other way around. To refer on the right-hand side to synthesized attributes of |lhs|, or to inherited attributes of a child\footnote{ This is rarely useful with a conventional AG and can be simulated with local attributes. However, in a dependently-typed AG, we may wish to refer to such values in a predicate, as we see in a later example. Then it is convenient to be able to directly refer to such a value. }, the attribute reference has to be additionally prefixed with |ainh.| or |asyn.|. For the |use|-production of |Source|, we check if the element is in the environment. If not, we produce a singleton error-list. The |Target| term is a copy of the |Source| term. For |def|, no errors arise, and again the |Target| term is a copy of the |Source| term. Finally, for |_`diam`_|, we pass the |finEnv| down to both children, synthesize the |gathEnv| and |errs| of the children, and synthesize the target term. \begin{code} datasem Source prod use lhs.errs with loc.nm `mbElem` lhs.finEnv | inj1 _ = [ scope nm ] | inj2 _ = [] lhs.target = use loc.nm prod def lhs.errs = [] lhs.target = def loc.nm prod _`diam`_ left.finEnv = lhs.finEnv right.finEnv = lhs.finEnv lhs.gathEnv = left.gathEnv ++ right.gathEnv lhs.errs = left.errs ++ right.errs lhs.target = left.target `diam` right.target \end{code} Some advantages of Attribute Grammars show up in this example. Firstly, the order of appearance of rules is irrelevant. This allows us to write the rules that belong to each other in separate datasem-block of the same nonterminal, and use the preprocessor to merge these separate blocks into a single block. For large grammars with many attributes, such as UHC, this is an essential asset to keep the code maintainable. Furthermore, the rules for many attributes exhibit a standard pattern, especially when the grammar has many productions. For example, values for many attributes are simply passed down (|finEnv|, for example), or are a combination of some of the attributes of the children (|errs| and |gathEnv|, for example). We provide default rules~\cite{Middelkoop10gpce} to abstract over these patterns. Children are ordered per production (for example, alphabetic order). When for a child |c| an attribute |x| is not explicitly defined, but there is default-rule for |x|, then we take as value for |x| the right-hand side of the default rule applied to a list of all the |x| attributes of the children smaller than |c|. In particular, first element of this list is the nearest smaller child. The last element in this list is |lhs| (if it has an attribute |x|). \begin{code} datasem Source default errs = concat -- i.e. |left.errs ++ right.errs| default finEnv = last -- i.e. |lhs.finEnv| \end{code} In the AG code for UHC about a third of the attributes have an explicit definition. All others are provided by such default-rules. To effectively be able to use Attribute Grammars, the above two features are a necessity. In the next section, we construct a dependently-typed variation on the above Attribute Grammar, and need extensions to the AG formalism to retain these two features. \subsection{Dependently-Typed Attribute Grammar in Agda} \label{sect.dep.ag} We change the definition of the |Target| language. When constructing a term in this language, we demand a proof that all identifiers exist in the environment. For that, we give |Target| a type depending on a value: the final environment. Furthermore, we demand for each scoping error a proof that the identifier that caused the error is not in the environment. In this section, we show how a compiler that has these properties by construction. These are just a small number of properties. We can define many more, such as demanding that identifiers are not defined duplicately, or that error messages correspond to a used identifier in the source language. As we see in this section, an Attribute Grammar is suitable for exactly this purpose: the additional predicates and proofs just become additional attributes and rules. \begin{code} adata Target : (Gamma : Env) -> Set awhere def : forall {Gamma} (nm : Ident) -> (nm `elem` Gamma) -> Target Gamma use : forall {Gamma} (nm : Ident) -> (nm `elem` Gamma) -> Target Gamma _`diam`_ : forall {Gamma} -> (left : Target Gamma) -> (right : Target Gamma) -> Target Gamma adata Err : Env -> Set where scope : {Gamma : Env} (nm : Ident) -> neg (nm `elem` Gamma) -> Err Gamma Errs : Env -> Set Errs env = List (Err env) \end{code} In the previous section, we returned both a list of errors and a target term as result, with the hidden invariant that the target term is invalid when there are errors. The advantage is that we could refer to both values as separate attributes. A fine granularity of attributes is important in order to use default rules, and to be able to refer to values without having to unpack it first. This also has a disadvantage: the hidden invariant is not enforced. Patterns like the above occur often in the code of UHC. With the dependently-typed |Target| type, we are unable to construct such a term in the presence of errors. The hidden invariant is made explicit. However, we cannot construct both error messages and a target type anymore. Instead of two attributes, we now return a single attribute of type |(Errs ainh.finEnv) `Either` (Target ainh.finEnv)|, that contains either a list of error messages or a valid target term. Furthermore, both the error messages and the target term take both the final environment as parameter. To keep knowledge about this environment hidden inside the compiler, we wrap the result in a data type |Outcome|, which hides the environment via an existential. We produce this type at the root of the AST. \begin{code} adata Outcome : Set awhere outcome : forall {Gamma} -> (Errs Gamma) `Either` (Target Gamma) -> Outcome \end{code} The interfaces for the nonterminals now contain dependent-types that may depend on values of attributes. For example, we require a proof that the gathered environment is a subsequence of the final environment. The type of this proof refers to the two respective attributes. \begin{code} itf Root visit compile inh initEnv : Env syn outcome : Outcome itf Source visit analyze syn gathEnv : Env visit translate inh finEnv : Env inh gathInFin : asyn.gathEnv `subset` ainh.finEnv syn outcome : (Errs ainh.finEnv) `Either` (Target ainh.finEnv) \end{code} The dependencies between attributes in the interface must be acyclic. Furthermore, there may not be references to attributes in later visits. That ensures that we can generate a type signature for a coroutine with this interface. As additional work, we need to construct the proofs for the attribute |gathInFin|, and membership proofs for constructors |use| and |def|. At the root, we prefix the initial environment with the gathered environment. We get the required proof via |subRight|. \begin{code} datasem Root prod root root.finEnv = lhs.initEnv ++ root.gathEnv root.gathInFin = subRight lhs.outcome = outcome root.outcome \end{code} For production |use|, we get both of the proofs we desire from the membership test of the name in the environment. For |def|, we have to do a bit more work. The gathered environment for this case is a singleton environment, so we get the proof for this environment via |here|. Since we also get a proof as |lhs.gathInFin| that the gathered environment is a subsequence of the final environment, we get the desired result via lemma |inSubset|. Finally, for the |`diam'| production, we only have to consider the possible outcomes of the two children. \begin{code} datasem Source default finEnv = last default gathEnv = concat prod use lhs.outcome with loc.nm `mbElem` lhs.finEnv | inj1 notIn = inj1 [ scope loc.nm notIn ] | inj2 isIn = inj2 (use loc.nm isIn) prod def loc.inFin = inSubset lhs.gathInFin here lhs.outcome = inj2 (def loc.nm loc.inFin) prod _`diam`_ left.gathInFin = trans subLeft lhs.gathInFin right.gathInFin = trans (subRight {asyn.lhs.gathEnv} {lhs.finEnv}) lhs.gathInFin lhs.outcome with left.outcome | inj1 es with right.outcome | inj1 es1 | inj1 es2 = inj1 (es1 ++ es2) | inj1 es1 | inj2 _ = inj1 es1 | inj2 t1 with left.outcome | inj2 t1 | inj1 es1 = inj1 es1 | inj2 t1 | inj2 t2 = inj2 (t1 `diam` t2) \end{code} The above code shows again an advantage of using Attribute Grammars: we can easily add additional predicates and write separate rules for them. However, this example also shows a problem: we merged what used to be two attributes into a single attribute |outcome|. %if not fullversion In the extended version of this paper, \cite{Middelkoop10iflreport}, we show that the gradularity of attributes is important, and how we can circumvent merging the attributes. %else This causes several problems that become more urgent when the AG gets larger and has more attributes. Firstly, two attributes that we could before give separate rules for, now have to be written in a large rule. This affects the ability to describe and arrange rules for various aspects separately. For example, we cannot reason about the computations for errors separately from the computations related to target terms. A fine granularity of attributes is the advantage that an Attribute Grammar has compared to a conventional catamorphism. Secondly, when an attribute has a type with more than one inhabitant, we can only match against the contents in the right-hand side of a rule. When multiple rules refer to this attribute, they duplicate the match in their right-hand sides. When the match also opens an existential, it is also opened twice, resulting in incomparable values that are actually the same. To remedy this situation, we would like to be able to write rules in the context of having matched against another attribute. Finally, the merged rules in the example above prevented us from using a default rule for the errors. If we had an error attribute available for all children that returned an |inj1| then the default-rule produces exactly the expressions as we now wrote manually. \subsection{Dependent Attribute Grammars with Context} To improve attribute granularity, we distinguish context in the interface. A visit may result in one or more different outcomes, and we can match against these outcomes. \begin{code} itf Root visit compile inh initEnv : Env syn outcome : Outcome itf Source visit analyze syn gathEnv : Env visit translate inh finEnv : Env inh gathInFin : asyn.gathEnv `subset` ainh.finEnv context failure syn errs : Errs ainh.finEnv context success syn target : Target ainh.finEnv \end{code} Furthermore, we provide a more fine-grained mechanism to specify rules. We organize a production as a sequence of visits, with for each visit a number of clauses. In a clause, the next visit is defined. This structure forms a tree, with a structure largely defined by the interface: rules may be specified at each level of the tree, thus in a production, visit or clause. The lifetime of a rule is limited to the subtree it appears in, and we may schedule the evaluation of a rule anywhere in this lifetime (not earlier). The subtree also introduces a scope. The common attributes of clauses are also accessible from the parent-scope. This way, we can define rules in an as high-as-possible scope and let the rules be ordered automatically based on their attribute dependencies. When a rule depends on a clause or particular visit, we can restrict its lifetime to such a visit. To prevent notational overhead, in the code for a production, we only have to introduce the visits that matter for the lifetime and contexts of rules and clauses. Those that are not explicitly introduced are done so implicitly. If a visit has only one clause, then that clause may be omitted, and we introduce an anonymous clause for it. During the invocation of a visit, its clauses are tried in the order of appearance. The execution of rules may fail under certain conditions (explained below) and cause a backtrack to the next clause. The result context of the visit can be specified per clause, or for all clauses at the same time per visit. Moreover, we introduce an internal visit, which is not visible in the interface, but allows a choice of clause to be made during the execution of the visit. \begin{code} datasem Root prod root root.finEnv = root.gathEnv ++ lhs.initEnv root.gathInFin = subLeft internal clause rootOk : success context translate of root : failure lhs.outcome = outcome (inj1 root.errs) clause rootFail : success context translate of root : success lhs.outcome = outcome (inj2 root.target) \end{code} For the semantics of |Source|, we distinguish separate attributes and rules in different contexts. For example, in the |use| production, we match against the outcome of the membership test, and have two clauses: one for each outcome. \begin{code} datasem Source default finEnv = last default gathEnv = concat prod use loc.checkIn = loc.nm `mbElem` lhs.finEnv visit translate clause notInEnv : failure match inj1 notIn = loc.checkIn lhs.errs = [ scope loc.nm notIn ] clause isInEnv : success match inj2 isIn = loc.checkIn lhs.target = use loc.nm isIn prod def loc.inFin = inSubset lhs.gathInFin here lhs.target = def loc.nm loc.inFin visit translate : success prod _`diam`_ left.gathInFin = trans subLeft lhs.gathInFin right.gathInFin = trans (subRight {asyn.lhs.gathEnv} {lhs.finEnv}) lhs.gathInFin visit translate clause both : success context translate of left : success context translate of right : success lhs.target = left.target `diam` right.target clause contexts : failure default errs = concat \end{code} If a visit results in more than one context and a match against this context is omitted, then we are not allowed to use any of the synthesized attributes of that visit. As a shortcut, we may introduce a pseudo clause |contexts|, which stands for multiple clauses: one for each combination of remaining contexts. The default-rules may refer to attributes of children available due to matches against contexts. %endif \section{|rulerfront| and |rulercore|} \label{sect.rulerfront.agda} In previous work~\cite{Middelkoop10wgt,Middelkoop10gpce}, we showed that the syntax of |rulerfront| is syntactic sugar for a more general core language |rulercore|. In |rulercore|, the AG is prepared in such a way that it is suitable for code generation: default-rules are applied, rules are ordered explicitly, and invocations of visits of children are explicit. This process remains the same in a dependently-typed setting, so we focus in this section on (a dependently-typed) |rulercore|. Productions in |rulerfront| are translated to an algebra in |rulercore| for the grammar involved. For a production, the corresponding semantic function takes the semantic value of the children, and returns the semantic value of the production. For example, for |prod `diam`| the translation to |rulercore| is: \begin{code} sem_seq lSem rSem = sem_Source_nt where -- reference to the sem fun sem nt : Source -- Agda smart constructor for |`diam`| child left : Source = lSem -- turn |lSem| into an AG child |left| child right : Source = rSem -- turn |rSem| into an AG child |right| invoke analyze of left -- run visit |analyze| on |left| invoke analyze of right -- run visit |analyze| on |right| invoke translate of left invoke translate of right ... -- the actual rules defined in |rulerfront| \end{code} Thus, in the translation to |rulercore|, rules are added that define the children of the production, and declare visits on these. In this example, the rules still have to be ordered such that each rule is lexically situated before the rule that uses its results~\cite{Middelkoop10wgt}. Operationally, these rules are executed in precisely that order. For example, an invoke-rule requires the inherited attributes for that child and visit to be defined, and defines the synthesized attributes for that child. \begin{figure}[htp] \begin{code} e ::= Agda [ many b ] -- embedded |rulercore| blocks |b| in |Agda| b ::= i | s | o -- |rulercore| blocks o ::= inh.c.x | syn.c.x | loc.x -- embedded attribute reference i ::= itf I v -- interface decl, with first visit |v| v ::= visit x inh (many a) syn (many a) v -- visit declaration | () -- terminator of visit decl. chain a ::= x : e -- attribute decl, with Agda type |e| s ::= sem x :: I t -- semantics expr, defines nonterm |x| t ::= visit x (many r) t -- visit definition, with next visit |t| | () -- terminator of visit def. chain r ::= p e' -- evaluation rule | invoke x of c -- invoke-rule, invokes |x| on child |c| | child c : I = e -- child-rule, defines a child |c| p ::= o -- attribute def | .{e} -- Agda dot pattern | x (many p) -- constructor match e' ::= with e (many (p' e')) -- Agda |with| expression | = e -- Agda |=| expression p' -- Agda LHS x, I, c -- identifiers, interface names, children respectively \end{code} \caption{Syntax of |rulercore|} \label{fig.sem.agda.rulercore} \end{figure} Figure~\ref{fig.sem.agda.rulercore} shows the syntax of the dependently-typed |rulercore| (based on Agda). Effectively, a |rulercore| program is Agda code with embedded |rulercore| blocks, consisting of interface declarations, sem-blocks, and attribute references. The interface declares the attributes of visits. Note that the types of an attribute may refer to an attribute. The attributes must be ordered such that an attribute referenced in the type of attribute, occurs before the latter attribute. This requires the dependencies to be acyclic. With a sem-block, we define the semantics of an AG-node given an interface |I|. The rules are specified per visit, in the order of evaluation. To define an attribute, we either use a |with|-expression when the value of the attribute is conditionally defined, or use a simple equation as RHS. In the translation to Agda, we plug such an expression in a function defined via with-expression, hence we need knowledge about the with-structure of the RHS. %% \section{Translation to Agda} %% \label{sect.translation.agda} As (denotational) semantics for |rulercore|, we describe a translation of |rulercore| programs to |Agda|. Each semantics-block is translated to a coroutine, implemented as one-shot continuations. Each call to the coroutine corresponds with a visit. The parameters of the coroutine are the inherited attributes of the visit. The result of the call is an object containing values for the synthesized attributes, and the continuation to call for the visit. Such a coroutine for a visit is a function that takes the inherited attributes as parameter, and produces a (nested) dependent product (|Sigma|) containing the synthesized attributes for that visit, and the function to be used as continuation. Attribute references are transcribed to Agda via the naming conventions as listed on the bottom of Figure~\ref{fig.sem.denot.agda}. For example, the type of such a coroutine is: \begin{smallcode} \begin{code} T_Source_analyze = Sigma (synAgathEnv : Source) T_Source_translate T_Source_translate synAgathEnv = (inhAfinEnv : Env) -> (inhAgathInFin : synAgathEnv `subset` inhAfinEnv) -> Sigma (Errs inhAfinEnv `Either` Target inhAfinEnv) (T_Source_sentinel synAgathEnv inhAfinEnv inhAgathInFin) T_Source_sentinel synAgathEnv synAgathEnv inhAfinEnv inhAgathInFin synAoutcome = () \end{code} \end{smallcode} We need dependent products in order to be able to refer to synthesized attributes in the type of an attribute. This is the reason why we require the attributes to be ordered. Also, this representation can be optimized a bit by only passing on attributes that are needed in the types of attributes of later visits. \begin{figure}[tb] \begin{smallcode} \begin{code} (semb (itf I v)) ~> (semiv (v) (I) (emptyset)) ^^^ ; ^^^ (semb (sig I)) = (semb (sig I (name v))) (semiv (visit x inh (many a) syn (many b) v) (I) (many g)) ~> (semiv (v) (I) (many g ++ many a ++ many b)) (semb (sig I x)) : (semat (sub g 1)) -> ... -> (semat (sub g n)) -> Set (semb (sig I x)) (many (seman g)) = (sema (inh.(sub a 1))) -> ... -> (sema (inh.(sub a n))) -> (semb (typrod (many b) (sig I (name v)))) (semiv (()) (I)) ~> (semb (sig I "sentinel")) = () (sema (x : e)) ~> (semb (atname x)) : (semb e) (semat (x : e)) ~> (semb e) (seman (x : e)) ~> (semb (atname x)) (semb (sem x : I t)) ~> (semb (nt I x)) : (semb (sig I)) (semb (nt I x)) = (semb (vis I (name t))) where (semev t I emptyset) (semev (visit x (many r) t) (I) (many g)) ~> (semb (vis I X)) : (semb (sig I x)) (many (seman g)) (semb (vis I x)) (semb (inhs I x)) = (semr (many r) (semb cont)) cont ~> = (semb (valprod (syns I x) (vis I (name t)))) where (semev t I (many g ++ many a ++ many b)) (semr (child c : I = e) k) ~> with e ^^^ ... | (semb (vis I (firstvisit I))) (semb k) (semr (invoke x of c) k) ~> with (semb (vis (itf c) x)) (semb (inhs (itf c) x)) ... | (valprod (syns (itf c) x)) (semb k) (semr (p e') k) ~> (semep e' p k) (semep (with e (many (p' e'))) p k) ~> with e ^^^ (many (... | p' (semr (p e') k))) (semep (= e) p k) ~> with e ^^^ ... | (semb p) k \end{code} \begin{code} atref inh.c.x = c "I" x ^^^^ nt I x = "sem_" I "_" x atref syn.c.x = c "S" x ^^^^ vis I c = "vis_" I "_" c atref loc.x = "locL" x ^^^^ vis c x = c "V" x atname inh.x = "inhAx" ^^^^ sig I = "T_" I atname syn.x = "synAx" ^^^^ sig I x = "T_" I "_" x \end{code} \end{smallcode} \caption{Denotational semantics of |rulercore|} \label{fig.sem.denot.agda} \end{figure} The coroutine itself has the following structure: it defines nested continuation-functions. Each of these continuation functions takes the inherited attributes as parameter, consists of with-expressions that compute attributes and evaluates children (discussed later), and ends in a product of the synthesized attributes plus the continuation function. For example for `diam`: \begin{smallcode} \begin{code} prod_Source_seq : T_Source -> T_Source -> T_Source_Analyze prod_Source_seq lSem rSem = sem_Source_nt where sem_Source_nt : T_Source sem_Source_nt = vis_Source_analyze where vis_Source_analyze : T_Source_Analyze vis_Source_analyze with ... ... = (lhsSgathEnv, vis_Source_translate) where vis_Source_translate : T_Source_Translate lhsSgathEnv vis_Source_Translate lhsIfinEnv lhsIgathInFin with ... \end{code} \end{smallcode} The with-sequence for a visit-function consists of the translation of child-rules, invoke-rules and evaluation rules. For example, for rule |child left : Source = lSem|, evaluation of the right-hand side gives us the function to invoke for the first visit, hence the translation is: \begin{code} ... with lSem ... | leftVanalyze with ... \end{code} For the rule |invoke translate of left|, we run the visit-function of the child applied to its inherited attributes, and match against the synthesized attributes to bring them in scope: \begin{code} ... with leftVtranslate leftIfinEnv leftIgathInFin ... | (leftSoutcome, leftVsentinel) with ... \end{code} The name |sentinel| represents a built-in final visit. It is never invoked, but having one simplifies the translation. Finally, for evaluation rules (e.g. |lhs.outcome with ...|), we take the with-expression of the RHS, but move each RHS of an alternative (e.g. |inj1 (es1 ++ es2)|) into a with-expression that matches against the LHS of the evaluation rule: \begin{code} ... with leftSoutcome ... | inj1 es with rightSoutcome ... | inj1 es1 | inj1 es2 with inj1 (es1 ++ es2) ... | inj1 es1 | inj1 es | lhsSoutcome with ... \end{code} With the above procedure, a visit function is a large nested with-expression that encodes precisely the order of evaluation. Figure~\ref{fig.sem.denot.agda} formalizes the above translations. As usual, the semantic brackets indicate an escape from the syntactic to semantics level. The optional subscript on the left bracket disambiguates the intended translation. Both the |semiv| and |semev| translations are parameterized over |many g|, the attributes defined lexically before this visit. The |semr| translation of a rule takes a parameter |k|, representing the with-structure that follows up on that rule (which ends with the dependent product of synthesized attributes). The same technique we use for the translation of a with-rhs using |semep|. \section{Related Work} Dependent types originate in Martin-L\"of's Type Theory. Various dependently-typed programming languages are gaining popularity, including Agda~\cite{DBLP:conf/tldi/Norell09}, Epigram~\cite{DBLP:conf/afp/McBride04}, and Coq~\cite{DBLP:conf/tphol/Bertot08}. We present the ideas in this paper with Agda as host language, because it has a concept of a dependent pattern match, which corresponds conveniently with pattern matching in AGs. Also, in Coq and Epigram, a program is written via interactive theorem proving with tactics or commands. The preprocessor-based approach of this paper, however, suits a declarative approach more. Attribute grammars~\cite{DBLP:journals/mst/Knuth68,DBLP:conf/waga/Knuth90} were considered to be a promising implementation for compiler construction. Recently, many Attribute Grammar systems arose for mainstream languages, such as Silver~\cite{DBLP:journals/entcs/WykBGK08} and JastAdd~\cite{DBLP:conf/oopsla/EkmanH07} for Java, and UUAG~\cite{uuagc} for Haskell. To our knowledge, Attribute Grammars with more sexy types have not been investigated yet. In certain languages it is possible to implement AGs via meta-programming facilities, which obliviates the need of a preprocessor. Viera, et al.~\cite{DBLP:conf/icfp/VieraSS09} show how to implement AGs into Haskell through type level programming. A combination of that paper with our work, would fit well in Agda, if Agda had a mechanism similar to Haskell's class system. Several attribute grammar techniques are important to our work. Kastens~\cite{DBLP:journals/acta/Kastens80} introduces ordered attribute grammars. In OAGs, the evaluation order of attribute computations as well as attribute lifetime can be determined statically, which allows us to generate coroutines such that the circularity checker accepts the program as terminating. %% Boyland~\cite{DBLP:journals/toplas/Boyland96} introduces conditional attribute grammars. %% In such a grammar, semantic rules may be guarded. A rule may be evaluated if its guard %% holds. Evaluation of guards may influence the evaluation order, which makes the evaluation %% less predictable. %% In comparison, in our clauses-in-visits model, we have to explicitly indicate in what %% visits guards are evaluated (the match-statements of a clause), which makes evaluation %% clear. Our approach has the additional benefit that children may be conditionally %% introduced and visited. %% The work for this paper is carried out for a research project to support the %% implementation of the type inference component of the Utrecht Haskell compiler. %% We presented an earlier version of |rulerfront|'s clauses-per-visit model to %% allow attribute grammars with case distinction on more than a single AST, %% and express iteration and dynamic growing of trees~\cite{Middelkoop10vis,Middelkoop10wgt,Middelkoop10gpce}. \section{Conclusion} We investigated Attribute Grammars with dependently-typed attributes: the type of an attribute may refer to the value of another attribute. This feature allows us to conveniently encode invariants of attributes, and pass proofs of these invariants around as attributes, as presented in the use-case in Section~\ref{section.example}. We introduced a language for Ordered AGs, |rulerfront|, in Section~\ref{sect.rulerfront.agda} and showed a translation to Agda (preprocessor-based). The requirement on order is important. It allows us to prove termination in Agda, and generate continuation-based functions from AGs that require more than one visit to compute the attributes. These are tedious functions to write manually. The preprocessor-based approach saves us from writing such functions ourselves. %if fullversion Finally, rules can be given not only per production, but also per visit. A visit we can split up in clauses to deal with context-information resulting from visits to children as well as pattern matching in rules. As future work, the question is to what extent we can exploit patterns in Attribute Grammar descriptions to generate proof boilerplate. From a practical perspective, for example, to deal with fixpoint iteration in AGs and its termination, or to deal with lookups and insertions in the environment. From a theoretic perspective, for example, to generate templates related to type soundness and subject reduction. %endif \subsubsection*{Acknowledgments.} This work was supported by Microsoft Research through its European PhD Scholarship Programme. \bibliographystyle{splncs03} \bibliography{references} \end{document}