%include intro.fmt \chapter{Outline of the RulerCore Concepts} \label{chapt.rulercore} Section~\tref{intro.sect.overview} argued the necessity of extensions to attribute grammars. In the following chapters of this thesis we describe individual extensions to attribute grammars. In this chapter, we present the language RulerCore and give a detailed summary of the extensions. Each section summarizes a chapter in this thesis. This chapter can be read before or after the other chapters. It shows how the individual chapters are connected together. This chapter uses a uniform notation, whereas in the individual chapters, we use minor differences in notation when that is more suited for that chapter. Consult the actual chapters for a more extensive explanation and technical material. \paragraph{Outline.} Chapter~\rref{chapt.side-effect} and Chapter~\rref{chapt.iterative} give a detailed description of RulerCore's syntax. In this chapter, we use the syntax as described in Section~\tref{intro.ag.syntax}. Prerequisite to this chapter are ordered attribute grammars (Section~\tref{sect.intro.ag.eval}) and higher-order attribute grammars (Section~\tref{intro.sect.higherorder}). The following dependency graph shows the dependencies between sections of this chapter (and the corresponding chapters). The solid arrows represent dependencies implied by the contents of the chapter, and the dashed arrows represent additional dependencies due to the presentation in this chapter. The dotted arrows represent a very weak dependency and lighter nodes are only present in the extended edition of this thesis: \begin{center} \begin{tikzpicture} [ nd/.style={circle, minimum size=1mm, node distance=4mm, top color=black, bottom color=black,font=\scriptsize} , ndopt/.style={nd, dotted} , soft/.style={->,thick,gray} , hard/.style={->,thick} , chapt/.style={->,thick,dashed} , lab/.style={font=\footnotesize} ] \node[nd](side){}; \node[nd, right of=side, xshift=40mm, yshift=8mm](iter){}; \node[nd, right of=side, xshift=18mm, yshift=12mm](warren){}; \node[ndopt, right of=side, xshift=40mm, yshift=-8mm](gadt){}; \node[nd, right of=iter, xshift=30mm, yshift=8mm](dep){}; \node[ndopt, right of=iter, xshift=30mm, yshift=-6mm](graph){}; \node[nd, right of=iter, xshift=30mm, yshift=-18mm](step){}; \node[lab, below of=side, yshift=5mm]{\ref{intro.outline.sideeffect}: effects}; \node[lab, above of=warren, yshift=-5mm]{\ref{summary.sect.commuting}: phases}; \node[lab, above of=iter, yshift=-5mm]{\ref{sect.intro.exploitit}: iter}; \node[lab, below of=gadt, yshift=5mm]{\ref{sect.outline.gadts}: GADTs}; \node[lab, below of=step, yshift=5mm]{\ref{sect.outline.stepwise}: stepwise}; \node[lab, above of=dep, yshift=-5mm]{\ref{outline.dependent.types}: dependent types}; \node[lab, below of=graph, yshift=5mm]{\citet{ariem-ext-graph}: graphs}; \draw[hard](iter) to (side); \draw[soft](warren) to (side); \draw[hard](dep) to (iter); \draw[hard](graph) to (iter); \draw[soft](iter) to (warren); \draw[soft](step) to (gadt); \draw[chapt](gadt) to (side); \draw[chapt](step) to (iter); \node[lab,left of=side, xshift=-12mm, yshift=-6mm](legend){req-chapt}; \node[lab,left of=side, xshift=-12mm, yshift=-14mm](legend2){adviced}; \node[lab,left of=side, xshift=-12mm, yshift=-10mm](legend3){req-outline}; \node[right of=legend, xshift=3mm](m1){}; \node[right of=legend2, xshift=3mm](m2){}; \node[right of=legend3, xshift=3mm](m3){}; \draw[hard](legend) to (m1); \draw[soft](legend2) to (m2); \draw[chapt](legend3) to (m3); \end{tikzpicture} \end{center} %if False \paragraph{Notation.} In the text of this chapter we refer to keywords of RulerCore. These keywords are typeset in bold in code blocks. To prevent typesetting them in bold in the running text, we use the notational convention that, for example, the annotation |onlylocal| is denoted in the running text as the onlylocal-annotation. We thus trade type setting for additional hyphens. %endif \section{Attribute Grammars with Side Effects} \label{intro.outline.sideeffect} Ordered attribute grammars~\citep{DBLP:journals/acta/Kastens80} underly the extensions that we introduce in this chapter, and work out in the subsequent chapters of this thesis. In an OAG, attributes are evaluated in a fixed number of \emph{visits} per node of the AST. Visits are a concept that play their role only in the evaluation algorithm of OAGs. In RulerCore, however, visits are a programming model: RulerCore has notation to specify visits so that the visits can be used to specify evaluation strategies. In Chapter~\rref{chapt.side-effect} we introduce the concept of a visit and their notation. This concept plays a central role, because it provides a model of the evaluation of ordered attribute grammars. We show that this model is powerful enough to get the effect of a visitor-pattern based traversals as known from the OO-world in terms of an attribute grammar based description, and we show how to deal with monadic or side-effectful operations. In this section we explain what a visit is, and introduce the notation for specifying these visits. \paragraph{Visits and configurations.} We first explain what a visit is. For that, we consider the evaluation of attribute grammars. An attribute grammar describes correctly decorated trees, but not how such a decoration is to be constructed. For OAGs, there exists an evaluation algorithm that starts with an initially undecorated tree and finishes with a correctly decorated tree. In this process, we pass though a sequence of intermediate states, with each intermediate state corresponding to a partially decorated tree. \begin{defi}[State] The \emph{state} (or decorations) of a (partially) decorated tree consists of the local state of the root node of the tree and the states of its children. A local state of a node is a partial map from the attributes of the node to their values\footnote{ How the value of an attribute is represented depends on the implementation or the host language. Such a value can be an element in the domain of the attribute, but may also be a thunk. We usually assume that the values of attributes are at least in weak-head normal form (Section~\tref{intro.sect.lambdacalculus}). }. \end{defi} \begin{defi}[Defined attributes] An attribute is \emph{defined} when it is mentioned in the partial map. \end{defi} Note that defined in this definition means that the attribute is part of the computed decorations of the tree. This definition is unrelated to rules defining attributes. \begin{defi}[Configuration] A \emph{configuration} is a set of inherited and synthesized attributes of the root of a (partially) decorated tree, and describes which attributes of the root have associated values in the (intermediate) state of the tree. There exists some total order |prec| among configurations (we come back to this later). The total order |prec| must be stronger than the subset relation among configurations. \end{defi} Thus, a configuration is an abstract description of an intermediate state. \begin{defi}[Minimally defined state] Given (static) dependencies between attributes, a tree is in a \emph{minimally defined} state for a given configuration when precisely the attributes mentioned in the configuration have a value in the local state of the root and their (indirectly) dependent attributes have an associated value in the state of the tree. \end{defi} \begin{defi}[Visit] A \emph{visit} is a state and configuration transition, which takes a tree\footnote{ A tree does not have to be the full tree, but may also be a subtree. The definition is not limited to the full tree. Indeed, a visit may require visits to subtrees: we usually describe visits per node of the tree, and specify what visits are performed to children of the node. } in a state as described by a configuration |A| to a tree in a state as described by a different configuration |B| with |A `prec` B|. \end{defi} Thus, during a visit, computations are performed which determine the values of attributes. Usually, |B| contains at least one synthesized attribute that is not yet present in |A|, since that is a motivation to perform a visit\footnote{ In a conventional OAG, the \emph{only} motivation for performing a visit is to get synthesized attributes computed, and this motivation is formalized as dependencies between attributes. Below, we show other motivations (e.g. to perform side effects) and show how these are formalized. }. Values for the inherited attributes in the set difference |B - A| are provided by the parent\footnote{ We assume that each node has a parent. In case of the root node, the parent is represented by the interface with the host language (Section~\tref{intro.ag.syntax}). } prior to the visit. The notation that we present below facilitates a statically finite and explicit description of state and configuration transitions induced by visits. If we take a partial order among configurations instead of a total order, the order represents a Direct Acyclic Graph (DAG) where the vertices of the graph represent configurations and the edges represent visits. The evaluation of attributes for a tree associated to this DAG entails walking a path in the DAG where the parent chooses which visits to invoke on the tree. However, since we base our work on OAGs, we impose a total order that corresponds to a DAG with only a single path, which simplifies the implementation and the notation. In Section~\ref{outline.dependent.types} we generalize our work so that the DAG is actually a tree. \paragraph{Rationale.} In this thesis, we distinguish two important notions of evaluation: a visit and a step. Visits (this section) provide a static model of evaluation, whereas steps (Section~\ref{sect.outline.stepwise}) provide a runtime model of evaluation. We argued in Section~\tref{intro.sect.overview} that we wish to describe evaluation strategies, thus we do so in terms of the models as mentioned above. In this Section, we focus on the static model. Every node in the tree is related to a production, and each production has an associated collection of rules. Attributes are computed by evaluating rules (Section~\tref{sect.intro.ag.eval}). OAG evaluation is compositional in that it separates the evaluation of collections of rules of the parent from the evaluation of the children. In the case of evaluation of the parent, we make \emph{internal visits} explicit, which statically describes the evaluation of a collections of rules of a node in the tree. In the case of the evaluation of children, we note that a visit to a child is the statically \emph{smallest} unit of evaluation for a child that can be specified as part of the evaluation of the parent, and thus provides a model with a fine granularity. \paragraph{Description of visits.} RulerCore provides notation to describe visits. In a conventional AG, we declare attributes per nonterminal, and specify rules per production. In RulerCore, we additionally declare a linearly ordered sequence of visits and specify which attribute belongs to which visit. Also, we specify for each rule in which visit it is evaluated. Below, we describe the notation that forms an essential prerequisites for the remaining sections of this chapter. Chapter~\rref{chapt.side-effect} provides extensive examples and technical background. \begin{defi}[Interface declaration] An \emph{interface} declaration of a nonterminal specifies a linear sequence of visits to the nodes with which the nonterminal is associated. \end{defi} \begin{defi}[Visit declaration] Each visit declaration specifies which attributes must be defined prior to that visit, and which synthesized attributes become defined as a result of the visit. \end{defi} Configurations are not explicitly declared. The configuration of a node before a given visit is the union of the attributes declared for preceding visits, starting with the empty set. Visit declarations thus form a partitioning of the attributes of a nonterminal. In the following example, for some |Expr| nonterminal, we declare a linear sequence of visits |analyze| and |compile|. We specify that in the first visit |analyze| a synthesized attribute |errors| is defined given the inherited attribute |env|. In the second visit |compile|, the synthesized attribute |output| gets defined, given the inherited attribute |optimize|, and the attributes defined earlier: \begin{code} itf Expr ^^^ visit analyze ^^^ inh env :: Env ^^^ syn errors :: Errs ^^^ visit compile ^^^ inh optimize :: Bool ^^^ syn code :: Code \end{code} The order of appearance of visits matters, whereas the order of appearance of attribute declarations in a visit-block does not. As we have seen, configurations are an abstract representation of the state of a node. A configuration records which attributes have been evaluated. Configurations are not explicitly named in the interface declaration. Instead, we associate with a visit the configuration that corresponds to the state at the beginning of the visit. So, during evaluation, the decoration of an |Expr| node is initially in the |analyze| state, then in the |compile| state, and finally in some final state. The above example declares two visits for nodes associated with the |Expr| nonterminal. During evaluation, these visits correspond to a state transition. Such a state transition is described by a collection of rules which are specified per production using a semantics-block in conventional AG notation (Section~\tref{intro.ag.syntax}): \begin{code} sem Expr prod Var -- rules for production |Var| loc.defined = loc.x `member` lhs.env -- tests whether ident |loc.x| is in the env lhs.code = Code_Var loc.x -- some translation to a target language lhs.errors = if loc.defined then [] else [Undefined loc.x] \end{code} The above semantics-block introduces three rules for the production |Var|. Each rule is implicitly associated with a visit to |Expr|. Later we introduce notation to declare such a correspondence explicitly. \paragraph{Default-notation.} Before we continue with explicit notation for visits, we take a slight detour to introduce some notational conveniences. When programming with AGs, we often use copy rules (Section~\tref{intro.sect.copyrules}). Note that RulerCore is a \emph{core language}, thus we prioritize implementation convenience over concise syntax. Copy rules are a typical front-end concept. However, copy rules may interact with RulerCore's evaluation algorithm, hence we model them explicitly. RulerCore provides default-notation to improve on \emph{copy rules}. For example, we can define the rules of production |App| with \emph{default-rules}: \begin{code} sem Expr prod App -- rules of production |App| default env optimize -- declares copy rules default errors = concat -- declares collection rule for |errors| \end{code} In the example, the default-rules introduce generic rules for he inherited attributes |env| and |optimize|, and a collection rule for the synthesized attribute |errors|. A default-rule specifies that the value of an attribute can be inferred from equally named attributes of the production. We provide several flavors of default rules for different generic situations, which each differ in how attributes are combined. The above rules are syntactic sugar\footnote{ We give the abstract syntax of the \emph{desugared} rules, but use the sugared version in the code figures. } for rules of the following form: %format default0 = "\mathbf{default0}" %format default1 = "\mathbf{default1}" %format lexical = "\mathbf{lexical}" %format lexicalrev = "\mathbf{lexicalrev}" %format scheduled = "\mathbf{scheduled}" %format use = "\mathbf{use}" \begin{code} r ::= ... -- conventional rules (Section~\tref{intro.ag.syntax}) | m o (many x) h -- default rules for the attrs with name |many x| m ::= default0 -- applies even if no attrs matched | default1 -- at least one attr must match (default) o ::= lexical -- uses the lexical order of children (default) | lexicalrev -- uses the lexical order in reverse | scheduled -- determines order after scheduling h ::= epsilon -- threaded behavior | = e -- use expr |e| to combine a list of values of matching attrs | use e1 e2 -- applies the list algebra |(e1,e2)| x -- attribute name \end{code} The order annotation |o| determines the order in which children are considered in the resolution process. In case of the scheduled-order annotation, the actual definition is determined after scheduling. As a notational convenience, a default-rule may be specified as part of the interface of a nonterminal, which then applies to all semantics-blocks of that nonterminal. \paragraph{Explicit association to visits.} In RulerCore, rules are associated with a visit. The association is by default implicit, but notation is available to specify the association explicitly by organizing the rules inside a visit-block: \begin{code} sem Expr prod Var -- semantics-block for production |Var| ... -- rules without an explicit association visit analyze -- note the indentation (important in a later section) ... -- rules associated with analyze (or later) visit compile -- note the indentation (important in a later section) ... -- rules associated with |compile| (or later) \end{code} With this notation, we specify constraints on the scheduling of rules, in addition to the conventional constraints imposed by value dependencies between attributes and rules. Via a cycle analysis, the constraints can be verified to be satisfiable using standard algorithms (Section~\tref{sect.intro.ag.eval}). We come back to the rationale for the additional constraints later. \begin{defi}[Visit semantics] A visit-block |t| is explicitly associated to some visit |x| and may contain rules and a nested visit-block. The rules may be evaluated during visit |x| or a later visit. \end{defi} The following is the grammar of a semantics and visit-block: \begin{code} s ::= sem N prod P (many r) t -- common rules |r| and visit blocks |t| t ::= visit x (many r) t -- common rules |r| and subsequent visit |t| | terminator -- terminator of sequence of visit blocks (implicit) \end{code} A visit-block is associated with a similarly named visit declared on the interface of the nonterminal. However, not every visit is necessarily associated with a visit-block. The same name may not occur twice, and the total order must be preserved: If |x| is the name of a visit-block and |x'| the name of a nested visit-block then |x `prec` x'|. Figure~\ref{fig.outline.rulesorganizationalt} gives an alternative way to organize the rules of the earlier example. The rules inside a visit-block may appear in any order without affecting the semantics of the grammar. \begin{figure}[tb] \begin{code} sem Expr prod Var -- rules of production |Var| visit analyze -- rules of the |analyze| visit or later loc.defined = loc.x `member` lhs.env lhs.errors = if loc.defined then [] else [Err_Undefined loc.x] visit generate -- rules of the |generate| visit lhs.code = Code_Var loc.x -- actually independent of any visit sem Expr prod App -- rules of production |App| default env optimize -- declares copy rules f.env = lhs.env -- explicitly written rule visit generate -- rules of the |generate| visit a.optimize = lhs.optimize -- explicitly written rule lhs.errors = concat [f.errors, a.errors] -- explicitly written rule \end{code} \caption{Examples of organizing rules in a visit-block.} \label{fig.outline.rulesorganizationalt} \end{figure} A visit-block introduces a scope for local attributes such as |loc.defined|. When defined in a visit, such an attribute is only visible inside the visit-block it is defined in, and its enclosed visit-block. The scoping plays a role in Chapter~\rref{chapt.iterative} where rules may additionally be organized in clauses. The inherited and synthesized attributes of the children and |lhs| are globally scoped per production. \paragraph{Exploiting visits: invoke rules.} The additional refinements on the scheduling of rules can be used to allow side-effects in our specification (motivated later). For this purpose, we introduce two additional forms of rules. \begin{defi}[Invoke rule] An invoke rule specifies properties of a visit to a child. These properties usually specify some evaluation strategy. \end{defi} The following is the grammar of invoke-rules of which we explain some properties below: \begin{code} r ::= ... | invoke x of (many c) z -- specifies visit |x| of children |many c| with strategy |z| z ::= implicit -- determined only by attribute dependencies | explicit -- invoke-rule restricted to the visit it appears in | parent -- invoke-rule restricted to visit with the same name \end{code} The strategies are explained later. Invoke-rules are optional. When a visit |x| to a child |c| is not explicitly specified, it is implicitly specified as the rule: \begin{code} invoke x of c implicit \end{code} The invoke-rule is annotated with a strategy, which provide a means to specify properties of the evaluation of a child. The above strategies constrain when visits to children can be performed. We will later see more strategies. The implicit-strategy (above) specifies no additional constraints. The explicit-strategy (above) requires the invoke-rule to be nested in a visit-block, and constrains the visit to the child to the evaluation of that visit-block. The parent-strategy requires that the parent has an equally named visit and constrains the visit to the child to that visit of the parent. An empty list of children in the invoke-rule applies the strategy to all children which have a visit defined with the same name. \paragraph{Exploiting visits: rules with side effects.} We allow rules with side effects, but restrict these to the introduction of children only. \begin{defi}[Side-effectful child-rule] A \emph{side-effectful child-rule} is a child defined by some impure expression. \end{defi} A child-rule must appear in a visit-block, and its application is restricted to that visit. The syntax of child-rules is (similar to Section~\tref{intro.sect.higherorder}): \begin{code} r ::= ... | child c : N <- f [many a] -- definition of child with side effects \end{code} The scheduling guarantee is that the impure expression |f| is applied before the end of the visit. The relative order of the side effects within a visit is however not specified\footnote{ We actually provide a notion of \emph{internal visits} to specify the relative order of side effect within a visit. }. This coarse-grained way of specifying the evaluation order allows us to safely integrate side-effectful operations in the attribute grammar, while not having to micro-manage the order of the side effects. In case of Haskell as target language, |f| is a monadic expression that yields the structure of child |c| given values of attributes |many a|. For example, along the lines of Section~\rref{intro.sect.copyrules}, we can define such children to encode operations that provide fresh type variables and perform unification. Instead of threading a substitution, we pass these children an inherited |IORef| to a substitution. This allows us to schedule the effects of unification in a less strict way than the explicit threading of substitutions would entail\footnote{ In Chapter~\rref{chapt.iterative} we show how to treat nodes as first class value, and show in \citet{ariem-ext-graph} some complex traversal patterns that do not follow the tree structure. In such situations, we need more flexibility in the scheduling of side-effectful operations. %% Write that this may be used to encode traversal patterns with complex jumps from one node %% to the other. }. \paragraph{Scheduling and coroutines.} In the context of this chapter, we shall refer to augmented production dependency graphs as PDGs. Section~\rref{sect.intro.ag.depgraphs} explains how dependency graphs are obtained from AG descriptions. In this section, I/O graphs are Nonterminal Dependency Graphs (NDGs) because of the explicitly declared single visit sequence per nonterminal. From the PDGs, an execution plan with an as-late-as-possible scheduling of the rules can be obtained if the PDGs are cycle-free. Since we introduced new rules, the question arises how these rules affect the evaluation of the grammar. Also, RulerCore's NDGs leave less freedom for scheduling the rules in the PDGs, since we need to adhere to the explicitly defined interfaces. We introduce the language RulerBack as a desugared variant of RulerCore which represents execution plans\footnote{ RulerCore can be considered a programming language for execution plans of AGs. }. In comparison to RulerCore, in a RulerBack all implicit syntax is made explicit and rules in a RulerBack description are totally ordered. In this thesis, we define mappings of RulerBack to algorithms in various host languages. \begin{figure}[tb] \begin{code} sem_App s_f s_a = lhs_analyze where -- body of the function lhs_analyze lhs_env = do -- body of the first visit f_analyze <- return s_f -- monadic child rule a_analyze <- return s_a -- monadic child rule let f_env = lhs_env a_env = lhs_env (f_errs, f_compile) <- f_analyze f_env -- invoke rule (a_errs, a_compile) <- a_analyze a_env -- invoke rule let lhs_errs = concat [f_errs, a_errs] lhs_compile lhs_optimize = do ... -- body of the second visit return (lhs_errs, lhs_compile) -- results of first visit \end{code} \caption{A sketch of the coroutine |lhs_analyze|.} \label{fig.outline.sketch.coroutine} \end{figure} Section~\tref{sect.ag.eval.ordered} shows that ordered AGs can be implemented with coroutines. The example below serves as a sketch of a mapping from RulerBack to \emph{monadic} coroutines. For each production |P|, we introduce a semantic function |sem_P|, which takes the coroutines |s_f| and |s_a| as parameter that serve as children |f| and |a| and produces a coroutine |lhs_analyze| for the first visit |analyze|. The coroutine for some visit |x| is a function |lhs_x| that takes the values of the inherited attributes of visit |x|, and produces a monadic tuple with values of the synthesized attributes of visit |x| and a coroutine for the successor of |x|. The coroutine |lhs_optimize| (Figure~\ref{fig.outline.sketch.coroutine}) is thus constructed as part of the body of |lhs_analyze|. An attribute |k.x| is transcoded as the Haskell identifier |k_x|. The monad serves as an abstraction for evaluation algorithms. We exploit the monadic structure in later chapters. \paragraph{Foundation.} The fragment of RulerCore that we introduced so far does not add to the expressive power of attribute grammars: it can be expressed as a conventional attribute grammar, which we do so below to be more precise about the semantics of the notation. We assume that the RulerCore description is desugared to RulerBack, which we mentioned earlier. Below, we describe how to map the RulerBack description to a conventional attribute grammar. In general, interface declarations are be mapped to attribute declarations by erasing visits. Semantics-blocks in RulerCore are translated to semantics-blocks by erasing visit-blocks and invoke-rules. To represent the erased information in a conventional attribute grammar, we thread an additional attribute through the tree for each visit. These attributes serve as synchronization points for the beginning and end of the visit. Per visit |v| of a nonterminal |N|, we introduce two additional attributes |sub begin v| and |sub end v|: \begin{code} attr N ^^^ inh (sub begin v) ^^ :: S T ^^^ syn (sub begin v) ^^ :: S T \end{code} The attributes model the side effects by encapsulating a state as some type |S T|, which we come back to later. %format rule0 Also, we thread these attributes \emph{through the rules} to enforce their evaluation order. Per production, we associate a unique consecutive number (starting from |0|) with each rule in a production. This is possible because there exists a total order among the rules. For each rule, we introduce a local attributes |sub (loc.begin) i| and |sub (loc.end) i| where |i| is the number associated with the rule. The purpose of the attributes is to mark respectively the beginning and end of the evaluation of the rule. We show later how to make a rule dependent on its |begin| attribute (in addition to its normal dependencies), and how to make the |end| attribute dependent on the evaluation of the rule. For a visit |v| and production |P| of nonterminal |N| there exists a collection |R| of rules that are associated with |v| after scheduling. If this collection is not empty, let |k| be the lowest number associated with the rules, and |l| the highest number. We then connect the begin and end of the visit with the beginning and end of the rules associated with the visit: \begin{code} sem N ^^^ prod P sub (lhs.begin) v = sub (lhs.end) j -- if |R| is empty sub (loc.begin) k = sub (lhs.begin) v -- otherwise |R| not empty sub (lhs.end) v = sub (loc.end) l -- otherwise |R| not empty \end{code} When a rule with associated number |i| and a rule with number |i+1| are in |R|, then we add also: \begin{code} sem N ^^^ prod P sub (loc.begin) (i+1) = sub (loc.end) i \end{code} At this point, we chained the attributes together, except that still attributes |sub (loc.begin) i| needs to be connected to |sub (loc.end) i|, and the begin and end of visits to the children need to be incorporated. \begin{figure}[tb] \begin{code} class ThrEff t where type M t :: * -> * -- type of a monadic computation type S t :: * -- type of the state impure :: M t alpha -> S t -> (S t, alpha) pure :: alpha -> S t -> (S t, alpha) pure (x,s) = s `seq` x `seq` (s,x) \end{code} \caption{API of threaded effects.} \label{fig.outline.api.threaded} \end{figure} To thread the attribute through a rule, we introduce in Figure~\ref{fig.outline.api.threaded} the functions |pure| and |impure| which depend on the type |T|. The function |impure| takes an effectful computation |M t alpha| and an initial state |S t|, then produces an updated state paired with the result of the computation. The function |pure| passes the state on unchanged. Figure~\ref{fig.outline.example.instances} gives some exemplary instances. An instance of |ThrEff| can be given for any monad that threads a state. \begin{figure}[tb] %format STATE = "\Conid{State}\#" \begin{code} data IsPure -- do not thread a state instance ThrEff IsPure where type M IsPure t = t type S IsPure = () impure = pure data IsIO -- thread state of the world instance ThrEff IsIO where type M IsIO = IO type S IsIO = STATE RealWorld impure (IO m) w = case m w of (# ^^ w', a ^^ #) -> a `seq` (w', a) \end{code} \caption{Example instances for threaded effects.} \label{fig.outline.example.instances} \end{figure} To complete the chain, we show the mapping of rules of RulerCore. The above functions are used in the translation of a rule |r| with associated number |i| to |semb1 r i|: \begin{code} semb1 (invoke v of c) i ^^^ ~> ^^^ sub (c.begin) v = sub (loc.begin) i ^^^ ^^^ sub (loc.end) i = sub (loc.end) v semb1 (child c : N <- f [many a]) i ^^^ ~> ^^^ (sub (loc.end) i, loc.c) = impure (f [many a]) (sub (loc.begin) i) ^^^ ^^^ child c : N = loc.c semb1 (p [many a2] = f [many a1]) i ^^^ ~> ^^^ (sub (loc.end) i, p [many a2]) = pure (f [many a1]) (sub (loc.begin) i) \end{code} The concept of visits thus provides a means to reason about attribute grammars with side-effectful computations in their rules. Note that although the AG description may be thought of as having side effects, the underlying model is still purely functional. \paragraph{Remarks.} We purposefully allow side effects only in the creation of children. Conventional rules must be purely functional. This ensures that the attributes have a referentially transparent definition, even though the tree structure itself not\footnote{ We can encode any rule as an attribute-defined child (Section~\tref{intro.sect.higherorder}), thus the restriction does not limit expressiveness. The purpose of the restriction is to ensure that side effects are sufficiently contained. }. Chapter~\rref{chapt.breadthfirst} shows how to implement search algorithms, and introduces syntax to define children without inherited attributes. For advanced search algorithms, which make use of sharing and memoization, limited use of side effects plays an important role. More generally, by making visits explicit, we can integrate our approach with compilers that are built on top of monads, and to use tree traversals in impure environments. Also, we can use the side effects to efficiently access results from nodes visited earlier in a traversal. This can be used to implement memoization strategies. %% Our approach requires that each attribute is explicitly associated with a visit. In %% conventional ordered attribute grammars, both visits and associations are inferred. %% We show in Chapter~\rref{chapt.warren} how to infer both visits and associations. The visitor pattern~\citep{DBLP:conf/ecoop/GammaHJV93} is often employed to implement recursive traversals over tree structures in imperative languages. Concretely, Chapter~\rref{chapt.side-effect} presents how our approach generalizes over the visitor design pattern. For this purpose, we use JavaScript as a host language, which in passing shows that our extensions are applicable to domains other than the implementation of type inference algorithms or functional programming languages. In this context, a \emph{visitor} is an object that contains an algorithm that describes which children to visit, and what changes to apply to the state of the node, or the visitor itself. Attributes provide a convenient way to access the state of nodes through attributes, and with our approach the changes to the state of the visitor can be encoded with side effects. With respect to these attributes, our approach offers the benefits of AGs, including the static enforcement that attributes are defined before they are used. \section{Attribute Grammars with Commuting Rules} \label{summary.sect.commuting} %% Last-as-possible scheduling: start marking from the sinks %% choose those inh/syn attrs that %% Note: add parallelism reasoning In Chapter~\rref{chapt.warren}, we generalize visits to a \emph{phases}. A visit is a technical more internal concept which precisely controls the evaluation of the grammar. A phase is a more abstract concept which the programmer can use to specify properties of the evaluation of the grammar. To reason with side effects in this setting, we present \emph{commuting} rules, which are rules with relaxed dependencies. \paragraph{Phases.} We start with the notion of a phase: \begin{defi}[Phase] A \emph{phase} represents a sequence of state transitions, controlled and observable by the parent, which take the node's state to a state described by its next configuration. \end{defi} Such a state transition consists of a sequence of smaller state transitions, which correspond to the visits as described in Chapter~\rref{chapt.side-effect}. A nonterminal may be associated with a set of possible visit sequences for its phases, and a production specifies for each of its children which sequence to take. We illustrate the above with an example. Suppose that we describe an analysis and transformation of a tree of labeled instruction blocks. The abstract syntax of blocks is described by the following grammar: \begin{code} grammar Block ^^^ prod Seq ^^^ nonterm l,r : Block ^^^ prod Leaf ^^^ term lab :: Lab term instr :: Instr \end{code} The actual transformation of the instructions is out of the scope of this example. Let |transform| be a function that requires the label of the predecessor and sucessor to transform the instructions in the leafs. \begin{figure}[tb] \begin{code} itf Block -- phase interface (visit-interfaces are not given anymore) phase analyze -- analyze phase inh pred :: Lab -- label to be used as predecessor from the predecessor inh succ :: Lab -- label to be used as successor from the successor syn pred :: Lab -- label to be used as predecessor for the successor syn succ :: Lab -- label to be used as successor for the predecessor phase transform -- transformation phase inh debug :: Bool syn trans :: SELF -- self attribute (Section~\tref{intro.sect.incremental}) default debug -- default rule on interface for inh attr \end{code} \caption{Phase interface of a |Block|.} \label{fig.outline.block.phases} \end{figure} To apply the |transform| function, we associate the label of a preceding and succeeding block with each instruction. The chained attribute |pred| represents the label of the left-nearest instruction, and attribute |succ| the label of the right-nearest instruction. Effectively, we pass |pred| from left to right, and |succ| from right to left. The \emph{phase interface} for a nonterminal declares these attributes is given in Figure~\ref{fig.outline.block.phases}. Attributes may be declared outside phases. The ordering of phases is deduced from semantics blocks, thus not from the order of appearance in the phase-interface specification: phases represent non-overlapping units of evaluation. %% Static checks in semantic functions can still be done without having knowledge of %% phase ordering \begin{figure}[tb] \begin{code} sem Block prod Seq -- rules related to |pred| and |succ| ainh.l.pred = ainh.lhs.pred -- left to right ainh.r.pred = asyn.l.pred asyn.lhs.pred = asyn.r.pred ainh.r.succ = ainh.lhs.succ -- right to left ainh.l.succ = asyn.r.succ asyn.lhs.succ = asyn.l.succ sem Block prod Leaf asyn.lhs.pred = loc.lab -- is predecessor of next asyn.lhs.succ = loc.lab -- is successor of prev loc.newInstr = transform loc.instr ainh.lhs.pred ainh.lhs.succ lhs.debug asyn.lhs.trans = Leaf loc.lab loc.newInstr \end{code} \caption{The semantics of productions of |Block|.} \label{fig.outline.semantics.block} \end{figure} As part of the semantics for productions of |Block|, we describe the flow of the |pred| and |succ| attributes in Figure~\ref{fig.outline.semantics.block}. Seq-productions act as crossbar switches, and Leaf-productions inject their labels in the attribute flows. Since we have inherited and synthesized attributes with the same name, we use the prefixes |ainh| and |asyn| to explicitly distinguish these attributes. Assume that |Block| is also the root symbol, for which we at the root provide initial values for the inherited attributes, and request values for all synthesized attributes. \paragraph{Implementation of phases.} Since a phase effectively represents a unit of evaluation, we can choose an algorithm for the evaluation of a phase. We assume here that we choose a statically ordered evaluation algorithm, which reduces the choice to either a Kastens style or a Kennedy-Warren style algorithm (Section~\tref{sect.ag.eval.ordered}). A Kastens-style algorithm does not suite the example. In the example, the rules for |pred| and |succ| are independent. However, the attributes of the analyze-phase need to be computed in at least two visits. The rules for production |Seq| require either |succ| or |pred| to be computed first. This is a typical example where a Kastens-style scheduling~\citep{DBLP:journals/acta/Kastens80} fails to find an ordering, because that scheduling induces extra edges in the PDG, which for this example causes a cycle. \citet{DBLP:conf/popl/KennedyW76} describe an algoritm that does not induce extra edges in the PDG. A set of visit sequences is determined for each nonterminal, such that there is one visit sequence per context of an occurrence of the nonterminal symbol. We present a variation on this algorithm that schedules rules as late as possible, and only those that needed in a given context. Moreover, we show how to represent such visit sequences in a strongly-typed functional language. \paragraph{Visits-DAG.} We associate a graph structure with each nonterminal which represents the visits of that nonterminal: \begin{defi}[Visits-DAG] A \emph{visit-interface DAG} describes the set of visit interfaces that are associated with a nonterminal. The graph has exactly one source vertex, and at least one sink vertex. Each vertex represents a configuration, each arrow a visit, and each path from the source to some vertex a visit-interface. To disambiguate, we may call a vertex in the visits-DAG a \emph{visits-vertex}. \end{defi} \begin{figure}[tb] \mathhs \begin{center} \begin{tikzpicture} [ conf/.style={circle, minimum size=1mm, node distance=4mm, top color=black, bottom color=black,font=\scriptsize} , vis/.style={->,thick} , lab/.style={font=\footnotesize} ] \node[lab](nonterm){|itf Block|}; \node[conf, right of=nonterm, xshift=4mm](a){}; \node[conf, right of=a, yshift=-28mm, xshift=35mm](b){}; \node[conf, right of=a, yshift=28mm, xshift=35mm](c){}; \node[conf, right of=a, xshift=75mm](d){}; \node[conf, right of=d, xshift=27mm](e){}; \draw[vis](a) to node[auto,below,lab,yshift=19mm]{\ensuremath{ \begin{code} visit v1 inh pred, succ :: Lab inh debug :: Bool syn pred, succ :: Lab syn trans :: Block phase analyze prod Leaf ^^ emptyset prod Seq ^^ child l ^^ v2 v4 ^^ child r ^^ v3 v5 \end{code}}} (d); \draw[vis](a) to node[auto,above,lab,yshift=28mm,xshift=-10mm]{\ensuremath{ \begin{code} visit v2 inh pred :: Lab syn pred :: Lab prod Leaf ^^ emptyset prod Seq ^^ child l ^^ v2 ^^ child r ^^ v2 \end{code}}} (b); \draw[vis](a) to node[auto,below,lab,yshift=-22mm,xshift=-10mm]{\ensuremath{ \begin{code} visit v3 inh succ :: Lab syn succ :: Lab prod Leaf ^^ emptyset prod Seq ^^ child l ^^ v3 ^^ child r ^^ v3 \end{code}}} (c); \draw[vis](b) to node[auto,above,lab,yshift=26mm,xshift=12mm]{\ensuremath{ \begin{code} visit v4 inh succ :: Lab syn succ :: Lab phase analyze prod Leaf ^^ emptyset prod Seq ^^ child l ^^ v4 ^^ child r ^^ v4 \end{code}}} (d); \draw[vis](c) to node[auto,below,lab,yshift=-26mm, xshift=12mm]{\ensuremath{ \begin{code} visit v5 inh pred :: Lab syn pred :: Lab phase analyze prod Leaf ^^ emptyset prod Seq ^^ child l ^^ v5 ^^ child r ^^ v5 \end{code}}} (d); \draw[vis](d) to node[auto,below,lab,xshift=4mm]{\ensuremath{ \begin{code} visit v6 inh debug :: Bool syn trans :: Block phase transform prod Leaf ^^ emptyset prod Seq ^^ child l ^^ v5 ^^ child r ^^ v5 \end{code}}} (e); \end{tikzpicture} \end{center} \plainhs \caption{Visits-DAG of the example.} \label{fig.outline.visitsdag} \end{figure} For the above example, Figure~\ref{fig.outline.visitsdag} shows the visits-DAG. There are at least three paths in the visits-DAG. The middle path represents the root where all inherited attributes are available, and two other paths represent the respective first knowledge of one of the inherited attributes. Each edge has at least one \emph{output}, which is either a synthesized attribute or phase ending. Along each path, the number of attributes increases. Each edge corresponds with a visit; we gave each a unique label |sub v i|. Also, we associated with each edge some meta-data regarding productions: the visits performed on the children of the productions during the execution of visit that is associated with the edge. Paths may be of different length, and end in different configurations. In the above example, all paths end in the same configuration because in each context all attributes are eventually needed. Section~\tref{sect.warren.visitsgraphs} describes the visits graph and its properties in more detail and shows how to incrementally compute it. Given this graph, for each edge and each production, a collection of RulerBack rules can be determined. The branching-factor of each node determines code duplication. In practice, this code duplication is not a reason for concern. The visits graph of the largest AG of UHC has about 10,000 edges and already leads to a tractable implementation. With some optimizations (Section~\tref{warren.sect.optimizations}), we reduced this number to about 3,000 edges. %% NOTE: Take out? %if False \paragraph{Scheduling.} The scheduling for phases is performed on dependency graphs that are in the traditional Knuth-1 form (Section~\rref{sect.intro.ag.eval}). We derive the visits graphs (as shown above) from the dependency graphs using an iterative algorithm that gradually constructs the visits graphs. In the visits graph, the edges correspond to visits. We distinguish two types of edges: \begin{defi}[Pending edge] A \emph{pending} edge denotes that scheduling has not been performed for the visit yet, and describes which inherited attributes are defined prior to the visit, which synthesized attributes are to be computed, and which phases are complete. \end{defi} \begin{defi}[Final edge] A \emph{final} edge denotes that scheduling has been performed, and describes additionally which visits to invoke for children of the productions of the nonterminal. \end{defi} \begin{defi}[Pending or final vertices] Similarly, we distinguish pending and final vertices, where for a final vertex we associate for each production a copy of the PDG. \end{defi} We use the PDG to determine which PDG vertices are scheduled in a visit. As initial visits-graph, we start with a single visits-vertex for each nonterminal. This vertex represents the empty configuration for that nonterminal. For root-nonterminals, the initial graph also contains a pending visits-vertex for the final configuration, and a pending edge between them that represents the single visit at the root. We repeatedly take a pending visits-edge with a final predecessor vertex, and we perform scheduling for that edge to turn it in a final visits-edge with a final successor vertex, until all pending edges are final. Scheduling may result in new edges and vertices being added to the graph. The order in which we consider pending edges does not affect the resulting visits-graph. \begin{defi}[Marker] A \emph{marker} is an annotation of a vertex in a PDG. A pending marker for a PDG-vertex means that it is should still be scheduled, and a final marker means that it is scheduled. \end{defi} Given such an edge, we perform scheduling of the visit for each production of the associated nonterminal individually. Given the corresponding DPG of the predecessor vertex, we add marks to construct the corresponding DPG for the successor vertex. We mark the PDG-vertices final that correspond to the inherited attributes that are defined for this visit. Furthermore, we mark pending all unmarked vertices that have marked dependencies. \begin{defi}[Ready vertex] A \emph{ready} vertex is a vertex marked as pending with its dependencies marked as final. \end{defi} We then repeatedly apply the following algorithm on the PDG to determine the visits to children. We exhaustively mark the ready vertices as final that are not outputs of children. Then for each child that has ready outputs, we add a pending edge to the visits-graph of the child's nonterminal, and mark the ready outputs as final. When all pending dependency-vertices are marked as final, the scheduling for the production is complete. When all productions have been processed we mark the visit-edge as final. The entire scheduling process is complete when all visit-edges are marked as final. Each path represents a visit-interface, for which we generate the evaluation algorithm as described in Chapter~\rref{chapt.side-effect}. When instantiating the child, its parent specifies with a type index which of the visit-interfaces to use. \paragraph{Remarks.} Phases generalize over visits, and thus can be used to make assumptions about the evaluation order. If this functionality is not required, then all attributes can be defined in a single phase. If there exists a partitioning into visits, then there is at least one alternative in the visit-interface DAG. However, the explicit partitioning can also be used to restrict the number of paths. For example, if we define the |pred| and |succ| attributes in separate phases, only one of the paths remain. Moreover, at the end of each phase, the node's state is in a configuration that is independent of the chosen visit sequence, thus the configuration after a phase is a merge point in the DAG. %endif \paragraph{Commuting rules.} Section~\rref{intro.outline.sideeffect} shows a translation of visits to conventional attribute grammars using a functional encoding of the state. The translation involved adding an additional chained attribute (the state attribute) per visit which represents the state and a transformation of the rules to thread the attribute through the rules scheduled for the visit. A property of this approach is that the side effects that arise from visit a child of a parent can only be observed by the parent or its other children by inspecting the state attribute of the child after the visit. A similar translation is possible for phases. Analogously, a parent only observes the side effects arising from a child after completing the phase. However, during the evaluation of a phase of a child of a parent, side-effectful rules of the parent or the other children of the parent may be evaluated, since a phase consists possibly of multiple visits to a child. In this situation, with a single chained attribute per phase, side effects arising from a child may not be timely observed in the parent or in siblings, and such a translation does not fully capture the semantics of side-effectful rules. A possible approach is to translate the phases to an explicit visit sequence, and then use the translation of Section~\rref{intro.outline.sideeffect}. However, visits are implicit in the phases model and additionally there may be a visit sequence per context. Instead, we take the opportunity to present \emph{commuting rules}. \begin{defi}[Commutative compositions and commutable rules] Given an explicit ordering of rules, the composition of two rules is \emph{commutative} when the two rules are \emph{commutable}, which means that the rules may be swapped in the composition without affecting the intended result. \end{defi} Rule composition is a conditionally commutative operator. Commutable rules can be considered as commutable operations. The swapping of rules models side effects, and commutativity permits reasoning about the safe use of side effects. A semantic tree is a composition of the rules of the tree (Section~\ref{intro.sect.ags} and Section~\ref{sect.intro.ag.eval}). Section~\tref{intro.sect.correspondences} shows that a composition of rules can be expressed with arrow notation, which is a convenient notation to define when two rules are commutable. Further, we wish to refer describe a composition of rules (e.g. the composition of rules of a node) in a larger context (e.g. the composition of rules in a tree). We define a \emph{rule context} |h| as a composition of rules with a hole in it so that |h c| represents the composition of rules with |c| the composition of rules at the location of the hole. \begin{defi}[Commuting rule] A \emph{commuting rule} is a rule of the form |(x',y') = f x y, ^^ x `comm` x'| where the letters |x|, |y|, etc. are (sets of) attribute occurrences and |f| is a semantic function. \end{defi} A commuting rule thus only differs in the notation |x `comm` x'|, which denotes that the rule may be swapped with rules that define |x| or use |x'| (with renaming of the attribute occurrences). Such a rule is said to commute over |x| and |x'|. Consider a rule |r1| of the form |(x1,y1) = f (x0,y0), ^^ x0 `comm` x1| and a rule |r2| of the form |(x2,y2) = g (x1,y1), ^^ x1 `comm` x2|. In a composition of rules containing |r1| and |r2|, the additional notation specifies that |r1| may appear ordered before |r2| and vice versa. Without the additional notation the rule |r1| must appear before |r2| because |r2| refers to an attribute defined by |r1|. With commuting rules as AG feature, side effects can be encoded as a single chained attribute per nonterminal threaded through each rule and having each rule commute over this attribute. Such a translation is more straightforward than the translation in Section~\rref{intro.outline.sideeffect} and also works for phases. \paragraph{Referential transparency.} At the level of specification, the use of commutable rules may break referential transparency and thus complicate equality reasoning. A question that arises is how to reason with a safe use of commutable rules. We define below a law for this purpose. Suppose that |r1 `notprec` r2| denotes that |r1| is independent of |r2| with respect to the dependencies between attributes and rules except for the dependencies between attributes where the rules commute over. There are two compositions in arrow notation (|c1| and |c2|) to consider: \begin{code} c1 = proc (x0,y0,z0) -> do -- composition of |r1| and |r2| (x1,y1) <- f -< (x0,y0) -- rule |r1| in arrow notation (x2,z1) <- g -< (x1,z0) -- rule |r2| in arrow notation returnA (x2,y1,z1) c2 = proc (x0,y0,z0) -> do -- composition of |r2| and |r1| (x1,z1) <- g -< (x0,z0) -- rule |r2| in arrow notation (x2,y1) <- f -< (x1,y0) -- rule |r1| in arrow notation returnA (x2,y1,z1) \end{code} The identifiers |y0| and |z0| represent the independent input attributes of respectively |r1| and |r2|, and identifiers |y1| and |z1| their respective independent output attributes. Identifiers |x0|, |x1| and |x2| represent the attributes over which the rules may commute. \begin{defi}{Rule context} A rule context |h| is a function that serves as an abstraction of a rule composition with a hole. It takes as parameter the composition to fill to hole with. \end{defi} \begin{defi}[Commutable over attributes] We now say that |r1| and |r2| are \emph{commutable} over attributes of |x0,x1| and |x1,x2| if |r1 `notprec` r2| if their compositions |c1| and |c2| give an equivalent results |h c1 = h c2| in some given rule context |h|. \end{defi} When the rules are commutable, the outcome of swapping the rules in rule context |h| is equivalent, thus the slide of the grammar that contains the rules |r1|, |r2| and those in |h| is referentially transparent. Rule context |h| should be chosen in such a way that it expresses how the context of the rules interprets the attributes computed by the rules. We finish this discussion of commutable rules below with an exemplary definition of |h| which states that for an attribute that provides fresh numbers only uniqueness is relevant. For some exemplary grammar of expressions, the following two rules thread a counter |k|, and extract two unique numbers in |loc.u1| and |loc.u2|: \begin{code} sem Expr prod Var (loc.k, loc.u1) = f (ainh.lhs.k, ainh.lhs.k), ^^ ainh.lhs.k `comm` loc.k -- rule one (asyn.lhs.k, loc.u2) = g (loc.k+1, loc.k), ^^ loc.k `comm` asyn.lhs.k -- rule two \end{code} With the following definitions for |f| and |g|: \begin{code} f (a,_) = (a+1, a) g (b,_) = (b+1, b) \end{code} For the following definition of |h|, the above two rules are commutable. In both compositions of the rules, the two resulting numbers are different from each other: \begin{code} h r n = a /= b where -- abstraction: the unique numbers should be distinct (_,a,b) = r (n, (), ()) -- for any number |n| that represents the |ainh.lhs.g| \end{code} We may choose functions |h| that state stronger invariants and take more context into account. For example, when we consider the collection of a list of error messages, we may take the slice of the rules that depend on the error messages, and require that the lists are equal when ordered according to the source location of each message. %if False Despite the reordering, the specification is referentially transparent when the actual context of the rules does not make a stronger assumption than |h|. The commutable-property thus allows us to reason about which rules can be specified as being commuting. The commutatble-property is actually a law that commuting rules should satisfy for a sufficiently strong definition of |h|. We do not mechanically verify this property. %endif Commutable rules can be applied when expressing collection attributes as a chained attribute. A collection attribute is a synthesized attribute with a commutative monoid or trace monoid as value. Often, such attributes can be encoded more efficiently as a chained attribute. However, threading a chained attribute through some children may induce tighter dependencies than combining synthesized attributes of these children, and thus reduces freedom in attribute scheduling and may even lead to cycles. With commutable rules a chained attribute can be used without the tighter dependencies. \paragraph{Remarks.} In Chapter~\rref{chapt.warren} we work out phases in more detail. In this chapter, we describe how to compute the dependency graphs and how to perform scheduling of phases. This work shows how to generalize visits to phases, and allows us to describe the extensions that we present in the next sections (and their corresponding chapters) using visits, so that we factor out the dependency graphs and scheduling in the next sections and chapters. \section{AGs with Tree Construction} \label{sect.intro.exploitit} In Chapter~\rref{chapt.iterative}, we show several AG extensions based on the model of explicit visits that allow us to conditionally and iteratively define attributes and children. Additionally, we use annotations on visits and invocations of visits to fine-tune evaluation strategies. In this section, we give an overview of the extensions. In a conventional attribute grammar the rules to evaluate for a node are the rules associated to the production that is associated to that node. As extension, we wish to have more fine-grained control over which sequence of rules is evaluated. Therefore, we split up visit-blocks in a sequence of clause blocks. Each clause-block may contain rules, and per visit some strategy choses which clause-block to use to compute the attributes. In this section, we take a fixed strategy based on the order of appearance of clause-blocks and backtracking. We show that with this approach we can implement a dispatch of rules based on values of inherited attributes. In Section~\ref{sect.outline.stepwise} we present a mechanism based on a stepwise evaluation to actually define custom strategies. Further, we show how to express iteration by annotating invoke-rules with a strategy that repeats the evaluation of the visit until a condition is met. Moreover, first-class children are an extension that allow children to be detached as value, or attached from a value. The techniques combined provide a powerful mechanism to encode fixpoint computations. In the remainder of this section, we explain these extensions one-by-one. In Chapter~\rref{chapt.iterative} we show how these extensions are implemented. \paragraph{Clauses.} Instead of associating a collection of rules per production, we organize the rules in a different way. We associate a DAG with a production. Each vertex is associated with a configuration and each edge is associated with a visit and with a collection of rules. There may be multiple vertices associated to the same configuration, although there may only be one source vertex, wich must be associated with the empty configuration. During evaluation, a path is traversed through the DAG: one edge per visit and a strategy associated with vertices dictate which edge to traverse. \begin{defi}[Clause] A \emph{clause} is an edge in the DAG as specified above. \end{defi} As notational simplification, we impose the restriction that the DAG\footnote{ We thus identify two important DAGs: a DAG per nonterminal which describes visits and attributes, and a DAG per production which describes different sets of rules for visits to compute the attributes. The restrictions that we impose on the DAGs simplify the implementation or the notation. } must be a tree, and present notation below on how to describe this tree. Essentially, the rules are organized in clause-blocks per visit-block. \begin{figure}[tb] \begin{code} itf Expr ^^ visit check ^^ inh env :: Env -- environment containing declared types ^^ inh tp :: Ty -- expected type of the expr syn errors :: Errs -- result of type checking sem Expr ^^ prod Var ^^ visit check -- cases for the check-visit of the var-prod clause defIdent -- case for when the variable is in the env match (Just loc.declTp) = lookup loc.nm lhs.env internal matchTp -- internal case distinction clause typeOk -- case for when the type matches lhs.errors = [] match True = lhs.tp `isInstance` loc.declTp clause typeFail -- case in case of a type mismatch lhs.errors = [Mismatch lhs.tp loc.declTp] clause undefIdent -- case for when the variable is undefined lhs.errors = [UndefVar loc.nm] -- assumes a match of |defIdent| failed \end{code} \caption{Example of clauses.} \label{fig.outline.clausesexample} \end{figure} We start with an example in Figure~\ref{fig.outline.clausesexample} before explaining the notation. The example consists of a type checker for some var-production of a lambda calculus. In the example, we use clauses to encode case distinction. We distinguish a clause |defIdent| for when the identifier is in the environment, and a clause |undefIdent| when this is not the case. Moreover, we split the |defIdent| clause in two more clauses depending on whether the expected type matches the declared type using an internal visit. The nesting of clauses forms a decision tree. A path in this tree is the sequence of clauses that are selected to compute the outputs of the visit. For now, we assume that clauses are selected with a fixed strategy based on the order of appearance (to which we come back later). Internal visits can be considered as |epsilon|-edges in the DAG as mentioned above. The following changes to notation allow visits to consist of a non-empty ordered sequence of clauses: \begin{code} t ::= visit x (many r) (many k) -- conventional visit block, common rules |r|, and alternatives |many k| | internal x (many r) (many k) -- internal visit block, common rules |r|, and alternatives |many k| | terminator -- terminator of visit/clause branch (optional) k ::= clause x (many r) t -- clause-block with rules |many r| and visit |t| x -- identifier of a visit or clause \end{code} Each clause contains a set of rules. This set of rules defines the synthesized attributes of the visit, and potentially subsequent visits. Thus, each clause provides a number of alternative definitions of the synthesized attributes of a visit. We distinguish conventional visits and \emph{internal visits}. A conventional visit is invoked by the parent and declared as part of the interface of the nonterminal. Internal visits and clauses are evaluated as part of the evaluation of their encapsulating visit or clause. To describe the clause selection strategy, we distinguish three types of outcome for rules, visits, and clauses. Evaluation either \emph{succeeds} with resulting attribute bindings, \emph{terminates exceptionally}, or \emph{fails} with a recoverable failure: \begin{itemize} \item Clauses are evaluated in the order of their appearance. The first clause that succeeds or terminates is chosen as the clause that provides the outcome of the visit. If a clause fails, the next clause is evaluated. \item During the evaluation of a clause, the rules are evaluated in a scheduled order. When a rule succeeds, the next rule is evaluated. If all rules succeed, the clause succeeds. However, a clause fails if the evaluation of a rule terminates exceptionally or fails. \item A visit terminates if any of its evaluated clauses terminate, and otherwise succeeds if a clause succeeds. If all of its clause fail then the visit terminates exceptionally or fails recoverable. The respective difference is made by whether the visit is annotated as \emph{total} or annotated with a \emph{partial} strategy. Visits are declared as total by default. \end{itemize} There are two types of rules that may fail: \begin{itemize} \item An invoke-rule may be annotated with a \emph{partial} strategy. If it is, the invoke-rule fails if the visit to the child fails. Otherwise the invoke-rule either succeeds or terminates exceptionally. \item We present \emph{match-rules} to specify conditions. A rule |match p = e| requires that the value of |e| satisfies the pattern |p|, otherwise evaluation for the match-rule fails. \end{itemize} In the rule ordering, match-rules must be evaluated as part of the clause in which it is declared. Moreover, match-rules take priority in the rule scheduling. If two match rules are independent of each other, then the order of appearance determines which rule is scheduled first. As notational convention, we usually write match-rules up front in code examples. Chapter~\rref{chapt.side-effect} describes the evaluation of clauses as a generalization of productions. Chapter~\rref{chapt.iterative} describes an implementation in Haskell. Also, Chapter~\rref{chapt.breadthfirst} shows how to evaluate clauses simultaneously. In comparison to Conditional Attribute Grammars~\citep{DBLP:journals/toplas/Boyland96} or conditionally defined rules in general, clauses allow us to define a condition for multiple attributes and also children. \paragraph{Multi-attribute dispatch.} Clauses provide a convenient way to describe the structure of a derivation tree when the structure of the tree depends on the values of attributes. For example, to prove type equality, the structure of the derivation tree is a determined by two attributes that represent the types in question. \begin{figure}[tb] \begin{code} grammar Eq ^^ prod Check -- multiple clauses below itf Eq visit check -- checks type equality inh tp1 :: Type inh tp2 :: Type syn errs :: Errs -- outcome of check sem Eq prod Check visit check -- semantics of |Check| prod default errs = concat -- collect type errors clause twoInts -- when |tp1| and |tp2| are |Int|s match Ty_Int = lhs.tp1 match Ty_Int = lhs.tp2 clause twoArrs -- |tp1| and |tp2| both arrow-types match (Ty_Arr loc.a loc.b) = lhs.tp1 -- tests if |lhs.tp1| is an arr match (Ty_Arr loc.c loc.d) = lhs.tp2 -- tests if |lhs.tp2| is an arr child u1 : Eq = sem_Check -- recursion on both arg-types child u2 : Eq = sem_Check -- recursion on both res-types u1.tp1 = loc.a; ^^^ u1.tp2 = loc.c -- definitions of inh attrs of |u1| u2.tp1 = loc.b; ^^^ u2.tp2 = loc.d -- definitions of inh attrs of |u2| clause mismatch -- catch-all clause lhs.errs = [Err_Mismatch lhs.tp1 lhs.tp2] -- error for each mismatch \end{code} \caption{Matching example.} \label{fig.outline.matchingexample} \end{figure} The example in Figure~\ref{fig.outline.matchingexample} shows a first-order matching algorithm for the construction of a derivation tree for an equality judgment. Given two attributes |tp1| and |tp2| which values represent types (in some object language), we match \emph{pointwise} against the structure of these types. The value of such an attribute is either the integer type constructor or a function type constructor. During evaluation, the derivation tree is constructed up to the points that the types match. The result of evaluation is an attribute |errs| that contains an error message for each type mismatch. The production |Check| does not declare any terminal nor nonterminal symbols. The example relies on higher-order children and clauses instead. \paragraph{Iteration.} A judgment |R (many p)| can be seen as a constraint |R| between parameters |many p| where |R| is a relation. Fixpoint iteration is often employed to gradually construct a solution to a set of such constraints. In Figure~\ref{fig.outline.examplesubset} we show how to encode fixpoint iteration in AGs by iterating visits. We use some extensions of previous sections and Section~\rref{intro.sect.copyrules} to keep the description concise. We explain some aspects of the example below. We first give a grammar for a constraint language: a sentence in this language is a list of subset constraints (|a `subset` b|) on some symbols (|a|, |b|) that represent integer sets. Given the list of constraints and an initial mapping |env| from symbol to integer set, we describe an algorithm that refines the mapping until all constraints are satisfied. The nonterminal |Constr| represents a subset constraint and the nonterminal |Constrs| a list of such constraints. \begin{figure}[tb] \begin{code} grammar Top ^^ prod Top ^^ nonterm root : Constrs -- root symbol grammar Constrs = [Constr] -- short hand for cons-list grammar Constr ^^ prod Subset ^^ term a,b :: Ident -- subset constraint itf Constrs Constr -- shorthand notation visit solve fixed -- a \emph{fixed} visit (explained below) chn env :: Map Ident IntSet -- chained attribute syn changed :: Bool -- |True| if |env| changed default env = head -- default rule for |env| default changed = or -- default rule for |changed| sem Constr prod Subset -- approximation of |loc.newVal| loc.bVal = lookupWithDefault emptyset loc.b lhs.env loc.aVal = lookupWithDefault emptyset loc.a lhs.env loc.newVal = loc.aVal `union` loc.bVal lhs.env = insert loc.b loc.bVal lhs.env lhs.changed = loc.newVal /= loc.bVal \end{code} \caption{AG for solving subset constraints.} \label{fig.outline.examplesubset} \end{figure} Secondly, we show how to refine the mapping for a single constraint, then show further below how to iterate over such a list of constraints. The rules in Figure~\ref{fig.outline.examplesubset} describe how the new |env| is computed from the initial |env| in a single iteration. The attribute |changed| is |True| if and only if the mapping was changed. The semantics for |Constrs| is fully determined by default rules. Finally, we work below towards a specification of iteration for a list of constraints. The semantics of nonterminal |Constrs| is fully determined by the default rules: we thus specify iteration as part of the semantics of the Top-production, for which we introduce some additional annotation. An invoke-rule may be annotated with strategies |z| as we saw earlier. We introduce two new strategies: \emph{oneshot} and \emph{iterate}. By default an invoke-ruke is (implicitly) annotated as oneshot, which means that the visit is at most invoked once. However, when the annotation is \emph{iterate}, then the visit may be repeated multiple times: \begin{code} r ::= invoke x of (many c) z -- annotated invoke-rule z ::= oneshot -- by default (implicit) | iterate e -- repetitive invocation again = Just :: Inp N x -> Maybe (Inp N x) -- API function stop = Nothing :: Maybe (Inp N x) -- API function \end{code} The expression |e| is a function that takes two parameters. The first parameter is the set of values for inherited attributes of the last iteration, and the second parameter the set of values for the synthesized attributes that resulting from that last iteration. The result of the function describes if the visit is repeated. If the result is produced using the constant |stop|, then the visit is not repeated. If, however, the result is produced using the function |again|, which takes as parameter the set of inherited attribute values that are used for the next iteration, then the visit is repeated with those values. We apply this strategy to repeat the solve-visit on lists of constraints until a fixpoint is reached for the environment, which is the case when the attribute |changed| is |False| at the end of an iteration: \begin{code} sem Top prod Top -- semantics of the root ainh.root.env = lhs.initialEnv -- env for the first iteration invoke solve of root iterate ^^ \inp outp -> -- iterative invoke strategy if changed outp -- query attribute |changed| then again (inp { env = env outp }) -- repeat with updated |env| else stop -- stop iterations lhs.result = asyn.root.env -- takes result of last iteration \end{code} The values of the attributes are stored in a record for the inherited and synthesized attributes of a visit\footnote{ In case of the generalization to phases of the previous section, an association of attributes to phases may be determined automatically when such an association is not manually given. However, when this happens, it is unclear which attributes are present in the record. Therefore, to be able to iterate a phase, we require the phase declaration to be annotated with the annotation \emph{fixed} which disallows the automatic scheduling of attributes to the phase. }. The labels are an encoding of the name of the attribute. As a constraint solving strategy we may be interested in results of previous iterations. To keep a local state per node we introduce visit-local chained attributes, so that the notation for visit-declarations becomes: \begin{code} t ::= visit x (many (chn y :: ty) (many r) t -- the type |ty| is optional \end{code} Note that the name of a visit may not clash with the name of a child, and |y| must be unique with respect to all visit-local attributes of a production. The name |y| in the attribute declaration denotes four attributes that are local to the production: \begin{center} \begin{tabular}{llll} attribute & meaning & scheduling & notation \\ \hline |ainh.x.y| & initial value of |ainh.vis.y| & outside visit |x| & |ainh| is a keyword \\ |asyn.x.y| & last value of |asyn.vis.y| & outside visit |x| & |asyn| is a keyword \\ |ainh.vis.y| & input to visit & inside visit |x| & |ainh| and |vis| are keywords \\ |asyn.vis.y| & result of visit & inside visit |x| & |asyn| and |vis| are keywords \\ \end{tabular} \end{center} In Figure~\ref{fig.outline.agweakening}, we express that if the number of iterations exceeds a threshold of |5|, then the result is weakened by enlarging it to the top-value in our set-lattice. This approach enforces convergence. The top-value is provided as attribute |lhs.topVal|. Note that we specified the rules outside the |solve| visit-block. The rule scheduling moves these rules to the appropriate block. \begin{figure}[tb] \begin{code} itf Constr -- more visits visit initial ^^ inh topVal :: IntSet -- value of top element visit solve fixed ^^ chn ... -- as defined above visit generate ^^ syn outcome :: (IntSet,IntSet) -- outcome of solving sem Constr prod Subset lhs.output = (loc.aVal, loc.newVal) -- the computation for the result ainh.solve.reps = 0 -- initial val of vis chained attr asyn.vis.last = ainh.vis.last + 1 -- increment ... loc.newVal = loc.aVal `union` loc.bVal `union` loc.cVal loc.cVal = if inh.vis.last >= 5 then lhs.topVal else emptyset visit solve chn reps :: Int -- visit local chained attribute \end{code} \caption{Example of weakening.} \label{fig.outline.agweakening} \end{figure} In contrast to conventional fixpoint evaluators for AGs, we precisely specify the iteration points, may perform fixpoint iteration over multiple attributes, and keep (purely functional) state between iterations. We may even construct children as part of a fixed visit, although to prevent constructing children over-and-over again, we may need to store the state of a child as part of the iteration state. We can accomplish this by detaching and attaching children. \paragraph{First-class children.} Constraints are used in inference algorithms to delegate a proof obligation to a different location in the tree. Constraints are typically used to delay a proof until all constraints in a given scope are collected. In an AG, proof obligations can be encoded as a visit on a child that still needs to be invoked. Invoking the visit corresponds to constructing the proof. In the model as presented so far, we statically know the configuration of a child's state at each point during evaluation of the node. We present rules to detach and attach children that are in a given configuration, which permit us to treat children as first class values, and thus as constraints: \begin{code} p = detach x of c -- detaches |c| in the state that visit |x| is pending attach x of c = e -- attaches |e| as |c| in the state that visit |x| is pending \end{code} The attach-rule is a generalization of the child-rule that specifies that visit |x| and later are accessible on child |c|. The detach-rule specifies that visit |x| and later are not accessible on child |c| and provides a value that represents the child prior to the invocation of |x|. These two rules may be used in conjunction, however, to prevent conflicts only one attach or detach rule is allowed per child and visit combination. A detached child may be attached at a different location in the tree and visited as part of the evaluation for that location, or be visited through a wrapper function in the host language as part of an external solving algorithm. \begin{figure}[tb] \begin{code} itf Expr -- some example visit gather syn gathCnstrs :: IntMap Sem_Cnstrs_solve -- use to gather of children visit distribute inh distCnstrs :: IntMap Sem_Cnstrs_generate -- used to distribute children syn transformed :: Expr sem Expr prod Var child c : Constr = sem_Subset loc.nm lhs.parentNm c.topVal = lhs.topVal -- for first visit to |c| lhs.gathCnstrs = singleton loc.nodeId (detach solve of c) attach generate of c = lookup loc.nodeId lhs.distCnstrs lhs.transformed = Expr_Const $ fst $ c.output -- based on the last visit of |c| \end{code} \caption{Example of child detachment.} \label{fig.outline.detachexample} \end{figure} Figure~\ref{fig.outline.detachexample} shows an example of how a child can be detached. We collect the detached children in an attribute |gathCnstrs| as constraints. These constraints are solved elsewhere by invoking the |solve| visit on them, then transferred back as attribute |distCnstrs| and attached again. The type |Sem_Cnstrs_solve| is the type of a detached child prior to the invocation of |solve|. At another location in the tree, we may attach the constraints in the list and apply the iteration technique of above to solve the constraints. This approach has the advantage that we can easily transport context information from the node that defines the constraint to the location where we solve the constraint, and vice versa. \citet{ariem-ext-graph} gives additional examples of this technique. Moreover, the dependency analysis provides define-before-use guarantees. Chapter~\rref{chapt.dependent-ags} describes how dependent types can be used to prove that a detached subtree is attached at precisely one other location in the tree. \paragraph{Remarks.} By making visits explicit we gained the ability to describe evaluation strategies by annotating the callee (visit declarations) or the caller (invoke-rules). Clauses offer a means to specify alternatives. Children in a given state are first class and can be passed around to describe complex traversals. The extensions preserve the attractive properties of AGs, such as automatic rule scheduling and purely functional descriptions. Also, the implementation in the host language is purely functional. In Chapter~\rref{chapt.iterative} we present a large example, and give a translation of the notation to Haskell code. We show in \citet{ariem-ext-graph} that we can also describe complex traversals over trees and even graph structures. \section{Case Study with GADTs} \label{sect.outline.gadts} Constructors of an algebraic data type specify how a value of the data type is structured. A data type may be parameterized. Generalized Algebraic Data Types (GADTs)~\citep{Cheney03first-classphantom} associate per data constructor a set of type equivalences between the parameters of the data type. When building a value using a GADT constructor, and thus specifying how the parameters are instantiated, the type equivalences must be satisfied. In the scope of a successful pattern match against a GADT data constructor, the type equivalences may be assumed to hold and can be used to \emph{refine} or safely \emph{coerce} types. In the extended edition of this thesis, we present a type system for GADTs as a case study for several reasons~\citep{ariem-ext-gadt}. Firstly, a type system for GADTs poses additional challenges to a description of a type inference algorithm compared to a conventional DHM-style inference algorithm (Section~\tref{sect.intro.dhm}, Section~\tref{intro.sect.ag.dhm}) which give insight in what features our meta language for type system needs to support. Secondly, we investigate the description of GADTs as a minimalistic type system \emph{extension}. Moreover, we make extensive use of GADTs in this thesis, thus this chapter can then also be used as an explanation of GADTs. In this section, we take a simplified subset of the actual type system: equality proofs. We first give a specification, then look at properties of an inference algorithm, and consider a description of the algorithm with attribute grammars. \paragraph{Specification.} %format ex1 %format ex2 %format ex3 GADTs are typically used as type index. In the example of Figure~\ref{fig.outline.gadtindex}, the type |Index a| is a first-class description of the type |a|. By pattern matching on the description we reconstruct what the concrete type was to which |a| was instantiated. The expression |TBool| must be of type |Index Bool|, thus the |a| of |append TBool| must be |Bool|. Consequently, |ex2| and |ex3| are ill-typed. \begin{figure}[tb] \begin{code} data Index a where -- data type written using GADT notation TInt :: Index Int -- parameterless constructor with |a == Int| TBool :: Index Bool -- parameterless constructor with |a == Bool| append :: Index a -> a -> a -> a append TInt = (+) -- coercion of |Int| to |a| append TBool = (&&) -- coercion of |Bool| to |a| ex1 = (append TInt 1 2, append TBool True False) -- OK: (3, False) ex2 = append TInt 1 2 + append TBool True False -- type error (|Int != Bool|) ex3 = append TBool 1 2 -- type error (|Int != Bool|) \end{code} \caption{Example of a GADT as type index.} \label{fig.outline.gadtindex} \end{figure} In the above example, there exists a type equality assumption |a ~ Int| in the context of having matched against the |TInt| constructor. The assumption is used to prove that |Int -> Int| can be coerced into |a -> a|. The actual facilities that we need to reason with GADTs is the introduction of type equality assumptions in a scope, and equality reasoning on types. These facilities are orthogonal to the actual treatment of algebraic data types. To be able to describe GADTs as a separate aspect of a type system, it is thus desirable to separate these facilities. In the above example, we used simple types constructed by type arrows and type constants. In the specification we use the following grammar for types and environments containing equality assumptions: \begin{code} tau ::= a | Int | Bool | tau1 -> tau2 -- types, also: |rho|, |omega|, and |sigma| gam ::= emptyset | gam, (tau1 ~ tau2) -- environment containing type equality assumptions \end{code} The type equality relation is used to reason with the equality between types. \begin{figure}[tb] \begin{center} \begin{mathpar} \fbox{|gam :- tau == rho|}\\ \inferrule*[right=refl] {} {|gam :- tau==tau|} \inferrule*[right=sym] {|gam :- tau == rho|} {|gam :- rho == tau|} \inferrule*[right=trans] {|gam :- tau == rho| \\\\ |gam :- rho == sigma| \\ } {|gam :- tau == sigma|} \inferrule*[right=assum] {|(tau ~ rho) `elem` gam| \\ } {|gam :- tau == rho|} \inferrule*[right=congr] {|gam :- tau == rho| \\ |gam :- sigma == omega| \\ } {|gam :- tau -> rho == sigma -> omega|} \inferrule*[right=sub.l] {|tau -> rho == sigma -> omega|} {|gam :- tau == sigma|} \inferrule*[right=sub.r] {|tau -> rho == sigma -> omega|} {|gam :- rho == omega|} \end{mathpar} \end{center} \caption{} \label{fig.outline.equalityrules} \end{figure} Given an environment |gam| that consists of the type equality assumptions, the inference rules in Figure~\ref{fig.outline.equalityrules} describe the type equality relation. The first three rules are properties that any equality relation is supposed to exhibit. In addition, the rule \TirName{assum} expresses a proof by assumption, and the remaining three are related to congruence and subsumption properties derived from the structure of types. See \citet{ariem-ext-gadt} for some exemplary proofs for judgments of this relation. \paragraph{Forward and backward chaining.} The above rules are not straightforwardly mapped to an inference algorithm. The rule \TirName{sym} can always be applied, thus some condition is needed to determine when not to apply this rule. In Section~\rref{sect.intro.dhm} we describes properties of the type rules of the DHM type system that permit an attractive implementation in the form of algorithm W. Similarly, we apply apply domain knowledge here to impose conditions on above rules so that problematic derivation trees are avoided or do not have to be considered by the algorithm. For example, we require that a derivation tree for judgment |a == b| may not contain (indirectly) a child for the same judgment, as the proof would then be circular. This constraint ensures that the number of applications of the sym-rule is bounded. To explain why the other rules are not straightforwardly mapped to an inference algorithm, we mention first that there are two ways of reasoning with inference rules~\citep{Russell:1996:AIM:231005}. With \emph{forward chaining} inference starts with assumptions and derives conclusions. With \emph{backward chaining} inference starts at conclusions and tries to prove premisses until they can be discharged by assumptions. Inference algorithms as discussed so far use a limited form of backward chaining. Backward chaining is suitable when a conclusion can be decomposed into smaller premisses, which is indeed the case for rules \TirName{refl}, \TirName{assum} and \TirName{congr}, and also for rule \TirName{sym} with the aforementioned restriction. This is not the case for the rules \TirName{trans}, \TirName{sub.l}, and \TirName{sub.r}. These contain one or more meta variables in their premises that are not fixed by their conclusion judgments. As a consequence, arbitrary (infinite) branches can be introduced in the derivation tree by applying these rules. Forward chaining is suitable when premisses can be decomposed into smaller conclusions. This is the case for all rules except \TirName{refl} and \TirName{congr}. To deal with all rules, we use a combination of backward and forward chaining by distinguishing proof obligations and proven facts. Rule \TirName{congr} may only be applied on a proof obligation whereas Rules \TirName{sub.l} and \TirName{sub.r} may only be applied on proven facts. The \TirName{trans} may only be applied if one of the premisses is a proven fact. As part of the case study in \citet{ariem-ext-gadt}, we implemented a solver for equality constraints in UHC using an implementation of constraint handling rules~\citep{chr98} that provides forward chaining and can emulate backward chaining~\citep{chrclass}. \paragraph{Lookahead.} Forward chaining can be implemented with backward chaining by defining a reduction relation on environments which keep track of derived facts. We thus concern ourselves in the remainder of this section with the implementation of backward chaining using attribute grammars. To implement backward chaining, we use clauses to represent the various alternatives. However, the rule \TirName{trans} poses an additional challenge: a choice made for the left premise has consequences for the right premise. To express that a clause may only be selected if the remaining evaluation in a given context (the remainder-context) does not fail, we introduce two more strategy annotations: the lookahead-strategy and the onlylocal-strategy which serve as annotations for visit-blocks and invoke-rules: \begin{code} t ::= visit x z (many k) -- visit-block as presented before r ::= invoke x of (many c) z -- invoke-rule as presented before z ::= onlylocal -- does not take the remainder-context into account | lookahead -- takes the remainder-context into account \end{code} The onlylocal-strategy is the default. The remainder-context is a runtime property that can be influenced by invoke-rules. When an invoke rule is evaluated and it is annotated with the lookahead-annotation, the remaining evaluation in the parent's remainder context contributes to the remainder-context of the child. Otherwise, the invoke-rule behaves as a cut-operator which fixes the choices of clauses made by the child. The partial-strategy and total-strategy as mentioned before are orthogonal to the onlylocal-strategy and lookahead-strategy. \begin{figure}[tb] \begin{multicols}{2} \begin{code} grammar Eq ^^ prod Check itf Eq ^^ visit check partial inh env :: Set (Ty,Ty) inh tp1, tp2 :: Ty sem Eq prod ^^ Check ^^ visit check lookahead -- visit annotation default env -- rule in scope visit invoke check lookahead parent clause refl child u : Unify = sem_Unif u.tp1 = lhs.tp1 u.tp2 = lhs.tp2 clause sym child flipped : Eq = sem_Check flipped.tp1 = lhs.tp2 flipped.tp2 = lhs.tp1 clause trans child fr : Fresh = sem_Fresh child left : Eq = sem_Check child right : Eq = sem_Check left.tp1 = lhs.tp1 left.tp2 = fr.tp right.tp1 = fr.tp right.tp2 = lhs.tp2 clause assum match (u1.tp2, u2.tp2) <- do ahead $$ \k -> some $$ map k $$ elems lhs.env child u1,u2 : Unify = sem_Unif u1.tp1 = lhs.tp1 u2.tp1 = lhs.tp2 \end{code} \end{multicols} \caption{Example of an AG that represents an equality solver.} \label{fig.outline.equalitysolver} \end{figure} The example in Figure~\ref{fig.outline.equalitysolver} serves as illustration and we explain it below. For brevity, we left out details of the description that are related to the prevention of infinite derivations, the administration of substitutions, and the reuse of prior derivation trees. Moreover, we omitted attributes and rules to construct coercion terms from such a derivation tree. Such topics are discussed in Section~\rref{sect.intro.exploitit}. Only one production is declared for the Eq-nonterminal. The clauses in combination with higher-order children determine the structure of the equality proof. The example features a monadic match rule. The right-hand side of this rule is a monadic expression that determines the value to match against. In clause |assum|, we take a type equality assumption from the environment. There may be multiple of such assumptions in the environment. We derive from these assumptions a monadic expression that explores the possibilities one after the other and selects the first one that succeeds. Via |ahead| (explained below), we get a continuation |k| that expects a value for the pair and performs the remaining computations for the current context. The function |some| is defined below. It selects the first computation that succeeds. We saw above how to express backward chaining with clauses in combination with lookahead. In Section~\ref{intro.outline.sideeffect} we mentioned that the AG can be expressed as a monad. First we show an implementation of lookahead by using a backtracking monad, then show how clauses can be mapped to this monad. \paragraph{Backtracking monad.} We wrap the actual underlying monad |m| into a monad transformer |BackT| that consists of a composition of the continuation transformer on top of the error transformer. The continuation monad transformer provides a continuation, and via the error monad transformer a failing computation can be observed~\citep{Jones:1995:FPO:647698.734150}. The result type of the continuation is the parameter |r|: \begin{code} data Back = Back -- backtrack message type BackT r m a = ContT r (ErrorT Back m) a -- transformer \end{code} Backtrack points are specified using the operator |(<#>)| which represents local choice. It selects its right argument if and only if the evaluation of the left argument fails. Alternatively, the operator |<+>| represents a global choice, which takes the continuation of the parent of the choice into account: \begin{code} (<|>) :: Monad m => BackT a m a -> BackT a m a -> BackT r m a p <|> q = ContT (\c -> catchError (cut p) (const (cut q)) >>= c) (<+>) :: Monad m => BackT r m a -> BackT r m a -> BackT r m a p <+> q = ConT (\c -> catchError (runContT p c) (const (runContT q c))) cut p = runConT p return msum = foldr (<|>) (fail "backtrack") resolve p = ContT (\c -> cut p >>= c) \end{code} The function |resolve| limits the continuation. The function |ahead| exposes the continuation to the higher-order function |f|: \begin{code} ahead :: Monad m => ((a -> ContT r m r) -> ContT r m r) -> ContT r m a ahead f = ContT (\c -> runContT (f (\a -> ContT (\k -> c a >>= k))) return) p <+> q = ahead (\k -> p >>= k <|> q >>= k) -- alternative implementation \end{code} The function |ahead| provides the ability to explore different values for the continuation, and make choices based on the outcome of the continuation. We show in Chapter~\rref{chapt.breadthfirst} how to extend this mechanism to make choices based on intermediate results that are computed in the continuation. In a continuation monad, a computation |BackT r m a| represents a computation for a value of type |r| with a pending computation that takes |a| to |r|. The function |f| in |ahead| takes the pending computation as parameter, and replaces the computation for |r| with a computation that immediately goes to |r|. |Ahead f| can thus be understood as replacing the pending computation with (the computation produced by) |f|. \paragraph{Mapping of clauses to monads.} The evaluation algorithm for a clause is a monadic expression that computes values for the synthesized attributes of the visit. We thus define the body of a visit function as a sequence of these monadic expressions that are either combined with the global choice operator when the visit is annotated with the lookahead-annotation, or with the local choice operator when the visit has the default onlylocal-annotation. If an invoke-rule is not annotated with a lookahead-annotation, it applies |resolve| to the monadic expression of the child after applying the values for the inherited attributes. \paragraph{Remarks.} As mentioned in the previous section, clauses represent a search tree, which encodes alternative ways to compute the decorations of the tree. The exploration of these alternatives using the |BackT| monad is depth-first. Chapter~\rref{chapt.breadthfirst} describes how to explore clauses in a breadth-first way, which may give a more balanced exploration. Overhead is the work that is performed for the exploration of an alternative that is not selected. In practice, we preferably solve problems using a single pass traversal, or a fixpoint iteration. A search for a solution, however, cannot always be avoided, as is demonstrated by the GADT use-case. Moreover, the naive exploration of alternatives may be convenient for prototyping purposes. \section{Attribute Grammars with Stepwise Evaluation} \label{sect.outline.stepwise} Some type inference algorithms require an exploration of a forest of potential derivation trees. We can encode such a forest as a search tree that contains additional nodes which represent choices between derivations. In Chapter~\rref{chapt.breadthfirst} we present a library to describe such explorations of the search tree. \paragraph{Stepwise evaluation.} In the evaluation algorithms of Chapter~\rref{chapt.iterative}, clauses are explored one after the other. This approach corresponds to a depth-first exploration of alternatives. In Chapter~\rref{chapt.breadthfirst} we show how to evaluate clauses simultaneously, which corresponds to a breadth-first exploration of alternatives. A breadth-first exploration provides a balanced exploration for alternatives, which may be more efficient. %% Also, the memory related to alternatives that fail can be reclaimed, whereas breadth-first %% explorations keep the memory of root nodes live until the exploration is complete. With statically ordered AG evaluation (Section~\rref{sect.intro.ag.eval}), the evaluation of an AG is a sequence of rule evaluations. In this section, we group a number of these rule evaluations together and call that a step. We represent the evaluation of a tree as a computation which can be asked to execute one step, and afterwards pauses and returns control back to the caller. To decorate the tree, we provide a computation (the root-computation) at the root which takes the computation of the tree and repeatedly asks it to perform a step until the decorations are computed. \paragraph{Simultaneous exploration.} To explore alternatives, we mentioned in the previous section that we combine the computations of alternatives, for example using the |<+>|-operator. We now consider different ways to combine the computations of alternatives. We provide a computation (the choice-computation) that asks the alternatives to perform steps in an interleaved fashion. When an alternative succeeds, we replace the choice-computation with the alternative. When an alternative fails, we replace the choice-computation with the other alternative. When each alternative performed one step, the choice-computation exposes one step to its parent choice-computation or the root. With this approach we obtain a breadth-first traversal of alternatives. In this section, we first describe how to write such an algorithm as a monad that represents a \emph{coroutine}, and how to specify what constitutes to a step in this monad. Then, we show how this monad is used in RulerCore descriptions, and show an implementation of the monad. \paragraph{Coroutines.} A coroutine is a function that during its execution performs zero or more |yield| operations which denote re-entry points. A |yield| operation pauses the execution of the function and returns control to the caller. The caller may resume the execution of the callee from the point where it was paused. The callee may expose intermediate results to the caller and the caller may provide additional parameters when resuming the function. We assume initially as simplification that no results are exchanged between caller and callee. Visit functions are examples of coroutines (Section~\rref{sect.ag.eval.ordered}) that are invoked a statically fixed number of times. The evaluation of a child pauses at the end of the visit, and proceeds with the evaluation of the next visit when the parent invokes the subsequent invoke-rule. In this section, however, we consider coroutines that in addition to the statically fixed yields between visits, may yield a statically unbounded number of times during the execution of a visit. The evaluation up-to the next yield is what we call a step. \begin{figure}[tb] \begin{code} yield :: Stepwise m () step :: Monad m => Stepwise m a -> m (Report m a) lift :: Monad m => m a -> Stepwise m a ahead :: (forall r . (a -> Stepwise m r) -> Stepwise m r) -> Stepwise m a data Report m a -- represents a progress report = Done a -- finished and produced a value |a| | Failed String -- failed with a given error message | Paused (Stepwise m a) -- paused with the residual computation \end{code} \caption{API of the Stepwise monad.} \label{fig.outline.stepwiseapi} \end{figure} We design a coroutine monad |Stepwise|, which represents a stepwise computation specialized for the exploration of alternatives. It supports a number of operations in addition to those of the |BackT| monad. Figure~\ref{fig.outline.stepwiseapi} shows the API. The operation |yield| pauses the execution and resumes the caller. The operation |step| runs the coroutine until the coroutine either fails or succeeds, or reaches the next yield instruction. The outcome of the evaluation is presented as a progress report in the encapsulated monad |m|. A yield-operation thus specifies what constitutes as a single step. With |lift|, we wrap the effects of |m| into a stepwise computation so that these effects can be merged with the effects embedded in other stepwise computations. For example, we typically use |lift| to describe how the effects of a step-operation on a child are merged with some parent computation. \begin{figure}[tb] \begin{code} p <+> q = ahead (\k -> p >>= k <|> q >>= k) p <|> q = do a <- lift (step p) -- perform a step for |p| b <- lift (step q) -- perform a step for |q| act a b -- inspects the outcomes act :: Report m a -> Report m a -> Stepwise m a act (Done a) _ = return a -- commit to finished |p| act _ (Done a) = return a -- commit to finished |q| act (Failed s) (Failed _) = fail s -- both fail act (Failed _) (Paused r) = r -- |p| fails, commit to |q| act (Paused r) (Failed _) = r -- |q| fails, commit to |p| act (Paused p') (Paused q') = yield >> (p' <|> q') -- pause, later continue with choice \end{code} \caption{Breadth-first choice combinators.} \label{fig.outline.choicecombinators} \end{figure} With the choice combinators, we define a computation that represents a traversal over a search tree. Each subtree encodes an alternative. With the above API, Figure~\ref{fig.outline.choicecombinators} shows a breadth-first version of the choice combinators. The traversal is breadth-first because when |act| reports a step to the caller, each of the non-failed children performed one step. Iteration is encoded by replacing the choice-computation with a computation that calls the choice function again. Thus, when we commit to a certain alternative, we replace the choice-computation with the selected alternative, and thereby eliminate the choice. \begin{figure}[tb] \begin{code} fullred :: Stepwise m a -> m (Stepwise m a) fullred p = do rep <- step p -- perform a step case rep of -- inspect report Paused r -> fullred r -- repeat after yield _ -> return $$ comp rep -- either |Failed| or |Done| comp :: Report m a -> Stepwise m a -- report to residual computation comp (Paused m) = m comp (Failed s) = fail s comp (Done v) = return v \end{code} \caption{Depth-first choice combinators} \label{fig.outline.choicecombinators.depth} \end{figure} Figure~\ref{fig.outline.choicecombinators.depth} A depth-first version of the choice combinators is obtained by applying the function |fullred| to the left alternative. This function returns the computation with all steps stripped, thus forcing it to evaluate fully. The control we have over stepwise computations allows us to express a whole range of strategies, such as taking two steps left for each step right. The underlying monad |m| can be used to exchange information between the computation of an alternative and the choice between alternatives. For example, when |m| is a writer monad, an alternative can provide an estimate of the amount of work that has been performed. When |m| supports IO, the system time can be used to balance the two computations. Stepwise computations thus offer a means to describe powerful and complex exploration strategies. \paragraph{Children as stepwise computations.} The evaluation of an AG we represent as a monadic computation, and thus fits the Stepwise-monad straightforwardly. In the remainder of this section, we show how to specify yield operations and how to express alternatives. \begin{figure}[tb] \begin{code} itf Yield -- interface without visits nor attributes grammar Yield prod Yield -- single production sem Yield prod Yield -- empty semantics sem Expr prod Var visit check child y : Yield <- do -- monadic child rule yield -- monadic operation return sem_Yield -- semantics of child (trivial) \end{code} \caption{Yielding of steps expressed as nonterminal.} \label{fig.outline.yield} \end{figure} We do not need to introduce additional syntax to express yield operations because monadic child rules have monadic right-hand sides and can therefore be used to express the yield operations. For example, we introduce a dummy nonterminal |Yield| in Figure~\ref{fig.outline.yield} and use it to specify a yield operation using a child rule in some exemplary production |Var|. A monadic child rule is guaranteed to be evaluated in the visit it is constrained to. Thus, the right-hand side of the rule is evaluated during visit |check|. \paragraph{Encoding of alternatives.} To express alternatives we can explicitly encode a search tree using higher-order children or by using clauses. We first consider the encoding of a search tree. A node in a search tree may express a choice between its children. For this purpose we refine the notation introduced in Section~\ref{sect.intro.exploitit} to attach and detach children with an additional keyword (explained below). \begin{figure}[tb] %format i1 %format i2 %format s1 \begin{code} sem Tree prod Alt -- semantics for a choice node child left : Tree = ... -- define child |left| child right : Tree = ... -- define child |right| left.i1 = ... -- definition of some attributes right.i1 = ... loc.p = detach upon v of left -- evaluates |left| up to visit |v| loc.q = detach upon v of right -- evaluates |right| up to visit |v| attach upon v of res : N <- do -- attaches as child |res| loc.p <+> loc.q -- choice between children lhs.s = res.s1 -- use results of the chosen child \end{code} \caption{A sketch of a parallel exploration.} \label{fig.outline.sketchpar} \end{figure} We detach alternatives and attach a computation that determines the chosen alternative. The abstract example in Figure~\ref{fig.outline.sketchpar} provides a sketch. Recall that the semantics of a visit of a child is a function that takes values for inherited attributes and returns a monadic computation for the synthesized attributes of that visit. We are thus interested in the state of the children after they received the values for the inherited attributes. We use the upon-keyword for this purpose. The detach-rule thus provides the monadic computation for the specified visit of a child, and the attach-rule runs the computation to obtain the synthesized values for that visit. \paragraph{Strategy for clauses.} Instead of explicitly encoding a search tree as above, we can also use clauses. To specify a choice between clauses, we present additional notation. A visit-block may be annotated with a select-strategy: \begin{code} t ::= visit x z (many c) -- existing syntax for visit-blocks z ::= onlylocal -- combine clauses with the local choice operator | lookahead -- combine clauses with the global choice operator | select e -- custom, |e| is e.g. a function |\c1 ... (sub c k) -> ...| \end{code} The select-strategy specifies a function |e| that takes a computation for each clause of the visit-block as parameter and provides a computation for the results of the visit. The computation for the results of a visit is |e c1 ... (sub c k)| where |c1, ..., (sub c k)| are the computations corresponding to each clause\footnote{ Several notational variants are possible. It may sometimes be more convenient to obtain the computations of the clauses as a list or as a record with a field for each clause. }. \paragraph{Implementation.} In the remainder of the section we describe how intermediate results can be exposed to the selection function. However, we first show how the Stepwise-monad is implemented. The algorithm\footnote{ Complete Haskell module of the simplified implementation:\\ @https://svn.science.uu.nl/repos/project.ruler.papers/archive/RefStepwise.hs@ } that we show here is slightly simplified with respect to Chapter~\rref{chapt.breadthfirst} and its full understanding is not required for the remainder of this chapter. \begin{figure}[tb] \begin{code} data Stepwise m a where Return :: a -> Stepwise m a Bind :: Stepwise m a -> Parents m a b -> Stepwise m b Fail :: String -> Stepwise m a Yield :: Stepwise m () Lift :: m a -> Stepwise m a Ahead :: (forall r . (a -> Stepwise m r) -> Stepwise m r) -> Stepwise m a data Parents :: m a b where Root :: Parents m a a Pending :: (a -> Stepwise m b) -> Parents m b c -> Parents m a c instance Monad (Stepwise m) where return = Return m >>= f = Bind m (Pending f Root) \end{code} \caption{The structure of |Stepwise|.} \label{fig.outline.stepwise.data} \end{figure} We represent the Stepwise-monad in Figure~\ref{fig.outline.stepwise.data} as a computation that can be inspected (Section~\tref{intro.sect.correspondences}). The function |step| interprets the computation in order to evaluate it one step. The right-hand side of a bind contains a stack |Parents| of all the continuations to the right of a monadic expression. The expression |(m >>= f) >>= g| is represented as: \begin{code} Bind m (Pending f (Pending g Root)) \end{code} Since monadic binds are right-associative, the stack only grows when an expression occurs as left-and side of a monadic bind that expands to one or more binds. This is the case when calling the visit function of a child, hence the stack contains the continuations of all parents till the location where |step| is performed. The child that is undergoing evaluation is on top of the stack. \begin{figure}[tb] \begin{code} step :: Monad m => Stepwise m a -> m (Report m a) step m = reduce m Root reduce :: Monad m => Stepwise m a -> Parents m a b -> m (Report m b) reduce Yield r = return $$ Paused (r `apply` ()) reduce (Fail s) _ = return $$ Failed s reduce (Lift m) r = m >>= step . apply r reduce (Ahead f) r = step $$ f (apply r) reduce (Return v) Root = return $$ Done v reduce (Return v) (Pending f r) = reduce (f v) r reduce (Bind m r) r' = reduce m (push r r') apply :: Parents m a b -> a -> Stepwise m b apply r v = Bind (Return v) r push :: Parents m a b -> Parents m b c -> Parents m a c push r Root = r push Root r = r push (Pending f r') r = Pending f (push r' r) \end{code} \caption{The implementation of |reduce|.} \label{fig.outline.reduceimpl} \end{figure} The function |reduce| in Figure~\ref{fig.outline.reduceimpl} takes a computation |m| and a pending stack |p|. It evaluates |m| one step. If |m| yields or fails it returns a progress report. Otherwise, it continues evaluating |m| until it obtains a result that can be fed into the parents-stack. If the parents-stack is empty, evaluation is finished. Otherwise, the parent-stack contains the continuation to proceed with. The function |step| delegates to |reduce| with an empty stack. The function |apply| turns a pending parent into a monadic computation by passing it the result it was waiting for. The function |push| concatenates two parent stacks. \paragraph{Coordination.} We presented above how to specify selection strategies. To encode powerful coordination strategies, we improve the above approach so that computations can yield results and take arguments when resumed. \begin{defi}[Tag] A \emph{tag} of type |Op i o| specifies the interface between the callee (the computation) and the caller (execution of |step|), where |i| is a type index that fixes the data-type that is used for tags (|Op i o| is a type family), and |o| is the type of the results exchanged between callee and caller. The computation yields results of the type |Inp o| and takes arguments of type |Out o| as parameter for the next visit. \end{defi} \begin{figure}[tb] \begin{code} action :: Op i o -> Inp o -> Stepwise m (Out o) data Report i m a -- refinement of the |Report| data type | forall o . ^^^ Paused ^^^ (Op i o) ^^^ (Inp o) ^^^ ((Stepwise m (Out o) -> Stepwise m a) -> Stepwise m a) data family Op i :: * -> * -- a tag of an operation (of some set indexed by |i|) type family Inp o :: * -- inputs to the operation |Op i o| type family Out o :: * -- resulting outputs of operation |Op i o| \end{code} \caption{} \label{fig.outline.families} \end{figure} We refine the operation |yield| so that it takes a tag of type |Op i o| and intermediate results of type |Inp o| for the caller and provides arguments of type |Out o| as given by the caller when the callee is resumed. Figure~\ref{fig.outline.families} shows the encoding in Haskell with type families. For example, |action| can be used by a computation to report the number of open goals, and receive a priority rating from the caller. In that case, we specify the following instances for the above type families: \begin{code} data Meta -- type index for the set of tags data OInfo -- type index for a particular tag data instance Op Meta o where -- declares the tags OpInfo :: Op Meta OInfo -- one tag type instance Inp OInfo = Int -- specification of inputs of operation |o| type instance Out OInfo = Int -- specification of results of operation |o| \end{code} The report-handling code in selection strategies may match on Yield-reports to obtain the |Op i o| and |Inp o| values, plus the continuation which may be used to resume the computation when a computation is provided that produces the |Out o| values. This approach is powerful: arbitrary \emph{traps} to operations can be expressed this way. For example, it is possible to create tags for that represent operations such as unification, generation of unique numbers, lookups in memo tables, and output to the console. The caller can determine what semantics to give to these operations. The construction offers an \emph{inversion of control}: the callee \emph{declaratively} specifies operations, and the caller determines the semantics. \paragraph{Memoization.} As mentioned in the GADT example, it may be desirable to cache results of subtrees and reuse these results at other locations in the tree. However, if the continuation (accessed using |ahead|) is used to distinguish the result of the computation, the results should not be cached, unless the continuation is the same for each context the shared computation appears in. Moreover, in case of a stepwise computation, these results may not be available yet, and it may be desirable to share computations instead. A computation can be shared by storing it in an updatable state, and updating this state after each call to |step|. The computation may then each time receive the result of an action from a different context. Also, each contexts may only receive a partition of the actions yielded by the computation. \paragraph{Remarks.} %% Also note: Stepwise does not have the continuation parameter |r|. The reason is that %% it's then not clear what the type of the continuation for |Report| is. Chapter~\rref{chapt.breadthfirst} focusses on \emph{generators} which are coroutines that only yield information but do not take parameters. We show how to evaluate such coroutines strictly or lazily. In the latter case, results can already be produced when it depends on a choice for which only one alternative is left. \section{Attribute Grammars with Dependent Types} \label{outline.dependent.types} In Chapter~\rref{chapt.dependent-ags} we investigate AGs where attributes may have a dependent type, which can be used to state and prove properties of the AG. For this purpose, we describe an embedding of AGs in Agda\footnote{ In this section, we deviate slightly from the actual syntax to have a closer correspondence with Haskell. }~\citep{bove09}. \emph{Dependent types} provide a means to use types to encode properties with the expressiveness of (higher-order) intuitionistic propositional logic, and terms to encode proofs. In this setting, a parameterized type constructor specifies a relation between its type parameters and data constructors form the inference rules of the relation. In a dependently typed AG, the type of an attribute may refer to values of attributes. The type of an attribute is an invariant and the value of an attribute a proof for that invariant. Moreover, because of the Curry-Howard correspondence, dependently typed AGs are a domain-specific language to write structurally inductive proofs in a \emph{composable}, \emph{aspect-oriented} fashion; each attribute represents a separate aspect of the proof. Some knowledge of dependent types and Agda is a prerequisite for this section. We first give an example and some notation. We follow up with an extension that permits the visits of a nonterminal to be organized as a tree instead of a totally ordered sequence. \paragraph{Dependent attribute type.} The following interface declaration for some nonterminal |Pat| demonstrates attributes with a dependent type. The gathered environment |asyn.gathEnv| is a subset of the final environment |ainh.finEnv|, which is expressed as the type |asyn.gathEnv `subset` ainh.finEnv|. We assume the existence of a type constructor |`subset`| and several utility functions. The attribute |ainh.gathInFin| has the above type and therefore represents a proof of this property: %format prv1 %format prv2 %format inj1 %format inj2 \begin{code} itf Pat visit analyze ^^^ syn gathEnv :: Env visit translate ^^^ inh finEnv :: Env inh gathInFin :: asyn.gathEnv `subset` ainh.finEnv grammar Pat ^^^ prod Var ^^^ term nm :: Ident \end{code} The |gathInFin| attribute provides the guarantee that elements that are in |asyn.gathEnv| are also in |ainh.finEnv|. Rules of productions of |Pat| may exploit this guarantee. The |lookup| of an identifier in the final environment may return |Left notIn| where |notIn| is a proof that the identifier is not in the environment, or |Right v| where |v| is the value of the identifier in the environment. In the following example, production |Var|, which has a terminal |loc.nm|, we define with a proof\footnote{ The functions |here| and |inSubset| are conventional dependently typed functions that construct the appropriate proofs. Their implementation are beyond the scope of this section. } |loc.prv2| that the identifier is in the environment, and we use |loc.prv2| to prove that the lookup cannot return a Left-value: \begin{code} sem Pat prod Var loc.prv1 ^^ = here loc.nm asyn.lhs.gathEnv -- proof of |nm| in |gathEnv| loc.prv2 ^^ = inSubset lhs.gathInFin loc.prv1 -- proof of |nm| in |finEnv| loc.val ^^ case loc.nm `lookup` lhs.finEnv of -- defined by case distinction ^^ | Left notIn ^^^ falsum notIn loc.prv2 -- impossible case ^^ | Right v ^^^ -> v -- case that |loc.nm| is in |finEnv| \end{code} The case that the element is not in |lhs.finEnv| is in contradiction with |loc.prv2|. Their application has an uninhabitable type, which we use in combination with the falsum-case to terminate the branch without giving a definition. This example shows three ways to define an attribute: with a plain RHS, with case distinction, and with |falsum e| where host-language expression |e| has an uninhabitable type: \begin{code} r ::= p = e -- with plain RHS | p m -- with complex RHS m ::= falsum e -- unreachable case (|e| has an uninhabitable type) | case e of (many b) -- with case distinction, and cases |many b| b ::= rho -> e -- nested plain RHS in a case distinction | rho t -- nested complex RHS in a case distinction rho -- pattern in the host language (no attributes) \end{code} The additional syntax provides us with a means to give provable \emph{total} definitions of attributes. \paragraph{Cycle analysis and consistency.} Functions are required to be total in dependently typed programs for reasons of logical consistency and termination of type checking, which in case of AGs correspond to total definitions of attributes and the requirement that dependencies between attributes are acyclic. \paragraph{Partitions.} In Chapter~\rref{chapt.warren}, the ordering algorithm implicitly distinguishes different contexts in which a nonterminal is used. However, to ensure that attribute definitions are total, it may be convenient to make such contexts explicit. In the following example, we specify code generation depending on the absence of errors. We define two contexts for the |generate| visit. The context |errorfree| provides an attribute |code|, but it may only be invoked when errors are absent. The context |haserrors| alternatively provides an attribute |pretty|, which contains an annotated pretty print of the program: \begin{code} itf Pat visit report ^^^ syn errors :: Errs ainh.finEnv visit generate -- a visit may consist of one or more partitions context errorfree -- a partition has a name inh noErrors :: asyn.errors == [] syn code :: Target ainh.finEnv context haserrors -- a partition may also contain subsequent visits syn pretty :: Doc \end{code} The caller invokes a visit on the callee, and is responsible for selecting what context it wants to use. The callee is required to produce results for that choice. The callee can encode restrictions on the available choices for the parent as inherited attributes. The caller must provide values for the inherited attributes of the partition it chooses. \begin{figure}[tb] \begin{code} sem Pat prod App lhs.errors = f.errors ++ a.errors -- collect errors context errorfree -- rules exclusive for |errorfree| invoke generate of left context errorfree -- specifies context to invoke invoke generate of right context errorfree -- specifies context to invoke left.noErrors = leftNil left.errors right.errors lhs.noErrors right.noErrors = rightNil left.errors right.errors lhs.noErrors lhs.code = left.code `apply` right.code context haserrors -- rules exclusive for |haserrors| invoke generate of left context haserrors invoke generate of right context haserrors lhs.pretty = left.pretty <+> right.pretty -- collect pretty print \end{code} \caption{Example of rules specified for a specific context.} \label{fig.outline.rulescontext} \end{figure} We specify a context as annotation of the invoke-rule. Moreover, we may specify rules for particular contexts as is demonstrated in Figure~\ref{fig.outline.rulescontext}. A special falsum-rule may be used to denote that a visit, clause or context is unreachable. \paragraph{Remarks.} Type attributes correspond to quantification. An inherited type attribute corresponds to universal quantification, since the caller can choose its instantiation. A synthesized type attribute corresponds to existential quantification. The callee can choose its type, but the caller cannot make an assumption about it. This mechanism allows us to deal with polymorphism in interfaces. Indeed, the above ideas allow quantification in an AG for Haskell to be expressed. In a dependently typed AG, attributes can represent both values and types. In Haskell, there is a clear distinction between values and types. In an AG for Haskell, we can make an explicit distinction between attributes that represent types (and specify a \emph{kind} as type) and attributes that represent values. The type of a type attribute may not refer to other attributes. The type of a value attribute, however, may refer to a type attribute. \section{Attribute Grammars on DAGs} \label{outline.otherapplications} In the extended edition, we included a relatively short chapter~\citep{ariem-ext-gadt} that provides examples of other ways to apply the techniques as presented. There are two common data structures in compilers: trees and directed graphs. Ordered attribute grammars are suitable to define a semantics on trees but not suitable to define the semantics of graphs. The reasons is that nodes in a graph may occur in different contexts at execution time, which makes a static dependency analysis difficult. In that chapter, we also show how our approach relates to (cyclic) reference attribute grammars. \section{Conclusion} We gave a detailed summary of the following chapters, and described how the chapters are connected together. Also, this chapter showed the features of RulerCore in relation to conventional attribute grammars. A prototype implementation of RulerCore is available as the compiler @ruler-core@. Its implementation is based on higher-order attribute grammars and Haskell, and can be obtained from: \begin{verbatim} https://svn.science.uu.nl/repos/project.ruler.papers /archive/ruler-core-1.0.tar.gz \end{verbatim} The @examples@ subdirectory contains some minimalistic examples. A large example based on the HML type system~\citep{DBLP:conf/popl/Leijen09} can be obtained from: \begin{verbatim} https://svn.science.uu.nl/repos/project.UHC.pub /branches/tnfchris-hml/ \end{verbatim}