%include format.lhs In Chapter~\tref{chapt.side-effect} we described a programming model based on visits as an extension of attribute grammars. Using this model, attribute declarations are not as easily composed as with conventional attribute grammars. In this chapter, we introduce phases as a generalization of visits, for which attribute declarations are composable. Further, we introduce commuting rules, which are rules that thread a chained attribute (Section~\tref{intro.sect.ag.dhm} and Section~\tref{intro.sect.copyrules}), with the difference that the rules in the chain can be reordered. To preserve referential transparency, the commuting rules must satisfy a liberal commutativity law. This chapter is orthogonal with respect to the subsequent chapters. It presents the technical material which can be used to generalize the contents of the subsequent chapters. \section{Introduction} In this chapter, we show three related extensions of the explicit visit-approach in Chapter~\tref{chapt.side-effect} and ordered attribute grammars in general. \begin{defi}[Phase] A phase is a generalization of a visit. It gives a name to a unit of evaluation for attributes associated with a nonterminal symbol. \end{defi} Firstly, instead of associated visit declarations with nonterminals as in the previous chapter, we associate sets of phase declarations with nonterminals in this chapter. We allow attribute declarations and rules to depend on the beginning of a phase or be constrained by the end of a phase, and impose a partial order on phases. When the value dependencies between attributes and rules and the dependencies induced by phases are acyclic, the attributes that are associated to a previous phase are defined before any attributes that are associated with a later phase, which provides a way to specify the order of evaluation of rules. Secondly, we use a statically ordered evaluation to compute values of attributes, and we present a scheduling algorithm in the style of \citet{DBLP:conf/popl/KennedyW76} which infers multiple visit interfaces from the phase interface of a nonterminal. For each visit interface, a production has a potentially different execution plan. An execution plan specifies an explicit ordering of rules per visit (Section~\tref{sect.treewalkautomaton}). Such a plan specifies also for each child which visit interface is used for its evaluation. In the generated code, semantic functions are indexed by the choice of the visit interface for which it provides a semantics, and we show how to encode this in a strongly typed, purely functional language. There may be many possible visit interfaces given the phase interface of a nonterminal. The constraints on rule and attribute ordering as mentioned above can be used to significantly limit the number of possible visit interfaces. Finally, we present commutable rules, which allows us to functionally encode side effects. A functional encoding of side effects as mentioned in Chapter~\tref{chapt.side-effect} can be accomplished with a chained attribute that is threaded through each operation, where an operation is a rule or higher-order child. For example, a substitution attribute captures the side effects that result from substitutions during type inference, and an attribute with the type |RealWorld| captures arbitrary I/O. The attribute's threading determines the order in which the side effects are observable in the values of the attribute. For some threaded attributes, the order imposed by the threading may be irrelevant, for example when the rules represent \emph{commuting} operations, such as when handing out unique numbers. However, the order imposed by threading the attribute may induce cycles in the dependencies of attributes. We present a mechanism to allow the order to be left unspecified and given the freedom to be determined by the implementation. \begin{defi}[Commutable rule] A \emph{commutable rule} is a rule that threads a chained attribute, but does not depend (indirectly) on previous rules in the chain. \end{defi} Commutable rules thus provide more freedom in their ordering. To preserve referential transparency, the composition of commutable rules must satisfy a liberal commutativity law, as we show later. %% commutative rules allow us to functionally express side effects as if we threaded %% the attribute through each visit. Commutable rules also allow the functional encoding of a side-effectful rule that is scheduled to different implicit visits. We thus offer a mechanism to explicitly enforce constraints on the evaluation order of attributes, and another mechanism to loosen the constraints imposed by rules. The combination offers convenient ways to declaratively specify side-effectful operations and their algorithmic evaluation order. Phases and commutable rules have in common that the associated scheduling algorithm takes information into account that is not visible in the attribute dependencies induced by the rules. Phases induce ordering constraints based on dependencies between attributes and the lexical scope of rules. Commutable rules require a barrier between rules that commute over an attribute and rules that do not. Therefore, we introduce barrier attributes and dependency rules, which allow the encoding of such additional dependencies. \begin{defi}[Designator] A \emph{designator} |d| (Figure~\tref{warren.fig.agcore}) is either an attribute occurrence or a symbolic representation of a rule. \end{defi} \begin{defi}[Dependency rule] A \emph{dependency rule} |d1 `pre` d2| is a rule that specifies that designator |d1| must be scheduled before designator |d2|. \end{defi} \begin{defi}[Barrier attribute] A \emph{barrier attribute} is an attribute that can be used as a \emph{designator} (dependee/depender) in combination with dependency rules. The value of a barrier attribute is implicitly defined. \end{defi} In this chapter, we first work out the concepts of barrier attributes and dependency rules (Section~\ref{sect.warren.example}), and then show how these concepts can be exploited to deal with phases (Section~\ref{sect.warren.phases}) and commutable rules (Section~\ref{warren.sect.commutting}). \section{Example with Barriers} \label{sect.warren.example} \begin{figure}[tb] \begin{code} grammar Tree -- grammar for nonterm |Tree| | Leaf x :: Int -- production |Leaf| with one terminal | Bin l,r : Tree -- production |Bin| with two nonterminals attr Tree ^^^ syn gath :: Int -- attributes for nonterm |Tree| inh mini :: Int syn repl :: Tree sem Tree -- semantics for nonterm |Tree| | Leaf lhs.gath = loc.x lhs.repl = Leaf lhs.mini -- replacement tree | Bin lhs.gath = min l.gath r.gath -- global minimum gathering l.mini = r.gath -- crossover between |r| and |l| r.mini = l.gath lhs.repl = Bin l.repl r.repl -- replacement tree \end{code} \caption{A variant of the Repmin example.} \label{fig.example.barrier} \end{figure} We use Haskell as host language. Figure~\ref{fig.example.barrier} shows a variant of the classical Repmin example~\citep{springerlink:10.1007/BF00264249}, which requires more than one visit to compute the attributes with a statically ordered evaluation strategy. The attribute |repl| contains a clone of the tree with each leaf-child replaced with the minimum of its sibling's subtree. In order to get the |min| attribute of |l|, the |gath| attribute of |r| is needed and vice versa. With statically ordered evaluation, this can be accomplished by first computing the |gath| attribute in a first visit, then compute |min| and |repl| in a later visit. \paragraph{Barriers.} The evaluations for |l.repl| and |r.repl| are independent. For debugging purposes, we may want to specify that the |l.repl| attribute is computed before the |r.repl| attribute. Since the |asyn.repl| attribute depends on the |ainh.min| attribute, we get the desired behavior when |l.repl| is a dependency of |r.min|. The following dependency rule expresses this dependency: \begin{code} sem Tree | Bin ^^^ order l.repl `pre` r.min \end{code} This rule requires that |ainh.min| is a dependency of |asyn.repl|. Note that, as usual, the dependency relation is transitive. The dependencies imposed by dependency rules may not be clear when multiple synthesized attributes are involved, nor remain stable after code changes. For this purpose, we introduce a barrier attribute: \begin{code} attr Tree ^^^ inh sync barrier sem Tree | Leaf ^^^ order lhs.sync `pre` lhs.repl | Bin order lhs.sync `pre` l.sync order lhs.sync `pre` r.sync order l.repl `pre` r.sync -- actually subsumes |lhs.sync `pre` r.sync| \end{code} Barrier attributes are not defined by a rule. Instead, these attributes are defined implicitly. However, we require that for a synthesized barrier attribute |y|, that there is at least one production with a dependency rule where |lhs.y| occurs as RHS, and at least one production with a child |k| where |k.y| occurs as LHS in a dependency rule. For an inherited barrier attribute |y|, we require the inverse. There is actually no technical reason for this requirement: if the requirement is not met, it should be taken as a warning that a barrier is not used. \begin{defi}[Updatable attribute] An updatable attribute is an attribute that has a value which is a reference into a shared state. Via attribute occurrences (Figure~\ref{warren.fig.agcore}), we specify operations on this shared state. An attribute occurrence |sup z trref| in expressions represents the value in the shared state referenced by |z|. An occurrence |sup z trref| in a pattern means that the shared state at the location pointed to by |z| is updated with the matched value. An occurrence |sup z trnew| in a pattern means that a new shared location is allocated with the matched value and a reference to it stored in |z|. \end{defi} We incorporate side-effect in BarrierAG (Chapter~\rref{chapt.side-effect}) in the form of updatable attributes. In the following example an updatable attribute |unq| is created, read from, and updated: \begin{code} attr Tree ^^^ inh unq :: IORef Int -- inherited unique number dispenser sem Root | Root sup (root.unq) trnew = 1 -- creates reference and assigns initial value sem Tree | Leaf -- reads and updates reference (loc.myId, sup (alhs.unq) trref) = (sup (alhs.unq) trref, sup (alhs.unq) trref + 1) \end{code} Normally, attribute occurrences in a pattern are at defining positions. However, when |sup z trref| is used to specify a store to a shared state, the attribute occurrence |z| is at a usage position. Hence, |alhs.unq| refers to the inherited attribute |alhs.unq|, and is a dependency of the rule. Rules that write to a reference are guaranteed to be applied at most once. The order in which these writes take place depends on the scheduling of attributes. If the attribute |loc.myId| is not used, the write actually does not take place. The only guarantee that is given for such a write is that the reference is created and initialized. For more guarantees, barriers can be used to control and specify the order. \paragraph{Remarks.} Barriers and updatable references should be used sparingly. In particular, rules with updatable references are not functional, which breaks equational reasoning. We present them here as an implementation mechanism for phases and commutable rules. With the later property, we can recover equational reasoning. Note that the code generated from an AG may be functional even if the AG description itself is non-functional. \begin{figure}[t] \begin{code} I ::= attr N (many a) -- attr decls for nonterminal |N| a ::= k x t -- attribute declaration with form |k|, attr name |x| and type |t| k ::= inh | syn -- attribute forms (often also written as identifier) t ::= :: tau -- attribute type (host language) | barrier -- barrier type s ::= (sub sem (^^ Gamma)) P : N (many r) -- semantics of a production |P| of nonterm |N| in env |Gamma| r ::= child x : N = f (many zup) -- (higher order) child declaration | x : ^^ p (many zdown) = f (many zup) -- evaluation rule named |x| with LHS |p| and RHS |f| | order d1 `pre` d2 -- order declaration d ::= child x -- designates child |x| | rule x -- designates a rule named x | z -- designates an attribute occurrence zdown ::= sup z trval -- store in occurrence |z| | sup z trref -- write to location referenced by |z| | sup z trnew -- store new handle in |z| zup ::= sup z trval | sup z trref -- respectively read (closed), and deref (open) z ::= h.c.x -- occurrence attr |x| of |c| with form |h| h ::= k | loc -- attribute forms (extended with locals) c ::= lhs | loc | x -- child designators x, f, p, P -- identifiers \end{code} \caption{AG core representation.} \label{warren.fig.agcore} \end{figure} \section{Core Representation of AGs with Barriers} \label{sect.warren.core} In this section, we define the semantics of the dependency rules and barrier attributes using the core language BarrierAG, a desugared subset of higher-order AGs. BarrierAG also permits attributes that have references to a global state as value, which we need later. \paragraph{Syntax.} Figure~\ref{warren.fig.agcore} gives the syntax of BarrierAG. BarrierAG serves as a core language for AGs. Hence, we assume that static checks, desugaring, item grouping, and copy rule insertion have been performed (Section~\tref{intro.sect.incremental} and Section~\tref{intro.sect.copyrules}). This core representation serves two purposes: it allows formal reasoning with AGs, and specifying the construction of dependency graphs and execution plans (also see Section~\tref{sect.intro.ag.depgraphs}). \begin{figure}[t] \begin{code} sem_Tree (Leaf x) = sem_Leaf x sem_Tree (Bin l r) = sem_Bin (sem_Tree l) (sem_Tree r) attr Tree ^^ syn gath :: Int inh mini :: Int inh sync barrier syn repl :: Tree sem_Leaf field_x = sem { f1 = field_x, f2 = Leaf } Leaf : Tree r0: ^^^ id ^^ occval (loc.loc.x) ^^ = ^^ f1 r1: ^^^ id ^^ occval (asyn.lhs.gath) ^^ = ^^ id ^^ occval (loc.loc.x) r2: ^^^ id ^^ occval (asyn.lhs.repl) ^^ = ^^ f2 ^^ occval (ainh.lhs.mini) order ainh.lhs.sync `pre` asyn.lhs.repl sem_Bin field_l field_r = sem { f1 = field_l, f2 = field_r, f3=min, f4=Bin } Bin : Tree child l : Tree = f1 child r : Tree = f2 r3: ^^^ id ^^ occval (asyn.lhs.gath) ^^ = ^^ f3 ^^ occval (asyn.l.gath) ^^ occval (asyn.r.gath) r4: ^^^ id ^^ occval (asyn.lhs.repl) ^^ = ^^ f4 ^^ occval (asyn.l.repl) ^^ occval (asyn.r.repl) r5: ^^^ id ^^ occval (ainh.l.mini) ^^ = ^^ id ^^ occval (asyn.r.gath) r6: ^^^ id ^^ occval (ainh.r.mini) ^^ = ^^ id ^^ occval (asyn.l.gath) order ainh.lhs.sync `pre` ainh.l.sync order ainh.lhs.sync `pre` ainh.r.sync order asyn.l.repl `pre` ainh.r.sync ast = sem_Bin (sem_Leaf 1) (sem_Leaf 2) \end{code} \caption{Desugared example.} \label{warren.fig.agexpl} \end{figure} The main (meta) nonterminals in the (meta) grammar of Figure~\ref{warren.fig.agcore} are |I| (attr-block) and |s| (semantics blocks). An attr-block consists of a set of attribute declarations. A semantics block provides the rules for a single production. The environment |Gamma| is often left implicit. Figure~\ref{warren.fig.agexpl} shows a desugared version of the earlier example in BarrierAG. The context free grammar is translated to terms in the host language, and is not part of the core language. The symbols of the production are represented as higher-order children and local attributes. Evaluation rules are explicitly named, and consist of a function symbol |p| and |f|. The evaluation rule represents the function |(many z1) = p (f (many z2))|. For example, the rule |r3| represents the function |asyn.lhs.gath = id (f3 asyn.l.gath asyn.r.gath)|. The definition of these functions are bound in the environment |Gamma|, which is an annotation of the sem-block. This representation has the advantage that the rules can be duplicated without duplicating the body of the function. %%% Operational semantics. \paragraph{Semantics.} Ultimately, we generate host-language code for the core language term. However, to facilitate reasoning with terms in the core language, we first define an operational semantics\footnote{ This operational semantics is also implemented as part of the @ruler-interpreter@: \url{https://svn.science.uu.nl/repos/project.ruler.papers/archive/ruler-interpreter-0.1.tar.gz}. } in Figure~\ref{warren.fig.nodeeval} which denotes a tree walking automaton (Section~\tref{sect.treewalkautomaton}). The semantics refers to rules in Figure~\ref{warren.fig.readyrules}, which can be interpreted as a dynamic version of the dependency analysis that we define in the next section or as a specification of a tree-walking automaton (Section~\tref{sect.treewalkautomaton}). Since we deal with higher-order AGs, the operational semantics describes both the construction and decoration of the AST. A decorated node of the AST is represented as a tuple |(N,P,Gamma,Heap)|, where |N| is the associated nonterminal, |P| is the associated production, |Gamma| is an environment that contains bindings for the functions that are mentioned in the rules of |P|, and the heap |Heap| contains values for attributes and nodes that form the children of |P|. More precisely, a heap |H| is a partial map between descriptors |d| and a value |v|. A descriptor is either an identifier for an attribute or a child: \begin{code} v ::= atom | unit | n | ell -- atomic value |atom|, or unit value |unit|, or node |n|, or reference |ell| n ::= (N,P,Gamma,H) -- node associated to |N| and |P|, with env |Gamma|, and heap |H| H ::= emptyset | H, d ^ `dmap` v -- heap of a node (partial map from descriptor to value) Psi ::= emptyset | Psi, ell ^ :-> v -- threaded heap (partial map from reference to value) Upsilon ::= emptyset | Upsilon, ell -- set of references |many ell| \end{code} A value of an attribute is either an atomic value |atom| in the domain of the attribute, or a tree in case of a higher-order attribute. \begin{figure}[t] \begin{small} \begin{mathpar} \fbox{|Gamma :- Psi0 `sep` v0 ---> v1 `sep` Psi1|}\\ \inferrule*[right=e.descent] { |hlook H (child x) = v0| \\ |Gamma0 :- Psi0 `sep` v0 ---> v1 `sep` Psi1| } {|Gamma0 :- Psi0 `sep` (N,P,Gamma1,H) ---> (N,P,Gamma1,{child x :-> v1} \/ H) `sep` Psi1|} \inferrule*[right=e.inh] {|hlook H (ainh.x.y) = v| \\ |hlook H (child x) = (N',P',Gamma2,H')|\\ |ready (N' `sep` P' `sep` H') (ainh.lhs.y)| } {|Gamma0 :- Psi `sep` (N,P,Gamma1,H) ---> (N,P,Gamma1,{child x :-> (N',P',Gamma2,{ainh.lhs.y :-> v} \+/ H')} \/ H) `sep` Psi|} \inferrule*[right=e.syn] {|hlook H (child x) = (N',P',Gamma2,H')|\\ |hlook H' (asyn.lhs.y) = v|\\ |ready (N `sep` P `sep` H) (asyn.x.y)| \\ } {|Gamma0 :- Psi `sep` (N,P,Gamma1,H) ---> (N,P,Gamma1,{asyn.x.y :-> v} \+/ H) `sep` Psi|} \inferrule*[right=e.child] { |( child x : N' = f((many (sup z trup))) ) `elem` rules(N,P)| \\ |ready (N `sep` P `sep` H) (child x)| \\ |many (load (Psi `sep` H) (sup z trup) (v))| \\ } {|Gamma0 :- Psi `sep` (N,P,Gamma1,H) ---> (N,P,Gamma1,{child x :-> (Gamma0 Gamma1 f)(many v)} \+/ H) `sep` Psi|} \inferrule*[right=e.eval] { |( x : ^^ p((many (sup z1 trdwn))) = f((many (sup z2 trup)))) `elem` rules(N,P)|\\ |ready (N `sep` P `sep` H0) (rule x)| \\\\ |many (load (Psi0 `sep` H0) (sup z2 trup) (v0))| \\ |(many v1) = (Gamma0 Gamma1 p ^^ ^^ . ^^ ^^ Gamma0 Gamma1 f) (many v0)| \\ |many (store (H0) (v1) (sup z1 trdwn) (H `sep` Psi `sep` Upsilon))| \\ |sub Upsilon 1 \+/ ... \+/ sub Upsilon n \+/ dom Psi = (many ell)| \\ } {|Gamma0 :- Psi0 `sep` (N,P,Gamma1,H0) ---> (N,P,Gamma1,(many H) \+/ {rule x :-> unit} \+/ H0) `sep` (many Psi) Psi0|} \inferrule*[right=e.bar.lhs] { |asyn.y `elem` barriers N| \\ |ready (N `sep` P `sep` H) (asyn.lhs.y)| \\ } {|Gamma0 :- Psi `sep` (N,P,Gamma1,H) ---> (N,P,Gamma1,{asyn.lhs.y :-> unit} \+/ H) `sep` Psi|} \inferrule*[right=e.bar.inh] { |( child x : N' = f((many (sup z trup))) ) `elem` rules(N,P)| \\ |ainh.y `elem` barriers N'| \\ |ready (N `sep` P `sep` H) (ainh.x.y)| \\ } {|Gamma0 :- Psi `sep` (N,P,Gamma1,H) ---> (N,P,Gamma1,{ainh.x.y :-> unit} \+/ H) `sep` Psi|} \end{mathpar} \end{small} \caption{Small-step evaluation rules for a node.} \label{warren.fig.nodeeval} \end{figure} \begin{defi}[Normal form] A tree |n| is in normal form when the synthesized attributes of the root have been computed, thus when the root node |n| has a heap with bindings for all |asyn.lhs.x| where |asyn.x| is a synthesized attribute declared for the nonterminal of |n|. \end{defi} The small-step relation |Gamma0 :- Psi `sep` v ---> v' `sep` Psi'| (Figure~\ref{warren.fig.nodeeval}) describes how to evaluate a tree |v| one step further to |v'| in a global environment |Gamma0|. The threaded heap |Psi| contains bindings for references, and is threaded through the evaluation. To reduce a tree, we start with an initial tree and gradually grow the tree by modifying the heaps. In the initial tree |v0 = (N,P,Gamma,emptyset)|, |N| is the nonterminal of the root, |P| is the production of the root, and |Gamma| contains bindings for functions as mentioned in the rules of |P|. In particular, for each child-rule in |P|, there is a binding for a function in |Gamma| that provides the semantics for that child. When the AG is well-formed, the resulting tree is in normal form when the rules are applied exhaustively. We use the following notation in the rules. The operator |\/| represents the left-biassed union of its two operands heaps, and operator |\+/| takes the union of two heaps with disjoined keys. The domain |dom(H)| of a heap |H| is the set of the heap's keys. A heap |H| applied to a designator |d|, written |H(d)|, returns the binding for |d| in |H|. Similarly, |Gamma0 Gamma1 f| returns the function binding for |f| in the right-biassed union of |Gamma0| and |Gamma1|. Juxtaposition of heaps represents the left-biassed union. In Figure~\ref{warren.fig.nodeeval}, rule \TirName{e.decent} incorporates the idea of nondeterministically selecting a node constructed so far, and reducing it a bit further. Each node has its own heap that explicitly tracks which attributes have been computed, which children have been constructed, and which rules have been applied. A test to check if a rule can be applied then boils down to verifying that their dependencies are met (Figure~\ref{warren.fig.readyrules}). With Rule \TirName{e.inh} bindings for the inherited attribute of a child can be copied to the heap of that child when the child is ready to receive these bindings. For example, inherited attributes for a child can be computed even before the child is introduced. Rule \TirName{e.syn} represents the inverse for synthesized attributes. Note the careful use of disjoined unions which ensure that a rule is only applicable once per attribute. With rules \TirName{e.child}, \TirName{e.eval}, \TirName{e.bar.inh}, and \TirName{e.bar.syn}, bindings can be added to the heap. Evaluation of the RHS of the AG's child-rule gives us the initial tree to use for that child in the heap. Evaluation of the AG's evaluation rule results in bindings for one or more attributes. The rule \TirName{e.eval} ensures in addition that potentially newly introduced references |Upsilon| do not clash with existing references. The rule \TirName{e.bar.inh} defines an inherited barrier attribute of a child |x|. Rule \TirName{e.bar.syn} defines a synthesized barrier attribute of |lhs|. \begin{figure}[t] \begin{small} \begin{mathpar} \fbox{|load (Psi `sep` H) (sup z trup) v|}\\ \inferrule*[right=l.val] { |hlook H z = v| \\ } {|load (Psi `sep` H) (sup z trval) v|} \inferrule*[right=l.ref] {|hlook H z = ell| \\ |plook Psi ell = v| \\ } {|load (Psi `sep` H) (sup z trref) v|} \end{mathpar} \begin{mathpar} \fbox{|store (H0) v (sup z trdwn) (H `sep` Psi `sep` Upsilon)|}\\ \inferrule*[right=s.new] {} {|store (H0) v (sup z trnew) ({ z :-> ell } `sep` { ell :-> v } `sep` {ell})|} \inferrule*[right=s.save] {} {|store (H0) v (sup z trval) ({ z :-> v } `sep` emptyset `sep` emptyset)|} \inferrule*[right=s.update] { |hlook H z = ell| } {|store (H0) v (sup z trref) (emptyset `sep` { ell :-> v } `sep` emptyset)|} \end{mathpar} \begin{mathpar} \fbox{|ready (N `sep` P `sep` H) (d)|}\\ \inferrule*[right=d.order] { |readydep (N `sep` P `sep` H) (d)| \\ |{ d' ^^ `bar` ^^ (order d' `pre` d) `elem` rules(N,P) } `subseteq` dom H| \\ } {|ready (N `sep` P `sep` H) (d)|} \inferrule*[right=d.r] { |( x : ^^ p(many (sup z1 trdwn)) = f(many (sup z2 trup)) ) `elem` rules(N,P)|\\\\ |many (ready (N `sep` P `sep` H) z1)| \\ } {|readydep (N `sep` P `sep` H) (rule x)|} \inferrule*[right=d.c] { } {|readydep (N `sep` P `sep` H) (child x)|} \inferrule*[right=d.a] { } {|readydep (N `sep` P `sep` H) z|} \end{mathpar} \end{small} \caption{Load, store and designator dependency rules.} \label{warren.fig.readyrules} \end{figure} Figure~\ref{warren.fig.readyrules} shows load, store, and ready rules. The relation |load (Psi `sep` H) (sup z trup) v| represents a load of value |v| using designator |z| in heap |H|. The load rule \TirName{l.val} requires the presence of a designator in the heap. With the rule the value of the designator can be extracted from the head. The load rule \TirName{l.ref} represents an indirect load from the threaded heap |Psi|. The relation |store (H0) v z (H `sep` Psi `sep` Upsilon)| represents a store of value |v|. The heap |H| and treaded heap |Psi| contain the new bindings that result from storing |v|. The initial heap |H0| is stores the reference when |sup z trdwn| is an indirection. The set of references |Upsilon| contains the newly added references. The store rule \TirName{s.new} represents the creation and assignment of a new reference |ell|, and \TirName{s.update} an update via such a reference. With rule \TirName{s.save} a binding for |z| is added without an indirection. The relation |ready (N `sep` P `sep` H) (d)| describes when |d| is ready to be defined while taking dependency-rules into account. These rules do not test for the existence of values in the heap, because the load and store rules already cover for this. The rule \TirName{d.order} handles dependency-rules generically for the designators. The rule \TirName{d.r} states that an AG rule is ready when its LHS is ready. \paragraph{Remarks.} If we can determine for a nonterminal that no subtree applies rule \TirName{s.new} or \TirName{s.update}, only topdown behavior for |Psi| is needed. If in addition \TirName{l.ref} cannot occur, the two |Psi| parameters can be omitted from the relation. The rule \TirName{e.descent} facilitates a walk down the tree until the node to reduce is reached. An actual implementation performs multiple evaluation steps (if possible) on such a node in order to reduce traversal overhead. In fact, it is possible to statically analyze the AG and obtain a static scheduling of the rules. \section{Static Dependency Graphs} \label{warren.sect.dependencies} In order to obtain a static scheduling of the rules, we first construct dependency graphs in the style of Knuth-1~\citep{DBLP:journals/mst/Knuth68}. If these graphs are cycle free, a static scheduling of the rules is possible. Unlike in Section~\tref{sect.intro.ag.depgraphs}, we mean an augmented PDG when we talk about a PDG. When we talk about the initial PDG, we mean the non-augmented PDG as in Section~\tref{sect.intro.ag.depgraphs}. Thus, we will be dealing with dependency graphs per production (augmented production dependency graph |pdg(N,P)|) and dependency graphs per nonterminal (a nonterminal dependency graph |ndg(N)|). We first define the initial graphs, then specify the actual graphs as the fixpoints of the functions: %format initial = "\mathsf{init}" \begin{code} pdg N P = sub(pdg)(initial) N P <+> { instantiate x (ndg (sub N x) ) | x `elem` children P } ndg N = sub(ndg)(initial) N <+> { abstract (pdg N P) | P `elem` prods N } \end{code} The function |instantiate| translates the edges between declarations of |sub N c| in the NDG as edges between the attributes of child |c| in the PDG. Similarly, when there is a path between two attributes of |lhs| in the PDG, then |abstract| translates these as edges between the same attributes in |N|'s NDG. \begin{figure}[tb] %format vert1 = "\mathsf{vert}" %format edge1 = "\mathsf{edge}" %format pre1 = "\mathsf{pre}" %format suc1 = "\mathsf{succ}" \begin{code} semb (sem P : N ^^ (many r)) (vert1) = {child lhs} <+> semb (sub (many a) N) (vert1(lhs)) <+> semb (many r) (vert1) semb (inh y t) (vert1(x)) = {ainh.x.y} semb (syn y t) (vert1(x)) = {asyn.x.y} semb (child x : N = f (many zup)) vert1 = {child x} <+> semb (sub (many a) N) (vert1(x)) semb (x : ^^ p (many (sup z1 trdwn)) = f (many (sup z2 trup))) vert1 = {rule x} semb (order d1 `pre` d2) vert1 = emptyset \end{code} \begin{code} semb (sem P : N ^^ (many r)) (edge1) = semb (many r) (edge1) semb (child x : N = f (many (sup z trup))) edge1 = many (z <- child x) <+> { child x <- syn.x.y | syn.y `elem` sub a N } semb (x : ^^ p(many (sup z1 trdwn)) = f(many (sup z2 trup))) edge1 = many (z2 <- rule x) <+> semb (sup z1 trdwn) (pre1(x)) semb (order d1 `pre` d2) = {d1 <- d2} semb (sup z trval) (pre1(x)) = { rule x <- z } semb (sup z trnew) (pre1(x)) = { rule x <- z } semb (sup z trref) (pre1(x)) = { z <- rule x } \end{code} \caption{Vertices and edges of the initial production dependency graph.} \label{fig.warren.initial} \end{figure} \paragraph{Initial production dependency graph.} Per production |P| of nonterminal |N|, we construct a production dependency graph |(sub pdg initial) N P|. Thus, each sem-block is translated to a PDG. Figure~\ref{fig.warren.initial} shows that the vertices of |(sub pdg initial) N P| consists of designators |semb (sem P : N ^^ (many r)) (vert1)|. In this notation, |sub (many a) N| is the set of attribute declarations |a| of nonterminal |N|. The directed edges |semb (sem P : N (many r)) (edge1)| of |pdg(N,P)| relate designators. An edge from |d2| to |d1| denotes that |d1| must be defined before |d2|. Order rules are thus a means to arbitrarily add edges to the PDG. The initial PDG does not contain the dependencies imposed by the semantics of each child (e.g. the actual tree). We obtain the actual production dependency graph by augmenting it with the edges taken from the nonterminal dependency graphs of the children which are an approximation of the dependencies of such trees. \paragraph{Initial nonterminal dependency graph.} The vertices of the nondeterminal dependency graph |ndg(N)| are the declarations |a| of |N|: \begin{code} semb (attr N (many a)) (vert1) = semb (many a) (vert1) semb (inh x t) (vert1) = { ainh.x } semb (syn x t) (vert1) = { asyn.x } \end{code} The initial NDG does not have any edges. These edges are inferred from the PDGs, as we see below. \paragraph{Graph construction.} The actual PDG is the least solution to the above equations with the following definitions for |abstract| and |instantiate|. The relation |d1 <-+ d2| represents a path from |d2| to |d1|, with |d1 /= d2|: \begin{code} abstract g = { k1.x <- k2.y | k1.lhs.x <-+ k2.lhs.y `elem` g } instantiate x g = { k1.x.y <- k2.x.z | k1.y <-+ k2.z `elem` g } \end{code} Conventionally, a synthesized attribute of a child only depends on an inherited attribute of a child. However, due to dependency-rules, any attribute of a child can potentially depend on any other attribute of the child, hence our definition of |abstract| and |instantiate| take these also into account. The construction of the graphs is a relatively straightforward fixpoint computation. As intermediate data structure, a transitively closed PDG allows for efficient tests for paths between vertices. \begin{figure}[tb] \begin{center} \begin{tikzpicture} [ nd/.style={ellipse, draw, minimum size=4mm, node distance=12mm,font=\scriptsize} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , lab/.style={font=\scriptsize} ] \node[nd](lhsmini){|lhs.mini|}; \node[nd, right of=lhsmini, xshift=30mm](lhsgath){|lhs.gath|}; \node[nd, right of=lhsgath, xshift=10mm](lhsrepl){|lhs.repl|}; \node[nd, below of=lhsmini, yshift=-6mm](l){|child l|}; \node[nd, below of=lhsrepl, yshift=-6mm](r){|child r|}; \node[nd, below of=l, xshift=-20mm](lmini){|l.mini|}; \node[nd, right of=lmini, xshift=10mm](lgath){|l.gath|}; \node[nd, right of=lgath, xshift=10mm](lrepl){|l.repl|}; \node[nd, below of=r, xshift=-20mm](rmini){|r.mini|}; \node[nd, right of=rmini, xshift=10mm](rgath){|r.gath|}; \node[nd, right of=rgath, xshift=10mm](rrepl){|r.repl|}; \node[nd, below of=lhsgath, xshift=-10mm, yshift=3mm](r3){|r3|}; \node[nd, below of=lhsgath, xshift=25mm, yshift=3mm](r4){|r4|}; \node[nd, below of=lrepl, xshift=-6mm](r5){|r5|}; \node[nd, below of=lrepl, yshift=5mm](r6){|r6|}; \draw[dep](r3) to (lgath); \draw[dep](r3) to (rgath); \draw[dep](lhsgath) to (r3); \draw[dep](r4) to (lrepl); \draw[dep](r4) to (rrepl); \draw[dep](lhsrepl) to (r4); \draw[dep](r5) to [in=-155,out=0] (rgath); \draw[dep](lmini) to [in=180,out=-35] (r5); \draw[dep](r6) to (lgath); \draw[dep](rmini) to (r6); \draw[dep](rgath) to (r); \draw[dep](rrepl) to (r); \draw[dep](lgath) to (l); \draw[dep](lrepl) to (l); \draw[dep2](lrepl) to [out=-155,in=-35] (lmini); \draw[dep2](rrepl) to [out=-155,in=-35] (rmini); \end{tikzpicture} \end{center} \caption{Example of the (augmented) PDG of the Bin-production.} \label{fig.warren.binpdg} \end{figure} \paragraph{Example.} In the Leaf-production, there is dependency of |asyn.lhs.repl| on |ainh.lhs.mini| via rule |r2|. This thus induces a dependency of |asyn.repl| on |ainh.mini| in the NDG. Figure~\ref{fig.warren.binpdg} shows the PDG of the production |Bin|. The solid edges are initial edges, and the dashed edges are instantiated from the NDG. The Bin-production by itself does not induce any edges in the NDG. We do not show the PDG of the Leaf-production here; it is shown in Figure~\ref{warren.fig.vdpds.g1}. \paragraph{Properties.} A NDG can only be cyclic if at least one of the PDGs is cyclic. The reason is that the edges of the NDG are projected into the PDGs, thus a PDG must have a subgraph with these cyclic edges. When the PDGs are acyclic, then an evaluation to normal form is possible for any well-formed tree. In other words, the graph construction is sound. This property has a relatively straightforward structural induction proof on the shape of the tree. The NDGs symbolize the induction hypothesis. When an evaluation algorithm respects the dependencies of the PDGs, and the constraints of each rule are preserved, the evaluation algorithm is sound. Moreover, the evaluation algorithm is complete if it finds a scheduling when the PDGs are acyclic. We show such an algorithm in Section~\ref{sect.warren.visitsgraphs}. Rules of a production may make an inherited attribute of a child dependent on a synthesized attribute of a child. These dependencies are not part of the NDG. When we add these edges also to the NDG, each occurrence of a nonterminal has the same dependencies of its inherited attributes on its synthesized attributes. When the PDGs are cycle free without these edges, they are typically also cycle free with these additional edges. In the next section, it becomes clear that adding these edges may reduce the number of visit interfaces that a nonterminal symbol needs to support, thus can be beneficial when the code size is an issue. \section{Visits Graphs} \label{sect.warren.visitsgraphs} An interface for a nonterminal is a partitioning of the attributes into a finite sequence of visits. Given such a sequence, we can produce an execution plan and generate code, as we showed in Chapter~\rref{chapt.side-effect}. We derive such interfaces from acyclic PDGs. Given such a sequence, it induces a dependency of all attributes of a later visit on attributes of an earlier visit. When represented as scheduling-induced edges in the PDG, it may make the PDG cyclic. It is typically possible to define a single visit sequence per nonterminal that does not make the PDG cyclic. However, it is generally hard to automatically find such a sequence~\citep{DBLP:journals/acta/Kastens80} when the sequence is not manually specified as we presented in Chapter~\rref{chapt.side-effect}. Instead, if we allow multiple interfaces for a nonterminal, we can define visit sequences so that they do not lead to additional dependencies. This is the approach taken by \citet{DBLP:conf/popl/KennedyW76}. In our experience, Kastens' approach requires severe manual intervention for large AGs, thus we take the approach by \citet{DBLP:conf/popl/KennedyW76}. However, it is not immediately obvious how to deal with multiple interfaces in a strongly typed language, because the type of the semantic function depends on the interface. In the next section, we solve this problem using type indices in combination with GADTs. In this section, we show how to determine these interfaces. \paragraph{Contexts.} A \emph{context} |ctx| of a nonterminal |N|, or nonterminal symbol with |N|, or a production of |N| is a visit sequence (many v) that is consistent with |ndg N|: \begin{code} ctx ::= (many v) -- a context consists of a sequence of visits |many v| v ::= visit i (many (k.x)) -- a visit consists of a set of attributes |many (k.v)| and globally unique |i| i -- identifier \end{code} A context contains a subset of the attributes declared for a nonterminal. A nonterminal may have a context with less attributes when not all attributes of a nonterminal symbol with this nonterminal are needed. A visit sequence is consistent with |ndg N| when the following conditions are met. \begin{itemize} \item For each attribute |ainh.x| and |asyn.y| in the sequence, with |ainh.x <-+ asyn.y| in |ndg N|, either |asyn.y| is not part of the visit sequence ,or |asyn.y| is in the same or later visit as |ainh.x|. \item For each attribute |asyn.x| and |ainh.y| in the sequence, with |asyn.x <-+ ainh.y| in |ndg N|, either |ainh.y| is not part of the visit sequence, or |ainh.y| is in a later visit as |asyn.x|. \end{itemize} The number of consistent visit sequences is finite, but potentially large, as it is in the worst case exponential in the number of attributes. \begin{figure} \begin{code} semb (visit i (many (k.y))) (vert1(x)) = { visit x i } semb (visit i (many (k.y))) (edge1(x)) = semb (i) (edge1(x,pre i)) <+> semb (many (k.y)) (edge1(x,i)) semb (i) (edge1(x,epsilon)) = { child x } semb (i) (edge1(x,j)) = { visit x j <- visit x i } semb (ainh.y) (edge1(x,i)) = { ainh.x.y <- visit x i } semb (asyn.y) (edge1(x,i)) = { visit x i <- asyn.x.y } \end{code} \caption{The embedding of a context.} \label{fig.warren.embedding} \end{figure} \paragraph{Embedding of a context.} We make the context of a child visible in the PDG. For that purpose, we introduce a vertex form that represents a visit: \begin{code} d ::= ... | visit x i -- a visit to |x| identified by |i| \end{code} For a child |x| in context |sub C x|, we add additional vertices and edges to the PDG, which we call the \emph{embedding} of |sub C x| of child |x| in the PDG. Informally, to embed a context, we add a visit-node and add edges between this node and the associated inherited and synthesized attributes. Additionally, a visit node is dependent on the node of the preceding visit, if any, and the child-node otherwise. Formally, let |pre i| either be the preceding visit |j| of |i|, or |epsilon| when |i| is the first visit. We add vertices |semb C (vert1(x))| and edges |semb C (edge1(x))| as described in Figure~\ref{fig.warren.embedding}. \begin{figure}[t] \centering \subfigure[Context |ctx1| with all attributes.]{ \begin{tikzpicture} [ nd/.style={ellipse, draw, minimum size=4mm, node distance=12mm,font=\scriptsize} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , lab/.style={font=\scriptsize} ] \node[nd](v1){|visit v1|}; \node[nd, right of=v1, xshift=8mm](v2){|visit v2|}; \node[nd, above of=v2, xshift=5mm](lmini){|ainh.l.mini|}; \node[nd, below of=v1, xshift=-5mm](lgath){|asyn.l.gath|}; \node[nd, below of=v2, xshift=10mm](lrepl){|asyn.l.repl|}; \node[nd, below of=lgath, xshift=18mm, yshift=7mm](l){|child l|}; \draw[dep](v2) to (v1); \draw[dep](v2) to (lmini); \draw[dep](lgath) to (v1); \draw[dep](lrepl) to (v2); \draw[dep2](lrepl) to [out=55,in=-55] (lmini); \draw[dep2](v1) to (l); \draw[dep](lgath) to (l); \draw[dep](lrepl) to (l); \end{tikzpicture} \label{fig.warren.exampleinst1} } \hspace{4em} \subfigure[Context |ctx2| with only |asyn.gath|.]{ \begin{tikzpicture} [ nd/.style={ellipse, draw, minimum size=4mm, node distance=12mm,font=\scriptsize} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , lab/.style={font=\scriptsize} ] \node[nd](v3){|visit v3|}; \node[nd, above of=v3, xshift=22mm](lmini){|ainh.l.mini|}; \node[nd, below of=v3, xshift=-5mm](lgath){|asyn.l.gath|}; \node[nd, below of=v3, xshift=27mm](lrepl){|asyn.l.repl|}; \node[nd, below of=lgath, xshift=18mm, yshift=7mm](l){|child l|}; \draw[dep](lgath) to (v3); \draw[dep2](lrepl) to [out=38,in=-38] (lmini); \draw[dep](lgath) to (l); \draw[dep](lrepl) to (l); \draw[dep2](v3) to (l); \end{tikzpicture} \label{fig.warren.exampleinst2} } \caption{Examples of visit-additions to the PDG.} \label{fig.warren.exampleinsts} \end{figure} \begin{defi}[Visit PDG] For some given contexts of children, a \emph{visit PDG} (VPDG) is the embedding of these contexts in the PDG. \end{defi} The following are exemplary contexts of child |l| of production |Bin|. The context |ctx1| describes the computation of all attributes in two phases. The context |ctx2| only the computation of |asyn.gath|: \begin{code} ctx1 = [ visit v1 {asyn.gath}, visit v2 {ainh.mini,asyn.repl} ] ctx2 = [ visit v3 {asyn.gath} ] ctx3 = [ visit v4 {asyn.gath, ainh.mini, asyn.repl} ] \end{code} Figure~\ref{fig.warren.exampleinsts} shows the relevant subgraphs after embedding the respective contexts |ctx1| and |ctx2|. The dashed edges are already existing edges. The embedding of |ctx3| would lead to a cycle in the PDG. A visit-node in a VPDG serves as bookkeeping that a visit to the child is needed to compute the attributes it is associated with. It represents dependencies induced from the AG-wide scheduling of attribute computations, and we will ensure later that each visit node has an associated algorithm that computes the attributes. \paragraph{Abstraction of visit-nodes.} We see later as well that the bookkeeping with visits is too verbose, so we establish a more compact representation. For that purpose, we change the syntax of child-vertices. These additionally define in what state |many a| the child is, which can be used in the graph to \emph{abstract} from a sequence of visits: \begin{code} d ::= ... | child x (many a) -- a child |x| in state |many a| (initially |emptyset|) \end{code} Let |g| be an acyclic VPDG. Figure~\ref{fig.warren.abstraction} shows how to abstract from the visit-vertices. We first remove all visit-vertices. When a visit-vertex |i| is removed, its associated attributes |sub (many a) i| are added to the child-vertex. There are no visit-vertices left in |g| after |abstract| has been applied. \begin{figure}[tb] \begin{code} abstract g | <% invoke x i %> `elem` g && <% child x (many a) <- invoke x i %> `elem` g = let v = <% child x (many a \+/ sub (many a) i) %> es = { v <- asyn.x.y | asyn.x.y `elem` g } \/ { v <- invoke x j | <% invoke x j %> `elem` g, <% invoke x i <- invoke x j %> `elem` g } in abstract ((g \/ {v} \/ es) - { invoke x i, child x (many a) } ) | otherwise = g \end{code} \caption{Abstraction on a VPDG.} \label{fig.warren.abstraction} \end{figure} With abstraction applied to a VPDG we obtain scheduled PDGs: \begin{defi}[Scheduled PDG] Given a production |P| in context |ctx|, and context |many ctx| for its children, its VPDG |g| is a \emph{scheduled PDG} (SPDG) when: \begin{itemize} \item The VPDG |g| is acyclic. \item All reachable synthesized attributes of a child depend on a child node that includes the synthesized attribute in its state. Thus, for each |asyn.y `elem` ctx|, for each path in the VPDG from vertex |asyn.lhs.y| to a vertex |asyn.x.z|, there exists a vertex |child x (many a)| with |asyn.z `elem` (many a)|. \end{itemize} \end{defi} These two conditions allow us to gradually determine contexts, as we show in the remainder of this section. We show in the next section how to convert a SPDG to an execution plan for a production. \paragraph{Representation.} We determine a set of contexts for each nonterminal, such that there exists a SPDG for each context and each production. Preferably, the set of contexts is small, because each context requires a different semantic function. Also, to save traversal overhead, preferably each context contains visits with many attributes. To determine this set, we use a more sophisticated representation, the visits graph, which we define below. Note again that a context is a sequence of visits that is associated with a nonterminal |N|, where each visit describes a transition of the configuration of a node associated with |N|. Given a set of contexts, sequences of visits may have common intermediate configurations. Thus, we can represent a set of contexts as a graph, which we call the visits graph: \begin{defi}[Visits graph] Given a set of contexts |many ctx|, a \emph{visits-graph} is DAG where the vertices are the union of |semb ctx vert1| of each |ctx `elem` many ctx|, and the edges are the union of |semb ctx edge1| of each |ctx `elem` many ctx|, as described by Figure~\ref{fig.warren.visitgraph}. The vertices represent configurations, and the edges represent visits. \end{defi} \begin{figure}[tb] \begin{code} semb (ctx) (vert1) = semb ctx (vert1(emptyset)) semb (emptyset) (vert1(s)) = {s} semb (visit i (many a) : vs) (vert1(s)) = {s} <+> semb vs (vert1(s <+> many a)) semb (ctx) (edge1) = semb ctx (edge1(emptyset)) semb (emptyset) (edge1(s)) = emptyset semb (visit i (many a) : vs) (edge1(s)) = {s (over(i)(->)) (s <+> many a) } <+> semb vs (edge1(s <+> many a)) \end{code} \caption{Vertices and edges of the visits-graph.} \label{fig.warren.visitgraph} \end{figure} \begin{figure}[tb] \begin{center} \begin{tikzpicture} [ conf/.style={circle, minimum size=1mm, node distance=4mm, top color=black, bottom color=black,font=\scriptsize} , vis/.style={->,thick} , lab/.style={font=\footnotesize} ] \node[conf](c1){}; \node[conf, right of=c1, xshift=30mm](c2){}; \node[conf, right of=c1, xshift=10mm, yshift=-10mm](c3){}; \node[lab,left of=c1, xshift=2mm]{|s0 = emptyset|}; \node[lab,right of=c2, xshift=14mm]{|s2 = { ainh.mini,asyn.gath,asyn.repl }|}; \node[lab, right of=c3, xshift=3mm]{|s1 = { asyn.gath }|}; \draw[vis](c1) to node[auto,above,lab]{|v3|} (c2); \draw[vis](c1) to node[auto,above,lab]{|v1|} (c3); \draw[vis](c3) to node[auto,above,lab]{|v2|} (c2); \end{tikzpicture} \end{center} \caption{Example of a visits-graph which represents contexts |ctx1|, |ctx2| and |ctx3|.} \label{fig.warren.visitsgraphexample} \end{figure} Figure~\ref{fig.warren.visitsgraphexample} shows the visits-graph that represents the contexts |ctx1|, |ctx2| and |ctx3|. Note that the context |ctx2| is fully implied by the other contexts. For a generated algorithm, a path in the visits-graph represents how a node is visited by its parent. However, a parent may stop invoking visits after any visit. From a code generation perspective, it may be beneficial to know at which vertex a parent stops. We can model this situation in the graph, but we refrain from this complication here. When two paths in the visits graph converge, this means that two different visit sequences ended up in the same configuration. Consequently, the children are in the same configuration, but were potentially subjected to different visit sequences. The VPDGs may thus differ in the visit-vertices, therefore we need |abstract| (as defined earlier) to eliminate this difference. \begin{figure}[tb] \begin{code} s ::= conf N (many a) (many p) -- configuration of a node with nonterminal |N| (key |many a|) p ::= prod P (many k) g -- production in a context of nonterminal |N| k ::= child x : N (many a) -- configuration of child |x| (key |many a|) e ::= s `sep` v `sep` s' -- edge between |s| and |s'| v ::= visit i : N (many a) (many c) -- visit |i| with production info |many c| (key |i|, non-empty |many a|) c ::= prod P (many r) -- sequence of invocation sets |many r| (possibly empty) r ::= sim (many m) -- simultaneous invocations |many m| (non-empty, unordered) m ::= invoke x i -- invocation of visit |i| to |x| (key |i|) g -- VPDG of a production in a given context N, P -- identifier (nonterminal), identifier (production) \end{code} \caption{Notation for visits-graphs.} \label{fig.warren.visitsgraphnotation} \end{figure} We keep a more complex administration for the visits graph, with vertices of the form |s| and edges of the form |v|, as described by Figure~\ref{fig.warren.visitsgraphnotation}. We explain some aspects of this notation below. Figure~\ref{warren.fig.visitsgraphexmpl} shows this administration for the example. A vertex of the visits graph represents a context. Per production, a vertex stores the VPDG in this context, and the states of the children. An edge represents a visit, which is a transition from one configuration to one of its next configurations. It stores the attributes |many a|, which are the inherited attributes provided by the parent for the visit, and the synthesized attributes that need to be computed for the parent. Per production, an edge stores an ordered sequence of invocations to children that are required for the visit. These visits are grouped in what we call \emph{simultaneous visits}. Visits of such a group visit children independently. Note that these visits may be invoked in any order, and may also be invoked concurrently. Figure~\ref{warren.fig.vdpds.g1} displays the graphs |g1|, |g2|, |g3| and |g4| that are mentioned in Figure~\rref{warren.fig.visitsgraphexmpl}. The operation |abstract| is not yet applied in the graphs. The dotted edges represent the edges that are inserted due to the embedding of visits. An important property of such a graph is that the simultaneous visits contain precisely those synthesized attributes that can be visited because the inherited attributes are available (described below with the relation |avail|), and also need to be visited in order to produce values for the synthesized attributes. The example is rather symmetric. The children of the production |Bin| are treated in the same way. This is in general not the case and will be handled correctly. For example, children may be of different nonterminals, or may be visited in a different order. This situation arises when we use |<% invoke l 3 %>| for the left child of the bin-production, or in the example of Section~\tref{summary.sect.commuting}. \begin{figure}[t] \begin{code} s0 = conf Tree emptyset { p1, p2 } s1 = conf Tree { asyn.gath } { p3, p4 } s2 = conf Tree { ainh.mini, asyn.gath, asyn.repl } { p5, p6 } p1 = prod Leaf emptyset g1 p3 = prod Leaf emptyset g1 p5 = prod Leaf emptyset g1 p2 = prod Bin { child l : Tree emptyset, child r : Tree emptyset } ^^ (abstract g2) p4 = prod Bin { child l : Tree { asyn.gath }, child r : Tree { asyn.gath } } ^^ (abstract g3) p6 = prod Bin { child l : Tree { ainh.mini, asyn.gath, asyn.repl } , child r : Tree { ainh.mini, asyn.gath, asyn.repl } } ^^ (abstract g4) v1 = visit 1 : Tree { asyn.gath } { c1, c2 } v2 = visit 2 : Tree { ainh.mini, asyn.repl } { c3, c4 } v3 = visit 3 : Tree { ainh.mini, asyn.gath, asyn.repl } { c5, c6 } c1 = prod Leaf emptyset c3 = prod Leaf emptyset c5 = prod Leaf emptyset c2 = prod Bin { sim { invoke l 1, invoke r 1 } } c4 = prod Bin { sim { invoke l 2, invoke r 2 } } c6 = prod Bin { sim { invoke l 1, invoke r 1 }, sim { invoke l 2, invoke r 2 } } \end{code} \caption{The contents of nodes and vertices of the visits graph of the example.} \label{warren.fig.visitsgraphexmpl} \end{figure} \begin{figure}[p] \centering \subfigure[VPDG |g1| of production |Leaf|, which has no children.]{ \begin{tikzpicture} [ nd/.style={ellipse, draw, minimum size=4mm, node distance=12mm,font=\scriptsize} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , lab/.style={font=\scriptsize} ] \node[nd](lhsgath){|asyn.lhs.gath|}; \node[nd, right of=lhsgath, xshift=30mm](lhsmini){|ainh.lhs.mini|}; \node[nd, right of=lhsmini, xshift=10mm](lhsrepl){|asyn.lhs.repl|}; \node[nd, below of=lhsgath, yshift=4mm](r1){|r1|}; \node[nd, right of=r1, xshift=4mm](locx){|loc.loc.x|}; \node[nd, right of=locx, xshift=4mm](r0){|r0|}; \node[nd, right of=r0, xshift=7mm](r2){|r2|}; \draw[dep](lhsgath) to (r1); \draw[dep](r1) to (locx); \draw[dep](locx) to (r0); \draw[dep](r2) to (lhsmini); \draw[dep](lhsrepl) to (r2); \end{tikzpicture} \label{warren.fig.vdpds.g1} } \subfigure[VPDG |g2| of production |Bin|, which has children |l| and |r|.]{ \begin{tikzpicture} [ nd/.style={ellipse, draw, minimum size=4mm, node distance=12mm,font=\scriptsize} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , lab/.style={font=\scriptsize} ] \node[nd](lhsmini){|ainh.lhs.mini|}; \node[nd, right of=lhsmini, xshift=30mm](lhsgath){|asyn.lhs.gath|}; \node[nd, right of=lhsgath, xshift=10mm](lhsrepl){|asyn.lhs.repl|}; \node[nd, below of=lhsmini, yshift=3mm](l){|child l emptyset|}; \node[nd, below of=lhsrepl, yshift=-3mm](r){|child r emptyset|}; \node[nd, below of=l, xshift=-20mm, yshift=-4mm](lgath){|asyn.l.gath|}; \node[nd, right of=lgath, xshift=10mm](lmini){|ainh.l.mini|}; \node[nd, right of=lmini, xshift=10mm](lrepl){|asyn.l.repl|}; \node[nd, below of=r, xshift=-20mm, yshift=2mm](rgath){|asyn.r.gath|}; \node[nd, right of=rgath, xshift=5mm, yshift=-8mm](rmini){|ainh.r.mini|}; \node[nd, right of=rmini, xshift=15mm, yshift=6mm](rrepl){|asyn.r.repl|}; \node[nd, below of=lhsgath, xshift=-10mm, yshift=4mm](r3){|r3|}; \node[nd, below of=lhsgath, xshift=25mm, yshift=4mm](r4){|r4|}; \node[nd, below of=lrepl, xshift=7mm](r6){|r6|}; \node[nd, below of=lrepl, yshift=5mm](r5){|r5|}; \draw[dep](r3) to (lgath); \draw[dep](r3) to (rgath); \draw[dep](lhsgath) to (r3); \draw[dep](r4) to (lrepl); \draw[dep](r4) to (rrepl); \draw[dep](lhsrepl) to (r4); \draw[dep](r5) to (rgath); \draw[dep](lmini) to (r5); \draw[dep](r6) to (lgath); \draw[dep](rmini) to (r6); \draw[dep](rgath) to (r); \draw[dep](rrepl) to (r); \draw[dep](lgath) to (l); \draw[dep](lrepl) to (l); \draw[dep](lrepl) to (lmini); \draw[dep](rrepl) to (rmini); \end{tikzpicture} \label{warren.fig.vdpds.g2} } \subfigure[VPDG |g3| after visit |v1| in which it visited |v1| of the children.]{ \begin{tikzpicture} [ nd/.style={ellipse, draw, minimum size=4mm, node distance=12mm,font=\scriptsize} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , lab/.style={font=\scriptsize} ] \node[nd](lhsmini){|ainh.lhs.mini|}; \node[nd, right of=lhsmini, xshift=30mm](lhsgath){|asyn.lhs.gath|}; \node[nd, right of=lhsgath, xshift=10mm](lhsrepl){|ainh.lhs.repl|}; \node[nd, below of=lhsmini, yshift=3mm](l){|child l emptyset|}; \node[nd, below of=lhsrepl, yshift=-3mm](r){|child r emptyset|}; \node[nd, below of=l, xshift=-20mm, yshift=-4mm](lgath){|asyn.l.gath|}; \node[nd, right of=lgath, xshift=10mm](lmini){|ainh.l.mini|}; \node[nd, right of=lmini, xshift=10mm](lrepl){|asyn.l.repl|}; \node[nd, below of=r, xshift=-20mm, yshift=2mm](rgath){|asyn.r.gath|}; \node[nd, right of=rgath, xshift=5mm, yshift=-8mm](rmini){|ainh.r.mini|}; \node[nd, right of=rmini, xshift=15mm, yshift=6mm](rrepl){|asyn.r.repl|}; \node[nd, below of=lhsgath, xshift=-10mm, yshift=4mm](r3){|r3|}; \node[nd, below of=lhsgath, xshift=25mm, yshift=4mm](r4){|r4|}; \node[nd, below of=lrepl, xshift=7mm](r6){|r6|}; \node[nd, below of=lrepl, yshift=5mm](r5){|r5|}; \node[nd, above of=lgath](l1){|invoke l 1|}; \node[nd, right of=rgath, xshift=10mm](r1){|invoke r 1|}; \draw[dep2](lgath) to (l1); \draw[dep2](rgath) to (r1); \draw[dep2](l1) to (l); \draw[dep2](r1) to (r); \draw[dep](r3) to (lgath); \draw[dep](r3) to (rgath); \draw[dep](lhsgath) to (r3); \draw[dep](r4) to (lrepl); \draw[dep](r4) to (rrepl); \draw[dep](lhsrepl) to (r4); \draw[dep](r5) to (rgath); \draw[dep](lmini) to (r5); \draw[dep](r6) to (lgath); \draw[dep](rmini) to (r6); \draw[dep](rgath) to (r); \draw[dep](rrepl) to (r); \draw[dep](lgath) to (l); \draw[dep](lrepl) to (l); \draw[dep](lrepl) to (lmini); \draw[dep](rrepl) to (rmini); \end{tikzpicture} \label{warren.fig.vdpds.g3} } \subfigure[VPDG |g4| after visit |v2| or |v3| in which it visited |v2| and possibly |v1| of the children.]{ \begin{tikzpicture} [ nd/.style={ellipse, draw, minimum size=4mm, node distance=12mm,font=\scriptsize} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , lab/.style={font=\scriptsize} ] \node[nd](lhsmini){|ainh.lhs.mini|}; \node[nd, right of=lhsmini, xshift=30mm](lhsgath){|asyn.lhs.gath|}; \node[nd, right of=lhsgath, xshift=10mm](lhsrepl){|asyn.lhs.repl|}; \node[nd, below of=lhsmini, yshift=1mm](l){$\begin{array}{c}|child l|\\|{asyn.gath}|\end{array}$}; \node[nd, below of=lhsrepl, yshift=-7mm](r){$\begin{array}{c}|child r|\\|{asyn.gath}|\end{array}$}; \node[nd, below of=l, xshift=-20mm, yshift=0mm](lgath){|asyn.l.gath|}; \node[nd, right of=lgath, xshift=10mm](lmini){|ainh.l.mini|}; \node[nd, right of=lmini, xshift=10mm](lrepl){|asyn.l.repl|}; \node[nd, below of=r, xshift=-20mm, yshift=5mm](rgath){|asyn.r.gath|}; \node[nd, right of=rgath, xshift=5mm, yshift=-8mm](rmini){|ainh.r.mini|}; \node[nd, right of=rmini, xshift=15mm, yshift=6mm](rrepl){|asyn.r.repl|}; \node[nd, below of=lhsgath, xshift=-10mm, yshift=1mm](r3){|r3|}; \node[nd, below of=lhsgath, xshift=25mm, yshift=4mm](r4){|r4|}; \node[nd, below of=lrepl, xshift=7mm, yshift=-1mm](r6){|r6|}; \node[nd, below of=lrepl, yshift=5mm](r5){|r5|}; \node[nd, below of=lmini, xshift=-18mm](l2){|invoke l 2|}; \node[nd, right of=rmini, yshift=-4mm, xshift=8mm](r2){|invoke r 2|}; \draw[dep2](l2) to (lmini); \draw[dep2](lrepl) to (l2); \draw[dep2](r2) to (rmini); \draw[dep2](rrepl) to (r2); \draw[dep2](l2) to (l); \draw[dep2](r2) to (r); \draw[dep](r3) to (lgath); \draw[dep](r3) to (rgath); \draw[dep](lhsgath) to (r3); \draw[dep](r4) to (lrepl); \draw[dep](r4) to (rrepl); \draw[dep](lhsrepl) to (r4); \draw[dep](r5) to (rgath); \draw[dep](lmini) to (r5); \draw[dep](r6) to (lgath); \draw[dep](rmini) to (r6); \draw[dep](rgath) to (r); \draw[dep](rrepl) to (r); \draw[dep](lgath) to (l); \draw[dep](lrepl) to (l); \draw[dep](lrepl) to (lmini); \draw[dep](rrepl) to (rmini); \end{tikzpicture} \label{warren.fig.vdpds.g4} } \caption{VPDGs of the visits graph example.} \label{warren.fig.vdpds} \end{figure} \paragraph{Invariants.} The representation of visits graph leads to a recursive set of constraints (presented below) for which we can incrementally construct \emph{a} visits graph. For a given attribute grammar, there may be many possible visits graphs. The way we represent the graph gives rise to visit sequences that are relatively independent and only compute what is necessary, but which are not guaranteed to be as large as possible. Different visits graphs may exhibit small differences in performance, although in principle any visits graph suffices. We impose a number of constraints on the visits graph for which it is possible to find a solution when the PDGs are acyclic, and for which it is possible to fine a \emph{unique} solution. In the formalization below, we denote a relation as partial function without a right-hand side, or as (partial) boolean functions. Moreover, when used in a boolean expression, we consider such relations a total function from the arguments to a Boolean result value, which is |True| if and only if the arguments form an element of the relation. When the expression of a guard has the value |undefined|, then the guard itself has the value |False|. Also, for unbound variables in pattern expressions we assume existential quantification, unless indicated explicitly otherwise. \begin{figure}[tb] \begin{code} avail (many a) g d ^^ ^^ = ^^ all (avail (many a) g) (deps g d) && except (many a) g d except (many a) g <% asyn.x.y %> ^^ | ^^ <% visit x i %> `elem` deps g <% asyn.x.y %> | ^^ <% child x (many a') %> `elem` deps g <% asyn.x.y %> && <% asyn.x.y %> `elem` (many a') except (many a) g <% ainh.x.y %> | ^^ True except (many a) g <% asyn.lhs.y %> | ^^ True except (many a) g <% ainh.lhs.y %> | ^^ ainh.y `elem` (many a) except (many a) g <% rule x %> | ^^ True except (many a) g <% child x %> | ^^ True except (many a) g <% visit x i %> | ^^ True \end{code} \caption{The definition of |avail|.} \label{fig.warren.avail} \end{figure} \begin{defi}[Available] A vertex |d| is \emph{available} in a VPDG |g| when vertex |d| can be scheduled, and respects the order imposed by |g|. Formally, the relation |avail (many a) g d| as defined in Figure~\ref{fig.warren.avail} (explained below) states whether a vertex |d| in a VPDG |g| is available when the node is in configuration |many a|. \end{defi} The relation |avail| states that a vertex |d| is available when all its dependencies are available, with the exception that only the inherited attributes |ainh.lhs.y| are available if they are part of the configuration |many a|, and that synthesized attributes of visits are only available if there was a visit that computed them. Thus, |avail| gives us a notion of which vertices are scheduled in a VPDG in a given configuration. \begin{defi}[Well-formed] A visits-graph |g| is \emph{well-formed} when it satisfies the invariants below. Additionally, the visits-graph is acyclic, and so are the VPDGs that are stored in the configurations. Also, in each edge of |g| or each vertex of |g| with some nonterminal |N|, there is exactly an entry |<% prop P ... %>| for each production |P| of |N|. We omit some straightforward structural invariants, such as that each visit-edge is annotated with the same nonterminal |N| as the nonterminal |N| of the two configurations it connects. \end{defi} \begin{figure}[tb] \begin{code} <% conf N (many a0) (many p0) `sep` visit i : N (many a) (many c) `sep` conf N (many a1) (many p1) %> `elem` edges (sub VG N) = all3 (transition (many a1) (many a)) (many p0) (many c) (many p1) ^^ && ^^ many a1 == many a0 \+/ many a transition (many a1) (many a) (prod P (many k0) g0) (prod P (many r)) (prod P (many k1) g1) = abstract (embed (many a1) g0 (many r)) == g1 ^^ && ^^ complete (many a1) (many a) g1 && all2 (trchild (many r)) (many k0) (many k1) trchild (many r) <% child x : N (many a) %> <% child x : N (many a') %> ^^ = ^^ (many a') == foldl (trsim x) (many a) (many r) trsim x (many a) <% sim (many m) %> | x `notelem` children (many m) ^^ = ^^ emptyset | <% invoke x i %> `elem1` (many m) ^^ && ^^ <% s `sep` visit i : N (many a') (many c) `sep` s' %> `elem` edges (sub VG N) ^^ = ^^ (many a) \+/ (many a') children (many m) = ^^ { x | <% invoke x i %> `elem` (many m) } complete (many a1) (many a) g1 ^^ = ^^ all (avail (many a1) g1) { k.lhs.x | k.x `elem` (many a) } \end{code} \caption{Main invariant: are all visits represented.} \label{fig.warren.correctlyrepresented} \end{figure} The main invariant is that all visits that are correctly represented in the graph, which is captured by the property in Figure~\ref{fig.warren.correctlyrepresented}. A visit-edge of nonterminal |N| requires a transformation of the VPDG for each production of |N|. The visit must precisely mention the new attributes |many a| that are added to |a0| to form |a1|. The sequence of simultaneous invocations |many r| describes these transitions for the children. When applied to the VPDG |g0| of the old configuration, |embed (many a1) g0 (many r)| gives the unabstracted VPDG |g1| of the new configuration. Moreover, |g1| must be complete, which means that all the synthesized attributes |asyn.lhs.y| in the new configuration must be available in |g1|. Finally, the invocations |many r| also describe the state transitions of the children. The relation |trchild (many r) k0 k1| relates the old configuration |k0| of a child to the new configuration |k1|. \begin{figure}[tb] \begin{code} req (many a) g d ^^ = ^^ d `elem` reqs (many a) g reqs (many a) g ^^ = ^^ { d | ainh.x `elem` (many a), asyn.y `elem` (many a), (ainh.lhs.x <-* d <-* asyn.lhs.y) `elem` g } \end{code} \begin{code} embed (many a) g <% many r %> ^^ = ^^ foldl (embed (many a)) g (many r) embed (many a) g <% sim (many m) %> ^^ = ^^ foldl (embed (many a) g) g (many r) embed (many a) g0 g <% invoke x i %> | all (avail g0 (many a)) inhs && all (req g0 (many a)) syns = semb (visit i (many a')) (vert1(x)) \+/ semb (visit i (many a')) (edge1(x)) \+/ g where <% s `sep` visit i : N (many a') (many c) `sep` s' %> `elem` edges (sub VG N) inhs = { ainh.x.y | ainh.y `elem` (many a') } syns = { asyn.x.y | asyn.y `elem` (many a') } \end{code} \caption{The relations |req| and |embed|.} \label{fig.warren.reqandembed} \end{figure} The relation |req (many a) g d| in Figure~\ref{fig.warren.reqandembed} specifies if |d| is required for the synthesized attributes of |many a| to be \emph{available} in |g|. Note that the relation |d1 <-* d2| represents a possibly empty path between |d1| and |d2|. The function |embed| in Figure~\ref{fig.warren.reqandembed} applies the invocations of visits |many r| to the VPDG |g|. A visit may only be invoked if its synthesized attributes are required, since our strategy schedules only the attributes that are needed. Moreover, the inherited attributes must be available. The simultaneous invokes represent invocations that can be applied in any order. Since we test which inherited attributes are available according to |g0|, the order in which these visits are applied does not affect the PDGs. This is because the synthesized attributes of the child are not available in |g0|, and can thus not be used to make more inherited attributes of another child available. This property allows us to determine visits graphs in a stable way. Some modifications are possible to this approach to schedule attributes less eagerly. Further, for each nonterminal |N|, there must be a vertex in the visits graph with an empty configuration. An empty configuration consists of an empty configuration for the children of each production. In addition, for the root symbols of the grammar with a non-empty set of attributes, the configuration with all attributes defined must be part of the visits graph. The configurations of the children do not have to be fully defined. Also, there must be an appropriate edge between these configurations. \paragraph{Properties.} The invariants imposed on the visits graph ensure a number of properties for which we sketch proofs. \begin{proof}[The visits graph is acyclic] The destination configuration connected by an edge is larger than the edge's source configuration, because the set of attributes of an edge is not empty. Edges thus connect distinct configurations. The configurations connected by edges form an ascending chain, and can thus not be cyclic. \end{proof} \begin{proof}[The VPDGs are acyclic] The VPDGs of the initial configurations are by definition acyclic. The transformation induced by a visit adds visit-vertices, although these keep the VPDG acyclic. A vertex that is available requires its (indirect) dependencies to be available. Thus, an available vertex cannot depend on an unavailable vertex. A visit-vertex connects available vertices to unavailable vertices of synthesized attributes of a child. To form a cycle, such an available vertex must depend (indirectly) on one of the synthesized attributes. Since such a vertex is unavailable, a cycle is not possible. \end{proof} \begin{proof}[Consistent visits to children] The edges of the NDG of a child forms the edges between the child's inherited and synthesized attributes in the VPDG of a production that contains the child. Suppose that we visit child |x| with |many a| as the attributes of the visit. As induction hypothesis, we assume that the available inherited and synthesized attributes of |x| are part of a consistent visit sequence. To prove that the visit sequence with the visit added is consistent, the definition requires us to consider four cases. In each case, one of the attributes is an element of |many a|. As the first case, given an attribute |ainh.x.y| and an attribute |asyn.x.z|, suppose that |ainh.y <-+ asyn.z| in the NDG of |x|, and |asyn.x.z `elem` many a|. Consequently, vertex |asyn.x.z| is unavailable. Then |ainh.x.y <- asyn.x.z| in the VPDG, which means that |ainh.x.y| is available. Hence, |asyn.x.z| must be in the same or later visit as |ainh.x.y|. Proofs for the remaining three cases are similar. \end{proof} \begin{proof}[Consistent states] When two potential edges in the visits graph converge, they result in the same state. Clearly, when two edges converge, the edges have an equivalent set of attributes |many a1| as destination state, otherwise the edges would not converge. Then, for both a VPDG |g1| of one edge, and a VPDG |g2| for the same production of the other edge, the |reqs| sets for |many a1| are equivalent. The attributes of a child's state are exactly those in the |reqs| set, hence the states of the children are equivalent. \end{proof} \begin{proof}[Dependencies between attributes of the same kind] A synthesized attribute |asyn.x.y| can depend on a synthesized attribute |asyn.x.z| and still be computed as part of the same visit when the inherited attributes both |asyn.x.y| and |asyn.x.z| depend on are available. Similarly, two inherited attributes can both be passed simultaneously to a child when they are both available. An inherited attribute that depends on a synthesized attribute of the same child, however, cannot be scheduled to the same visit. \end{proof} \paragraph{Construction.} The size of the visits graph is in both the worst and average case exponential in the number of attributes. However, we normally need only a small portion of this graph. In the remainder of this chapter, we call the visits graph of a program a slice of the graph that we inferred from the program. To incrementally construct this slice, we distinguish \emph{partial} and {final} vertices and edges. A partial vertex contains the configuration |many a|, but not the administration for the productions. A final vertex does contain this information. Similarly, a partial vertex contains only the visit identifier |i| and the attributes |many a|. The partial vertices and edges represent information of the graph that needs to be in the graph, but what we did not compute yet. The final vertices and edges represent already computed parts of the graph. For example, the initial vertices with an empty state are final vertices. For the root symbols, we insert a partial edge and partial destination vertex to the graph. As algorithm, we repeatedly take a partial edge with a final source vertex, and perform scheduling to turn it into a final edge, and consequently the destination vertex into a final vertex. Scheduling may result in new edges and vertices being added to the graph. The algorithm that performs scheduling for final edges computes the information for the edge as a function of the source state and the attributes |many a| of the edge. This ensures that the order in which we consider pending edges does not affect the resulting visits graph. Moreover, the visits graph is finite, thus if the computation for each pending edge is finite, then so is the whole computation. As a given, the input to the algorithm is a set of attributes |many a| of the edge, and the VPDGs of the productions of the previous configuration. To compute the remaining administration of the edge, we infer the visits to the children. The |embed| relation tells us how. We first determine which vertices are in the |reqs| set. Then we repeatedly which of the attributes of the children are required and ready to be scheduled. Those form one group of simultaneous visits. For each visit, we possibly add a pending edge and a pending vertex to the visits graph, if such vertices are not yet in the graph. This process terminates because the VPDG is acyclic. With each iteration, there is either at least one vertex that can be scheduled, or we are done. The compilation of the largest AG of UUAG takes less than a minute using a slightly optimized version of the algorithm as sketched above. To improve the performance, the construction of the graph is relatively straightforward to parallelize. The processing of each edge is independent and can be done in parallel. Most of the shared state is read-only. Only updates on the state need to be properly synchronized, but these updates happen relatively infrequent, and there is likely little contention. The construction of the graph is thus likely to scale very well. Also, the process may be done incrementally. When a change in the AG does not affect the PDGs of a nonterminal, the graphs constructed for that nonterminal so far may be reused as initial visits graph. However, the implementation is performant enough to be used in practice, even without such optimizations. \paragraph{Remarks.} Both the construction of the PDGs and the visits graph is a whole-program analysis of the sources related to an evaluation algorithm of a compiler for a particular tree (e.g. one stage in the compiler pipeline). The requirement that the sources related to one algorithm must be analyzed as a whole is usually not a problem in large compiler implementations, because individual algorithms are usually monolithic. When plugins are concerned, such plugins are typically a separate stage, and would thus be compiled independently. However, it is in theory possible to defer the computations of the visits graph to load-time, although then the composition of rules must also be determined at load-time, using meta programming facilities such as Template Haskell. There are various customizations posible in the construction on the visits graph which may have severe consequences for the structure of the graph. The structure of the graph does not influence the outcome of the attribute evaluation, but may affect execution time. At the moment of writing, we are still gathering empirical data regarding performance. In earlier measurements on UHC, the difference between on-demand (lazy) and eager (strict) attribute evaluation turned out to be insignificant. This may be due to the use of strictness annotations and DeepSeq in combination with the sequentialization of computations due to unifications on a chained substitution attribute. However, in earlier measurements with the AG implementation of the editor Proxima~\citep{Schrage04proxima}, execution time almost halved when eager AG evaluation was used. Note that with the rise of multi-core computing, the effects of visit sequences on parallel behavior may still be significant~\citep{DBLP:conf/waga/KuiperS90,629079,springerlink:10.1007/BF02948394}. \section{Optimizations} \label{warren.sect.optimizations} %% Decision: this chapter should provide some relief after the formal parts %% and for the formal parts to come. The shape of the graph may have an impact on the performance of the generated code. In our approach, visits are likely to be small and independent, which is beneficial for parallelism and incremental evaluation. However, when the visits graph is huge, code size becomes a more pressing issue. The visits graph of the largest AG in the UHC project features about 10,000 configurations. Hence, we require measures to limit the size of this graph. \paragraph{Subsumption.} When the graph is huge, there are many contexts that are similar, yet differ in one or more attributes because such an attribute was not needed or only needed later. Such a small difference can easily cause many visits to be needed in the graph. An edge |i| from a configuration |many a| with attributes |many (ainh.y)| and |many (asyn.z)| \emph{subsumes} an edge |k| from |many a| with attributes |many (ainh.p)| and |many (asyn.q)| when |many (asyn.y)`subseteq` many (ainh.p)| and |many (asyn.q) `subseteq` many (asyn.q)|. When we are about to declare a visit, but it is subsumed by an already declared visit, we may use that visit instead. This approach can potentially save many contexts, at the expense of producing some results that are not needed yet. That is usually not a problem, because non-trivial computations of some node normally have dedicated inherited attributes, thus if one of these inherited attributes is needed then so is the non-trivial computation that computes the synthesized attribute. A downside of the subsumption approach is that the order in which edges in the visits graph are considered may influence the outcome. The effects of such approaches requires further investigation. We experimented with the strategy to determine the inherited attributes of a visit based on the synthesized attributes that are required, but determining the largest set of synthesized attributes that can be computed from the inherited attributes available so far. This strategy reduced the graph of UHC's largest AG to 1,500 nodes, at the slight expensive of returning one or two attributes more than is strictly necessary. Under the assumption that complex synthesized attributes are always dependent on their own set of inherited attributes, the additional cost is negligible. \paragraph{Partial Kastens.} Two other decisive factors are the number of attributes and productions, and sparse dependencies between attribues. The number of productions is typically fixed, but the number of attributes and the dependencies can be influenced. We can produce an algorithm for any consistent visit sequence. Consequently, a PDGs can be replaced with supergraphs as long as these remain acyclic. In our experience, orderable AGs are absolutely non-circular (Section~\rref{sect.intro.ag.eval}). If for one production there is a child |x| with nonterminal |N| that has a dependency of an attribute |ainh.x.y| on an attribute |asyn.x.z|, then this dependency can also be imposed for all other children with nonterminal |N|. For such an AG, the approach of \citet{DBLP:journals/acta/Kastens80} is applicable. The approach of Kastens determines a total order on the attributes in the NDGs using a late-as-possible strategy. As mentioned earlier, this approach likely causes the graphs to become cyclic. By removing the edges between vertices of strongly connected components that are not in the original PDG, we obtain a non-cyclic supergraph of the original PDG. This supergraph has Kastens' algorithm partially applied, and is likely to have a significant lower number of possible contexts. For large AGs, this appears to be the most effective step to limit the state explosion caused by many productions and attributes. It does not require manual intervention. On the other hand, it restricts the freedom in choosing smaller and independent visits. \paragraph{Attribute elimination.} In combination with copy rules, it is common practice to define attributes on nonterminals where they are actually unused, or only used in chains of copy rules. During the development of an AG, such attributes also show up because the AG is not finished, thus not all attributes are in use. Superfluous attributes seem innocent, but actually make the scheduling harder. These attributes and their rules for such nonterminals are largely independent, thus easily lead to many contexts in the visits graph, because many ways to interleave them are allowed. Further, the Kastens' algorithm typically schedules them too early, which causes cycles in the dependency graphs, thus makes the above approach to reduce contexts less effective. A combination of dead-code elimination and copy propagation~\citep{Nielson:1999:PPA:555142} can be used to eliminate superfluous attributes of a nonterminal. As an additional benefit, their removal may improve the performance of the application, because it prevents the trivial copying of attributes around the tree. Similarly to the dependency analysis itself, many analyses for AGs can be specified as a recursive set of constraints. The common pattern is that some property of a nonterminal is determined by combining the properties of each production, which we call abstraction. This property of the nonterminal is then instantiated for each a child of the nonterminal. A fixpoint can then be computed starting with a fixed value for the nonterminals that are roots, and a bottom value for the other nonterminals. In a similar way as shown in Section~\rref{warren.sect.dependencies}, we can define that an attribute is \emph{live} if it is a dependency of a live attribute. For the root nonterminal, all synthesized attributes are live. Since this analysis is defined in terms of the attribute dependencies, the analysis can straightforwardly be defined in terms of the dependency graphs. As with dead-code elimination, the attributes and rules that are not live are removed from the grammar. Similarly, a collection attribute~\citep{10.1109/SCAM.2007.13} is \emph{empty} if it is composed with either a copy rule or a monoid's append from empty attributes. In this case, we start with non-empty as bottom value, and do not treat start symbols of the grammar in a special way. Instead, a use rule that depends on no attributes becomes empty as value eventually. %% In Section~\rref{sect.exploit.cfg} we show how such fixpoint computations on DAGs can be described %% with attribute grammars. Finally, the output attribute of a copy rule is a \emph{copy} of an attribute |x|, where |x| either equals |z| when the input attribute |y| is a copy of another attribute |z|, or |x| equals |y| otherwise. This also applies to rules for which we can syntactically determine that its body is essentially the identity function, For other rules, the output attribute is not a copy of another attribute. Initially, each attribute is not a copy of any attribute. set of attributes. During abstraction, a synthesized attribute may be a copy of an inherited attribute if all productions agree on that attribute. Given the results of the copy and empty analyses, a rule that depends on an attribute |x| that is a copy of |y| can substitute |y| for |x|. A rule that depends on an empty attribute can substitute the attribute with the empty value, and thus remove the dependency on the attribute. Finally, dead-code elimination cleans up the unused copies, unused empty attributes, and unused other attribues. The updatable attributes provide some more options for extensions. When an attribute is \emph{constant} or the attribute is \emph{unique}~\citep{DBLP:conf/icfp/HageHM07}, these attributes can be stored as a mutable structure in a global state, and replaced by a single attribute that stores a reference to that structure. These are analyses that are relatively straightforward to implement and exploit for BarrierAG, in contrast to general purpose programming languages, which have complications due to data types and higher-order functions. \begin{figure}[p] \begin{code} S ::= (sub sem (^^ Gamma)) P : N (many c) (many q) j -- execution plans |many q| of production |P|, and initial config |j| c ::= conf j (many a) -- description of a configuration |j| q ::= s `sep` s' `sep` v -- edge between configurations |s| and |s'| v ::= visit i : N (many a) (many r) -- visit combined with rules of the visit r ::= child x : N = f (many zup) -- (higher order) child declaration | x : ^^ p (many zdown) = f (many zup) -- evaluation rule | sim (many m) -- simultaneous invocations m ::= invoke x i : N -- child invocation s ::= conf j (many k) -- configuration of a node k ::= child x : N j -- configuration of child |x| j -- configuration identifier \end{code} \begin{code} sem { f1 = field_l, f2 = field_r, f3=min, f4=Bin } Bin : Tree 1 { conf 1 emptyset, conf 2 {asyn.gath}, conf 3 {ainh.mini, asyn.gath, asyn.repl} } [ conf 1 { child l : Tree 1, child r : Tree 1 } ; conf 2 { child l : Tree 2, child r : Tree 2 } ; visit 1 : Tree ^^^ child l : Tree = f1 child r : Tree = f2 sim { invoke l 1 : Tree, invoke r 1 : Tree} r3: ^^^ id ^^ occval (asyn.lhs.gath) ^^ = ^^ f3 ^^ occval (asyn.l.gath) ^^ occval (asyn.r.gath) , conf 2 { child l : Tree 2, child r : Tree 2} ; conf 3 { child l : Tree 3, child r : Tree 3} ; visit 2 : Tree ^^^ r5: ^^^ id ^^ occval (ainh.l.mini) ^^ = ^^ id ^^ occval (asyn.r.gath) r6: ^^^ id ^^ occval (ainh.r.mini) ^^ = ^^ id ^^ occval (asyn.l.gath) sim { invoke l 2 : Tree, invoke r 2 : Tree } r4: ^^^ id ^^ occval (asyn.lhs.repl) ^^ = ^^ f4 ^^ occval (asyn.l.repl) ^^ occval (asyn.r.repl) , conf 1 { child l : Tree 1, child r : Tree 1 } ; conf 3 { child l : Tree 3, child r : Tree 3 } ; visit 3 : Tree ^^^ child l : Tree = f1 child r : Tree = f2 sim { invoke l 1 : Tree, invoke r 1 : Tree} r3: ^^^ id ^^ occval (asyn.lhs.gath) ^^ = ^^ f3 ^^ occval (asyn.l.gath) ^^ occval (asyn.r.gath) r5: ^^^ id ^^ occval (ainh.l.mini) ^^ = ^^ id ^^ occval (asyn.r.gath) r6: ^^^ id ^^ occval (ainh.r.mini) ^^ = ^^ id ^^ occval (asyn.l.gath) sim { invoke l 2 : Tree, invoke r 2 : Tree } r4: ^^^ id ^^ occval (asyn.lhs.repl) ^^ = ^^ f4 ^^ occval (asyn.l.repl) ^^ occval (asyn.r.repl) ] \end{code} \caption{Syntax of execution plans, and the plans of production |Bin|.} \label{warren.fig.execplans} \end{figure} \section{Execution Plans and Generated Code} \label{warren.sect.executionplan} With each vertex in the visits graph, we uniquely associate an identifier |j|. For each edge of the visits graph, and for each of its productions, we construct an execution plan. The vertices of the VDPG that are required for the new state, but not required for the old state, are the edges that belong in the plan. In Chapter~\rref{chapt.side-effect}, we described how to derive a total order for the vertices. In this section, we simply assume that we take a topological sort of the graph. \paragraph{Plan representation.} Figure~\ref{warren.fig.execplans} shows the syntax of execution plans, and the execution plans for the production |Bin|. We organize the plans per production. For each production, it contains an execution plan for each edge in the visits graph. The rules are ordered, and visits to children are made explicit. \begin{defi}[Intra-visit dependency] With each vertex in the visits graph, we associate a set of \emph{intra-visit} dependencies. These are values (denoted as descriptors |d|) in the state of a node that are potentially needed in later visits. Given a visits graph |V|, the set of descriptors contained in |s| for a production |P| is |intra P s| in Figure~\ref{warren.fig.intra} (explained below). \end{defi} The set of descriptors is determined by taking the descriptors introduced by by a visit, and those needed by rules of later visits. Also note that the graph |g| is the graph of the new state for |defs|, and the graph of the old state for |uses|. The difference is in the state of the child-nodes. Finally, the set of intra-visit dependencies of the initial state and each final state is empty. \begin{figure}[t] \begin{code} intra P s = \/{ (uses P s s' \/ intra P s') - defs P s s' ^^ | ^^ (s `sep` s' `sep` v) `elem` V } uses P <% conf N (many a0) (many p0) %> <% conf N (many a1) (many p1) %> ^^ | ^^ <% prod P (many k) g %> `elem` (many p0) = \/(map (uses P) (reqs (many a1) g - reqs (many a0) g)) uses P <% k.x.y %> = emptyset uses P <% child x (many a) %> | abs (many a) == 0 = \/{ map uses (many z) | <% child x : N = f (many zup) %> `elem` rules P } | length (many a) > 0 = { child x (many a) } uses P <% rule x %> = \/{ (many z2) \/ map uses (many (sup z1 trdwn)) | <% x : ^^ p (many (sup z1 trdwn)) = f (many (sup z2 trup)) %> `elem` rules P } uses <% sup z trref %> = { z } uses <% sup z trval %> = emptyset uses <% sup z trnew %> = emptyset defs P <% conf N (many a0) (many p0) %> <% conf N (many a1) (many p1) %> ^^ | ^^ <% prod P (many k) g %> `elem` (many p1) = \/(map (defs P) (reqs (many a1) g - reqs (many a0) g)) defs P <% k.x.y %> = emptyset defs P <% child x (many a) %> = { child x (many a) } defs P <% rule x %> = \/{ map defs (many (sup z1 trdwn)) ^^ | ^^ <% x : ^^ p (many (sup z1 trdwn)) = f (many (sup z2 trup)) %> `elem` rules P } defs <% sup z trref %> = emptyset defs <% sup z trval %> = { z } defs <% sup z trnew %> = { z } \end{code} \caption{Intra-visit dependencies of nodes.} \label{warren.fig.intra} \end{figure} From execution plans, we generate visit functions (see also Section~\tref{sect.ag.eval.ordered}). The process for individual visit functions is largely conventional (Section~\tref{intro.outline.sideeffect}, and Section~\tref{iter.sect.semantics}). However, the weaving of the visit functions is more complex, because we encode the visits graph, which is in general not a linear sequence of visits. \begin{figure}[t] \begin{code} type T_Tree = T_Tree_s1 -- initial configuration of the tree -- type of tree in a given state |s| (function from key to visit) data T_Tree_s1 where C_Tree_s1 :: { inv_Tree_s1 :: forall t . K_Tree_s1 t -> t } -> T_Tree_s1 data T_Tree_s2 where C_Tree_s2 :: { inv_Tree_s2 :: forall t . K_Tree_s2 t -> t } -> T_Tree_s2 data T_Tree_s3 where C_Tree_s3 :: { inv_Tree_s3 :: forall t . K_Tree_s3 t -> t } -> T_Tree_s3 -- type of a \emph{key}, which identifies a visit |v| from state |s| data K_Tree_s1 t where K_Tree_v1 :: K_Tree_s1 T_Tree_v1 K_Tree_v3 :: K_Tree_s1 T_Tree_v3 data K_Tree_s2 t where K_Tree_v2 :: K_Tree_s2 T_Tree_v2 data K_Tree_s3 t where -- empty data declaration -- type of a visit |v|, with continuation as the new state |s| type T_Tree_v1 = IO (Int, T_Tree_s2) type T_Tree_v2 = Int -> IO (Tree, T_Tree_s3) type T_Tree_v3 = Int -> IO (Int, Tree, T_Tree_s3) \end{code} \caption{Types of keys and semantic functions.} \label{warren.fig.types} \end{figure} The visits graph specifies the possible states of an attributed tree, and models the visits that can be done on it. Moreover, it specifies which attributes are in which state, and which attributes are an input or output of a visit. In the generated code, an attributed tree is a value that represents a vertex in the visits graph. A visit function represents an edge. Figure~\ref{warren.fig.types} shows their types for the example. Below, we explain these types. \paragraph{Types.} For each vertex |j| of a nonterminal |N| in the visits graph, we generate a type |T_N_(sub s j)|. This is the type for an attributed tree with nonterminal |N| and in configuration |j|. For each edge |i| of a nonterminal |N| in the visits graph, we generate a type |T_N_(sub v i)|. This is the type for a visit function that takes the tree from its source configuration |T_N_(sub s (semb (i) (source1)))| to its target destination |T_N_(sub s (semb (i) (target1)))|. \begin{figure}[p] %format st1 %format st2 %format st3 %format k1 %format k2 %format k3 \begin{smallcode} \begin{code} sem_Bin :: T_Tree -> T_Tree -> T_Tree sem_Bin field_l field_r = st1 where st1 = let k1 :: K_Tree_s1 t -> t k1 K_Tree_v1 = v1 k1 K_Tree_v3 = v3 k1 _ = error "unreachable" v1 :: T_Tree_v1 v1 = do l1 <- return f1 r1 <- return f2 (l_gath, l2) <- inv_Tree_s1 l1 K_Tree_v1 (r_gath, r2) <- inv_Tree_s1 r1 K_Tree_v1 lhs_gath <- return . id $ f3 l_gath r_gath return (lhs_gath, st2 l2 r2 l_gath r_gath) v3 :: T_Tree_v3 v3 lhs_mini = do l1 <- return f1 r1 <- return f2 (l_gath, l2) <- inv_Tree_s1 l1 K_Tree_v1 (r_gath, r2) <- inv_Tree_s1 r1 K_Tree_v1 lhs_gath <- return . id $ f3 l_gath r_gath l_mini <- return . id $ id r_gath r_mini <- return . id $ id l_gath (l_repl, l3) <- inv_Tree_s2 l2 K_Tree_v2 l_mini (r_repl, r3) <- inv_Tree_s2 r2 K_Tree_v2 r_mini lhs_repl <- return . id $ f4 l_repl r_repl return (lhs_gath, lhs_repl, st3) in C_Tree_s1 k1 st2 l2 r2 l_gath r_gath = let k2 :: K_Tree_s2 t -> t k2 K_Tree_v2 = v2 k2 _ = error "unreachable" v2 :: T_Tree_v2 v2 lhs_mini = do l_mini <- return . id $ id r_gath r_mini <- return . id $ id l_gath (l_repl, l3) <- inv_Tree_s2 l2 K_Tree_v2 l_mini (r_repl, r3) <- inv_Tree_s2 r2 K_Tree_v2 r_mini lhs_repl <- return . id $ f4 l_repl r_repl return (lhs_repl, st3) in C_Tree_s2 k2 st3 = let k3 :: K_Tree_s3 t -> t k3 _ = error "unreachable" in C_Tree_s3 k3 f1 = field_l ^^^ ; ^^^ f2 = field_r ^^^ ; ^^^ f3 = min ^^^ ; ^^^ f4 = Bin \end{code} \end{smallcode} \caption{Generated code of the production |Bin|.} \label{warren.fig.code} \end{figure} We can apply one operation on an attributed tree. Given an \emph{typed key} |K_N_(sub s j) t| and a tree |T_N_(sub s j)|, the function |inv_N_(sub s j) :: T_N_(sub s j) -> K_N_(sub s j) t -> t| provides us with the visit function of type |t|: \begin{code} data T_N_(sub s j) where C_N_(sub s j) :: { inv_N_(sub s j) :: forall t . K_N_(sub s j) t -> t } -> T_N_(sub s j) \end{code} The constructor |C_N_(sub s j)| is essentially a wrapper around the |inv| function. %% When we construct the attributed tree, we thus specify for each state The type |t| can be chosen by a parent by providing a key with this type. The key |K_N_(sub s j) t| is a type index. The child can inspect the key to discover which type is actually represented by |t|: \begin{code} data K_N_(sub s j) t where K_N_(sub v i1) :: K_N_s ^^ T_N_(sub v i1) ... K_N_(sub v (sub i n)) :: K_N_s ^^ T_N_(sub v (sub i n)) \end{code} For each outgoing edge |i| of |j|, |K_N_(sub s j)| contains a key |K_N_(sub v i)| that serves as evidence that |t| equals |T_N_(sub v i)|. The type |K_N_(sub s j)| has no constructors when |j| has no outgoing edges in the visits graph. Indeed, a tree in such a configuration cannot be visited. The type of a visit function |T_N_(sub v i)| is a function of values of the visit's inherited attributes to a computation of a tuple of values of the visit's synthesized attributes, and the new state of the tree. Since BarrierAG includes updatable attributes, the computation takes place in the IO monad: \begin{code} type T_N_(sub v i) = sub tau ainh1 -> ... -> sub tau (sub ainh n) -> IO (sub tau asyn1, ..., sub tau (sub asyn m), T_N_(sub s (semb i target1))) \end{code} The type |sub tau (sub ainh k)| is the type declared for the inherited attribute |sub ainh k| of nonterminal |N|, and |sub tau (sub asyn k)| the type for the synthesized attribute |sub asyn k|. \paragraph{Translation of semantics-blocks.} Figure~\ref{warren.fig.code} gives the generated code for a production\footnote{ The full code of the example can be downloaded from: \url{https://svn.science.uu.nl/repos/project.ruler.papers/archive/ExampleWarren.hs}. It is compilable with GHC version 6.12.3. }. A sem-block is translated to a \emph{node constructor} function |sub st j| for each state |j|, which given values for the intra-dependencies of |j|, returns a node with this state, thus of type |T_N_(sub s j)|. The node constructor |st1| constructs the initial state. Therefore it has an empty set of intra-dependencies and is thus represents the initial value of the node. In contrast, the node constructor |st2| takes the states of the live children as parameter, and values of the live attributes. The body of a node constructor is a wrapper around a function |sub k j|, which returns the visit function |sub v i| given the key |K_N_(sub v i)| that identifies one of the outgoing edges of |j|. For each visit |i| that is a successor of configuration |j|, we generate a visit function |sub v i|. The visit function |sub v i| takes values for the inherited attributes as parameter that are declared on edge |i|. It returns a computation that gives a tuple of the synthesized attributes that are declared on edge |i|. In addition , it returns the result state of the node by applying the constructor for the next state to the values of that state's intra dependencies. \begin{figure}[t] \begin{code} semb (child x : N = f (many zup)) gen1 ^^^ = ^^^ semb x (gen1(initial N)) <- (sub lift (|many zup|)) f (semb (many zup) trup) semb (x : ^^ p (many zdown) = f (many zup)) gen1 ^^^ = ^^^ (many y) <- (sub lift (|many zup|)) f (semb (many zup) trup) (many (semb zdown (trdwn y))) ^^^ ^^^ where (many y) fresh semb (invoke x i : N) gen1 = let vis = sub ainvoke (sub s (semb i source1)) (semb x (source1(i))) (K_N_(sub v i)) (semb x (syn1(i)), (semb x (target1 i))) <- vis (semb x (inh1(i))) semb x (inh1(i)) = { semb (ainh.x.y) gen1 ^^ | ^^ ainh.y <- (sub a (sub N i)) } semb x (syn1(i)) = { semb (asyn.x.y) gen1 ^^ | ^^ asyn.y <- (sub a (sub N i)) } semb (sup z trval) (trdwn y) = let semb z gen1 = y semb (sup z trref) (trdwn y) = writeIORef (semb z gen1) y semb (sup z trnew) (trdwn y) = (semb z gen1) <- newIORef y semb (sup z trval) trup = return (semb z gen1) semb (sup z trref) trup = readIORef (semb z gen1) semb (h.c.x) gen1 = h_c_x semb (x) (gen1(j)) = x_j \end{code} \caption{Translation scheme for rules in the execution plan.} \label{warren.fig.rules} \end{figure} Figure~\ref{warren.fig.rules} shows a straightforward translation of the rules in the execution plan. We assume that barrier attributes and dependency rules are stripped from the execution plan and the administration after the visits graph has been constructed. For each visit to a child we pass as additional parameter the appropriate key to the invoke-function of the child's state, which results in the visit function |vis|. The function |vis| subsequently takes the inherited attributes as parameters. For an attribute occurrence |z| at an input position, either the transcribed identifier |semb z gen1| is |z|'s value, or is a reference that can be read from to provide |z|'s value. In a similar way, an attribute occurrence |z| at an output position is either stored as the transcribed identifier |semb z gen1|, or written to the reference under that name. \paragraph{Remarks.} A nice property of the translation is that we explicitly declare the types of the visit functions with type signatures. Since the types of the functions |f| are monomorphic, type errors are usually to be reported in functions |f|, which are defined in the actual source code, so that type errors can be related back to the original locations in the source file. We do not need to know the types of local attributes. Also, when we allow type variables in the types, then these can be supported in our scheme using scoped type variables. The translation can be optimized in various ways. When a configuration does not have outgoing edges, a visit to a child that ends in this configuration does not need to construct nor return the new state of the child, as the child cannot be visited anymore, and the state is actually empty according to the definition of |intra|. When a configuration has one outgoing edge, then the selection of a visit via a key is not needed. Also, when the visits graph for a nonterminal is a tree, a single key that identifies the path in the tree can be given to the child when it is created, instead of one segment of the path for each visit. Instead of relying on Haskell to construct closures, the node constructors can use an untyped, updatable array instead. With conventional techniques from register scheduling, the state can be represented such that needless copying of states is avoided during visits, which makes visits cheap. In this thesis, we do not venture down this path, and rely on Haskell to handle closures. A compiler that employes uniqueness and usage analysis~\citep{DBLP:conf/ifl/VriesPA07,DBLP:conf/icfp/HageHM07} can apply this optimization transparently. When a simultaneous invocation group contains more than one visit, we can use |forkIO| to allow the Haskell runtime to evaluate the visits in parallel, since these do not have common dependencies. \section{Generalization to Phases} \label{sect.warren.phases} We use BarrierAG as a host language to describe phases. A nonterminal declares a set of phases. Attributes may be associated uniquely to a phase. A phase corresponds to one or more implicit visits. Since a visit describes the smallest unit of evaluation of a node in the tree, a phase describes a larger unit of evaluation. It allows us to express properties of chunks of AG evaluation, without resorting to the low level details of visits. The following example is a possible declaration of phases for nonterminal |Tree|. The indentation determines which attribute is declared in which phase. The scope of the phase declaration ends before a keyword at the same indentation level: \begin{code} itf Tree -- declaration of attributes and phases of |Tree| syn gath :: Int -- attribute not assigned to a phase phase distribute -- declaration of a phase |distribute| inh mini :: Int -- attribute |mini| in phase |distribute| phase transform -- declaration of a phase |transform| syn repl :: Tree -- attribute |repl| in phase |transform| \end{code} These phase declarations are unordered, and can be specified in any order. \begin{figure}[tb] \begin{code} I ::= itf N (many q) -- a set of declaration of a phase-interface q ::= a -- toplevel attribute decl (not associated with a phase) | phase ph (many q) -- phase with attribute declarations d ::= ... -- descriptors extended with phases | begin x ph -- begin of a phase |ph| of child |x| | end x ph -- end of a phase |ph| of child |x| s ::= sem N prod P (many r) (many t) -- common rules |r| and a set of phase blocks |many t| t ::= phase ph (many r) (many t) -- common rules |r| and subsequent phases |many t| r ::= ... -- AG rules | invoke ph of (many c) z -- specifies invocation of phase |ph| of |many c| with strategy |z| \end{code} \caption{Notation for phase interfaces.} \label{fig.warren.phaseitfs} \end{figure} Figure~\ref{fig.warren.phaseitfs} shows the notation for phase interfaces. There is a high similarity with the notation for visit interfaces of Chapter~\rref{chapt.side-effect}. The main difference is that we may specify multiple blocks of subsequent phases in a phase-block. The lexical nesting of phase declarations (on a nonterminal) and phase blocks (in a sem-block) impose a partial order on phases. Also order-rules in combination with |begin lhs ph| and |end lhs ph| imposes a partial order. A phase-block of phase |ph2| that is nested in a phase block |ph1| is evaluated either during the evaluation of |ph1| when there is a constraint |begin lhs ph2 `pre` end lhs ph1|, or after the evaluation of |ph| otherwise. Invoke-rules may be explicitly given or be implicit. An invoke-rule |r| with a phase |ph| corresponds to one or more actual visits to a child |x|. The rule itself precedes the first of these visits, thus |r `pre` begin x ph|. The lexical scope of a phase block, which is relevant for local attributes and default-rules, is only determined by the nesting of phase blocks in a production. Possibly more constraints on the order induced by other productions or order-rules are not taken into account for scoping. This technical detail ensures that we can determine the vertices of the PDGs before the dependency analysis takes place. We essentially have two main evaluation algorithms to choose from: demand-driven evaluation and statically ordered evaluation. We may specify several properties of a phase, such as cyclic or acyclic, and pure or impure. Not all combinations are possible. For example, a cyclic phase must use on-demand evaluation, and may not be impure. Also, the properties impose constraints on rules in such a phase, or on rules that invoke such a phase. For example, an invoke-rule in a pure phase may not invoke an impure phase on a child. The scheduling of rules takes such constraints into account (Section~\tref{visit.sect.order}). In a host language with lazy evaluation, the on-demand algorithm is a simplification of the eager algorithm, hence in this thesis we focus only on the latter. \begin{figure}[tb] \begin{code} attr N ^^^ inh (sub abegin N) barrier -- begin master phase for |N| ^^^ syn (sub aend N) barrier -- end master phase for |N| attr N ^^^ inh (sub abegin ph) barrier -- begin barrier for each phase |ph| of |N| ^^^ syn (sub aend ph) barrier -- end barrier for each phase |ph| of |N| attr N ^^^ ainh.(sub abegin ph) `pre` asyn.(sub aend ph) -- begin before end for each phase |ph| ^^^ ainh.(sub abegin N) `pre` ainh.(sub abegin ph) -- begin |ph| after begin master phase ^^^ asyn.(sub aend ph) `pre` asyn.(sub aend N) -- end |ph| before end master phase attr N ^^^ ainh.(sub abegin ph) `pre` k.y -- for each attribute |k.y| of phase |ph| ^^^ k.z `pre` asyn.(sub aend ph) -- for each attribute |k.y| of phase |ph| ^^^ ainh.(sub abegin ph) `pre` ainh.(sub abegin ph') -- for each nested phase |ph'| ^^^ asyn.(sub aend ph') `pre` asyn.(sub aend ph) -- for each nested phase |ph'| ^^^ ainh.(sub abegin N) `pre` k.y -- for each attribute |k.y| not in a phase ^^^ k.z `pre` asyn.(sub aend N) -- for each attribute |k.y| not in a phase \end{code} \caption{Sketch of a translation of phases to BarrierAG.} \label{fig.warren.phasessketch} \end{figure} \paragraph{Foundation.} We express phases in terms of BarrierAG, which we sketch in Figure~\ref{fig.warren.phasessketch}. A dependency rule |k.x `pre` k.y| on attr-blocks of |N| is syntactic sugar for rule |k.lhs.x `pre` k.lhs.y| in each production of |N|. For each phase |ph| of nonterminal |N|, we introduce two barrier-attributes |sub abegin ph| and |sub aend ph|. The attributes of the phase are enclosed by these barriers. There is one master phase |N| for each nonterminal |N| where all attributes and phases are enclosed by via dependencies on its barriers. The lexical nesting of phases and rules induces additional order constraints, as sketched by the following example: \begin{code} sem N (many r1) -- |ainh.lhs.(sub abegin N) `pre` (many r1)| phase ph1 -- |ainh.lhs.(sub abegin N) `pre` ainh.lhs.(sub abegin ph1)| (many r2) -- |ainh.lhs.(sub abegin ph1) `pre` (many r1)| phase ph2 -- |ainh.lhs.(sub abegin ph1) `pre` ainh.lhs.(sub abegin ph2)| (many r3) -- |ainh.lhs.(sub abegin ph2) `pre` (many r3)| \end{code} With order-dependencies, and with syntax as demonstrated in Chapter~\rref{chapt.side-effect}, more dependencies between phases may be specified. Further, we take the union of all constraints on phases of nonterminal |N| from each child with nonterminal |N|, and integrate these in the NDG of |N|. The PDGs must remain acyclic, otherwise the constraints on phases are inconsistent, which is considered a static error. \begin{figure}[tb] \begin{center} \begin{tikzpicture} [ phase/.style={circle, minimum size=1mm, node distance=4mm, top color=black, bottom color=black,font=\scriptsize} , lab/.style={font=\footnotesize} , level distance=6mm ] \node[phase](master){} child { node[phase](analyze){} child { node[phase](name){} } child { node[phase](type){} } } child { node[phase](translate){} }; \node[lab, right of=master]{master}; \node[lab, right of=translate]{translate}; \node[lab, left of=analyze]{analyze}; \node[lab, left of=name]{name}; \node[lab, right of=type]{tpcheck}; \end{tikzpicture} \end{center} \caption{Phase nesting visualized as a tree.} \label{fig.warren.phasenesting} \end{figure} \paragraph{Phase nesting.} A phase may be contained inside another phase. The parent of a node can invoke the contained phase of the node as part of the evaluation of the containing phase. This model introduces levels of granularity that allow for more concise specifications. We can visualize this model as a tree, which we show in Figure~\ref{fig.warren.phasenesting}. \begin{defi}[Nesting tree] A \emph{nesting tree} describes the nesting of phase declarations. \end{defi} In a nesting tree, the nodes are phases. When a node |ph2| is a child of a node |ph1|, |ph1| is a nested phase of |ph2|. The above exemplary tree corresponds to the following phase interface: \begin{code} itf Expr -- defines the siblings for the root (master phase) phase analyze -- nested phase in master phase phase name -- nested phase in phase |name| phase tpcheck -- nested phase in phase |tpcheck| phase translate -- nested phase in master phase \end{code} The above phase interface describes that during the evaluation of the analyze phase of a child, the name phase of that child is invoked. Different strategies may be specified for the name phase than for the analyze phase. The nesting tree can be inferred from the NDG. Given a nonterminal |N|, |N|'s phase |ph1| is a child of |N|'s phase |ph2|, if either: \begin{itemize} \item |ainh.(sub abegin ph2) <-+ ainh.(sub abegin ph1)| and |asyn.(sub aend ph1) <-+ asyn.(sub aend ph2)|. In this case, |ph1| is fully enclosed by |ph2|. \item there exists a |ph3| so that |ph3| is a child of |ph1|, and |ph3| is a child of |ph2|, with in the NDG |ainh.(sub abegin ph2) <-+ ainh.(sub abegin ph1)| or |asyn.(sub aend ph1) <-+ asyn.(sub aend ph2)|. In this case, |ph3| is both a child of |ph1| and |ph2|, which we resolve by making |ph1| a child of |ph2|. \end{itemize} The constraints form a directed graph per nonterminal, and can be solved with a fixpoint computation. If the resulting graph is not a tree, then either the constraints are inconsistent, or too few constraints were specified. Such a nesting tree leads to additional edges between the barriers of the NDG. If the PDGs become cyclic due to these additional edges, the constraints on phases were inconsistent. This approach infers a single nesting tree per nonterminal. The inference of the nesting tree has the advantage that it becomes easier to compose phases. On the other hand, the nesting of phases is typically limited, and has a \emph{purpose}, so it is only a slight burden to specify the nesting fully. \paragraph{Overlap prevention.} Sibling phases may not overlap, otherwise it is unclear which evaluation of a node corresponds to which phase. Two phases overlap if either |ainh.x.(sub abegin ph2) <-+ ainh.x.(sub abegin ph1)| and |asyn.x.(sub aend ph2) <-+ asyn.x.(sub aend ph1)|, or the other way around. However, the dependencies on barriers do not guarantee this property. Given two sibling phases |ph1| and |ph2|, we wish to express that either |end ph1 `pre` begin ph2| or |end ph2 `pre` begin ph1|. Since siblings are not ordered, we do not know which of the two to take. Attribute scheduling is actually a reduction of the ordering of phases. If we define phases such that each attribute is declared in a unique phase, then determining the order of phases is attribute scheduling. The order of phases may thus be dependent on context. Therefore, we take a similar approach as Section~\ref{sect.warren.visitsgraphs} and define the phases graph. \paragraph{phases graph.} The phases graph of a nonterminal is the phases graph of the master phase of the nonterminal. A phases graph is a DAG where each represents a phase, and an edge from |a| to |b| means that |a| comes before |b| in the sequence. A vertex is labelled with the phases graph of its children: \begin{code} g ::= phase ph (many g) (many e) -- phases graph of phase |ph| (initially the master phase) e ::= g -> g -- edges connect phases subgraphs \end{code} Such a graph |g| may have several sources and sinks. However, each path from source to sink must contain precisely all sibling phases, because each path corresponds to an ordered sequence of siblings in the nesting tree of the nonterminal. Similar as the visits graph, the phases graph contains all consistent ways to interleave phases. An interleaving is consistent when it satisfies the partial order of the phases. In practice, there are only a few phases per nonterminal, with relatively dense constraints, thus the actual required portion of the graph contains little variety. Also, the problem is slightly easier compared to Section~\rref{sect.warren.visitsgraphs} because each path from source to sink is equally long, and the number of vertices on such a path is known beforehand. We take a slightly different representation of the phases graph to stress the similarity with the visits graph. The vertices |s| of the phases graph represent an ordered sequence of the available |inh.(sub abegin ph)| attributes. An edge is annotated with the attribute that has become available: \begin{code} s ::= (many a) -- state of a vertex in the phases graph e ::= s (over (->) a) s -- edge in the phases graph \end{code} Note that we can construct the nesting tree from such a sequence. The purpose of the phases graph is to determine a small number of totally ordered nesting trees for a nonterminal. In a similar way as with visits, we determine a total order on the phases of children of a production given the total order on the phases of the production. For this analysis, we do not add visit-vertices to the PDGs. Instead, we add dependencies between attributes |asyn.x.(sub aend ph1)| and |ainh.x.(sub abegin ph2)| for children |x| and phases |ph1| and |ph2|. \begin{figure} \begin{code} avail (many a) g d ^^ ^^ = ^^ all (avail (many a) g) (deps g d) && except (many a) g d except (many a) g <% ainh.lhs.y %> | ^^ not isBegin ainh.lhs.y || ainh.y `elem` (many a) except (many a) g <% ainh.x.y %> | ^^ not isBegin ainh.x.y || (d `elem` deps g <% ainh.x.y %> && isEnd d) except (many a) g <% asyn.c.y %> | ^^ True except (many a) g <% rule x %> | ^^ True except (many a) g <% child x %> | ^^ True except (many a) g <% visit x i %> | ^^ True isBegin <% k.x.(sub abegin ph) %> isEnd <% k.x.(sub aend ph) %> \end{code} \caption{The relation |avail| with begin and end barriers.} \label{fig.warren.availbeginend} \end{figure} Recall that a vertex in a PDG is available when all its dependencies are available. In this situation, for a begin barrier to be ready, it must have been connected to an end barrier. Figure~\ref{fig.warren.availbeginend} gives the definition. Further, an end-barrier is required when the begin-barriers of the phase and its children are available. Sufficient child-edges must be added so that the (indirect) dependencies of a required end-barrier are available. \begin{defi}[Selectable] A begin-barrier is \emph{selectable} when the following conditions are met. \begin{itemize} \item Its dependencies are available. This implies that the begin-barrier of its parent is available. \item If a begin-barrier of a sibling is available, then so is the end-barrier of that sibling. This ensures that the phases do not overlap. \end{itemize} \end{defi} Options for child-edges can thus be determined by taking the intersection between selectable and required begin-barriers. If this intersection is empty, then the dependencies on phases are inconsistent, i.e. forcing the barriers to overlap. If there are multiple options available, we prioritize based on some stable order, such as the order of appearance. As initial solution, we start with only the begin-barrier of the master phase available for each nonterminal. For the root nonterminals, we construct as initial solution a full path by determining a last-as-possible order given the dependencies of the NDG, and prioritizing based on the stable order mentioned earlier. \paragraph{Attribute scheduling.} The set of paths from sink to source is in practice small in the phases graph. The amount of paths is the price to pay for not specifying a total order on phases. For each path in the phases graph, we determine the accompanying nesting tree. The set of these trees form the phase interfaces of a nonterminal. For each child, we specify which phase interface to use, using a type index for the child-rule. With each phase interface corresponds a specialized version of the NDG and PDGs. We also construct a separate visits graph per phase interface. Due to the additional dependencies imposed by the phases-interface, variability in the visits graph is reduced. Moreover, on each nesting-tree, we can perform additional attribute scheduling. The attributes are restricted by a partial order, and this order may be strengthened if it does not lead to cycles in the NDG of the nesting tree. For example, we may schedule attributes to the latest possible phase, or avoid scheduling attributes to certain phases. Such heuristics are similar to those discussed for rule scheduling in Section~\tref{visit.sect.order}. \paragraph{Remarks.} The generalization to phases ensures that our approach is a conservative extension of ordered attribute grammars. A conventional attribute grammar can be expressed by associating all attributes with one phase. The inference of the phases graph is similar to the inference of the visits graph. It may be possible to combine the inference of both graphs. However, it is not immediately clear how to express heuristics, such as the avoidance of certain phases, in a combined approach. \section{Commuting Rules} \label{warren.sect.commutting} In this section, we present AGs with commuting rules, which are chained rules that can be reordered. These commuting rules provide an abstraction for the use of references in Section~\ref{sect.warren.core}. Given an explicit ordering of rules, the composition of two rules is \emph{commutative} when the two rules are \emph{commutable}, which means that the rules may be swapped in the composition. Rule composition is a conditionally \emph{commutative operator}. Commutable rules represent \emph{commutable operations}. The swapping of rules models side effects, and commutativity facilitates reasoning about the safe use of side effects. \paragraph{Syntax.} We assume some conventional AG to start with, and show later how to encode the commutable rules in BarrierAG. Firstly, we introduce \emph{threaded attributes}, which are a special form of chained attributes. Secondly, we introduce commuting rules, which specify a set |many h| of commutable chains the rule participates in: \begin{code} k ::= ... -- attribute forms (e.g. |inh| and |syn|) | thr -- threaded attribute r ::= ... -- rules | x : ^^ p (many z1) = f (many z2) ^^^ (many h) -- rule |x| that commutes over |many h| h ::= z1 `comm` z2 -- rule commutes with rules of |z1| and |z2| \end{code} The syntax |z1 `comm` z2| specifies that the rule connects chains of |z1| and |z2|. Attribute |z1| must be on an input position, and |z2| on an output position. Also, both must be threaded attributes, and their types must be the same. \begin{figure}[tb] \begin{code} attr Tree ^^^ thr unq :: Int -- declaration of threaded attribute sem Root | Root ^^^ root.unq = 1 -- initial value of threaded attribute loc.final = root.unq -- final value of threaded attribute sem Tree -- modifications to the threaded attribute | Leaf ^^^ (loc.myId, lhs.unq) = (lhs.unq, lhs.unq+1) ^^^ lhs.unq `comm` lhs.unq | Bin ^^^ l.unq = lhs.unq ^^^ lhs.unq `comm` l.unq r.unq = l.unq ^^^ l.unq `comm` r.unq lhs.unq = r.unq ^^^ r.unq `comm` lhs.unq \end{code} \caption{Example of commuting rules.} \label{fig.warren.commutingexample} \end{figure} The example in Figure~\ref{fig.warren.commutingexample} uses a threaded counter to demonstrate the commuting rules. Each value of |loc.myId| is unique, although it is not guaranteed that a right-sibling of a node has a higher |loc.myId| value. This would be the case if the rules were not commutable. The example shows as the first rule of |Root| how to provide the initial value of a threaded attribute. This is a rule that defines a threaded attribute, but does not commute over it. Also, the example shows as the second rule of |Root| how to obtain the final value of a threaded attribute, which is a rule that refers to the value of a threaded attribute, but does not commute over it. Finally, the rules of |Tree| are commutable rules. As additional requirement, a rule that defines a threaded attribute |lhs.y| must mention |lhs.y| in its set of commuting chains |many h|. This restriction ensures that the threaded attribute can be represented as an inherited attribute. The commuting chains |many h| specifies in which chains a rule participates: \begin{code} uses (many h) = { z1 | <% z1 `comm` z2 %> `elem` (many h) } defs (many h) = { z2 | <% z1 `comm` z2 %> `elem` (many h) } \end{code} A rule with |many h| connects |uses (many h)| to |defs (many h)|. \paragraph{BarrierAG encoding.} We use a BarrierAG encoding as a means to specify the implementation of commuting rules. We represent each threaded attribute |thr x :: tau| with three attributes in BarrierAG: \begin{code} inh (sub x ref1) :: IORef tau ^^^ -- reference to the mutable state syn (sub x wr1) barrier -- all commutable updates before this barrier inh (sub x rd1) barrier -- all non-commutable updates after this barrier \end{code} Thus, a threaded attribute is a reference to a mutable state. We ensure in the encoding that rules only depend on the reference, which permits the reordering. As invariant, the write barrier of the child depends on all commuting rules of the child (and its subtree) that update the reference. All non-commuting rules that refer to the reference depend on the read barrier. \begin{figure}[tb] \begin{code} sem N | P -- |k| is a child of production |P| of nonterminal |N| loc.(ann k (x) (ref1)) :: IORef tau -- local attribute declarations with type loc.(ann k (x) (wr1)) barrier -- local barrier declaration loc.(ann k (x) (rd1)) barrier -- local barrier declaration (sup (k.(sub x ref1)) trval) = (sup (loc.(ann k (x) (ref1))) trval) ^^^ -- copy as reference for child k.(sub x wr1) `pre` loc.(ann k (x) (wr1)) -- restrain read barrier of child (ann k (x) (rd1)) `pre` k.(sub x rd1) -- restrain write barrier of child \end{code} \caption{Sketch of the encoding of commuting rules in BarrierAG.} \label{fig.warren.commsketch} \end{figure} For each threaded attribute |thr y :: tau| of a child |k|, we introduce three local attributes, and rules to connect the local attributes with the attributes of the child. Figure~\ref{fig.warren.commsketch} gives a sketch. Note that |ann k x ref1| is the name of the attribute. The name of the child |k| is part of the name of the attribute. When a rule refers to the threaded attribute of |k|, we actually let it refer to the local attributes, as we show below. For notational convenience, we also introduce these local attributes for threaded attributes of |lhs|: \begin{code} (sup (loc.(ann lhs (x) (ref1))) trval) = (sup (lhs.(sub x ref1)) trval) ^^^ -- copy as reference loc.(ann lhs (x) (wr1)) `pre` lhs.(sub x wr1) -- restrain read barrier k.(sub lhs rd1) `pre` (ann lhs (x) (rd1)) -- restrain write barrier \end{code} These rules for |lhs| are the contravariant version of the rules for children. \begin{figure}[t] \centering \begin{tikzpicture} [ nd/.style={rectangle, draw, minimum size=6mm, node distance=10mm,font=\scriptsize, sibling distance=16mm} , nd2/.style={ellipse, draw, minimum size=8mm, node distance=10mm,font=\scriptsize, sibling distance=16mm} , dep/.style={->,thick} , dep2/.style={->,dashed, thick} , dep3/.style={->,dotted, thick} , lab/.style={font=\scriptsize} ] %% Root \node[nd](lhsunqref){|lhs.(sub unq ref1)|}; \node[nd,right of=lhsunqref, xshift=6mm](lhsunqrd){|lhs.(sub unq rd1)|}; \node[nd,right of=lhsunqrd, xshift=6mm](lhsunqwr){|lhs.(sub unq wr1)|}; \node[nd,below of=lhsunqref](loclhsunqref){|loc.(ann lhs unq ref1)|}; \node[nd,right of=loclhsunqref, xshift=6mm](loclhsunqrd){|loc.(ann lhs unq rd1)|}; \node[nd,right of=loclhsunqrd, xshift=6mm](loclhsunqwr){|loc.(ann lhs unq wr1)|}; \draw[dep](loclhsunqref) to (lhsunqref); \draw[dep3](loclhsunqrd) to (lhsunqrd); \draw[dep2](lhsunqwr) to (loclhsunqwr); %% Left \node[nd,below of=lhsunqref, xshift=-36mm, yshift=-40mm](loclunqref){|loc.(ann l unq ref1)|}; \node[nd,right of=loclunqref, xshift=6mm](loclunqrd){|loc.(ann l unq rd1)|}; \node[nd,right of=loclunqrd, xshift=6mm](loclunqwr){|loc.(ann l unq wr1)|}; \node[nd, below of=loclunqref](lunqref){|l.(sub unq ref1)|}; \node[nd,right of=lunqref, xshift=6mm](lunqrd){|l.(sub unq rd1)|}; \node[nd,right of=lunqrd, xshift=6mm](lunqwr){|l.(sub unq wr1)|}; \draw[dep] (lunqref) to (loclunqref); \draw[dep3] (lunqrd) to (loclunqrd); \draw[dep2] (loclunqwr) to (lunqwr); %% Right \node[nd,below of=lhsunqref, xshift=36mm, yshift=-40mm](locrunqref){|loc.(ann r unq ref1)|}; \node[nd,right of=locrunqref, xshift=6mm](locrunqrd){|loc.(ann r unq rd1)|}; \node[nd,right of=locrunqrd, xshift=6mm](locrunqwr){|loc.(ann r unq wr1)|}; \node[nd, below of=locrunqref](runqref){|r.(sub unq ref1)|}; \node[nd,right of=runqref, xshift=6mm](runqrd){|r.(sub unq rd1)|}; \node[nd,right of=runqrd, xshift=6mm](runqwr){|r.(sub unq wr1)|}; \draw[dep] (runqref) to (locrunqref); \draw[dep3] (runqrd) to (locrunqrd); \draw[dep2] (locrunqwr) to (runqwr); %% Rules \node[nd2, below of=lhsunqref, xshift=-15mm, yshift=-20mm](r1){|r1|}; \node[nd2, right of=r1, xshift=20mm](r2){|r2|}; \node[nd2, right of=r2, xshift=20mm](r3){|r3|}; \node[nd2, left of=r1](r1'){|ann r 1 trref|}; \node[nd2, below of=r2, yshift=-5mm](r2'){|ann r 2 trref|}; %% Dependencies of reference \draw[dep] (r1) to (loclhsunqref); \draw[dep] (r1) to (loclunqref); \draw[dep] (r1') to (loclhsunqref); \draw[dep] (loclunqref) to (r1'); \draw[dep] (r2) to (loclunqref); \draw[dep] (r2) to (locrunqref); \draw[dep] (r2') to [in=30,out=170] (loclunqref); \draw[dep] (locrunqref) to (r2'); \draw[dep] (r3) to (loclhsunqref); \draw[dep] (r3) to (locrunqref); %% Dependencies of write \draw[dep2] (loclhsunqwr) to (r3); \draw[dep2] (locrunqwr) to (r3); \draw[dep2] (locrunqwr) to (r2); \draw[dep2] (loclunqwr) to (r2); \draw[dep2] (loclhsunqwr) to (r1); \draw[dep2] (loclunqwr) to (r1); \draw[dep2] (loclunqwr) to [out=35,in=145] (locrunqwr); \draw[dep2] (loclhsunqwr) to [out=-145,in=80] (loclunqwr); %% Dependencies of read \draw[dep3] (loclunqrd) to (loclhsunqrd); \draw[dep3] (locrunqrd) to [out=155,in=25] (loclunqrd); %% Legend \node[lab, right of=loclhsunqwr, xshift=20mm](a0){}; \node[lab, right of=a0](a1){}; \node[lab, below of=a0, yshift=4mm](b0){}; \node[lab, right of=b0](b1){}; \node[lab, below of=b0, yshift=4mm](c0){}; \node[lab, right of=c0](c1){}; \draw[dep](a1) to node[lab,auto,above]{|ref1|} (a0); \draw[dep2](b1) to node[lab,auto,above]{|wr1|} (b0); \draw[dep3](c1) to node[lab,auto,above]{|rd1|} (c0); \end{tikzpicture} \caption{Attributes encoding the commutative rules of the production |Bin|.} \label{fig.warren.commexmpl} \end{figure} \emph{Non-commuting write}. When a rule |x| with commuting chains |many h| defines a threaded attribute |athr.k.y|, but |athr.k.y `notelem` defs (many h)|, rule |x| serves as \emph{initializer} for the threaded attribute. In the encoding, the defining occurrence |athr.k.y| in the left-hand side of rule |x| is replaced with |sup (loc.(ann k y ref1)) trnew|. Moreover, we add the following dependency rule: \begin{code} loc.(ann k y wr1) `pre` loc.(ann k y rd1) -- orders writes before reads \end{code} Since |k| is the start of the chain, and given the invariants on the read and write barriers, this dependency rule ensures that commutable writes take place before non-commutable reads. \emph{Non-commuting read}. When a rule |x| with commuting chains |many h| refers to a threaded attribute |athr.c.y|, but |athr.c.y `notelem` uses (many h)|, rule |x| reads the final value of the threaded attribute. Thus, we replace the occurrence |athr.c.y| in the right-hand side of rule |x| with |sup (ann c y ref1) trref|, and add the following dependency: \begin{code} loc.(ann c y rd1) `pre` rule x ^^ -- the read depends on the read barrier \end{code} The identifier |c| is either a child |k| or |lhs|. \emph{Commuting read and write}. When a rule |x| with commuting chains |many h| refers to a threaded attribute |athr.c.y|, and |athr.c.y `elem` defs (many h)|, we replace the defining occurrence in the left-hand side of |x| with |sup (ann c y ref1) trref|. When |athr.c.y `elem` uses (many h)|, we replace the occurrence in the righthand side of |x| with |sup (ann c y ref1) trref|. Moreover, we insert a rule |sup x trref| for each commuting chain, which copies the reference and thus links the chain. Also, we connect the read and write barriers: \begin{code} sup x trref : id (sup (k.y2) trval) = id (sup (c.y1) trval) ^^^ -- for each |c.y1 `comm` k.y2 `elem` many h| (copies the reference) rule x `pre` loc.(ann c y wr1) ^^^ -- for each attribute |y| in |many h| (rule before write barrier) loc.(ann c y2 wr1) `pre` loc.(ann k y1 wr1) -- for each |c.y1 `comm` k.y2 `elem` many h| (connect write barriers) loc.(ann c y1 rd1) `pre` loc.(ann k y2 rd1) -- for each |c.y1 `comm` k.y2 `elem` many h| (connect read barriers) \end{code} When |lhs.y `elem` defs (many h)|, the read and write barrier are not connected, as it would create a cycle. Also, note the contravariant behavior between the read and write barrier. The barrier attributes and dependency rules enforce the proper ordering of rules that use threaded attributes. Figure~\ref{fig.warren.commexmpl} demonstrates the encoding for the Bin-production. The boxes represent attributes. The circles represent the relevant rules. The barriers and their dependencies are not part of the generated code, and thus do not have a runtime overhead. As a consequence, we actually transformed a chained attribute into an inherited attribute with a reference to a mutable state. \paragraph{Referential transparency.} Referencial transparency is important for equational reasoning. For AGs, it is also important to ensure that the order of evaluation does not affect the result. In BarrierAG, rules that use updatable attributes break referential transparency. However, with commutable rules, we can establish a weaker version of referential transparency. When two rules commute, the actual values for the attributes that these rules define may be different, but in the context where these rules are defined, the final result, which abstracts from the values of the attributes, may still be equivalent to any ordering of the commutable rules. The composition of rules, and the context of rules can be made explicit with arrow notation. A commuting rule |r1 : ^^ (x1,y1) = f (x0,y0), ^^ x0 `comm` x1| and a commuting rule |r2 : ^^ (x2,z1) = g (x1,z0), ^^ x1 `comm` x2| correspond respectively to the arrows |(x1,y1) <- f -< (x0,y0)| and |(x2,z1) <- g -< (x1,z0)|. Section~\tref{intro.sect.correspondences} shows that the composition of these rules can be expressed as an arrow: \begin{code} proc (x0,y0,z0) -> do (x1,y1) <- f -< (x0,y0) (x2,z1) <- g -< (x1,z0) returnA (x2,y1,z1) \end{code} Alternatively, when we reorder |f| and |g|, and rename the attributes, we obtain the following arrow: \begin{code} proc (x0,y0,z0) -> do (x1,z1) <- g -< (x0,z0) (x2,y1) <- f -< (x1,y0) returnA (x2,y1,z1) \end{code} This notion can straightforwardly be generalized to rules that commute over many attributes, or define and use many other attributes. These two rules are \emph{commutable} over attributes of |x0,x1| and |x1,x2| if their compositions are equivalent for a given rule context |h|, and |r1 `notpre` r2|: \mathhs \begin{displaymath} |h| \left( \begin{code} proc (x0,y0,z0) -> do (x1,y1) <- f -< (x0,y0) (x2,z1) <- g -< (x1,z0) returnA (x2,y1,z1) \end{code} \right) |^^^ == ^^^| |h| \left( \begin{code} proc (x0,y0,z0) -> do (x1,z1) <- g -< (x0,z0) (x2,y1) <- f -< (x1,y0) returnA (x2,y1,z1) \end{code} \right) \end{displaymath} \plainhs If there exists directly or indirectly a dependency between |r1| and |r2| then the rules may not commute. This is for example the case when |r1| defines a (non-threaded) attribute that is used by |r2|, or because of a dependency rule. The rule context |h| is an abstraction of the composition in which the composition of |f| and |g| is contained. For example, |h| can represent the composition of the rules of the entire tree, and thus the rule states that the end result of the computation is not affected. In practice, we take a more abstract notion of |h|. For example, the property that all |loc.myId| attributes have a unique value. In Section~\tref{summary.sect.commuting} we give some examples of the function |h|. \paragraph{Remarks.} To reason with commutable rules, we may need to make assumptions about the order of evaluation. This is, for example, the case when the values of the attributes are trace monoids~\citep{Diekert:1997:PCT:267871.267879}. Phases can be used for this purpose. The identification of commuting rules may be relevant for the parallel and incremental evaluation of attribute grammars. Chained attributes sequentialize code, whereas commuting rules allow more interleaving. Similarly, during incremental evaluation, changes in a subtree that appears earlier in the evaluation may be lifted over a later subtree if these changes are visible in a threaded attribute. Commutable rules can also be used to collect statistics or other runtime properties about the evaluation process, such that the attribute-dependencies of the collecting rules have only a minor influence on the evaluation process. For example, a count of the nodes of the tree traversed so far may be an indication of how much work has been done. In case of type inference, substitutions may be represented as a threaded attribute, so that the threading of the substitution does not influence the order of evaluation. Traditionally, unification is only a commutable operation when all unifications succeed. By encoding the substitution as a graph structure, it is possible to make unification a commutable operation also in the case of a type conflict~\citep{heeren05TopQuality}. \section{Related Work} The ability to compose attribute grammars is an important benefit that attribute grammars offer. In fact, AGs are so easily composed that the compositions may accidentally become inconsistent, i.e. have attributes with a cyclic definition. \citet{DBLP:journals/mst/Knuth68} proves that an AG is well-defined if and only if the dependency graphs of productions are cycle-free according to his refined algorithm (Knuth-2). When the dependency graphs are cycle-free according to Knuth's original algorithm (Knuth-1) then the AG is well-defined, but not necessarily the other way around. These are static properties of AGs that provide guarantees that the evaluation of an AG terminates. Knuth-2 uses a dependency graph per production/child production combination, in contrast to a single dependency graph per production as the Knuth-1 approach uses. Knuth-2 leads to an approximate number of dependency graphs per production in the order $(|p|^|b|)$, where |p| is the number of productions of the child nonterminals, and |b| is the number of children. In practice, e.g. for the let-production of a lambda calculus, |p| is rather large (|p > 10|), but |b| is typically small (|b <= 2|). However, we usually define a fine granularity of nonterminals so that distinct dependency graphs per production/child production combination does not offer an advantage over a single graph per production. In our experience, AGs are either necessarily cyclic, or are cycle-free in both Knuth-1 and Knuth-2. If the AG is not cycle-free with Knuth-1, then this is an indication that the AST does not have sufficient structure, which is not likely in strongly typed languages. In the first case, on-demand evaluation may still yield results for attributes, although it is the responsibility of the programmer to ensure this. In the second case, we know that an evaluation order exists, and can use a statically ordered evaluation algorithm. A statically ordered evaluation algorithm is likely to exhibit better time and space behavior, and actually permits minor assumptions about the evaluation order to be made. \citet{DBLP:journals/acta/Kastens80} presented an approach to infer a visit interface per nonterminal. Unfortunately, when using the Kastens approach, we often encounter cycles that are induced by the scheduling as resulting from Kastens' scheduling algorithm. These induced cycles hamper compositionality, because as remedy, we need to add artificial dependencies between attributes to the AG to control the scheduling. Also, the effect of scheduling is not visible in the original rules of the grammar, which makes such cycles very hard to understand and resolve. The approach by \citet{DBLP:conf/popl/KennedyW76} can find a solution, but may possibly result in an exponentially large solution. In practice the solution is not so large: In our experience, this approach works very well for small AGs. For large AGs, however, the exponential behavior may show up. To counter this behavior, we provide sufficient mechanisms to restrict the set of solutions. \section{Conclusion} This chapter demonstrates one of the great strengths of attribute grammars: the ability to statically analyse the grammar with abstract interpretations. As one of the main contributions of this chapter, we reformulated the approach of \citet{DBLP:conf/popl/KennedyW76} so that it can be used to generate code for a strongly typed, purely functional host language. For absolutely non-circular AGs, this approach finds a way to order the rules statically. By using such an approach, it is not needed to explicitly schedule attributes and rules, unless these need to be given special properties. We introduce the notion of phases for this. Since the ordering of attributes and rules can be inferred, we may omit such details from our specification. Consequently, the specification becomes more concise and easier to compose, which is beneficial from an engineering point of view. The price that we pay is that such an analysis can only give an answer per context, such as a context dependent phase or visit interface, and a nonterminal may be in exponentially many contexts. A programmer typically has some evaluation order in mind, thus is usually able to specify a single order that works in all contexts, which may reduce the size of the generated code, compile time, and the execution time. On the other hand, if there is not a single order that works in all contexts, then a programmer is not likely to be able to keep track of all the possibilities manually. We presented phases as a means to specify knowledge about the order of evaluation, without going into the fine details, as in Chapter~\rref{chapt.side-effect}. Also, for some problems, the order imposed by attribute dependencies may be too restrictive. We presented commuting rules as a means to loosen some restrictions. Commuting rules can be used when a number of rules form a chain, and the individual ordering of the rules is not relevant for the result. A typical example is a sequence of rules that provides unique numbers to nodes in the tree. When the requirement is only that each number is unique, the actual order in which numbers are handed out is does not invalidate that requirement. As future work, more quantitative insight is needed about the effects of dependency analysis on the performance of the generated code. In earlier experiments, the impact seemed negligible, although the results differed from one AG to another. Also, we need more insight which heuristics have impact on the size of the visits graph. For example, it appeared that our approach in which we only compute what is needed resulted in less paths in the visits graph than the approach where we compute as much as possible. However, more quantitative evidence is required to draw such conclusions. Another direction of future work is an extension of the Kennedy-Warren approach with support for cyclic AGs. Given a collection of cyclic PDGs, we can determine which attributes of the NDGs are mutually dependent. By grouping these attributes in a separate phase that uses lazy evaluation (for example) as evaluation algorithm, we can support a combination of a cyclic and non-cyclic AGs. %% Todo: Mention commutable rules