\documentclass{entcs}
\usepackage{entcsmacro}
%include lhs2TeX.fmt
%include polycode.fmt
%let fullVersion = False
\usepackage{stmaryrd}
\usepackage{float}
\usepackage{graphicx}
\usepackage{pgf}
\usepackage{mathpartir}
\usepackage{tikz}
\usetikzlibrary{arrows,positioning}
\floatstyle{boxed}
\restylefloat{figure}
\renewcommand{\hscodestyle}{\normalsize}
\newenvironment{smallcode}[0]{\renewcommand{\hscodestyle}{\small}}{\renewcommand{\hscodestyle}{\normalsize}}
\def\lastname{Middelkoop and Dijkstra and Swierstra}
%format . = "."
%format : = "\!:\!"
%format ++ = "+\!+"
%format ^^^ = "\;\;\;"
%format ^^^^ = "\hspace{1em}"
%format epsilon = "\epsilon"
%format (sub (x) (y)) = "{" x "}_{" y "}"
%format (sup (x) (y)) = "{" x "}^{" y "}"
%format (many (x)) = "\overline{" x "}"
%format (abs (x)) = "|" x "|"
%format c1
%format c2
%format x1
%format x2
%format x3
%format v1
%format v2
%format Gamma = "\Gamma"
%format Sigma = "\Sigma"
%format Phi = "\Phi"
%format :- = "\vdash"
%format emptyset = "\emptyset"
%format sep = "\:;\:"
%format bar = "|"
%format ::: = "\::\:"
%format `union` = "\cup"
%format union = "\cup"
%format `intersection` = "\cap"
%format intersection = "\cap"
%format `subset` = "\subseteq"
%format scope = "\circ"
%format << = "\llbracket"
%format >> = "\rrbracket"
%format ~> = "\leadsto"
%format (semc (i) (s) (z)) = << z >> "_{" i,s "}"
%format (semt (i) (z)) = << z >> "_{" i "}"
%format (semb (z)) = << z >>
%format (semp (e) (z)) = << z >> "_{" e "}"
%format Gamma0
%format Gamma1
%format Gamma2
%format Gamma3
%format Gamma4
%format jvisit = "visit"
%% javascript keywords
%format function = "\mathbf{function}"
%format var = "\mathbf{var}"
%format for = "\mathbf{for}"
%format if = "\mathbf{if}"
%format prototype = "\mathbf{prototype}"
%format return = "\mathbf{return}"
%format throw = "\mathbf{throw}"
%format catch = "\mathbf{catch}"
%format try = "\mathbf{try}"
%format == = "=="
%% AG keywords
%format attr = "\mathbf{attr}"
%format data = "\mathbf{data}"
%format sem = "\mathbf{sem}"
%format inh = "\mathbf{inh}"
%format syn = "\mathbf{syn}"
%format con = "\mathbf{con}"
%format clause = "\mathbf{clause}"
%format default = "\mathbf{default}"
%format itf = "\mathbf{itf}"
%format visit = "\mathbf{visit}"
%format internal = "\mathbf{internal}"
%format match = "\mathbf{match}"
%format child = "\mathbf{child}"
%format invoke = "\mathbf{invoke}"
%format datasem = "\mathbf{datasem}"
%format rulerfront = "{\mbox{\scshape ruler-front}}"
%format rulercore = "{\mbox{\scshape ruler-core}}"
%format Java = "{\mbox{\scshape Java}}"
%format Haskell = "{\mbox{\scshape Haskell}}"
%format Javascript = "{\mbox{\scshape JavaScript}}"
%format MiniJava = "{\mbox{\scshape MiniJava}}"
%format JS = "\mathcal{J}"
\begin{document}
\begin{frontmatter}
%if fullVersion
\title{Visitor-based Attribute Grammars with Side Effect (Extended Version)}
%else
\title{Visitor-based Attribute Grammars with Side Effect}
%endif
\author{Arie Middelkoop\thanksref{email}}
\address{Universiteit Utrecht, The Netherlands}
\author{Atze Dijkstra\thanksref{email}}
\address{Universiteit Utrecht, The Netherlands}
\author{S. Doaitse Swierstra\thanksref{email}}
\address{Universiteit Utrecht, The Netherlands}
\thanks[email]{Email:
\href{mailto:ariem@@cs.uu.nl} {\texttt{\normalshape
\{ariem,atze,doaitse\}@@cs.uu.nl}}}
\begin{abstract}
The visitor design pattern is often applied to program traversal
algorithms over Abstract Syntax Trees (ASTs).
It defines a visitor, an object with
a visit method that is executed for each node in the AST.
These visitors have the advantage that the order of traversal
is explicitly under control of the programmer, which is essential
to deal with side-effectful computations.
Unfortunately, the exchange of results between traversals is error-prone.
Attribute Grammars (AGs) are an alternative way to write multi-traversal
algorithms. An attribute evaluator decorates the AST with
attributes in one or more traversals. The attributes form a convenient
mechanism to exchange results between traversals.
Unfortunately, AGs discourage the use of side effect.
In this paper, we present |rulerfront|, a language capturing the
combination of the above approaches. A |rulerfront| grammar can be
translated to traversal algorithms in
multiple languages.
In this paper, we translate to the
imperative, dynamically-typed language |Javascript|.
\end{abstract}
\begin{keyword}
attribute grammar, visitor, design pattern
\end{keyword}
\end{frontmatter}
\section{Introduction}
Algorithms for traversing tree-shaped data structures appear in many
applications, especially in compilers.
A lot of effort has been invested in proper abstractions
for tree traversals, for example in the form of
Attribute Grammars (AGs)~\cite{DBLP:journals/mst/Knuth68}.
In the last years, we applied AGs in many small
projects (to teach compiler construction~\cite{CCO}, master projects, etc.), and
several large projects, including the Utrecht Haskell Compiler~\cite{1596650},
the Helium~\cite{Leijen:helium} compiler for learning Haskell,
and the editor Proxima~\cite{Schrage04proxima}.
The use of AGs in these projects is invaluable, for reasons that become clear
in Section~\ref{sect.example}.
Tree traversals play their role in many other fields, including end-user
applications. Web applications, for example, traverse and compute properties
of DOM trees. Sadly, the nice abstractions emerging from the compiler field
are not used to write such traversals. One of these reasons is that AGs
require an additional language to be learned.
Also, the AG formalism poses too severe restrictions to be used effectively
in these areas, such as prohibition of side effect, or tool support may simply
be absent for the programming language in question.
The purpose of this paper and associated work is to treat the above two
technical challenges.
Considering the first challenge, for imperative languages like |Javascript|, a
programmer
either writes recursive functions, or takes a more structured approach via the
visitor design pattern~\cite{DBLP:conf/ecoop/GammaHJV93,674267,DBLP:conf/oopsla/OliveiraWG08}.
Tool support for the visitor design pattern is available for many languages.
For example, the parser generator
SableCC~\cite{DBLP:conf/tools/GagnonH98} generates visitor skeleton code for
the ASTs that the parser
produces, and we used these once to write a type checker for |MiniJava|~\cite{DBLP:conf/sigcse/Roberts01}.
We also used ASM~\cite{ASM}, a library used in many big Java projects that provides
visitor skeleton code to traverse Java bytecode,
to transactify Java programs~\cite{jtransact}.
With visitors, we use side-effect to carry results computed in
one visit over to the next. In our experience, scheduling visits and
side effect is an error-prone process, due to absence of the
define-before-use guarantee. We elaborate on this in Section~\ref{sect.example.vis}.
Attribute grammars offer a programming model where each AST node has attributes
(named values per node). The programmer writes code that computes
attributes in terms of other attributes. The attribute grammar evaluator
automatically schedules this code over visits, and define-before-use can be
verified with the circularity test of AGs. The implicitness of scheduling is
a serious advantage, because it saves us from writing this scheduling manually,
and cannot do it wrong. Unfortunately, the implicitness of scheduling
comes with a severe restriction: side effect cannot be used reliably and
should not be used in attribute computations. In web applications, for example,
we typically would like to use a bit of side effect to influence the contents
of a webpage.
We elaborate on this in Section~\ref{sect.example.ag}.
The main contribution of this paper is an extension of attribute
grammars that has an explicit notion of visits, which offers a hybrid model
between visitors and attribute grammars, while maintaining the best of both
worlds. In fact, besides being more expressive, our extension
make attribute grammars more intuitive to use.
%if fullVersion
In the workshop version of this paper~\cite{Middelkoop10vis}, we presented an earlier
version of a visitor-extension of attribute grammars to model more complex
traversals. We improved this idea to deal with iterative traversals, and
trees that change during evaluation. In this paper, we incorporate the
notation of side effect, such that these ideas are not limited to
ASTs of compilers, but tree traversals in a much wider range of applications.
%endif
\begin{figure}[htb]
\makebox[\textwidth][r] {
\begin{tikzpicture}
[ nd/.style={rectangle, minimum size=5mm, thick, draw=black!50, top color=white, bottom color=black!20,font=\tiny} ]
\node[nd, minimum width=25mm](a) {item a};
\node[nd, below left=0mm of a.south east](b) {very big item b};
\node[nd, below left=0mm of b.south east](c) {not so big c};
\node[nd, minimum width=15.8mm, below left=0mm of c.south east](d) {tiny};
\end{tikzpicture}}
\vspace{-25mm}
\begin{smallcode}
\begin{code}
function Menu(name, children) { -- constructor of a |Menu| AST node
this.name = name; -- the name of the element to align
this.children = children; -- an array of children menus
}
Menu.prototype.accept = function(visitor) {
visitor.visitMenu(this); -- invokes the appropriate visit method
}
function Visitor() { -- constructor of a |Visitor| object
this.depth = 0; -- the depth so far in the menu tree
this.maximum = 0; -- the maximum width observed so far
this.count = 0; -- the number of menus laid out so far
}
var root = -- the menu tree and corresponding html nodes
new Menu("a", [ -- @
item a
@
new Menu("b", [ -- @very big item b
@
new Menu("c", []) -- @not so big c
@
, new Menu("d", []) -- @tiny
@
])
]); -- @@
function align(root, anchor) { -- aligns the html nodes according to the menu tree
var v = new Visitor(); -- creates visitor with empty state
v.visitMenu = function(menu) { -- first visit method (gets |menu| node as param)
menu.elem = document.getElementById(menu.name);
menu.depth = this.depth; -- remember |depth| for the second visit
this.maximum = Math.max(this.maximum, this.depth * 20 + menu.elem.clientWidth);
for(var i in menu.children) {
this.depth = menu.depth + 1; -- reset |this.depth| to one deeper than current
menu.children[i].accept(this); -- invokes visitor on children
}
}
root.accept(v); -- invokes the first visit (on the root)
v.visitMenu = function(menu) { -- second visit method (gets |menu| node as param)
var offset = menu.depth * 20;
menu.elem.style.left = (anchor.offsetLeft + offset) + "px";
menu.elem.style.top = (anchor.offsetTop + this.count * 30) + "px";
menu.elem.style.width = (this.maximum - offset) + "px";
menu.elem.style.height = 30 + "px";
this.count ++; -- inorder numbering of nodes
for(var i in menu.children) { -- invokes |visitor| on |menu|s children
menu.children[i].accept(this); -- |count| should not be reset in this case
}
}
root.accept(v); -- invokes the second visit (on the root)
}
\end{code}
\end{smallcode}
\caption{Pseudocode dualvisit menu alignment.}
\label{fig.intro.visitor}
\end{figure}
To accomplish this goal, we also address the second challenge, which is
to make our approach available for many target languages.
We present |rulerfront|, a small but powerful language for tree traversals.
We managed to isolate the language-dependent part into a small subset
called |rulercore|, and show the translation from |rulercore| to |Javascript|.
In a related paper~\cite{Middelkoop10gpce}, we showed a translation to |Haskell|.
With these two languages, we cover the implementation issues regarding the full
spectrum of mainstream general purpose programming languages available today.
Similar to Yacc, SableCC and UUAG~\cite{uuagc}, the idea is to embed code
fragments of the target language for the computations of attributes.
This keeps general-purpose programming constructs
out of |rulercore|, and allows the programmer to express
computations without having to learn a special language.
In particular, |rulercore| is suitable as a target language for
attribute grammars.
In summary, we present two languages |rulerfront| and |rulercore|.
We implemented both in a single tool written in
Haskell using UUAG\footnote{Downloadable from svn: \url{https://subversion.cs.uu.nl/repos/project.ruler.systems/ruler-core/}}.
In Section~\ref{sect.example} we investigate the above challenges in more detail.
In Section~\ref{sect.sem.rulercore} we present |rulercore|, with a translation to |Javascript|
in Section~\ref{sect.trans.rulercore}.
%if fullVersion
Finally, the translation from |rulerfront| to |rulercore| in Section~\ref{sect.trans.rulerfront}.
%else
In the extended version of this paper~\cite{wgt10journalext}, we give a translation from |rulerfront| to |rulercore|.
%endif
\section{Example}
\label{sect.example}
In this section, we motivate the claims of the introduction in more
detail, and introduce the background information relevant for the remainder
of the paper.
We take as usecase the alignment of an HTML menu in a web application using
|Javascript|, based on a multi-visit tree traversal over an abstract
description of the menu.
We first show a solution using the visitor-pattern, then a near-solution
using attribute grammars, finally followed by two solutions using
|rulerfront|.
\subsection{Visitor design pattern.}
\label{sect.example.vis}
In the visitor design pattern, each node of the Abstract Syntax Tree
(AST) is modelled as an
object, which stores references to the subtrees, and has an |accept| method.
The |accept| method takes a visitor as parameter. A visitor is an object with
a |jvisit|-method for each type of node. The |accept| method of the AST node
calls the appropriate |jvisit|-method on the visitor and passes the node as an
argument. This |jvisit| method consists of statements that manipulate the
state of the visitor or the AST node, and can visit a subtree by calling
the |accept| method on the root of a subtree, with the visitor-object
as parameter.
Figure~\ref{fig.intro.visitor} shows an example of a visitor that layouts
HTML items as a menu in a tree-like fashion, as visualized in the upper-right corner.
The menus are aligned to the right, and submenus are slightly indented.
Furthermore, we desire the smallest layout, based on the contents of
the HTML items.
The variable |root| contains an abstract description of the menu as
a tree of |Menu| objects (the AST). Associated with each |Menu| object is an
HTML item with the same name. We interpret the menu structure to
layout the HTML items.
In the first visit to the menu tree, we query the widths of the
corresponding HTML items.
In the second visit, we adjust the positions and sizes of these items.
Some information (such as indentation based on the |depth|) is computed
in the first visit, and also needed in the second visit. That information
we store as additional fields in the menu objects.
The order in which the tree is visited is clearly defined by the explicit
|accept|-calls in the |jvisit|-methods. This is important to deal with
side effect: we need to have queried all the sizes of the
HTML items before we start resizing them.
However, there are a number of issues with the above solution.
In the second visit, we require a number of values computed in the first visit.
We store these in the state of the AST nodes during the first visit. However,
there is no guarantee that we actually stored them there in the first visit.
Furthermore, we never remove any of these values from the state, and thus retain
all memory until the AST gets deallocated. This especially becomes a problem
when using large AST storing many results.
Furthermore, we have to take care of the order of the statements. For example,
the |this.depth| needs to be reset at the appropriate place, and requires that
the assignment to |menu.depth| is done before. Similarly, the increment
to |this.count| needs to be positioned carefully. These are actually separate
aspects which we would like to implement in isolation, without having to worry
about their composition.
Finally, we need to explicitly write visits to children using |accept|. Some tools
generate depth-first visitors, which alleviates the need to do so, but these come
with restrictions.
For example, all statements must be written before the invocations to children.
In Figure~\ref{fig.intro.visitor}
we reset |this.depth| in between visits to children. To use a depth-first visitor,
we would have to move this statement (which may not be easy).
Moreover, in the simple example that we showed, the two visits are invoked after
each other at the root. In practice, for example in type checking languages with
principal types, we actually invoke multiple visits on a subtree
before moving on to the next subtree. This rules out depth-first visitors, and
is also error-prone to write manually.
\begin{figure}[htb]
\begin{smallcode}
\begin{code}
data Root con Root root : Menu -- node with a child named |root|
data Menu con Menu name ^^^ cs : Menus -- node with a property |name|, and a child |cs|
type Menus : [Menu] -- conceptually a cons-list, physically an array
var root = new Root_Root( -- the |Menus| are physically represented
new Menu_Menu("a", [ -- as an array. However, conceptually
new Menu_Menu("b", [ -- we define its attributes using the
new Menu_Menu("c", []) -- above cons-list representation.
, new Menu_Menu("d", []) ]) ]));
attr Menu Menus inh depth finMax count -- |gathMax|: width of submenu
syn gathMax count -- two attributes with the name |count|
function align(root, anchor) { -- uses embedded attribute grammars
datasem Root clause Root -- equations of production |Root| of nont |Root|
root:depth = 0 -- initial |depth|
root:count = 0 -- initial |count|
root:finMax = root:gathMax -- choose gathered max as global max
datasem Menu clause Menu -- production |Menu| of nonterm |Menu|
cs:depth = 1 + lhs:depth -- increase |depth| for submenus
cs:count = 1 + lhs:count -- increase |count|
lhs:count = cs:count -- provide the updated count to the parent
loc:elem = document.getElementById(loc:name)
loc:offset = lhs:depth * 20 -- indentation
loc:width = loc:offset + loc:elem.clientWidth
lhs:gathMax = Math.max(cs:gathMax, loc:width)
cs:finMax = lhs:finMax -- pass down final maximum
loc:dummy = (function () { -- side-effectful statements
loc:elem.style.left = (anchor.offsetLeft + loc:offset) + "px";
loc:elem.style.top = (anchor.offsetTop + lhs:count * 30) + "px";
loc:elem.style.width = (lhs:finMax - loc:offset) + "px";
loc:elem.style.height = 30 + "px";
}) () -- directly call the anonymous function
datasem Menus -- equations of productions |Cons| and |Nil|
clause Cons
hd:depth = lhs:depth -- pass |depth| downwards through the menus
tl:depth = lhs:depth
hd:count = lhs:count -- thread the |count| through the menus, in an
tl:count = hd:count -- in-order fashion. First to the head, then to
lhs:count = tl:count -- the tail, then back up to the parent.
lhs:gathMax = Math.max(hd:gathMax, tl:gathMax)
hd:finMax = lhs:finMax -- pass global maximum downwards
tl:finMax = lhs:finMax
clause Nil
lhs:count = lhs:count -- thread |count| through without changing it
lhs:gathMax = 0 -- initial maximum
var inhs = new Inh_Root(); -- contains inh attrs of the root
eval_Root(sem_Root, root, inhs); -- run the attribute evaluator
}
\end{code}
\end{smallcode}
\caption{Attribute grammar-based near-solution to menu alignment.}
\label{fig.intro.ag}
\end{figure}
The example in Figure~\ref{fig.intro.visitor} can easily be made more complicated,
for example by having menus that share submenus, and form an acyclic graph instead
of a tree. With each of such complications, the above mentioned
problems grow worse.
As a sidenote, in this paper, we treat the AST as a fixed datastructure. For
example, we do not consider adding menu entries on the fly.
The ideas we propose can deal with the dynamic construction
of proof trees~\cite{Middelkoop10gpce}, and we think that this is sufficient to deal with dynamic changes
to the AST as well, but leave this topic as future work.
Below, we look for a way to generate code similar to the code above, but from a description
that does not have the aforementioned problems.
\subsection{Attribute grammars}
\label{sect.example.ag}
Attribute grammars take care of the problems mentioned above related to visitors,
but are not flexible enough to take side effect into account. We briefly consider
why attribute grammars appear a promising solution, and why side effect is a problem.
Before we show the example, we first give some background information on attribute
grammars, and their encoding in |Javascript|. As syntax, we take a mixture of
UUAG's syntax~\cite{uuagc}, and |rulerfront| (which are closely related).
An attribute grammar is an extension of a context-free grammar, where
nonterminals are annotated with attributes, and productions specify
equations between attributes. The context-free grammar specifies the
structure of the AST: each node of the AST is associated with a production.
A node is also associated to the nonterminal of the left-hand
side of the production, and each child of a node to the corresponding
nonterminal in the right-hand side of the production.
For example, we can denote a production as well as the structure of a
node in the AST using a data-type definition (explained below).
\begin{code}
data Menus -- nonterminal |Menus|
con Cons hd : Menu tl : Menus -- production |Cons|, with two nonts
con Nil -- production |Nil|, empty
\end{code}
This data-type declaration introduces a nonterminal |Menus| with two productions,
representing a cons-list.
The first production is named |Cons|, and corresponds in BNF to |Menus -> Menu Menus|.
The two nonterminals |Menu| and |Menus| in the right-hand side (RHS) have explicitly been
given the respective names |hd| and |tl|.
Terminals only have a name (shown later in Figure~\ref{fig.intro.ag}).
Furthermore, this data-type declaration introduces |Javascript| constructor functions
to construct ASTs. Each production is mapped to a constructor function that gets as
parameter an object corresponding to the symbols in the RHS of the production.
Each nonterminal is mapped to a constructor function that creates a base object that each
of the objects corresponding to the productions inherits.
Due to the inheritance, we can verify at the point of construction that the AST matches
the grammar.
\begin{code}
function Menus() {} -- nonterminal |Menus|: base class
function Menus_Cons(hd, tl) { -- production |Cons|: subclass
this.hd = hd; assert(hd instanceof Menu);
this.tl = tl; assert(tl instanceof Menus);
}
Menus_Cons.prototype = new Menus();
Menus_Cons.prototype.constructor = Menus_Cons;
function Menus_Nil() {} -- production |Nil|: subclass
Menus_Nil.prototype = new Menus();
Menus_Nil.prototype.constructor = Menus_Nil;
\end{code}
Cons-lists occur often. As a shortcut, we alternatively write the following shorthand
for the above instead.
\begin{code}
type Menus : [Menu]
\end{code}
As an additional bonus, we can represent a list of menus as a Javascript array.
Evaluation of an attribute grammar runs an evaluation algorithm
on each node, derived from the equations of its associated production,
that decorates each node with attributes.
We assume that attributes are physically represented as Javascript
properties of the AST objects.
Nodes are decorated with two types of attributes:
inherited attributes are computed during evaluation of the
parent of that node, and
synthesized attributes are computed during evaluation
of the node itself.
We declare the attributes of a nonterminal using an attribute declaration.
\begin{code}
attr Menu inh depth -- inherited attribute
syn gathMax -- synthesized attribute
\end{code}
These attribute names are mapped to object properties named |_inh_depth|
and |_syn_gathMax|. At some point during attribute evaluation, given a participating
|Menu| object |m|, the objects properties |m._inh_depth| and
|m._syn_gathMax| will be defined. An inherited attribute may have the same
name as a synthesized attribute: they are mapped to differently named properties.
As an aside, nodes may define a number of local attributes, which can
be seen as local variables.
To give a semantics to these attributes, we specify equations (rules) per production (explained below -
full details of the nonterminal and its semantics in Figure~\ref{fig.intro.ag}).
\begin{code}
datasem Menu -- nonterminal |Menu|
clause Menu -- production |Menu|
cs:depth = 1 + lhs:depth -- rule
loc:width = 20 * lhs:width -- rule
lhs:gathMax = Math.max(loc:width, cs:gathMax) -- rule
\end{code}
The left-hand side of an equation designates an inherited attribute,
using the notation |childname:attrname|, which allows us to distinguish
attribute names from properties.
The names |loc| and |lhs| are
special: |loc| indicates a local attribute, and |lhs| refers
to a synthesized attribute of the current node. Thus, the attributes
we need to define appear as left-hand side. For example, the
above attribute designations are refer to the |Javascript| properties
|this.cs._inh_depth|, |this._loc_width|, and |this._syn_gathMax|
respectively.
Similarly, the right-hand side consists of a |Javascript| expression,
with embedded attribute references. In this case, we may refer to the
synthesized attributes of children, or with |lhs| to the inherited
attributes of the current node. The terminals of a production are
available as local attributes. In production |Menu|, there is a terminal
called |name|, which is available as attribute |loc:name|.
The translation of attribute references
is similar as described above.
%%% FIGURE WAS HERE
The last rule expands to the |Javascript| statement:
\begin{smallcode}
\begin{code}
this._syn_gathMax = Math.max(this._loc_width, this.cs._syn_gathMax);
\end{code}
\end{smallcode}
Evaluation of an attribute grammar corresponds to traversing the
AST one or more times, and executing rules, according to an
evaluation strategy. In this paper, we restrict ourselves to the class
of well-defined attribute grammars, whose attribute dependencies can be
statically proved to be acyclic~\cite{DBLP:journals/mst/Knuth68}.
For those grammars, a traversal is possible that
visits each subtree a bounded number of times. This corresponds
precisely with typical uses of the visitor-design pattern.
%% With on-demand evaluation of attributes,
%% the application requests a value of a synthesized attribute of
%% the root, and evaluator proceeds by first transitively
%% computing all the attributes the rule depends on, potentially
%% visiting subtrees multiple times. In this paper, we consider
%% a greedy evaluation strategy: subtrees are visited a statically
%% known bounded number of times.
Out of the semantic definitions for e.g. |Menu|, a function
|sem_Menu| is generated containing the evaluation algorithm.
Furthermore, to interface with the decorated tree from
|Javascript| code, a function |eval_Menu| is generated that
takes the AST, the function |sem_Menu|, and an object
containing values for the inherited attributes. It applies
the semantic value, and returns an object with the synthesized
attributes.
\begin{code}
var inhs = new Inh_Menu();
inhs.depth = 0; -- provide inh attrs of root
syns = eval_Menu(sem_Menu, menu, inhs); -- initiate evaluation
window.alert(syns.gathMax); -- access syn attrs of root
\end{code}
In Figure~\ref{fig.intro.ag}, we show an attribute grammar version
of the example presented earlier.
It is a non-solution, for reasons
explained later, but exhibits various important properties.
The keywords written in bold indicate a switch from
|Javascript| code to AG code, and layout determines the
switch back.
The attribute grammar code starts with a number of data type
definitions that describe the structure of the menu tree.
We then define a number of attributes. In particular, the
idea is that we gather a maximum |gathMax| (synthesized), and
use its value at the root, to pass down the global maximum
|finMax| (inherited). Moreover, we count the menus. The inherited
attribute |count| specifies the count for the current menu,
and the synthesized |count| is the count incremented with the
total number of children.
We define the semantics for these attributes in the function
|align|. Because |root| and |anchor| are its parameters, we
also have access to these in the right-hand sides of rules.
To layout the HTML item, we need to execute a number of statements,
and encode this as an expression. In |Javascript|, this can be
accomplished in a variety of ways. In the example, we choose to use
a parameterless anonymous function.
In the semantic of |Menus|, rules are given to compute the attributes
for lists of menus. These rules follow standard patterns: a topdown
passing of |depth| and |finMax|, bottomup computation of gathMax, and
an inorder threading of |count|. In the visitor-example, the fields in
the visitor combined with side-effect took care of this behavior.
With attribute grammars, we have to describe it explicitly.
However, there are mechanisms to abstract from these patterns, in the form of
copy rules~\cite{uuagc},
collection rules~\cite{10.1109/SCAM.2007.13}, or a generalization called
default rules~\cite{Middelkoop10gpce}. With such abstractions, the semantics
of |Menus| can be written in a much conciser way (as we see later).
\begin{figure}[htbp]
\begin{smallcode}
\begin{code}
data Root con Root root : Menu -- node with a child named |root|
data Menu con Menu name cs : Menus -- node with a property |name|, and a child |cs|
type Menus : [Menu] -- conceptually a cons-list, physically an array
var root = new Root_Root( -- the |Menus| are physically represented
new Menu_Menu("a", [ -- as an array. However, conceptually
new Menu_Menu("b", [ -- we define its attributes using the
new Menu_Menu("c", []) -- above cons-list representation.
, new Menu_Menu("d", []) ]) ]));
itf Root -- itf for nonterminal |Root| (root node)
visit perform -- one visit, named |perform|
inh ast -- menu-AST is inherited attribute
itf Menu Menus -- itf for nonterminals Menus (menu nodes)
visit gather -- first visit: compute maximum
inh ast depth -- needs AST and depth
syn gathMax -- computes maximum width of the menu
visit layout -- second visit: layout the HTML items
inh finMax count -- needs global maximum width
syn count -- produces updated count
function align(root, anchor) { -- uses embedded attribute grammars
datasem Root clause Root -- equations of production |Root| of nonterm |Root|
root:depth = 0 -- initial |depth|
root:count = 0 -- initial |count|
root:finMax = root:gathMax -- global max is the gathered max here
invoke layout of root -- require that visit |layout| of |root| is invoked
datasem Menu clause Menu -- equations scheduled to visits of |Menu|
cs:depth = 1 + lhs:depth -- increase |depth| for submenus
cs:count = 1 + lhs:count -- increase |count|
lhs:count = cs:count -- provide the updated count to the parent
match loc:elem = document.getElementById(loc:name)
loc:offset = lhs:depth * 20 -- indentation
loc:width = loc:offset + loc:elem.clientWidth
lhs:gathMax = Math.max(cs:gathMax, loc:width)
cs:finMax = lhs:finMax -- pass down final maximum
visit layout -- equations for visit |layout| and later
match _ = (function () { -- side-effectful statements (wrapped as function)
loc:elem.style.left = (anchor.offsetLeft + loc:offset) + "px";
loc:elem.style.top = (anchor.offsetTop + lhs:count * 30) + "px";
loc:elem.style.width = (lhs:finMax - loc:offset) + "px";
loc:elem.style.height = 30 + "px";
}) () -- directly call the anonymous function
datasem Menus -- standard patterns for |Menus|
default depth = function(depths) { return depths[depths.length-1]; }
default finMax = function(maxs) { return maxs[maxs.length-1]; }
default gathMax = function(maxs) { return Math.max.apply(Math, maxs); }
default count = function(counts) { return counts[0]; }
clause Cons -- a clause must be given for each production,
clause Nil -- otherwise easy to forget one
var inhs = new Inh_Root_perform(); -- contains inh attrs for the root
inhs.ast = root -- AST as inherited attribute
eval_Root_perform(sem_Root, inhs); -- run the attribute evaluator
}
\end{code}
\end{smallcode}
\caption{|rulerfront| solution to menu alignment.}
\label{fig.intro.ruler1}
\end{figure}
The AG code has several nice properties. The order of
appearance of the rules is irrelevant. This allows the rules for
e.g. |depth| and |count| to be written separately and merged
automatically~\cite{uuagc}. In the example, we give all
the rules in one go to fit the page, however, for bigger projects
the ability to write such rules separately is important to write
coherent code.
Another nice property is the absence of invocations of visits
(the |accept| calls in the visitor-example). The number of visits is
totally implicit. From the dependencies between attributes in the
rules, the attribute evaluator determines automatically that
the attribute |root:gathMax| (in the semantics of |Root|) must be
computed first in a visit, before it can be passed as |root:finMax|.
Finally, we check statically if there is an evaluation order of
statements such that all attributes are defined before their value is
accessed. The attribute declarations describe the attributes that must
be defined, and those that are available. The rules describe what
attributes must be available before computing an attribute, and an
evaluation order is possible if the transitive closure of the
dependencies is non-cyclic~\cite{DBLP:journals/mst/Knuth68}.
However, the above code has a number of problems, because the order of
evaluation of rules is determined only by dependencies on attributes.
In particular, the side-effect that rearranges the HTML items is not a
dependency of any rule. Thus it is not clear when it is evaluated,
if it is evaluated at all. Similarly, it is neither clear at what
moment the widths of the HTML items are obtained.
When there are other rules in play that have side effect that
effects these widths, the interleaving of these side effects
becomes even harder to predict.
Finally, the root of the tree does not have any attributes
defined, so there is actually no reason to expect any of the rules to
be executed in the first place.
%%% FIGURE WAS HERE
\begin{figure}[htbp]
\begin{smallcode}
\begin{code}
function align(root, anchor) { -- uses embedded attribute grammars
var sem_Root = -- semantic function with itf |Root|
sem ntRoot : Root -- equations for itf |Root|
visit perform -- equations for the |perform|, the only visit
clause Root -- production named |Root|
child root : Menu = sem_Menu -- introduce a child |root| of nonterm |Menu|
root:ast = lhs:ast -- use |lhs:ast| as AST
root:depth = 0 -- initial |depth|
root:count = 0 -- initial |count|
root:finMax = root:gathMax -- global max is the gathered max of here
invoke layout of root -- demand invocation |layout| of |root|
var sem_Menu = -- semantic function with itf |Menu|
sem ntMenu : Menu -- equations for itf |Menu|
visit gather -- equations for first visit
clause Menu -- production named |Menu|
child cs : Menus = sem_Menus -- introduce a child |cs| of nonterm |Menus|
cs.ast = lhs:ast.cs -- pass submenus as AST for |cs|
cs:depth = 1 + lhs:depth -- increase |depth| for submenus
match loc:elem = document.getElementById(loc.name)
loc:offset = lhs:depth * 20 -- indentation
loc:width = loc:offset + loc:elem.clientWidth
lhs:gathMax = Math.max(cs:gathMax, loc:width)
cs:finMax = lhs:finMax -- pass down global maximum
visit layout -- equations for visit |layout|
clause Menu' -- subproduction named |Menu'|
cs:count = 1 + lhs:count -- increase |count|
lhs:count = cs:count -- provide the updated count to the parent
match _ = (function () { -- side-effectful statements
loc:elem.style.left = (anchor.offsetLeft + loc:offset) + "px";
loc:elem.style.top = (anchor.offsetTop + lhs:count * 30) + "px";
loc:elem.style.width = (lhs:finMax - loc:offset) + "px";
loc:elem.style.height = 30 + "px";
}) () -- directly call the anonymous function
var sem_Menus = -- semantic function, also itf |Menu|
sem ntMenus : Menu -- equations for itf |Menu|
visit gather -- equations for visit |gather|
default depth = function(depths) { return depths[depths.length-1]; }
default finMax = function(maxs) { return maxs[maxs.length-1]; }
default gathMax = function(maxs) { return Math.max.apply(Math, maxs); }
default count = function(counts) { return counts[0]; }
clause Cons -- production |Cons|
match true = lhs:ast.length >= 1 -- clause matches if array has an element
child hd : Menu = sem_Menu -- introduce child |hd| using |sem_Menu|
hd.ast = lhs:ast[0] -- head of the array
child tl : Menu = sem_Menus -- introduce child |tl| using |sem_Menus|
tl.ast = lhs:ast.slice(1) -- tail of the array
clause Nil -- production |Nil| (matches always)
var inhs = new Inh_Root_perform(); -- contains inh attrs for the root
inhs.ast = root -- AST as inherited attribute
eval_Root_perform(sem_Root, inhs); -- run the attribute evaluator
}
\end{code}
\end{smallcode}
\caption{Desugared |rulerfront| solution to menu alignment.}
\label{fig.intro.ruler2}
\end{figure}
\subsection{Ruler-front}
\label{sect.example.ruler1}
We now present a solution using |rulerfront|.
The syntax of |rulerfront| resembles the syntax
of the AG in Figure~\ref{fig.intro.ag}, but is
different. Before we jump into the example, we
first discuss some of the differences.
The central idea is to make visits to an AST
node during attribute evaluation explicit.
We then associate side effect with individual
visits.
\emph{Interfaces.}
Instead of declaring attributes for a nonterminal,
we declare an \emph{interface} for a nonterminal.
An interface declaration specifies the
visits of a nonterminal, and attributes per
visit.
In the following example, we specify that
the attributes of |Menu| are computed in two
visits.
\begin{code}
itf Menu -- interface for nonterminal |Menu|
visit gather -- declaration of first visit
inh ast -- inherited attr defined prior to visit
syn gathMax -- synthesized attr computed by visit
visit layout -- declaration second visit
inh finMax count -- two inherited attributes
syn count -- synthesized attr computed by visit
\end{code}
The order of appearance of visit declarations dictates the
order of visits to AST nodes with this interface. In order
to visit a node, all previous visits must have occurred:
the actual visits on a node must be a prefix of
the declared visits.
Values for inherited attributes must be provided prior to
the visit. Values for synthesized attributes are only
available after a visit has been performed.
In a conventional AG, the AST to traverse can be
seen as hidden inherited attribute. In |rulerfront|, the
AST must actually be provided explicitly as inherited
attribute |ast| in the first visit.
Section~\ref{sect.example.ruler2} motivates this choice.
\emph{Scheduling.}
The rules of a semantics-block are automatically
scheduled over visits using an as-late-as-possible
strategy. If the rules are cyclicly defined, the
scheduling is not possible, and a static error is
reported.
Visits to children are automatically
inferred based on the attribute requirements of
rules. However, since |Root| has no attributes,
there is no need to invoke any visits of |root|.
Therefore, we specify through an |invoke| rule
that visit |layout| must be invoked, which requires
through attribute dependencies that also visit
|gather| must be invoked, and kickstarts the
evaluation.
\emph{Scheduling constraints.}
Rules can be constrained to visits. With a visit-block,
we constrain rules to that visit, or a later visit.
The example below illustrates the various
possibilities.
An attribute definition prefixed with the keyword |match| is
an exception. It is constrained to the visit it appears in,
and is executed even if the attribute it defines is never needed.
We explain its precise meaning later.
\begin{code}
datasem Menu -- rules for nonterminal |Menu|
clause Menu -- rules for production |Menu|
cs:count = lhs:count + 1 -- scheduled in visit |gather| or later
match loc:elem = ... -- precisely in visit |gather|
visit layout -- rules for visit |layout| or later
match _ = ... -- precisely in visit |layout|
lhs:count = cs:count -- constrained to layout or later
\end{code}
With an underscore, we bind the value of the RHS of a rule to an
anonymous attribute that we cannot refer to.
A visit-block also introduces a subscope. A local attribute defined
in a visit-block is not available for a rule defined in a higher
scope, even if that rule is scheduled to a subscope.
After all these preparations, we can finally present the |rulercore|
solution in Figure~\ref{fig.intro.ruler1}. In this example, we express
that the side effect that queries the widths of the HTML items, is
constrained to the first visit, and the side effect that changes the
location and dimensions is constrained to the second.
For the |Menus|-nonterminal, we give default-rules for equality named
attributes in its productions. If such an attribute does not have an
explicit definition, these are implicitly defined by the default rule.
The idea is that the default-rule provides a function that gets an
area with all attribute values of the same name of previously
visited children (or |lhs|).
Formally, given a default-rule for attribute |a|,
suppose that a child |(sub k i) `elem` (sub k 1),...,(sub k n),lhs|
has an attribute |k.a| (synthesized if |k = lhs|, inherited otherwise),
but lacks an explicit definition for it.
The default-rule gives an implicit definition, by invoking the RHS of the
default-rule with an array defined as follows.
For each child |(sub c j) `elem` (sub k (i-1)),...,(sub k 1),lhs| that has
an attribute |a| (inherited if |k = lhs|, synthesized otherwise), in this
order, the array has a value |(sub c j).a|.
In particular, the first entry is the value of the closest child, and the
last entry is that of |lhs| (if such attributes exist).
In the above example, we combined both side effect and attribute evaluation.
We retain the advantages that AGs offer, such as the ease of adding
attributes.
Furthermore, the extension is orthogonal to various optimizations for attribute
grammars, including incremental evaluation and multi-core parallel evaluation.
However, we require the programmer to manually assign attributes to visits, and
constrain side-effectful rules to particular visits, which is not necessary
for conventional attribute grammars. In practice, this is only a
minimal amount of extra work that has as additional advantages that
it makes attribute evaluation more predictable and thus easier to understand.
%% as well as simplifies cycle checks, both in terms of computational complexity
%% and understandability of error messages.
%%% FIGURE WAS HERE
\subsection{Desugared Ruler-Front}
\label{sect.example.ruler2}
In Figure~\ref{fig.intro.ruler2} (explained below), we give another way to write
the same example in |rulerfront|. Both Figure~\ref{fig.intro.ruler2} and
Figure~\ref{fig.intro.ruler1} are valid |rulerfront| programs. The former is, however,
a desugared version of the latter. This desugared version only uses a subset of
|rulerfront| that we call |rulercore|.
It naturally generalizes over Higher-Order~\cite{DBLP:conf/pldi/VogtSK89} and
Conditional~\cite{DBLP:journals/toplas/Boyland96} Attribute Grammars.
We use this example as preparation for |rulercore| in the next section.
To save space, we omitted the data-type declarations, interface declaration, and
|root| variable, which are equal to those in the first half of Figure~\ref{fig.intro.ruler1}.
We present sem-blocks of the form |sem nonterm : Interface|,
which introduces a nonterminal |nonterm|, with visits and attributes
described by |Interface|. The productions are not defined by a data-type
definition, but through clauses and rules per visit, as we explain below.
Additionally, the code generated from a semantics-block is a constructor-function
that produces an AST node described by |Interface|, which we can store in a
variable, and may use in rules.
In Figure~\ref{fig.intro.ruler2}, we start with a definition of the semantics
for the root. The interface |Root| declares one visit.
We generalize over productions
for a nonterminal by having clauses for each visit. Each clause provides an
alternative way to compute the attribute values.
We thus give clauses for the visit |perform|, in this case only one clause.
Clauses and visits may contain rules. Rules given for a visit are in scope
of all clauses declared for that visit. Rules for a clause are only visible
in that clause. We also see another type of rule, called a child-rule,
which introduces a child. For example, we introduce a child |root|, with
interface |Menu|, and the semantics defined by the |Javascript| value
|sem_Menu|.
The left-hand sides of an evaluation-rule may be a pattern. This is
either an attribute reference, an underscore or a
constant. Evaluation of such a rule fails when its execution throws an
exception, or the left-hand side is a value that is not equal to
the value computed for the right-hand side.
During attribute evaluation, the clauses of a visit
are tried at runtime in the order of appearance. The next clause
is tried when either a match-rule fails, or when there is no
succeeding clause for a visit to a child.
Failure of any other
form of rule simply aborts the entire evaluation.
This way, the match-rules allow us to distinguish clauses |Cons| and
|Nil| of |ntMenus| by matching on the length of the list.
Missing visits are implicitly defined with a single empty clause.
A visit without clauses implicitly has a single clause.
Therefore, we neither have to specify the visit |layout| nor clauses for it
in the semantics of |ntMenus|. Also, due to the automatic ordering of rules,
many of the rules defined in visit |layout| of |ntMenu|, could also be
defined one level higher, in visit |gather|.
Note that this representation is more general than conventional
attribute grammars, and that an attribute grammar can easily be
mapped to this representation, as shown by the difference between
Figure~\ref{fig.intro.ruler1} and Figure~\ref{fig.intro.ruler2}.
\begin{figure}[htb]
\begin{smallcode}
\begin{code}
e ::= JS [ many b ] -- embedded |rulerfront| blocks |b| in |JS|
b ::= i | s | o -- |rulerfront| blocks
i ::= itf I (many v) -- interface decl, with visits |v|
v ::= visit x inh (many x1) syn (many x2) -- visit decl, with atributes |x1| and |x2|
s ::= sem x : I t -- semantics expr, defines nonterm |x|
t ::= visit x (many r) (many c) -- visit def, with common rules |r|
| epsilon -- no visit (serves as terminator)
c ::= clause x (many r) t -- clause definition, with next visit |t|
r ::= p = e -- assert-rule, evaluates |e|, bind to pattern |p|
| match p = e -- match-rule, backtracking variant
| invoke x of c -- invoke-rule, invokes visit |x| on |c|
| child c : I = e -- child-rule, introduces a child |c| of itf |I|
o ::= c:x -- attribute reference in some embedded code
p ::= c:x -- attribute reference in pattern
| _ -- underscore
| k -- constant
x, c I, p, e -- identifiers, child identifiers, patterns, expressions respectively
Gamma,Sigma ::= epsilon -- attr+child environment (used in semantics)
| Gamma, scope -- new scope
| Gamma, inh c:x -- inh attr |c:x|
| Gamma, syn c:x -- syn attr |c:x|
| Gamma, c:I (many v) -- child |c| with available visits |many v|
Phi ::= epsilon -- interface environment (used in semantics)
| Phi, I (many v) -- itf |I| with visit decls |v|
\end{code}
\end{smallcode}
\caption{Syntax of |rulercore|}
\label{fig.syntax.rulercore}
\end{figure}
\section{Static Semantics of |rulercore|}
\label{sect.sem.rulercore}
In this section, we introduce |rulercore|, a small subset of |rulerfront|, but sufficiently rich
to serve as intermediate language for |rulerfront|.
Figure~\ref{fig.syntax.rulercore} lists the syntax of |rulercore|. A |rulerfront| program |e| is a
|JavaScript| program |J|, with embedded |rulercore| blocks |b|. A block |b| is either an interface
declaration, semantics-block, or attribute reference. We explain the individual forms of syntax
in more detail below.
There are some essential differences in contrast to |rulerfront| that we gradually introduced
by example in the previous section.
The order of appearance of rules the evaluation order, and
each invocation of a visit must explicitly be stated through an invoke rule.
Special syntax for data-types is not part of |rulercore|. Through clauses and (match) rules, we have
a general mechanism to inspect and perform case distinction on arbitrary |Javascript| datastructures.
We make no assumptions about the syntax of |J|. The embedded blocks may occur anywhere in a
|Javascript| program. It is up to the programmer to ensure that semantic-blocks and
attribute references occur at expression-positions, and that interface-declarations occur at
statement positions. Neither do we make any assumptions about scopes of |J|; instead, we assume
that all embedded blocks are in the same scope.
%format SumNil2
%format SumCons2
\begin{figure}[htp]
\begin{smallcode}
\begin{code}
itf S visit v1 inh l syn emptyset -- decompose array |l| down
visit v2 inh emptyset syn s -- compute sum |s| up
var sumArr = sem sum : S
visit v1 emptyset -- first visit
clause sumNil -- when list is empty
match 0 = lhs:l.length -- match empty |l|
visit v2 emptyset -- second visit
clause sumNil2 -- single clause
lhs:s = 0 -- empty list, zero sum
() -- no next visit
clause sumCons -- when list non-empty
loc:x = lhs:l[0] -- head of the list
loc:xs = lhs:l.slice(1) -- tail of the list
child tl : S = sumArr -- recursive call
tl:l = loc:xs -- |l| param of call
invoke v1 of tl -- invoke on child
visit v2 emptyset -- second visit
clause sumCons2 -- single clause
invoke v2 of tl -- invoke on child
lhs:s = loc:x + tl:s -- sum of head and tail
() -- no next visit
\end{code}
\end{smallcode}
\caption{Example of |rulercore| syntax: summing an array of integers.}
\label{fig.syntax.example}
\end{figure}
Figure~\ref{fig.syntax.example} shows an example |rulercore| program to sum an
array of integers in two visits. The first visit has two clauses: a clause
|sumNil| when the array is empty, and |sumCons| when there is at least
one element. In the second visit, we compute the actual sum, depending on
the clause chosen in the first visit.
A semantics-block introduces a visitor-object with an interface |I|.
The interface dictates what visits can be made to the object, and what
the inputs (inherited attributes) and outputs are (synthesized attributes).
The outputs for a visit are produced by executing rules. We write these
rules down in a tree of clauses and visits, as illustrated by the
indentation in Figure~\ref{fig.syntax.example} and the state diagram:
\begin{tikzpicture}
[ grow=right
, visit/.style={circle, minimum size=1mm, node distance=2mm, very thick, draw=black, fill=black,font=\scriptsize}
, choice/.style={circle, minimum size=1mm, node distance=2mm, very thick, draw=black, fill=white,font=\scriptsize}
]
\node[visit,label=90:|v1|]{}
child {
node[choice, label={[xshift=-4mm]120:|emptyset|}, label={[xshift=-4mm]60:|SumNil|}, label={[xshift=-6mm]-60:|SumCons|}]{}
[level distance=25mm]
child {
node[visit,label=90:|v2|]{}
[node distance=20mm]
child {
node[choice, label={[xshift=-7mm]120:|emptyset|}, label={[xshift=1mm]60:|SumCons2|}]{}
child {
node[visit,label=90:|()|]{}
}
}
}
child {
node[visit,label=90:|v2|]{}
child {
node[choice, label={[xshift=-7mm]120:|emptyset|}, label={[xshift=1mm]60:|SumNil2|}]{}
child {
node[visit,label=90:|()|]{}
}
}
}
};
\end{tikzpicture}
The black nodes represent the state of the AST-node prior to a visit, and the white nodes indicate a branch
point. Upon creation, an AST node is in the state represented by the root node. With each edge are alternately
associated the rules of a visit or of a clause. With each visit, an AST node tries to switch state to a next
black node by executing the rules on the path to such a node. Execution of all of the rules must succeed.
At a branch-point, rules on edges of clauses are tried in order of appearance. Results produced by executing
rules are in scope of rules further along the path.
\begin{figure}[htp]
\begin{smallcode}
%if fullVersion
\begin{code}
Gamma :- s ^^^ ^^^ ^^^ ^^^ (many v) sep Gamma :- t ^^^ ^^^ ^^^ ^^^ Sigma sep Gamma0 :- r ::: Gamma1 ^^^ ^^^ ^^^ ^^^ :- i ^^^ :- v
Gamma :- o ^^^ ^^^ ^^^ ^^^ (many v) sep (many x) sep Gamma :- c ^^^ ^^^ ^^^ ^^^ Sigma sep Gamma0 :- p ::: Gamma1 ^^^ ^^^ ^^^ ^^^ Gamma :- e
\end{code}
%endif
\begin{mathpar}
%if fullVersion
\inferrule*[right=itf]
{ |I unique| \\\\
|I (many v) `elem` Phi| \\
}
{|:- itf I (many v)|}
\inferrule*[right=attr]
{ |inh (sub a i) unique| \\\\
|syn (sub b j) unique| \\
}
{|:- visit x inh (many a) syn (many b)|}
\inferrule*[right=sem]
{ |x unique| \\\\
|I (many v) `elem` Phi| \\
|(many v) sep Gamma, scope :- t| \\
}
{|Gamma :- sem x : I t|}
%endif
\inferrule*[right=end]
{}
{|[] sep Gamma :- epsilon|}
\inferrule*[right=visit]
{ |Gamma0 `union` avail (visit x (many r) (many c)) sep Gamma0 :- (many r) ::: Gamma1| \\
|(many v) sep (many s) sep Gamma1 `union` { inh lhs:a bar a `elem` (many i) } :- (sub c i)| \\
}
{|visit x inh (many i) syn (many s), (many v) sep Gamma0 :- visit x (many r) (many c)|}
\inferrule*[right=clause]
{ |x unique| \\
|Gamma0 `union` avail (clause x (many r) (many c)) sep Gamma0 :- (many r) ::: Gamma1| \\
|(many v) sep Gamma1 :- t| \\
|{ (syn lhs:a) bar a `elem` (many s)} `subset` Gamma1| \\
}
{|(many v) sep (many s) sep Gamma0 :- clause x (many r) t|}
\inferrule*[right=assert]
{ |Sigma sep Gamma0 :- p ::: Gamma1| \\
|Gamma0 :- e| \\
}
{ |Sigma sep Gamma0 :- p = e ::: Gamma1| }
\inferrule*[right=match]
{ |Sigma sep Gamma0 :- p ::: Gamma1| \\
|Gamma0 :- e| \\
}
{ |Sigma sep Gamma0 :- p = e ::: Gamma1| }
\inferrule*[right=invoke]
{ |Phi((sub I c)) = (many v)| \\
|visit x inh (many i) syn (many s) `elem` (many v)| \\
|c : (sub I c) (many w) `elem` Gamma0| \\ %% check child
|next (many w) (many v) = x| \\ %% check invoke
|{ inh c:a bar a `elem` (many i) } `subset` Gamma0| \\ %% check inhs
|Gamma1 = Gamma0 `union` { syn c:a bar a `elem` (many s) } `union` { c : (sub I c) ((many w), visit x inh (many i) syn (many s)) } | \\ %% add syn + visit
}
{ |Sigma sep Gamma0 :- invoke x of c ::: Gamma1| }
\inferrule*[right=child]
{ |Gamma0 :- e| \\
|Gamma1 = Gamma0 `union` { c : I emptyset }|
}
{ |Sigma sep Gamma0 :- child c : I = e ::: Gamma1| }
\inferrule*[right=occ.lhs]
{ |inh lhs:a `elem` Gamma| }
{ |Gamma :- lhs : a | }
\inferrule*[right=occ.child]
{ |syn c:a `elem` Gamma| }
{ |Gamma :- c : a| }
\inferrule*[right=pat.lhs]
{ |syn lhs:a `elem` Sigma| \\
}
{ |Sigma sep Gamma0 :- lhs : a ::: Gamma0, syn lhs:a| }
\inferrule*[right=pat.loc]
{}
{ |Sigma sep Gamma0 :- loc : a ::: Gamma0, syn loc:a| }
\inferrule*[right=pat.child]
{ |inh c:a `elem` Sigma| \\
}
{ |Sigma sep Gamma0 :- c : a ::: Gamma0, inh c:a| }
\inferrule*[right=const]
{}
{ |Sigma sep Gamma :- k ::: Gamma| }
\inferrule*[right=any]
{}
{ |Sigma sep Gamma :- _ ::: Gamma| }
\end{mathpar}
\begin{code}
avail (visit x (many r) (many c)) = (sub avail union) ((many r)) `union` (sub avail intersection) ((many c))
`union` { syn lhs:b | visit x inh (many a) syn (many b) `elem` Phi((sub I x)) }
avail (clause x (many r) t) = (sub avail union) ((many r)) `union` avail (t)
avail (p = e) = emptyset
avail (match p = e) = emptyset
avail (invoke x of c) = { inh c:a | a `elem` (many a), visit x inh (many a) syn (many b) `elem` Phi((sub I c)) }
avail (child c : I = e) = { c : I (Phi I) }
\end{code}
\end{smallcode}
\caption{Static semantics of |rulercore|}
\label{fig.sem.rulercore}
\end{figure}
There are four types of rules.
\begin{itemize}
\item
\begin{code}
match p = e -- match-rule
match loc:x = 3 -- example that succeeds
match true = false -- example that fails
\end{code}
The pattern |p| must match the value of the right hand side. If the evaluation of
|e| results in an exception, or the match fails, a backtrack is made to the next clause.
If |p| represents an attribute, the attribute gets defined.
\item
\begin{code}
p = e -- assert-rule (not prefixed with a keyword)
\end{code}
Similar to the above, except that the match is expected to succeed. If not,
the evaluation itself aborts with an exception.
\item
\begin{code}
child c : I = e -- child-rule
child root : Menu = ntMenu -- example that introduces a |Menu| child
\end{code}
Evaluation of the rule above creates a child |c|, visitable according
to the interface |I|, and created by executing the constructor function
|e|.
\item
\begin{code}
invoke x of c -- invoke rule
\end{code}
Executes visit |x| of child |c|. The inherited attributes of |x| must be defined, and all prior visits to |c|
must have been performed. The invocation may fail if no clause matches. In that case, it causes the current
AST node to backtrack to the next clause.
If successful, the synthesized attributes of |x| become available.
\end{itemize}
Figure~\ref{fig.sem.rulercore} shows a static semantics for |rulercore|.
A |rulercore| program that satisfies these conditions
never crashes due to an undefined attribute, invalid rule order, or forgotten invocation to a child.
Dynamic or static type checking we leave as responsibility of the host language.
We briefly consider some aspect of these rules.
Two environments play an important role: |Gamma| represents the children and attributes
defined so far (to test for missing and duplicated definitions), and |Sigma| the attributes that are allowed to
be defined (to test for definitions of unknown attributes). As additional constraint on environments, we consider
it a static error when there is a duplicate attribute in the environment within two scope markers.
Visits must be specified in the proper order, and none may be omitted.
The relation for visits |t| gets a sequence of pending visits |many v| as declared in the
interface. In rule \TirName{visit}, we verify that the name of the visit matches the
expected visit in the head of |many v|. The next visit must match the head of the tail of
this list, until in the end |many v| is empty.
We also add the inherited attributes of the visits to the environment.
The function |avail| defines which attributes may be defined. Higher-up in the visit-clauses-tree,
we may only define those attributes that are common to all lower clauses. In rules
\TirName{pat.lhs} and \TirName{pat.child}, we verify that we are indeed defining an attribute
belonging to a certain child.
In rule \TirName{invoke}, we verify that |x| is indeed the next visit in the expected sequence of
visits |many v|, given the previous invocations |many w|. We furthermore verify that the inherited
attributes for the visit of |c| are defined, and add the synthesized attributes to the environment.
%% Mention that the rules are a bit crafted to keep them consice, and that this is much nicer
%% with the AG version.
\begin{figure}[htp]
\begin{smallcode}
\begin{code}
var sumArr = function() { -- semantic function
function nt_sum(_inps) { -- visit |v1|
var lhsIl = _inps.l; -- extract |lhs:l|
try { -- try clause |sumNil|
if (lhsIl.length != 0) throw eEval; -- if |lhs:l| is empty
var _res = new Object(); -- produce results of |v1|
_res._next = function(_inps) { -- cont. for visit |v2|
var lhsSs = 0; -- |lhs:s| rule
var _res = new Object(); -- produce results of |v2|
_res._next = null; -- no next visit
_res.s = lhsSs; -- store |lhs:s|
return _res; -- return result of |v2|
};
return _res; -- return result of |v1|
} catch(err) { -- try clause |sumCons|
var locLx = lhsIl[0]; -- |loc:x| rule
var locLxs = lhsIl.slice(1); -- |loc:xs| rule
var vis_tl = sumArr(); -- creation of child |tl|
tlIl = locLxs; -- |tl:l| rule
var _args = new Object(); -- inputs for |v1| of |tl|
_args.l = tlIl; -- store |tl:l|
var _res = vis_tl(_args); -- invoke |v1| of |tl|
var vis_tl = _res._next; -- extract results
var _res = new Object(); -- produce results of |v1|
_res._next = function(_inps) { -- cont. for visit |v2|
var _args = new Object(); -- inputs for |v2| of |tl|
var _res = vis_tl(_args); -- invoke |v2| of |tl|
var tlSs = _res.s; -- extract |tl:s| result
var lhsSs = locLx + tlSs; -- compute |lhs:s|
var _res = new Object(); -- produce results of |v1|
_res._next = null; -- no next visit
_res.s = lhsSs; -- store |lhs:s|
return _res; -- return result of |v2|
};
return _res; -- return result of |v1|
} }; return nt_sum; }; -- return visitor function
\end{code}
\end{smallcode}
\caption{Example translation}
\label{fig.sem.example}
\end{figure}
\section{Translation of |rulercore| to |Javascript|}
\label{sect.trans.rulercore}
In this section, we describe how to translate |rulercore| programs to |Javascript|.
We translate each semantics-block to a coroutine, implemented as one-shot
continuations. Each call to the coroutine corresponds with a visit. The parameters
of the coroutine are the inherited attributes of the visit. The result of the
call is an object containing values for the synthesized attributes, and the
continuation to call for the visit.
\begin{figure}[htp]
\begin{smallcode}
\begin{code}
(semb (sem x : I t)) ~> function() { var (semb (nt x)) = (semt I t); return (semb (nt x)); }
(semb (c:x)) ~> (semb (inp c x))
(semt I (())) ~> null
(semt I (visit x (many r) (many c))) ~> function(_inps) {
(semb (inp lhs (inhs I x))) = _inps.(semb (inhs I x));
(semb (many r)); (semc I (syns I x) (many c)); }
(semc I (many s) []) ~> throw eEval;
(semc I (many s) (c : (many c))) ~> try { (semc I (many s) c); }
catch(err) {
if (err == eEval) { (semc I (many s) (many c)); }
else throw err; }
(semc I (many s) (clause x (many r) t)) ~> (semb (many r));
var _outs = new Object();
_outs._next = (semt I t);
_outs.(many s) = (semb (out lhs (many s)));
return _outs;
(semb (p = e)) ~> var _res; try { _res = (semb e); } catch (err) {
if (err == eEval) throw eAbort; else throw err; }
(semp eAbort p)
(semb (match p = e)) ~> var _res = (semb e); (semp eEval p);
(semb (child c : I = e)) ~> var (semb (vis c)) = ((semb e))();
(semb (invoke x of c)) ~> var _args = new Object();
_args.(semb (inhs (sub I c) x)) = (semb (out c (inhs (sub I c) x)));
var _res = (semb (vis c))(_args);
var (semb (inp c (syns (sub I c) x))) = _res.(semb (syns (sub I c) x));
var (semb (vis c)) = _res._next;
(semp e (c:a)) ~> var (semb (out c a)) = _res;
(semp e _) ~> ;
(semp e k) ~> if (_res != k) throw e;
\end{code}
\begin{code}
out "loc" x = "locL" x ^^^^ inp "loc" x = "locI" x
out "lhs" x = "lhsS" x ^^^^ inp "lhs" x = "lhsI" x
out c x = c "I" x ^^^^ inp c x = c "S" x
vis c = "vis_" c ^^^^ nt x = "nt_" x
syns I x ^^^^ ^^^^ inhs I x -- respectively, inh and syn attrs of |x| of |I|
\end{code}
\end{smallcode}
\caption{Denotational semantics of |rulercore|}
\label{fig.sem.denot}
\end{figure}
As an example, we show in Figure~\ref{fig.sem.example} the translation of the example in the
previous section. To deal with backtracking, we use the exception mechanism, and throw an
exception to switch to the next clause. Note that this does not rollback any side effect that
the partial execution of the rules may have caused. To be able to do so, we can run the rules
in a software transaction~\cite{Heidegger10}, for which many programming languages have
tool support nowadays. Alternatively, when the side effect matters,
the programmer can schedule it to an earlier or later visit, such that it is not
influenced by backtracking.
To deal with continuations, we use closures. The function to be used for the next visit, we
build in the previous visit. This function has access to all the results computed in the
previous visit. Furthermore, we store values for attributes in local variables. Those values
that are not needed anymore, are automatically cleaned up by the garbage collector.
Figure~\ref{fig.sem.denot} shows the general tranlation scheme, and naming scheme for
attributes. In particular, for each visit, we generate a closure that takes values for
inherited attributes as parameter. Clauses are dealt with through exception handling.
When a clause successfully executed all statements, it returns an object containing
values for synthesized attributes, as well as the continuation function for the next visit.
The above translation is relatively straightforward. In practice, the selection of a clause is
functionally dependent on the value of an inherited attribute, or a local attribute computed
in a previous visit. In those cases, the selection of clauses can be implemented more
efficiently using conventional branching mechanisms.
We verified that the above implementation
runs in time linear to the size of the tree, when we use version of the |slice| operation
that does not make a copy of the array.
With a throughput of about hundred array elements per microsecond, and about a thousand per
microsecond with the exception handling replaced by conventional branching, this is still about one or
two orders of magnitude slower than using a hand-written loop.
In our experience, however, performance is rarely an issue. In general, the
asymptotic complexity of the traversal is linear in the size of the tree, and the
actual time taken by traversing the trees is insignificant compared to the
work performed by the right-hand sides of the rules in a real application.
%if fullVersion
\section{Translation of |rulerfront| to |rulercore|}
\label{sect.trans.rulerfront}
In Section~\ref{sect.example.ruler2}, we showed by example how a |rulerfront| program can be encoded
using only syntax in |rulercore|. For space reasons, we omit the data-type driven translation from
a |datasem| into a |sem|, nor the translation of default-rules. Instead, in this section we assume
that |rulerfront| consists of those programs that after insertion of invoke-rules and reordering of
rules are a valid |rulercore| program.
\subsection{Implicit Invocations}
In |rulerfront|, invoke-rules may be omitted. From a |rulerfront| program, we derive a number of
implicit invocatations. We first determine the needed attributes. From these we determine the maximum
needed visit, and thus the sequence of visits needed. An invoke-rule needs to be inserted if there is
no invoke-rule for any of these visits yet. We start the insertion-process at the root of the tree,
and check at each level downwards which invokes need to be inserted. With this process, we insert
the invoke-rules at the lowest point, while still being in scope of all rules that need it.
Automatic rule ordering then positions the invokes at their appropriate places.
A synthesized attribute |a| of child |c| is needed if there exists
a rule which has the attribute reference |c:a| in its right-hand side. The needed attributes may
differ per clause and visit, which we define in a similar way as |avail| in Section~\ref{sect.sem.rulercore}:
\begin{code}
need (visit x (many r) (many c)) = (sub need union) (many r) `union` (sub need intersection) (many c)
need (clause x (many r) t) = (sub need union) (many r) `union` need t
need (p = e) = need e
\end{code}
The actual implementation is a bit more complicated. A default-rule may indirectly express a need on
an attribute (and corresponding visit). Furthermore, when a programmer provided an explicit invoke,
we disable the generation of implicit invokes for that visit. Apparently, the programmer had a reason
to explicitly invoke the visit, and we rather warn the programmer when he did not provide this invoke
in, for example, another branch where it is also needed.
\begin{figure}[htp]
\begin{minipage}{0.58 \linewidth}
\small{\begin{tabular}{ll}
source node & target node \\
\hline
begin visit & end prev visit \\
end visit & begin visit, clauses, syn attr def. rules, match rules \\
clause & begin visit \\
any rule & begin visit or clause \\
invoke rule & prev invoke or child, inh attr def. rules \\
any rule w. rhs & begin visit for inh attrs, loc attrs def., invoke of attr \\
\end{tabular}}
\end{minipage} \hfill \begin{minipage}{0.38 \linewidth}
\includegraphics[scale=0.2]{sum.pdf}
\end{minipage}
\caption{Dependencies between |rulerfront| entities.}
\label{fig.dependencies}
\end{figure}
\subsection{Rule Ordering}
To order the rules, we first create a dependency graph. Per semantics-block, in this graph, we have a begin
and end node for each visit in the interface. For each clause and each rule we also have a node.
Figure~\ref{fig.dependencies} lists what directed edges there are between these nodes, and gives a rough
impresses what the dependency graph for the sum example in Figure~\ref{fig.syntax.example} looks like.
The ovals represent attribute definitions, the square boxes represent begin and end of visits, clauses, and
invocations. If this dependency graph is cycle-free, the rules can be ordered.
The graph represents a partial order. We turns this order into a total order by giving attribute definitions
precedence over invoke rules, invoke rules precedence over clauses, and otherwise take the lexical order.
This is important, as it keeps the child-order stable, orders rules common to all clauses to be evaluated once (if possible) before selecting
clauses, and orders match-rules up front in clauses.
A topological sort over the graph gives a rank for each node. We move rules down the clauses-tree to the
highest-ranked visit or clause with a rank less than the rule. To move a rule down a visit, we duplicate it
to all branches.
Sometimes, rules can be scheduled to more than one visit. The exact visit a rule ends up influences the amount
of data that has to be transported between visits. The inputs for the rule need to be transported to the visit
of the rule, and the result of the rule to the visits where these results are used.
With the above algorithm, rules are scheduled to the earliest possible visit.
Scheduling to the last possible visit seems more intuitive: we only compute attributes at the moment it is
really going to be used.
With a small preprocessing step, we can implement the as-late-as-possible scheduling strategy.
Starting from the sinks of the graph, we preform a reversed depth-first search, marking each node with the
last end-visit-node encountered (without overwriting a mark once assigned). This gives us for each rule the
latest possible visit, and we add a dependency of this rule on the begin of that visit to the graph. If the
graph is acyclic, it stays acyclic.
The implicit invokes and automatic ordering allow a straightforward transformation from a datasem-block
to a sem-block. Essentially, a datasem-block is syntactic sugar for a number of clauses, each with a
match-rule, and a number of child-rules. We also allow in |rulerfront| omitted visits and clauses to
be implicitly defined. Combined with implicit invocations, this makes it easy to add additional visits to an
interface. Furthermore, the automatic rule ordering allows us to write independent rules separately from each other
(possibly in separate files) and use a preprocessing step to merge the rules together.
%% This particular
%% feature is supported by UUAG
\section{Discussion}
\label{sect.eval}
The expressiveness of |rulerfront| depends on the host language.
However, given a host language that supports only global storage, and the
two boolean constants, then we can still express general recursion and
branching. In principle, |rulerfront| is turing complete. In practice,
we use |rulerfront| as an alternative for attribute grammars, and
in particular for traversals over tree-like data structures.
To a limited extend it may be applicable to graphs traversals that are
technically tree traversals (such as a traversal over a depth-first forest).
Loops and iteration can be expressed in some extend via recursion. In related
work, we expressed these by iterating visits~\cite{Middelkoop10gpce}).
|rulerfront| is not suitable to express traversals over drastically changing
data structures.
We implemented several extensions to make |rulerfront| more expressive.
One extension is the notion of internal visits.
A conventional visit is invoked externally by the parent, and can choose
a clause. This means that we can only conditionally compute attributes
once per visit.
In contrast, an
internal visit is invoked at the end of the clause, and is not visible
externally. An internal visit may again have clauses, and these clauses
may again specify an internal visit as next visit, or a conventional
visit. With this relatively simple extension, we can arbitrarily often
branch inside a visit.
Furthermore, we integrated demand-driven attribute evaluation.
With demand-driven evaluation, certain circular grammars may still
produce values for attributes: those that have no runtime circular
dependencies. We either use statically ordered evaluation of a
visit, or demand-driven one. A demand-driven visit may not use side
effect, nor have multiple clauses. With visits, we can adequately model
this combination.
In the Haskell-version of |rulerfront|, we require type signatures for
attributes. In |Javascript|, instead of type signatures, the notion of
a type signature is a dynamic check in the form of
assertion-functions that validate the values for attributes.
%endif
\section{Related Work}
\label{sect.relatedwork}
Related to this paper are various visitor-like approaches and attribute grammar
techniques.
The purpose of the Visitor design pattern~\cite{DBLP:conf/ecoop/GammaHJV93}
is to decouple traversal operations from the specification of the tree
to be traversed, in order to make it easier to add new operations without
changing the existing specification of the tree.
This allows us to write a multi-visit traversal using a separate visitor
per traversal.
In Section~\ref{sect.example.vis}, we discussed advantages and
disadvantages of modeling traversals with this pattern. In particular,
side effect is permitted, and used to store results for use in later
visits. The side effect makes it hard to predict if results needed in a
next visit are actually stored by a first visit. This is a fundamental
problem of visitors. Oliveira, et al.~\cite{DBLP:conf/oopsla/OliveiraWG08},
for example, show many enhancements with respect to the type safety of
visitors, but do not address the transfer of results between visits.
%% Multi-methods~\cite{DBLP:conf/oopsla/ChambersL94} are supposed to replace
%% the visitor pattern. A multi-method allows overloading of methods on
%% multiple parameters, and makes |accept|-methods superfluous.
%% This, however, is orthogonal to the problems and solutions presented in this
%% paper.
Attribute grammars~\cite{DBLP:journals/mst/Knuth68,DBLP:conf/waga/Knuth90}
were considered to be a promising implementation for compiler construction,
but several success stories aside, did not meet these expectations~\cite{DBLP:conf/waga/Waite90}.
The bets may be turning again.
%if fullVersion
We are not worried about every byte of memory
consumption anymore. Instead, multi-core processor utilization becomes an issue, and
parallel evaluation of AGs was well studied in the past~\cite{629079,DBLP:conf/waga/KuiperS90},
but not yet applied in the presence of multi-core processors.
The work in this paper is orthogonal to results in those areas.
%endif
Recently, many Attribute Grammar systems arose for mainstream languages, such as
Silver~\cite{DBLP:journals/entcs/WykBGK08} and JastAdd~\cite{DBLP:conf/oopsla/EkmanH07}
for Java, and UUAG~\cite{uuagc} for Haskell. In contrast to the work in this paper, these
systems strictly discourage or even forbit the use of side effect.
The design of |rulercore| is inspired by the language of execution plans of
UUAG.
In certain languages it is possible to implement AGs via meta-programming facilities,
which obliviates the need of a preprocessor.
Viera, et al.~\cite{DBLP:conf/icfp/VieraSS09} show how to implement AGs into Haskell
through type level programming. The ideas presented in this paper are orthogonal to
such approaches, although the necessary dependency analysis may be difficult to express
depending on the expressiveness of the meta language.
Several attribute grammar techniques are important to our work.
Kastens~\cite{DBLP:journals/acta/Kastens80} introduces ordered attribute
grammars. In OAGs, the evaluation order of attribute
computations as well as attribute lifetime can be determined statically,
allowing severe optimizations.
%if fullVersion
Boyland~\cite{DBLP:journals/toplas/Boyland96} introduces conditional attribute grammars.
In such a grammar, semantic rules may be guarded. A rule may be evaluated if its guard
holds. Evaluation of guards may influence the evaluation order, which makes the evaluation
less predictable.
In comparison, in our clauses-in-visits model, we have to explicitly indicate in what
visits guards are evaluated (the match-statements of a clause), which makes evaluation
clear. Our approach has the additional benefit that children may be conditionally
introduced and visited.
The work for this paper is carried out for a research project to support the
implementation of the type inference component of the Utrecht Haskell compiler.
In the workshop version of this paper~\cite{Middelkoop10vis}, we presented an earlier
version of |rulerfront|'s clauses-per-visit model to allow attribute grammars to implement
functions that perform case distinction on more than a single AST.
In a later paper~\cite{Middelkoop10gpce}, we improved on this model to allow iteration
of visits, and dynamic growing of trees, to model fixpoint construction of proof trees.
That work was carried out using Haskell as target language.
In this paper, we made an explicit connection between AGs and |rulerfront|, and use this
connection to express side effect in AGs.
%endif
\section{Conclusion}
\label{sect.conclusion}
We introduced the language |rulerfront|, an extension of Attribute Grammars that makes
visits to nonterminals explicit. As a consequence, it is possible to use side effects
in rules. It combines the freedom of visitors as described by the Visitor Design
Pattern with the convenience of programming with attributes, as shown in
Section~\ref{sect.example}.
Moreover, we presented |rulercore|, a subset of |rulerfront|, which serves as a small
core language for visitor-based Attribute Grammars. In |rulercore|, the lifetime of
attributes is explicit, as well as the evaluation order of rules and visits to
children.
A |rulercore| program has a
straightforward translation to many languages. In Section~\ref{sect.trans.rulercore},
we showed a translation to |Javascript| .
%if fullVersion
Furthermore, we described how |rulerfront| programs are mapped to |rulercore| in
Section~\ref{sect.trans.rulerfront}.
%endif
There are many directions for future work. The parallel evaluation of Attribute Grammars
received a lot of interest in the past, but during a time that multi-core processors
were not commonly available. The small |rulercore| language is suitable for experimentation
with different evaluation strategies.
Another direction of research is to allow destructive updates on attributed trees.
For example, to support event-handling traversals over data structures that are
dynamically changed based on user input or external events.
In |rulerfront|, the visits performed on an attributed tree explicitly specify which
attributes are defined. When we apply a destructive update to the tree, we thus know
precisely what information is based upon the previous structure of the tree, which is
beneficial when reasoning about mutations to the tree.
Incremental evaluation of Attribute Grammars received attention in the past, and may be
used to efficiently recompute attributes after an AST change.
%% AVL trees.
More fundamentally, the idea of this paper is to deal with the scheduling of rules in
the presence of side effect. This is not possible with conventional attribute grammars,
because the effects are not visible in attribute dependencies.
In the Haskell version of |rulerfront|, the left-hand side of a rule can be a match
against a data constructor. If this data constructor is a GADT, the match brings type
assumptions in scope, to be used to coerce types in rules that follow.
Similarly to side effect, these type assumptions are implicit.
However, with |rulerfront|, we can explicitly
schedule rules to be after such a match.
This allows us to combine GADT features with Attribute Grammars. This may be sufficient
to target dependently-typed programming languages, and a direction towards verified
compilers using AGs.
\ack
This work was supported by Microsoft Research through its European PhD Scholarship Programme.
\bibliographystyle{entcs}
\bibliography{references}
\end{document}