\documentclass[preprint,authoryear]{sigplanconf}

%let appendices = False
%let acknowledgements = False

%include lhs2TeX.fmt
%include polycode.fmt

\usepackage{amsmath}
\usepackage{natbib}
\usepackage{listings}
\usepackage{float}
\usepackage{pgf}
\usepackage{tikz}
\usepackage{color}
\usepackage{xcolor}
\usepackage[english]{babel}
\usepackage{mathtools}
\usepackage{stmaryrd}

\floatstyle{ruled}
\restylefloat{figure}
\setlength{\fboxsep}{1.4pt}

\lstset{
  language=Java,
  basicstyle=\small\sf,
  showstringspaces=false,
  numbers=none,
  morekeywords={var,sealed,dynamic,function},
  tabsize=2
}

\usepackage[ breaklinks, colorlinks
           , pdftitle={Functional Instrumentation of ActionScript Programs}
           , pdfsubject={Instrumentation}
           , pdfkeywords={instrumentation,actionscript,flash,EDSL}
           , pdfcreator={LaTeX with Lhs2TeX}
           , pdfproducer={pdflatex}
           , pdfauthor={Arie Middelkoop}
           , linkcolor=black
           , citecolor=black
           , filecolor=black
           , urlcolor=black
           ]{hyperref}

\usetikzlibrary{shapes,arrows,fit,decorations,positioning}

%include instr-icfp.fmt

\renewcommand{\hscodestyle}{\normalsize}
\newenvironment{smallcode}[0]{\renewcommand{\hscodestyle}{\small}}{\renewcommand{\hscodestyle}{\normalsize}}

\begin{document}

\conferenceinfo{ICFP '11}{September 19-21, Tokyo.}
\copyrightyear{2011}
\copyrightdata{[to be supplied]}

\titlebanner{Submitted to ICFP 2011}        % These are ignored unless
\preprintfooter{A Functional Language for Action Script Instrumentation}   % 'preprint' option specified.

\title{Experience Report: Functional Instrumentation of ActionScript Programs}

\authorinfo{Arie Middelkoop \and Alexander B. Elyasov \and Jurriaan Hage \and Wishnu Prasetya}
           {Universiteit Utrecht}
           {\{ariem,elyasov,jur,wishnu\}@@cs.uu.nl}

\maketitle

\begin{abstract}
In log-based testing, the system under test contains code to log aspects of 
the system's execution. Such code can be injected automatically with program 
transformations. Still, manual intervention is needed to identify execution 
steps of interest, as well as projections of the system's state.
With aspect-oriented programming (AOP), we can describe the semi-automatic
transformation of the system. However, AOP languages typically lack abstraction
mechanisms. In this paper, we show that well-known techniques from
functional programming facilitate the design of an expressive
AOP EDSL for ActionScript, called Asil.
\end{abstract}

\category{D.2.5}{Testing and Debugging}{Tracing}

\terms
measurement

\keywords
instrumentation, execution traces

\section{Introduction}
\label{sect.introduction}

  The \emph{future internet}~\citep{DBLP:journals/fi/Hudson-Smith09}
  challenges traditional testing approaches,
  as future internet applications are a dynamic composition of
  ever-changing third party components and services.
  In addition to conventional unit and regression tests during
  development, continuous testing is required even after deployment,
  which can be accomplished through log-based testing.
  In log-based testing, the execution of the system under test
  (SUT) is logged and later
  analyzed to discover anomalies that require investigation by
  software testers.

  In this paper, we use functional programming techniques for the
  design and implementation of a tool suite that facilitates the injection
  of logging code into client-side web applications.
  Our use case is Habbo Hotel~\citep{habbo},
  which is a large Flash application deployed on the Internet. Therefore,
  we restrict ourselves to programs written in ActionScript 3, the
  programming language of Flash.
  From the execution of this Flash program, we wish to obtain a log,
  which is an execution trace that records a projection of the program's
  state for each execution step.
  
  To manually add logging to the source code of the SUT is undesirable, as
  it is a crosscutting concern that easily clutters up the original code.
  Moreover, modifications to the run-time environment are generally
  not possible due to security restrictions on the clients that run the application.
  %% Moreover, it is quite likely that at different times, we want to log different
  %% information from different locations for the same program.
  
  Fortunately, other techniques exist that help us deal with this issue
  more effectively. In particular, logging is a prime example 
  of a functionality that can be implemented non-intrusively through aspect-oriented
  programming (AOP)~\citep{DBLP:conf/ecoop/KiczalesLMMLLI97}. This 
  allows us to describe the instrumentation separately from the
  actual source code. In order to change the actual logging, we only
  need to change the instrumentation, and not the code to which instrumentation
  was applied.
  
  In AOP, a \emph{join point} is a location in the program's code to which we may
  transfer control. A join point has a static and dynamic context, which is
  defined as the reachable program state at that point. The \emph{advice} is
  code that can inspect and alter this state when executed at a join point.
  Typical AOP languages, such as AspectJ~\citep{Kiczales:2001:OA:646158.680006},
  offer facilities for specifying \emph{point cut}s: a particular set 
  of join points, where a given piece of advice should be executed. 
  An \emph{aspect weaver} integrates the advice into the actual
  program at such join points via static and/or dynamic program transformation.

  Unfortunately, typical AOP languages offer limited means of abstraction in the
  join point model. For example, in AspectJ, point cuts can only be parameterized
  over \emph{symbolic values}. A symbolic value is a statically unknown value that
  corresponds to a \emph{run-time value} in the state of the SUT during its execution.
  Due to our strongly functional background, where the use of higher-order functions
  is second nature, we would like point cuts to be first class in the AOP
  specification.
  
  From a functional perspective, an AOP description can be viewed as a partial 
  function that provides advice for some of the join points. In the case of logging, 
  advice is a state transformer that additionally yields a sequence of events. 
  Such an event typically contains some projection of the program's state, and 
  an execution trace is then the sequence of such events as they arose 
  during the SUT's execution.
  \begin{code}
  Spec             = Instrumentation
  Instrumentation  = JoinPoint -> Maybe Advice
  Advice           = World -> (World `times` [Event])
  \end{code}
  The advice functions are typically written in the programming
  language of the SUT. The instrumentation functions are point-cut
  descriptions in the AOP language, which is typically some pattern
  language.
  
  For coverage analyses (which exploit the information contained in logs) we 
  typically want to specify a more fine-grained instrumentation depending on, e.g., the
  nesting depth of branches, or annotations provided by the user.
  To log the behavior of external services, we often want to specify the
  instrumentation of the invocation of the service as a function of (part of) 
  the specification of the external service.
  Therefore, in this paper, we deviate from standard usage by
  parameterizing AOP specifications over values |alpha|. These may
  include symbolic values and even AOP specifications themselves.
  \begin{code}
  Spec alpha = alpha -> Instrumentation

  Coverage    = Spec ControlFlowGraph
  External    = Spec ServiceContract
  \end{code}
  Consequently, such abstractions allow us to write more modular
  specifications. 
  
  From the point of view of the programmer this leads to a specification
  of the following type:
  \begin{code}
  Spec alpha  = alpha `times` JoinPoint `times` World ->
                  Maybe (World `times` [Event])
  \end{code}
  In other words, in an instrumentation specification a programmer may employ
  values of an arbitrary type Haskell |alpha|, a point cut,
  and information that will be only be available at run-time.
  For an example of the former, consider a control flow graph that we 
  have previously computed for the SUT, and that we use to \emph{generate}
  specifications of point cuts, and of advice to be weaved in.
  In other words, we can generate the AOP specification based on values and 
  (higher-order) functions living in the Haskell world.

  Although the specification by the programmer may refer to information that will
  be available at run-time, this information is not yet available during the 
  early stages that transform Asil code (a DSL deeply embedded in 
  Haskell) into AsilCore (our core instrumentation language),
  or that weave in the advice.
  From this point of view, the specifications 
  are actually a bit more restrictive than the type suggests.
  The following type is a closer match.
  \begin{code}
  Spec alpha = alpha -> JoinPoint ->
    Maybe (World -> (World `times` [Event]))
  \end{code}
  This type makes explicit that run-time information, contained in values of type 
  |World|, cannot be used during the first two stages of translation. Indeed,
  run-time information may only be used in ways that can be correctly translated into 
  AsilCore or ActionScript.
  For example, in the specification given by the programmer, a value that will
  only be known at run-time may not be scrutinized in a case statement, because
  AsilCore does not have the concept of a case statement, and therefore it must
  be translated away.  Of course, this restriction
  does not hold the other way around: instead of weaving in advice at join points in
  the point cut, one may decide to weave in advice at every join point that 
  dynamically decides whether the joint point is an element of the point cut.
  For reasons of efficiency, however, this is not a very wise thing to do. 
  
  We point out  that the ``arrows'' in the above type correspond exactly 
  to code transformations: the first arrow transforms Asil into AsilCore, 
  the second arrow is the process of weaving the advice into the ActionScript
  source (the tool \emph{Asic}), and the final arrow (inside the |Maybe|) refers 
  to the run-time execution of the advice. 
 
%  As noted before, each function arrow in the above type corresponds to a
%  distinct language.
  
%  For simplicity, advice is a simple method call to conventional ActionScript code, where we 
%  assume the existence of a support library that contains methods to, e.g., serialize
%  events. 

  \begin{figure}[htb]
  \centering
  \begin{tikzpicture}
    [ item/.style={rectangle, draw=black, fill=none, thick, font=\normalsize, node distance=15mm}
    , inv/.style={draw=none,fill=none,font=\scriptsize}
    , annot/.style={inv, node distance=4mm}
    ]
  
  \node[item](sut){SUT};
  \node[item, right of=sut, xshift=15mm](sut2){SUT'};
  \node[item, right of=sut2, xshift=15mm](trace){Trace};
  \node[item, below of=sut, yshift=6mm](sym){Refl};
  \node[item, below of=sut2,yshift=6mm](core){Core};
  \node[item, below of=core, xshift=-15mm, yshift=7mm](spec){Spec};
  \node[item, right of=core, xshift=5mm](support){Support};
  
  \node[annot, above of=sut]{@.swf@};
  \node[annot, above of=sut2]{@.swf@};
  \node[annot, above of=trace]{@.log@};
  \node[annot, below of=sym]{@.hs@};
  \node[annot, below of=spec]{@.hs@};
  \node[annot, below of=support]{@.as@};
  
  \draw[->] (sut.east) to node[midway,above,inv]{weave} (sut2.west);
  \draw[->] (sut.south) to node[midway,right,inv]{generate} (sym.north);
  \draw[->] (sym.south) to node[near end,above,inv]{import} (spec.west);
  \draw[->] (core.north) to node[midway,left,inv]{weave} (sut2.south);
  \draw[->] (support.north) to node[near start,above,inv]{import} (sut2.south);
  \draw[->] (sut2.east) to node[midway,above,inv]{exec} (trace.west);
  \draw[->] (spec.east) to node[near start,above,inv]{compile} (core.south);
  \end{tikzpicture}
  \caption{Overview of the specification artifacts.}
  \label{fig.overview}
  \end{figure}

  Figure~\ref{fig.overview} provides a somewhat more detailed view of the artifacts 
  involved in the logging specification. From the SUT, we generate some reflection 
  information Refl (symbol information, control flow graphs),
  which is imported by our logging specification Spec. From Spec, we generate AsilCore,
  which is weaved into the SUT together with the support library that provides
  functions for, e.g., serializing events. Finally, execution of the instrumented 
  SUT generates the trace.
  
  Zooming in on the languages we employ, our instrumentation language is a 
  combinator language called \emph{AsilCore},
  that provides two primitive operations: pattern matching (on join points)
  and the invocation of advice. Such a language is not very easy to program in, which
  is why we have chosen to extend it to a deeply embedded Haskell DSL, called \emph{Asil}.
  Asil uses the well-known concepts of \emph{monadic}~\citep{DBLP:conf/afp/Wadler95} and
  \emph{alternative}~\citep{Mcbride:2008:APE:1348940.1348941} interfaces to compose instrumentations.

  The implementation of Asil is ongoing work within the
  Fittest project\footnote{\url{http://www.pros.upv.es/fittest/}}. The library
  @asil@\footnote{\url{http://hackage.haskell.org/package/asil}} 
  contains Asil as a library, including parsers, pretty printers, and
  other tools for the transformation of ActionScript byte code.
  The tool @asic@\footnote{\url{http://hackage.haskell.org/package/asic}}
  generates reflection information and performs aspect weaving.

  
  The remainder of this paper is organised as follows. After introducing the running
  example in Section~\ref{sect.example}, we consider the Asil language in more 
  detail in Section~\ref{sect.asil}. In Section~\ref{sect.asilcore} we show
  AsilCore and some aspects of its implementation; the aspect weaver asic
  is shortly discussed in Section~\ref{sect.asic}.
  Finally, in Section~\ref{sect.reflection} we reflect on the
  advantages and disadvantages offered by functional programming
  with respect to the implementation of Asil and Asic, and Section~\ref{sect.conclusion}
  concludes.

\section{Example}
\label{sect.example}

  In log-based testing, we log aspects of the program's execution, so that we can
  inspect the log to discover atypical execution patterns that may be worth an
  investigation by a software tester. This logging may take place in a special test
  environment, or after deployment, when the program is in active use.
  %% In the latter case, privacy is an important additional concern.
  
  In this paper, we consider only the instrumentation of Flash programs: the
  representation of the logs and the analysis of the logs are out of the scope of
  this paper. To set the scene, we describe a snippet of chess-like internet game as a 
  simple running example (see Figure~\ref{fig:snippet}). 
  The game proceeds in turns. In each turn, the user clicks 
  on a square to select a pawn, then clicks on an unoccupied square to move the 
  pawn there. Note that logging code has been explicitly inserted, logging
  each call to the @MyGame.clicked@ method (with position information)
  and also the actual moves that are performed. The calls to
  @Log.clicked@ and @Log.move@ take care of generating events that are
  logged and thereby become part of the trace. 
  
\begin{figure}[ht]
  \begin{small}
  \begin{verbatim}
package code {
public class MyGame extends Sprite {
  private var selSquare : Square;

  public function MyGame() : void {
    addEventListener("click", clicked); }

  function clicked(event:MouseEvent) : void {
    var x : int = event.localX;
    var y : int = event.localY;

    Log.clicked(x,y);
    var target : Square  = getSquare(x,y);
    var taken  : Boolean = occupied(target);

    if (!this.selSquare && taken) {
      this.selSquare = target;
    } else if (this.selSquare && !taken) {
      Log.move(this.selSquare, target);
      this.move(this.selSquare, target);
      this.selSquare = null; } } } }
  \end{verbatim}
  \end{small}
  \caption{ActionScript snippet with logging code.}
  \label{fig:snippet}
\end{figure}
  
  An \emph{execution trace} is a sequence of events |e1 ... (sub e n)|, where an event
  |e = (t,i)| is a tuple consisting of a discrete timestamp |t| and information |i| about the
  state of the program at |t|. In case of the example, this information consists
  either of a |clicked| record or a |move| record.
  \begin{code}
  i  ::=  clicked(int,int)               -- clicked record
       |  move(code.Square,code.Square)  -- move record
  \end{code}
  The utility methods @Log.clicked@ and @Log.move@ take care of raising the events
  and serializing a projection of the object-graph of their parameters to a log.  
  The actual implementation of these utility methods is out of the scope of this paper.

  The calls to the logging methods obscure the source code. This gets worse 
  when we make the logging code conditional for reasons of, e.g., privacy or performance:
  \begin{verbatim}
  if (Level.privacy <= 2 && Level.verbose >= 1) {
    Log.move(this.selSquare, target) }
  \end{verbatim}
  Logging and privacy are typical examples of crosscutting concerns, so that we can
  apply AOP techniques to specify these concerns separately from the code. In the next section
  we discuss the instrumentation language Asil, a functional AOP language.

\section{The Asil Instrumentation Language}
\label{sect.asil}

  The language |Asil| is a combinator language for writing down specifications |Spec alpha|
  as described in Section~\ref{sect.introduction}. It is a functional, strongly-typed,
  monadic Haskell EDSL for expressing 
  \emph{advice} that is conditionally applied at \emph{join points}.
  Join points are predefined locations in the code where control can be transferred to.
  Figure~\ref{fig.joinpoints} lists the join points in our
  \emph{join point model}~\citep{Cazzola06semanticjoin}. Our main join points are the
  conventional method entry and exit from the caller and callee side as in
  AspectJ~\citep{Kiczales:2001:OA:646158.680006}, as well as sequential blocks of
  instructions~\citep{Juarez-Martinez:2008:JPM:1562423.1562447}.
  Operationally, an Asil program is a value of the type |I ()| (explained below)
  that is applied to each encountered join point during execution.
  
  \begin{figure}[htb]
  \centering
  
  \begin{tabular}{l||l}
  Join point & Context captured \\
  \hline
  Method entry      & name, parameters, and param types  \\
  Method exit       & name, return value, and return type \\
  Method abort      & nane, and exception    \\
  \hline
  Method call       & name, parameters, and param types   \\
  Method returned   & name, return value, and return type \\
  Method failed     & name, and exception    \\
  \hline
  Block entry       & id, cycle root, and preceding join points     \\
  Block exit        & id, cycle root, and preceding join points     \\
  \hline
  Coercion call     & input, input type, and intended type  \\
  Coercion returned & input, output, input type, and output type \\
  Coercion failed   & input, exception, and intended type   \\
  \end{tabular}

  \caption{Join points and their context.}
  \label{fig.joinpoints}
  \end{figure}
  
  We use combinators to construct
  monadic \emph{instrumentations} of type |I a| and (symbolic) \emph{expressions} of type |E a|.
  A typed product |a| of expressions is represented by the type |many E a|.
  Figure~\ref{fig.operations} lists the Asil combinators that are used in the example.

  Instrumentations have a notion of success: a successfully applied
  instrumentation of type |I a| returns a value of type |a|,
  otherwise the instrumentation failed to apply.
  The operations |matchEnter|, |matchCall|, and |matchEnter'| express matches against join points.
  A successful match returns join-point information. For example, after |matchEnter|
  we can access the values of parameters to the call (symbolically). 
    The operation |call| invokes an ActionScript method, and returns the (symbolic) result value
  that is returned by that method.
  Method calls succeed by default, unless the method uses a special API routine to
  indicate an abort.

  It is important to realize
  that the values that will only be available at run-time will only be represented symbolically:
  we can manipulate them only to a limited extent. 
  Such values are the atomic elements of our \emph{expressions} |E a| that may also involve built-in 
  operations on these values, such asinteger arithmetic, comparisons, boolean arithmetic, 
  field dereference, and array indexing.
  Essentially, a value of type |E a| represents a (symbolic) ActionScript value of type |a|.
  We use Haskell to build such expressions. Not all Haskell expressions 
  (lambdas in particular) can masquerade as |E|-expressions, nor can |E|-expressions
  be scrutinized with Haskell's case expressions.  The reason is that AsilCore does not
  support these types and expressions, and we chose not to provide translations for them at this time.
  We may do so in the future, but in this first attempt we prefer to introduce only those
  elements that are essential and to decide later what else we might need.
  Only a few Haskell \emph{values}, such as integers and strings, are also |E|-expressions.
%%    We purposefully impose this limitation (Section~\ref{sect.obstacles}).
  
  \begin{figure}[htb]
  \begin{smallcode}
  \begin{code}
      -- instrumentations (I)
    matchEnter   :: Prop c (Method a b) -> I (Enter a)
    matchCall    :: Prop c (Method a b) -> I (Enter a)
    matchEnter'  :: I (Enter Any)
    call         :: E (Method a b) -> (many (E)) a -> I (E b)
    guard        :: E Bool -> I ()
    onPrevious   :: I a -> I a
    param        :: Enter Any -> Int   -> I (E Any)

      -- monadic combinators
    (>>=)        :: I a -> (a -> I b) -> I b
    return       :: a -> I a
    fail         :: String -> I a
    
      -- alternative combinators
    (<#>)        :: I a -> I a -> I a
    (<<#>)       :: I a -> I a -> I a
    (<##>)       :: I a -> I a -> I a
    
      -- expression combinators (E)
    (#)          :: E c -> Prop c t -> E t
    embed        :: I a -> E a
    null         :: E (Object t)
    static       :: E Static
    param1       :: Enter (a, r)       -> E a
    param2       :: Enter (a, (b, r))  -> E b
    
      -- match context
    data Enter a = ME { name :: E String, params :: (many (E)) a }
  \end{code}
  \end{smallcode}
  \caption{Excerpt from the Asil combinators.}
  \label{fig.operations}
  \end{figure}
  
  The primitive operations are composed with \emph{monadic} and \emph{alternative} combinators,
  shown at the bottom of Figure~\ref{fig.operations}.
  The monadic bind |m >>= f| composes instrumentations sequentially, and
  parameterizes the continuation |f| with the values returned by |m|,
  if |m| succeeds.
  The monadic return |return x| is an always succeeding instrumentation
  that is effectively a no-op, but returns |x| as value.
  We use do-notation to conveniently write monadic sequences.
  
  In the following example, the operation |matchEnter| matches against the method entry (join) point
  of the @clicked@ method of @MyGame@, and then calls the @clicked@ method @Log@
  with information that is extracted from the event-parameter (using |#|).
  %% using the |#| property dereference
  %% operator.
  \begin{code}
  instrLogClick = do
    m <- matchEnter kk~code~MyGame~clicked
    let  evt  = param1 m
         eX   = evt # kk~flash~events~MouseEvent~localX
         eY   = evt # kk~flash~events~MouseEvent~localY
    call (static # kk~code~Log~clicked) (eX, eY)
    return ()
  \end{code}
  If the match succeeds, then the variable |m| contains information
  about the joint point. Different types of join points typically provide
  different information.
  In this case, we can use |param1| to obtain a typed reference |evt| to the first parameter
  of the method from |m|. From |evt| we can obtain the coordinates of the mouse click
  and pass these to the |Log~clicked| method. For static properties, we use the
  special name |static| as object reference.
  
  On the other hand, |matchEnter'| can match against method-entry join points that do not have
  a statically fixed name nor a statically fixed signature. For example, with the
  following code, we can match on any method-entry join point with a name that
  contains |"clicked"| as substring, and has a first parameter of the
  @MouseEvent@ type.
  \begin{code}
  do  m    <- matchEnter'
      guard (count (params m) == 1)
      guard (substr "clicked" (name m))
      p1   <- param m 1
      evt  <- p1 `cast` tt~flash~events~MouseEvent
      ...
  \end{code}
  In this example, |params| obtains a reference to the parameters of the method from |m|.
  The |guard| instrumentation succeeds if and only if its Boolean parameter is |True|.
  Remember that these operators work on symbolic values |E t| instead of conventional
  Haskell values of type |t|. 
%  Also, anonymous functions do not have a name, and are represented by a @null@
%  reference.

  Similarly, we can instrument the call to @move@ in the method
  @clicked@. The operation |onPrevious| takes an instrumentation
  |p| and applies it to the joint points that (directly) precede in the control
  flow graph, the matched join point (ignoring back edges).
  The operation succeeds if |p| is successfully applied to such a preceding join point
  (at runtime).
  With this operation, we declaratively specify contextual pattern matches.
  \begin{code}
  instrLogMove = do
    onPrevious $ matchEnter
      kk~code~MyGame~clicked
    m <- matchCall kk~code~MyGame~move
    let  from  = param1 m
         to    = param2 m
    call (nmclass # kk~code~Log~move) (from, to)
    return ()
  \end{code}
  In the context of having matched against @clicked@, we match against a call to
  @MyGame.move@, and insert a call to @Log.move@.
  
  The instrumentations |instrLogMove| and |instrLogClick| are defined separately. 
  Instrumentations can be combined with various \emph{alternative} operators.
  Given some continuation |f|, the meaning of |(p <##> q) >>= f| is that branch |p >>= f| and
  branch |q >>= f| are applied to the join point (in that order).
  The meaning of |(p <#> q) >>= f| is that |p| and |q| (in that order) are both applied,
  and the continuation |f| is parametrized with the result of the last succeeding one of the two.
  The strict alternative |p <<#> q| applies |q| only if |p| fails.
  
  The ability to fail raises the question how we deal with the fact that
  our support library contains conventional ActionScript methods that
  may exhibit side effects. At this time, we do not attempt to revert 
  these effects.
  
  %% Note that this is possible when we would use software transactions.
  %% With parallel composition, we can combine a list of individual instrumentations
  %% into a single instrumentation.
  With conventional higher-order functions and parallel composition we combine the
  individual instrumentation.
  \begin{code}
  myInstr :: I ()
  myInstr = foldr (<##>) (fail "initial")
    [ instrLogClick, instrLogMove ]
  \end{code}
  The |fail| operation is a unit of parallel composition.
%%  The |fail| operation explicitly aborts the application of an instrumentation.
%%  It is a unit to parallel composition.
  
  Having specified the complete instrumentation |myInstr| we can then interpret 
  it for each join point in the Flash program according to the semantics that we 
  specified informally in 
  this section. However, we actually implemented an instrumentation monad such that
  its execution yields an AsilCore specification (Section~\ref{sect.asilcore})
  as an intermediate step. How and why we do this is the subject of the next section.
  
  %% Todo: insert reason why we would want to take this partial evaluation step

\section{The AsilCore Language}
\label{sect.asilcore}

%{
%% Format instructions restricted to this section only
%% Might otherwise conflict with variable names in the Haskell blocks

%format match  = "\mathbf{match}"
%format call   = "\mathbf{call}"
%format on     = "\mathbf{on}"
%format previous = "\mathbf{previous}"
%format abort  = "\mathbf{abort}"
%format entry  = "\mathbf{entry}"
%format exit   = "\mathbf{exit}"
%format call   = "\mathbf{call}"
%format prim   = "\mathbf{prim}"
%format static = "\mathbf{static}"
%format prop   = "\mathbf{prop}"
%format name   = "\mathbf{name}"
%format inputs = "\mathbf{inputs}"
%format res    = "\mathbf{res}"
%format args   = "\mathbf{args}"
%format dynamic = "\mathbf{dynamic}"
%format nop     = "\mathbf{nop}"
%format block   = "\mathbf{block}"
%format guid    = "\mathbf{guid}"
%format coerce  = "\mathbf{coerce}"


  The monadic bind is an obstacle in the translation of Asil to
  ActionScript. The right-hand side of a bind is a function, thus
  it would require a nontrivial translation of Haskell functions
  to ActionScript.
  Therefore, we partially evaluate the monadic code to AsilCore,
  which is sequential code. This step eliminates both functions and monads.
  %% For more details... next section?

  \begin{figure}[htb]
  \begin{smallcode}
  \begin{code}
    i  ::=  match m                          -- match against join point |m|
       |    call o (many v) (x :: ty)        -- call with args |many v|, binds result |x|
       |    on previous i                    -- lift to preceeding join points
       |    abort                            -- aborts instrumentation
       |    nop                              -- no-op
       |    coerce v1 (x2 :: ty)             -- coerce type of |v1| to |ty|
       
       |    i1 ; i2                          -- sequential composition
       |    i1 <#> i2                        -- parallel composition
       |    i1 <<#> i2                       -- alternative composition
       
       |    static i                         -- error if |i| not statically resolved
       |    dynamic i                        -- defers eval of prims in |i|
    
    m  ::=  entry  (x1 :: String)  (many (x2 :: ty))  -- name |x1|, args |many x2|
       |    exit   (x1 :: String)  (x2 :: ty)         -- name |x1|, result |x2|
       |    call   (x1 :: String)  (many (x2 :: ty))  -- name |x1|, args |many x2|
       |    ...                                       -- etc.
    
    o  ::=  prim x                           -- primitive with the name x
       |    static s                         -- static property named s
       |    dynamic v1 v2                    -- call closure |v1| with |v2| as @this@
    
    v  ::=  x | c                            -- identifiers and constants
  \end{code}
  \end{smallcode}
  
  \caption{AsilCore abstract syntax.}
  \label{fig.asilcore}
  \end{figure}
  
  The AsilCore language (Figure~\ref{fig.asilcore}) is an intermediate language 
  that can be mapped strightforwardly to ActionScript advice (to be precise, we map directly
  to AVM2 byte code).
  An AsilCore term has a statically fixed control flow graph, which
  simplifies the injection into compiled ActionScript, since it can be represented
  with conventional branching instructions of ActionScript.
  
  \begin{figure}[htb]
  \begin{smallcode}
  \begin{code}
  type I a = Uid ->                        -- unique id
    (Uid -> a -> Core -> (Core, Uid)) ->   -- continuation
      (Core, Uid))                         -- res. core  

  run f = fst $ f 1 (\n _ i -> (i, n))     -- |I a -> Core|
  
  instance Functor I where
    fmap f g  =  \n k -> g n (\m v -> k m (f v))

  instance Monad I where
    return x  =  \n k -> k n x nop

    m1 >>= f  =  \n1 k -> m1 n1 $ \n2 v1 i1 ->
                   (f v1) n2 $ \n3 v2 i2 -> k n3 v2 (i1 ; i2)

    fail s    =  \n k -> k n (error s) abort
  
  instance Alternative I where
    empty       =  \n1 k -> k n1 undefined abort
    m1 <#> m2   =  \n1 k ->  m1 n1 $ \n2 v1 i1 ->
                    m2 n2 $ \n3 v2 i2 -> k n3 v2 (i1 <#> i2)
    m1 <##> m2  =  \n1 k ->  let  (i1,n2) = m1 n1 k
                                  (i2,n3) = m2 n2 k
                             in (i1 <#> i2, n3)

  embed i  = \n k -> k n () i              -- |Core -> I ()|
  fresh    = \n k -> k (n+1) (ESym n) nop  -- |I (E a)|
  
  matchBlockEntry = do                     -- |I Block|
    vId <- fresh
    embed (match block guid (toCoreIdent vId))
    return $ Block { blockGuid = vId }
  \end{code}
  \end{smallcode}
  \caption{Asil to Core execution.}
  \label{fig.execution}
  \end{figure}
  
  In the translation to AsilCore (Figure~\ref{fig.execution}), we implement the 
  instrumentation monad in continuation passing style~\cite{Sussman75scheme} to deal with the |<##>| operator.
  The monadic binds are replaced with static sequencing.
  Even if Asil programs cannot use Haskell code to scrutinize E-values, we can 
  parametrize a function |f| in the right-hand side of a bind  
  with the (typically) symbolic value |v1| computed in the left-hand side, 
  and continue partial evaluation. As a result, such a value |v1| will be restricted
  in how it can be manipulated at run-time.

  We desugar Asil programs a bit further to simplify the AsilCore representation.
  We express all branching with the local combinators |<#>| and |<<#>|.
  Values in AsilCore are either identifiers that symbolically represent ActionScript
  objects, or constants that represent concrete ActionScript objects.
  Matches introduce identifiers for the contextual values of the match,
  and calls introduce identifiers for the result value.
  Calls take values as argument, which are either constants or identifiers.
  An identifier may only be introduced once, and must be introduced before being used.
  This property holds for AsilCore programs that are derived from well-typed
  Asil programs.

  The following fragment shows the result of the translation of the
  running (Asil) example to AsilCore. The abstractions offered by Asil are seen to be
  expanded away.
  \begin{code}
  {  match entry
       name   x1     :: String 
       args   x2     :: flash.events.MouseEvent
  ;  call prim equals
       args   x1, "code.MyGame:clicked"
       res    x3     :: Boolean
  ;  call prim guard
       args   x3
       res    x4     :: void
  ;  call prim deref
       args   x2, "flash.events.MouseEvent:localX"
       res    x5     :: int
  ;  call prim deref
       args   x2, "flash.events.MouseEvent:localY"
       res    x6     :: int
  ;  call static "code.Log:clicked"
       args   x5, x6
       res    x7     :: void
  } <#>  -- composes the two instrumentations in parallel
  {  on previous
       {  match entry
            name     x8     :: String
            inputs   x9     :: flash.events.MouseEvent
       ;  call prim equals
            args     x8, "code.MyGame:clicked"
            res      x10    :: Boolean
       ;  call prim guard
            args     x10
            res      x11    :: void }
  ;  match call
       name     x12    :: String
       inputs   x13    :: code.Square, x14 :: code.Square
  ;  call prim equals
       args     x12, "code.MyGame:move"
       res      x15    :: Boolean
  ;  call prim guard
       args     x15
       res      x16    :: void
  ;  call static "code.Log:move"
       args     x13, x14
       res      x17    :: void
  } <#> abort  -- may be optimized away
  \end{code}
  The primitive |deref| takes an object and a fully qualified name as string,
  and returns the associated property if it exists, otherwise it fails.
  %% \footnote{
  %%   Private and protected fields of such an object may not be accessible.
  %%   As part of the code transformation, we inject accessor functions that can
  %%   do so. However, we refrain from transforming the standard library.
  %% }.
  To call a property of an object, |deref| can be used to
  obtain a closure of type |Function|, followed by a |call dynamic|.
  The primitive |guard| succeeds if and only if its argument is true,
  and the |equals| primitive has the same semantics as the
  ActionScript @==@.
  
  The declarative |on previous| instruction lifts its block to all
  preceding join points in the static control flow graph of the
  method. Operationally, it also introduces an additional local
  boolean that keeps track whether the instrumentation was applied.
  If it was applied, we can access its values via the local variables that the
  lifted instrumentation introduced.
  
  %{
  %format acall = "\Varid{call}"
  \begin{figure}[htb]
  \begin{smallcode}
  \begin{code}
  <# abort #>                   ~>  thisJoinPoint.aborted = true
  <# nop #>                     ~>  ;
  <# call o (many v) (x :: ty)  ~>  <# x #> = (ty) (<# o #> (many (<# v #>)));
  <# match exit (x1 :: String) (x2 :: ty2) #>  ~>
    thisJoinPoint.aborted |= thisJoinPoint.isMethodExit;
    if (not thisJoinPoint.aborted) {
      <# x1 #> = thisJoinPoint.nm; <# x2 #> = thisJoinPoint.ret }
  
  <# i1 ; i2 #>     ^^^  ~>  <# i1 #>;  if (not thisJoinPoint.aborted) { <# i2 #> }
  <# i1 <#> i2  #>  ^^^  ~>  <# i1 #>;  thisJoinPoint.aborted = false; <# i2 #>
  <# i1 <<#> i2 #>  ^^^  ~>  <# i1 #>;  if (thisJoinPoint.aborted) {
                                     thisJoinPoint.aborted = false; <# i2 #> }
  
  <# prim x #>         as  ^^^  ~>  Primitives.<# x #> (as)
  <# static C:x #>     as  ^^^  ~>  <# C #> . <# x #> (as)
  <# dynamic v1 v2 #>  as  ^^^  ~>  v1.acall(v2, as)
  \end{code}
  \end{smallcode}
  \caption{AsilCore to ActionScript translation.}
  \label{fig.translation}
  \end{figure}
  %}
  
  The obtained AsilCore instrumentation can be mapped to ActionScript code as defined in 
  Figure~\ref{fig.translation}, and can then be weaved in at join points in the program.
  This requires an ActionScript method for each primitive operation, and we need to
  construct a reflexive @thisJoinPoint@ value~\citep{Hilsdale:2004:AWA:976270.976276}
  at each join point. For a typical join point, only a small part of the instrumentation 
  is applicable (or even none at all).
  Therefore, our instrumentation weaver Asic (Section~\ref{sect.asic}) partially 
  evaluates the AsilCore term, and only weaves in the residual term.
  
%%  We considered a variation
%%  that applies the block to all join points in order to encode AspectJ's
%%  dynamic |cflow| pattern. However, we can already achieve such inter-procedural
%%  effects by maintaining stacks with information in the methods of the support
%%  library.

%}
  
\section{The Instrumentation Weaver Asic}
\label{sect.asic}
  
%if False
  The Asic tool takes an AsilCore term and applies it to all join points of the
  SUT. This involves yet another partial evaluation step. The following pipeline
  summarizes the instrumentation process. Each box represents a partial evaluation phase,
  and corresponds to the function arrows as mention in Section~\ref{sect.introduction}.
  \begin{center}
  \begin{tikzpicture}
    [ tool/.style={rectangle, draw=black, fill=none, very thick, font=\normalsize, node distance=18mm}
    , inv/.style={draw=none,fill=none,font=\scriptsize, node distance=17mm}
    ]
  \node[inv](before){};
  \node[tool, right of=before](hs){|hs|};
  \node[tool, right of=hs](asic){|asic|};
  \node[tool, right of=asic](flash){|flash|};
  \node[inv, right of=flash](after){};
  \node[inv, below of=asic, node distance=10mm](under2){};
  \node[inv, below of=hs, node distance=10mm](under1){};
  
  \draw[->] (before.east)  to node[midway,above,inv]{|asil|}  (hs.west);
  \draw[->] (hs.east)      to node[midway,above,inv]{|core|}  (asic.west);
  \draw[->] (asic.east)    to node[midway,above,inv]{|SUT'|}   (flash.west);
  \draw[->] (flash.east)   to node[midway,above,inv]{|trace|} (after.west);
  \draw[->] (under1.north) to node[near start,inv,xshift=3mm]{|sym|}         (hs.south);
  \draw[->] (under2.east)  to node[near start,inv,xshift=5.5mm]{|support|}   (asic.south);
  \draw[->] (under2.west)  to node[near start,inv,xshift=-3mm]{|SUT|}        (asic.south);
  \end{tikzpicture}
  \end{center}
  We partially evaluate an Asil expression via conventional Haskell execution using GHC
  to a AsilCore term, the AsilCore code is partially evaluated by AsilCore, and final evaluation
  is performed by the Flash environment. The JIT compiler of Flash takes care of optimizing
  the additional bytecode that is introduced by Asic.
%endif
  
  The Asic tool takes an AsilCore term and applies it to all join points of the
  SUT. This involves yet another partial evaluation step.
  The success of a match is always resolved
  statically, and results in static bindings for identifiers related to the name of
  a method, unique id of a block, and possibly also the types and number of
  arguments. Calls to primitive methods may be performed statically, depending on
  what is known statically about their parameters. Reachable calls to non-primitive methods
  always result in a residual call in order to cater for possible side effect.
  
  AsilCore terms can be large when the instrumentation is \emph{generated} from, e.g., the
  control-flow graph. For example, there may be a dedicated AsilCore branch for each method in the SUT.
  Asic applies several optimizations to the AsilCore term before and after the evaluation
  process. For example, for each AsilCore branch, we collect the static constraints on the join points
  in a trie structure, such that we can quickly obtain the branches that potentially match.
%  This is a typical example of unification.
%%%% well... it's similar to typical matching problems

%%  The declarative |dynamic| instruction can be used to force evaluation to
%%  the actual execution time of the PUI. This is needed in combination with the
%%  |typeof| method that can always be done statically (because |core| values are
%%  typed), but yield more specific information during the PUI's execution.
%%  Non-primitive calls are never performed statically.

  Aspect-weaving is an instance of syntax-directed translation.
  Attribute Grammars (AGs)~\citep{uuagc} provide a declarative language to write
  composable implementations of such translations. Hence, we implemented Asic
  with AGs. An advantage of the approach, compared to a monadic one,
  is that aspects of the translation can be described seperately, e.g.,
  the threading of environments, location information and substitutions.
  %% Comparison to monads?
  

\section{Reflection}
\label{sect.reflection}

  \paragraph{Embedding.}
  Haskell provides Asil's abstraction facilities for free.
  Also, we piggy back on Haskell's type system to type check Asil specifications.
  For the syntactic sugar related to E-expressions and function calls
  we used GADTs and common type class extensions.
  The use of type classes was not always intuitive due to confusing conflicts between
  overlapped instances and functional dependencies.
  
  %% To deal with (rather confusing) conflicts between overlapped instances and functional dependencies
  %% we needed type-class hackery.
  %%% ~\citep{Kiselyov:2004:STH:1017472.1017488}.
  
  In our embedding, join-point structures and the SUT's runtime values are represented by
  E-expressions. The syntax of E-expressions is purposefully limited so that we can 
  use Haskell's abstraction facilities, but refrain from
  embedding arbitrary Haskell expressions in AsilCore and ActionScript.

  \paragraph{Executable specifications.}
  Specifications of Flash and the ActionScript bytecode format are publicly available.
  However, we discovered that these specifications are incomplete with respect to
  new Flash versions, and also contain errors that are hard to track down.
  Therefore, we initially implemented a monadic binary parser with
  @uuparsing-lib@~\citep{Swierstra:2009:CPS:1614478.1614484},
  which provides results lazily, in combination with error correction. This provided us with sufficient context
  to find bugs in our own implementation and the specification itself.
  Our parser can actually serve as a formal specification, in an executable, testable form.
  
  \paragraph{Performance.}
  Our use case, the Habbo application, is a large SWF file that
  consists of 3,000 classes with 25,000 methods and 800,000 instructions.
  Simply parsing the application takes several seconds, which is surprisingly
  long despite the size of the application.
  We reimplemented our parser as a binary deserializer using the
  @Data.Binary.Get@ to no avail. Profiling showed us that garbage collection takes
  about 60\% of the time. Our hypothesis is that this is an interplay between the large and
  relatively long lived AST that we construct, and the short-lived closure-garbage produced
  by the parsers. If it would be possible to allocate the AST (via an annotation on the
  data constructors) directly in
  an older memory generation, we expect it to improve our parser's performance.
  A modification of the parser that only verified the structure of the SWF files but does
  not construct the AST takes about half the time and seems to confirm our hypothesis.

  A more pressing problem is that GHC crashes on the huge symbol file that we
  generate from the Habbo SWF. We consider generating a separate Haskell module for each
  ActionScript class.

  \paragraph{Proofs.}
  The structure of Asil is based on the functor, monad, applicative and alternative
  interfaces~\citep{Mcbride:2008:APE:1348940.1348941}. This allows us to immediately use
  many of the convenient functions that are written in terms of these interfaces.
  Also, these interfaces have to satisfy satisfy a number of laws. Proving that these
  laws hold gives us more confidence and insight in our implementation. For example,
  our monad (Figure~\ref{fig.execution}) satisfies the monad laws semantically, but
  not structurally, since the monadic |return| introduces no-ops.
  In our actual implementation, we directly eliminate superfluous no-ops and 
  trivially-dead branches as optimization.
  

  %if False
  With typed abstract syntax, we can represent functions in Haskell as data type.
  These can be mapped on top of ActionScript objects. With an approach like
  \emph{Type Safe Observable Sharing in Haskell}, we can lift Haskell functions
  to ActionScript. However, general case distinction is still not possible.
  Alternatively, we could use UHC to generate ActionScript code, as Atze showed on
  the UHC blog. Or template haskell with reification.
  
  %%%%% Idea: Log Based Testing of Asynchronous Systems
  %%%%% http://www.replicator.org/content/idea-log-based-testing-of-asynchronous-systems
  %endif

%if False
\section{Related Work}
\label{sect.relatedwork}

  With respect to language
  design, a discussion with AspectJ would be nice. AspectJ lacks abstraction:
  you can write parametrized point cuts, but these parameters are symbolic
  program values where no case distinction can be performed on. Neither are
  "higher-order" pointcuts possible. This is where functional programming
  can help us.
%endif

\section{Conclusion}
\label{sect.conclusion}

  We presented Asil, an expressive AOP EDSL for the instrumentation of
  ActionScript programs. Its design and implementation is based on
  well-known functional programming techniques, such as monads. The
  implementation exploits Haskell's facilities for abstraction,
  syntactic sugar and static typing.
  
  Like AspectJ, Asil allows point cuts to be specified
  with constrained patterns.
  In addition, Asil allows these constraints to be specified as
  runtime ActionScript methods and possibly instrumentation-time
  primitive functions. Finally, (higher-order) functions can be
  used to abstract over instrumentations and to derive instrumentations
  from others.

  The possibility to write higher-order functions is an essential
  feature that we sorely miss in many DSLs.
  Fortunately, this comes for free for EDSLs in functional languages.
  

%if acknowledgements
%% Can possibly also be moved as footnote to the first page of the paper
\acks
This work was carried out in terms of the Fittest project.
%% Todo: insert more appropriate acknowledgements
%endif

\bibliographystyle{abbrvnat}
\bibliography{references}


%if appendices
\appendix

\section{Implementation}
\label{sect.implementation}

 A bit more insight in the structure of the tools.
  Some specifics here, such as the use of Haskell, Attribute Grammars.
  Perhaps mention that the parser written with parser combinators is a
  more complete specification of the ActionScript bytecode grammar than
  the actual document. 

  The actual partial evaluation done by |asic| proceeds in two steps. As a
  first step, we pre-evaluate and optimize the |core| term. We associate a
  symbolic value with each identifier and execute the primitive calls
  that can run already, which leads to concrete bindings for some of the
  identifiers. We substitute primitive calls that failed with |abort|.
  We reorder some instructions to the front, under the restriction that no
  instruction that can potentially fail may shift over a non-primitive call.
  This ensures that the observable order of side effect is retained.
  Also, an instruction can only be shifted over another instruction if it
  does not break def-use dependencies. Under these restrictions, we try to
  order match instructions to the front, followed by |on previous| instructions,
  and finally the calls. We also replace a sequences of the form |abort s; m| with
  |abort s|, and |m <#> abort s| and |m <<#> abort s| with |m| (and the symmetric cases as well).
  Finally, we left-factor the result. The |core| term in this form allows us to a trie
  structure to quickly determine
  what top-level branches of a |core| term match a join point. This is important as the
  PUI may have hundreds of thousands of join points, of which only a small portion have
  an applicable instrumentation, which is only a small
  
  Instead of manually adding the calls to the logging methods in the source code,
  we develop on a tool |asic| that injects the logging code in the compiled
  Flash program. We call this Flash program the Program Under Instrumentation
  (PUI).
  The code to inject is determined by a program of the domain specific,
  functional programming language |asil| (an EDSL in Haskell) that
  describes what code is to be inserted at which locations.
  %%
  %% Should not give too much argumentation here (save that for introduction)
  %%
  An |asil| expression describes an \emph{instrumentation} to be performed at
  \emph{join points} of the PUI.

  The |asic| tool generates a Haskell library from the PUI that contains symbol information
  of types, properties and methods that occur in the PUI.
  This symbol information allows the Haskell compiler to type check the instrumentation.
  For example, the @clicked@ method
  lives in class @MyGame@ of package @code@, hence the fully qualified name for this method
  is @code.MyGame.clicked@, and the transcribed Haskell identifier |k~code~MyGame~clicked|.
  This symbol actually stores the name of the property, and has a type that describes the
  method's signature.
  \begin{code}
  type TT~code~MyGame~clicked =
    TT~flash~events~MouseEvent -> TT~void

  kk~code~MyGame~clicked ::
    TT~code~MyGame `Key` TT~code~MyGame~clicked
  kk~code~MyGame~clicked = K "code.MyGame:clicked"
  \end{code}
%endif

\end{document}