\chapter{Annotations in MultiRec}

\label{sec:generic}

In this section we will reimplement the annotation framework of the previous
section in the generic programming library \emph{MultiRec}
\cite{rodriguez2008mutrec}. In a sense, MultiRec is more generic than the open
recursion approach we took in the previous section, offering more flexibility
but also more complexity. We will see why in a moment.

Before we look at MultiRec in particular, let us have a short introduction to
generic programming in general: what is it and why would you implement the
same functionality in multiple generic libraries?

\section{Introduction to generic programming}

The report \emph{Comparing Libraries for Generic Programming in Haskell}
\cite{rodriguez08comparing} gives a very nice explanation of the term
\emph{generic programming}:

\begin{quote}
Software development often consists of designing a datatype to which
functionality is added. Some functionality is datatype specific. Other
functionality is defined on almost all datatypes, and only depends on the
structure of the datatype; this is called datatype-generic functionality.
Examples of such functionality are comparing two values for equality,
searching a value of a datatype for occurrences of a particular string or
other value, editing a value, pretty-printing a value, etc.
\end{quote}

In Haskell 98, datatype genericity is available for a limited number of
functions as |deriving| clauses following datatypes.\footnote{See the Haskell
98 Report, section 10 ``Specification of Derived Instances''.} The clause
|deriving Eq| following the definition of |data BareExpr|, for example, makes
available an operator |(==) :: BareExpr -> BareExpr -> Bool| for comparing two
values of type |BareExpr|.

To implement such `deriving' behavior for a custom type class, one would have
to extend the compiler and analyze the datatype in question. Because this is
often a lot of tedious work, programmers have looked for suitable abstractions
in which programming this generic behavior over datatypes is made easier, by
providing different interfaces on this low-level view on datatypes: thus
generic programming was born. Because these interfaces are often less
flexible, they can be made substantially simpler and easier to use.

Of all these interfaces, Template Haskell \cite{sheard02template} is one of
the most powerful because it aims to offer a one-to-one mapping between the
syntax of a Haskell datatype and runtime values describing such a datatype. It
allows for generic programming because---among many other things---it lets the
programmer write a function that takes a datatype as input and produces some
function declarations. This function is evaluated at compile-time, and the
produced declarations are then compiled into the final program. This is
efficient: there is some overhead at compile-time, but the generated code is
as fast as if it had been written by hand.

Template Haskell has a few disadvantages, however. Because it is so powerful,
it is also reasonably verbose and even simple generic functions easily span
many lines. In fact, there is so much freedom that it is possible to build
type-incorrect or otherwise nonsensical expressions and declarations. Errors
in the resulting expressions are caught and reported at compile-time, but
because they concern generated code, it is very difficult to give accurate
position information about where those errors occurred.\footnote{Something we
obviously value very much in this thesis. :-)} That the errors often concern
generated names complicates matters even further.

So when code generated by Template Haskell is compiled, the compiler will not
fail during the parsing phase, but it might during the type-checking phase.
This did not sit well with some programmers: Haskell is known for its powerful
type system; was it not possible to create an approach to generic programming
directly in Haskell, as a library, so that it is necessarily type-safe? And so
the search for good generic programming strategies continued.

Some of the resulting generic libraries are Scrap Your Boilerplate (SYB)
\cite{laemmel05syb}, Uniplate \cite{Mitchell:2007:uniplate}, Extensible and
Modular Generics for the Masses (EMGM) \cite{Oliveira_Hinze_Loh:2006EMGM} and
the various |compos| operators \cite{bringert06pattern}. These approaches
express their generic functionality in terms of type classes; generic programs
then depend on the existence of type class instances of the types they are
expected to work on. The programmer can write instances for these types by
hand, but most of these libraries offer a way to bootstrap the generic
functionality. SYB, for example, allows instances to be created by a
|deriving| clause through a compiler extension. Uniplate, in turn, has an
implementation in terms of SYB.

So which of these libraries should you choose for your own generic program?
The differences between these libraries lie mainly in how they deal with
families of datatypes, fields of different types, lists (or other collections)
of children, type arguments (if at all) and information about constructors
such as their name and---if they are operators---their fixity.

None of these libraries, however, allow the user to generically change the
\emph{shape} of a datatype. Their functionality is restricted to
\emph{queries} that compute some value over a tree (like a fold) or
\emph{transformations} that allow rewriting tree terms.

In section \ref{sec:fixedpoint} we saw how we can write a datatype in such a
way that we can later decide what type the children will have by introducing
an extra type argument. This added flexibility turned out to be crucial in
adapting a datatype to hold position information. 

Although the degree in which |ExprF| exposes its structure is very useful, it
is still very limited, from a generic programmer's point of view. For example,
we cannot generically count the number of constructors of a datatype, or count
the number of fields in a specific constructor, or generate members of a
datatype, or check whether two types are equal. In the context of type-indexed
datatypes there is also more structure to be exposed. Zippers, for example,
cannot be expressed in terms of a base functor.

To divulge more of a type's structure, we can model it using a fixed set of
building blocks. All datatypes follow the same pattern: they have a number of
constructors and each constructor has a number of fields. This way of
modeling datatypes is called \emph{sums of products}, because of the way the
number of values of a datatype add up: a single constructor has zero or more
fields. Each of these fields individually has a number of values it can take
on. The constructor can be initialized with any combination of these values,
so the number of possible instantiations equals the product of each field's
number of values.

A datatype consists of several constructors. To compute the number of values
in a datatype, we take the numbers of values of each constructor and add these
up. In this sense, a datatype is a sum (the constructors) of products (the
fields of the constructors).

Using sums of products is very interesting for our application because it too,
just like the approach in the previous section, allows the use of fixed points
and thus the insertion of annotations at every level.

ASTs often comprise several datatypes where one datatype refers to another.
Sometimes they are even mutually recursive. Until recently it was unknown how
to model such a \emph{family} of datatypes using sums and products without
knowing in advance how many datatypes the family would consist of. MultiRec,
however, offers a solution to this problem in a type-safe way. Here by
`type-safe' we mean that it is impossible for the sums-of-products model of
the datatypes to create trees which cannot be written using the original
datatypes (i.e. they are completely isomorphic, up to $\bot$) and that
nowhere in the implementation any `unsafe' functions (such as |unsafeCoerce|)
are used.

This is why MultiRec is an interesting generic programming library for our
problem: it allows for families of possibly mutually recursive datatypes,
exposes a lot of information about the datatypes using sums of products, and
uses fixpoints to boot!

Let us take a look at some concrete examples.

\section{Pattern functors}

So far we have used arithmetic expressions as our running example. To exploit
MultiRec's capabilities of working with families of datatypes, we will modify
the example to use an extra datatype:

\begin{code}
data Expr
  =  EAdd     Expr  Expr
  |  EMul     Expr  Expr
  |  ETup     Expr  Expr
  |  EIntLit  Int
  |  ETyped   Expr  Type
  deriving (Eq, Show)

data Type
  =  TyInt
  |  TyTup  Type Type
  deriving (Eq, Show)
\end{code}

We have gotten rid of the constructors for division and subtraction and in
their place have two other constructors: one for creating tuples of
expressions and one for expressions with type signatures. The latter uses a
separate type |Type| to describe expression types. The tuple constructors of
both |Expr| and |Type| allow arbitrary nesting, e.g.~|(1, (2, 3))| and |(Int,
(Int, Int))| can be modeled as |ETup (EIntLit 1) (ETup (EIntLit 2) (EIntLit
3))| and |TyTup TyInt (TyTup TyInt TyInt)|, respectively. The type annotations
introduce some ambiguity to the language so that we have reason to type-check
expressions and possibly report errors later when we look at MultiRec's
version of error catamorphisms.

Let us translate these datatypes to sums of products. We will use
\texttt{multirec-0.4}, available from
Hackage\footnote{\url{http://hackage.haskell.org/package/multirec-0.4}}:

\begin{code}
type PF_Expr =
       I  Expr  :*:  I  Expr  :*:  U
  :+:  I  Expr  :*:  I  Expr  :*:  U
  :+:  I  Expr  :*:  I  Expr  :*:  U
  :+:  K  Int   :*:  U
  :+:  I  Expr  :*:  I  Type  :*:  U

type PF_Type =
       U
  :+:  I  Type  :*:  I  Type  :*:  U

type PF_Tuples = PF_Expr :>: Expr :+: PF_Type :>: Type
\end{code}

Here the |Tuples| in |PF_Tuples| is the name we have given to our family of
datatypes, comprising of |Expr| and |Type|. The sums-of-products version of a
family of datatypes is called the family's \emph{pattern functor}; that is
what |PF| stands for. The pattern functors of either type are written
separately as |PF_Expr| and |PF_Type|. Their structure mirrors the structure
of the corresponding datatypes directly. The two pattern functors are then
tagged and combined in |PF_Tuples|. In all, this type is pretty large but
consists only of six distinct building blocks:

\begin{itemize}
  \item |:+:| is used to denote choice (sums), be it between datatypes of the
  family or constructors of a particular datatype.
  \item |:*:| can be seen as the cons in the list of constructors, forming the
  products. Just like the cons operator |:| on lists, this operator
  associates to the right.
  \item |I xi| is used to indicate recursion into type |xi| of the family in
  question. It always appears on the left-hand side of a |:*:|.
  \item |K a| (for \emph{k}onstant) is used when a field has a type that falls
  outside the family. In our case, constants cannot be selected and will not
  be annotated with position information. They always appear on the left-hand
  side of a |:*:|.
  \item |U| is the |[]| of the list of constructors.
  \footnote{\label{foot:typelist}In the MultiRec paper, |U| is only used to describe constructors
  with zero fields and does not appear in pattern functors of constructors
  that do have fields. However, treating constructors consistently as cons
  lists allows for more elegant type functions later on.}
  \item |pf :>: ix| tags the pattern functor |pf| as representing datatype
  |ix|.
\end{itemize}

Using these building blocks |PF_Tuples| completely mirrors the structures of
the |Expr| and |Type| datatypes. The building blocks' definitions are chosen
such that the pattern functor and |Either Expr Type| are isomorphic,
i.e.~values of one type can be translated to corresponding values of the other
type. This fact is reflected by MultiRec's |Fam| type class:

\begin{code}
class Fam phi where
  from  :: phi ix -> ix -> PF phi I0 ix
  to    :: phi ix -> PF phi I0 ix -> ix
\end{code}

The first function, |from|, translates from values of the original two
datatypes to the pattern functor versions while the second function, |to|
takes care of the reverse. These functions' types introduce a few new things.

The first of these is their first argument, |phi ix|. In our example |phi| is
the family |Tuples|:

\begin{code}
data Tuples :: * -> * where
  Expr :: Tuples Expr
  Type :: Tuples Type
\end{code}

Type |Tuples| is a generalized algebraic datatype (GADT) that enumerates all
the datatypes in the family, with one constructor for each datatype. Each
constructor indexes |Tuples| with the type it represents. When |from| pattern
matches on one of these constructors, the polymorphic type argument |ix| is
refined to a specific, concrete datatype after which case analysis can be done
on the second argument to produce a value of the pattern functor.

The dual |to| also receives a |Tuples| value, after which it can pattern match
on a value of the pattern functor to produce a value of one of the original
types.

MultiRec abstracts over specific pattern functors through a type synonym
family |PF|, which appears in the type signatures of |from| and |to|:

\begin{code}
type family PF phi :: (* -> *) -> * -> *
\end{code}

Again, in our specific example, |phi| stands for |Tuples|, and we have already
seen our example's pattern functor, so writing an instance of this type family
is easy:

\begin{code}
type instance PF Tuples = PF_Tuples
\end{code}

If |PF Tuples| has kind |(* -> *) -> * -> *|, it must receive two type
arguments before it becomes a concrete type. From |class Fam| we learn that
they are called |r| and |ix|. These arguments are shared by each functor
building block, all the way down to the leaves.

The latter argument |ix| indicates the original type the pattern functor
represents. For example, |PF Tuples r Type|, for some |r|, is the type of a
pattern functor value representing a |Type|. Functions |from| and |to| use
this to restrain their output and input functors appropriately.

The |r| serves a purpose similar to that in |ExprF r|: it is used to
control the values at the recursive child positions. But where the |r| in
|ExprF r| had kind |*|, this |r| has kind |* -> *|: |r|'s argument indicates
which type is recursed into. This argument was unnecessary in the previous
chapter, because families of datatypes were not supported yet and there was
only one type to recurse into.

Let us take a look at the definitions of the building blocks so that we can
write an |instance Fam Tuples|.

\begin{code}
data (f :+: g) r ix = L (f r ix) | R (g r ix)
infixr 5 :+:
\end{code}
The two constructors reflect the choice between the |L|eft functor and the
|R|ight functor. The two type arguments |r| and |ix| are simply passed down to
either functor.

\begin{code}
data (f :*: g) r ix = (f r ix) :*: (g r ix)
infixr 7 :*:
\end{code}
Values of the product type constructor contain one value of either side.
Again, the type arguments are passed on unchanged.

\begin{code}
data (f :>: xi) r ix where
  Tag :: f r ix -> (f :>: ix) r ix
infix 6 :>:
\end{code}
The |Tag| constructor fixes the index |ix| of the enclosed functor to be the
same as the right-hand side of the |:>:| type constructor. This ensures that
when creating a pattern functor value, only values of the proper branch
representing a particular datatype are allowed, and---conversely---that only
values of the right branch need to be pattern matched on when converting the
pattern functor values back to the original datatypes.

\begin{code}
data K a r ix = K a
\end{code}
Constants discard the |r| and |ix| and only hold a simple value of type |a|.

\begin{code}
data I xi r ix = I (r xi)
\end{code}
An |I xi| in a pattern functor marks a recursive occurrence of type |xi|.
Recursive positions do not need be the same as the enclosing type, so the |ix|
argument is ignored. The |r| determines what is actually done with the |xi|.
There are several interesting choices for |r|. Two of them are the following
datatypes:

\begin{code}
newtype I0  a    = I0  { unI0  :: a }
newtype K0  a b  = K0  { unK0  :: a }
\end{code}

The former results in values of the original datatypes at the recursive
positions, while the latter ignores the index and uses a constant type. The
former is used in the |Fam| type class, so |to| and |from| effectively
transform the values only one level deep (a \emph{shallow} conversion). We
will see another example of |r| later on when we look at fixed points of
pattern functors.

There is also a building block that describes constructor metadata, such as
its name or fixity, but we will not need that information and it is safe to
leave it out of our pattern functor, so we omit the description of its
workings here.

Now that we know how the building blocks are defined, we can write the
conversion functions for our |Tuples| family. First conversion from the
original datatypes to the pattern functors:

\savecolumns
\begin{code}
instance Fam Tuples where

  from Expr ex  = L . Tag $ case ex of
    EAdd x y     -> L              $ I (I0 x)  :*: I (I0 y)  :*: U
    EMul x y     -> R . L          $ I (I0 x)  :*: I (I0 y)  :*: U
    ETup x y     -> R . R . L      $ I (I0 x)  :*: I (I0 y)  :*: U
    EIntLit n    -> R . R . R . L  $ K n       :*: U
    ETyped e t   -> R . R . R . R  $ I (I0 e)  :*: I (I0 t)  :*: U
  from Type ty  = R . Tag $ case ty of
      TyInt      -> L              $ U
      TyTup x y  -> R              $ I (I0 x)  :*: I (I0 y)  :*: U
\end{code}

The outermost case analysis shows the magic that is pattern matching on GADT
values: both |Expr| and |Type| are constructors of the |Tuples| GADT, and
depending on which value is given to |from|, the inner case analyses can
pattern match on constructors of the relevant, more specific types.

The two outer case constructs wrap all of their results in |L . Tag| and |R .
Tag|, respectively. This matches with the pattern functor's outer layers:
|PF_Expr :>: Expr :+: PF_Type :>: Type|.

Type |PF_Expr| uses nested binary sums. Because these associate to the right,
the constructors progressively need more |R|'s in |from|'s definition to reach
their position in the pattern functor. To the right of the dollar signs is the
construction of each constructor's fields.

The opposite |to| is very similar to |from|, except that the pattern functor
constructors are on the left-hand side and the original datatype's
constructors are on the right. Unfortunately we cannot eliminate some
parentheses like we could in |from| using function composition.

\restorecolumns
\begin{code}
  to Expr (L (Tag ex))  = case ex of
    L           (I (I0 x) :*: I (I0 y) :*: U)     -> EAdd x y
    R (L        (I (I0 x) :*: I (I0 y) :*: U))    -> EMul x y
    R (R (L     (I (I0 x) :*: I (I0 y) :*: U)))   -> ETup x y
    R (R (R (L  (K n                   :*: U))))  -> EIntLit n
    R (R (R (R  (I (I0 x) :*: I (I0 y) :*: U))))  -> ETyped x y
  to Type (R (Tag ty))  = case ty of
    L           U                                 -> TyInt
    R           (I (I0 x) :*: I (I0 y) :*: U)     -> TyTup x y
\end{code}

Finally, we need to write instances of |El| for every member of our family.
This type class is used in certain instances on building blocks when proofs
(elements of the GADT |Tuples|) are needed:

\begin{code}
class El phi ix where
  proof :: phi ix

instance El Tuples Expr where
  proof = Expr

instance El Tuples Type where
  proof = Type
\end{code}

That's it! Now we have everything we need to do some serious multi-datatype
generic programming.

\section{Fixed points in MultiRec}

\label{sec:multirecfixedpoints}

As we saw, the pattern functor embedding is shallow. Because the pattern
functor abstracts over the recursive positions, we can use fixed points again
to achieve a deep embedding: we just need to fix the type argument |r|---which
determines the shape of the children---to the fixed point itself. To find out
what the fixed point datatype should look like for a higher-order functor |f|
(for example, |PF Tuples|), we write down three iterations of the recursion:

\begin{code}
type InfiniteFix f ix = f (f (f (...))) ix
\end{code}

As before, we observe the regularity of the recursion in this infinite
datatype to be able to express the recursion in a Haskell datatype:

\begin{code}
newtype HFix f ix = HIn { hout :: f (HFix f) ix }
\end{code}

Now a deep embedding of the a value of the |Tuples| can be expressed as |HFix
(PF Tuples)|.

And because we are using fixed points again, we can insert our annotations at
every level. We do not need to build a custom |Ann| datatype like before: we
can reuse building blocks |K| and |:*:| to couple a constant with a functor.

This gives us what we need to build the MultiRec version of |AnnFix| and
|AnnFix1|. We reuse these names here: it will always be clear from the context
which is meant, because the two versions of the fixed points are never mixed.

\begin{code}
type AnnFix   x phi = HFix (K x :*: PF phi)
type AnnFix1  x phi = (PF phi) (AnnFix x phi)
\end{code}

Using |K| means the index is discarded, making the annotation type |x|
independent of the index. If we wanted, we could use an associated datatype
\cite{chakravarty05associated01} to make the annotation type depend on the
specific datatype in the family. But we will be using source locations
throughout the whole tree, so |K| suffices.


\section{Pattern functors are traversable}

If we can translate a value to its pattern functor, we can automatically
|traverse| it\footnote{Starting with GHC 6.12, datatypes may derive
|Traversable| instances!}: all we need to do is build traversal functions for
the building blocks. What would such a traversal function look like for the
generic building blocks? Recall the traversal function described by McBride
and Paterson:

\begin{code}
traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)
\end{code}

The first argument |a -> f b| is used on the recursive positions of a datatype
constructor. The resulting actions are then sequenced using |(<*>)|, resulting
in one bigger action with the new traversable datatype as result.

In the higher-order version of |traverse|, the first function will also be
called with the recursive positions as arguments. However, in the generic
case, we do not know the type of the recursive position: it might be any type
in the family |phi|. For this reason, the supplied function must be
polymorphic in its index so that we can give it a child of any family type. To
help the programmer that writes this function, we also pass it a witness |phi
ix| so that the function can pattern match on this witness, refining |ix| to a
known type.

In |traverse| the first argument may also change the child type from |a| to
|b|. We cannot allow this in the generic version: the index |ix| must remain
the same, because a constructor cannot suddenly change the types of its
children. We can however change the recursive type constructor |r|.

All these considerations are reflected in MultiRec's version of |Traversable|,
called |HFunctor|:

\begin{code}
class HFunctor phi f where
  hmapA :: Applicative a =>
    (forall xi. phi xi -> r xi -> a (r' xi)) -> phi ix -> f r ix -> a (f r' ix)
\end{code}

The first argument is the action that is applied to every child position. The
type guarantees that it works for all indices and that the action does not
change the index of a specific child.

The presence of |hmapA|'s second argument, a witness of type |phi ix|, makes
it much easier to fix the type argument |phi| when calling hmapA.

Just like in the |Data.Traversable| module, we can write monadic and pure
versions of |hmapA|, making use of a (trivial) |instance Applicative I0|:

\begin{code}
hmapM  :: (HFunctor phi f, Monad m) =>
  (forall xi. phi xi -> r xi -> m (r' xi))  -> phi ix -> f r ix -> m (  f r' ix)
hmap   :: HFunctor phi f =>
  (forall xi. phi xi -> r xi -> r' xi)      -> phi ix -> f r ix ->      f r' ix

hmapM  f p x = unwrapMonad  (hmapA (\p_i x -> WrapMonad  (f p_i x)) p x)
hmap   f p x = unI0         (hmapA (\p_i x -> I0         (f p_i x)) p x)
\end{code}

All that rests is to show that these functions can actually be implemented on
pattern functors. Again, we start with the building blocks of individual
constructors:

\savecolumns
\begin{code}
instance HFunctor phi U where
  hmapA _ _  U          = pure U

instance HFunctor phi (K x) where
  hmapA _ _  (K x)      = pure (K x)

instance El phi xi => HFunctor phi (I xi) where
  hmapA f _  (I x)      = I      <$> f proof x
\end{code}

The most interesting one is |I|, where f is used on the recursive position.
That instance also explains why |phi| is part of the class header: a proof for
the recursive position must be produced to be able to call |f| on the child.

The instances for the combinators use applicative style to traverse child
functors:

\restorecolumns
\begin{code}
instance (HFunctor phi f, HFunctor phi g) => HFunctor phi (f :+: g) where
  hmapA f p  (L x)      = L      <$> hmapA f p x
  hmapA f p  (R y)      = R      <$> hmapA f p y

instance (HFunctor phi f, HFunctor phi g) => HFunctor phi (f :*: g) where
  hmapA f p  (x :*: y)  = (:*:)  <$> hmapA f p x <*> hmapA f p y

instance HFunctor phi f => HFunctor phi (f :>: ix) where
  hmapA f p  (Tag x)    = Tag    <$> hmapA f p x
\end{code}


\section{Error catamorphisms in MultiRec}

The next step is to translate the error catamorphisms we discussed in section
\ref{sec:errorcata} to the pattern functors. It is reasonably easy to express
an algebra in terms of pattern functors in the same way as was done with the
base functors. We just need to take into account the higher-order recursion
parameter |r| and the extra index |ix| of the pattern functor:

\begin{code}
type ErrorAlg_PF f e a = forall ix. f (K0 a) ix -> Either e a
\end{code}

Here we choose |K0 a| for |r|, saying that the children should always contain
a value of type |a|, regardless of their index. The universal quantification
says that it does not matter what the index of the ingoing value is: the
result is always |Either e a|.

With this algebra type we can write the pattern functor version of
|errorCata|:

\begin{code}
errorCata :: (HFunctor phi f) => ErrorAlg_PF f e r ->
  phi ix -> HFix (K x :*: f) ix -> Except [(e, x)] r
errorCata alg p_f (HIn (K k :*: f)) =
  case hmapA (\p_g g -> K0 <$> errorCata alg p_g g) p_f f of
    Failed xs  -> Failed xs
    OK expr'   -> case alg expr' of
      Left x'  -> Failed [(x', k)]
      Right v  -> OK v
\end{code}

Although the names of the constructors that are pattern matched on are
different, the function's implementation is similar to the old |errorCata|. An
extra proof must be carried around and passed to |hmapA|, but just as before
the |Applicative| instance of |Except| is used to collect errors.

However, writing such an algebra for this catamorphism is not as easy as
previously, because the algebra cannot use the constructors of the original
datatype. Instead, it has to pattern match on the chains of |L|'s and |R|'s of
the pattern functor, which make the algebra a lot longer and less clear.

The same problem occurs when writing normal algebras (as opposed to error
algebras). The MultiRec paper \cite{rodriguez2008mutrec} solves the problem by
automatically translating the pattern functor to a more convenient type in
which the algebras can be expressed, using a type synonym family. This derived
type cannot be applied directly to values of the pattern functor, but it is
translated back to a function that is applicable to pattern functors using a
type class with instances for the various functor building blocks.

We will adapt this strategy to work for our error catamorphisms. We skip the
discussion of normal catamorphisms; the reader is referred to the MultiRec
paper for this. Instead, we jump directly to the encoding of the error
catamorphisms in MultiRec. It all starts with the type synonym family that
computes the more convenient type:

\begin{code}
type family ErrorAlg
  (f :: (* -> *) -> * -> *)  -- pattern functor
  (e :: *)                   -- error type
  (a :: *)                   -- result type
  :: *                       -- resulting algebra type
\end{code}

Since this type function will be applied to pattern functor building blocks,
we need to give an instance for every single building block. Let us start with
the building blocks of individual constructors:

\savecolumns
\begin{code}
type instance ErrorAlg U              e a = Either e a
type instance ErrorAlg (K b   :*: f)  e a = b  -> ErrorAlg f e a
type instance ErrorAlg (I xi  :*: f)  e a = a  -> ErrorAlg f e a
\end{code}

The first case handles the base case of every constructor: for every
constructor, the error algebra should end in |Either e a|. The second and
third cases handle cons nodes of constant fields and recursive fields,
introducing an extra argument to the algebra. For constant fields, the type of
this extra argument is equal to the type |b| of the field. For recursive
positions, however, we ignore the index that is recursed on and expect an
argument whose type equals the result type of the algebra. So just like
before, the algebra may assume that children have already been folded and that
there were no errors. Again, if desirable, the algebra's
result type can be made dependent on the index using a associated datatype.

\restorecolumns
\begin{code}
type instance ErrorAlg (f :+: g)      e a = (ErrorAlg f e a, ErrorAlg g e a)
type instance ErrorAlg (f :>: xi)     e a = ErrorAlg f e a
\end{code}

Constructors in a datatype and datatypes in a family are combined using |:+:|.
In such cases, algebras have to be provided for both options, so the algebras
are coupled together in a tuple. Tags are ignored; the recursive algebra type
is used directly.

With these instances, the computed types |ErrorAlg PF_Expr| and
|ErrorAlg PF_Type| are equivalent to these type synonyms:

\begin{code}
type ExprErrorAlg e a
   =   (a     -> a     ->  Either e a)  -- EAdd
  :&:  (a     -> a     ->  Either e a)  -- EMul
  :&:  (a     -> a     ->  Either e a)  -- ETup
  :&:  (Int            ->  Either e a)  -- EIntLit
  :&:  (a     -> a     ->  Either e a)  -- ETyped

type TypeErrorAlg e a
   =                       Either e a   -- TyInt
  :&:  (a     -> a     ->  Either e a)  -- TyTup
\end{code}

Here |(:&:) = (,)|, a right-associative infix notation for binary tuples,
is used to make the types easier to read.

As said before, these derived algebra types are very convenient to work with,
but they cannot be directly applied to pattern functors. To still be able to
use them, we describe how to convert such a convenient algebra to one that can
work on pattern functors using a type class:

\begin{code}
class MkErrorAlg f where
  mkErrorAlg :: ErrorAlg f e a -> ErrorAlg_PF f e a
\end{code}

Instances of this class correspond directly with the instances of the type
synonym family above:

\begin{code}
instance MkErrorAlg U where
  mkErrorAlg x           U                 = x

instance MkErrorAlg f => MkErrorAlg (K a :*: f) where
  mkErrorAlg alg         (K x :*: f)       = mkErrorAlg (alg x) f

instance MkErrorAlg f => MkErrorAlg (I xi :*: f) where
  mkErrorAlg alg         (I (K0 x) :*: f)  = mkErrorAlg (alg x) f

instance MkErrorAlg f => MkErrorAlg (f :>: xi) where
  mkErrorAlg alg         (Tag f)           = mkErrorAlg alg f

instance (MkErrorAlg f, MkErrorAlg g) => MkErrorAlg (f :+: g) where
  mkErrorAlg (alg_f, _)  (L x)             = mkErrorAlg alg_f x
  mkErrorAlg (_, alg_g)  (R y)             = mkErrorAlg alg_g y
\end{code}

Here the benefit we alluded to in footnote \ref{foot:typelist} becomes
apparent: because lists of field types always end in the nil type |U|, we
don't have to write instances for lone occurrences of |K a| and |I xi| as they
always appear on the left-hand side of a |(:*:)|.

To demonstrate that this approach works in practice, we look at an algebra
that infers the type of expressions of our Tuples language.

\begin{code}
inferType :: ExprErrorAlg String Type :&: TypeErrorAlg String Type
inferType  =  ( equal "+" & equal "*" & tup & const (Right TyInt) & equal "::" )
           &  ( Right TyInt & tup )
  where
    equal op ty1 ty2
      | ty_1 == ty_2  = Right ty1
      | otherwise     = Left ("lhs and rhs of " ++ op ++ " must have equal types")
    tup ty_1 ty_2     = Right (TyTup ty1 ty2)
\end{code}

The algebra says that both operands of |+| and |*| must have equal types. The
algebra also checks whether explicit type signatures using |::| match the
types of the expressions on the left-hand sides.

To test this algebra, we use a function |readExpr :: String -> AnnFix Bounds
Tuples Expr| we will define later when we discuss parsing annotated
expressions with MultiRec:

\begin{code}
> let expr_1 = readExpr "(1, (2, 3))"
> errorCata (mkErrorAlg inferType) Expr expr_1
OK (TyTup TyInt (TyTup TyInt TyInt))

> let expr_2 = readExpr "(1 :: (Int, Int), 2 + (3, 4))"
> errorCata (mkErrorAlg inferType) Expr expr_2
Failed
  [ ("lhs and rhs of :: must have equal types",
      Bounds {leftMargin = (1,1), rightMargin = (16,16)})
  , ("lhs and rhs of + must have equal types",
      Bounds {leftMargin = (17,18), rightMargin = (28,28)})
  ]
\end{code}


\section{Constructing recursively annotated trees}

In section \ref{sec:multirecfixedpoints} we translated the |AnnFix| and
|AnnFix1| type synonyms from section \ref{sec:fixedpoint} to use MultiRec
terms. Function |mkAnnFix| can be translated in a similar fashion:

\begin{code}
mkAnnFix :: x -> AnnFix1 x s ix -> AnnFix x s ix
mkAnnFix x = HIn . (K x :*:)
\end{code}

When we developed the parser combinators, we produced |AnnFix1|s by taking one
of |ExprF|'s constructors and giving it fully annotated |AnnFix| children. But
producing an |AnnFix1| in the MultiRec version is not as easy: we cannot use
the constructors of the original datatypes |Expr| and |Type| and give them
annotated children, because the constructors' types only allow unannotated
values as fields. Instead, to make an |AnnFix1| we have to use the pattern
functor versions of the constructors.

For example, given two annotated expressions of type |AnnFix Bounds Tuples
Expr|, we can construct their sum using constructor |Add|'s pattern functor
constructor |\x y -> L . Tag . R . L $ I x :*: I y :*: U|. These constructor
functions are long and tedious to write, and that makes it easier to make
mistakes when writing them down. They have to be written for each possible
constructor, and then they have to be used during parsing instead of the real
constructors. It would be much nicer if we could find a way to avoid this.

The way we will solve this problem in this thesis is by making use of the fact
that the parsers we work with construct trees in post-order form: first the
children are parsed and constructed (including position information), then the
node itself. For example, when parsing the sentence |"2 + 3"|, first the |2|
is parsed, then the |3| and only then their sum is constructed and returned.

The parser maintains an explicit stack of fully annotated trees. The stack
contains exactly those trees that will form the children of later nodes, until
the very end when it will contain the final result of the parser. In the
example above, the parser would first push |EIntLit 2|, then |EIntLit 3| and
finally |EAdd (EIntLit 2) (EIntLit 3)|.

Every time an unannotated node is pushed onto the stack, the parser supplies
the annotation for the root of the node. Because the node's children and their
annotations will already be present on the stack, the parser has enough
information at that moment to form a new, fully annotated tree. Let us call
the action of pushing an element onto the stack |yield|. Then the construction
of the annotated parse tree for |"2 + 3"| looks like this:

\begin{code}
do
  n2 <- yield Expr (Bounds { leftMargin = (0, 0),  rightMargin = (1, 2)  }) (EIntLit 2)
  n3 <- yield Expr (Bounds { leftMargin = (3, 4),  rightMargin = (5, 5)  }) (EIntLit 3)
  n5 <- yield Expr (Bounds { leftMargin = (0, 0),  rightMargin = (5, 5)  }) (EAdd n2 n3)
  return n5
\end{code}

The first argument to |yield| is a witness of the |Tuples| GADT, indicating
the type of the expression that is being pushed onto the stack. The second
argument is the annotation that goes with the root of the tree, while the
third and final argument is the tree itself.

Note that there is some overlap in the information that is pushed onto the
stack. First |2| and |3| are pushed individually, and then in the third
|yield| they are given to the stack again as children of the |EAdd| node. If
the integer literals are given twice, which instances does |yield| use? The
answer is that |yield| only uses the root constructors of the elements given
to it. The children of the |EAdd| node in the third call to |yield| are
discarded, so yielding |EAdd undefined undefined| would have worked just as
well. This once again shows that redundant information is bad, but it is the
price we pay for being able to use the original datatype constructors rather
than the pattern functor versions. Alternative solutions to the problem are
discussed in chapter \ref{sec:future}.

Let us take a closer look at the exact steps |yield| takes whenever it is
called:

\begin{enumerate}
  \item A new element is being pushed onto the stack. We have been given the
  accompanying witness of type |phi ix|, so we can do a shallow conversion to
  the pattern functor version of the constructor using |from| in type class
  |Fam|. The resulting value is of type |PF phi I0 ix|.
  \item Figure out how many children the constructor has. We can express this
  computation in terms of |hmapA|, traversing the constructor's pattern
  functor and counting the child positions. Let us call the number of children
  |n|.
  \item \label{step:pop} Pop |n| children from the stack. Every child is fully
  annotated (by induction; read on) and has type |AnnFix x phi xi|, for some
  index |xi|.
  \item \label{step:distribute} Distribute the |n| children over the pattern
  functor of the constructor being pushed. This results in a node which has
  fully annotated children but still lacks a top-level annotation, i.e.~a
  value of type
  |AnnFix1 x phi ix|.
  \item Using |mkAnnFix|, tie the annotation that was passed to |yield| to the
  |AnnFix1| value, producing a value of |AnnFix x phi ix|.
  \item Finally, push this new, fully annotated tree onto the stack.
\end{enumerate}

A consequence of pushing redundant information is that some of the steps
outlined above might fail. Step \ref{step:pop} might find there are not enough
(fewer than |n|) children on the stack if a previous parse operation neglected
to |yield| its results. Step \ref{step:distribute} might find it is trying to
replace unannotated children by annotated values of the wrong type. For
example, |EAdd| expects two |Expr| children, but the previous yield statements
might have pushed |Type|s onto the stack. Finally, at the end of the parsing,
we expect exactly one tree in the stack: another thing that might not be the
case.

The good news is that the |yield|ing of values can be hidden in standard
combinators such as |unit|, |chainl| and |chainr|. This means that correct use
of |yield| only has to be verified for these combinators and that the final
parser cannot do anything wrong if it is expressed in terms of these
combinators. In fact, as we will see in a bit, the parser for the |Expr| part
of the |Tuples| family is surprisingly similar to the parsers defined in
section \ref{sec:annoparse}.

Before we look at these parsing combinators, however, let us look at the code
behind |yield|. We can separate the parsing from the |yield|ing by defining a
new monad transformer for the |yield|ing process and then later stacking the
parser transformer with the |yield| transformer. As is common when defining a
new transformer, we capture the primitive operations in their own type class:

\begin{code}
class Monad m => MonadYield m where
  type YieldFam  m  :: * -> *
  type AnnType   m  :: *
  yield             :: YieldFam m ix -> AnnType m -> ix -> m ix
\end{code}

Class |MonadYield| needs only one primitive operation, but it uses two
associated types, expressed using a type synonym family. By putting these
functions inside the class, we express the fact that instances of these type
functions go hand in hand with instances of the class itself.

What type do the stack elements have? The elements might be of different types
within a family. In order to be able to put them in the same list, we can
existentially quantify over the family indices. By storing them together with
proof values of the family GADT, we can always recover the indices again later
by pattern matching on the proof. We define a new datatype to this end and
call it |AnyF|, because it can store an |f| indexed by any type in a family
|phi|:

\begin{code}
data AnyF phi f where
  AnyF :: phi ix -> f ix -> AnyF phi f

type AnyAnnFix x phi = AnyF phi (AnnFix x phi)
\end{code}

The first argument to the |AnyF| constructor is the proof; the second is an
|f| indexed by the type indicated by the proof. In our case, we set |f| to be
|AnnFix x phi| and call such an element |AnyAnnFix x phi|. Now we can write an
instance of |MonadYield|:

\begin{code}
newtype YieldT x phi m a = YieldT (StateT [AnyAnnFix x phi] m a)
  deriving (Functor, Monad)

instance MonadTrans (YieldT x phi) where
  lift = YieldT . lift

instance (Monad m, HFunctor phi (PF phi), EqS phi, Fam phi) =>
    MonadYield (YieldT x phi m) where
  type YieldFam  (YieldT x phi m)  = phi
  type AnnType   (YieldT x phi m)  = x
  yield                            = doYield
\end{code}

In |YieldT x phi m a|, |x| is the type of the annotation (such as |Bounds|),
|phi| is the family GADT (such as |Tuples|), |m| is the underlying monad and
|a| is the return value of the monad. We keep track of the stack by making
|YieldT| a |newtype| wrapper around a |StateT| monad with a list of
|AnyAnnFix|es as state. As any good monad transformer, a |MonadTrans| instance
is supplied. The type functions |YieldFam| and |AnnType| point to the relevant
type arguments. The definition of |doYield| is postponed; we will look at it
in a moment. First, we need to define the |EqS| class constraint.

When we distribute the annotated children over a pattern functor, we need to
make sure that the children are of the right type. We will use hmapA to
perform the distribution, and the transformation function passed to |hmapA|,
which has type |forall xi. phi xi -> r xi -> a (r' xi)|, reflects the fact
that the new children should have the right index.

The transformation function is passed a witness of type |phi xi| and the old
child of type |r xi|, and the resulting action should return a new child of
type |r' xi|, i.e.~with the same index. How can we make sure that the element
we removed from the stack has the right index? This is what |EqS| is for:

%format ^^ = "\;"
\begin{code}
class EqS phi where
  eqS :: phi ix_1 -> phi ix_2 -> Maybe (ix_1 :=: ix_2)

data (:=:) :: * -> * -> * ^^ where
  Refl :: ix :=: ix
\end{code}

Constructor |Refl| can only be constructed if the two type operands to |(:=:)|
are equal. As a result, if two type variables |ix_1| and |ix_2| are in scope,
pattern matching on the |Refl| constructor of type |ix_1 :=: ix_2| tells the
Haskell compiler that these two type variables must be equal. This explains
the return type of |eqS|: given two witnesses, if they turn out to be one and
the same witness, we do not just want the result |True| (as is the case with
the normal equality operator |(==)|), but we also want a proof that |ix_1 :=:
ix_2|. We can use |eqS| when distributing the children: compare the witness
argument to |hmapA|'s first argument to the witness that is stored in the
|AnyAnnFix|. If they are equal, then we can convince the compiler that it is
okay to return the annotated child.

\begin{code}
doYield :: (Monad m, HFunctor phi (PF phi), EqS phi, Fam phi) =>
  phi ix -> x -> ix -> YieldT x phi m ix
doYield p bounds x = YieldT $ do
  let pfx  =   from p x
  let n    =   length (children p pfx)
  stack    <-  get
  if length stack < n
    then
      error $  "structure mismatch: required " ++ show n ++
               " accumulated children but found only " ++
               show (length stack)
    else do
      let (cs, cs')  = splitAt n stack
      let newChild   = evalState (hmapM distribute p pfx) (reverse cs)
      put (AnyF p (HIn (K bounds :*: newChild)) : cs')
      return x

distribute :: EqS phi => phi ix -> I0 ix -> State [AnyAnnFix x phi] (AnnFix x phi ix)
distribute p_1 _ = do
  xs <- get
  case xs of
    [] ->
      error "structure mismatch: too few children"
    AnyF p_2 x : xs' ->
      case eqS p_1 p_2 of
        Nothing    -> error "structure mismatch: incompatible child type"
        Just Refl  -> do put xs'; return x
\end{code}

The code closely follows the steps outlined earlier. One thing to note is that
because |doYield| uses |splitAt| to pop the children, the resulting list has
to be reversed: the child that was pushed last appears at the front of the
stack, while we want that child to be in the last position of the list we give
to |distribute|. That |distribute| ignores the unannotated children and simply
replaces them is made explicit by the underscore as its second argument.

\section{Parsing expressions}

In the previous section we alluded to hiding the calls to |yield| in parser
combinators. Now that we have fully defined the |YieldT| monad transformer, we
can build a monad stack that allows yielding while parsing and actually define
these parser combinators. In section \ref{sec:annoparse} we defined a type
synonym |P|; we will reuse |P| and extend it to define a type synonym |YP|
that incorporates |YieldT|:

\begin{code}
type YP s phi m = P s (YieldT Bounds phi m)
\end{code}

Why should the monads be stacked in this way? Stacked like this, we can use
the choice parser combinator |(<||>)| to define alternatives in the grammar.
If |YieldT| were on the outside, we would only be able to use choice in the
inner monad |P|, from where we cannot |yield| any values. On the other hand,
now that |P| is on the outside we can express choice and still have access to
|yield| from within the alternatives.

We have to be careful that while evaluating the choices, no |yield| side
effects take place. Parsec partially solves this by specifying that |p_1 <||>
p_2| first tries |p_1| and does not consider |p_2| anymore if |p_1| has
consumed any input. Since we only |yield| values after consuming some input,
we are safe. However, Parsec also offers the |try| combinator which lifts this
restriction and allows arbitrary backtracking. If |try| is used, we have no
safety guarantees anymore. For that reason, we should be very careful to use
|try|, or---better yet---not use it at all. Fortunately, we will not need it
in the definition of our example |Tuples| parser.

Let us take a look at the |unit| parser combinator we have seen in section
\ref{sec:annoparse}, translated to the YP monad stack:

\begin{code}
unit :: (Fam phi, EqS phi, HFunctor phi (PF phi), Monad m) =>
  phi a -> YP s phi m a -> YP s phi m a
unit w p = do
  left  <- getPos
  x     <- p
  mkBounded w left x

mkBounded :: (Fam phi, EqS phi, HFunctor phi (PF phi), Monad m) =>
  phi a -> Range -> a -> YP s phi m a
mkBounded w left x = do
  right <- getPos
  lift $ yield w (Bounds left right) x
\end{code}

Although the types are different and more involved, the definition of |unit|
is identical to that of the original |unit|, with the exception of an extra
witness parameter of type |phi a|. Only |mkBounded|'s definition has changed
significantly: instead of returning the result of a call to |mkAnnFix|, it
calls |yield|.

The other two combinators, |chainl| and |chainr|, are also the same as before,
again with the exception of the witness parameter:

\begin{code}
chainr :: (Fam phi, EqS phi, HFunctor phi (PF phi), Monad m, Show a) =>
  phi a -> YP s phi m a -> YP s phi m (a -> a -> a) -> YP s phi m a
chainr w px pf = fix $ \loop -> do
  left  <- getPos
  x     <- px
  option x $ do
    f  <- pf
    y  <- loop
    mkBounded w left (f x y)

chainl :: (Fam phi, EqS phi, HFunctor phi (PF phi), Monad m, Show a) =>
  phi a -> YP s phi m a -> YP s phi m (a -> a -> a) -> YP s phi m a
chainl w px pf = do
    left <- getPos
    px >>= rest left
  where
    rest left = fix $ \loop x -> option x $ do
      f <- pf
      y <- px
      mkBounded w left (f x y) >>= loop
\end{code}

The actual parser for the |Tuples| family of datatypes is unsurprising, but we
include it here for the sake of symmetry with the previous section. As before,
we do not include the code for the lexer. The lexer token constructor names
can be recognized by a prefix |T|.

\begin{code}
type ExprParser = YP ExprToken Tuples Identity

pExpr      ::  ExprParser Expr
pExpr      =   do
  left  <- getPos
  ex    <- pAdd
  option ex $ do
    pToken TDoubleColon
    ty <- pType
    mkBounded Expr left (ETyped ex ty)

pAdd       ::  ExprParser Expr
pAdd       =   chainl Expr pMul     (EAdd <$ pToken TPlus)

pMul       ::  ExprParser Expr
pMul       =   chainl Expr pFactor  (EAdd <$ pToken TStar)

pFactor    ::  ExprParser Expr
pFactor    =  pIntLit <|> pTupleVal

pIntLit    ::  ExprParser Expr
pIntLit    =  unit Expr $ (\(TIntLit n) -> EIntLit n) <$> satisfy isIntLit

pTupleVal  ::  ExprParser Expr
pTupleVal  =  pTuple Expr pExpr ETup

pType      ::  ExprParser Type
pType      =  pTyInt <|> pTyTuple

pTyInt     ::  ExprParser Type
pTyInt     =   unit Type $ TyInt <$ pToken TInt

pTyTuple   ::  ExprParser Type
pTyTuple   =   pTuple Type pType TyTup

pTuple     ::  Tuples ix -> ExprParser ix -> (ix -> ix -> ix) -> ExprParser ix
pTuple w pEl f = do
  left  <- getPos
  pToken TLParen
  lhs   <- pEl
  ty    <- option lhs $ do
    pToken TComma
    rhs <- pEl
    mkBounded w left (f lhs rhs)
  pToken TRParen
  return ty
\end{code}

\section{Annotation-guided exploring}

In contrast with the annotation-guided exploring in section
\ref{sec:fixedpoint}, pattern functors give us enough information to define
genuine zippers. An implementation of such a zipper is available in package
\texttt{zipper} on
Hackage\footnote{\url{http://hackage.haskell.org/package/zipper}}.

We will discuss the API of the zipper, but not its implementation. For details
on the implementation, the reader is referred to the MultiRec paper
\cite{rodriguez2008mutrec}. The current version of the zipper (\texttt{0.3})
is intended to work on shallowly converted datatypes. In order to be able to
use it for recursively annotated trees, a slight modification has been made to
the library. The modified version is listed in Appendix \ref{app:zipperfix}.

\begin{figure}[t]
\begin{code}
data FixZipper phi f ix

enter  :: Zipper phi f => phi ix -> HFix f ix -> FixZipper phi f ix
leave  :: Zipper phi f => FixZipper phi f ix -> HFix f ix

type Nav =  forall phi f ix. Zipper phi f => FixZipper phi f ix -> Maybe (FixZipper phi f ix)

down   :: Nav
up     :: Nav
right  :: Nav
left   :: Nav

on     :: (forall xi. phi xi -> HFix f xi -> a) -> FixZipper phi f ix -> a
\end{code}
\caption{The API of the MultiRec-based fixpoint zipper.}
\label{fig:zipperapi}
\end{figure}

An overview of the API is listed in figure \ref{fig:zipperapi}. The names of
the arguments of |FixZipper| are consistent with those in previous datatypes:
|phi| is the family GADT, |f| is the pattern functor over which the zipper is
computed and |ix| is the top-level index.

Function |enter| takes a fixpoint and returns a zipper over that fixpoint,
with the focus at the tree's root. Function |leave| leaves the zipper
structure and returns the tree that was originally |enter|ed. Because a zipper
value implicitly contains a witness |phi ix|, functions that \emph{accept} a
zipper (such as |leave|) do not need an extra witness argument.

The navigation functions |down|, |up|, |right| and |left| all have the same
type, which---as before---has been called |Nav|. And just as before,
navigation steps can be composed using |>=>|.

Finally, |on| takes a function, applies it to the current focus and then
returns the result.

If we instantiate a |FixZipper| with |K x :*: PF phi| for |f|, it will hold
selections of recursively annotated trees:

\begin{code}
type AnnZipper phi x = FixZipper phi (K x :*: PF phi)
\end{code}

Given a zipper over an annotated tree, we can extract the annotation of the
current focus using |on|:

\begin{code}
focusAnn :: AnnZipper phi x ix -> x
focusAnn = on (\_ (HIn (K x :*: _)) -> x)
\end{code}

The heart of the code in section \ref{sec:annoops} was the |explore| function.
Most other functions were expressed in terms of |explore|. The function's
generic sibling is very similar:

\begin{code}
explore :: Zipper phi (PF phi) =>
  phi ix -> (x -> ExploreHints) -> (AnnFix x phi) ix -> [AnnZipper phi x ix]
explore p hints = explore' hints . enter p

explore' :: Zipper phi (PF phi) =>
  (x -> ExploreHints) -> AnnZipper phi x ix -> [AnnZipper phi x ix]
explore' hints root = [ z | (dirOk, zs) <- dirs, dirOk (hints x), z <- zs ]
  where
    x = focusAnn root
    dirs =
      [  (matchHere,     [root])
      ,  (exploreDown,   exploreMore (down   root))
      ,  (exploreRight,  exploreMore (right  root))
      ]
    exploreMore = maybe [] (explore' hints)
\end{code}

Its type is more complicated, and it receives a witness. The extraction of the
annotation is done using |focusAnn| rather than direct pattern matching.

The other functions are also very similar. Again, the type changes and some
witnesses are passed, but their essences are the same.

\begin{code}
findLeftmostDeepest :: (Zipper phi (PF phi)) =>
  phi ix -> (x -> Bool) -> AnnFix x phi ix -> Maybe (AnnZipper phi x ix)
findLeftmostDeepest p down = listToMaybe . reverse . explore p hints
  where
    hints x
      | down x     = ExploreHints  True   True   False
      | otherwise  = ExploreHints  False  False  True

selectByRange :: Zipper phi (PF phi) =>
  phi ix -> Range -> AnnFix Bounds phi ix -> Maybe (AnnZipper phi Bounds ix)
selectByRange p range@(left, _) = listToMaybe . reverse . explore p hints where
  hints bounds@(Bounds _ (ir, _)) = 
      ExploreHints
        {  matchHere     = range `rangeInBounds` bounds
        ,  exploreDown   = range `rangeInRange` outerRange bounds
        ,  exploreRight  = left >= ir
        }

selectByPos :: (Zipper phi (PF phi)) =>
  phi ix -> Int -> AnnFix Bounds phi ix -> Maybe (AnnZipper phi Bounds ix)
selectByPos p pos = findLeftmostDeepest p (posInRange pos . innerRange)

repairBy :: (Ord dist, HFunctor phi (PF phi)) =>
  phi ix -> (Range -> Range -> dist) -> AnnFix Bounds phi ix -> Range -> Bounds
repairBy p cost tree range =
  head (sortOn (cost range . innerRange) (allAnnotations p tree))

repair :: HFunctor phi (PF phi) =>
  phi ix -> AnnFix Bounds phi ix -> Range -> Bounds
repair p = repairBy p distRange

moveSelection :: Zipper phi (PF phi) =>
  phi ix -> AnnFix Bounds phi ix -> Nav -> Range -> Maybe Bounds
moveSelection p tree nav range = focusAnn <$> (selectByRange p range tree >>= nav)
\end{code}