\documentclass[]{beamer} \usetheme{uucs} \usepackage[english]{babel} \usepackage{color, autofe, ucs, mathtools, amsmath, textcomp} \usepackage[utf8x]{inputenc} %include polycode.fmt \newcommand{\question}[1]{\begin{itemize}\item[Question:]{#1}\end{itemize}} \setbeamercovered{transparent} \title{Source code annotation, 2010} \date{November 12th} \author{Thijs Alkemade (3285960) \& Alessandro Vermeulen (3242919)} \begin{document} \frontmatter \begin{frame} \titlepage \end{frame} \mainmatter \begin{frame}{Outline} \tableofcontents \end{frame} \section{Research question} \begin{frame}{Origin} It is convenient to store the origins of expressions: \begin{itemize} \item Pretty printing \item Error reporting \item Text editor: mapping edit field $\iff$ source tree \end{itemize} We do not want to clutter our expression datatype with this information. The solution is \emph{annotation} of our original expression datatype. \end{frame} \begin{frame}{Annotated AST} \begin{spec} data ExprF rT = Add rT rT | Sub rT rT | Mul rT rT | Div rT rT | Num Int deriving (Eq, Show) \end{spec} \end{frame} \begin{frame}{Example parse} \begin{spec} *ExprParser> parseExpr "1+ 5" Right (In {out = Ann (Bounds { leftMargin = (0,0) , rightMargin = (4,4)}) (Add (In {out = Ann (Bounds { leftMargin = (0,0) , rightMargin = (1,1)}) (Num 1)}) (In {out = Ann (Bounds { leftMargin = (2,3) , rightMargin = (4,4)}) (Num 5)}) )} ) \end{spec} \end{frame} \begin{frame}{Our problem, \emph{The Editor}} As you can imagine, we might want to transform our expression tree. There are several reasons why: \begin{itemize} \item Code insertion \item Code removal \item Refactoring \end{itemize} \end{frame} \begin{frame}{The problem} \begin{itemize} \item[Problem] Operations on the source lead to changes in the tree; \pause \item[Result] We now have \emph{invalid} locations; \item[Solution] Update the locations; \item[Problem] Applying a lot of transformations means having a lot of overhead. \end{itemize} \end{frame} \section{Research contribution} \begin{frame}{Our solution} \begin{itemize} \item Our |Bounds| now contain the piece of code that corresponds to that piece of syntax. \item This was quite easy to add by comparing how much more was parsed after every step (instead of asking parsec for the parser's position). \end{itemize} \end{frame} \begin{frame}{Example} \begin{spec} *ExprParser> parseExpr "1+ 5" Right (In {out = Ann (Bounds { source = [ TNum 1 , TPlus , TNum 5] , previous = []}) (Add (In {out = Ann (Bounds { source = [TNum 1] , previous = []}) (Num 1)}) (In {out = Ann (Bounds { source = [TNum 5] , previous = [ TPlus , TNum 1]}) (Num 5)}))}) \end{spec} \end{frame} \begin{frame}{Problems with our solution} \begin{itemize} \item Updating the helper functions is not always directly possible. \item We need to calculate the margins from the code we have in the new tree. \item Our idea: store a list of the previous tokens. The left margin is now the sum of the size of all previous tokens. \pause \item But this has exactly the same problem as before: an update will require updates in all other parts of the tree! \end{itemize} \end{frame} \begin{frame}{Future work} As you have seen, our solution is far from perfect. Therefore we propose four possible solutions. \end{frame} \section{Future work} \begin{frame}{1. Store only the parent} The easiest one: \begin{block}{Store the left sibling} Instead of storing all previous tokens, store the previous annotation. Walking back to the start is still possible, so the left margin can be calculated, but updates only need to be done locally. \end{block} \end{frame} \begin{frame}{2. Source only} \begin{block}{Keep only the source} Store only the tokens in the annotation. When needed calculate the ranges from the root down as you have enough information when you start at the root to calculate them. \end{block} \begin{block}{Problem} There are several helper functions that locally need the ranges information. How do you change them? And how can you do that without a too big a performance hit? \end{block} \end{frame} \begin{frame}{3. Two separate annotations} \begin{block}{Use two separate annotations} The idea is to have two annotations. One including the ranges that is convenient for the editor, as it is in the current situation. And an annotation where only the source is being maintained. One should then define the source transformations on the source annotated tree and first convert the positional tree, run the operations and convert back. \end{block} \end{frame} \begin{frame}{4. Continuations} \begin{block}{4. Continuations} Define transformations of the tree in terms of continuations. This might result in less overhead as only necessary values are updated. With the possibility of fusing several updates together. Resulting in less traversals over the tree. \end{block} \end{frame} \section{Conclusion} \begin{frame}{Conclusion} There is still a lot of work to be done but there are interesting ideas. \end{frame} \end{document}