Tree Manipulation and Code Refactoring

Contents

Introduction
Modifying Nodes
Creating Nodes
Moving Nodes
Modifying Hidden Tokens
Example: Adding Nodes
Example: Deleting Nodes
Example: Moving Nodes

Introduction

The following API functions: parserHiddenAddToFront, parserHiddenDelete, parserHiddenInsertAfter, parserHiddenSetText, parserHiddenSetType, parserNodeCreate, parserNodeCreateI, parserSetNodeFirstChild, parserSetNodeNextSibling, parserSetNodeText, parserSetNodeType, and parserSetNodeTypeI, allow you to automatically modify source code using Proparse.

The general steps for modifying programs are:

  1. parse the program (compile unit)
  2. use the above API functions to manipulate your program as needed, inside the syntax tree
  3. write out the modified tree to a new program file
  4. review to confirm desired result

Modifying Nodes

The API functions parserSetNodeText, parserSetNodeType and parserSetNodeTypeI modify individual nodes only.

parserSetNodeText may be used, for example, to beautify your code by making all keywords upper case.

parserSetNodeType and parserSetNodeTypeI will be useful for changing keywords.

To use the parserSetNodeType function, a valid node type must be supplied. Look up the node types in the tree specification and use these as the second argument to the function.

To use the parserSetNodeTypeI function, a valid node type does not need to be supplied. This is useful for creating synthetic nodes with user-defined node types during refactoring. (See also parserNodeCreateI). Integer node types less than 10,000 are reserved for use by Joanju, and integer node types greater than or equal to 10,000 may be used for user-defined node types.

Creating Nodes

The API functions parserNodeCreate and parserNodeCreateI create new nodes, without attaching these new nodes to the tree.

There are several important points to keep in mind when creating new nodes for your tree:

  1. To use the parserNodeCreate function, a valid node type must be supplied. Look up the node types in the tree specification and use these as the second argument to the function.

    To use the parserNodeCreateI function, a valid node type does not need to be supplied. This is useful for creating synthetic nodes with user-defined node types during refactoring. Integer node types less than 10,000 are reserved for use by Joanju, and integer node types greater than or equal to 10,000 may be used for user-defined node types.

    The text supplied as the third argument to these functions should be the text as you wish it to appear in your finished program. (e.g. not abbreviated, all upper case, etc.)

  2. After using one of these functions to create a node, use parserSetNodeFirstChild or parserSetNodeNextSibling (see Functions to move nodes section for important details) to attach the new node in an appropriate tree location and reattach any displaced nodes.
  3. It is best to try to maintain the existing tree structure when adding new nodes. For example, in the statement DEFINE VARIABLE cust LIKE customer.name NO-UNDO., the NO-UNDO is in the syntax tree as a sibling of the LIKE and PERIOD nodes (customer.name is the first child of the LIKE node).

    The structure of the syntax tree is shown in the tree specification, and also the Tokenlister tool on the Proparse Launcher (proparse/launcher.w) will give a (possibly incomplete) indication of the tree structure.

    When adding a NO-UNDO to a DEFINE VARIABLE statement that doesn't have one, (see Node Manipulation Examples - Adding Nodes), reviewing the tree spec will show that the new NO-UNDO node should be inserted into the tree as the next sibling of the LIKE, i.e. using parserSetNodeNextSibling instead of parserSetNodeFirstChild.

    It is acceptable to not maintain the exact tree structure for added nodes if no future passes through the modified tree will rely on a particular structure. If only one pass is done through the syntax tree and then the code is written out after this modification pass, a not-quite-perfect tree structure won't cause a problem. A future reparse from the revised operating system file will be correctly loaded into a new tree.

Moving Nodes

The API functions parserSetNodeFirstChild and parserSetNodeNextSibling move nodes around the tree.

There are several important points to keep in mind when moving nodes:

  1. When a node is moved, all of its siblings, children, and the hidden nodes before these nodes will follow. (hidden nodes are only attached to the front of a node, i.e. before the node) This may be hundreds or thousands of nodes. There is no performance impact if moving does result in the move of large numbers of nodes, as this "move" is simply a change in the attachment location of the first node into the tree.
  2. When a node (and its siblings, children, and hidden nodes) are displaced by a move, it is no longer linked into the tree. It is necessary to keep a handle to this node and attach it back into the tree in a new location. If this is not done, these nodes will be missing when the tree is written out. A node is displaced if a different node is placed into its position in the tree. For example, if node B is the first sibling of node A, then node C is made the new first sibling of node A, node B is displaced.
  3. Due to the fact that a node's siblings and children follow along during moves, it is very important to only move nodes "up" the tree, or to an earlier point in the program. If a node is moved "down" the tree (to a later point in the program), it is easy to cause an infinite loop: one may be moving a node down among its own siblings but these siblings are currently being moved because they are following the node.
These points are also applicable when creating new nodes with the parserNodeCreate and parserNodeCreateI functions, as after a node is created it must be placed into the existing tree.

Modifying Hidden Tokens

The API functions parserHiddenAddToFront, parserHiddenDelete, parserHiddenInsertAfter, parserHiddenSetText, and parserHiddenSetType modify the hidden tokens in the tree.

parserHiddenAddToFront creates a new hidden token and places it as the first hidden for the current real node. It doesn't require that any another hiddens already exist for this node. The parserHiddenInsertAfter function works similarly, but requires an existing hidden token to "insert after".

The parserHiddenDelete function deletes one hidden token at a time, and can be used to delete some or all of the hiddens for a node.

The parserHiddenSetType and parserHiddenSetText functions alter the current hidden token, and may be used in place of deleting a hidden token then creating a different hidden token to replace it.

These hidden token manipulation functions may be used in conjunction with the existing hidden functions (such as parserHiddenGetFirst or parserHiddenGetNext) to update comments, adjust program indenting and formatting, and set appropriate comments and whitespace when doing code transformations.

See also Hidden Tokens in the User's Guide.

Example: Adding Nodes

Given this statement:

    DEFINE VARIABLE cust LIKE customer.name.
you can automatically add a NO-UNDO.

First, walk through the children of the DEFINE node and store the handles of the appropriate nodes before and after the point where the new node will be inserted. In this case, store the handles of the two sibling nodes, LIKE and PERIOD (.). Also, create a new handle to be used for the new node.

Next, use parserNodeCreate to create the new NO-UNDO node.

Lastly, insert the new node in the appropriate location. In this case, use parserSetNodeNextSibling to attach the new node as the next sibling of the LIKE node and to reattach the displaced PERIOD node as the next sibling of the new NO-UNDO node.

After storing the node handles, the rest of the steps look like this:

  parserNodeCreate(newNode, "NOUNDO", " NO-UNDO").
  parserSetNodeNextSibling(likeNode, newNode). 
  parserSetNodeNextSibling(newNode, periodNode). 
When the tree is eventually written out, this line will now read
    DEFINE VARIABLE cust LIKE customer.name NO-UNDO.

As an alternative to the use of a blank space in the " NO-UNDO" in the above parserNodeCreate statement (to position appropriate white space before the new text), the new hidden token function parserHiddenAddToFront could have been used as follows:
  parserNodeCreate(newNode, "NOUNDO", "NO-UNDO").  /* no space */
  parserSetNodeNextSibling(likeNode, newNode). 
  parserSetNodeNextSibling(newNode, periodNode). 
  parserHiddenAddToFront(newNode, "WS", " ").

Example: Deleting Nodes

Given this statement:

    DEFINE SHARED VARIABLE sCustomer LIKE customer.name NO-UNDO.
it is simple to remove the SHARED. First, store the handle to the DEFINE node. Then, store the handle to the VARIABLE node. Lastly, use parserSetNodeFirstChild to make the VARIABLE the first child of the DEFINE. This will bring along all of the siblings of VARIABLE such as sCustomer and LIKE. This will also displace the SHARED node. If the SHARED is not reattached to the tree, (and it won't be, since we didn't save its handle) it will be omitted when the tree is written out.

Depending on the tree structure for the statement you are modifying, it may be necessary to use parserSetNodeNextSibling instead of parserSetNodeFirstChild.

Example: Moving Nodes

The program:

    def   var A as char init "A" no-undo. 
    defi  var B as char init "B" no-undo. 
    defin var C as char init "C" no-undo. 

    display A B C. 
If, for example, you want to move the third define, defin var C, to be the first of the three define statements and push the other two defines down, the order in which the steps are performed is very important. The four steps to accomplish this are:
  1. Save the handles of the nodes to be moved, the nodes being displaced, and the nodes representing the new attachment points. In this case, save handles to the defin of defin var C being moved, to the def of def var A being displaced, and to the display of display A B C. which is also going to be displaced. We will also need the handle of the defi of defi var B as we will need to attach the display after it. (In this example, we actually need handles to all of the statements).

  2. Use parserSetNodeFirstChild to move the defin (and its children and siblings) up to be the first child of the Program_root node (which is found by using parserNodeTop ). This will bring along the sibling node display and its children, and will detach/displace the def node (the former first child of the Program_root), def's children, its sibling defi and defi's children. At this point, the tree looks like:
       Program_root
    
    defin var C...
    display...
    Program_tail
    and def and its sibling defi are loose nodes.

  3. Next, use parserSetNodeNextSibling to move the def node to be the next sibling of the defin node, which reattaches the def node back into the tree. This will bring along its children, sibling node defi and defi's children, and will detach the display node and its children. At this point, the tree looks like:
       Program_root
    
    defin var C...
    def var A...
    defi var B...
    Program_tail
    and display is a loose node.

  4. Lastly, use parserSetNodeNextSibling to move the display node to be the next sibling of the defi node, which reattaches the display node back into the tree. This will bring along display's children.
    At this point, the tree looks like:
       Program_root
    
    defin var C...
    def var A...
    defi var B...
    display...
    Program_tail
    and there are no more loose nodes, so we're finished.
When written out, the revised program looks like:
    defin var C as char init "C" no-undo. 
    def   var A as char init "A" no-undo. 
    defi  var B as char init "B" no-undo. 

    display A B C. 
The above may sound complicated, but after the nodes are saved, steps 2-4 just look like this:
    parserSetNodeFirstChild(topNode,oldChild3).     /* defin C becomes first define */
    parserSetNodeNextSibling(oldChild3,oldChild1).  /* def A becomes second define  */
    parserSetNodeNextSibling(oldChild2,oldChild4).  /* display follows defi B       */