Also have a look at examples/query_length.p which was described in Starting With Something Practical. It is a practical example of using Proparse's named queries to find occurrences of something specific in your source code.
However, if you leave the DLL in memory, you will find that no matter how many times you make function calls into it, its memory usage will remain constant.
The sample programs load, but never release the DLL.
Also, you should remember to APPLY "CLOSE" to the handle that your use
for proparse.p
, so that it has a chance to clean up MEMPTRs etc.
Don't just DELETE it.
proparse/api/proparse.p
persistently, store its handle in a named
procedure handle, and then pass the name of that handle to
proparse/api/proparse.i
. We will refer to that procedure
handle as the parser handle.
Use APPLY "CLOSE" to close the persistent parser procedure when you are
done with it.
Proparse.i
provides the forward-declarations necessary to
use the function calls defined in proparse.p
. We provide
a 4GL function definition for each of the DLL functions, which would
normally have to be called with a RUN statement. We do this because the
function definitions can make for some slightly more elegant code.
The functions are fairly consistent in their names and parameters. All parser function names begin with "parser", to help prevent name clashes with other function libraries. Many have input parameters like:
ofHandle
- which node are we parserDoingSomething to?ofHandle
is the parent node that we might want to
get the first child node of with parserNodeFirstChild
.
intoHandle
- if the function returns a node,
it will put a reference to it into this handle.intoHandle
is handle where we want to put a reference
to the child found by parserNodeFirstChild
.
There are various configuration settings which are used for configuring Proparse's behavior. Some of these settings are simply configuration flags, like whether or not the parser has been initialized.
Some of these settings might be important in order for Proparse
to be able to properly parse your source code.
Specifically, some of these configuration settings determine
how Proparse interprets functions within &IF
preprocessor conditions, such as OPSYS, PROPATH, and KEYWORD-ALL.
Many of these settings are configured automatically by proparse.p
,
which does its job by looking directly at your Progress environment settings
and then configuring Proparse accordingly.
Some of these settings, though, need to be configured manually.
The configuration for the preprocessor KEYWORD-ALL function is an example
of one that must be configured manually.
The functions parserConfigGet and parserConfigSet are used for getting and setting Proparse's configurations.
See Configuration Settings Reference for details about the various settings.
The following small program tells Proparse that "aliasname" is an alias for the database "dbname". Because the Proparse DLL is not unloaded after each use, the alias remains in effect (within Proparse) for the duration of your Progress session.
DEFINE VARIABLE parserHandle AS HANDLE NO-UNDO. RUN proparse/api/proparse.p PERSISTENT SET parserHandle. {proparse/api/proparse.i parserHandle} parserSchemaAliasCreate("aliasname", "dbname"). APPLY "CLOSE" TO parserHandle.Also see: parserSchemaAliasCreate and parserSchemaAliasDelete .
Once you have created your node handle, you can use it as the INPUT
parameter to other functions which require a node handle. One such
function is
parserNodeTop,
which stores a handle
to the topmost node in the node handle that you provide. The topmost
node is always a special node of type Program_root
.
There are functions for getting attribute values from your node. Those functions take your node handle as the single INPUT parameter, and then return the attributes. Examples of functions for getting node attributes are parserGetNodeType and parserGetNodeLine.
There are functions for getting other nodes, such as parserNodeFirstChild, parserNodeParent, parserNodeNextSibling, and parserNodePrevSibling. Those functions require two INPUT parameters. The first is the "where from" node handle. The second is the "store a pointer to the resulting node into this node handle" node handle. You can use the same node handle for both parameters, with the effect of changing your node handle to point from one node to the next. The function parserNodeStateHead is similar, but it finds the head node of the enclosing statement or block.
parserReleaseHandle
function to release your
node handle.
By releasing a node handle, all you are doing is telling the DLL
that it can now re-use that node pointer. If, for example, you release
node number 12, it is possible that the next time you use the function
parserGetNodeHandle
, you will be given the number 12 again.
There is no practical limit to the number of node handles that you can keep.
All nodes and node handles are cleared away each time you parse
a new program (i.e.: each time you use the parserParse
function).
Proparse.p
, as provided, does not display error messages
or do any sort of error handling.
That is for the sake of keeping it small and efficient,
and also because different uses of the API
will demand different kinds of error handling.
For the most part, it should be sufficient to check for error conditions in two places in your parser-based programs:
parserParse
. It returns FALSE if there was an error
during the parse.
parserParse
,
because parserParse
clears out everything from the previous
parse, including any old errors.
IF NOT parserParse(filename) THEN DO: MESSAGE parserErrorGetText() VIEW-AS ALERT-BOX ERROR. RETURN. END. /* ... and ... */ IF parserErrorGetStatus() < 0 THEN DO: MESSAGE parserErrorGetText() VIEW-AS ALERT-BOX ERROR. RETURN. END.The function parserErrorGetStatus returns -1 if a warning exists, and -2 if an error exists. Use parserErrorGetText to retrieve the error or warning text. Note that the error status remains in effect, and the error (or warning) text is available, until parserErrorClear is called, or until
parserParse
is called again.
proparse.p
, then you may have
noticed that, while the 4GL functions often return LOGICAL, many of the
DLL calls are actually returning an integer.
For the DLL functions, where the return value does not have a conflicting meaning:
However, the parser*()
functions defined in proparse.p
just return a LOGICAL to keep things simple.
The program examples/query_length.p is a straightforward example which uses queries. You work with Proparse queries a little like the way that you work with database queries. First you create a query, and then you work with the result set. In the case of Proparse, your query finds nodes of the node type that you specify. Normally, your query would start at the topmost node (i.e. the node found with parserNodeTop), but sometimes your queries may start at other nodes. Perhaps your new query will start at a node which was found via a previous query.
The return value of function parserQueryCreate
is an integer.
It is the number of nodes in the query result set. This is the key to working
with the result set. You store the result of parserQueryCreate
in an INTEGER variable, we'll use a variable named numResults
for discussion purposes here. Once you have the number of results, then you
can simply loop for, say, yourCounter = 1 TO numResults
.
You pass yourCounter
to parserQueryGetResult
to fetch
your results one at a time.
parserQueryGetResult
also requires a valid node handle as a parameter,
and after parserQueryGetResult
has been called, you use that node
handle to reference the node which was found as part of the query result set.
It is not normally necessary to use the parserQueryClear
function
to clear out the result set when you are done with it. Each time you call the
parserParse
function, any old queries get cleared out. However,
if you are creating many queries in a loop, for a single parse, then you
might want to clear out queries when you are done with them.
parserQueryCreate
to put all nodes into the results set.
This allows you to view part of the tree, or all of the tree, as a flat
set of nodes. Instead of writing a recursive program to walk through the tree
structure, you can use a simple loop to visit each node.
In the 4GL, simple loops can be much faster than recursive functions.
Also, when the tree is flattened, operator
nodes are placed in between their operands so that this feature is especially
useful for printing out code.
Note however that with this approach, you lose the benefit of the structure of the tree. Some parsing applications are much better served with a tree structure than with a flat vector of nodes.
Queries work the same for investigating a scan result set (token/symbol list)
as they do for investigating a parse result (syntax tree).
There is an additional query option first_where_line=
for working with scan results.
See the Scanner
subsection Using Queries with the Scanner
for a description.
Proparse works a little different than most parsers. Proparse is designed so that you can work with those tokens within the syntax tree. Proparse preserves whitespace (WS) tokens, COMMENT tokens, as well as the following: AMPMESSAGE, AMPANALYZESUSPEND, AMPANALYZERESUME, AMPGLOBALDEFINE, AMPSCOPEDDEFINE, AMPUNDEFINE .
Unlike regular nodes, Proparse only allows you to work with one hidden token at a time. There are no handles to work with. Because of this, the functions have less parameters and are a little simpler to work with than the functions for regular nodes.
See the example program examples/codeprint1a.p for an example of using the hidden token functions for retrieving whitespace. For each node that it displays, it also displays the whitespace tokens (if any) which come immediately before it.
The function parserHiddenGetBefore finds the hidden token, if any, which comes immediately prior to the node referred to by the input node handle. It returns TRUE if a hidden token is found.
The function
parserHiddenGetFirst
finds the first hidden token, if any, which comes immediately prior to the node
referred to by the input node handle.
For example, if we have a node with three hidden tokens in front of it
(say, whitespace, then a comment, then more whitespace):
then parserHiddenGetFirst
finds the hidden token containing the
first whitespace, and makes that the "current hidden token".
The function
parserHiddenGetNext
would then find the comment, using parserHiddenGetNext
again would
find the second whitespace, and then a third call to parserHiddenGetNext
would
return FALSE - there would no longer be any hidden token available.
The function
parserHiddenGetPrevious
of course goes in the opposite direction.
If a hidden token is available, then you can use the function parserHiddenGetType to get the current hidden token's type. The function parserHiddenGetText returns the current hidden token's text. In the case of "WS" tokens, this will be any number of contiguous space, tab, newline, and carriage return characters. The function parserHiddenGetFilename returns the name of the source file where the current hidden token's text came from. The function parserHiddenGetLine returns the line number within the source file where the current hidden token's text came from.
Use the function parserAttrGet to get a node's attributes.
The node attributes are stored within the syntax tree (within Proparse) with unique integer keys and unique integer values. Integers are stored in the nodes in the syntax tree, instead of character strings, to minimize the amount of storage space required by those attributes. However, to make programming from the 4GL easier on the eyes, Proparse does an internal mapping of attribute integer keys and values to unique attribute strings.
However, if you want to mark up the syntax tree (sometimes called "decorating the tree") with node attributes of your own, then you must set and get the node attributes with integer values. See parserAttrSet and parserAttrGetI.
To prevent clashes between different uses of attribute integers, we have established the following guidelines:
See also: Node Attributes Reference
Proparse provides at least two ways to do that. One way would be to use specially formatted comments, and then to look for those by using the hidden tokens functions. Proparse provides another method though which is easier to use. It allows you to create real nodes (not just hidden tokens) in the syntax tree.
You can use the function parserConfigSet with parameter values "show-proparse-directives" and "true" to enable this feature. (By default, its value is "false".)
Now to describe the marking that you can put into your code.
An undefined preprocessor name can be inserted into your Progress
source code without impacting the behavior of the program.
We take advantage of this for our marking method.
Normally all undefined preprocessor names have no impact on
Proparse or its resulting syntax tree.
However, if you set the configuration flag "show-proparse-directives" to "true",
then Proparse watches for
{&_PROPARSE_}
directives, and allows those to be
inserted into your syntax tree anywhere a statement may be inserted.
In other words, you may place
{&_PROPARSE_}
directives
anywhere in your source code where you would be able to place a
complete Progress statement.
If you set "show-proparse-directives" to "true" and insert a
{&_PROPARSE_}
directive
in the middle of a statement, your source code will not parse.
Unless you set "show-proparse-directives" to "true",
{&_PROPARSE_}
directives
are ignored.
{&_PROPARSE_}
directives create nodes of node type
"PROPARSEDIRECTIVE".
To add meaning to the _PROPARSE_ directives, simply add text (you decide
what) before the closing curly. For example:
{&_PROPARSE_ your meaningful text here}
will create a node with type "PROPARSEDIRECTIVE", and the node's
"proparsedirective" attribute will be "your meaningful text here".
You use the function parserAttrGet
to retrieve the
value of the "proparsedirective" attribute, for example:
parserAttrGet(theNode, "proparsedirective":U)
.
Note that whitespace between the "_PROPARSE_" and the first non-whitespace character is discarded. However, any whitespace between your text and the closing curly brace is not discarded.
The functions parserDictAdd and parserDictDelete as well as the new node attribute from-user-dict allow you to play with alternative names for token types. For example, you could use parserDictAdd("define_const", "DEFINE") in order to make "define_const" a valid synonym for DEFINE.
A tree walker can find nodes related to user dictionary entries, and then make transformations to the tree based on those specially named nodes. In the "define_const" example, you might want to replace variable references with string or numeric literals, raise a syntax error if assignment of the variable is attempted, remove the define statement, comment the code where substitutions were made, etc.
Once all transformations were made to ensure that the user-defined language extensions were converted to valid 4gl syntax, the tree could be written out to a new .p file, ready for handing over to the compiler.
Fun ideas for playing with might include extending the 4gl to allow user-defined datatypes (classes), more object-oriented syntaxes, aspect-oriented programming, etc.
In order to review how Proparse has evaluated the preprocessing within a compile unit, a "listing" file can be written out.
You enable this feature by telling Proparse which file name to write
the listing out to:
parserConfigSet("listing-file", "/my/listing/file.txt")
and disable it with:
parserConfigSet("listing-file", "")
.
The file is written to (overwritten) each time a new compile unit is parsed.
The output file is designed for use by programs or scripts which read the file - it is not designed to be looked at without the aid of some sort of viewer. For example, it would be easy to write a script to generate an HTML view of the preprocessing done for a compile unit.
The file format is small and simple, but requires a little explanation. Rather than list entire file names, we list a file index number. At the end of the listing file, there's a cross reference to tell you which file number goes with which file name. We do it this way for efficiency sake, especially if we consider that we will want the data from these listing files to be stored persistently.
Each line starts with three numbers: The file index number, the line number, and the column number. There are three zeros "0 0 0" if that information is not relevant or for some reason not available.
Here is the format. "9" represents a number, "0" means that "0" will be written...
9 9 9 globdef name value 9 9 9 scopdef name value 9 9 9 macroref name 0 0 0 macrorefend 9 9 9 undef name 9 9 9 include 9 0 0 0 incarg {name|9} value 0 0 0 incend 9 9 9 ampif {true|false} 9 9 9 ampelseif {true|false|?} 9 9 9 ampelse {true|?} 9 9 9 ampendif 0 0 0 fileindex 9 filenameWhenever it is something that can be calculated, we do not show names or values. Again, this is for persistent storage efficiency.
For "ampif", we show if it evaluated to true or false. For "ampelseif" and "ampelse", the value would be "?" if it's not evaluated, because a "true" &if or &elseif has already been evaluated.
It is important to note that the value for include arguments requires extra processing. In order to keep the listing file such that there is one line per entry, any line breaks in an include argument value are replaced. Backslashes are replaced with double backslash, newlines are replaced with backslash-n, carriage returns are replaced with backslash-r. To convert the argument back to its original string, do a search and replace. (Double backslashes first, then backslash n and r to newline and carriage return characters.)
The integer node attribute for key 2100 is valid for the CLASS node only.
It returns an integer handle to the CLASS node of the super class,
or zero if the handle is not available.
This handle was already created internally -
do not use the getHandle()
function.
NOTE: The super tree might not be available if "multi-parse"
caching is turned on. If caching is turned on, then the trees for those supers
are available the first time they are needed.
After that, only their inheritance information is cached internally by Proparse
so that it does not need to be re-parsed.
In other words, if the super class's syntax tree needs to be examined by your
application at some point anyway, then examine it the first time it becomes available.
This can save you an extra call to parse()
,
and save your application from the redundant processing overhead.