NaturalDocs::Languages::Advanced

NaturalDocs::Languages::Base

NaturalDocs::Languages::ActionScript

NaturalDocs::Languages::CSharp

NaturalDocs::Languages::Perl

The base class for all languages that have full support in Natural Docs. Each one will have a custom parser capable of documenting undocumented aspects of the code.

Summary

NaturalDocs::Languages::Advanced	The base class for all languages that have full support in Natural Docs.
Implementation
Members	The class is implemented as a blessed arrayref.
Functions
New	Creates and returns a new object.
Tokens	Returns the tokens found by ParseForCommentsAndTokens().
SetTokens	Replaces the tokens.
ClearTokens	Resets the token list.
AutoTopics	Returns the arrayref of automatically generated topics, or undef if none.
AddAutoTopic	Adds a NaturalDocs::Parser::ParsedTopic to AutoTopics().
ClearAutoTopics	Resets the automatic topic list.
ScopeRecord	Returns an arrayref of NaturalDocs::Languages::Advanced::ScopeChange objects describing how and when the scope changed thoughout the file.
Parsing Functions	These functions are good general language building blocks.
ParseForCommentsAndTokens	Loads the passed file, sends all appropriate comments to NaturalDocs::Parser->OnComment(), and breaks the rest into an arrayref of tokens.
PreprocessFile	An overridable function if you’d like to preprocess the file before it goes into ParseForCommentsAndTokens().
TokenizeLine	Converts the passed line to tokens as described in ParseForCommentsAndTokens and adds them to Tokens().
TryToSkipString	If the position is on a string delimiter, moves the position to the token following the closing delimiter, or past the end of the tokens if there is none.
SkipRestOfLine	Moves the position to the token following the next line break, or past the end of the tokens array if there is none.
SkipUntilAfter	Moves the position to the token following the next occurance of a particular token sequence, or past the end of the tokens array if it never occurs.
IsFirstLineToken	Returns whether the position is at the first token of a line, not including whitespace.
IsLastLineToken	Returns whether the position is at the last token of a line, not including whitespace.
IsAtSequence	Returns whether the position is at a sequence of tokens.
IsBackslashed	Returns whether the position is after a backslash.
Scope Functions	These functions provide a nice scope stack implementation for language-specific parsers to use.
ClearScopeStack	Clears the scope stack for a new file.
StartScope	Records a new scope level.
EndScope	Records the end of the current scope level.
ClosingScopeSymbol	Returns the symbol that ends the current scope level, or undef if we are at the top level.
CurrentScope	Returns the current calculated scope, or undef if global.
CurrentPackage	Returns the current calculated package or class, or undef if none.
SetPackage	Sets the package for the current scope level.
CurrentUsing	Returns the current calculated arrayref of SymbolStrings from Using statements, or undef if none.
AddUsing	Adds a Using SymbolString to the current scope.
Support Functions
AddToScopeRecord	Adds a change to the scope record, condensing unnecessary entries.
CreateString	Converts the specified tokens into a string and returns it.

Implementation

Members

The class is implemented as a blessed arrayref. The following constants are used as indexes.

TOKENS	An arrayref of tokens used in all the Parsing Functions.
SCOPE_STACK	An arrayref of NaturalDocs::Languages::Advanced::Scope objects serving as a scope stack for parsing. There will always be one available, with a symbol of undef, for the top level.
SCOPE_RECORD	An arrayref of NaturalDocs::Languages::Advanced::ScopeChange objects, as generated by the scope stack. If there is more than one change per line, only the last is stored.
AUTO_TOPICS	An arrayref of NaturalDocs::Parser::ParsedTopics generated automatically from the code.

Functions

New

sub New #( name )

Creates and returns a new object.

Parameters

name	The name of the language.

Tokens

sub Tokens

Returns the tokens found by ParseForCommentsAndTokens().

SetTokens

sub SetTokens #( tokens )

Replaces the tokens.

ClearTokens

sub ClearTokens

Resets the token list. You may want to do this after parsing is over to save memory.

AutoTopics

sub AutoTopics

Returns the arrayref of automatically generated topics, or undef if none.

AddAutoTopic

sub AddAutoTopic #( topic )

Adds a NaturalDocs::Parser::ParsedTopic to AutoTopics().

ClearAutoTopics

sub ClearAutoTopics

Resets the automatic topic list. Not necessary if you call ParseForCommentsAndTokens().

ScopeRecord

sub ScopeRecord

Returns an arrayref of NaturalDocs::Languages::Advanced::ScopeChange objects describing how and when the scope changed thoughout the file. There will always be at least one entry, which will be for line 1 and undef as the scope.

Parsing Functions

These functions are good general language building blocks. Use them to create your language-specific parser.

All functions work on Tokens() and assume it is set by ParseForCommentsAndTokens().

ParseForCommentsAndTokens

sub ParseForCommentsAndTokens #( FileName sourceFile,
string[] lineCommentSymbols,
string[] blockCommentSymbols,
string[] javadocLineCommentSymbols,
string[] javadocBlockCommentSymbols )

Loads the passed file, sends all appropriate comments to NaturalDocs::Parser->OnComment(), and breaks the rest into an arrayref of tokens. Tokens are defined as

All consecutive alphanumeric and underscore characters.
All consecutive whitespace.
A single line break. It will always be “\n”; you don’t have to worry about platform differences.
A single character not included above, which is usually a symbol. Multiple consecutive ones each get their own token.

The result will be placed in Tokens().

Parameters

sourceFile	The source FileName to load and parse.
lineCommentSymbols	An arrayref of symbols that designate line comments, or undef if none.
blockCommentSymbols	An arrayref of symbol pairs that designate multiline comments, or undef if none. Symbol pairs are designated as two consecutive array entries, the opening symbol appearing first.
javadocLineCommentSymbols	An arrayref of symbols that designate the start of a JavaDoc comment, or undef if none.
javadocBlockCommentSymbols	An arrayref of symbol pairs that designate multiline JavaDoc comments, or undef if none.

Notes

This function automatically calls ClearAutoTopics() and ClearScopeStack(). You only need to call those functions manually if you override this one.
To save parsing time, all comment lines sent to NaturalDocs::Parser->OnComment() will be replaced with blank lines in Tokens(). It’s all the same to most languages.

PreprocessFile

sub PreprocessFile #( lines )

An overridable function if you’d like to preprocess the file before it goes into ParseForCommentsAndTokens().

Parameters

lines

An arrayref to the file’s lines. Each line has its line break stripped off, but is otherwise untouched.

TokenizeLine

sub TokenizeLine #( line )

Converts the passed line to tokens as described in ParseForCommentsAndTokens and adds them to Tokens(). Also adds a line break token after it.

TryToSkipString

sub TryToSkipString #( indexRef,
lineNumberRef,
openingDelimiter,
closingDelimiter,
startContentIndexRef,
endContentIndexRef )

If the position is on a string delimiter, moves the position to the token following the closing delimiter, or past the end of the tokens if there is none. Assumes all other characters are allowed in the string, the delimiter itself is allowed if it’s preceded by a backslash, and line breaks are allowed in the string.

Parameters

indexRef	A reference to the position’s index into Tokens().
lineNumberRef	A reference to the position’s line number.
openingDelimiter	The opening string delimiter, such as a quote or an apostrophe.
closingDelimiter	The closing string delimiter, if different. If not defined, assumes the same as openingDelimiter.
startContentIndexRef	A reference to a variable in which to store the index of the first token of the string’s content. May be undef.
endContentIndexRef	A reference to a variable in which to store the index of the end of the string’s content, which is one past the last index of content. May be undef.

Returns

Whether the position was on the passed delimiter or not. The index, line number, and content index ref variables will be updated only if true.

SkipRestOfLine

sub SkipRestOfLine #( indexRef,
lineNumberRef )

Moves the position to the token following the next line break, or past the end of the tokens array if there is none. Useful for line comments.

Note that it skips blindly. It assumes there cannot be anything of interest, such as a string delimiter, between the position and the end of the line.

Parameters

indexRef	A reference to the position’s index into Tokens().
lineNumberRef	A reference to the position’s line number.

SkipUntilAfter

sub SkipUntilAfter #( indexRef,
lineNumberRef,
token,
token,
... )

Moves the position to the token following the next occurance of a particular token sequence, or past the end of the tokens array if it never occurs. Useful for multiline comments.

Note that it skips blindly. It assumes there cannot be anything of interest, such as a string delimiter, between the position and the end of the line.

Parameters

indexRef	A reference to the position’s index.
lineNumberRef	A reference to the position’s line number.
token	A token that must be matched. Can be specified multiple times to match a sequence of tokens.

IsFirstLineToken

sub IsFirstLineToken #( index )

Returns whether the position is at the first token of a line, not including whitespace.

Parameters

index

The index of the position.

IsLastLineToken

sub IsLastLineToken #( index )

Returns whether the position is at the last token of a line, not including whitespace.

Parameters

index

The index of the position.

IsAtSequence

sub IsAtSequence #( index,
token,
token,
token ... )

Returns whether the position is at a sequence of tokens.

Parameters

index	The index of the position.
token	A token to match. Specify multiple times to specify the sequence.

IsBackslashed

sub IsBackslashed #( index )

Returns whether the position is after a backslash.

Parameters

index

The index of the postition.

Scope Functions

These functions provide a nice scope stack implementation for language-specific parsers to use. The default implementation makes the following assumptions.

Packages completely replace one another, rather than concatenating. You need to concatenate manually if that’s the behavior.
Packages inherit, so if a scope level doesn’t set its own, the package is the same as the parent scope’s.

ClearScopeStack

sub ClearScopeStack

Clears the scope stack for a new file. Not necessary if you call ParseForCommentsAndTokens().

StartScope

sub StartScope #( closingSymbol,
lineNumber,
package )

Records a new scope level.

Parameters

closingSymbol	The closing symbol of the scope.
lineNumber	The line number where the scope begins.
package	The package SymbolString of the scope. Undef means no change.

EndScope

sub EndScope #( lineNumber )

Records the end of the current scope level. Note that this is blind; you need to manually check ClosingScopeSymbol() if you need to determine if it is correct to do so.

Parameters

lineNumber

The line number where the scope ends.

ClosingScopeSymbol

sub ClosingScopeSymbol

Returns the symbol that ends the current scope level, or undef if we are at the top level.

CurrentScope

sub CurrentScope

Returns the current calculated scope, or undef if global. The default implementation just returns CurrentPackage(). This is a separate function because C++ may need to track namespaces and classes separately, and so the current scope would be a concatenation of them.

CurrentPackage

sub CurrentPackage

Returns the current calculated package or class, or undef if none.

SetPackage

sub SetPackage #( package,
lineNumber )

Sets the package for the current scope level.

Parameters

package	The new package SymbolString.
lineNumber	The line number the new package starts on.

CurrentUsing

sub CurrentUsing

Returns the current calculated arrayref of SymbolStrings from Using statements, or undef if none.

AddUsing

sub AddUsing #( using )

Adds a Using SymbolString to the current scope.

Support Functions

AddToScopeRecord

sub AddToScopeRecord #( newScope,
lineNumber )

Adds a change to the scope record, condensing unnecessary entries.

Parameters

newScope	What the scope SymbolString changed to.
lineNumber	Where the scope changed.

CreateString

sub CreateString #( startIndex,
endIndex )

Converts the specified tokens into a string and returns it.

Parameters

startIndex	The starting index to convert.
endIndex	The ending index, which is not inclusive.

Returns

The string.