State transitions are defined in state elements in the type’s xml file. They are used to indicate when a different state should become active. States can apply specific formatting to a section and control when other transitions or keyword highlighting can occur. For example in most languages keywords aren’t highlighted within strings or comments but those sections do want different formatting.
cpp.xml
<state name="Block comment" id="2" type="includeStart" colour="darkGreen"
allowNestedStates="false" useEscapeChar="false"
<startType type="cpp" id="0" />
The state element has a set of attributes on it which define how the state transition works. The startType elements define the type name and state id combination in which the language transition can start. For example you can’t start a comment in the middle of a quoted sequence
Attribute | Description | Default Value |
name | The descriptive name of the state | “” |
id | A number used to identify the state to other transitions | 0 |
type | Controls how the state is handled. The options are includeStart – state colour is applied before the start sequence and lasts until the end excludeStart – state colour is applied after the start sequence and lasts until the end includeStartWord – state colour is applied only to the start and end sequences NoColour – state colour is not applied | pre |
colour | The colour to use for the state | “” |
allowNestedStates | Controls whether or not state transitions should be tested when in this state | false |
useEscapeChar | Indicates if the start or end sequence should be checked to see if it’s escaped using the type’s defined escape character | false |
start | The character sequence used to trigger state transition. | “” |
end | The character sequence used to trigger the transition back to the previous state. The special “~line” sequence indicates the state ends at the end of the line and the “~word” sequence indicates it ends at the end of the word. | “” |
regEx | regular expression used to ensure only whole words are detected | "[a-zA-Z0-9]" |
words | If set to sym this is the last of words that can be included in the start sequence. The special “~word” sequence means all words that meet the regEx are valid. | |
Code
State.h
Type: Header file
Language: C++
State.h State.cpp
The State class stores information about state transitions. A state is used to control what formatting conditions are valid. It contains methods to detect the start and end of a state and to print the start and end.
the State() constructor creates a default state that’s used as the initial state for all languages.
The Lang(xmlNodePtr xmlNode) constructor first sets the State attributes to their default values and then uses the passed in XML object to decode the attributes from the type file along with start and end TypeIdPairs.
The IsStart(std::string line, int pos, TypeIdPair testType, std::string escape) method compares the start sequence for the state transition against the characters in the current line at the specified position. If the start sequence matches it then loops through the list of allowed start type name and state ids to see if any of them match the passed in pair. If useEscapeChar is true it then tests to see how many times the passed in escape character appears before the start sequence. If the start sequence matches, a type name and state id pair is found and there is an even number of escape characters before the start sequence or useEscapeChar is false the method returns true, otherwise it returns false.
The IsEnd(std::string line, int pos, TypeIdPair testType, std::string escape) method compares the end sequence for the language transition against the characters in the current line at the specified position. If useEscapeChar is true it then tests to see how many times the passed in escape character appears before the start sequence. If the end sequence matches and there is an even number of escape characters before the end sequence or useEscapeChar is false the method returns true, otherwise it returns false.
The PrintStart(std::stringstream& lineStream, std::string line, int pos) method prints the start sequence of a state. If the state is includeStartWord or includeStart a span is started and then the start sequence is printed to lineStream. If the state is excludeStart and not marked to end after a word then the span is started after the start sequence is printed. If the state is includeStartWord then it tries to find a valid state word and closes the span. If the state is marked to end after a word then it tries to find a word from line and prints it before closing the span. The length of the start sequence and any words printed is returned.
The PrintEnd(std::stringstream& lineStream) method prints the end sequence of a state. If the state is includeStartWord a span is started and then the end sequence is printed to lineStream. If the state is not NoColour a closing span is then printed. The length of the end sequence is returned.
The PrintRestart(std::stringstream& lineStream) method is called when a state is re-opened after being temporarily halted to display something else. It prints the opening span to lineStream unless the state is includeStartWord or NoColour.
The PrintRestart(std::stringstream& lineStream) method is called when a state needs to be temporarily halted to display something else. It prints a closing span to lineStream unless the state is includeStartWord or NoColour.
The FindStateWord(std::string line, int pos) method loops through the lists of words defined on the state. If a word in the list matches what is at the current line position and isn’t a part of a larger word it returns the word. If the special “~word” sequence is found in the list it uses the regular expression to find the word starting at the current position in the line and returns it.