Overview

The Pattern Parser is intended to provide a pattern-based method of taking a String, and creating a Map containing name-value pairs. Its primary use in the Connectors project is parsing command-line output and producing sets of Attributes.

The most important class is the MapTransform class. The MapTransform parses a String into a Map, using an array of PatternNodes, which contain regular expressions to break up the input. Each PatternNode specifies a java Pattern to be matched. The output from the PatternNode is the first group in the Pattern, thus

    <PatternNode key='TSO*HOLDCLASS' pattern='HOLDCLASS=\s*(\S)'/>
will return a Map.Entry with key 'TSO*HOLDCLASS', and a value of single character -- the first non-whitespace character after HOLDCLASS=. The values stored in the Map can be modified by an array of Transforms attached to the PatternNode. These Transforms are applied as a chain -- each Transform takes as input the output of the previous Transform. See below for examples.

The Pattern Parser contains a set of pre-defined Transforms:

Extending the Pattern Parser

It is also possible to extend the Pattern Parser with additional Transform subtypes. In order to do this, the new Transform class must do the following:
  1. Extend the Transform class
  2. Implement the transform(Object)(Object) method
  3. Have a Constructor taking an Element
  4. If the Transform has attributes, override the getAttributes() method (the Transform must have an attribute class, which specified the class name of the Transform.
  5. If the Transform has children, override the getChildren() method

Example User-Written Transform Class

    public static class SubstringTransform extends Transform<Object> {
        private static final String START = "start";
        private static final String END = "end";

        private int start;
        private int end;

        public SubstringTransform(Element element) {
            this(Integer.parseInt(element.getAttribute(START)), Integer.parseInt(element.getAttribute(END)));
        }

        public SubstringTransform(int start, int length) {
            this.start = start;
            this.end = length;
        }

        {@literal @}Override
        protected String getAttributes() {
            return super.getAttributes()
            +attributeToString(START, start+"")
            +attributeToString(END, end+"");
        }

        {@literal @}Override
        public Object transform(Object input) throws Exception {
            return ((String)input).substring(start, end);
        }

    }

Debugging

A ParserDebugger class is provide to assist with debugging your MapTransforms.

It can be run with a command line such as:

    java -cp PatternParser-1.0.x.jar org.identityconnectors.patternparser.ParserDebugger SampleParser.xml SampleInput.txt
This will produce a trace of all matching and Transforms performed during a Pattern-based parse.

Example User-Written Parser

As an example, this Parser parses the output from the

    LISTUSER name NORACF TSO
RACF command, and creates a java.util.Map<String, Object>.

Sample Parser

<MapTransform>
  <PatternNode key='TSO*ACCTNUM'     pattern='ACCTNUM=\s*([^\n]*)\n'  optional='true' reset='false'>
    <SubstituteTransform pattern='\s*$' substitute=''/>
  </PatternNode>
  <PatternNode key='TSO*HOLDCLASS'   pattern='HOLDCLASS=\s*(\S)'      optional='true' reset='false'/>
  <PatternNode key='TSO*JOBCLASS'    pattern='JOBCLASS=\s*(\S)'       optional='true' reset='false'/>
  <PatternNode key='TSO*MSGCLASS'    pattern='MSGCLASS=\s*(\S)'       optional='true' reset='false'/>
  <PatternNode key='TSO*PROC'        pattern='PROC=\s*(\S{1,8})'      optional='true' reset='false'/>
  <PatternNode key='TSO*SIZE'        pattern='SIZE=\s*(\d+)'          optional='false' reset='false'>
      <ClassTransform transform='java.lang.Integer'/>
  </PatternNode>
  <PatternNode key='TSO*MAXSIZE'     pattern='MAXSIZE=\s*(\d+)'       optional='false' reset='false'>
      <ClassTransform transform='java.lang.Integer'/>
  </PatternNode>
  <PatternNode key='TSO*SYSOUTCLASS' pattern='SYSOUTCLASS=\s*(\S)'    optional='true' reset='false'/>
  <PatternNode key='TSO*UNIT'        pattern='UNIT=\s*(\S{1,8})'      optional='true' reset='false'/>
  <PatternNode key='TSO*USERDATA'    pattern='USERDATA=\s*(\S{1,4})'  optional='false' reset='false'/>
  <PatternNode key='TSO*SECLABEL'    pattern='SECLABEL=\s*([^\n]*)\n' optional='true' reset='false'/>
  <PatternNode key='TSO*COMMAND'     pattern='COMMAND=\s*([^\n]*)\n'  optional='true' reset='false'>
    <SubstituteTransform pattern='\s*$' substitute=''/>
  </PatternNode>
</MapTransform>

Sample Input

TSO INFORMATION
---------------
ACCTNUM= ACCT#
HOLDCLASS= X
JOBCLASS= A
MSGCLASS= X
PROC= ISPFPROC
SIZE= 00006133
MAXSIZE= 00000000
SYSOUTCLASS= X
USERDATA= 0000
COMMAND= ISPF PANEL(ISR@390)

Debugging the Example

Running the ParserDebugger with the preceding Parser and test data produces:

Matched regex 'ACCTNUM=\s*([^\n]*)\n' to 'ACCTNUM= ACCT#
' at character 162
    Match value:'ACCT#                                                                  '
    Transform org.identityconnectors.patternparser.SubstituteTransform:'ACCT#                                                                  '->'ACCT#'
Matched regex 'HOLDCLASS=\s*(\S)' to 'HOLDCLASS= X' at character 243
    Match value:'X'
Matched regex 'JOBCLASS=\s*(\S)' to 'JOBCLASS= A' at character 324
    Match value:'A'
Matched regex 'MSGCLASS=\s*(\S)' to 'MSGCLASS= X' at character 405
    Match value:'X'
Matched regex 'PROC=\s*(\S{1,8})' to 'PROC= ISPFPROC' at character 486
    Match value:'ISPFPROC'
Matched regex 'SIZE=\s*(\d+)' to 'SIZE= 00006133' at character 567
    Match value:'00006133'
    Transform org.identityconnectors.patternparser.ClassTransform:'00006133'->'6133'
Matched regex 'MAXSIZE=\s*(\d+)' to 'MAXSIZE= 00000000' at character 648
    Match value:'00000000'
    Transform org.identityconnectors.patternparser.ClassTransform:'00000000'->'0'
Matched regex 'SYSOUTCLASS=\s*(\S)' to 'SYSOUTCLASS= X' at character 729
    Match value:'X'
Matched regex 'USERDATA=\s*(\S{1,4})' to 'USERDATA= 0000' at character 810
    Match value:'0000'
Matched regex 'COMMAND=\s*([^\n]*)\n' to 'COMMAND= ISPF PANEL(ISR@390)
' at character 891
    Match value:'ISPF PANEL(ISR@390)                                                    '
    Transform org.identityconnectors.patternparser.SubstituteTransform:'ISPF PANEL(ISR@390)                                                    '->'ISPF PANEL(ISR@390)'

Results:
    TSO*PROC=ISPFPROC
    TSO*MSGCLASS=X
    TSO*ACCTNUM=ACCT#
    TSO*USERDATA=0000
    TSO*SYSOUTCLASS=X
    TSO*SIZE=6133
    TSO*MAXSIZE=0
    TSO*JOBCLASS=A
    TSO*HOLDCLASS=X
    TSO*COMMAND=ISPF PANEL(ISR@390)
This allows you to see what is captured by each pattern, and how the transforms affect the output at each stage.