The Dragon Parser Generator
6 Advanced scanner feed |
---|
Back to TOC |
The complete dragon parser implemention requires the implementation of the virtual nextChar and backChar method of the parser super class. The nextChar provides the next character from the input stream to the scanner. The backChar puches back the actual character provided to the scanner back to the stream ( for parsing resolution, dragon uses lookahead characters to resolve ambivalent analyse states). Furthermore special token definitions like string tokens must be defined within the nextChar method. The following sample implements a small parser parsing a comma separated list of string values ( string values are embedded into single quote signs ). First, we give the dragon parser definition.
HEADER IGNORETOKEN STRINGVAL SEPIGNORE '\t' SEPIGNORE ' ' SEPSIGN ',' END TOKENSET ',' : KOMMA END PRODUCTIONSET Start : StringList StringList : String KOMMA StringList StringList : String String : STRINGVAL ; printString END |
6.1 Advanced C++ scanner
This defintion is stored into the the file StringParser.def and is then processed by dragon.
$ dragon -t c++ -p StringParser |
As already discussed, the files StringParser.h StringParser.cc should be produced by the generation step. Now we have to implement the main modul for the string parser.
#define MAXSTRINGLEN 100 class StringParserImp : StringParser() { public: StringParser(char* inputStream) { _pC = inputStream; _i=0; } ~StringParser(); char nextChar() { if (_pC_i]) { if (_pC[_i] == '\'') { setReserved(STRINGVAL); _i++; int j=0; while ( _pC[_i] != '\'') { _stringBuf[j] = _pC[_i]; j++; _i++; if (_i == MAXSTRINGLEN) { throw Exception(StringParserImp,stringbuf exceeded); } } _stringBufLen = j+1; _stringBuf[j] = 0; _i++; return 0; } _i++; return (toupper(_pC[_i-1])); } return 0; } void backChar() { _i--; } void printString() { cout <<String is<< _stringBuf << endl; } private: char* _stringBuf[MAXSTRINGLEN]; int _stringBufLen; char* _pC; int _i; } |
If the nextChar method detects a quote character ( ' ), it calls the super class method setReserved with the detected token value ( in our case STRINGVAL ). This call informs the superclass to take the given token value ( STRINGVAL ) as the next token value. Note: The token value given to the setReserved method must correspond to the token IGNORETOKEN ( or IGNORE for backward compatibility ) statement given in the parser defintion file.
The value of the the string is stored in a dedicated character array _stringBuf managed by the derived parser implementation class. If the string value is completely scanned, a stream value of zero is returned from the nextChar method to the scanner. Finally the main routine has to be written down. This may look like the following
#includeStringParser.h#includeStringParserImp.hmain(int argc, char** argv) { StringParserImp stringParser(argv[1]); try { stringParser.parse(); } catch Exception( e ) { ListT |
6.2 Advanced GoLang scanner
The GoLang based implementation for a string handling parser is quite similar. Since there is no inheritance available in Go, the reserved token information is stored in the parser implementation ( MyAction ). The parser implementation type also must implement the methods IsReserved and GetAndResetReserved
// code omitted // ... type MyAction struct { feed string i int tl []string isReserved bool reservedToken G1.Token } // code omitted // ... func ( ma *MyAction ) NextChar() byte { if ma.i < len(ma.feed) { fmt.Printf("Next Char = %c\n", ma.feed[ma.i]) c := ma.feed[ma.i] // for use with more sophisticated grammers, we detect quotes if c == '\'' { fmt.Println("Reading string") ma.i++ c := ma.feed[ma.i] for c != '\'' && c != 0 { fmt.Printf("Next String Char = %c\n", ma.feed[ma.i]) ma.isReserved = true ma.reservedToken = G1.Token_STRINGVAL ma.i++ c = ma.feed[ma.i] } ma.i++ c = ma.feed[ma.i] return c } else { ma.i++ } return c } return 0 } func ( ma *MyAction ) BackChar() { fmt.Println("BackChar called") ma.i-- } func ( ma *MyAction ) IsReserved() bool { return ma.isReserved } func ( ma *MyAction ) GetAndResetReserved() G7.Token { ma.isReserved = false return ma.reservedToken } // ... |
If a string token is detected by NextChar, the reserved flag is set to true and the reserved token is set to the STRINGVAL. These values are provided for the calling Parser by the interface methods IsReserved and GetAndResetReserved. Please note that the GetAndResetReserved method is also responsible for reset the reserved flag to false, so subsequent calls for IsReserved from the parser will return false