× News Cego SysMT Croom Web Statistics Impressum
SysMT Logo

The Dragon Parser Generator

4   Generating the parser
Back to TOC

The tokenset and the productionset must be written in a file called < parsername >.def. This file is used as input for the dragon parser generator. You just call the dragon from a command line with

$ dragon -p < parsername > ( without the .def postfix )

If the description is correct, this call results in the generation of two files ( < parsername >.h and < parsername >.cc ), containing appropriate C++-Source code. You can compile this classes using a C++-Compiler ( gcc ) and the base library, which is provided in a separate package.

4.1   Parser conflicts

As mentioned in the introduction, dragon is limited to LR(1) grammars. Sometimes, this leads to conflicts which cannot be resolved by dragon. In such cases, dragon fails with an exception message containing either an shift/reduce or reduce/reduce conflict.

Then the grammar description must be changed to make the parser generation work. In the following, we discuss to typical cases, where the grammar description is not appropriate and mus be changed.

4.1.1   Shift/reduce conflict

If dragon fails with a shift/reduce-conflict, the analyse algorihm has reached a state, where it cannot decide to shift ahead to read more input or to reduce at the current point to an appropriate production, because both decisions are valid. In this case, the grammar is ambiguous. Sample:

TOKENSET
'a'     :	a
'b'     : 	b
'c'     :   c
END

PRODUCTIONSET
Start   :       C a
C       :       c a
C       :       c a a
END

With the grammar defined above, we will follow the parsers algorithm to parse the word caaa. With the first sign c, the token c is detected. There is no valid production to reduce, so the parser goes ahead to read more input. Now the sign a is read and the token a is detected. With the next read token a the parser results in the conflict. It either could reduce the stream with the production C-> ca, because the second token a is a valid follow up token ( Start -> Ca ) production C. But the parser also could shift ahead to read more input, because there is another valid production for the stream ( C-> caa ). There is no way for the LR-1 parser to resolve this situation. As a consequence, generating the parser code for this grammar would result in the following error message

$ dragon -p G2
Reading Header ...
Reading Terminals ...
Reading Productions ...
Checking Productions ...
Analysing ...
Reached LR(1) elements: 
1...3...5...6...
Dragon : shift/reduce conflict with production C-> c a.[a]

To solve the problem, you should modify the grammar in way, that the parser can derive a valid word of the described language in a unique way.

TOKENSET
'a'     :	a
'b'     : 	b
'c'     :   c
END

PRODUCTIONSET
Start   :       C
C       :       c a a 
C       :       c a a a
END

4.1.2   Reduce/reduce conflict

If dragon fails with a reduce/reduce-conflict, the algorithm has reached a state where more than one production could be reduced, so dragon can not decide, which one to take. Sample:

TOKENSET
'a'     :       a
'b'     :       b
'c'     :       c
END

PRODUCTIONSET
Start   :       S a  ; A1
S       :       C b ; A2
S       :       D b
C       :       c a ; A3
D       :       c a
END