5.11 Debugging a parser
Debugging a parser implemented with the help of a combinator library has its special challenges. In particular, setting a breakpoint and stepping through the code is not as straightforward as in a regular recursive descent parser. Furthermore, stack traces can be difficult to decipher because of the ubiquitous use of anonymous functions.[1] However, with the help of the techniques we explain in this chapter, working around these issues should be easy.
5.11.1 Setting a breakpoint
Suppose you have a combined parser like
let buggyParser = pipe2 parserA parserB (fun a b -> ...)
and you would like to break into the debugger whenever buggyParser
calls parserB
. One thing you could try is to set a breakpoint at the beginning of parserB
. However, that’s only possible if parserB
is not itself a combined parser, and even then you still have the problem that your breakpoint is also
triggered whenever parserB
is called from any other place in your source. Similarly, a
breakpoint you set in pipe2
will
probably be triggered by many other parsers besides buggyParser
.
Fortunately there’s a simple workaround if you can modify and recompile the code. Just define a wrapper function like the following
let BP (p: Parser<_,_>) stream = p stream // set a breakpoint here
Then redefine the buggy parser as
let buggyParser = pipe2 parserA (BP parserB) (fun a b -> ...)
If you now set a breakpoint at the body of the BP function, it will be triggered whenever parserB
is called from buggyParser
.
With such a wrapper it’s also easy define a precise conditional breakpoint. For example, if you only want to break once the parser has reached
line 100 of the input file, you could use the breakpoint condition stream.Line >= 100
.
By the way, you don’t need to set the breakpoint in the debugger. You can also write it directly into the code:
let BP (p: Parser<_,_>) (stream: CharStream<_>) = // this will execute much faster than a // conditional breakpoint set in the debugger if stream.Line >= 100L then System.Diagnostics.Debugger.Break() p stream
There are some issues with setting breakpoints in or stepping into anonymous or curried F# functions in Visual Studio 2008. In Visual Studio 2010 many of these issues have been fixed.
If you’re using Visual Studio, don’t forget to switch on the “Suppress JIT optimization on module load” option in the Tools – Options – Debugging – General dialog. And, when possible, use a debug build (of FParsec) for debugging.
5.11.2 Tracing a parser
Occasionally you have a parser that doesn’t work as expected and playing around with the input or staring at the code long enough just isn’t enough for figuring out what’s wrong. In such cases the best way to proceed usually is to trace the execution of the parser. Unfortunately, stepping through the parser under a debugger can be quite tedious, because it involves stepping through long sequences of nested invocations of parser combinators. A more convenient approach often is to output tracing information to the console or a logging service.
A simple helper function for printing trace information to the console could like the following example:
let (<!>) (p: Parser<_,_>) label : Parser<_,_> = fun stream -> printfn "%A: Entering %s" stream.Position label let reply = p stream printfn "%A: Leaving %s (%A)" stream.Position label reply.Status reply
To demonstrate how you could use such a tracing operator, let’s try to debug the following buggy (and completely silly) parser:
let number = many1Satisfy isDigit let emptyElement = pstring "[]" : Parser<_,unit> let numberElement = pstring "[" >>. number .>> pstring "]" let nanElement = pstring "[NaN]" let element = choice [emptyElement numberElement nanElement] .>> spaces let elements : Parser<_,unit> = many element
The following test run shows that the above parser is indeed buggy:
> run elements "[] [123] [NaN]";; val it : ParserResult<string list,unit> = Failure: Error in Ln: 1 Col: 11 [] [123] [NaN] ^ Unknown Error(s)
You probably don’t need trace information to figure out why the "NaN"
bit of the string doesn’t get parsed, but let’s pretend you do. Obviously, there’s something wrong with the
element
parser. To find out what’s wrong, let’s decorate the element
parser and all subparsers with the <!>
operator and
an appropriate label:
let number = many1Satisfy isDigit <!> "number" let emptyElement = pstring "[]" <!> "emptyElement" let numberElement = pstring "[" >>. number .>> pstring "]" <!> "numberElement" let nanElement = pstring "[NaN]" <!> "nanElement" let element = choice [emptyElement numberElement nanElement] .>> spaces <!> "element" let elements : Parser<_,unit> = many element
If you now run the parser on the same input as before, you get the following output:
> run elements "[] [123] [NaN]";; (Ln: 1, Col: 1): Entering element (Ln: 1, Col: 1): Entering emptyElement (Ln: 1, Col: 3): Leaving emptyElement (Ok) (Ln: 1, Col: 4): Leaving element (Ok) (Ln: 1, Col: 4): Entering element (Ln: 1, Col: 4): Entering emptyElement (Ln: 1, Col: 4): Leaving emptyElement (Error) (Ln: 1, Col: 4): Entering numberElement (Ln: 1, Col: 5): Entering number (Ln: 1, Col: 8): Leaving number (Ok) (Ln: 1, Col: 9): Leaving numberElement (Ok) (Ln: 1, Col: 10): Leaving element (Ok) (Ln: 1, Col: 10): Entering element (Ln: 1, Col: 10): Entering emptyElement (Ln: 1, Col: 10): Leaving emptyElement (Error) (Ln: 1, Col: 10): Entering numberElement (Ln: 1, Col: 11): Entering number (Ln: 1, Col: 11): Leaving number (Error) (Ln: 1, Col: 11): Leaving numberElement (Error) (Ln: 1, Col: 11): Leaving element (Error) val it : ParserResult<string list,unit> = Failure: Error in Ln: 1 Col: 11 [] [123] [NaN] ^ Unknown Error(s)
This trace log clearly reveals that the element
parser failed because the numberElement
parser failed after consuming the left bracket and thus the choice
parser never got to try the the nanElement
parser. Of course, this issue could be easily avoided by factoring out the bracket
parsers from the emptyElement
, numberElement
and nanElement
parsers. Also, if we had used many1SatisfyL
instead of manySatisfy
for the number
parser, we would have gotten an error message more descriptive than “Unknown
error(s)” (see the chapter on customizing error messages).
[1] | Although, debugging a parser written with a combinator library is often still easier than debugging one generated by an opaque parser generator tool. |
---|