5.8 Customizing error messages
Generating relevant and informative parser error messages is one of FParsec’s greatest strengths. The top‐down approach of recursive‐descent parsing guarantees that there is always enough context to describe the exact cause of a parser error and how it could be avoided. FParsec exploits this context to automatically generate descriptive error messages whenever possible. This chapter explains how you can ensure with minimal efforts that your parser always produces understandable error messages.
As we already described in detail in section 5.4.2, error reporting in FParsec is based on the following two principles:
-
Parsers that fail or could have consumed more input return as part of their
Reply
anErrorMessageList
describing the input they expected or the reason they failed. - Parser combinators aggregate all error messages that apply to the same input position and then propagate these error messages as appropriate.
The various error messages in the previous chapters demonstrate that the built‐in error reporting usually works quite well even without any intervention by the parser author. However, sometimes FParsec lacks the information necessary to produce an informative error message by itself.
Consider for example the many1Satisfy f
parser, which parses a string consisting of one or more chars satisfying the
predicate function f
. If this parser fails to parse at least one char, the generated error is
not very helpful:
> run (many1Satisfy isLetter) "123";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 1 123 ^ Unknown Error(s)
The problem here is that many1Satisfy
can’t describe what chars the function predicate accepts. Hence, when you don’t use many1Satisfy
as part of a combined
parser that takes care of a potential error, you better replace it with many1SatisfyL
, which allows you to describe the
accepted input with a label (hence the “L”):
> run (many1SatisfyL isLetter "identifier") "123";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 1 123 ^ Expecting: identifier
There are also labelled variants of other parsers and combinators, for example choiceL
and notFollowedByL
.
If there is no labelled parser variant or you want to replace a predefined error message, you can always use the labelling operator
val (<?>): Parser<'a,'u> -> string -> Parser<'a,'u>
The parser p <?> label
behaves like p
, except
that the error messages are replaced with expectedError label
if
p
does not change the parser state (usually because p
failed).
For example, if FParsec didn’t provide many1SatisfyL
, you could define it yourself as
let many1SatisfyL f label = many1Satisfy f <?> label
The labelling operator is particularly useful for producing error messages in terms of higher‐level grammar productions instead of error messages in terms of lower‐level component parsers. Suppose you want to parse a string literal with the following parser
let literal_ = between (pstring "\"") (pstring "\"") (manySatisfy ((<>) '"'))
If this parser encounters input that doesn’t start with a double quote it will fail with the error message produced by the parser for the opening quote:
> run literal_ "123";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 1 123 ^ Expecting: '"'
In situations like these an error message that mentions the aggregate thing you’re trying to parse will often be more helpful:
let literal = literal_ <?> "string literal in double quotes"
> run literal "123";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 1 123 ^ Expecting: string literal in double quotes
Note that <?>
only
replaces the error message if the parser doesn’t consume input. For example, our literal
parser won’t mention that we’re trying to parse a string literal if it fails after the initial double quote:
> run literal "\"abc def";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 9 "abc def ^ Note: The error occurred at the end of the input stream. Expecting: '"'
With the compound labelling operator <??>
you can make sure that the compound gets mentioned even if the parser fails after consuming input:
let literal = literal_ <??> "string literal in double quotes"
> run literal "\"abc def";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 1 "abc def ^ Expecting: string literal in double quotes string literal in double quotes could not be parsed because: Error in Ln: 1 Col: 9 "abc def ^ Note: The error occurred at the end of the input stream. Expecting: '"'
If you don’t like the formatting of these error messages, you can write a custom formatter for your application. The data structure in which
error messages are stored is easy to query and process. See the reference for the Error
module.
The parsers we discussed so far in this chapter only generated Expected
error messages, but FParsec also supports other type of error messages. For example, the notFollowedByL
parser generates
Unexpected
error messages:
> run (notFollowedByL spaces "whitespace") " ";; val it : ParserResult<unit,unit> = Failure: Error in Ln: 1 Col: 1 ^ Unexpected: whitespace
Error messages that don’t fit into the Expected
and Unexpected
categories can be produced with the fail
and failFatally
primitives:
let theory = charsTillString "3) " true System.Int32.MaxValue >>. (pstring "profit" <|> fail "So much about that theory ... ;-)") let practice = "1) Write open source library 2) ??? 3) lot's of unpaid work"
> run theory practice;; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 40 1) Write open source library 2) ??? 3) lot's of unpaid work ^ Expecting: 'profit' Other error messages: So much about that theory... ;-)
If you can’t get the built‐in operators and parsers to produce the error message you need, you can always drop down one API level and write a special‐purpose parser combinator.
The following example shows how you can define a custom between
combinator that includes the position of the opening delimiter as part of the error message that gets
generated when the closing delimiter cannot be parsed.
let betweenL (popen: Parser<_,_>) (pclose: Parser<_,_>) (p: Parser<_,_>) label = let expectedLabel = expected label let notClosedError (pos: Position) = messageError (sprintf "The %s opened at %s was not closed." label (pos.ToString())) fun (stream: CharStream<_>) -> // The following code might look a bit complicated, but that's mainly // because we manually apply three parsers in sequence and have to merge // the errors when they refer to the same parser state. let state0 = stream.State let reply1 = popen stream if reply1.Status = Ok then let stateTag1 = stream.StateTag let reply2 = p stream let error2 = if stateTag1 <> stream.StateTag then reply2.Error else mergeErrors reply1.Error reply2.Error if reply2.Status = Ok then let stateTag2 = stream.StateTag let reply3 = pclose stream let error3 = if stateTag2 <> stream.StateTag then reply3.Error else mergeErrors error2 reply3.Error if reply3.Status = Ok then Reply(Ok, reply2.Result, error3) else Reply(reply3.Status, mergeErrors error3 (notClosedError (state0.GetPosition(stream)))) else Reply(reply2.Status, reply2.Error) else let error = if state0.Tag <> stream.StateTag then reply1.Error else expectedLabel Reply(reply1.Status, error)
The behaviour of the betweenL
combinator differs from that of the standard between
combinator in two ways:
The following tests demonstrate this behaviour:
let stringLiteral = betweenL (str "\"") (str "\"") (manySatisfy ((<>) '"')) "string literal in double quotes"
> run stringLiteral "\"test\"";; val it : ParserResult<string,unit> = Success: "test" > run stringLiteral "\"test";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 6 "test ^ Note: The error occurred at the end of the input stream. Expecting: '"' Other messages: The string literal in double quotes opened at (Ln: 1, Col: 1) was not closed. > run stringLiteral "test";; val it : ParserResult<string,unit> = Failure: Error in Ln: 1 Col: 1 test ^ Expecting: string literal in double quotes