5.1 Parser functions
An FParsec parser is a function that reads input from a text stream. When it succeeds, it returns a result value (e.g. a parsed number or an AST node); when it fails, it returns error messages describing what went wrong.
The following type abbreviation from the Primitives
module defines the basic type of parser function supported throughout the FParsec library:
type Parser<'Result,'UserState> = CharStream<'UserState> -> Reply<'Result>
As you can see from this definition, parser functions only accept a single argument: a CharStream<'UserState>
instance. The CharStream
class is FParsec’s specialized stream type for
“text” streams, i.e. streams of Unicode chars. A CharStream
can either be created directly from a string or it can be created from a file path or System.IO.Stream
. In the latter cases the CharStream
will take care of decoding the
binary input into UTF‐16 chars, similar to what a System.IO.StreamReader
does. What separates CharStream
from the StreamReader
and similar classes is that it comes with some advanced features that make it especially suitable for
backtracking parser applications.
We will discuss the purpose of the 'UserState
type in more detail in later chapters. For now
it’s enough to note that the user state is a user‐definable component of the CharStream
state. If you don’t need a user state, you will
normally define 'UserState
to be unit
.
To save some key strokes and screen real estate, we usually abbreviate 'UserState
as 'u
.
The Reply<'Result>
value returned from a parser function is a a simple value type container for the
parser result and possible error messages. It contains a status field indicating whether the parser succeeded or not, a field for the result
value (of type 'Result
) and a field with a possibly empty list of error messages. We will
explain these fields in more details in section 5.3.
A very basic example of a parser is the asciiLower
parser from the CharParsers
module:
val asciiLower: Parser<char,'u>
It parses any lower case ASCII char, i.e. any char in the range 'a'
‐ 'z'
,
and, if successful, returns the parsed char as part of its reply.
Many predefined parsers expect one or more parameter values as arguments. Take for instance the skipString
function:
val skipString: string -> Parser<unit,'u>
It takes a string as an argument and returns a parser that skips over this (and only this) string in the input.
Implementing parser grammars with FParsec usually means composing parsers for higher‐level grammar rules from parsers for lower‐level rules. You start with simple parsers for the leaf nodes of your grammar and then work your way up step‐by‐step until you eventually obtain a parser for the complete grammar. The simple representation of parsers as functions makes this composition particularly easy and allows for a straightforward and intuitive implementation of the library primitives.