# 5.4 Applying parsers in sequence

Now that we have discussed how `Parser` functions work, we can start explaining how FParsec’s parser combinators work.

In this chapter we will discuss combinators that allow you to apply multiple parsers in sequence, i.e. parse the beginning of the input with the first parser, then parse the following input with the second parser, and so on.

## 5.4.1 The definition of `>>.`

The simplest combinators for sequentially applying two parsers are

```val (>>.):   Parser<'a,'u> -> Parser<'b,'u> -> Parser<'b,'u>
val (.>>):   Parser<'a,'u> -> Parser<'b,'u> -> Parser<'a,'u>
```

Both operators take two parsers as arguments and return a combined parser that applies the two parsers in sequence. As you can infer from the type signatures, `p1 >>. p2` returns the result of `p2` and `p1 .>> p2` the result of `p1`. In each case the point points to the parser whose result is returned.

In order to explain exactly what it means to apply two parser in sequence, we give a full definition of the `>>.` operator:

```let (>>.) (p1: Parser<'a,'u>) (p2: Parser<'b,'u>) =
fun stream ->
let stateTag = stream.StateTag
let mutable reply2 = p2 stream
if stateTag = stream.StateTag then                        // (1)
else // reconstruct error reply with new result type
```

The implementation of `p1 >>. p2` should be quite self‐explanatory: First `p1` is applied to the input `stream`. If `p1` succeeds, i.e. if the status of `reply1` is `Ok`, `p2` is applied to stream and the reply of `p2` then becomes the reply of the combined parser. However, if `p1` fails, its reply is immediately propagated as the reply of the combined parser. Since `reply1` has type `Reply<'a,'u>` but `p1 >>. p2` needs to return a `Reply<'b,'u>`, the error reply needs to be reconstructed with a new result type before it can be returned.

## 5.4.2 Merging error messages

We mentioned earlier that the error messages returned in the `Reply.Error` field implicitly refer to the state of the `CharStream` at the time the parser returns. In particular the error messages refer to the then current stream position. Since the messages do not contain themselves a separate record of the error position they can only be interpreted together with the `CharStream` state.

When `p2` does not change the parser state, the error messages from both replies refer to the state of the `CharStream` as it is when `p1 >>. p2` returns. Thus, the combinator needs to merge the (immutable) `ErrorMessageList`s from both replies, so that the returned list contains all the relevant error messages (see the line marked with `(2)`).

In order to check whether the `CharStream` state has changed, the combinator does not compare the full states from before and after `p2` is invoked. Instead it only compares the `StateTag` values (see line `(1)`). This improves performance and — for most practical purpose — is almost equivalent to comparing the full state, as we will discuss below.

Note

The way `(>>.)` handles errors and merges error messages is a template for all combinators in FParsec that perform multiple sequential parser invocations.

You may wonder why the error messages get merged even though `p1` succeeded. The somewhat counterintuitive reason is that parsers can return nonempty error message lists even when they don’t fail. For example, a parser that skips over the optional string `"str"` will return `Reply(Ok, (), expectedString "str")` if it doesn’t find the string in the input. In this case the error message describes what further input the parser could have parsed at the current stream position. If subsequently a parser fails at the same position, all error messages for the same position can be aggregated to give the user as much information as possible about what went wrong and what alternative inputs could have been parsed at the given position.

The following sample demonstrates the helpful effect of this error handling behaviour:

```let str s = pstring s

let oneOrTwoInts =
str "(" >>. tuple2 pint32 (opt (str "," >>. spaces >>. pint32)) .>> str ")"
```
```> run oneOrTwoInts "(1 2)";;
val it : ParserResult<(int32 * int32 option),unit> = Failure:
Error in Ln: 1 Col: 3
(1 2)
^
Expecting: ')' or ','
```

This error message wouldn’t mention the possibility of a missing comma if the `.>>` combinator did not merge error messages for the same position when the left‐hand side parser succeeds.

## 5.4.3 The `StateTag`

Parser combinators often need to check whether a parser has changed the `CharStream` state. In a typical FParsec application these checks are performed so frequently that an efficient implementation is important for the overall parser performance. Since a straightforward comparison of the complete `CharStream` states can be quite expensive, the `CharStream` class provides a shortcut for this purpose: the `StateTag`.

The `StateTag` is a simple integer counter that is incremented every time a `CharStream` method changes the state. Thus, if the `StateTag` hasn’t changed, you can safely infer that the state hasn’t changed either. Except for some special cases, the opposite is also true: if the `StateTag` has changed, the state has changed too.

In the following special cases checking whether the `StateTag` has changed is not equivalent to checking whether the `CharStream` state has changed, because the tag may change even though the state doesn’t:

• A parser calls the basic `Skip` or `Read` methods with a 0 offset or an empty argument string.
• A parser seeks the `CharStream` to the current position or replaces the user state with the current value.
• A parser makes several calls to `CharStream` methods and in later calls undoes the changes it made in earlier calls.

The first and second cases only have practical relevance for generic or parameterized parsers and can be simply avoided by checking the arguments before calling the respective `CharStream` methods. The third case only arises in the context of backtracking and it too can be easily avoided, either by using the `BacktrackTo` method for backtracking or by manually restoring the `StateTag` after the backtracking.

In practice these special cases are extremely rare, usually without consequences for the parser behaviour and always easily avoidable. Hence, FParsec combinators make free use of the `StateTag` to check whether a parser has changed the `CharStream` state.

## 5.4.4 Generalizing `>>.`

The parsers `p1 .>> p2` and `p1 >>. p2` only return the results of `p1` and `p2` respectively. If you want to combine the results from both `p1` and `p2`, you could use the `pipe2` combinator instead:

`val pipe2: Parser<'a,'u> -> Parser<'b,'u> -> ('a -> 'b -> 'c) -> Parser<'c,'u>`

The parser `pipe2 p1 p2 f` will apply `p1` and `p2` in sequence, exactly like `>>.`, but instead of returning one of the result values of `p1` and `p2` it will return the result of the function application `f x1 x2`, where `x1` and `x2` are the results returned by `p1` and `p2`.

There are also `pipe3`, `pipe4` and `pipe5` combinators, in case you need more than two arguments. Often these combinators are used to pass arguments to object constructors, like in the following example of a parser for a comma‐separated list of XYZ coordinates:

```type Data = Point of float*float*float

let ws = spaces
let str s = pstring s .>> ws
let number = pfloat .>> ws
let point = pipe3 number (str "," >>. number) (str "," >>. number)
(fun x y z -> Point(x, y, z))
```
```> run point "1, 2, 3";;
val it : ParserResult<Data,unit> = Success: Point (1.0,2.0,3.0)
```

If you just want to return the parsed values as a tuple, you can use the predefined `tuple2-5` parsers. For example, `tuple2 p1 p2` is equivalent to `pipe2 p1 p2 (fun x1 x2 -> (x1, x2)`.

`tuple2` is also available under the operator name `.>>.`, so that you can write `p1 .>>. p2` instead of `tuple2 p1 p2`.

There is no `pipe1` combinator, but there is an operator for the same purpose:

```val (|>>): Parser<'a,'u> -> ('a -> 'b) -> Parser<'b, 'u>
```

This operator is used similarly to the F#’s ubiquitous pipeline operator `|>`:

```type Expression = Number of int
| Identifier of string

let number = pint32 |>> Number
```
```> run number "123";;
val it : ParserResult<Expression,unit> = Success: Number 123
```

## 5.4.5 The `>>=` combinator

All the sequencing and piping combinators we have discussed so far could be implemented with the help of the “bind” combinator:

`val (>>=): Parser<'a,'u> -> ('a -> Parser<'b,'u>) -> Parser<'b,'u>`

Instead of two parsers this combinator takes a parser and a function producing a parser as arguments. The combined parser `p >>= f` first applies the parser `p` to the input, then it applies the function `f` to the result returned by `p` and finally it applies the parser returned by `f` to the input. If we knew in advance that `p` returns `x` then `p >>= f` would be equivalent to `p >>. (f x)`.

The `>>=` combinator is quite versatile. For example, the following code implements five of the previously discussed combinators in terms of `>>=` and the trivial `preturn` primitive:

```let preturn x = fun stream -> Reply(x)

let (|>>) p  f    = p  >>= fun x -> preturn (f x)
let (.>>) p1 p2   = p1 >>= fun x -> p2 >>= fun _ -> preturn x
let (>>.) p1 p2   = p1 >>= fun _ -> p2 >>= fun y -> preturn y
let (.>>.) p1 p2  = p1 >>= fun x -> p2 >>= fun y -> preturn (x, y)
let pipe2 p1 p2 f = p1 >>= fun x -> p2 >>= fun y -> preturn (f x y)
```

In typical FParsec code `>>=` is only seldomly used, because in many situations where `>>=` could in principle be used one of the other specialized operators is more convenient to use and faster. However, on a conceptual level this combinator is important, because its generality allows us to define and test many combinators through their equivalence with a parser defined in terms of `>>=`. This combinator is also significant for the role it plays in the monadic parser construction syntax, see section 5.10.

Footnotes:
 Of course, this doesn’t apply if you manually set back the `StateTag` to the old value. There is also the purely theoretical possibility that the `StateTag` has overflown and was incremented exactly 264 times (or 232 if you define the `SMALL_STATETAG` conditional compiler symbol).