Module gutenberg_post_parser::parser[−][src]

The Gutenberg post parser.

The Gutenberg post parser is a parser combinator. Thus it provides mulitple parsers, aka combinators. They are based on the nom project. Each parser receives an input, and produces an output of kind IResult.

The writing of parsers heavily relies on Rust macros. Don't be surprise! To learn more, consult the documentation. Nonetheless, a grammar is maintained with the EBNF notation hereinbelow.

Grammar

This section describes the Gutenberg post grammar with the Extended Backus-Naur form (EBNF) metasyntax notation.

`block_list`

This rule is the axiom of the grammar.

block_list =
    { block | phrase } ;

`block`

A balanced block has an opening and a closing tag. Their names must be identical, i.e. the respective namespaces and names must match. A void block is an “auto-closing” block.

A balanced block can have children, while a void block cannot.

block =
    block_balanced | block_void ;

block_balanced =
    "<!--", [ wss ], "wp:", block_name, wss, block_attributes, [ wss ], "-->",
    block_list,
    "<!--", [ wss ], "/wp:", block_name, [ wss ], "-->" ;

block_void =
    "<!--", [ wss ], "wp:", block_name, wss, block_attributes, [ wss ], "/-->" ;

`block_name`

A block name is a pair composed of a namespace, and a name. The namespace is optional, and defaults to core.

block_name =
    namespaced_block_name | core_block_name ;

namespaced_block_name =
    block_name_part, "/", block_name_part ;

core_block_name =
    block_name_part ;

block_name_part =
    letter, { letter | digit } ;

letter =
    "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" |
    "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" |
    "w" | "x" | "y" | "z" ;

digit =
    "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;

`block_attributes`

Block attributes must be a valid JSON object, like defined in the RFC 7159. It therefore must start by { and end by }:

block_attributes =
    ? RFC 7159, JSON, Section 4. Objects ? ;

`phrase`

A phrase is anything that is not a block.

phrase =
    anything - "<!--" ;

anything =
    ? any bytes ? ;

`wss`

Whitespace is shortened to ws, and whitespaces is shortened to wss.

wss =
    ws, { ws } ;

ws =
    " "  (* U+0020 *)
  | "\n" (* U+000A *)
  | "\r" (* U+000D *)
  | "\t" (* U+0009 *) ;

Functions

block	Recognize a block.
block_attributes	Recognize block attributes.
block_list	Axiom of the grammar: Recognize a list of blocks.
block_name	Recognize a fully-qualified block name.
block_name_part	Recognize a block name part.
core_block_name	Recognize a globally-namespaced block name.
namespaced_block_name	Recognize a namespaced block name.
phrase	Recognize a phrase.
whitespaces	Recognize whitespaces.