markdown-it/docs/architecture.md

# markdown-it design principles

## Data flow

Input data is piped via nestesd chains of rules. There are 3 nested chains -
`core`, `block` & `inline`:

```
core
    core.rule1
    ... (none yet, you can patch input string here)

    block
        block.rule1
        ...
        block.ruleX

    core.ruleXX
    ... (references, abbreviations, footnotes)

    inline (applyed to each block token with "inline type")
        inline.rule1
        ...
        inline.ruleX

    core.ruleYY
    ... (typographer, linkifier)

```

Mutable data are:

- array of tokens
- `env` sandbox

Tokens are the "main" data, but some rules can be "splitted" to several chains,
and need sandbox for exchange. Also, `env` can be used to inject per-render
variables for your custom parse and render rules.

Each chain (core / block / inline) has independent `state` object, to isolate
data and protect code from clutter.


## Token stream

Instead of traditional AST we use more low-level data representation - tokens.
Difference is simple:

- Tokens are sequence (Array).
- Opening and closing tags are separate tokens.
- There are special token object, "inline containers", having nested token
  sequences with inline markup (bold, italic, text, ...).

Each token has common fields:

- __type__ - token name.
- __level__ - nesting level, useful to seek closeing pair.
- __lines__ - [begin, end], for block tokens only. Range of input lines,
  compiled to this token.

Inline container (`type === "inline"`) has additional properties:

- __content__ - raw text, unparsed inline content.
- __children__ - token stream for parsed content.

In total, token stream is:

- On the top level - array of paired or single "block" tokens:
  - open/close for headers, lists, blockquotes, paragraphs, ...
  - codes, fenced blocks, horisontal rules, html blocks, inlines containers
- Each inline containers have `.children` property with token stream for inline content:
  - open/close for strong, em, link, code, ...
  - text, line breaks

Why not AST? Because it's not needed for our tasks. We follow KISS principle.
If you whish - you can call parser without renderer and convert token stream
to AST.

Where to search more details about tokens:

- [Renderer source](https://github.com/markdown-it/markdown-it/blob/master/lib/renderer.js)
- [Live demo](https://markdown-it.github.io/) - type your text ant click `debug` tab.


## Parse process

This was mentioned in [Data flow](#data-flow), but let's repeat sequence again:

1. Blocks are parsed, and top level of token stream filled with block tokens.
2. Content on inline containers is parsed, filling `.children` properties.
3. Rendering happens.

And somewhere between you can apply addtional transformations :) . Full content
of each chain can be seen on the top of
[parser_core.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_core.js),
[parser_block.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_block.js) and
[parser_inline.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_inline.js)
files.
Added preliminary development docs 10 years ago			`# markdown-it design principles`

			`## Data flow`

Docs clarification 10 years ago			`Input data is piped via nestesd chains of rules. There are 3 nested chains -`
			`core`, `block` & `inline`:
Added preliminary development docs 10 years ago
			```
			`core`
			`core.rule1`
			`... (none yet, you can patch input string here)`

			`block`
			`block.rule1`
			`...`
			`block.ruleX`

			`core.ruleXX`
			`... (references, abbreviations, footnotes)`

			`inline (applyed to each block token with "inline type")`
			`inline.rule1`
			`...`
			`inline.ruleX`

			`core.ruleYY`
			`... (typographer, linkifier)`

			```

			`Mutable data are:`

			`- array of tokens`
			- `env` sandbox

			`Tokens are the "main" data, but some rules can be "splitted" to several chains,`
			and need sandbox for exchange. Also, `env` can be used to inject per-render
			`variables for your custom parse and render rules.`

			Each chain (core / block / inline) has independent `state` object, to isolate
			`data and protect code from clutter.`


			`## Token stream`

			`Instead of traditional AST we use more low-level data representation - tokens.`
Docs clarification 10 years ago			`Difference is simple:`
Added preliminary development docs 10 years ago
Docs clarification 10 years ago			`- Tokens are sequence (Array).`
			`- Opening and closing tags are separate tokens.`
			`- There are special token object, "inline containers", having nested token`
			`sequences with inline markup (bold, italic, text, ...).`
Added preliminary development docs 10 years ago
Docs clarification 10 years ago			`Each token has common fields:`
Added preliminary development docs 10 years ago
			`- __type__ - token name.`
Docs clarification 10 years ago			`- __level__ - nesting level, useful to seek closeing pair.`
Added preliminary development docs 10 years ago			`- __lines__ - [begin, end], for block tokens only. Range of input lines,`
Docs clarification 10 years ago			`compiled to this token.`
Added preliminary development docs 10 years ago
			Inline container (`type === "inline"`) has additional properties:

			`- __content__ - raw text, unparsed inline content.`
			`- __children__ - token stream for parsed content.`

			`In total, token stream is:`

Docs clarification 10 years ago			`- On the top level - array of paired or single "block" tokens:`
			`- open/close for headers, lists, blockquotes, paragraphs, ...`
Added preliminary development docs 10 years ago			`- codes, fenced blocks, horisontal rules, html blocks, inlines containers`
Docs clarification 10 years ago			- Each inline containers have `.children` property with token stream for inline content:
Added preliminary development docs 10 years ago			`- open/close for strong, em, link, code, ...`
			`- text, line breaks`

			`Why not AST? Because it's not needed for our tasks. We follow KISS principle.`
Docs clarification 10 years ago			`If you whish - you can call parser without renderer and convert token stream`
Added preliminary development docs 10 years ago			`to AST.`

Docs clarification 10 years ago			`Where to search more details about tokens:`

			`- [Renderer source](https://github.com/markdown-it/markdown-it/blob/master/lib/renderer.js)`
			- [Live demo](https://markdown-it.github.io/) - type your text ant click `debug` tab.


Added preliminary development docs 10 years ago			`## Parse process`

			`This was mentioned in [Data flow](#data-flow), but let's repeat sequence again:`

			`1. Blocks are parsed, and top level of token stream filled with block tokens.`
			2. Content on inline containers is parsed, filling `.children` properties.
			`3. Rendering happens.`

			`And somewhere between you can apply addtional transformations :) . Full content`
			`of each chain can be seen on the top of`
			`[parser_core.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_core.js),`
			`[parser_block.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_block.js) and`
			`[parser_inline.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_inline.js)`
			`files.`