# markdown-it design principles
## Data flow
Input data is piped via nestesd chains of rules. There are 3 nested chains -
`core` , `block` & `inline` :
```
core
core.rule1
... (normalize)
block
block.rule1
...
block.ruleX
core.ruleXX
... (nothing yet)
inline (applyed to each block token with "inline" type)
inline.rule1
...
inline.ruleX
core.ruleYY
... (abbreviation, footnote, typographer, linkifier)
```
Mutable data are:
- array of tokens
- `env` sandbox
Tokens are the "main" data, but some rules can be "splitted" to several chains,
and need sandbox for exchange. Also, `env` can be used to inject per-render
variables for your custom parse and render rules.
Each chain (core / block / inline) has independent `state` object, to isolate
data and protect code from clutter.
## Token stream
Instead of traditional AST we use more low-level data representation - tokens.
Difference is simple:
- Tokens are sequence (Array).
- Opening and closing tags are separate tokens.
- There are special token objects, "inline containers", having nested token
sequences with inline markup (bold, italic, text, ...).
Each token has common fields:
- __type__ - token name.
- __level__ - nesting level, useful to seek closeing pair.
- __lines__ - [begin, end], for block tokens only. Range of input lines,
compiled to this token.
Inline container (`type === "inline"`) has additional properties:
- __content__ - raw text, unparsed inline content.
- __children__ - token stream for parsed content.
In total, token stream is:
- On the top level - array of paired or single "block" tokens:
- open/close for headers, lists, blockquotes, paragraphs, ...
- codes, fenced blocks, horisontal rules, html blocks, inlines containers
- Each inline containers have `.children` property with token stream for inline content:
- open/close for strong, em, link, code, ...
- text, line breaks
Why not AST? Because it's not needed for our tasks. We follow KISS principle.
If you whish - you can call parser without renderer and convert token stream
to AST.
Where to search more details about tokens:
- [Renderer source ](https://github.com/markdown-it/markdown-it/blob/master/lib/renderer.js )
- [Live demo ](https://markdown-it.github.io/ ) - type your text ant click `debug` tab.
## Rules
Rules are functions, doing "magick" with parser `state` objects. Each rule is
registered in one of chain with unique name.
Rules are managed by names via [Ruler ](https://markdown-it.github.io/markdown-it/#Ruler ) instances and `enable` / `disable` methods in [MarkdownIt ](https://markdown-it.github.io/markdown-it/#MarkdownIt ).
You can note, that some rules have "validation mode" - in this mode rule does not
modify token stream, and only look ahead for the end of token. It's one of
important design principle - token stream is "write only" on block & inline parse stages.
Parser is designed to keep rules independent. You can safely disable any, or
add new one. There are no universal recipes how to create new rules - design of
distributed state machines with good data isolation is tricky business. But you
can investigate existing rules & plugins to see possible approaches.
Also, in complex cases you can try to ask for help in tracker. Condition is very
simple - it should be clear from your ticket, that you studied docs, sources,
and tried to do something yourself. We never reject with help to real developpers.
## Renderer
After token stream is generated, it's passed to [renderer ](https://github.com/markdown-it/markdown-it/blob/master/lib/renderer.js ).
It just plays all tokens, passing each to rule with the same name as token type.
Renderer rules are located in `md.renderer.rules[name]` and are simple functions
with the same signature:
```js
function (tokens, idx, options, env, renderer) {
//...
return htmlResult;
}
```
In many cases that allows easy output change even without parser intrusion.
For example, let's replace images with vimeo links to player's iframe:
```js
var md = require('markdown-it')();
var defaultRender = md.renderer.rules.image,
vimeoRE = /^https?:\/\/(www\.)?vimeo.com\/(\d+)($|\/)/;
md.renderer.rules.image = function (tokens, idx, options, env, self) {
var id;
if (vimeoRE.test(tokens[idx].href)) {
id = tokens[idx].href.match(vimeoRE)[2];
return '< div class = "embed-responsive embed-responsive-16by9" > \n' +
' < iframe class = "embed-responsive-item" src = "//player.vimeo.com/video/' + id + '" ></ iframe > \n' +
'</ div > \n';
}
return defaultRender(tokens, idx, options, env, self);
});
```
You also can write your own renderer to generate AST for example.
## Summary
This was mentioned in [Data flow ](#data-flow ), but let's repeat sequence again:
1. Blocks are parsed, and top level of token stream filled with block tokens.
2. Content on inline containers is parsed, filling `.children` properties.
3. Rendering happens.
And somewhere between you can apply addtional transformations :) . Full content
of each chain can be seen on the top of
[parser_core.js ](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_core.js ),
[parser_block.js ](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_block.js ) and
[parser_inline.js ](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_inline.js )
files.
Also you can change output directly in [renderer ](https://github.com/markdown-it/markdown-it/blob/master/lib/renderer.js ) for many simple cases.