Added preliminary development docs

11 years ago · f78f1c97fa
3 changed files with 151 additions and 0 deletions
--- a/docs/README.md
+++ b/docs/README.md
@ -0,0 +1,5 @@
 This folder contains info mostly for plugins developers.
 If you just use `markdown-it` in your app, see
 [README](https://github.com/markdown-it/markdown-it#markdown-it) and
 [API docs](https://markdown-it.github.io/markdown-it/).
--- a/docs/architecture.md
+++ b/docs/architecture.md
@ -0,0 +1,96 @@
 # markdown-it design principles
 ## Data flow
 Parse process is unified as much as possible. Input data is piped via nestesd
 chains of rules. There are 3 "main" chains (core / block / inline):
 ```
 core
    core.rule1
    ... (none yet, you can patch input string here)
    block
        block.rule1
        ...
        block.ruleX
    core.ruleXX
    ... (references, abbreviations, footnotes)
    inline (applyed to each block token with "inline type")
        inline.rule1
        ...
        inline.ruleX
    core.ruleYY
    ... (typographer, linkifier)
 ```
 Mutable data are:
 - array of tokens
 - `env` sandbox
 Tokens are the "main" data, but some rules can be "splitted" to several chains,
 and need sandbox for exchange. Also, `env` can be used to inject per-render
 variables for your custom parse and render rules.
 Each chain (core / block / inline) has independent `state` object, to isolate
 data and protect code from clutter.
 ## Token stream
 Instead of traditional AST we use more low-level data representation - tokens.
 Difference is very simple.
 - tokens are sequence (Array)
 - opening and closing tags are separate tokens
 - there are special token object, "inline containers", having nested token
  sequences with inline markup (bold, italic, text)
 Each token has 2 mandatory fields:
 - __type__ - token name.
 - __level__ - nesting level, useful to seek matched pair.
 - __lines__ - [begin, end], for block tokens only. Range of input lines,
  compiled to this token
 Inline container (`type === "inline"`) has additional properties:
 - __content__ - raw text, unparsed inline content.
 - __children__ - token stream for parsed content.
 See [renderer source](https://github.com/markdown-it/markdown-it/blob/master/lib/renderer.js)
 for available tokens and those properties. Currently there are no special
 requirements on tokens naming and additional fields.
 In total, token stream is:
 - Array of paired or single "block" tokens, on top level:
  - open/close for headers, lists, blockquotes paragraphs
  - codes, fenced blocks, horisontal rules, html blocks, inlines containers
 - Inline containers have "substream" Array with inline tags:
  - open/close for strong, em, link, code, ...
  - text, line breaks
 Why not AST? Because it's not needed for our tasks. We follow KISS principle.
 If you whish - you can call parser withour renderer and convert token stream
 to AST.
 ## Parse process
 This was mentioned in [Data flow](#data-flow), but let's repeat sequence again:
 1. Blocks are parsed, and top level of token stream filled with block tokens.
 2. Content on inline containers is parsed, filling `.children` properties.
 3. Rendering happens.
 And somewhere between you can apply addtional transformations :) . Full content
 of each chain can be seen on the top of
 [parser_core.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_core.js),
 [parser_block.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_block.js) and
 [parser_inline.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_inline.js)
 files.
--- a/docs/development.md
+++ b/docs/development.md
@ -0,0 +1,50 @@
 # Development recommendations
 Prior to continue, make sure you've readed:
 1. [README](https://github.com/markdown-it/markdown-it#markdown-it)
 2. [API documentation](https://markdown-it.github.io/markdown-it/)
 3. [Architecture description](architecture.md)
 ## General considerations for plugins.
 1. Try to understand, where your plugin rule sould be located
  - Will it conflict with existing markup (by priority)?
    - If yes - you need to write inline or block rule.
    - If no - you can morth tokens in core chain.
  - Remember, that tokens morphing in core is always more simple than writing
    block / inline rules. However, block / inline rules are usually faster
  - Sometime it's enougth to modify renderer only (for example, to add
    header IDs or target=_blank for the links)
 2. Search existing [plugins](https://www.npmjs.org/browse/keyword/markdown-it-plugin)
   or [rules](https://github.com/markdown-it/markdown-it/tree/master/lib),
   doing something similar. It can me more simple to modify existing code,
   instead of writing from scratch.
 3. If you did all steps above, but still has questions - ask in
   [tracker](https://github.com/markdown-it/markdown-it/issues). But, please:
   - Be specific. Generic questions like "how to do plugins" and
     "how to learn programming" are not accepted.
   - Don't ask us to break [CommonMark](http://commonmark.org/) specification.
     Such things should be discussed first on [CommonMark forum](http://talk.commonmark.org/).
 ## Notes for NPM packages
 To simplify search:
 - add to `package.json` keyswords `markdown-it` and `markdown-it-plugin` for plugins
 - add keyword `markdown-it` for any other related packages.
 ## FAQ
 #### I need async rule, how to do it?
 Sorry. You can't do it directly. All complex parsers are sync by nature. But you
 can use workarounds:
 1. On parse phase, replace content by random number and store it in `env`.
 2. Do async processing over collected data.
 3. Render content and replace those random numbers with text
   (or replace first, then render)