Added preliminary development docs

11 years ago · f78f1c97fa
3 changed files with 151 additions and 0 deletions
--- a/docs/README.md
+++ b/docs/README.md
@ -0,0 +1,5 @@
+This folder contains info mostly for plugins developers.
+
+If you just use `markdown-it` in your app, see
+[README](https://github.com/markdown-it/markdown-it#markdown-it) and
+[API docs](https://markdown-it.github.io/markdown-it/).
--- a/docs/architecture.md
+++ b/docs/architecture.md
@ -0,0 +1,96 @@
+# markdown-it design principles
+
+## Data flow
+
+Parse process is unified as much as possible. Input data is piped via nestesd
+chains of rules. There are 3 "main" chains (core / block / inline):
+
+```
+core
+    core.rule1
+    ... (none yet, you can patch input string here)
+
+    block
+        block.rule1
+        ...
+        block.ruleX
+
+    core.ruleXX
+    ... (references, abbreviations, footnotes)
+
+    inline (applyed to each block token with "inline type")
+        inline.rule1
+        ...
+        inline.ruleX
+
+    core.ruleYY
+    ... (typographer, linkifier)
+
+```
+
+Mutable data are:
+
+- array of tokens
+- `env` sandbox
+
+Tokens are the "main" data, but some rules can be "splitted" to several chains,
+and need sandbox for exchange. Also, `env` can be used to inject per-render
+variables for your custom parse and render rules.
+
+Each chain (core / block / inline) has independent `state` object, to isolate
+data and protect code from clutter.
+
+
+## Token stream
+
+Instead of traditional AST we use more low-level data representation - tokens.
+Difference is very simple.
+
+- tokens are sequence (Array)
+- opening and closing tags are separate tokens
+- there are special token object, "inline containers", having nested token
+  sequences with inline markup (bold, italic, text)
+
+Each token has 2 mandatory fields:
+
+- __type__ - token name.
+- __level__ - nesting level, useful to seek matched pair.
+- __lines__ - [begin, end], for block tokens only. Range of input lines,
+  compiled to this token
+
+Inline container (`type === "inline"`) has additional properties:
+
+- __content__ - raw text, unparsed inline content.
+- __children__ - token stream for parsed content.
+
+See [renderer source](https://github.com/markdown-it/markdown-it/blob/master/lib/renderer.js)
+for available tokens and those properties. Currently there are no special
+requirements on tokens naming and additional fields.
+
+In total, token stream is:
+
+- Array of paired or single "block" tokens, on top level:
+  - open/close for headers, lists, blockquotes paragraphs
+  - codes, fenced blocks, horisontal rules, html blocks, inlines containers
+- Inline containers have "substream" Array with inline tags:
+  - open/close for strong, em, link, code, ...
+  - text, line breaks
+
+Why not AST? Because it's not needed for our tasks. We follow KISS principle.
+If you whish - you can call parser withour renderer and convert token stream
+to AST.
+
+## Parse process
+
+This was mentioned in [Data flow](#data-flow), but let's repeat sequence again:
+
+1. Blocks are parsed, and top level of token stream filled with block tokens.
+2. Content on inline containers is parsed, filling `.children` properties.
+3. Rendering happens.
+
+And somewhere between you can apply addtional transformations :) . Full content
+of each chain can be seen on the top of
+[parser_core.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_core.js),
+[parser_block.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_block.js) and
+[parser_inline.js](https://github.com/markdown-it/markdown-it/blob/master/lib/parser_inline.js)
+files.
--- a/docs/development.md
+++ b/docs/development.md
@ -0,0 +1,50 @@
+# Development recommendations
+
+Prior to continue, make sure you've readed:
+
+1. [README](https://github.com/markdown-it/markdown-it#markdown-it)
+2. [API documentation](https://markdown-it.github.io/markdown-it/)
+3. [Architecture description](architecture.md)
+
+
+## General considerations for plugins.
+
+1. Try to understand, where your plugin rule sould be located
+  - Will it conflict with existing markup (by priority)?
+    - If yes - you need to write inline or block rule.
+    - If no - you can morth tokens in core chain.
+  - Remember, that tokens morphing in core is always more simple than writing
+    block / inline rules. However, block / inline rules are usually faster
+  - Sometime it's enougth to modify renderer only (for example, to add
+    header IDs or target=_blank for the links)
+2. Search existing [plugins](https://www.npmjs.org/browse/keyword/markdown-it-plugin)
+   or [rules](https://github.com/markdown-it/markdown-it/tree/master/lib),
+   doing something similar. It can me more simple to modify existing code,
+   instead of writing from scratch.
+3. If you did all steps above, but still has questions - ask in
+   [tracker](https://github.com/markdown-it/markdown-it/issues). But, please:
+   - Be specific. Generic questions like "how to do plugins" and
+     "how to learn programming" are not accepted.
+   - Don't ask us to break [CommonMark](http://commonmark.org/) specification.
+     Such things should be discussed first on [CommonMark forum](http://talk.commonmark.org/).
+
+
+## Notes for NPM packages
+
+To simplify search:
+
+- add to `package.json` keyswords `markdown-it` and `markdown-it-plugin` for plugins
+- add keyword `markdown-it` for any other related packages.
+
+
+## FAQ
+
+#### I need async rule, how to do it?
+
+Sorry. You can't do it directly. All complex parsers are sync by nature. But you
+can use workarounds:
+
+1. On parse phase, replace content by random number and store it in `env`.
+2. Do async processing over collected data.
+3. Render content and replace those random numbers with text
+   (or replace first, then render)