markdown

Commit Graph

Author	SHA1	Message	Date
Kyle J. McKay	1ecc6a0fe5	Markdown.pl: isolate the archaic tab default If running as a plug in for either of the two original systems that this was designed to "plug in" to, continue to use the archaic, non-standard default for expansion width of physical tabs. This setting does not affect the "indent level" width. Otherwise, force the physical tab width expansion to default to the expected and standard value. This has been the behavior for some time already, except that when "use"ing Markdown.pm and calling the API directly this was being bypassed in favor of the old, archaic default. With this change, the old, archaic default becomes isolated to those two originally supported systems. The setting can still, of course, be changed by using an option to whatever is desired. The default though will now be more sane for more clients. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	751b55b7c6	Markdown.pl: use some sanity Replace 'require' with 'use' in a few places where it should have been "used" in the first place. Make sure the essential package variables are initialized inside a BEGIN block. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	ff7fb525fc	Markdown.pl: avoid accidentally nested anchors Given something that looks like this: [1][] [1]: https://example.com/ Ever since commit `dfbf2b4e30` ("Markdown.pl: retain square brackets around footnotes", 2017-01-19, markdown_1.1.2), the link text has been rendered to include the surrounding '[' and ']' because it just looks better that way and produces a bigger link target. Unfortunately that can result in the linked text being processed again and producing a nexted anchor which is not only invalid according to the XHTML specification but is also the wrong rendering for the input. Deal with this by hiding the '[' and ']' characters inside link text the same way other characters within the link text are already being hidden. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	6324b499f7	Markdown.pl: provide anchor API access The actual anchor id values produced while processing a page are not necessarily immediately obvious. These implicit anchor id values are created for all markdown- format H1-H6 headers by "processing" the text of the header. Provide a new external function, ResolveFragment that can hook up a fragment identifier to one of these automatically- generated anchor id values by transforming it as needed. The lookup table needed by ResolveFragment can be retrieved after calling Markdown by first setting the 'anchors' key in the passed in options HASH ref. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	191f62119c	Markdown.pl: provide urlfunc hook and helpers Provide a new urlfunc hook that can inspect/change all urls that are in "a" "href" attributes and "img" "src" attributes. Make the new SplitURL and unescapeXML routines exportable (@EXPORT_OK) and rename the old escape function to be escapeXML and make it exportable (@EXPORT_OK) too. Add some nice comments to each of the newly exportable functions. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	4ba2a0423a	Markdown.pl: document and catch more meaningless tags There are a few tags (e.g. `a`, `area`, `img`, `map`) that require at least one attribute to be present in order to be meaningful. When these tags occur without any attributes they are treated as non-tags and the leading `<` is escaped to `<`. This can only happen when sanitize mode is active. Although already partially implemented, it was not documented in the help. Add discussion of this to the help and make the implementation more robust to catch more of these tags. This is not intended to be a perversely pedantic change, but rather to allow such meaningless tags to be used as plain text without the need for escaping. For example the text: The <a><c><e> process ... Can be used exactly as-is and all of the `<`s will automatically be escaped to `<` since none of them specify meaningful tags. Of course, using the `--no-sanitize` option will disable this behavior. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	34b44054db	Markdown.pl: sanitize common "oops" entities Take a hint from w3m and quietly fix up the six common entities < > & " '   when they are missing their trailing ';' provided whatever trailing character is there is not alphanumeric, an equals sign or a semicolon. Without this change this case the leading ampersand would have ended up being escaped to & in these cases which seems likely to be almost certainly incorrect. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	bf4a09aeb2	Markdown.pl: sanitize all "&" issues When sanitize is active (--sanitize, the default), make sure all "&" issues are checked. This includes things like bare "&" that should be "&" but aren't. And it includes single/double quote characters inside attribute values that should be encoded and are not. Since the internal validator requires the sanitize mode to be active, this now makes sure that the internal validation mode cannot pass through any invalid entity references to the output. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	964670e66b	Markdown.pl: run _EncodeAmpsAndAngles on top-level raw html blocks At the top level of the document, the _HashHTMLBlocks function gets called to sequester raw top-level html blocks from being processed. As a result, anything in these top-level blocks escapes general Markdown processing except that if XML validation has been enabled (the default), the final result of processing does always pass through a validation stage. On the one hand that's good as it allows raw HTML in Markdown docs, but on the other hand, some basic fix ups are not happening and that's bad. Rather than try and push all of the top-level raw HTML block content through either _RunBlockGamut or _RunSpanGamut (thereby somewhat defeating the point of allowing raw HTML top-level blocks in the first place), use a compromise between the two extremes and push all the text of raw HTML block content through just the _EncodeAmpsAndAngles function. This causes things like non-html-escaped ampersands (&) inside "href" and "src" attributes to magically be transformed into "&" and at the same time any url adjustment options (i.e. -r, -i, -b, -a) to be applied. The result produces better and less surprising outcomes than before. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	cec2468782	various: update copyright year Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	83a2b69572	Markdown.pl: add missing ul to tagblk list The <ul> tag is just as much a block as the <ol> and <dl> tags. Correct the omission by adding it to the tagblk hash. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	26e8ebf4c1	Markdown.pl: treat center as a block because it is Although the <center>...</center> tag has been deprecated, it still occurs in the wild. Since it's equivalent to <div align="center">...</div> it needs to be treated as a block level tag. Add it to tagblk to make it so. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	ff92cf5457	Markdown.pl: dd, dt and li do not autoclose containing table While <dd>, <dt> and <li> all have "optional" closing tags, they can all be contained within a table. And as such must not close the tags that define the content of the table itself. Customize the tagacl list for these three to exclude the tags that may contain table content to prevent their premature closing. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	e8e0a21a7f	Markdown.pl: use passed in prefix for GenerateStyleSheet Although GenerateStyleSheet did, in fact, accept a prefix argument (properly defaulting if omitted), it was not using the passed in prefix. Correct that so the style sheet can be generated using any desired prefix, but most helpfully using the `style_prefix` as passed in to the Markdown function. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	97be3fbcf9	syntax.md: document YAML front matter Explain the syntax of the optional YAML front matter. Include a few examples that demonstrate the known keys. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	e3bc60b4e6	Markdown.pl: implement header --auto-number option This also adds support for the YAML front matter header_enum option which if enabled has the same effect as --auto-number. Only markdown format h1...h6 headers are numbered with --auto-number. Any raw <h1>...<h6> contents are left unchanged. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	ec2f13dd1f	Markdown.pl: mention source of <title> for --stub The <title> value comes from the first markdown markup "h1" encountered or, if YAML processing is enabled, a "title" setting if present which always takes precedence. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	101b394705	Markdown.pl: process YAML front matter Process any YAML front matter that may be present by default. Provide copious options to control how any YAML front matter that may be present will be handled including the ability to completely disable YAML front matter processing altogether. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	f207630963	basics.md: delete defunct dingus There is no dingus to play with; stop talking about it. Also make the "syntax page" link hook up properly. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	43ea96d596	Markdown.pl: provide --raw-xml and --raw-html options Make the --raw option an alias for --raw-xml and provide a new --raw-html option. Previously the --raw option always activated the auto-closing and optional-closing tag semantics as indicated in the HTML standard so that a valid XML document would be output. Unfortunately, these semantics can result in valid XML documents being rejected. For example, "<p><pre></pre></p>" would be turned into "<p></p><pre></pre></p>" because the standard specifies that the opening "pre" tag automatically closes the open "p" tag. Retain these auto-closing semantics under the new --raw-html option while disabling them under the --raw-xml (aka --raw) option. This produces a less surprising outcome when valid XML is provided as input while still providing access to the auto-closing semantics (via --raw-html) if explicitly desired when processing raw input. The auto-closing semantics remain enabled (as before) for the non-raw mode when using --validate-xml-internal (the default). Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	719758c570	Markdown: recognize wiki style image links When the --wiki option is active, recognize wiki-style image links in the format: [[link-to-image.png\|align=left,alt=text]] Where any "well-known" image suffix may be used in place of ".png" and the "\|..." part is optional but may specify any of the "width=", "height=", "align=" or "alt=" keywords (provided alt= is always last). Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	37a4f3d221	Markdown.pl: add --wiki "b" sub-option Allow spaces to be retained when generating wiki file names by using the new "b" wiki sub-option. Sinces spaces are always trimmed (leading and trailing removed and runs of multiple replaced with a single) before processing wiki links, multiple consecutive white space characters are always collapsed to a single space in the final URL. Since the retained spaces are subject to URL encoding, they become "%20" in the final URL. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	5711c23576	Markdown.pl: auto-open "<p>" when required Given input like this: hi<p>_</p>there avoid leaving a dangling text blob outside of any "p" section like this: <p>hi</p><p>_</p>there Instead, auto-open a new "p" section so the final text blob ends up properly wrapped like so: <p>hi</p><p>_</p><p>there</p> This reflects the actual rendering behavior of the client "user agent" (aka browser) which would end up supplying the missing <p>...</p> wrapper in any case. By doing this the output better reflects the way the markup actually renders. The heuristic used to auto-open the "p" section may not always auto-open a "p" when it should, but it should never auto-open a "p" when it shouldn't. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	e9d025d178	Markdown.pl: avoid superfluous leading <p></p> Since each "paragraph" is wrapped between a "<p>" and "</p>" this input: <p>hi <p>bye has been producing this output: <p></p><p>hi</p> <p></p><p>bye</p> Correct this so that if the leading "<p>" of the paragraph wrapper is immediately auto-closed then it's simply discarded rather than creating a bogus "<p></p>" section. With this change the previous input now produces this output: <p>hi</p> <p>bye</p> The bogus leading "<p></p>" sections have been omitted and the output looks much nicer. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	e028ec2909	Markdown.pl: avoid auto-closing <p> problems When forming paragraphs, a $string is wrapped to become <p>$string</p>. If the opening "<p>" ends up being auto-closed by markup within $string, then either another "<p>" must be auto-opened or the closing "</p>" of the wrapper must be silently dropped to avoid a validation failure. Figuring out exactly where to auto-open the "<p>" turns out to be somewhat more difficult than just dropping the wrapper's "</p>". For now just go ahead and drop the wrapper's closing "</p>" if the wrapper's opening "<p>" has been auto-closed by the time the validator encounters the wrapper's closing "</p>". At the same time, make sure that all "optional closing tag" tags that occur after the wrapper's opening "<p>" get closed immediately upon encountering the wrapper's closing "</p>" (whether or not it ultimately gets dropped). With these changes, this input: line<p>one line<p>three or this input: line<p>one</p> line<p>three</p> produces this output: <p>line</p><p>one</p> <p>line</p><p>three</p> While this input: line<p>one</p>x1 line<p>three</p>x3 produces this output: <p>line</p><p>one</p>x1 <p>line</p><p>three</p>x3 In this last example, the "x1" and "x3" text is left hanging outside of a "p" section. The client "user agent" (aka browser) will end up rendering these hanging "x1" and "x3" pieces of text in their own "p" sections. With these changes, simple markup that would previously have been rejected for no apparent reason by the default `--validate-xml-internal` parser while being accepted by the `--validate-xml` option becomes acceptable to the `--validate-xml-internal` parser as well. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	cd268c10c9	Markdown: support floating images With a minor enhancement to the support for specifying image dimensions, images can now be "float"ed to the left or right or even centered in their own block. Add the ability to generate a <br clear="all" /> with 3 or more spaces on the end of a line rather than a plain <br /> with only 2. Document these additions as well. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	cb07dccca3	Markdown.pl: add --wiki "%" sub-option Allow wiki names to be "flatten"ed by replacing runs of one (or more) "/" characters with "%2F" indicated by the new "%" sub-option. Ultimately these "%2F" replacements become "%252F" by the time the final URL is generated. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	c4f0c6f2f7	Markdown.pl: enhance direct module wiki use via code ref Provide a "wikifunc" 'CODE' ref hook capability to provide for custom wiki link handling when "use"ing the Markdown module. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	acb554cacb	Markdown.pl: new --keep-abs option With the `--keep-abs` option absolute path URLs will be preserved into the output despite any -r/-i options that may be present. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	3224279fb7	Markdown.pl: enhance direct module use via code refs Allow a 'CODE' ref to be supplied when "use"ing the Markdown module for any of the *_prefix options. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	c3cfcf92d6	Markdown.pl: improve standalone XML comment stripping When stripping XML comments, if any XML comments are recognized as a standalone block, strip that entire block when forming paragraphs the final time. This provides a much cleaner output as it results in many superfluous blank lines being suppressed that the XML parser would not otherwise remove when it strips out XML comments. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	c2f72b71a4	Markdown.pl: remove following \n with comment When `--strip-comments` is active, if an XML comment is immediately followed by optional spaces and/or tabs and a newline, remove those along with the comment itself. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	0eeaa48ae6	Markdown.pl: detail ability to "use" Markdown module While the default mode of Markdown.pl remains that of a command line utility, it's fairly simple to "use Markdown" and call the functions directly. Explain this usage in the help and make sure all of the auxiliary functions that might be used for this appear in @EXPORT_OK. Include an example that simulates `Markdown.pl --stub --wiki`. Add a symbolic link from Markdown.pm to Markdown.pl to go along with the new example. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	a04101db46	Markdown.pl: minor empty tag improvments Even though block tags such as "<p/>" should not appear in valid XHTML documents, the internal validator (which is enabled by default) will properly expand "<p/>" to "<p></p>". However, the block formatting code fails to notice such an empty tag block leading to it being wrapped in a spurious "<p>...</p>" pair before it's expanded by the validation code. Attempt to recognize some of these valid-for-xml-but-not-xhtml blocks earlier to produce better output. This is not a perfect fix, but it's an improvement. It's really an odd edge case anyway that's unlikely to be encountered very often. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	00def67f36	Markdown.pl: refactor and document Markdown function args Move sanity checking of arguments to the Markdown and ProcessRaw functions into a new _SanitizeOpts function. Call the new _SanitizeOpts function from both Markdown and ProcessRaw. Document all of the possible options in the _SanitizeOpts function. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	dfe62957f3	Markdown.pl: refactor wiki options parsing Create a new "SetWikiOpts" function that parses the `--wiki=` option value into the appropriate internal options settings. Use the new "SetWikiOpts" function to parse the command line `--wiki=...` option. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	5a5deeca72	Markdown.pl: refactor style sheet generation Create a new "GenerateStyleSheet" function that returns a copy of the internal fancy style sheet using the given prefix as a prefix of all the CSS style names. Use the new "GenerateStyleSheet" function to create the style sheet as needed. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	33fd043581	Markdown.pl: recognize U+00D7 for "x" When parsing a "checkbox" item or image dimensions, recognize a U+00D7 Multiplication Sign character as equivalent to an "x". The real "x" is preferred (and still recognized along with "X"), but in the case where a U+00D7 (×) ends up in there, just go with it and recognize it as the intent remains clear. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	e4b8a0f6c4	syntax.md: add XML comments section Add an explanation of XML comments for those who may not be familiar with them including a link to the relevant specification, examples, and exacting details about where they are and are not recognized. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	c1f488d9f7	Markdown.pl: recognize adjacent standalone comments together Combine adjacent (i.e. no separating blank line) standalone XML comments into the same "block". This is more efficient, better preserves the original comment formatting and avoids an unfortunate side-effect that could introduce unwanted extra paragraphs into the output. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	3b0825b9af	Markdown.pl: provide --strip-comments option With --strip-comments, remove any XML comments from the output. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	2a265e3923	Markdown.pl: simplify options parsing code Make use of more of the Getopt::Long::GetOptions API capabilities to avoid needing extra, awkward code checks. With this change, options that support negation (e.g. "stylesheet") or have variants (e.g. "validate-xml-internal") now work as intended such that the last option given wins. Additionally, help/version options are now handled immediately when encountered. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	003134a723	Markdown.pl: correct comment sanitation The XML standard section 2.5 is quite specific: the string "--" (double-hyphen) MUST NOT occur within comments In fact, xmllint will complain about any comments that incorrectly contain an internal "--" sequence as they are not valid XML. Adjust the sanitation code to only pass through valid XML comments using the same pattern that _HashHTMLBlocks uses to recognize them. With this change, invalid XML comments will be treated as literal text by the sanitizer and have the initial "<" escaped to < thus rendering them as not a comment at all. Also take this opportunity to correct the comments in the _HashHTMLBlocks function from "HTML" to "XML" to reflect what it actually matches. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	6956d990d2	Markdown.pl: next version is 1.1.11 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	d67dd6f667	Markdown version 1.1.10 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	18d217abe3	various: update links to https where possible Additionally, update a few that now redirect elsewhere. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	4174281293	Markdown.pl: remove markup from <title> value When using `--stub` and picking up the value of the first "H1" tag to use as the title, remove markup (such as links, italic, bold, etc.) from the value before using it. Since <title>...</title> value cannot contain links or other markup this makes the displayed title look much better where such markup is present in the original document. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	ea408f7d29	Markdown.pl: hook up fragment only link definitions This works to hook up a fragment link to its section: # Section 1 Link to [Top](#Section_1). Make the same thing work when written like this: # Section 1 Link to [Top][id]. [id]: #Section_1 Or even like this: # Section 1 Link to [id]. [id]: #Section_1 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	de1c7f4f1a	syntax.md: mention ability to split link references A link reference may have the URL actually split onto the next line, not just the title attribute. Mention this in the syntax description for links. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago
Kyle J. McKay	fef2d21f4c	Markdown.pl: support new --absroot=prefix option Any absolute path URLs (but not // ones) have the prefix prepended. If that makes the resulting URL a fully absolute URL it will not be processed by any --htmlroot and/or --imageroot options. With this option, site-relative absolute path URLs can be re-written so that the site is made explicit in order to support viewing on a different site. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	5 years ago

1 2 3 4 5

231 Commits (master) All Branches Search

231 Commits (master)

All Branches