markdown

Commit Graph

Author	SHA1	Message	Date
Kyle J. McKay	05fcf802f6	Markdown version 1.1.15 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	d02bb9ee9e	README: clean up old version history entries Be consistent with list marker and remove any irrelevant leading bit from all version history entries that do not match the original version history entry style. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	b603660fc5	Markdown.pl: clarify help text for --no-validate-xml option Replace the old text left over from before --validate-xml-internal existed with something more accurate that indicates that Markdown.pl can check for XML problems itself (via the default --validate-xml-internal option). Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	62c5916945	Markdown.pl: add new --us-ascii option for 7-bit output With the new --us-ascii (aka --ascii) option enabled, any characters with a code point value larger than 127 are output using their equivalent numerical character entity. This makes the output strictly US-ASCII (which is a subset of UTF-8) and should allow it to survive almost any transport mechanism at the expense of an increase in size that depends on how many non-US-ASCII characters are present. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	a3c4d34b75	Markdown.pl: allow '/' in auto-quoted attribute values When auto-quoting the attribute value of something like this: <a href=http://example.com/test/>test</a> Do not stop at the first "/", instead pick up the entire value. Of course, whitespace will still terminate the value being auto-quoted. Additionally, when checking the end of the tag to see if it's self-closing (e.g. "<br/>") do not include the final character of any value that may have been picked up by the auto-quoting process. For the example above, that prevents the opening "a" tag from mistakenly being considered self-closing just because the value being auto-quoted happens to end in a "/" and butts right up against the final ">" of the tag. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	eccdc9d08e	Markdown.pl: improve readability of a few regexs Instead of using internal "\" escapes to match "/", just replace the surrounding "/"..."/" delimiters with "m{"..."}" instead. Also remove unnecessary surrounding brackets from single character character classes. These changes make it easier to understand the patterns. This is a cosmetic change only, no functional differences. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	4840dad1c9	Markdown.pl: next version is 1.1.15 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	3b5f256216	Markdown version 1.1.14 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	509770c743	Markdown.pl: do not start sublist in list marker line Given a list like so: * 1. item * 2. items * 3. items Each line contains two list item "markers". The first is an unordered list marker "" and the second is an ordered list marker ("1.", "2." or "3."). However, a sublist (the "1.", "2." or "3.") may not start inside the actual marker line of the parent list. Make sure that any text on the same line following the initial list marker does not* get recognized as the start of a sublist. In other words, make sure the above does _NOT_ produce: <ul> <li><ol><li>item</li></ol></li> <li><ol start="2"><li value="2">items</li></ol></li> <li><ol start="3"><li value="3">items</li></ol></li> </ul> But instead more correctly produces this output: <ul> <li>1. item</li> <li>2. items</li> <li>3. items</li> </ul> Recognizing a sublist start within any of the parent list's marker lines does not make sense. With this change this longstanding problem is corrected once again. The problem was first fixed in version 1.1.0 and remained fixed in version 1.1.1 but that fix caused too many problems and was reverted causing versions 1.1.2 through 1.1.13 to be broken again. The technique used to fix the problem this time differs from the previous attempt and will hopefully survive scrutiny keeping the problem resolved from now on. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	a1aa3e31db	Markdown.pl: tidy up some regex options The %g_escape_table hash gets initialized in a BEGIN block and then remains constant for the life of Markdown.pl. Add the "o" (compile pattern once) option to all the patterns that it appears in to perhaps squeeze out just a tiny little bit of extra performance if possible. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	d0ac5b3362	Markdown.pl: next version is 1.1.14 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	a85151b383	Markdown version 1.1.13 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	7e34ed6a63	Markdown.pl: adjust strip-comments options and defaults Make --strip-comments be an alias for --strip-comments-lax. Make --strip-comments-lax-only be the default whenever --sanitize is active (which is the default). Whenever possible, running Markdown.pl without any options should provide the best output possible by default. Turning what appear to be (at first glance) XML comments into plain text in the output clearly violates the principle of least surprise and can make for some very ugly pages. Similarly using a `--strip-comments` option and discovering those same plain text XML comments in the output also invites bewilderment. Instead, make `--strip-comments` be a short form of `--strip-comments-lax` and make `--strip-comments-lax-only` be the normal default. By doing this there are no ugly page surprises by default. Those pesky double hyphen sequences (`--`) that have furtively slipped into what were supposed to be strictly valid XML comments thereby making them strictly invalid XML comments are now rendered impotent by default. The output remains strictly valid XML uncontaminated by the surprise appearance of strictly invalid XML comments suddenly rendered as plain text due to accidental inclusion of a double hyphen sequence. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	c154f45386	Markdown: allow backticks-delimited code blocks in lists Using backticks-delimited code blocks in lists has apparently become rather widely used even though the original specification didn't seem to allow them within lists. Make the needed changes to allow this. The changes are actually rather minor. And do update the syntax document to reflect this change. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	f2f8a1e2fe	Markdown: support `\` EOL to generate a `<br />` Improve compatibility with some other markdown renderers and translate a backslash (`\`) at the very end of a line (a line that is not inside a table or code block) into a `<br />` in addition to the two-or-more-spaces at the end of line translation that already takes place and does the same thing. As expected, the backslash can be escaped by doubling it to preserve it (or by enclosing it in backquotes `\` to make it a code span). Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	d9f6613164	syntax.md: mention headers supersede horizontal rules Add a note to the syntax document mentioning that when using a line of solid hyphens (`-`s) for a horizontal rule, it will actually be treated as an H2 Setext-style header if the preceding line is not blank. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	94c8e6e7dd	Markdown.pl: wordsmith a help comment Clean up a little bit of awkward english in the help. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	9c7c5a0c11	Markdown.pl: do not mistake table for code block When making a nice looking table such as this: Term \| Detail -------------- \| -------------- First term \| number one Second term \| number two There is a potential to misinterpret the header line as the beginning of a code block (the indented type) since it begins with 5 spaces. Of course this could be addressed either by moving the "Term" string to the left at least 2 spaces or by adding the optional leading "\|" to the beginning of the column, but that's unnecessarily ugly. Instead, when parsing a code block, check to see if the code block consists of exactly one line and when combined with the next line represents a valid table start. A valid table start specifies a header row and a separator row with exactly the same (positive integer) number of columns. If a valid table start is found, avoid making it into a code block and instead allow the table code to grab it and make it into a table. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	aa05222a09	Markdown.pl: correct minor quibble in DoCodeBlocks regex Correct a longstanding issue with the regex used when matching code blocks. Specifically the 4-spaces-indented kind of code block. The code block ends either at either the end of the document or when a non-indented line is encountered. The pattern looking for the non-indented line actually allowed a match with up to the full 4-space indentation. It hasn't been a problem because the greedy matcher before that part of the pattern grabs any lines with 4 or more spaces of indentation. However, leaving the pattern as-is leaves it more ambiguous than necessary and leaves open more backtracking possibilities (although in this case the greedy matcher should prevent them being used). Correct the pattern to reflect the actual syntax and make that part of the pattern non-capturing to make the compiled pattern just that little bit slightly more efficient. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	97d6ad49e2	README: backquote some text in changes list for 1.1.12 The README file is meant to be valid Markdown. The changes listed for version 1.1.12 included: do not choke on <br></br> etc. Of course, that caused a "<br />" tag to end up in the rendered output. Quote the tags with backquotes like so: do not choke on `<br></br>` etc. This makes it render as intended in the xhtml output. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	7fce1577a8	Markdown.pl: next version is 1.1.13 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	5ebdc50649	Markdown version 1.1.12 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	53b4a58143	Markdown.pl: add missing space to implied attributes When sanitizing an attribute with an implied value such as "compact" or "checked", add the required space at the end to avoid mashing up against any other attribute that might be present. For example, <ol compact start=10> now becomes the correct: <ol compact="compact" start="10"> rather than the previously incorrect: <ol compact="compact"start="10"> Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	39e875e4f5	Markdown.pl: allow target="_blank" rel="nofollow" While other targets could, potentially, represent legitimate issues for concern, opening a new window generally does not since that's typically a readily available option in the user agent anyway when choosing to follow any individual link. While using target="_blank" does not really represent any security issue, it may be an annoyance issue, but that's something for the author to address, not the sanitizer. Although rel="nofollow" is _not_ part of the HTML 4 standard, it may be very useful to avoid "endorsing" sites that are being linked to. Since it does not introduce any risk of scripting issues or other hidden issues, go ahead and allow it too. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	6efa98325a	Markdown.pl: add limited tilde-delimited code block support To avoid conflicting (too much) with setext-style H3 headers that are delimited with a line of tildes, require exactly three tildes to introduce a tilde-delimited code block. And, while in there, clean up the backticks-delimited code blocks pattern a tiny amount and allow either kind of code block to be closed by more than the number of opening delimiters in addition to exactly the same number of opening delimiters. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	35c983c9f0	Markdown.pl: do not choke on <br></br> etc. Adjust code to properly handle "empty" tags that are written as an open plus closing tag but do not contain any whitespace in the opening tag. The code already properly handles turning <hr noshade></hr> into just <hr noshade="noshade" />, but it was failing to handle that when the opening tag did not contain any whitespace such as <br></br>. Adjust the code to return the proper value for the opening tag under such a condition so that it's handled properly. Previously a sequence such as <br></br> would fail as it would end up being turned into <br /></br> which then fails XML validation. Now it works properly and turns <br></br> into <br /> as it should have been doing all along. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	658edb6abf	Markdown.pl: clean up closing tag whitespace While closing tags are matched okay if they contain whitespace, that whitespace was not being cleaned up in a comparable way to the manner in which whitespace in an opening tag is being handled. Make whitespace in closing tags be handled the same way as whitespace in opening tags. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	c578bbfcfa	Markdown.pl: add more comment stripping options With --strip-comments-lax even strictly invalid XML comments will be stripped. With --strip-comments-lax-only only strictly invalid XML comments will be stripped. Allowing strictly invalid XML comments to pass through to the output would produce invalid XML. By default such invalid comments end up having their leading '<' escaped so that they become plain text in the output thereby avoiding making it invalid XML. However, if comments are being stripped out, there's no reason the standard cannot be relaxed a little bit since the output will remain valid XML as the comments will not be passed through to the output in that case. The two new options, --strip-comments-lax and --strip-comments-lax-only provide a choice of behavior, strip all comments including the strictly invalid ones, or just strip the strictly invalid ones. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	50422d1e28	Markdown.pl: better sanitization of href and src attributes Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	19c0131f03	Markdown.pl: do not choke on \n inside attribute values A tag such as this: <span style=" lots: of; stuff: in; here: now; "></span> Is perfectly valid. Add the missing "s" pattern match qualifier to make sure such attribute values do not end up getting mangled. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	86606c5a52	Markdown.pl: allow some overlooked table attributes For %cellhalign allow the overlooked 'char' and 'charoff' attributes. For table allow the overlooked 'frame' and 'rules' attributes. For table, tr, th and td allow the 'bgcolor' attribute. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	fde2382058	Markdown.pl: next version is 1.1.12 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	5ffe21ab63	Markdown version 1.1.11 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	6a118b8c53	Markdown.pl: format --help output properly Make the full help output use the correct formatting if available so it looks as nice as possible. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	62382f4e1b	Markdown.pl: add a help comment about literal html tags Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	cc044905ae	Markdown.pl: include yaml table lines for error msg lineno The line number mentioned in any error message gets generated by counting from the beginning of the non-yaml output. Of course, the final output will include any yaml table if generated. Adjust the line number in any error messages by the number of lines of preceding yaml table that will be included in the output. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	d05afd2cfb	Markdown.pl: convert named character entities by default Unless the new, heavily discouraged, `--keep-named-character-entities` option has been given, always convert known named character entities to their equivalent numerical entity. All strict XML validators will complain about anything other than the required-by-XML five entities (& < > " ') unless an entity dictionary has been provided. In addition, some older XHTML clients do not grok the ' entity. Now only the universally supported four entities (& < > ") will be preserved by default. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	4e9eba45fa	Markdown.pl: add a --div option and corresponding API It can be very convenient to be able to wrap the contents in its own output "<div>". Add an option to do that with an underlying corresponding API option to match. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	53494a4bdc	Markdown API: eliminate problematic xmlcheck == 1 There was absolutely no benefit to passing in an xmlcheck value of 1 to the Markdown/ProcessRaw API. It was ignored and did NOT result in any checking. Change this so that any value other than a numeric 0 results in XML checking when calling the API. This makes the most sense and avoids creating obscure API bugs. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	7ce25f1ec9	Markdown.pl: format -h output properly Make the usage output use the correct termcap codes to look nice if they're available. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	38a41b7a8c	Markdown.pl: process "br" indicators at end of paragraph Normally there's no point to a "<br />" tag at the end of a paragraph as the end of the paragraph will force a break anyway. Unless that "br" tag contains a "clear='...'" attribute. Make sure that 3 or more spaces at the end of a paragraph actually turns into a "<br clear='all' />" tag but at the same time make 2 spaces at the end of a paragraph just go away as it serves no purpose. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	a9781245b3	Markdown.pl: update help description Add missing conjunction. Update example of document that fails with --raw-html but not --raw-xml. With the recent changes, the old example no longer fails. Use a different example that still fails. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	94e07af1e7	Markdown.pl: strip markup out of implicit anchors Each H1, H2, ... H6 generated courtesy of markdown markup has an implicit anchor assigned based on the content of the element. For example: # This is an _H1_ header Strip any inline markup (in this case the '_'s) out before creating the implicit anchor. With this change, the text used to generate the anchor for the above is just "This is an H1 header". There are a couple of additional places where text that might have inline markup gets turned into an identifier (implicit reference links such as [thing][] or [thing] and wiki links without an explicit link destination such as [[thing]]). Perform the same tag stripping for them too before trying to find the destination. Many links that should have connected previously now do. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	857a411dc5	Markdown.pl: allow "stuff" on end of ``` line Some @#%^@! are doing something like this: ```shell script blah blah blah ``` That was not previously matching because only one optional "word" was allowed trailing the opening "```" characters. The single optional "word" is supposed to be a file extension type. Clearly ".shell script" is _not_ a file extension! Relax the rule somewhat. Multiple "words" are now allowed but only the first will ever participate in choosing the syntax highlighting (which currently never happens anyway). Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	89cae62dd1	Markdown.pl: do not sequester top-level unmatched p When running _HashHTMLBlocks, there's a step where we "match any empty block tags that should have been paired." Exclude "p" from that list. Given a document like this: <p> text That isolated "p" was getting sequestered away into its own blob resulting in an output document like this: <p> </p><p>text</p> By removing "p" from the list of "empty block tags that should have been paired," we get this output instead: <p> text</p> A nice improvement. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	f9a023d56a	Markdown.pl: remove td, th, tr from thead and tfoot closers Although "thead" and "tfoot" do, indeed, have an optional closing tag, neither "td", "th" nor "tr" will auto-close them. Therefore remove "thead" and "tfoot" from the list of tags that "td", "th" and "tr" will auto-close. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	c06b59644b	Markdown.pl: add bdo to taga1p The "bdo" (Bi-Directional Override) container element always requires at least one attribute to be present for it to be valid. Specifically, in this case, the "dir" attribute. Add "bdo" to the `%taga1p` (TAGs requiring Attributes count of 1 Plus) hash to reflect this. A bare "<bdo>" will now be passed through to the output as "<bdo>" when using the default options. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	845104c13a	Markdown.pl: improve handling of auto-closed p tags Given an input document like this: <div> <p> <pre>hi</pre> </p> </div> It will validate just fine in `--raw-xml` mode. However, in normal "html/xhtml" mode, the "pre" opening tag automatically closes the currently open "p" tag leading to this: <div> <p> </p><pre>hi</pre> </p> </div> Without further intervention, the closing "p" tag that was already there (just before the closing "div" tag), now has no matching open "p" tag to close anymore -- the corresponding open tag is now the open "div" section. Obviously the document fails to validate at this point. The naive fix simply has the closing tag that corresponds to the opening tag that caused the "p" to be auto-closed to then automatically re-open a "p" at that point producing this: <div> <p> </p><pre>hi</pre><p> </p> </div> While such a solution does work, it frequently ends up introducing extra unwanted "p" sections. Instead of reopening the "p" immediately upon seeing the closing tag that matches the opening tag that auto-closed the "p", simply set a "reopen p" flag. When the "reopen p" flag is set and suitable conditions are met, then go ahead and "reopen" a new "p" tag. The exact conditions are a bit of an heuristic at the moment but amount to clearing the "reopen p" flag when the next start tag is seen and inserting a new "p" at that time only if the open tag is a text level element opening tag. Alternatively, if the "reopen p" flag is currently set and some non-whitespace text shows up before seeing another open tag, re-open a new "p" at that point (and clear the "reopen p" flag). Finally, if the flag is currently set and a closing "p" tag appears, just discard it and clear the "reopen p" flag. Essentially this case has the effect of just moving the closing "p" tag. With these changes, the troublesome document now produces this: <div> <p> </p><pre>hi</pre> </div> An improvement on what came before. Some might argue that the empty "p" section ought to simply be omitted entirely. Perhaps. But there was an explicit open "p" tag in the text -- auto closing it is one thing -- removing an explicit open tag entirely is something else. Additionally, since the validator validates in a "streamy" way, that's much more difficult to accomplish since at the time the initial opening "p" has been seen there's not yet any information available about the fact it's about to be auto-closed while still not containing any text and it therefore gets emitted to the output. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	e004a5275c	Markdown.pl: do not leave remnant state lying around When commit `c86fea4089` ("Markdown: enhance link handling", 2019-10-20, markdown_1.1.8) did its thing, a new global (%g_anchors_id) was introduced to keep track of all the link ids being used/generated in order to better connect them up to the links meant to target them. Unfortunately, that hash was not getting cleared before processing each new document. While this is mostly not a problem when running from the command line since typically only one document ever gets processed at once, if more than one document is processed at a time, prior documents could affect the link fragment targets for subsequent documents. Correct the problem by properly resetting the global (along with all the others that are also reset) before processing a new document. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago
Kyle J. McKay	32862223ad	Markdown.pl: make the default yaml API mode match CLI The default YAML mode from the command line shows unknown YAML options in a table prefixed to the output and applies the ones it recognizes. Make the API have the same default mode rather than silently discarding unknown YAML options and ignoring known ones. Signed-off-by: Kyle J. McKay <mackyle@gmail.com>	4 years ago

1 2 3 4 5

231 Commits (05fcf802f68756d8a749550fd76c9883cda18871) All Branches Search

231 Commits (05fcf802f68756d8a749550fd76c9883cda18871)

All Branches