Browse Source

Markdown.pl: correct minor quibble in DoCodeBlocks regex

Correct a longstanding issue with the regex used when matching
code blocks.  Specifically the 4-spaces-indented kind of code
block.

The code block ends either at either the end of the document
or when a non-indented line is encountered.

The pattern looking for the non-indented line actually allowed
a match with up to the full 4-space indentation.

It hasn't been a problem because the greedy matcher before that
part of the pattern grabs any lines with 4 or more spaces of
indentation.

However, leaving the pattern as-is leaves it more ambiguous than
necessary and leaves open more backtracking possibilities (although
in this case the greedy matcher should prevent them being used).

Correct the pattern to reflect the actual syntax and make that
part of the pattern non-capturing to make the compiled pattern
just that little bit slightly more efficient.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
master
Kyle J. McKay 4 years ago
parent
commit
aa05222a09
  1. 3
      Markdown.pl

3
Markdown.pl

@ -3012,6 +3012,7 @@ sub _DoCodeBlocks {
#
my $text = shift;
my $less_than_indent = $opt{indent_width} - 1;
$text =~ s{
(?:\n\n|\A\n?)
@ -3021,7 +3022,7 @@ sub _DoCodeBlocks {
.*\n+
)+
)
((?=^[ ]{0,$opt{indent_width}}\S)|\Z) # Lookahead for non-space at line-start, or end of doc
(?:(?=^[ ]{0,$less_than_indent}\S)|\Z) # Lookahead for non-space at line-start, or end of doc
}{
my $codeblock = $1;

Loading…
Cancel
Save