From aa05222a090deeadbe69aba79c035a209158e256 Mon Sep 17 00:00:00 2001 From: "Kyle J. McKay" Date: Fri, 4 Jun 2021 19:40:32 -0700 Subject: [PATCH] Markdown.pl: correct minor quibble in DoCodeBlocks regex Correct a longstanding issue with the regex used when matching code blocks. Specifically the 4-spaces-indented kind of code block. The code block ends either at either the end of the document or when a non-indented line is encountered. The pattern looking for the non-indented line actually allowed a match with up to the full 4-space indentation. It hasn't been a problem because the greedy matcher before that part of the pattern grabs any lines with 4 or more spaces of indentation. However, leaving the pattern as-is leaves it more ambiguous than necessary and leaves open more backtracking possibilities (although in this case the greedy matcher should prevent them being used). Correct the pattern to reflect the actual syntax and make that part of the pattern non-capturing to make the compiled pattern just that little bit slightly more efficient. Signed-off-by: Kyle J. McKay --- Markdown.pl | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Markdown.pl b/Markdown.pl index 641cef1..2d0de93 100755 --- a/Markdown.pl +++ b/Markdown.pl @@ -3012,6 +3012,7 @@ sub _DoCodeBlocks { # my $text = shift; + my $less_than_indent = $opt{indent_width} - 1; $text =~ s{ (?:\n\n|\A\n?) @@ -3021,7 +3022,7 @@ sub _DoCodeBlocks { .*\n+ )+ ) - ((?=^[ ]{0,$opt{indent_width}}\S)|\Z) # Lookahead for non-space at line-start, or end of doc + (?:(?=^[ ]{0,$less_than_indent}\S)|\Z) # Lookahead for non-space at line-start, or end of doc }{ my $codeblock = $1;