|
|
UNIX system pattern-matching operations typically use a single line of input. grep(1) for example, cannot handle embedded newlines. sed, however, supplies three uppercase functions that deal specially with multiline pattern spaces.
Within a multiline pattern space, an embedded newline is matched by (\n). The usual end-of-line notation ($) matches only the last newline in the pattern space; preceding embedded newlines are ignored. The start-of-line notation (^) matches the beginning of the pattern space.
The multiline functions on their own are not sufficient to match patterns that cross a line boundary. The problem is that embedded newlines may appear anywhere in the pattern space. There are two ways to deal with this: either insert optional newlines between every character in the search string, or strip the \n characters out of the pattern space while searching.
The second method is preferred, but to carry out such a search it is
necessary to discard scanned lines on a rolling basis: this requires
the ability to make a temporary copy of the pattern
space. Techniques for copying the pattern space are described in
``Hold and get functions''.