Manipulating text with sed

Multiple input-line functions

UNIX system pattern-matching operations typically use a single line of input. grep(1) for example, cannot handle embedded newlines. sed, however, supplies three uppercase functions that deal specially with multiline pattern spaces.

Within a multiline pattern space, an embedded newline is matched by (\n). The usual end-of-line notation ($) matches only the last newline in the pattern space; preceding embedded newlines are ignored. The start-of-line notation (^) matches the beginning of the pattern space.

N: This function appends the next input line to the current line in the pattern space; the resulting lines in the pattern space are separated by an embedded newline. A maximum of two addresses is permitted.
D: Deletes from the start of the pattern space all the characters up to and including the first newline character it comes to. If the pattern space becomes empty (the only newline being the terminal newline), another line is read from the input. Following a D function the execution of editing commands begins over again from the top of sed's list of commands. D takes a maximum of two addresses.
P: Prints from the start of the pattern space up to and including the first newline. The maximum number of addresses is two.

If there are no embedded newlines in the pattern space, the P and D functions are equivalent to their lowercase counterparts.

The multiline functions on their own are not sufficient to match patterns that cross a line boundary. The problem is that embedded newlines may appear anywhere in the pattern space. There are two ways to deal with this: either insert optional newlines between every character in the search string, or strip the \n characters out of the pattern space while searching.

The second method is preferred, but to carry out such a search it is necessary to discard scanned lines on a rolling basis: this requires the ability to make a temporary copy of the pattern space. Techniques for copying the pattern space are described in ``Hold and get functions''.