Perl is a language optimized for scanning arbitrary text
files, extracting information from those text files, and
printing reports based on that information. It is also a
good language for many system management tasks. The
language is intended to be practical (easy to use,
efficient, complete) rather than beautiful (tiny, elegant,
minimal).
Perl combines some of the
best features of C, sed, awk, and sh,
so people familiar
with those languages should have little difficulty with it.
(Language historians will also note some vestiges of csh,
Pascal, and even BASIC-PLUS.) Expression syntax corresponds
quite closely to C expression syntax. Unlike most Unix
utilities, Perl does not arbitrarily limit the size of your
data -- if you have enough memory, Perl can read in your whole
file as a single string. Recursion depth is unlimited and
the tables used by hashes (previously called
"associative arrays") grow as necessary to prevent degraded
performance. Perl uses sophisticated pattern matching
techniques to scan large amounts of data very quickly.
Although optimized for scanning text, Perl can also deal
with binary data, and can make dbm files look like hashes.
setuid Perl scripts are safer than C programs because they
use a dataflow tracing mechanism which prevents many stupid
security holes.
If you have a problem that would ordinarily use sed or awk
or sh, but it exceeds their capabilities or must run a
little faster, and you do not want to use C,
then Perl may suit your needs.
The a2p and s2p translators are also provided to
turn your existing sed and awk scripts into Perl scripts.
Perl version 5 provides the following benefits:
Many usability enhancements
It is now possible to write much more readable Perl
code (even within regular expressions). Formerly
cryptic variable names can be replaced by mnemonic
identifiers. Error messages are more informative, and
the optional warnings will catch many of the mistakes a
novice might make. This cannot be stressed enough.
Whenever you encounter mysterious behavior, use the -w switch.
In any case, it is recommended that you always use -w
when developing Perl scripts.
Simplified grammar
The new yacc grammar is half the size of the old
one. Many of the arbitrary grammar rules have been
regularized. The number of reserved words has been cut
by two-thirds. Despite this, nearly all old Perl scripts will
continue to work unchanged.
Lexical scoping
Perl variables may now be declared within a lexical
scope as for auto variables in C. Not only is this
more efficient, but it contributes to better privacy
for ``programming in the large''. Anonymous subroutines
exhibit deep binding of lexical variables (closures).
Arbitrarily nested data structures
Any scalar value, including any array element, may now
contain a reference to any other variable or
subroutine. You can easily create anonymous variables
and subroutines. Perl manages your reference counts
for you.
Modularity and reusability
The Perl library is now defined in terms of modules.
A Perl module is a reusable software component
which can be easily shared among various packages,
whereas a Perl script is standalone.
A package may choose to import all or a portion of a
module's published interface. Pragmas (compiler directives)
are defined and used by the same mechanism.
See
http://www.perl.com/CPAN-local
for more information.
Object-oriented programming
A package can function as a class. Dynamic multiple
inheritance and virtual methods are supported in a
straightforward manner and with very little new syntax.
Filehandles may now be treated as objects.
Embeddable and extensible
Perl may now be embedded easily in your C or C++
application, and can either call or be called by your
routines through a documented interface. The XS
preprocessor is provided to make it easy to glue your C
or C++ routines into Perl. Dynamic loading of modules
is supported, and Perl itself can be made into a
dynamic library.
POSIX compliant
A major new module is the POSIX module, which provides
access to all available POSIX routines and definitions,
via object classes where appropriate.
Package constructors and destructors
The new BEGIN and END blocks provide means to capture
control as a package is being compiled, and after the
program exits. At the simplest level, they work just
like BEGIN and END in awk
when you use the -p or -n switches.
Multiple simultaneous DBM implementations
A Perl program may now access DBM, NDBM,
SDBM, GDBM,
and Berkeley DB files from the same script
simultaneously. In fact, the old dbmopen interface has
been generalized to allow any variable to be tied to an
object class which defines its access methods.
Autoloaded subroutine definitions
The AUTOLOAD mechanism allows you to
autoload subroutine definitions and also to
define any arbitrary semantics for undefined subroutine calls.
Regular expression enhancements
Non-greedy quantifiers may now be specified and you can
group without creating a backreference. You can
now write regular expressions with embedded whitespace
and comments for readability. A consistent
extensibility mechanism has been added that is upwardly
compatible with all old regular expressions.
Innumerable unbundled modules
The Comprehensive Perl Archive Network (CPAN) described in the
perlmodlib(1) manual page contains hundreds of plug-and-play
modules full of reusable code.
See
http://www.perl.com/CPAN-local
for more information.
Compilability
While not yet in full production mode, a working perl-to-C
compiler does exist. It can generate portable
byte code, simple C, or optimized C code.
Documentation
For ease of access, the Perl reference documentation is split into a
number of separate manual pages:
perl(1)
Overview (the version of this manual page in the GNU distribution)
perldelta(1)
Changes since previous version
perlfaq(1)
Frequently asked questions
perldata(1)
Data structures
perlsyn(1)
Syntax
perlop(1)
Operators and precedence
perlre(1)
Regular expressions
perlrun(1)
Execution and options
perlfunc(1)
Built-in functions
perlvar(1)
Predefined variables
perlsub(1)
Subroutines
perlmod(1)
How modules work
perlmodlib(1)
How to write and use modules
perlform(1)
Formats
perllocale(1)
Locale support
perlref(1)
References
perldsc(1)
Introduction to data structures
perllol(1)
Data structures: lists of lists
perltoot(1)
Object-oriented (OO) tutorial
perlobj(1)
Objects in Perl
perltie(1)
Objects hidden behind simple variables
perlbot(1)
OO tricks and examples
perlipc(1)
Interprocess communication
perldebug(1)
Debugging Perl
perldiag(1)
Diagnostic messages
perlsec(1)
Security in Perl
perltrap(1)
Traps for the unwary
perlstyle(1)
Style guide
perlpod(1)
Plain old documentation
perlbook(1)
Books about Perl
perlembed(1)
Embedding perl in C or C++ applications
perlapio(1)
Internal I/O abstraction interface
perlxs(1)
Application programming interface to the XS preprocessor
perlxstut(1)
XS tutorial
perlguts(1)
Internal functions for extensions
perlcall(1)
Calling conventions from C
The manual pages are listed in this order, rather than alphabetically,
to reduce the number of forward references.
The following ancillary manual pages are also provided:
a2p(1)
Convert awk to Perl
c2ph(1)
Dump C structures generated by cc
h2ph(1)
Convert C header files to Perl header files
h2xs(1)
Convert C header files to Perl extensions
perlbug(1)
How to submit Perl bug reports
perldoc(1)
Look up Perl documentation in pod format
perltoc(1)
Table of contents for Perl documentation
pl2pm(1)
Convert Perl 4 files to Perl 5 modules
pod2html(1)
Convert pod to html
pod2man(1)
Convert pod to manual pages
pstruct(1)
Dump C structures generated by cc
splain(1)
Force verbose warning diagnostics
s2p(1)
Convert sed to Perl
xsubpp(1)
Convert Perl XS code to C
By default, all the above manual pages are installed in the
/usr/gnu/man directory hierarchy.
Additional documentation for Perl modules is located
in the /usr/gnu/lib/perl5/man directory hierarchy.
Some of this additional documentation is part of the Perl distribution,
but you will also find documentation for third-party modules there.
To view the Perl manual pages using
man(1),
include these directories in the MANPATH
variable defined in /etc/default/man or in the shell environment.
You can also use the
perldoc(1) script with the -t option
specified to view module information.
Files
/tmp/perl-e$$
temporary files generated by -e commands
/usr/gnu/lib/perl5
location of the directory hierarchy for the Perl libraries
Usage
See the perlrun(1) manual page
for a description of the command-line
switches and environment variables understood by perl.
Diagnostics
Use the -w switch to produce diagnostics.
See the perldiag(1) manual page for an explanation
of Perl diagnostics.
Compilation errors tell you the line number of the
error, together with an indication of the next token or token type
that was to be examined. (In the case of a script passed to
Perl via -e switches, each -e is counted as one line.)
setuid scripts have additional constraints that can produce
error messages such as ``Insecure dependency''. See the
perlsec(1) manual page for more information.
Some pathnames given in the Perl manual pages (/usr/local/)
do not match where Perl is installed on a UnixWare® 7 system
(/usr/gnu/).
Perl uses the system definitions of
various operations such as type casting, atof, and
floating-point output with sprintf.
Under UnixWare 7, Perl requires a seek or eof
between reads and
writes on a particular stream. (This does not
apply to sysread and syswrite.)
While none of the built-in data types have any arbitrary
size limits (apart from memory size), there are still a few
arbitrary limits:
a given variable name may not be longer
than 255 characters
no component of your PATH may be
longer than 255 characters if you use the -S switch
a regular expression may not
compile to more than 32767 bytes internally
Mail bug reports to
perlbug@perl.com
including full
configuration information as output by the myconfig program
in the Perl source tree, or by perl -V.
If you have succeeded in compiling perl,
the perlbug script in the utils
subdirectory can be used to help mail a bug report.
Perl actually stands for ``Pathologically Eclectic Rubbish
Lister'', but Larry asks that you don't tell anyone that he said that.
The Perl motto is ``There's more than one way to do it''.
Divining how many more is left as an exercise to the reader.
The three principal virtues of a programmer are ``laziness,
impatience, and hubris''. For an explanation, see the ``Camel Book''.
Author
Larry Wall
(larry@wall.org)
with the help of many others.
If your Perl success stories and testimonials may be of help
to others who wish to advocate the use of Perl in their
applications, or if you wish to simply express your
gratitude to Larry and the Perl developers, please mail
perl-thanks@perl.org.