|
|
#include <stdlib.h>void *calloc(size_t nelem, size_t elsize); void free(void *ptr); void *malloc(size_t size); void *realloc(void *ptr, size_t size);
void *memalign(size_t alignment, size_t size); void *valloc(size_t size);
struct mallinfo mallinfo(void);
free- checking memory allocator
realloc- checking memory allocator
memalign- checking memory allocator
valloc- checking memory allocator
mallinfo- checking memory allocator
These functions provide a simple general-purpose memory allocation package. An allocation request is made by a call to one of the functions that return void *; the pointer values returned are the start (lowest byte address) of disjoint objects, suitably aligned to serve as an array of one or more objects totalling the requested size. Except for calloc, the initial contents of these objects are indeterminate. The ptr argument (free and realloc) must be either a null pointer or a pointer value returned by a previous allocation request which has not been deallocated by an intervening call to free or realloc.
calloc allocates space for an array of nelem elements each of elsize bytes. All bytes of this space are initialized to zero.
free deallocates (makes available for further allocation) the space pointed to by ptr. If ptr is a null pointer, no action occurs.
malloc allocates space for an object of size bytes.
realloc changes the size of the object pointed to by ptr to size bytes. Note that the object may have been moved, in which case the old version of the object has been freed. The object's contents are unchanged up to the lesser of the old and new sizes. realloc(NULL,size) is a convenient equivalent to malloc(size).
memalign allocates space for an object of size bytes with the specified alignment, which must be a power of two. (The address returned is a multiple of alignment.)
valloc(size) is equivalent to memalign(sysconf(_SC_PAGESIZE),size).
mallinfo returns a structure that describes the current status of the allocation arena. The structure has at least the following members:
size_t arena; /* total arena size in bytes */ size_t ordblks; /* number of full-sized allocations */ size_t smblks; /* number of small-sized allocations */ size_t hblks; /* number of small-sized allocation containers */ size_t hblkhd; /* overhead for these containers, in bytes */ size_t usmblks; /* total bytes of in-use small-sized allocations */ size_t fsmblks; /* total bytes of available small-sized allocations */ size_t uordblks; /* total bytes of in-use full-sized allocations */ size_t fordblks; /* total bytes of available full-sized allocations */
Besides a lack of available space from the system, both realloc and memalign also can return null pointers due to invalid arguments.
If the environment variable MALLOC_FILENO exists with a decimal integer value, messages are written to that file descriptor instead of standard error (or 2). This allows these messages to be separated from any of the application's error messages, for example.
The usage statistics and checking code, when included, are enabled only through separate environment variables, respectively, MALLOC_STATS and MALLOC_CHECKS. Both variables are expected minimally to be decimal integers; a zero value disables, a positive value enables, and generally a larger value enables more detail. In the case of MALLOC_CHECKS, a negative value enables the ``memory fault'' feature. Both variables also accept, after the initial decimal value, a comma-separated sequence of keywords used to specify other features. See ``Keyword features''
Another cause of spare bytes is an enforced minimum block size. To implement a quick best-fit allocation policy, the initial portion of the data area in free blocks is used for internal data structures. The amount of space this requires determines the size of the smallest full-sized block. But, since a minimum block size can cause a disproportionally large amount of space to be wasted for small allocation requests, the implementation also provides for small-sized blocks with some restrictions, such as no coalescing with neighboring free blocks.
A limited view of the allocation arena's current status with respect to full- and small-sized blocks is available from mallinfo.
After an assertion failure is reported, the package usually continues processing as if the internal assertions had not been included (no corrective action takes place) but, if either (or both) of the statistics or checking code has been enabled, it will attempt to terminate the process through a call to abort.
As mentioned above, note that assertions are not enabled in the product.
Numeric values are printed in hexadecimal (base 16) for pointers,
while sizes are printed in decimal.
Also, size arguments are presented in a before and after form:
incoming=>normalized-header
where normalized is the incoming value after
including the block header overhead
and rounding up to the minimum internal alignment.
``@caller'' represents the ``return address'' that will be jumped to when the function returns. With help from a debugger, for example, this address can be associated with a calling function. It is present when MALLOC_STATS is at least two. The return address will instead be displayed in a potentially more usable form when the symbolic keyword is specified. See ``Keyword features''.
Except for free, a ``->return'' pointer value will be shown.
Each ``{fact}'' gives some implementation internal detail--they are not intended to be generally useful--none of which will be present unless MALLOC_STATS is at least three.
However,
one typically useful exception is the true block size:
{max=size-header}
which displays the block's actual size less its header.
The true block size is printed just after both
the return value and an incoming ptr argument.
There are three checking levels: basic-fill, safe-copy, and added-space, which correspond to MALLOC_CHECKS values of at least one, three, and five, respectively. Fundamental to this package's approach to checking is that it can be fully enabled without affecting the allocation behavior, at least until a diagnostic line is printed. This means that whatever presumed allocation arena corruption is occurring within an application with checking disabled should be reproduced after enabling checking. Only at the added-space checking level will the allocation behavior be changed, which is its purpose.
When checking is enabled, a call to mallinfo will also cause an examination of the whole allocation arena. For each of the checking levels, the next higher MALLOC_CHECKS value (2, 4, and 6, respectively) also cause a similar walk of the entire arena at the start of each of the main functions.
At the basic-fill checking level, shape checks on the ptr argument to free and realloc are performed -- it must be correctly aligned, fall somewhere within the allocation arena, and it should be an allocated (in-use) block. These checks produce the following problem descriptions for free and realloc:
free() of invalid block block multiply free()dThe distinction between the last two reflects the implementation's ``temporary holding slot'' behavior in which an attempt is made not to immediately modify a block's data area when deallocated. (This is primarily a hold-over from much older implementations which guaranteed this behavior!) An attempt to realloc a deallocated block will fail unless the block happens to be found in the holding slot.realloc() of invalid block realloc() of free()d block realloc() of just free()d block
Also at the basic-fill level, all available (deallocated) or freshly allocated data area bytes are filled with a ``noise'' value of ``0xCA''. When an available block is handled by the implementation (such as just before it is allocated), its data area is examined for bytes other than ``0xCA''. If one is found, a diagnostic is printed with the problem description
free()d block was modified.
At the safe-copy level, the implementation also creates and manages a separate and parallel section of memory (via mmap) into which safe copies of the allocation arena's block headers are maintained along with a record of at least the block's last-allocation owner (see ``@caller'' and ``/owner @addr'' above) and usually the number of spare bytes present in the block's data area. At this level, when a block is handled, its header is checked against the corresponding copy -- which can result in a problem description of
invalid block addressif the block is not aligned, or
not an allocated blockif there is no corresponding copy, or
block header was modifiedif the copy differs -- and any spare data area bytes (for blocks passed to free and realloc or other in-use blocks) are examined for bytes other than ``0xCA'' -- which can result in a problem description of
spare block space was modified
At the added-space level, the implementation also increases each allocation request by an additional minimum alignment to force the presence of spare bytes in every in-use block. These spare bytes provide a little more protection from typical one-too-many misuse as well as better tracking of the corruption owner.
Negative values for MALLOC_CHECKS cause these allocation routines to employ an entirely different implementation. The heart of this alternate approach is that every allocation gets at least one complete page of memory. This means that a lot of memory can be allocated very quickly, even for otherwise small programs. The benefit of this approach is that programs are much more likely to generate an early memory fault when an allocation is misused. Levels -1 through -8 are provided.
At the -1 level, the last byte of an allocation request is the last byte of the (last) page allocated and released allocations are kept as inaccessible as long as possible. The -3 level differs from -1 in that the first byte of an allocation request is the first byte of the (initial) page allocated. Levels -5 and -7 differ, respectively, from levels -1 and -3 in that released allocations are returned to the system for reuse immediately. Bytes are filled with the same ``0xCA'' noise value as is used with positive MALLOC_CHECKS values.
Also, just as with the positive MALLOC_CHECKS values, the corresponding even level (-4 for -3, for example) is just the same, except that a walk of the entire arena is performed each time one of these functions is called. This can add noticeably to the runtime of programs making heavy use of dynamic memory allocation functions.
When MALLOC_CHECKS is negative, MALLOC_STATS will not provide higher than level 2 information, and even that information is somewhat misleading given that there is no longer a header that comes just in front of each allocation.
It is common to enable these diagnostic checks while running an application under the control of a symbolic debugger. With this in mind, the implementation calls a local (internal linkage) function, modified, just before the three ``... was modified'' diagnostics so that the modified contents can be examined with the debugger before the corrupted header or data area is reset. modified is called with three arguments, a void * that points to the start of the detected corruption, a size_t that specifies the number of bytes (remaining) to be checked including the one pointed to, and a const char * that points to a short string that denotes which kind of modification was found: spare, header, or free()d.
Two other local functions are useful when debugging: allocassert, which is called with the assertion expr string and num, and checkmsg, which is called with a block's data area pointer and the problem description string. Finally, two local integers, announcing and checking hold, respectively, the value of the MALLOC_STATS and MALLOC_CHECKS variables, so modifying these two can permit finer-grained control over these facilities. Check with your preferred symbolic debugger about how one refers to local identifiers.
If you are using debug(1), the best way to run a program prog with checking enabled is first to enter debug without naming prog. Then before starting prog, set and export $MALLOC_CHECKS as desired. Then start prog. You can set a breakpoint on modified, for example, by specifying stop malloc.c@modified.
The pid and tid keywords cause insertion of, respectively, a process or thread ID (or both) at the end of any emitted diagnotic or statistics line. The process ID is a decimal number immediately preceded by a ``#''. The thread ID is a decimal number between angle brackets ``<>''.
The nullptr keyword causes most attempts at dereferencing a null pointer to cause the generation of a memory fault (a SIGSEGV will be sent to the process). This differs from the nullptr(1) command in that it is controlled through an environment variable (and thus can be enabled separately for different processes) and that it does not go into effect until the first call to one of these functions. Particularly note that because touching page zero is necessary before it can be made unreadable, if nullptr has been used to make page zero unreadable generally, use of the nullptr keyword will cause an immediate memory fault.
The zero keyword causes calls to malloc requesting zero bytes and calls to free with a null pointer to be reported as mistakes if MALLOC_CHECKS has been enabled. Nevertheless, malloc(0) will continue to return a pointer to a unique zero-sized block, so this keyword does not cause any change in the runtime behavior of these functions.
The symbolic keyword causes the display of ``text'' addresses (see ``@caller'' and ``@addr'' above) to instead be shown as
by module@function+offsetor as
by module@address
when no function can be found. An example of each would be
by libc.so.1@_strdup+18 by ls@0x8301b87
The offset is a decimal number giving the distance in bytes from the start of the named function. This symbolic information is determined by use of dladdr(3C) and thus is restricted to functions in runtime symbol tables. Since, by default, only a minimal set of functions are exported from executables, often addresses in the executable will not find a function, or (worse) will report a function that comes earlier in the address space, but isn't the actual function. One can cause all functions with external linkage to be exported by passing the -Bexport option to the linker when creating the executable. With the cc(1) or CC(1C++) commands, one does this by specifying -Wl,-Bexport.
Both the full and leak keywords cause one-per-line listing of the currently-allocated blocks, they differ in when they are triggered: The full keyword will cause the list to be printed just before an allocation request that will result in the return of a null pointer due to insufficient space. The leak keyword will cause the list to be printed just before the process goes away for a normal process exit, not when a process dies because it received an unhandled signal, for example. Note that the full keyword can generate many lines of output. Also, it isn't necessarily a bug for a block still to be allocated at process exit.
Each line output by these keywords is of the form
allocated block addr (n bytes + m) ownerwhere addr is the returned block address, n is the size of the block (or if + m is present, n is the number of bytes requested and m is the spare bytes -- this information is only available when MALLOC_CHECKS is negative or at least 3), and owner is the ``return address'' whose display depends on the whether the symbolic keyword is specified.
For example, to see what blocks are live after the end of a simple grep(1) invocation, try the following:
$ MALLOC_CHECKS=3,leak,sym grep x /dev/null allocated block 0x8303078 (8 bytes + 4) by libc.so.1@_strdup+32 allocated block 0x8303090 (8 bytes + 4) by grep@_rt_pre_init+160 allocated block 0x83030a0 (12 bytes) by libc.so.1@posnfoll+316 allocated block 0x83030b0 (12 bytes) by libc.so.1@_regdfacomp+244 allocated block 0x83030c0 (12 bytes) by libc.so.1@_regdfacomp+314 allocated block 0x8303358 (8524 bytes) by libc.so.1@_regdfacomp+98 allocated block 0x83054a8 (51 bytes + 1) by libc.so.1@_regdfacomp+169 allocated block 0x8305860 (9215 bytes + 5) by grep@_rt_pre_init+160
This shows that there are eight blocks still allocated at process exit, two of which were allocated by grep, the rest by the C library on behalf of grep. The function that actually allocated space in grep is not _rt_pre_init, that just happens to be the nearest function exported by grep. None of the above allocations actually represents a memory leak.
The C library internal header inc/mallint.h describes the tunable constants and data structures for gen/malloc.c. The main data structure Tree declares just enough information for blocks in the top-down splay tree that is used for the free blocks. This structure for IA32 is 20 bytes big, but the smallest full-sized block is 24 bytes because sizes are rounded up to multiples of eight. For small-blocks, only the block header and next pointers are assumed to be present.
Unlike earlier implementations, this approach does not force the same base alignment requirements on block headers as it does for the user pointers. A block header occurs in the HEADERSIZE bytes that come just before the aligned user pointer. Also, the basic alignment is eight, not four, so that three low-order bits are available for block status. Note that this means that the block size cannot be the number of data area bytes since the size must be a multiple of eight -- HEADERSIZE need not be a multiple of eight; this code uses a block size that is the distance from this block's header to the next.
The third status bit signifies whether a freed block is in the free tree or on a quick list. It is the quick lists that give this implementation its speed since, in practice, by far the greatest number of calls are for smallish allocations. These lists cover the small-sized block pools without needing (much) special case code. Generally, the implementation tries to keep blocks at or below the MAXQUICKSIZE line on these lists; only when the implementation would otherwise have to request more space from the system does it flush the quick lists into the free tree.
This implementation also differs from older ones in that it can return memory to the system. This turns out to be quite important for long-lived processes, such as daemons, that have allocation peaks, but do not need nearly as much memory while waiting for requests.