|
|
The UNIX memory management system can be thought of as a form of ``cache management'', in which a processor's primary memory is used as a cache for pages from objects from the system's virtual memory. Thus, there are a number of operations which control or interrogate the status of this ``cache'', as described in this section.
int memcntl(caddr_t addr, size_t len, int cmd, caddr_t arg, int attr, int mask);memcntl provides several control operations over mappings in the range [addr, addr + len), including locking pages into physical memory, unlocking them, and writing pages to secondary storage. The functions described in the rest of this section offer simplified interfaces to the memcntl operations.
int mlock(caddr_t addr, size_t len);mlock causes the pages referenced by the mapping in the range [addr, addr + len) to be locked in physical memory. References to those pages (through other mappings in this or other processes) will not result in page faults that require an I/O operation to obtain the data needed to satisfy the reference. Because this operation ties up physical system resources, and has the potential to disrupt normal system operation, use of this facility is restricted to the superuser. The system prohibits more than a configuration-dependent limit of pages to be locked in memory simultaneously, the call to mlock will fail if this limit is exceeded.int munlock(caddr_t addr, size_t len);
munlock releases the locks on physical pages. If multiple mlock calls are made through the same mapping, only a single munlock call will be required to release the locks (in other words, locks on a given mapping do not nest.) However, if different mappings to the same pages are processed with mlock, then the pages will stay locked until the locks on all the mappings are released.
Locks are also released when a mapping is removed, either through being replaced with an
mmap
operation or removed explicitly with
munmap.
A lock will be transferred between pages on the ``copy-on-write'' event
associated with a
MAP_PRIVATE
mapping, thus locks on an address range that includes
MAP_PRIVATE
mappings will be retained transparently along with the copy-on-write
redirection (see
mmap
above for a discussion of this redirection).
int mlockall(int flags);mlockall and munlockall are similar in purpose and restriction to mlock and munlock, except that they operate on entire address spaces. mlockall accepts a flags argument built as a bit-field of values from the set:int munlockall(void);
MCL_CURRENT | Current mappings |
MCL_FUTURE | Future mappings |
munlockall removes all locks on all pages in the address space, whether established by mlock or mlockall.
int msync(caddr_t addr, size_t len, int flags);
msync supports applications which require assertions about the integrity of data in the storage backing their mapping, either for correctness or for coherent communications in a distributed environment. msync causes all modified copies of pages over the range [addr, addr + len) to be flushed to the objects mapped by those addresses. In the cache analogy discussed previously, msync is the cache ``write-back,'' or flush, operation. It is similar in purpose to the fsync operation for files.
msync optionally invalidates such cache entries so that further references to the pages cause the system to obtain them from their permanent storage locations.
The flags argument provides a bit-field of values that influences the behavior of msync. The bit names and their interpretations are:
MS_SYNC | synchronized write |
MS_ASYNC | return immediately |
MS_INVALIDATE | invalidate caches |
int mincore(caddr_t addr, size_t len, char *vec);mincore determines the residency of the memory pages in the address space covered by mappings in the range [addr, addr + len). Using the ``cache concept'' described earlier, this function can be viewed as an operation that interrogates the status of the cache, and returns an indication of what is currently resident in the cache. The status is returned as a char-per-page in the character array referenced by *vec (which the system assumes to be large enough to encompass all the pages in the address range). Each character contains either a ``1'' (indicating that the page is resident in the system's primary storage), or a ``0'' (indicating that the page is not resident in primary storage.) Other bits in the character are reserved for possible future expansion -- therefore, programs testing residency should test only the least significant bit of each character.
mincore returns residency information that is accurate at an instant in time. Because the system may frequently adjust the set of pages in memory, this information may quickly be outdated. Only locked pages are guaranteed to remain in memory.