DMA
DMA (Direct-Memory Access)
refers to the I/O strategy
where data is transferred between system memory and the device
without the aid of the CPU.
There are two main categories of DMA devices:
-
Some DMA devices (mostly ISA devices)
use the DMA controller on the system motherboard
to execute the I/O transfers.
Drivers for these devices must program
the DMA controller and the device
to complete the transfer;
this involves acquiring DMA channels,
programming the DMA controller with the
starting address and size of the data transfer,
and finally programming the device to commence transfer operations.
Both SVR5 and SCO OpenServer 5 provide functions for this,
listed later in this article.
-
Many modern devices include their own DMA controllers
on the device itself.
Drivers for these devices must obtain
the physical addresses to be used for data transfers
and for command/status access
and map them into the device registers.
See also:
Allocating DMA memory (DDI)
DDI drivers specify the physical requirements
for DMA memory in the
physreq(D4)
structure that is allocated with the
physreq_alloc(D3)
function, then populated, then prepared with the
physreq_prep(D3)
function.
The memory is then allocated
with one of the following functions
that utilizes the physreq specifications:
kmem_alloc_physreq(D3)-
Allocate memory that can be accessed by the device.
This is mostly used for memory that is used
for control and status information.
kmem_zalloc_physreq(D3)-
Same as
kmem_alloc_phys( )
except that the allocated memory is zeroed on allocation.
kmem_alloc_phys(D3) (DDI 8 only)-
Allocate memory that can be accessed by the device.
kmem_free(D3)-
Free memory obtained with
kmem_alloc_physreq( ),
kmem_zalloc_physreq( ),
or
kmem_alloc_phys( ).
allocb_physreq(D3str)-
Allocate STREAMS message memory
using a physreq structure.
msgphysreq(D3str)-
Check that the transmitted message satisfies
the specified physical requirements.
msgpullup_physreq(D3str)-
Concatenate the bytes of the message if necessary
to create a new message that matches the physical requirements.
sdi_kmem_alloc_phys(D3sdi)-
Allocating DMA memory (ODDI)
kmem_alloc(D3oddi)-
Allocate kernel memory.
By default in ODDI,
the memory allocated is below the 16MB boundary
and so can be used for DMA for any device.
Use the KM_NO_DMA flag
when the allocated memory can be above the 16MB limit.
kmem_zalloc(D3oddi)-
Similar to
kmem_alloc( )
but the contents of the buffer are zeroed out
before returning.
sptalloc(D3oddi)-
Allocate either physical or virtual memory
depending on the flags provided.
Use the DMAABLE flag
if the memory allocated must be under 16MB.
allocb(D3str)-
Allocate STREAMS memory.
Considerations for allocating DMA memory
The term ``DMA-able memory'' is a bit ambiguous
since each device has a different definition
of what constitutes DMA-able memory.
Generally, devices must be concerned
with physical contiguity and physical addresses.
Devices that do not support
scatter/gather operations
can only address one chunk of physically-contiguous memory,
identified by the physical base address
and the size of the chunk.
Other devices may be able to address more than one chunk
of physically-contiguous memory,
but still have limits as to how many chunks there can be.
If there are more memory fragments in a buffer
that the device needs to access,
it must copy the extra fragments
into the last buffer it can address.
Some drivers that can address unlimited chunks of memory
may perform better if they copy data
rather than manage a large set of DMA descriptors.
Devices are also limited by
the number of address lines they have.
Devices with 24 address lines
can address up to 16MB of memory,
devices with 32 address lines
can address any memory below 4 gigabytes,
and devices with 36 address lines
can address up to 64 gigabytes of memory.
The following list summarizes these issues
and points to the functions and/or structures
that the driver writer can use to satisfy them.
These considerations apply only to memory
that is accessed by the device as well as the kernel.
DMA device drivers may also allocate memory
that is used only by the driver software;
this memory has fewer constraints and can be allocated
as discussed in the
``Memory allocation''
article.
Memory ranges-
Some DMA devices have severe retrictions
on the physical memory to and from which
I/O can be performed.
Other devices can access a large range of physical memory
and performance is greatly enhanced
when they are allowed to access the full range.
DDI 7 and ODDI drivers
can access memory up to 4GB;
DDI 8 drivers can access memory
up to 64GB.
DDI-
Specify the memory range for the device
with the
phys_dmasize
member of the
physreq(D4)
structure.
Set to 24 for ISA devices,
32 for devices that can access up to 4GB memory,
and 64 for devices that can access up to 64GB memory.
See
``DMA up to 64 bits (DDI only)''
for more information about DMA that uses 64-bit addressing.
ODDI-
Specify the DMAABLE flag to the
sptalloc(D3oddi)
or
getcpages(D3oddi)
function or call the
kmem_alloc(D3oddi)
function without the KM_NO_DMA flag
to allocate memory below the 16MB threshhold.
Otherwise, the allocated memory can come
from anywhere in the first 4GB of system memory.
Contiguity-
DMA operations require
that the system memory being used is physically contiguous.
The memory that is used for command/status operations
often falls into this category.
Each scatter/gather region also needs to be physically contiguous,
although the different regions
do not all need to be physically contiguous,
nor does the list of scatter/gather regions.
DDI-
Drivers that use DDI versions
prior to version 8 can specify the
PREQ_PHYSCONTIG flag
for the
phys_flags
member of the
physreq(D4)
structure
to specify that the allocated memory must be physically contiguous.
ODDI-
Use the
getcpages(D3oddi)
function to allocate physically contiguous memory segments
that are larger than a page.
kmem_alloc(D3oddi)
and
sptalloc(D3oddi)
can be used if the physically contiguous segment
is required to be 1 page (4KB) or smaller.
Alignment and boundary-
Some devices require that an address be word-aligned
or that the data buffer use for the I/O data
not cross some boundary.
DDI-
Specify alignment requirements in the
phys_alignment
member
of the physreq structure.
Note that the memory used for command/status operations
may have different alignment requirements
than the memory used for the actual I/O data buffer.
For example, phys_alignment
may be set to 512
(the size of a physical disk block)
for data transfers on a scatter/gather device,
but set to 4 for the command/status operations.
Specify boundary requirements in the
phys_boundary
member
of the physreq structure.
If this member is set to a non-zero value,
buffers will not span addresses
that are multiples of this value.
For example,
if the data buffer used for the I/O data
not cross a 64KB boundary,
set phys_boundary to 64.
ODDI-
The
physio(D3oddi),
and
dma_breakup(D3oddi)
functions can be used
to handle many alignment and boundary requirements.
ODDI is not as powerful in DDI in this case,
so, in some cases,
the driver writer must to code these checks manually
in a driver-specific subroutine.
Device-specific units-
Mapping a user process's view of a device into a set of
device-specific units (for example, sectors for disks) when
performing I/O.
For example, when a user process performs direct I/O
from a random access device, with reads and writes
interspersed with lseeks, mapping is needed to
translate the relative location of I/O as seen by the
process into a series of disk sectors
that can be used directly by the driver.
Locality (NUMA systems only)-
DMA support for drivers
running on NUMA systems
requires that the kernel knows the ``locality'' of memory,
which is expressed in terms of the
CPU group with which
the memory used for DMA buffers
is associated.
Drivers must also know the physical requirements of
the individual device's DMA engine.
Only SVR5 supports NUMA
so only DDI drivers are equipped
to control locality.
The kernel determines this information
from the
buf(D4)
structure (or other memory handle) and the
physreq(D4)
structure that is associated with the
driver's resource manager key
for DDI 8 drivers.
Programming the system DMA controller (DDI)
The following DDI functions are used
to program the system motherboard DMA controller
for a DMA device that does not have
a DMA controller on the device controller itself.
These functions acquire the command and data blocks
needed to describe the type of transfer
and program the system DMA controller
and device to complete the transfer.
dma_cascade(D3)-
Program an ISA-style DMA channel for cascade mode
dma_disable(D3)-
Disable recognition of hardware requests
on an ISA-style DMA channel
dma_enable(D3)-
Enable the ISA-style channel
to respond to DMA requests from the device
dma_free_buf(D3)-
Free a previously-allocated DMA buffer descriptor
dma_free_cb(D3)-
Free a previously-allocated DMA command block
dma_get_best_mode(D3)-
Determine best transfer mode
for an ISA-style DMA channel
dma_get_buf(D3)-
Allocate a DMA buffer descriptor
dma_get_cb(D3)-
Allocate a DMA command block
that can be used to identify
all the DMA requirements in terms of:
-
Direction of transfer (I/O to memory or memory to I/O)
-
Size of the data path (8-, 16-, or 32-bit)
-
Whether addresses should be incremented or decremented
after every transfer
-
Mode of DMA (single, cascade, block, and so on)
dma_physreq(D3)-
dma_prog(D3)-
Program a particular ISA-style DMA channel
based on driver input
expressed by the control and data blocks.
dma_stop(D3)-
Stop software-initiated DMA operation
on a channel and release it
dma_swsetup(D3)-
Program a DMA operation
for a subsequent software request.
dma_swstart(D3)-
Initiate a DMA operation via software request
Programming the system DMA controller (ODDI)
The following ODDI functions are used
to program the system motherboard DMA controller
for a DMA device that does not have
a DMA controller on the device controller itself.
These functions acquire the command and data blocks
needed to describe the type of transfer
and program the system DMA controller
and device to complete the transfer.
dma_alloc(D3oddi)-
Allocate a DMA channel.
dma_breakup(D3oddi)-
Size DMA request into 512-byte blocks.
dma_enable(D3oddi)-
Begin DMA transfer.
dma_param(D3oddi)-
Set up the system DMA controller
for a DMA transfer.
dma_relse(D3oddi)-
Release previously allocated DMA channel.
dma_resid(D3oddi)-
Return the number of bytes not transferred
during a DMA request.
dma_start(D3oddi)-
Queue the DMA request.
© 2005 The SCO Group, Inc. All rights reserved.
OpenServer 6 and UnixWare (SVR5) HDK - June 2005