|
|
Sometimes a filesystem must perform an operation in several steps that it would ideally like to perform in one step. For instance, creating a file requires the following steps:
Because caching disk controllers can delay and/or change the order in which data is written to the disk, independent of the filesystem's intentions, a filesystem's data or structure can become corrupted if the writes are interrupted. Filesystem corruption by this means can occur with any filesystem on any operating system, whether it be DOS, OS/2, UnixWare, or some other operating system.
A filesystem (and the disk as well) can be severely damaged if one of the following events happens while data is being written to the disk:
To avoid filesystem damage, here are some suggestions (these suggestions apply whether or not you have a caching disk controller):
If you have a caching disk controller, carefully read the documentation that came with it. If the cache is only used for reading, or it is a write-through cache, you need not worry about consistency problems related to the caching disk controller. If the cache is a write-back cache, consider the issues raised in the rest of this section.
With caching controllers, there is a trade-off of increased performance versus risk of corruption. The risk is related to how often the disk is interrupted while flushing data from the cache to the disk. If you cannot accept this risk, you can take one of the following actions (all decrease the performance of the disk controller).
Normally, filesystem damage is automatically repaired for you when you bring up the computer. If the ordering is changed by a caching disk controller and then interrupted before all of the cache is written to disk, the filesystem might be fooled into thinking that all is well when damage is present.
The only way this can happen for s5, ufs, sfs, or bfs is if the superblock is stamped clean before other writes have completed. A superblock is stamped clean only when it has been unmounted or when fsck(1M) has been run on it. When the computer comes back up, fsck is not run in this case because the filesystem appears clean. In all other cases, the filesystem does not appear to be clean, and fsck checks the entire filesystem.
With vxfs, however, things are more problematic. Remember that certain operations that appear as a single step or write to disk to the user actually involve several steps or writes to disk. vxfs logs these steps in the intent log on the disk before making these steps. If the operation is interrupted, fsck for vxfs does not have to check the entire filesystem. Instead, it only has to check the intent log to see what steps need to be done to complete an operation, or what steps need to be undone to undo an operation. Thus, because the disk is not usually checked at all (or is checked only superficially), the fsck for vxfs is very fast.
This mode of fsck, however, depends on the order of writes to disk, which a caching controller might change.
If you suspect filesystem damage that the computer is overlooking, do the following:
/sbin/fsck $fstypa -y ${rrootfs} > /dev/null 2>&1to this:
/sbin/fsck $fstypa -o full -y ${rrootfs} > /dev/null 2>&1"Then shut down the computer. This might not work because you are changing a file on the affected filesystem. If you still have problems, you must restore your root filesystem from emergency recovery tapes (see emergency_rec(1M) for details).