JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris 11.1 Administration: ZFS File Systems     Oracle Solaris 11.1 Information Library
search filter icon
search icon

Document Information

Preface

1.  Oracle Solaris ZFS File System (Introduction)

2.  Getting Started With Oracle Solaris ZFS

3.  Managing Oracle Solaris ZFS Storage Pools

4.  Managing ZFS Root Pool Components

5.  Managing Oracle Solaris ZFS File Systems

6.  Working With Oracle Solaris ZFS Snapshots and Clones

7.  Using ACLs and Attributes to Protect Oracle Solaris ZFS Files

8.  Oracle Solaris ZFS Delegated Administration

9.  Oracle Solaris ZFS Advanced Topics

10.  Oracle Solaris ZFS Troubleshooting and Pool Recovery

Resolving ZFS Space Issues

ZFS File System Space Reporting

ZFS Storage Pool Space Reporting

Identifying ZFS Failures

Missing Devices in a ZFS Storage Pool

Damaged Devices in a ZFS Storage Pool

Corrupted ZFS Data

Checking ZFS File System Integrity

File System Repair

File System Validation

Controlling ZFS Data Scrubbing

Explicit ZFS Data Scrubbing

ZFS Data Scrubbing and Resilvering

Resolving Problems With ZFS

Determining If Problems Exist in a ZFS Storage Pool

Reviewing zpool status Output

Overall Pool Status Information

Pool Configuration Information

Scrubbing Status

Data Corruption Errors

System Reporting of ZFS Error Messages

Repairing a Damaged ZFS Configuration

Resolving a Missing Device

Physically Reattaching a Device

Notifying ZFS of Device Availability

Replacing or Repairing a Damaged Device

Determining the Type of Device Failure

Clearing Transient Errors

Replacing a Device in a ZFS Storage Pool

Determining If a Device Can Be Replaced

Devices That Cannot be Replaced

Replacing a Device in a ZFS Storage Pool

Viewing Resilvering Status

Repairing Damaged Data

Identifying the Type of Data Corruption

Repairing a Corrupted File or Directory

Repairing Corrupted Data With Multiple Block References

Repairing ZFS Storage Pool-Wide Damage

Repairing an Unbootable System

11.  Archiving Snapshots and Root Pool Recovery

12.  Recommended Oracle Solaris ZFS Practices

A.  Oracle Solaris ZFS Version Descriptions

Index

Repairing Damaged Data

The following sections describe how to identify the type of data corruption and how to repair the data, if possible.

ZFS uses checksums, redundancy, and self-healing data to minimize the risk of data corruption. Nonetheless, data corruption can occur if a pool isn't redundant, if corruption occurred while a pool was degraded, or an unlikely series of events conspired to corrupt multiple copies of a piece of data. Regardless of the source, the result is the same: The data is corrupted and therefore no longer accessible. The action taken depends on the type of data being corrupted and its relative value. Two basic types of data can be corrupted:

Data is verified during normal operations as well as through a scrubbing. For information about how to verify the integrity of pool data, see Checking ZFS File System Integrity.

Identifying the Type of Data Corruption

By default, the zpool status command shows only that corruption has occurred, but not where this corruption occurred. For example:

# zpool status tank
  pool: tank
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
        entire pool from backup.
   see: http://support.oracle.com/msg/ZFS-8000-8A
config:

        NAME                       STATE     READ WRITE CKSUM
        tank                       ONLINE       4     0     0
          c0t5000C500335E106Bd0    ONLINE       0     0     0
          c0t5000C500335FC3E7d0    ONLINE       4     0     0


errors: 2 data errors, use '-v' for a list

Each error indicates only that an error occurred at a given point in time. Each error is not necessarily still present on the system. Under normal circumstances, this is the case. Certain temporary outages might result in data corruption that is automatically repaired after the outage ends. A complete scrub of the pool is guaranteed to examine every active block in the pool, so the error log is reset whenever a scrub finishes. If you determine that the errors are no longer present, and you don't want to wait for a scrub to complete, reset all errors in the pool by using the zpool online command.

If the data corruption is in pool-wide metadata, the output is slightly different. For example:

# zpool status -v morpheus
  pool: morpheus
    id: 13289416187275223932
 state: UNAVAIL
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
   see: http://support.oracle.com/msg/ZFS-8000-72
config:

        morpheus    FAULTED   corrupted data
          c1t10d0   ONLINE

In the case of pool-wide corruption, the pool is placed into the FAULTED state because the pool cannot provide the required redundancy level.

Repairing a Corrupted File or Directory

If a file or directory is corrupted, the system might still function, depending on the type of corruption. Any damage is effectively unrecoverable if no good copies of the data exist on the system. If the data is valuable, you must restore the affected data from backup. Even so, you might be able to recover from this corruption without restoring the entire pool.

If the damage is within a file data block, then the file can be safely removed, thereby clearing the error from the system. Use the zpool status -v command to display a list of file names with persistent errors. For example:

# zpool status tank -v
  pool: tank
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
        entire pool from backup.
   see: http://support.oracle.com/msg/ZFS-8000-8A
config:

        NAME                       STATE     READ WRITE CKSUM
        tank                       ONLINE       4     0     0
          c0t5000C500335E106Bd0    ONLINE       0     0     0
          c0t5000C500335FC3E7d0    ONLINE       4     0     0

errors: Permanent errors have been detected in the following files:
        /tank/file.1
        /tank/file.2

The list of file names with persistent errors might be described as follows:

If the corruption is within a directory or a file's metadata, the only choice is to move the file elsewhere. You can safely move any file or directory to a less convenient location, allowing the original object to be restored in its place.

Repairing Corrupted Data With Multiple Block References

If a damaged file system has corrupted data with multiple block references, such as from snapshots, the zpool status -v command will not display all corrupted data paths. The ZFS scrub algorithm traverses the pool and visits each block of data only once. It can only report the corruption the first time that it is encountered. Therefore, it only generates a single path to the impacted file. Note that this also applies to corrupted blocks that have been deduplicated.

If you have corrupted data and the zpool status -v command identifies that snapshot data is impacted, consider searching for additional corrupted paths.

# find mount-point -inum $inode -print
# find mount-point/.zfs/snapshot -inum $inode -print 

The first command searches for the inode number of the reported corrupted data in the specified file system and all its snapshots. The second command searches for snapshots with the same inode number.

Repairing ZFS Storage Pool-Wide Damage

If the damage is in pool metadata and that damage prevents the pool from being opened or imported, then the following options are available to you: