Skip Navigation Links | |
Exit Print View | |
Oracle Solaris 11.1 Tunable Parameters Reference Manual Oracle Solaris 11.1 Information Library |
1. Overview of Oracle Solaris System Tuning
2. Oracle Solaris Kernel Tunable Parameters
3. Oracle Solaris ZFS Tunable Parameters
Where to Find Tunable Parameter Information
Tuning ZFS When Using Flash Storage
Adding Flash Devices as ZFS Log or Cache Devices
Ensuring Proper Cache Flush Behavior for Flash and NVRAM Storage Devices
SCSI Unmap Considerations for Flash Devices
5. Internet Protocol Suite Tunable Parameters
A. Tunable Parameters Change History
Review the following considerations when using ZFS with a database product.
If the database uses a fixed disk block or record size for I/O, set the ZFS recordsize property to match it. You can do this on a per-file system basis, even though multiple file systems might share a single pool.
With ZFS's copy-on-write design, tuning down the recordsize is a way to improve OLTP performance at the expense of batch reporting queries.
ZFS checksums every block stored on disk. This alleviates the need for the database layer to checksum data an additional time. If checksums are computed by ZFS instead of at the database layer, any discrepancy can be caught and fixed before the data is returned to the application.
UFS direct I/O is used to overcome some of the design deficiencies of UFS and to eliminate double buffering of data. In ZFS, the UFS design deficiencies do not exist and ZFS uses the primarycache and secondarycache properties to manage buffering data in the ARC. Note that using the secondarycache (L2ARC) property to improve random reads also requires the primarycache property to be enabled.
Keep pool space under 90% utilization to maintain pool performance.
ZFS is recommended for any Oracle database version in single instance mode. ZFS can be used with an Oracle RAC database when it is available as a NFS-shared file system.
Review the following recommendations below for tuning ZFS for an Oracle database:
Verify that you are running the latest Solaris release
Start with the latest Solaris 10 or Solaris 11 release, with the Solaris 10 9/10 release as a minimum starting point.
Create LUNs for your ZFS storage pools, if needed
Use your storage array tools to create LUNs that will be presented to the ZFS storage pool. Or, consider using whole disks for your mirrored ZFS storage pools. For more information, see Chapter 3, Managing Oracle Solaris ZFS Storage Pools, in Oracle Solaris 11.1 Administration: ZFS File Systems.
Create a storage pool for data files for tables, index, undo and temp data
Consider creating a mirrored storage pool to provide a higher level of data redundancy. For example:
# zpool status dbpool pool: dbpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM dbpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c0t5000C500335F95E3d0 ONLINE 0 0 0 c0t5000C500335F907Fd0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c0t5000C500335BD117d0 ONLINE 0 0 0 c0t5000C500335DC60Fd0 ONLINE 0 0 0 errors: No known data errors
For databases with high redo log activity, such as a typical OLTP database with many commits, use a separate LUN for a separate log device.
Create a storage pool for the archivelog
If available, a system's internal disk can handle this type of load. The archivelog file system can also be a file system in the dbpool.
# zpool create archivepool c0t5000C500335E106Bd0
Create the ZFS file systems and set the specific file system properties by using the following guidelines.
Create separate file systems for redo, archive, undo, and temp database components using the default record size of 128 KB. The general rule is to set the file system recordsize = db_block_size for the file systems that contains Oracle data files. For table data and index components, create a file system with an 8 KB record size. Also consider providing metadata caching hints for your database file systems by using the primarycache property. For more information about ZFS file system properties, see Introducing ZFS Properties in Oracle Solaris 11.1 Administration: ZFS File Systems.
Create file systems for the table data files and index data files with an 8 KB recordsize. Use the default value for primarycache.
# zfs create -o recordsize=8k -o mountpoint=/my_db_path/index dbpool/index # zfs set logbias=throughput dbpool/index # zfs get primarycache,recordsize,logbias dbpool/index NAME PROPERTY VALUE SOURCE dbpool/index primarycache all default dbpool/index recordsize 8K local dbpool/index logbias throughput local
Create file systems for temporary and undo table spaces, using the default recordsize and primarycache values.
# zfs create -o mountpoint=/my_db_path/temp dbpool/temp # zfs set logbias=throughput dbpool/temp # zfs create -o mountpoint=/my_db_path/undo dbpool/undo # zfs set logbias=throughput dbpool/undo
Create a storage pool for redo logs with a separate log device. For databases with high redo log activity, such as a typical OLTP database with many commits, use a separate log device LUN.
Partition the disk into two slices, a small slice, s0, in the 64 to 150 MB range, for the separate log device. The s1 slice contains the remaining disk space for the redo log.
# zpool create redopool c0t50015179594B6F11d0s1 log c0t50015179594B6F11d0s0 # zpool status redopool pool: redopool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM redopool ONLINE 0 0 0 c0t50015179594B6F11d0s1 ONLINE 0 0 0 logs c0t50015179594B6F11d0s0 ONLINE 0 0 0 errors: No known data errors
Create a file system for redo logs in the redo pool. Use default file system values for recordsize and primarycache.
# zfs create -o mountpoint=/my_db_path/redo redopool/redo # zfs set logbias=latency redopool/redo
Create a file system for archivelog files in the archive pool, enabling compression, use the default value for recordsize and set primarycache to metadata.
# zfs create -o compression=on -o primarycache=metadata -o mountpoint= /my_db_admin_path/archive archivepool/archive # zfs get primarycache,recordsize,compressratio,compression,available, used,quota archivepool/archive NAME PROPERTY VALUE SOURCE archivepool/archive primarycache metadata local archivepool/archive recordsize 128K default archivepool/archive compressratio 1.32x - archivepool/archive compression on local archivepool/archive available 40.0G - archivepool/archive used 10.0G - archivepool/archive quota 50G local
Consider tuning storage array I/O queues (for systems with HDS or EMC storage arrays)
ZFS aggregates read and write I/O and manages the priority of I/O before sending it to the driver level, which handles the device. The zfs_vdev_max_pending parameter defines the maximum number of I/Os that ZFS sends to any storage pool device.
In a legacy storage environment, the ssd_max_throttle and sd_max_throttle parameters define the maximum number of concurrent I/Os that the driver can send to the storage. By setting the zfs_vdev_max_pending default value equal to the value of the [s]sd_max_throttle parameter, we prevent ZFS from queuing I/O to yet another unnecessary SD layer.
If you have ssd:ssd_max_throttle or sd:sd_max_throttle in the /etc/system file in your existing environment, then set zfs:zfs_vdev_max_pending at the same value. For example, if the storage array administrator asked for the following setting:
set ssd:ssd_max_throttle=20
Then, also set this parameter as follows:
set ssd:ssd_max_throttle=20 set zfs:zfs_vdev_max_pending=20
Setting this parameter allows ZFS to control each LUN queue. This means that the total number of pending I/Os in the storage can grow as follows:
number of LUNs * ZFS_VDEV_MAX_PENDING
Allocate sufficient memory and swap resources
You can reduce ZFS memory consumption by tuning the zfs_arc_max parameter to a low value, but we still recommend provisioning enough memory to cache metadata for the actively used portion of the database, which is estimated at 1.5% with an 8 KB ZFS record size and proportionately less or more with larger or smaller records. The file system that hold index files is the one that has the largest benefit from file system caching because it is the last one to invalidate in case of lack of memory. The zfs_arc_max parameter is in bytes and accepts decimal or hexadecimal values. The following example sets this parameter to 2 GB:
set zfs:zfs_arc_max=2147483648 or set zfs:zfs_arc_max=0x80000000
To prevent applications from failing due to lack of memory, you must configure some amount of swap space. The amount of swap equivalent to all of system memory is always enough for this purpose. This swap space is not expected to be used, but is needed as a reservation area. For information about increasing swap space, see Managing Your ZFS Swap and Dump Devices in Oracle Solaris 11.1 Administration: ZFS File Systems.
Additional Oracle database configuration recommendations
Configuring Your Oracle Database on ZFS File Systems in the following white paper:
http://www.oracle.com/technetwork/server-storage/solaris/config-solaris-zfs-wp-167894.pdf
Dynamic SGA Tuning of Oracle Database on Oracle Solaris with DISM white paper:
Oracle 11g Installation Guides
Oracle Database Quick Installation Guide 11g Release 2 (11.2) for Oracle Solaris on SPARC (64-Bit)
http://docs.oracle.com/cd/E11882_01/install.112/e24349/toc.htm
Oracle Database Quick Installation Guide 11g Release 2 (11.2) for Oracle Solaris on x86-64 (64-Bit)
http://docs.oracle.com/cd/E11882_01/install.112/e24351/toc.htm
Review the following considerations when using ZFS with MySQL.
ZFS recordsize
Match the ZFS recordsize property to the storage engine block size for better OLTP performance.
InnoDB
With a known application memory footprint, such as for a database application, you might cap the ARC size so that the application will not need to reclaim its necessary memory from the ZFS cache.
Create a separate pool for the logs.
Set a different path for data and log in the my.cnf file.
Set the ZFS recordsize property to 16K for the InnoDB data files, and use the default recordsize value for InnoDB logs, prior to creating data files.