JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris 11.1 Dynamic Tracing Guide     Oracle Solaris 11.1 Information Library
search filter icon
search icon

Document Information

Preface

1.  About DTrace

2.  D Programming Language

3.  Aggregations

4.  Actions and Subroutines

5.  Buffers and Buffering

6.  Output Formatting

7.  Speculative Tracing

8.  dtrace(1M) Utility

9.  Scripting

10.  Options and Tunables

11.  Providers

dtrace Provider

BEGIN Probe

END Probe

ERROR Probe

Stability

lockstat Provider

Overview

Adaptive Lock Probes

Spin Lock Probes

Thread Locks

Readers/Writer Lock Probes

Stability

profile Provider

profile- n probes

tick - n probes

Arguments

Timer Resolution

Probe Creation

Stability

cpc Provider

Probes

Arguments

Probe Availability

Probe Creation

Co-existence With Existing Tools

Examples

user-insts.d

kern-cycles.d

brendan-l2miss.d

brendan-generic-l2miss.d

off_core_event.d

l2miss.d

Stability

fbt Provider

Probes

Probe arguments

entry probes

return probes

Examples

Tail-call Optimization

Assembly Functions

Instruction Set Limitations

x86 Limitations

SPARC Limitations

Breakpoint Interaction

Module Loading

Stability

syscall Provider

Probes

System Call Anachronisms

Subcoded System Calls

New System Calls

Deleted System Calls

Large File System Calls

Private System Calls

Arguments

Stability

sdt Provider

Probes

Examples

Creating SDT Probes

Declaring Probes

Probe Arguments

Stability

mib Provider

Probes

Arguments

Stability

fpuinfo Provider

Probes

Arguments

Stability

pid Provider

Naming pid Probes

Function Boundary Probes

entry Probes

return Probes

Function Offset Probes

Stability

plockstat Provider

Overview

Mutex Probes

Reader/Writer Lock Probes

Stability

fasttrap Provider

Probes

Stability

sysinfo Provider

Probes

Arguments

Example

Stability

vminfo Provider

Probes

Arguments

Example

Stability

proc Provider

Probes

Arguments

lwpsinfo_t

psinfo_t

Examples

exec

start and exit

lwp-start and lwp-exit

signal-send

Stability

sched Provider

Probes

Arguments

cpuinfo_t

Examples

on-cpu and off-cpu

enqueue and dequeue

sleep and wakeup

preempt and remain-cpu

change-pri

tick

cpucaps-sleep and cpucaps-wakeup

Stability

io Provider

Probes

Arguments

bufinfo_t structure

devinfo_t

fileinfo_t

Examples

Stability

Protocols

ip Provider

Probes

Arguments

args[0] - pktinfo_t Structure

args[1] - csinfo_t Structure

args[2] - ipinfo_t Structure

args[3] - ifinfo_t Structure

args[4] - ipv4info_t Structure

args[5] - ipv6info_t Structure

Examples

Packets by host address

Sent size distribution

ipio.d

ipproto.d

Stability

iscsi Provider

Probes

Arguments

Types

Examples

One-liners

iscsiwho.d

iscsixfer.d

nfsv3 Provider

Arguments

Probes

Examples

nfsv3rwsnoop.d

nfsv3ops.d

nfsv3fileio.d

nfsv3rwtime.d

nfsv3io.d

nfsv4 Provider

Arguments

Probes

Examples

nfsv4rwsnoop.d

nfsv4ops.d

nfsv4fileio.d

nfsv4rwtime.d

nfsv4io.d

srp Provider

Probes

Probes Overview

Service up/down Event Probes

Remote Port Login/Logout Event Probes

SRP Command Event Probes

SCSI Command Event Probes

Data Transfer Probes

Types

scsicmd_t

conninfo_t

srp_portinfo_t

srp_logininfo_t

srp_taskinfo_t

xferinfo_t

Examples

service.d

srpwho.d

srpsnoop.d

tcp Provider

Probes

Arguments

pktinfo_t Structure

csinfo_t Structure

ipinfo_t Structure

tcpsinfo_t Structure

tcplsinfo_t Structure

tcpinfo_t Structure

Examples

Connections by Host Address

Connections by TCP Port

Who is Connecting to What

Who Isn't Connecting to What

Packets by Host Address

Packets by Local Port

Sent Size Distribution

tcpstate.d

tcpio.d

Stability

udp Provider

Probes

Arguments

pktinfo_t Structure

csinfo_t Structure

ipinfo_t Structure

udpsinfo_t Structure

udpsinfo_t Structure

Examples

Packets by Host Address

Packets by Local Port

Sent Size Distribution

Stability

12.  User Process Tracing

13.  Statically Defined Tracing for User Applications

14.  Security

15.  Anonymous Tracing

16.  Postmortem Tracing

17.  Performance Considerations

18.  Stability

19.  Translators

20.  Versioning

Index

fbt Provider

This chapter describes the Function Boundary Tracing (FBT) provider, which provides probes associated with the entry to and return from most functions in the Oracle Solaris kernel. The function is the fundamental unit of program text. In a well-designed system, each function performs a discrete and well-defined operation on a specified object or series of like objects. Therefore, even on the smallest Oracle Solaris systems, FBT will provide on the order of 20,000 probes.

Similar to other DTrace providers, FBT has no probe effect when it is not explicitly enabled. When enabled, FBT only induces a probe effect in probed functions. While the FBT implementation is highly specific to the instruction set architecture, FBT has been implemented on both SPARC and x86 platforms. For each instruction set, there are a small number of functions that do not call other functions and are highly optimized by the compiler (so-called _leaf functions_) that cannot be instrumented by FBT. Probes for these functions are not present in DTrace.

Effective use of FBT probes requires knowledge of the operating system implementation. Therefore, it is recommended that you use FBT only when developing kernel software or when other providers are not sufficient. Other DTrace providers, including syscall, sched, proc, and io, can be used to answer most system analysis questions without requiring operating system implementation knowledge.

Probes

FBT provides a probe at the boundary of most functions in the kernel. The boundary of a function is crossed by entering the function and by returning from the function. FBT thus provides two functions for every function in the kernel: one upon entry to the function, and one upon return from the function. These probes are named entry and return, respectively. The function name, and module name are specified as part of the probe. All FBT probes specify a function name and module name.

Probe arguments

entry probes

The arguments to entry probes are the same as the arguments to the corresponding operating system kernel function. These arguments may be accessed in a typed fashion by using the args[] array. These arguments may be accessed as int64_t's by using the arg0 .. arg n variables.

return probes

While a given function only has a single point of entry, it may have many different points where it returns to its caller. You are usually interested in either the value that a function returned or the fact that the function returned at all rather than the specific return path taken. FBT therefore collects a function's multiple return sites into a single return probe. If the exact return path is of interest, you can examine the return probe args[0] value, which indicates the offset (in bytes) of the returning instruction in the function text.

If the function has a return value, the return value is stored in args[1]. If a function does not have a return value, args[1] is not defined.

Examples

You can use FBT to easily explore the kernel's implementation. The following example script records the first ioctl(2) from any xclock process and then follows the subsequent code path through the kernel:

/*
 * To make the output more readable, we want to indent every function entry
 * (and unindent every function return).  This is done by setting the
 * "flowindent" option.
 */
#pragma D option flowindent

syscall::ioctl:entry
/execname == "xclock" && guard++ == 0/
{
        self->traceme = 1;
        printf("fd: %d", arg0);
}

fbt:::
/self->traceme/
{}

syscall::ioctl:return
/self->traceme/
{
        self->traceme = 0;
        exit(0);
}

Running this script results in output similar to the following example:

# dtrace -s ./xioctl.d
dtrace: script './xioctl.d' matched 26254 probes
CPU FUNCTION                                 
  0  => ioctl                                 fd: 3
  0    -> ioctl                               
  0      -> getf                              
  0        -> set_active_fd                   
  0        <- set_active_fd                   
  0      <- getf                              
  0      -> fop_ioctl                         
  0        -> sock_ioctl                      
  0          -> strioctl                      
  0            -> job_control_type            
  0            <- job_control_type            
  0            -> strcopyout                  
  0              -> copyout                   
  0              <- copyout                   
  0            <- strcopyout                  
  0          <- strioctl                      
  0        <- sock_ioctl                      
  0      <- fop_ioctl                         
  0      -> releasef                          
  0        -> clear_active_fd                 
  0        <- clear_active_fd                 
  0        -> cv_broadcast                    
  0        <- cv_broadcast                    
  0      <- releasef                          
  0    <- ioctl                               
  0  <= ioctl

The output shows that an xclock process called ioctl on a file descriptor that appears to be associated with a socket.

You can also use FBT when trying to understand kernel drivers. For example, the ssd(7D) driver has many code paths by which EIO may be returned. FBT can be easily used to determine the precise code path that resulted in an error condition, as shown in the following example:

fbt:ssd::return
/arg1 == EIO/
{
        printf("%s+%x returned EIO.", probefunc, arg0);
}

For more information on any one return of EIO, one may wish to speculatively trace all fbt probes, and then commit(or discard) based on the return value of a specific function. See Chapter 7, Speculative Tracing for details on speculative tracing.

Alternatively, you can use FBT to understand the functions called within a specified module. The following example lists all of the functions called in UFS:

# dtrace -n fbt:ufs::entry'{@a[probefunc] = count()}'
dtrace: description 'fbt:ufs::entry' matched 353 probes
^C
  ufs_ioctl                                                         1
  ufs_statvfs                                                       1
  ufs_readlink                                                      1
  ufs_trans_touch                                                   1
  wrip                                                              1
  ufs_dirlook                                                       1
  bmap_write                                                        1
  ufs_fsync                                                         1
  ufs_iget                                                          1
  ufs_trans_push_inode                                              1
  ufs_putpages                                                      1
  ufs_putpage                                                       1
  ufs_syncip                                                        1
  ufs_write                                                         1
  ufs_trans_write_resv                                              1
  ufs_log_amt                                                       1
  ufs_getpage_miss                                                  1
  ufs_trans_syncip                                                  1
  getinoquota                                                       1
  ufs_inode_cache_constructor                                       1
  ufs_alloc_inode                                                   1
  ufs_iget_alloced                                                  1
  ufs_iget_internal                                                 2
  ufs_reset_vnode                                                   2
  ufs_notclean                                                      2
  ufs_iupdat                                                        2
  blkatoff                                                          3
  ufs_close                                                         5
  ufs_open                                                          5
  ufs_access                                                        6
  ufs_map                                                           8
  ufs_seek                                                         11
  ufs_addmap                                                       15
  rdip                                                             15
  ufs_read                                                         15
  ufs_rwunlock                                                     16
  ufs_rwlock                                                       16
  ufs_delmap                                                       18
  ufs_getattr                                                      19
  ufs_getpage_ra                                                   24
  bmap_read                                                        25
  findextent                                                       25
  ufs_lockfs_begin                                                 27
  ufs_lookup                                                       46
  ufs_iaccess                                                      51
  ufs_imark                                                        92
  ufs_lockfs_begin_getpage                                        102
  bmap_has_holes                                                  102
  ufs_getpage                                                     102
  ufs_itimes_nolock                                               107
  ufs_lockfs_end                                                  125
  dirmangled                                                      498
  dirbadname                                                      498

If you know the purpose or arguments of a kernel function, you can use FBT to understand how or why the function is being called. For example, putnext(9F) takes a pointer to a queue(9S) structure as its first member. The q_qinfo member of the queue structure is a pointer to a qinit(9S) structure. The qi_minfo member of the qinit structure has a pointer to a module_info(9S) structure, which contains the module name in its mi_idname member. The following example puts this information together by using the FBT probe in putnext to track putnext(9F) calls by module name:

fbt::putnext:entry
{
        @calls[stringof(args[0]->q_qinfo->qi_minfo->mi_idname)] = count();
}

Running the above script results in output similar to the following example:

# dtrace -s ./putnext.d
^C
  iprb                         1
  rpcmod                       1
  pfmod                        1
  timod                        2
  vpnmod                       2
  pts                          40
  conskbd                      42
  kb8042                       42
  tl                           58
  arp                          108
  tcp                          126
  ptm                          249
  ip                           313
  ptem                         340
  vuid2ps2                     361
  ttcompat                     412
  ldterm                       413
  udp                          569
  strwhead                     624
  mouse8042                                 726

You can also use FBT to determine the time spent in a particular function. The following example shows how to determine the callers of the DDI delaying routines drv_usecwait(9F) and delay(9F).

fbt::delay:entry,
fbt::drv_usecwait:entry
{
        self->in = timestamp
}

fbt::delay:return,
fbt::drv_usecwait:return
/self->in/
{
        @snoozers[stack()] = quantize(timestamp - self->in);
        self->in = 0;
}

This example script is particularly interesting to run during boot. Chapter 15, Anonymous Tracing describes the procedure for performing anonymous tracing during system boot. Upon reboot, you might see output similar to the following example:

# dtrace -ae

              ata`ata_wait+0x34
              ata`ata_id_common+0xf5
              ata`ata_disk_id+0x20
              ata`ata_drive_type+0x9a
              ata`ata_init_drive+0xa2
              ata`ata_attach+0x50
              genunix`devi_attach+0x75
              genunix`attach_node+0xb2
              genunix`i_ndi_config_node+0x97
              genunix`i_ddi_attachchild+0x4b
              genunix`devi_attach_node+0x3d
              genunix`devi_config_one+0x1d0
              genunix`ndi_devi_config_one+0xb0
              devfs`dv_find+0x125
              devfs`devfs_lookup+0x40
              genunix`fop_lookup+0x21
              genunix`lookuppnvp+0x236
              genunix`lookuppnat+0xe7
              genunix`lookupnameat+0x87
              genunix`cstatat_getvp+0x134

           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@@@@@@@@@@@@@@@@@@@                    4105     
            8192 |@@@@                                     783      
           16384 |@@@@@@@@@@@@@@                           2793     
           32768 |                                         16       
           65536 |                                         0


              kb8042`kb8042_wait_poweron+0x29
              kb8042`kb8042_init+0x22
              kb8042`kb8042_attach+0xd6
              genunix`devi_attach+0x75
              genunix`attach_node+0xb2
              genunix`i_ndi_config_node+0x97
              genunix`i_ddi_attachchild+0x4b
              genunix`devi_attach_node+0x3d
              genunix`devi_config_one+0x1d0
              genunix`ndi_devi_config_one+0xb0
              genunix`resolve_pathname+0xa5
              genunix`ddi_pathname_to_dev_t+0x16
              consconfig_dacf`consconfig_load_drivers+0x14
              consconfig_dacf`dynamic_console_config+0x6c
              consconfig`consconfig+0x8
              unix`stubs_common_code+0x3b

           value  ------------- Distribution ------------- count    
          262144 |                                         0        
          524288 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@      221      
         1048576 |@@@@                                     29       
         2097152 |                                         0

        
              usba`hubd_enable_all_port_power+0xed
              usba`hubd_check_ports+0x8e
              usba`usba_hubdi_attach+0x275
              usba`usba_hubdi_bind_root_hub+0x168
              uhci`uhci_attach+0x191
              genunix`devi_attach+0x75
              genunix`attach_node+0xb2
              genunix`i_ndi_config_node+0x97
              genunix`i_ddi_attachchild+0x4b
              genunix`i_ddi_attach_node_hierarchy+0x49
              genunix`attach_driver_nodes+0x49
              genunix`ddi_hold_installed_driver+0xe3
              genunix`attach_drivers+0x28

           value  ------------- Distribution ------------- count    
        33554432 |                                         0        
        67108864 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 3        
       134217728 |                                         0

Tail-call Optimization

When one function ends by calling another function, the compiler can engage in tail-call optimization, in which the function being called reuses the caller's stack frame. This procedure is most commonly used in the SPARC architecture, where the compiler reuses the caller's register window in the function being called in order to minimize register window pressure.

The presence of this optimization causes the return probe of the calling function to fire before the entry probe of the called function. This ordering can lead to quite a bit of confusion. For example, if you wanted to record all functions called from a particular function and any functions that this function calls, you might use the following script:

fbt::foo:entry
{
        self->traceme = 1;
}

fbt:::entry
/self->traceme/
{
        printf("called %s", probefunc);
}

fbt::foo:return
/self->traceme/
{
        self->traceme = 0;
}

However, if foo ends in an optimized tail-call, the tail-called function, and therefore any functions that it calls, will not be captured. The kernel cannot be dynamically de-optimized on the fly, and DTrace does not wish to engage in a lie about how code is structured. Therefore, you should be aware of when tail-call optimization might be used.

Tail-call optimization is likely to be used in source code similar to the following example:

 return (bar());

Or in source code similar to the following example:

        (void) bar();
        return;

Conversely, function source code that ends like the following example cannot have its call to bar optimized, because the call to bar is not a tail-call:

        bar();
        return (rval);

You can determine whether a call has been tail-call optimized using the following technique:

Due to the instruction set architecture, tail-call optimization is far more common on SPARC systems than on x86 systems. The following example uses mdb to discover tail-call optimization in the kernel's dup function:

# dtrace -q -n fbt::dup:return'{printf("%s+0x%x", probefunc, arg0);}'

While this command is running, run a program that performs a dup(2), such as a bash process. The above command should provide output similar to the following example:

dup+0x10
^C

Now examine the function with mdb:

# echo "dup::dis" | mdb -k
dup:                            sra       %o0, 0, %o0
dup+4:                          mov       %o7, %g1
dup+8:                          clr       %o2
dup+0xc:                        clr       %o1
dup+0x10:                       call      -0x1278       <fcntl>
dup+0x14:                       mov       %g1, %o7

The output shows that dup+0x10 is a call to the fcntl function and not a ret instruction. Therefore, the call to fcntl is an example of tail-call optimization.

Assembly Functions

You might observe functions that seem to enter but never return or vice versa. Such rare functions are generally hand-coded assembly routines that branch to the middle of other hand-coded assembly functions. These functions should not impede analysis: the branched-to function must still return to the caller of the branched-from function. That is, if you enable all FBT probes, you should see the entry to one function and the return from another function at the same stack depth.

Instruction Set Limitations

Some functions cannot be instrumented by FBT. The exact nature of uninstrumentable functions is specific to the instruction set architecture.

x86 Limitations

Functions that do not create a stack frame on x86 systems cannot be instrumented by FBT. Because the register set for x86 is extraordinarily small, most functions must put data on the stack and therefore create a stack frame. However, some x86 functions do not create a stack frame and therefore cannot be instrumented. Actual numbers vary, but typically fewer than five percent of functions cannot be instrumented on the x86 platform.

SPARC Limitations

Leaf routines hand-coded in assembly language on SPARC systems cannot be instrumented by FBT. The majority of the kernel is written in C, and all functions written in C can be instrumented by FBT. Actual numbers vary, but typically fewer cannot be instrumented on the SPARC platform.

Breakpoint Interaction

FBT works by dynamically modifying kernel text. Because kernel breakpoints also work by modifying kernel text, if a kernel breakpoint is placed at an entry or return site before loading DTrace, FBT will refuse to provide a probe for the function, even if the kernel breakpoint is subsequently removed. If the kernel breakpoint is placed after loading DTrace, both the kernel breakpoint and the DTrace probe will correspond to the same point in text. In this situation, the breakpoint will trigger first, and then the probe will fire when the debugger resumes the kernel. It is recommended that kernel breakpoints not be used concurrently with DTrace. If breakpoints are required, use the DTrace breakpoint action instead.

Module Loading

The Oracle Solaris kernel can dynamic load and unload kernel modules. When FBT is loaded and a module is dynamically loaded, FBT automatically provides new probes associated with the new module. If a loaded module has no enabled FBT probes, the module may be unloaded; the corresponding probes will be destroyed as the module is unloaded. If a loaded module has enabled FBT probes, the module is considered busy, and cannot be unloaded.

Stability

The FBT provider uses DTrace's stability mechanism to describe its stabilities, as shown in the following table. For more information about the stability mechanism, see Chapter 18, Stability.

Element
Name stability
Data stability
Dependency class
Provider
Evolving
Evolving
ISA
Module
Private
Private
Unknown
Function
Private
Private
Unknown
Name
Evolving
Evolving
ISA
Arguments
Private
Private
ISA

As FBT exposes the kernel implementation, nothing about it is Stable — and the Module and Function name and data stability are explicitly Private. The data stability for Provider and Name are Evolving, but all other data stabilities are Private: they are artifacts of the current implementation. The dependency class for FBT is ISA: while FBT is available on all current instruction set architectures, there is no guarantee that FBT will be available on arbitrary future instruction set architectures.