aboutsummaryrefslogtreecommitdiff
path: root/cddl/contrib/opensolaris/cmd/lockstat/lockstat.1
diff options
context:
space:
mode:
Diffstat (limited to 'cddl/contrib/opensolaris/cmd/lockstat/lockstat.1')
-rw-r--r--cddl/contrib/opensolaris/cmd/lockstat/lockstat.1402
1 files changed, 402 insertions, 0 deletions
diff --git a/cddl/contrib/opensolaris/cmd/lockstat/lockstat.1 b/cddl/contrib/opensolaris/cmd/lockstat/lockstat.1
new file mode 100644
index 000000000000..552496840d3b
--- /dev/null
+++ b/cddl/contrib/opensolaris/cmd/lockstat/lockstat.1
@@ -0,0 +1,402 @@
+'\" te
+.\" CDDL HEADER START
+.\"
+.\" The contents of this file are subject to the terms of the
+.\" Common Development and Distribution License (the "License").
+.\" You may not use this file except in compliance with the License.
+.\"
+.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
+.\" or http://www.opensolaris.org/os/licensing.
+.\" See the License for the specific language governing permissions
+.\" and limitations under the License.
+.\"
+.\" When distributing Covered Code, include this CDDL HEADER in each
+.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
+.\" If applicable, add the following below this CDDL HEADER, with the
+.\" fields enclosed by brackets "[]" replaced with your own identifying
+.\" information: Portions Copyright [yyyy] [name of copyright owner]
+.\"
+.\" CDDL HEADER END
+.\" Copyright (c) 2008, Sun Microsystems, Inc. All Rights Reserved.
+.\"
+.\" $FreeBSD$
+.\"
+.Dd September 29, 2015
+.Dt LOCKSTAT 1
+.Os
+.Sh NAME
+.Nm lockstat
+.Nd report kernel lock and profiling statistics
+.Sh SYNOPSIS
+.Nm
+.Op Fl ACEHIV
+.Op Fl e Ar event-list
+.Op Fl i Ar rate
+.Op Fl b | t | h | s Ar depth
+.Op Fl n Ar num-records
+.Op Fl l Ar lock Oo Ns , Ns Ar size Oc
+.Op Fl d Ar duration
+.Op Fl f Ar function Oo Ns , Ns Ar size Oc
+.Op Fl T
+.Op Fl kgwWRpP
+.Op Fl D Ar count
+.Op Fl o filename
+.Op Fl x Ar opt Oo Ns = Ns Ar val Oc
+.Ar command
+.Op Oo Ar args Oc
+.Sh DESCRIPTION
+The
+.Nm
+utility gathers and displays kernel locking and profiling statistics.
+.Nm
+allows you to specify which events to watch (for example, spin on adaptive
+mutex, block on read access to rwlock due to waiting writers, and so forth), how
+much data to gather for each event, and how to display the data.
+By default,
+.Nm
+monitors all lock contention events, gathers frequency and timing data about
+those events, and displays the data in decreasing frequency order, so that the
+most common events appear first.
+.Pp
+.Nm
+gathers data until the specified command completes.
+For example, to gather statistics for a fixed-time interval, use
+.Xr sleep 1
+as the command, as follows:
+.Pp
+.Dl # lockstat sleep 5
+.Pp
+When the
+.Fl I
+option is specified,
+.Nm lockstat
+establishes a per-processor high-level periodic interrupt source to gather
+profiling data.
+The interrupt handler simply generates a
+.Nm
+event whose caller is the interrupted PC (program counter).
+The profiling event is just like any other
+.Nm lockstat
+event, so all of the normal
+.Nm lockstat
+options are applicable.
+.Pp
+.Nm
+relies on DTrace to modify the running kernel's text to intercept events of
+interest.
+This imposes a small but measurable overhead on all system activity, so access
+to
+.Nm
+is restricted to super-user by default.
+.Sh OPTIONS
+The following options are supported:
+.Bl -tag -width indent
+.It Fl V
+Print the D program used to gather the requested data.
+.El
+.Ss Event Selection
+If no event selection options are specified, the default is
+.Fl C .
+.Bl -tag -width indent
+.It Fl A
+Watch all lock events.
+.Fl A
+is equivalent to
+.Fl CH .
+.It Fl C
+Watch contention events.
+.It Fl E
+Watch error events.
+.It Fl e Ar event-list
+Only watch the specified events.
+.Ar event-list
+is a comma-separated list of events or ranges of events such as 1,4-7,35.
+Run
+.Nm
+with no arguments to get a brief description of all events.
+.It Fl H
+Watch hold events.
+.It Fl I
+Watch profiling interrupt events.
+.It Fl i Ar rate
+Interrupt rate (per second) for
+.Fl I .
+The default is 97 Hz, so that profiling doesn't run in lockstep with the clock
+interrupt (which runs at 100 Hz).
+.El
+.Ss Data Gathering
+.Bl -tag -width indent
+.It Fl x Ar arg Oo Ns = Ns Ar val Oc
+Enable or modify a
+.Xr dtrace 1
+runtime option or D compiler option.
+Boolean options are enabled by specifying their name.
+Options with values are set by separating the option name and value with an
+equals sign.
+.El
+.Ss "Data Gathering (Mutually Exclusive)"
+.Bl -tag -width indent
+.It Fl b
+Basic statistics: lock, caller, number of events.
+.It Fl h
+Histogram: timing plus time-distribution histograms.
+.It Fl s Ar depth
+Stack trace: histogram plus stack traces up to
+.Ar depth
+frames deep.
+.It Fl t
+Timing: Basic plus timing for all events (default).
+.El
+.Ss "Data Filtering"
+.Bl -tag -width indent
+.It Fl d Ar duration
+Only watch events longer than
+.Ar duration .
+.It Fl f Ar func Ns Oo Ns , Ns Ar size Oc Ns
+Only watch events generated by
+.Ar func ,
+which can be specified as a symbolic name or hex address.
+.Ar size
+defaults to the ELF symbol size if available, or 1 if not.
+.It Fl l Ar lock Ns Oo Ns , Ns Ar size Oc Ns
+Only watch
+.Ar lock ,
+which can be specified as a symbolic name or hex address.
+.Ar size
+defaults to the ELF symbol size or 1 if the symbol size is not available.
+.It Fl n Ar num-records
+Maximum number of data records.
+.It Fl T
+Trace (rather than sample) events.
+This is off by default.
+.El
+.Ss Data Reporting
+.Bl -tag -width indent
+.It Fl D Ar count
+Only display the top
+.Ar count
+events of each type.
+.It Fl g
+Show total events generated by function.
+For example, if
+.Fn foo
+calls
+.Fn bar
+in a loop, the work done by
+.Fn bar
+counts as work generated by
+.Fn foo
+(along with any work done by
+.Fn foo
+itself).
+The
+.Fl g
+option works by counting the total number of stack frames in which each function
+appears.
+This implies two things: (1) the data reported by
+.Fl g
+can be misleading if the stack traces are not deep enough, and (2) functions
+that are called recursively might show greater than 100% activity.
+In light of issue (1), the default data gathering mode when using
+.Fl g
+is
+.Fl s 50 .
+.It Fl k
+Coalesce PCs within functions.
+.It Fl o Ar filename
+Direct output to
+.Ar filename .
+.It Fl P
+Sort data by (\fIcount * time\fR) product.
+.It Fl p
+Parsable output format.
+.It Fl R
+Display rates (events per second) rather than counts.
+.It Fl W
+Whichever: distinguish events only by caller, not by lock.
+.It Fl w
+Wherever: distinguish events only by lock, not by caller.
+.El
+.Sh DISPLAY FORMATS
+The following headers appear over various columns of data.
+.Bl -tag -width indent
+.It Count or ops/s
+Number of times this event occurred, or the rate (times per second) if
+.Fl R
+was specified.
+.It indv
+Percentage of all events represented by this individual event.
+.It genr
+Percentage of all events generated by this function.
+.It cuml
+Cumulative percentage; a running total of the individuals.
+.It rcnt
+Average reference count.
+This will always be 1 for exclusive locks (mutexes,
+spin locks, rwlocks held as writer) but can be greater than 1 for shared locks
+(rwlocks held as reader).
+.It nsec
+Average duration of the events in nanoseconds, as appropriate for the event.
+For the profiling event, duration means interrupt latency.
+.It Lock
+Address of the lock; displayed symbolically if possible.
+.It CPU+Pri_Class
+CPU plus the priority class of the interrupted thread.
+For example, if CPU 4 is interrupted while running a timeshare thread, this
+will be reported as
+.Ql cpu[4]+TShar .
+.It Caller
+Address of the caller; displayed symbolically if possible.
+.El
+.Sh EXAMPLES
+.Bl -tag -width 0n
+.It Example 1 Measuring Kernel Lock Contention
+.Pp
+.Li # lockstat sleep 5
+.Bd -literal
+Adaptive mutex spin: 41411 events in 5.011 seconds (8263 events/sec)
+
+Count indv cuml rcnt nsec Lock Caller
+-------------------------------------------------------------------------------
+13750 33% 33% 0.00 72 vm_page_queue_free_mtx vm_page_free_toq+0x12e
+13648 33% 66% 0.00 66 vm_page_queue_free_mtx vm_page_alloc+0x138
+ 4023 10% 76% 0.00 51 vm_dom+0x80 vm_page_dequeue+0x68
+ 2672 6% 82% 0.00 186 vm_dom+0x80 vm_page_enqueue+0x63
+ 618 1% 84% 0.00 31 0xfffff8000cd83a88 qsyncvp+0x37
+ 506 1% 85% 0.00 164 0xfffff8000cb3f098 vputx+0x5a
+ 477 1% 86% 0.00 69 0xfffff8000c7eb180 uma_dbg_getslab+0x5b
+ 288 1% 87% 0.00 77 0xfffff8000cd8b000 vn_finished_write+0x29
+ 263 1% 88% 0.00 103 0xfffff8000cbad448 vinactive+0xdc
+ 259 1% 88% 0.00 53 0xfffff8000cd8b000 vfs_ref+0x24
+ 237 1% 89% 0.00 20 0xfffff8000cbad448 vfs_hash_get+0xcc
+ 233 1% 89% 0.00 22 0xfffff8000bfd9480 uma_dbg_getslab+0x5b
+ 223 1% 90% 0.00 20 0xfffff8000cb3f098 cache_lookup+0x561
+ 193 0% 90% 0.00 16 0xfffff8000cb40ba8 vref+0x27
+ 175 0% 91% 0.00 34 0xfffff8000cbad448 vputx+0x5a
+ 169 0% 91% 0.00 51 0xfffff8000cd8b000 vfs_unbusy+0x27
+ 164 0% 92% 0.00 31 0xfffff8000cb40ba8 vputx+0x5a
+[...]
+
+Adaptive mutex block: 10 events in 5.011 seconds (2 events/sec)
+
+Count indv cuml rcnt nsec Lock Caller
+-------------------------------------------------------------------------------
+ 3 30% 30% 0.00 17592 vm_page_queue_free_mtx vm_page_alloc+0x138
+ 2 20% 50% 0.00 20528 vm_dom+0x80 vm_page_enqueue+0x63
+ 2 20% 70% 0.00 55502 0xfffff8000cb40ba8 vputx+0x5a
+ 1 10% 80% 0.00 12007 vm_page_queue_free_mtx vm_page_free_toq+0x12e
+ 1 10% 90% 0.00 9125 0xfffff8000cbad448 vfs_hash_get+0xcc
+ 1 10% 100% 0.00 7864 0xfffff8000cd83a88 qsyncvp+0x37
+-------------------------------------------------------------------------------
+[...]
+.Ed
+.It Example 2 Measuring Hold Times
+.Pp
+.Li # lockstat -H -D 10 sleep 1
+.Bd -literal
+Adaptive mutex hold: 109589 events in 1.039 seconds (105526 events/sec)
+
+Count indv cuml rcnt nsec Lock Caller
+-------------------------------------------------------------------------------
+ 8998 8% 8% 0.00 617 0xfffff8000c7eb180 uma_dbg_getslab+0xd4
+ 5901 5% 14% 0.00 917 vm_page_queue_free_mtx vm_object_terminate+0x16a
+ 5040 5% 18% 0.00 902 vm_dom+0x80 vm_page_free_toq+0x88
+ 4884 4% 23% 0.00 1056 vm_page_queue_free_mtx vm_page_alloc+0x44e
+ 4664 4% 27% 0.00 759 vm_dom+0x80 vm_fault_hold+0x1a13
+ 4011 4% 31% 0.00 888 vm_dom vm_page_advise+0x11b
+ 4010 4% 34% 0.00 957 vm_dom+0x80 _vm_page_deactivate+0x5c
+ 3743 3% 38% 0.00 582 0xfffff8000cf04838 pmap_is_prefaultable+0x158
+ 2254 2% 40% 0.00 952 vm_dom vm_page_free_toq+0x88
+ 1639 1% 41% 0.00 591 0xfffff800d60065b8 trap_pfault+0x1f7
+-------------------------------------------------------------------------------
+[...]
+
+R/W writer hold: 64314 events in 1.039 seconds (61929 events/sec)
+
+Count indv cuml rcnt nsec Lock Caller
+-------------------------------------------------------------------------------
+ 7421 12% 12% 0.00 2994 pvh_global_lock pmap_page_is_mapped+0xb6
+ 4668 7% 19% 0.00 3313 pvh_global_lock pmap_enter+0x9ae
+ 1639 3% 21% 0.00 733 0xfffff80168d10200 vm_object_deallocate+0x683
+ 1639 3% 24% 0.00 3061 0xfffff80168d10200 unlock_and_deallocate+0x2b
+ 1639 3% 26% 0.00 2966 0xfffff80168d10200 vm_fault_hold+0x16ee
+ 1567 2% 29% 0.00 733 0xfffff80168d10200 vm_fault_hold+0x19bc
+ 821 1% 30% 0.00 786 0xfffff801eb0cc000 vm_object_madvise+0x32d
+ 649 1% 31% 0.00 4918 0xfffff80191105300 vm_fault_hold+0x16ee
+ 648 1% 32% 0.00 8112 0xfffff80191105300 unlock_and_deallocate+0x2b
+ 647 1% 33% 0.00 1261 0xfffff80191105300 vm_object_deallocate+0x683
+-------------------------------------------------------------------------------
+.Ed
+.It Example 3 Measuring Hold Times for Stack Traces Containing a Specific Function
+.Pp
+.Li # lockstat -H -f tcp_input -s 50 -D 10 sleep 1
+.Bd -literal
+Adaptive mutex hold: 68 events in 1.026 seconds (66 events/sec)
+
+-------------------------------------------------------------------------------
+Count indv cuml rcnt nsec Lock Caller
+ 32 47% 47% 0.00 1631 0xfffff800686f50d8 tcp_do_segment+0x284b
+
+ nsec ------ Time Distribution ------ count Stack
+ 1024 |@@@@@@@@@@ 11 tcp_input+0xf54
+ 2048 |@@@@@@@@@@@@@ 14 ip_input+0xc8
+ 4096 |@@@@@ 6 swi_net+0x192
+ 8192 | 1 intr_event_execute_handlers+0x93
+ ithread_loop+0xa6
+ fork_exit+0x84
+ 0xffffffff808cf9ee
+-------------------------------------------------------------------------------
+Count indv cuml rcnt nsec Lock Caller
+ 29 43% 90% 0.00 4851 0xfffff800686f50d8 sowakeup+0xf8
+
+ nsec ------ Time Distribution ------ count Stack
+ 4096 |@@@@@@@@@@@@@@@ 15 tcp_do_segment+0x2423
+ 8192 |@@@@@@@@@@@@ 12 tcp_input+0xf54
+ 16384 |@@ 2 ip_input+0xc8
+ swi_net+0x192
+ intr_event_execute_handlers+0x93
+ ithread_loop+0xa6
+ fork_exit+0x84
+ 0xffffffff808cf9ee
+-------------------------------------------------------------------------------
+[...]
+.Ed
+.El
+.Sh SEE ALSO
+.Xr dtrace 1 ,
+.Xr ksyms 4 ,
+.Xr locking 9
+.Sh NOTES
+Tail-call elimination can affect call sites.
+For example, if
+.Fn foo Ns +0x50
+calls
+.Fn bar
+and the last thing
+.Fn bar
+does is call
+.Fn mtx_unlock ,
+the compiler can arrange for
+.Fn bar
+to branch to
+.Fn mtx_unlock
+with a return address of
+.Fn foo Ns +0x58.
+Thus, the
+.Fn mtx_unlock
+in
+.Fn bar
+will appear as though it occurred at
+.Fn foo Ns +0x58.
+.Pp
+The PC in the stack frame in which an interrupt occurs can be bogus because,
+between function calls, the compiler is free to use the return address register
+for local storage.
+.Pp
+When using the
+.Fl I
+and
+.Fl s
+options together, the interrupted PC will usually not appear anywhere in the
+stack since the interrupt handler is entered asynchronously, not by a function
+call from that PC.