Index Index for
Section 8
Index Alphabetical
listing for L
Bottom of page Bottom of
page

lockinfo(8)

NAME

lockinfo - Displays kernel locking statistics for SMP and NUMA platforms

SYNOPSIS

/usr/sbin/lockinfo [-since_boot] [-class=lock_class] [-top=count] [-sort=statistic] [-rad=rad_id] [-percpu] [command [cmd_args]]

OPTIONS

-class=lock_class Specifies a particular lock class name (for example, thread.lock) for which trace information is to be displayed. This option adds to the general information displayed about all lock classes detailed information about where in the kernel code the specified lock class is asserted. For an alternative way to request trace information, see the entry for the -top option. -percpu Displays statistics for each lock on a per-CPU basis. -rad=rad_id Displays lock statistics for the specified RAD. This option is useful only on NUMA systems. The lockinfo command prints an error message if the specified RAD does not exist. The -rad option is available in the version of the lockinfo software included in the Tru64 UNIX product kit, starting with Version 5.1B. However, this version of lockinfo can also be used on Versions 5.1 and 5.1A. -since_boot Displays locking statistics for the system since it was booted. To use this option, the system must have been booted with the lockmode attribute set to 4. See sys_attrs_generic(5) for more information about the lockmode attribute. -sort=statistic Sorts output statistics from highest to lowest for the specified statistic. The value for statistic can be one of the following, which also correspond to the column headings in the lockinfo display: tries The number of tries for asserting the lock. (Default) reads The number of attempts on read. trmax The maximum number of readers in the critical path (for a C class lock) or the maximum number of cycles spent holding the lock (for an S, RWS, and MCS class lock). misses The number of lock misses. sleeps The number of blocks encountered while waiting for the lock. waitmax The maximum amount of time (in seconds) spent waiting for a lock. waitsum The total (in seconds) of all times spent waiting for the lock. misspct The lock miss percent (corresponds to the percent misses column in the display). You can also specify none for statistic. In this case, lockinfo sorts output in the order in which the lock classes are defined in the kernel source. (This is not a particularly useful option.) -top=count Causes lockinfo to do an initial pass of statistics gathering and then perform lock class tracing for each, in turn, of the asserted locks that sort in the top count from the first pass. When you specify the -top option, any command you pass to lockinfo will execute count+1 times. To request trace information for one specific lock class, see the entry for the -class option.

OPERANDS

command The command to be executed by lockinfo. cmd_args Any arguments to the preceding command. The command and cmd_args operands are used to limit the length of time that lockinfo runs. Typically, sleep is specified for command and some number of seconds is specified for cmd_args.

DESCRIPTION

The lockinfo utility collects and displays locking statistics for the kernel SMP locks. It uses the /dev/lockdev pseudo driver to collect data. Locking statistics can be gathered when the lockmode attribute for the generic subsystem is set to 2 (the default for SMP systems), 3, or 4. Be sure to review RESTRICTIONS for important information about support for this utility. When you enter a lockinfo command, the utility first opens the lockdev pseudo driver and turns on lock statistics gathering. Then, the utility forks and executes the specified command. After command completes, the utility turns off lock statistics gathering (closes lockdev), collects the data, and sends it to stdout. The output data shows the SMP locking that was done by the operating system during the execution time for command. If you do not include command to set a time limit on the interval for gathering statistics, the statistics shown are for the length of time since the last system boot. To gather statistics with lockinfo, you typically follow these steps: 1. Start up a system work load and wait for it to get to a steady state. 2. Start lockinfo with sleep as the specified command and some number of seconds as the specified cmd_args. This causes lockinfo to gather statistics for the length of time it takes the sleep command to execute. 3. Based on the first set of results, use lockinfo again to request more specific information about any lock class that shows results, such as a large percentage of misses, that you believe is very likely to cause a system performance problem. For example, the following command causes lockinfo to collect locking statistics for 60 seconds: # lockinfo sleep 60 The output from this command might look like this: hostname: sysname.node.corp.com lockmode: 2 (SMP default) processors: 4 start time: Wed Jun 9 14:38:05 1999 end time: Wed Jun 9 14:39:05 1999 command: sleep 60 tries reads trmax misses percent sleeps waitmax waitsum misses seconds seconds bsBuf.bufLock (S) 5718642 0 45745 194509 3.4 0 0.00007 0.63226 lock.l_lock (S) 5579643 0 40985 75656 1.4 0 0.00005 0.16531 thread.lock (S) 1989132 0 24817 21795 1.1 0 0.00003 0.03864 vnode.v_lock (S) 1578583 0 49207 1527 0.1 0 0.00002 0.00443 vdT.vdIoLock (S) 1412449 0 149797 81078 5.7 0 0.00017 0.51438 BsBufHashLock (S) 1377903 0 42586 89312 6.5 0 0.00006 0.24563 . . . inifaddr_lock (C) 1 1 1 0 0.0 0 0.00000 0.00000 total simple_locks = 28545191 percent unknown = 0.0 total rws_locks = 1429 percent reads = 100.0 total complex_locks = 2764296 percent reads = 33.2 percent unknown = 0.0 # The first six lines of output specify the system, its lockmode attribute setting, how many processors it has, the start and end times of the statistics gathering, and the command that was run. Following the initial six lines, a table with statistics data for each lock class is displayed. See the description of the -sort option for an explanation of the data in each column. In this example, the lock class entries in the table are sorted by the number of tries (the default sort order). Note that each lock class name is tagged (in parentheses) with its lock type. The following lock types are supported: ______________________________________ Tag Lock Type ______________________________________ (C) Complex lock (MCS) Queued lock (NUMA system only) (RCS) Reader/Writer spin lock (S) Simple spin lock ______________________________________ Finally, the display ends with some summary information: the total number of different types of locks and the read percentages for each type. When diagnosing a system problem, certain statistics are more important than others to look at. The following results are most likely indicate a problem: · A large number of tries · A percent misses value that is too high. "Too high" varies somewhat, depending on the lock class. A kernel developer who is testing VM code under development might consider any percentage over 1 percent "too high" for certain kinds of locks. A support representative testing released product software should look for percent misses values that exceed the range of 5 to 7 percent. · A large waitsum value If you see these results, there are no hard and fast rules to tell you whether the lock contention is poor code design in applications that are running on the system or whether there is a problem in the operating system software. A locking problem is simply an indication that there is high contention for a certain type of resource. If contention exists for a lock related to I/O and a particular application is spawning many processes that compete for the same files and directories, application or database storage design adjustments might be in order. Applications that use System V semaphores can sometimes encounter locking contention if they create a very large number of semaphores in a single semaphore set because the kernel uses locks on each set of semaphores. In this case, performance improvements might be realized by changing the application to use more semaphore sets, each with a smaller number of semaphores. Contention for locks on other kinds of resources, such as memory, is less likely to be remedied by changing application code; however, further investigation has to be done before you can rule out a problem in the application. Keep in mind that even when lock contention can be reduced by changing applications and this option might be the only short-term solution, it does not necessarily mean that the problem should not be reported to central engineering. For example, if the applications were not causing locking problems when run on an SMP system but do on a NUMA system with comparable CPU and memory resources, support representatives should report the case to central engineering. In this case, kernel developers will want to investigate system algorithms that better handle the contention. In the first example of lockinfo output, a few locks show high values in the tries, percent misses, and waitsum columns. The following example uses lockinfo to show the code paths where one of these, bsBuf.bufLock, is asserted with high frequency: # lockinfo -class=sBuf.bufLock sleep 60 hostname: sysname.node.corp.com lockmode: 2 (SMP default) processors: 4 start time: Wed Jun 9 15:02:39 1999 end time: Wed Jun 9 15:03:39 1999 command: sleep 60 Locks asserted by PC for lock class: bsBuf.bufLock count miss caller : line # return : line # ------------------------------------------------------------------------------- 733418 41275 bs_pinpg_one_int: 4494 bs_pinpg_clone: 4125 704579 49182 bs_pinpg_one_int: 4460 bs_pinpg_clone: 4125 697209 0 find_page: 5986 bs_pinpg_one_int: 4368 680910 30359 bs_unpinpg: 5461 log_donerec_nunpin: 3149 544828 12869 bs_q_lazy: 2006 bs_q_list: 1142 496294 15537 bs_unpinpg: 5670 log_donerec_nunpin: 3149 357840 0 find_page: 5986 bs_refpg_int: 2800 115502 21058 bs_derefpg: 3982 bmtr_update_rec_int: 2473 . . . 7 0 seq_ahead_cont: 6879 bs_pinpg_one_int: 4568 2 0 bs_derefpg: 3982 frag_group_dealloc: 1637 tries reads trmax misses percent sleeps waitmax waitsum misses seconds seconds bsBuf.bufLock (S) 5322157 0 45745 366848 6.9 0 0.00009 1.77041 . . . inifaddr_lock (C) 1 1 1 0 0.0 0 0.00000 0.00000 total simple_locks = 26331214 percent unknown = 0.0 total rws_locks = 1265 percent reads = 100.0 total complex_locks = 2544907 percent reads = 33.2 percent unknown = 0.0 Based on the information returned about high frequency code path assertions for the sBuf.bufLock lock, the kernel developer can then look for ways to reduce the amount of locking for this class. Strategies for reaching this goal might include one or more of the following: · Changing the code to use a read/write spin lock rather than a simple spin lock · Reducing lock hold times by moving some work outside the time that the lock is being held · Making more radical changes in kernel algorithms to reduce the frequency of lock assertions or the amount of time that locks are held The support representative with training in operating system internals might be interested in tracing high frequency code path assertions to better determine if a system performance problem requires submission of a problem report on the operating system software or if changes need to be made in third-party or site-specific applications. See EXAMPLES for examples of lockinfo commands that include the -percpu and -rad options.

NOTES

Privileges for using the lockinfo command are based on permissions for the /dev/lockdev file. By default, permissions on this file mean that using the lockinfo command requires superuser privilege. System administrators can change the permissions on the /dev/lockdev special file to allow "others" to gather lockinfo statistics without granting a particular user superuser privileges. The lockinfo command is indirectly invoked by the sys_check script. The owner and group for the /dev/lockdev file should not be changed; in other words, the file must be owned by root and remain in group mem. Furthermore, user root and group mem must have read permission on the file. For security reasons, do not run lockinfo in setuid root mode or add nonprivileged users to group mem. (The latter action gives nonprivileged users read permissions on physical memory.) The version of lockinfo software included in Tru64 UNIX Version 5.1B and higher releases includes NUMA support (gathering of statistics on a per-RAD basis). Support representatives can copy this version of lockinfo to NUMA platforms running Tru64 UNIX Version 5.1 or Version 5.1A, but should be careful to protect the version of lockinfo already installed on the system by using one of the following strategies: · Copy the new version to and run it from a directory other than /usr/sbin. (For example, copy the new version of lockinfo to /usr/local/bin.) · Copy the new version of lockinfo to /usr/sbin but under a different utility name (such as lockinfo_2). Protecting the version of lockinfo that originally shipped with the system is important because of the sys_check dependency.

RESTRICTIONS

Running the lockinfo command can impact system performance. However, the performance impact exists only for the length of time that statistics are being gathered (the length of time it takes for the specified command to execute). Only one instance of lockinfo can be running on the system at a particular time. This is because a process cannot open the /dev/lockdev pseudo device when it is already opened by another process. The lockinfo command is intended for use mainly by other facilities distributed with the operating system, kernel developers, and support representatives. It is not intended for use by customers who are not working with a support representative to diagnose a performance problem. Furthermore, lockinfo should not be invoked by customer scripts and applications for the following reasons: · The utility displays information that is difficult to interpret without training in operating system internals; many lock class names are not intuitively obvious; that is, the name does not suggest an association with a particular kernel subsystem or the type of system resource being locked. · The utility interface and output is subject to change, without advance notice, from one release to another and by patches that might be applied to the utility.

EXIT STATUS

0 (Zero) Success. >0 An error occurred.

ERRORS

Lock class class_name not asserted Explanation: The name specified for the -class option is a supported lock class for the system. However, no trace information is available; during the time that locking statistics were being gathered, the specified lock was not asserted. User Action: Verify class names in output from lockinfo that is entered without the -class option. You also might need to better synchronize the statistics-gathering period with a time that the lock class is being asserted in order to generate trace information. This may take a few tries or an increase in the number of seconds that statistics are gathered. lock statistics not supported in lockmode 0 or 1 Explanation: The lockmode attribute must be set to 2, 3. or 4 for the lockinfo command to work. The default for SMP systems is 2, so someone reset the value to 0 or 1 when the system was booted. User Action: The attribute value must be changed to 2, 3, or 4 and the system rebooted before lockinfo will work. open: Device busy Explanation: Another process has the /dev/lockdev device open. The other process might be running lockinfo directly or be running the sys_check script (which indirectly invokes lockinfo). User Action: Wait until the first instance of lockinfo has stopped running and try the command again. option since_boot only supported in lockmode 4 current lockmode: lockmode_value Explanation: The system must have been booted with lockmode set to 4 in order to start gathering locking statistics at system boot time. Because of performance implications for production systems, this is usually done only on a development system. User action: Reset lockmode to 4 and reboot the system or, if that is not possible, use lockmode to gather statistics for short time periods after performance problems are being noticed. top count too big count. Max value 1024 Explanation: A number larger than 1024 was specified for the -top argument. User action: Reduce the value. You typically gather trace information for only the top few lock classes that display the largest values for the specified sort criteria. rad number rad_id is invalid Explanation: The specified RAD number is less than 0 or greater than the number of RADs on the system. User action: Re-enter the lockinfo command with a corrected RAD identifier. Rad rad_id has no active processors Explanation: The specified RAD exists on the system but the RAD has no active processors. Therefore, there are no locking statistics to display for that RAD. Note that the lockinfo command does not find processors if they are installed but not running. The CPUs in the specified RAD might have been in the process of being taken on or off line when lockinfo was run. User action: Make sure that you specified the number for a RAD in which you expect CPUs to be active. If not, re-enter the command with a corrected RAD number. To check on processor status, use the psrinfo command. top number of countlarger then number of lock types number Explanation: The -top argument exceeded the supported number of lock class names. User action: Reduce the value. You usually gather trace information for only the top few lock classes that display the largest values for the specified sort criteria. Unknown class name: class_name Explanation: An invalid lock class name was specified for the -class option. User Action: Verify class names in output from lockinfo when entered without the -class option. Check the spelling and letter case for supported names and re-enter the command with a corrected -class argument.

EXAMPLES

1. The following lockinfo command gathers locking statistics for each processor over a period of 60 seconds: # lockinfo -percpu sleep 60 hostname: sysname.node.corp.com lockmode: 4 (SMP DEBUG with kernel mode preemption enabled) processors: 4 start time: Wed Jun 9 14:45:08 1999 end time: Wed Jun 9 14:46:08 1999 command: sleep 60 tries reads trmax misses percent sleeps waitmax waitsum misses seconds seconds bsBuf.bufLock (S) 0 1400786 0 45745 47030 3.4 0 0.00007 0.15526 1 1415828 0 45367 47538 3.4 0 0.00006 0.15732 2 1399462 0 33076 48507 3.5 0 0.00005 0.15907 3 1398336 0 31753 48867 3.5 0 0.00005 0.15934 ----------------------------------------------------------------------- ALL 5614412 0 45745 191942 3.4 0 0.00007 0.63099 lock.l_lock (S) 0 1360769 0 40985 18460 1.4 0 0.00005 0.04041 1 1375384 0 20720 18581 1.4 0 0.00005 0.04124 2 1375122 0 20657 18831 1.4 0 0.00009 0.04198 ----------------------------------------------------------------------- ALL 5483049 0 40985 74688 1.4 0 0.00009 0.16526 . . . inifaddr_lock (C) 0 0 0 1 0 0.0 0 0.00000 0.00000 1 1 1 1 0 0.0 0 0.00000 0.00000 2 0 0 1 0 0.0 0 0.00000 0.00000 3 0 0 1 0 0.0 0 0.00000 0.00000 ----------------------------------------------------------------------- ALL 1 1 1 0 0.0 0 0.00000 0.00000 total simple_locks = 28100338 percent unknown = 0.0 total rws_locks = 1466 percent reads = 100.0 total complex_locks = 2716146 percent reads = 33.2 percent unknown = 0.0 # 2. The following lockinfo command gathers locking statistics for a particular RAD on a NUMA platform. In this example, only cumulative statistics across all CPUs in the RAD are displayed. # /usr/sbin/lockinfo -rad=4 sleep 10 hostname: sysname.node.corp.com locktype: MCS Locks lockmode: 2 (SMP default) Tracing: RAD 4 only processors: 4 start time: Thu May 17 09:15:09 2001 end time: Thu May 17 09:15:19 2001 command: sleep 10 tries reads trmax misses misses sleeps waitmax waitsum percent seconds seconds processor.callout_lock (M) 1298 0 0 0 0.0 0 0.00000 0.00001 thread.lock (M) 336 0 0 0 0.0 0 0.00000 0.00000 processor.lock (M) 226 0 0 1 0.4 0 0.00000 0.00016 pag.idle_lock (M) 124 0 0 0 0.0 0 0.00000 0.00002 wait_queue.lock (M) 114 0 0 0 0.0 0 0.00000 0.00000 unknown_simple_lock (M) 76 0 0 0 0.0 0 0.00000 0.00000 misc_tcp_lock (M) 70 0 0 0 0.0 0 0.00000 0.00000 pag.lock (M) 10 0 0 0 0.0 0 0.00000 0.00000 cam_softc (M) 10 0 0 0 0.0 0 0.00000 0.00000 vm_control.vm_free_lock (M) 10 0 0 0 0.0 0 0.00000 0.00000 cam_x_pqhead1 (M) 8 0 0 0 0.0 0 0.00000 0.00000 pmap.lock (M) 6 0 0 0 0.0 0 0.00000 0.00000 proc.p_lock (M) 2 0 0 0 0.0 0 0.00000 0.00000 nxm_vp_lock (M) 2 0 0 0 0.0 0 0.00000 0.00000 task.ipc_translation_lock (M) 1 0 0 0 0.0 0 0.00000 0.00000 nxm_thread_lock (M) 1 0 0 0 0.0 0 0.00000 0.00000 kern_port.port_data_lock (M) 1 0 0 0 0.0 0 0.00000 0.00000 port_hash_bucket.lock (M) 1 0 0 0 0.0 0 0.00000 0.00000 total mcs_locks = 2296 percent unknown = 0.0 total rws_locks = 0 percent reads = 0.0 total complex_locks = 0 percent reads = 0.0 percent unknown = 0.0 3. The following lockinfo command gathers locking statistics for a particular RAD on a NUMA platform. In this example, per-CPU statistics are included as well as the total for all CPUs in the RAD. # /usr/sbin/lockinfo -rad=4 -percpu sleep 10 hostname: sysname.node.corp.com locktype: MCS Locks lockmode: 2 (SMP default) Tracing: RAD 4 only processors: 4 start time: Thu May 17 09:16:33 2001 end time: Thu May 17 09:16:43 2001 command: sleep 10 tries reads trmax misses misses sleeps waitmax waitsum percent seconds seconds processor.callout_lock (M) 16 110 0 0 0 0.0 0 0.00000 0.00000 17 92 0 0 0 0.0 0 0.00000 0.00000 18 100 0 0 0 0.0 0 0.00000 0.00000 19 992 0 0 0 0.0 0 0.00000 0.00000 ----------------------------------------------------------------------- ALL 1294 0 0 0 0.0 0 0.00000 0.00003 thread.lock (M) 16 54 0 0 0 0.0 0 0.00000 0.00000 17 30 0 0 0 0.0 0 0.00000 0.00000 18 162 0 0 0 0.0 0 0.00000 0.00000 19 90 0 0 0 0.0 0 0.00000 0.00000 ----------------------------------------------------------------------- ALL 336 0 0 0 0.0 0 0.00000 0.00000 processor.lock (M) 16 37 0 0 0 0.0 0 0.00000 0.00000 17 18 0 0 0 0.0 0 0.00000 0.00000 18 108 0 0 0 0.0 0 0.00000 0.00000 19 63 0 0 0 0.0 0 0.00000 0.00000 ----------------------------------------------------------------------- ALL 226 0 0 0 0.0 0 0.00001 0.00022 pag.idle_lock (M) 16 21 0 0 0 0.0 0 0.00000 0.00000 17 12 0 0 0 0.0 0 0.00000 0.00000 18 54 0 0 0 0.0 0 0.00000 0.00000 19 37 0 0 1 2.7 0 0.00001 0.00001 ----------------------------------------------------------------------- ALL 124 0 0 1 0.8 0 0.00001 0.00003 wait_queue.lock (M) 16 21 0 0 0 0.0 0 0.00000 0.00000 17 2 0 0 0 0.0 0 0.00000 0.00000 18 54 0 0 0 0.0 0 0.00000 0.00000 19 37 0 0 0 0.0 0 0.00000 0.00000 ----------------------------------------------------------------------- ALL 114 0 0 0 0.0 0 0.00000 0.00000 unknown_simple_lock (M) 16 56 0 0 0 0.0 0 0.00000 0.00000 17 0 0 0 0 0.0 0 0.00000 0.00000 18 20 0 0 0 0.0 0 0.00000 0.00000 19 0 0 0 0 0.0 0 0.00000 0.00000 ----------------------------------------------------------------------- ALL 76 0 0 0 0.0 0 0.00000 0.00000 . . . total mcs_locks = 2292 percent unknown = 0.0 total rws_locks = 0 percent reads = 0.0 total complex_locks = 0 percent reads = 0.0 percent unknown = 0.0

FILES

/dev/lockdev Pseudo driver that is opened by the lockinfo utility for statistics gathering.

SEE ALSO

Commands: sched_stat(8), sys_check(8) Others: sys_attrs_generic(5)

Index Index for
Section 8
Index Alphabetical
listing for L
Top of page Top of
page