1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
|
'\" te
.\" Copyright (c) 2005, Sun Microsystems, Inc.
.TH CPUSTAT 8 "April 9, 2016"
.SH NAME
cpustat \- monitor system behavior using CPU performance counters
.SH SYNOPSIS
.LP
.nf
\fBcpustat\fR \fB-c\fR \fIeventspec\fR [\fB-c\fR \fIeventspec\fR]... [\fB-p\fR \fIperiod\fR] [\fB-T\fR u | d ]
[\fB-sntD\fR] [\fIinterval\fR [\fIcount\fR]]
.fi
.LP
.nf
\fBcpustat\fR \fB-h\fR
.fi
.SH DESCRIPTION
.LP
The \fBcpustat\fR utility allows \fBCPU\fR performance counters to be used to
monitor the overall behavior of the \fBCPU\fRs in the system.
.sp
.LP
If \fIinterval\fR is specified, \fBcpustat\fR samples activity every
\fIinterval\fR seconds, repeating forever. If a \fIcount\fR is specified, the
statistics are repeated \fIcount\fR times. If neither are specified, an
interval of five seconds is used, and there is no limit to the number of
samples that are taken.
.SH OPTIONS
.LP
The following options are supported:
.sp
.ne 2
.na
\fB\fB-c\fR \fIeventspec\fR\fR
.ad
.sp .6
.RS 4n
Specifies a set of events for the \fBCPU\fR performance counters to monitor.
The syntax of these event specifications is:
.sp
.in +2
.nf
[picn=]\fIeventn\fR[,attr[\fIn\fR][=\fIval\fR]][,[picn=]\fIeventn\fR
[,attr[n][=\fIval\fR]],...,]
.fi
.in -2
.sp
You can use the \fB-h\fR option to obtain a list of available events and
attributes. This causes generation of the usage message. You can omit an
explicit counter assignment, in which case \fBcpustat\fR attempts to choose a
capable counter automatically.
.sp
Attribute values can be expressed in hexadecimal, octal, or decimal notation,
in a format suitable for \fBstrtoll\fR(3C). An attribute present in the event
specification without an explicit value receives a default value of \fB1\fR. An
attribute without a corresponding counter number is applied to all counters in
the specification.
.sp
The semantics of these event specifications can be determined by reading the
\fBCPU\fR manufacturer's documentation for the events.
.sp
Multiple \fB-c\fR options can be specified, in which case the command cycles
between the different event settings on each sample.
.RE
.sp
.ne 2
.na
\fB\fB-D\fR\fR
.ad
.sp .6
.RS 4n
Enables debug mode.
.RE
.sp
.ne 2
.na
\fB\fB-h\fR\fR
.ad
.sp .6
.RS 4n
Prints an extensive help message on how to use the utility and how to program
the processor-dependent counters.
.RE
.sp
.ne 2
.na
\fB\fB-n\fR\fR
.ad
.sp .6
.RS 4n
Omits all header output (useful if \fBcpustat\fR is the beginning of a
pipeline).
.RE
.sp
.ne 2
.na
\fB\fB-p\fR \fIperiod\fR\fR
.ad
.sp .6
.RS 4n
Causes \fBcpustat\fR to cycle through the list of \fIeventspec\fRs every
\fIperiod\fR seconds. The tool sleeps after each cycle until \fIperiod\fR
seconds have elapsed since the first \fIeventspec\fR was measured.
.sp
When this option is present, the optional \fIcount\fR parameter specifies the
number of total cycles to make (instead of the number of total samples to
take). If \fIperiod\fR is less than the number of \fIeventspec\fRs times
\fIinterval\fR, the tool acts as it period is \fB0\fR.
.RE
.sp
.ne 2
.na
\fB\fB-s\fR\fR
.ad
.sp .6
.RS 4n
Creates an idle soaker thread to spin while system-only \fIeventspec\fRs are
bound. One idle soaker thread is bound to each CPU in the current processor
set. System-only \fIeventspec\fRs contain both the \fBnouser\fR and the
\fBsys\fR tokens and measure events that occur while the CPU is operating in
privileged mode. This option prevents the kernel's idle loop from running and
triggering system-mode events.
.RE
.sp
.ne 2
.na
\fB\fB-T\fR \fBu\fR | \fBd\fR\fR
.ad
.sp .6
.RS 4n
Display a time stamp.
.sp
Specify \fBu\fR for a printed representation of the internal representation of
time. See \fBtime\fR(2). Specify \fBd\fR for standard date format. See
\fBdate\fR(1).
.RE
.sp
.ne 2
.na
\fB\fB-t\fR\fR
.ad
.sp .6
.RS 4n
Prints an additional column of processor cycle counts, if available on the
current architecture.
.RE
.SH USAGE
.LP
A closely related utility, \fBcputrack\fR(1), can be used to monitor the
behavior of individual applications with little or no interference from other
activities on the system.
.sp
.LP
The \fBcpustat\fR utility must be run by the super-user, as there is an
intrinsic conflict between the use of the \fBCPU\fR performance counters
system-wide by \fBcpustat\fR and the use of the \fBCPU\fR performance counters
to monitor an individual process (for example, by \fBcputrack\fR.)
.sp
.LP
Once any instance of this utility has started, no further per-process or
per-\fBLWP\fR use of the counters is allowed until the last instance of the
utility terminates.
.sp
.LP
The times printed by the command correspond to the wallclock time when the
hardware counters were actually sampled, instead of when the program told the
kernel to sample them. The time is derived from the same timebase as
\fBgethrtime\fR(3C).
.sp
.LP
The processor cycle counts enabled by the \fB-t\fR option always apply to both
user and system modes, regardless of the settings applied to the performance
counter registers.
.sp
.LP
On some hardware platforms running in system mode using the "sys" token, the
counters are implemented using 32-bit registers. While the kernel attempts to
catch all overflows to synthesize 64-bit counters, because of hardware
implementation restrictions, overflows can be lost unless the sampling interval
is kept short enough. The events most prone to wrap are those that count
processor clock cycles. If such an event is of interest, sampling should occur
frequently so that less than 4 billion clock cycles can occur between samples.
.sp
.LP
The output of cpustat is designed to be readily parsable by \fBnawk\fR(1) and
\fBperl\fR(1), thereby allowing performance tools to be composed by embedding
\fBcpustat\fR in scripts. Alternatively, tools can be constructed directly
using the same \fBAPI\fRs that \fBcpustat\fR is built upon using the facilities
of \fBlibcpc\fR(3LIB). See \fBcpc\fR(3CPC).
.sp
.LP
The \fBcpustat\fR utility only monitors the \fBCPU\fRs that are accessible to
it in the current processor set. Thus, several instances of the utility can be
running on the \fBCPU\fRs in different processor sets. See \fBpsrset\fR(8) for
more information about processor sets.
.sp
.LP
Because \fBcpustat\fR uses \fBLWP\fRs bound to \fBCPU\fRs, the utility might
have to be terminated before the configuration of the relevant processor can be
changed.
.SH EXAMPLES
.SS "SPARC"
.LP
\fBExample 1 \fRMeasuring External Cache References and Misses
.sp
.LP
The following example measures misses and references in the external cache.
These occur while the processor is operating in user mode on an UltraSPARC
machine.
.sp
.in +2
.nf
example% cpustat -c EC_ref,EC_misses 1 3
time cpu event pic0 pic1
1.008 0 tick 69284 1647
1.008 1 tick 43284 1175
2.008 0 tick 179576 1834
2.008 1 tick 202022 12046
3.008 0 tick 93262 384
3.008 1 tick 63649 1118
3.008 2 total 651077 18204
.fi
.in -2
.sp
.SS "x86"
.LP
\fBExample 2 \fRMeasuring Branch Prediction Success on Pentium 4
.sp
.LP
The following example measures branch mispredictions and total branch
instructions in user and system mode on a Pentium 4 machine.
.sp
.in +2
.nf
example% cpustat -c \e
pic12=branch_retired,emask12=0x4,pic14=branch_retired,\e
emask14=0xf,sys 1 3
time cpu event pic12 pic14
1.010 1 tick 458 684
1.010 0 tick 305 511
2.010 0 tick 181 269
2.010 1 tick 469 684
3.010 0 tick 182 269
3.010 1 tick 468 684
3.010 2 total 2063 3101
.fi
.in -2
.sp
.LP
\fBExample 3 \fRCounting Memory Accesses on Opteron
.sp
.LP
The following example determines the number of memory accesses made through
each memory controller on an Opteron, broken down by internal memory latency:
.sp
.in +2
.nf
cpustat -c \e
pic0=NB_mem_ctrlr_page_access,umask0=0x01, \e
pic1=NB_mem_ctrlr_page_access,umask1=0x02, \e
pic2=NB_mem_ctrlr_page_access,umask2=0x04,sys \e
1
time cpu event pic0 pic1 pic2
1.003 0 tick 41976 53519 7720
1.003 1 tick 5589 19402 731
2.003 1 tick 6011 17005 658
2.003 0 tick 43944 45473 7338
3.003 1 tick 7105 20177 762
3.003 0 tick 47045 48025 7119
4.003 0 tick 43224 46296 6694
4.003 1 tick 5366 19114 652
.fi
.in -2
.sp
.SH WARNINGS
.LP
By running the \fBcpustat\fR command, the super-user forcibly invalidates all
existing performance counter context. This can in turn cause all invocations of
the \fBcputrack\fR command, and other users of performance counter context, to
exit prematurely with unspecified errors.
.sp
.LP
If \fBcpustat\fR is invoked on a system that has \fBCPU\fR performance counters
which are not supported by Solaris, the following message appears:
.sp
.in +2
.nf
cpustat: cannot access performance counters - Operation not applicable
.fi
.in -2
.sp
.sp
.LP
This error message implies that \fBcpc_open()\fR has failed and is documented
in \fBcpc_open\fR(3CPC). Review this documentation for more information about
the problem and possible solutions.
.sp
.LP
If a short interval is requested, \fBcpustat\fR might not be able to keep up
with the desired sample rate. In this case, some samples might be dropped.
.SH ATTRIBUTES
.LP
See \fBattributes\fR(7) for descriptions of the following attributes:
.sp
.sp
.TS
box;
c | c
l | l .
ATTRIBUTE TYPE ATTRIBUTE VALUE
_
Interface Stability Evolving
.TE
.SH SEE ALSO
.LP
.BR cputrack (1),
.BR nawk (1),
.BR perl (1),
.BR gethrtime (3C),
.BR strtoll (3C),
.BR cpc (3CPC),
.BR cpc_bind_cpu (3CPC),
.BR cpc_open (3CPC),
.BR libcpc (3LIB),
.BR attributes (7),
.BR iostat (8),
.BR prstat (8),
.BR psrset (8),
.BR vmstat (8)
.SH NOTES
.LP
When \fBcpustat\fR is run on a Pentium 4 with HyperThreading enabled, a CPC set
is bound to only one logical CPU of each physical CPU. See
\fBcpc_bind_cpu\fR(3CPC).
|