diff options
Diffstat (limited to 'books/PCP_UAG/pcp-users-and-administrators-guide.xml')
-rw-r--r-- | books/PCP_UAG/pcp-users-and-administrators-guide.xml | 8020 |
1 files changed, 8020 insertions, 0 deletions
diff --git a/books/PCP_UAG/pcp-users-and-administrators-guide.xml b/books/PCP_UAG/pcp-users-and-administrators-guide.xml new file mode 100644 index 0000000..f869222 --- /dev/null +++ b/books/PCP_UAG/pcp-users-and-administrators-guide.xml @@ -0,0 +1,8020 @@ +<?xml version="1.0"?> +<!DOCTYPE book PUBLIC "-//Norman Walsh//DTD DocBk XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"> +<book status="final" security="public"><title>Performance Co-Pilot™ User's and Administrator's Guide</title> +<bookinfo><edition>3</edition> + +<othercredit> +<contrib>Maintained by</contrib> +<affiliation> +<orgname>The Performance Co-Pilot Development Team</orgname> +<address> +<email>pcp@mail.performancecopilot.org</email> +<otheraddr> +<ulink url="http://www.performancecopilot.org"/> +<inlinemediaobject><imageobject><imagedata fileref="figures/pcp.svg"/></imageobject></inlinemediaobject> +</otheraddr> +</address> +</affiliation> +</othercredit> + +<copyright> +<year>2000</year> +<year>2013</year> +<holder>Silicon Graphics, Inc.</holder> +</copyright> + +<copyright> +<year>2013</year> +<year>2014</year> +<holder>Red Hat, Inc.</holder> +</copyright> + +<legalnotice> +<title>LICENSE</title> +<para>Permission is granted to copy, distribute, and/or modify this document under +the terms of the Creative Commons Attribution-Share Alike, Version 3.0 or any +later version published by the Creative Commons Corp. +A copy of the license is available at +<ulink url="http://creativecommons.org/licenses/by-sa/3.0/us/"/></para> +</legalnotice> + +<legalnotice> +<title>TRADEMARKS AND ATTRIBUTIONS</title> +<para>Silicon Graphics, SGI and the SGI logo are registered trademarks +and Performance Co-Pilot is a trademark of Silicon Graphics, Inc.</para> +<para>Red Hat and the Shadowman logo are trademarks of Red Hat, Inc., +registered in the United States and other countries.</para> +<para>Cisco is a trademark of Cisco Systems, Inc. +Linux is a registered trademark of Linus Torvalds, used with permission. +UNIX is a registered trademark of The Open Group.</para> +</legalnotice> + +<revhistory> +<revision><revnumber>002</revnumber><date>August 2013</date><revremark>Revised to support the Performance Co-Pilot 3.8 release.</revremark></revision> +<revision><revnumber>001</revnumber><date>December 2002</date><revremark>Original publication. Supports the Performance Co-Pilot 2.3 release.</revremark></revision> +</revhistory> + +</bookinfo> +<toc/> + + +<preface id="id5178699"> + +<title>About This Guide</title> +<para>This guide describes the Performance Co-Pilot (PCP) performance analysis toolkit. +PCP provides a systems-level suite of tools that cooperate to deliver +distributed performance monitoring and performance management services spanning +hardware platforms, operating systems, service layers, database internals, +user applications and distributed architectures.</para> +<para>PCP is a cross-platform, open source software package - customizations, extensions, +source code inspection, and tinkering in general is actively encouraged.</para> +<para>“About This Guide” includes short descriptions of the chapters +in this book, directs you to additional sources of information, and explains +typographical conventions.</para> +<section id="id5178738"> + +<title>What This Guide Contains</title> +<para>This guide contains the following chapters:</para> +<itemizedlist> +<listitem><para><xref linkend="LE91944-PARENT"/>, provides an introduction, +a brief overview of the software components, and conceptual foundations of +the PCP software.</para> +</listitem> +<listitem><para><xref linkend="LE17127-PARENT"/>, describes the basic installation +and configuration steps necessary to get PCP running on your systems.</para> +</listitem> +<listitem><para><xref linkend="LE94335-PARENT"/>, describes the user interface +components that are common to most of the text-based utilities that make up +the monitor portion of PCP.</para> +</listitem> +<listitem><para><xref linkend="LE38515-PARENT"/>, describes the performance +monitoring tools available in Performance Co-Pilot (PCP). </para> +</listitem> +<listitem><para><xref linkend="LE21414-PARENT"/>, describes the Performance +Metrics Inference Engine (<command>pmie</command>) tool that provides automated +monitoring of, and reasoning about, system performance within the PCP framework. +</para> +</listitem> +<listitem><para><xref linkend="LE93354-PARENT"/>, covers the PCP services and +utilities that support archive logging for capturing accurate historical performance +records.</para> +</listitem> +<listitem><para><xref linkend="LE83321-PARENT"/>, presents the various options +for deploying PCP functionality across cooperating systems.</para> +</listitem> +<listitem><para><xref linkend="LE62564-PARENT"/>, describes the procedures +necessary to ensure that the PCP configuration is customized in ways that +maximize the coverage and quality of performance monitoring and management +services.</para> +</listitem> +<listitem><para><xref linkend="LE65325-PARENT"/>, provides a comprehensive +list of the acronyms used in this guide and in the man pages for Performance +Co-Pilot.</para> +</listitem></itemizedlist> +</section> +<section id="id5178921"> + +<title>Audience for This Guide</title> +<para>This guide is written for the system administrator or performance analyst +who is directly using and administering PCP applications.</para> +</section> +<section id="id5178935"> + +<title>Related Resources</title> +<para>The <citetitle>Performance Co-Pilot Programmer's Guide</citetitle>, +a companion document to the +<citetitle>Performance Co-Pilot User's and Administrator's Guide</citetitle>, +is intended for developers who want to use the PCP framework and services for +exporting additional collections of performance metrics, or for delivering +new or customized applications to enhance performance management. +</para> +<para>The <citetitle>Performance Co-Pilot Tutorials and Case Studies</citetitle> +provides a series of real-world examples of using various PCP tools, and +lessons learned from deploying the toolkit in production environments. +It serves to provide reinforcement of the general concepts discussed in the +other two books with additional case studies, and in some cases very detailed +discussion of specifics of individual tools. +</para> +<para>Additional resources include man pages and the project web site.</para> +</section> +<section id="id5178967"> + +<title>Man Pages</title> +<para>The operating system man pages provide concise reference information on the use of commands, subroutines, and system resources. There is usually a man page for each PCP command or subroutine. To see a list of all the PCP man pages, start from the following command:</para> +<literallayout class="monospaced"><userinput>man PCPIntro</userinput></literallayout> +<para>Each man page usually has a "SEE ALSO" section, linking to other, related entries.</para> +<para>To see a particular man page, supply its name to the <literal>man</literal> command, for example:</para> +<literallayout class="monospaced"><userinput>man pcp</userinput></literallayout> +<para>The man pages are arranged in different sections - user commands, programming interfaces, and so on. +For a complete list of manual sections on a platform enter the command:</para> +<literallayout class="monospaced"><userinput>man man</userinput></literallayout> +<para>When referring to man pages, this guide follows a standard convention: the section number in parentheses follows the item. +For example, <command>pminfo(1)</command> refers to the man page in section 1 for the <command>pminfo</command> command.</para> +</section> +<section id="id5178968"> + +<title>Web Site</title> +<para>The following web site is accessible to everyone:</para> +<variablelist condition="sgi_termlength:wide"> +<varlistentry> +<term><emphasis role="bold">URL</emphasis></term><listitem><para><emphasis role="bold">Description</emphasis></para></listitem></varlistentry> +<varlistentry> +<term><ulink url="http://www.performancecopilot.org">http://www.performancecopilot.org</ulink></term> +<listitem><para>PCP is open source software released under +the GNU General Public License (GPL) and GNU Lesser General Public License (LGPL)</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5179060"> + +<title>Conventions</title> +<para>The following conventions are used throughout this document:<variablelist> +<varlistentry> +<term><emphasis role="bold">Convention</emphasis></term><listitem><para><emphasis role="bold">Meaning</emphasis></para></listitem></varlistentry> +<varlistentry> +<term><literal>${PCP_VARIABLE}</literal></term> +<listitem><para>A brace-enclosed all-capital-letters syntax indicates a variable +that has been sourced from the global <filename>${PCP_DIR}/etc/pcp.conf</filename> file. +These special variables indicate parameters that affect all PCP commands, +and are likely to be different between platforms.</para> +</listitem> +</varlistentry> + +<varlistentry> +<term><literal>command</literal></term> +<listitem><para>This fixed-space font denotes literal items such as commands, +files, routines, path names, signals, messages, and programming language +structures. </para> +</listitem> +</varlistentry> + +<varlistentry> +<term><replaceable>variable</replaceable></term> +<listitem><para>Italic typeface denotes variable entries and words or concepts being +defined.</para> +</listitem></varlistentry> + +<varlistentry> +<term><userinput>user input</userinput></term> +<listitem><para>This bold, fixed-space font denotes literal items that the user enters in interactive sessions. (Output is shown in nonbold, fixed-space font.)</para> +</listitem></varlistentry> + +<varlistentry> +<term>[ ]</term> +<listitem><para>Brackets enclose optional portions of a command or directive line.</para> +</listitem></varlistentry> + +<varlistentry> +<term>...</term> +<listitem><para>Ellipses indicate that a preceding element can be repeated.</para> +</listitem></varlistentry> +<varlistentry> +<term>ALL CAPS</term> +<listitem><para>All capital letters denote environment variables, operator +names, directives, defined constants, and macros in C programs.</para> +</listitem></varlistentry> +<varlistentry> +<term>()</term> +<listitem><para>Parentheses that follow function names surround function arguments +or are empty if the function has no arguments; parentheses that follow commands +surround man page section numbers.</para> +</listitem></varlistentry> +</variablelist></para> +</section> +<section id="z825546061melby"> + +<title>Reader Comments</title> +<para>If you have comments about the technical accuracy, content, or organization of this document, contact the PCP maintainers using either the email address or the web site listed earlier.</para> +<para>We value your comments and will respond to them promptly.</para> +</section> +</preface> + + +<chapter id="LE91944-PARENT"> + +<title>Introduction to PCP</title> +<para><indexterm id="ITch01-0"><primary>overview</primary></indexterm><indexterm id="IG313718880"> +<primary>PCP</primary><secondary>features</secondary></indexterm><indexterm id="IG313718881"> +<primary>Performance Co-Pilot</primary><see>PCP</see></indexterm>This +chapter provides an introduction to Performance Co-Pilot (PCP), an overview +of its individual components, and conceptual information to help you use +this software.</para> +<para>The following sections are included:</para> +<itemizedlist> +<listitem><para><xref linkend="LE92676-PARENT"/> covers the intended purposes of PCP.</para></listitem> +<listitem><para><xref linkend="LE13618-PARENT"/>, describes PCP tools and agents.</para></listitem> +<listitem><para><xref linkend="LE79836-PARENT"/>, discusses the design theories behind PCP.</para></listitem> +</itemizedlist> +<section id="LE92676-PARENT"> + +<title>Objectives</title> +<para><indexterm id="IG313718882"><primary>objectives</primary></indexterm></para> +<para>Performance Co-Pilot (PCP) provides a range of services that may +be used to monitor and manage system performance. These services are distributed +and scalable to accommodate the most complex system configurations and +performance problems.</para> +<section id="LE67354-PARENT"> + +<title>PCP Target Usage</title> +<para><indexterm id="IG313718883"><primary>target usage</primary></indexterm>PCP is targeted +at the performance analyst, benchmarker, capacity planner, developer, +database administrator, or system administrator with an interest in overall +system performance and a need to quickly isolate and understand performance +behavior, resource utilization, activity levels, and bottlenecks in complex +systems. Platforms that can benefit from this level of performance analysis +include large servers, server clusters, or multiserver sites delivering +Database Management Systems (DBMS), compute, Web, file, or video services. +</para> +</section> +<section id="LE79006-PARENT"> + +<title>Empowering the PCP User</title> +<para><indexterm id="IG313718884"><primary>audience</primary></indexterm>To deal efficiently +with the dynamic behavior of complex systems, performance analysts need +to filter out noise from the overwhelming stream of performance data, +and focus on exceptional scenarios. Visualization of current and historical +performance data, and automated reasoning about performance data, effectively +provide this filtering.</para> +<para>From the PCP end user's perspective, PCP presents an integrated +suite of tools, user interfaces, and services that support real-time and +retrospective performance analysis, with a bias towards eliminating mundane +information and focusing attention on the exceptional and extraordinary +performance behaviors. When this is done, the user can concentrate on +in-depth analysis or target management procedures for those critical system +performance problems.</para> +</section> +<section id="LE35382-PARENT"> + +<title>Unification of Performance Metric Domains</title> +<para><indexterm id="IG313718885"><primary>domains</primary></indexterm><indexterm id="IG313718886"><primary> +metric domains</primary></indexterm>At the lowest level, performance metrics +are collected and managed in autonomous performance domains such as the +operating system kernel, a DBMS, a layered service, or an end-user application. +These domains feature a multitude of access control policies, access methods, +data semantics, and multiversion support. All this detail is irrelevant +to the developer or user of a performance monitoring tool, and is hidden +by the PCP infrastructure.</para> +<para><indexterm id="IG313718887"><primary>PMDA</primary><secondary>unification</secondary> +</indexterm><indexterm id="IG313718888"><primary>Performance Metrics Domain Agent</primary> +<see>PMDA</see></indexterm>Performance Metrics Domain Agents (PMDAs) within +PCP encapsulate the knowledge about, and export performance information +from, autonomous performance domains.</para> +</section> +<section id="LE83994-PARENT"> + +<title>Uniform Naming and Access to Performance Metrics +</title> +<para><indexterm id="IG313718889"><primary>uniform naming</primary></indexterm><indexterm id="IG3137188810"> +<primary>naming scheme</primary></indexterm><indexterm id="IG3137188811"><primary>PMNS</primary> +<secondary>defined names</secondary></indexterm><indexterm id="IG3137188812"><primary>Performance +Metrics Name Space</primary><see>PMNS</see></indexterm>Usability and extensibility +of performance management tools mandate a single scheme for naming performance +metrics. The set of defined names constitutes a Performance Metrics Name +Space (PMNS). Within PCP, the PMNS is adaptive so it can be extended, +reshaped, and pruned to meet the needs of particular applications and +users.</para> +<para>PCP provides a single interface to name and retrieve values for +all performance metrics, independently of their source or location.</para> +</section> +<section id="LE85063-PARENT"> + +<title>PCP Distributed Operation</title> +<para><indexterm id="ITch01-2"><primary>PCP</primary><secondary>distributed +operation</secondary></indexterm>From a purely pragmatic viewpoint, a +single workstation must be able to monitor the concurrent performance +of multiple remote hosts. At the same time, a single host may be subject +to monitoring from multiple remote workstations.</para> +<para><indexterm id="IG3137188813"><primary>client-server architecture</primary></indexterm>These +requirements suggest a classic client-server architecture, which is exactly +what PCP uses to provide concurrent and multiconnected access to performance +metrics, independent of their host location.</para> +</section> +<section id="LE87326-PARENT"> + +<title>Dynamic Adaptation to Change</title> +<para><indexterm id="IG3137188814"><primary>dynamic adaptation</primary></indexterm><indexterm id="IG3137188815"> +<primary>adaptation</primary></indexterm>Complex systems are subject to +continual changes as network connections fail and are reestablished; nodes +are taken out of service and rebooted; hardware is added and removed; +and software is upgraded, installed, or removed. Often these changes are +asynchronous and remote (perhaps in another geographic region or domain +of administrative control).</para> +<para>The distributed nature of the PCP (and the modular fashion in which +performance metrics domains can be installed, upgraded, and configured +on different hosts) enables PCP to adapt concurrently to changes in the +monitored system(s). Variations in the available performance metrics as +a consequence of configuration changes are handled automatically and become +visible to all clients as soon as the reconfigured host is rebooted or +the responsible agent is restarted.</para> +<para>PCP also detects loss of client-server connections, and most clients +support subsequent automated reconnection.</para> +</section> +<section id="LE13859-PARENT"> + +<title>Logging and Retrospective Analysis</title> +<para><indexterm id="IG3137188816"><primary>archive logs</primary><secondary>analysis</secondary> +</indexterm><indexterm id="ITch01-3"><primary>logging</primary><see>archive +logs</see></indexterm>A range of tools is provided to support flexible, +adaptive logging of performance metrics for archive, playback, remote +diagnosis, and capacity planning. PCP archive logs may be accumulated +either at the host being monitored, at a monitoring workstation, or both. +</para> +<para>A universal replay mechanism, modeled on +<ulink url="http://en.wikipedia.org/wiki/Media_controls">media controls</ulink>, +supports play, step, rewind, fast forward and variable speed processing of archived +performance data. Replay for multiple archives, from multiple hosts, is facilitated +by an archive aggregation concept.</para> +<para>Most PCP applications are able to process archive logs and real-time +performance data with equal facility. Unification of real-time access +and access to the archive logs, in conjunction with the media controls, +provides powerful mechanisms for building performance tools and to review +both current and historical performance data.</para> +</section> +<section id="LE36677-PARENT"> + +<title>Automated Operational Support</title> +<para><indexterm id="IG3137188817"><primary>automated operational support</primary></indexterm>For +operational and production environments, PCP provides a framework with +scripts to customize in order to automate the execution of ongoing tasks +such as these:</para> +<itemizedlist> +<listitem><para><indexterm id="IG3137188818"><primary>centralized archive logging</primary> +</indexterm>Centralized archive logging for multiple remote hosts</para> +</listitem> +<listitem><para><indexterm id="IG3137188819"><primary>archive logs</primary><secondary> +customization</secondary></indexterm>Archive log rotation, consolidation, +and culling</para> +</listitem> +<listitem><para>Web-based publishing of charts showing snapshots of performance +activity levels in the recent past</para> +</listitem> +<listitem><para>Flexible alarm monitoring: parameterized rules to address +common critical performance scenarios and facilities to customize and +refine this monitoring</para> +</listitem> +<listitem><para><indexterm id="IG3137188820"><primary>audits</primary></indexterm>Retrospective +performance audits covering the recent past; for example, daily or weekly +checks for performance regressions or quality of service problems</para> +</listitem></itemizedlist> +</section> +<section id="LE38522-PARENT"> + +<title>PCP Extensibility</title> +<para><indexterm id="IG3137188821"><primary>extensibility</primary></indexterm><indexterm id="IG3137188822"> +<primary>PCP</primary><secondary>extensibility</secondary></indexterm>PCP +permits the integration of new performance metrics into the PMNS, the +collection infrastructure, and the logging framework. The guiding principle +is, “if it is important for monitoring system performance, and you +can measure it, you can easily integrate it into the PCP framework.” +</para> +<para>For many PCP users, the most important performance metrics are +not those already supported, but new performance metrics that characterize +the essence of good or bad performance at their site, or within their +particular application environment.</para> +<para>One example is an application that measures the round-trip time +for a benign “probe” transaction against some mission-critical +application.</para> +<para><indexterm id="IG3137188823"><primary>PMDA</primary><secondary>libraries</secondary> +</indexterm>For application developers, a library is provided to support +easy-to-use insertion of trace and monitoring points within an application, +and the automatic export of resultant performance data into the PCP framework. +Other libraries and tools aid the development of customized and fully +featured Performance Metrics Domain Agents (PMDAs).</para> +<para>Extensive source code examples are provided in the distribution, +and by using the PCP toolkit and interfaces, these customized measures +of performance or quality of service can be easily and seamlessly integrated +into the PCP framework.</para> +</section> +<section id="LE40772-PARENT"> + +<title>Metric Coverage</title> +<para><indexterm id="IG3137188824"><primary>coverage</primary></indexterm>The core +PCP modules support export of performance metrics that include kernel +instrumentation, hardware instrumentation, process-level resource utilization, +database and other system services instrumentation, and activity in the PCP +collection infrastructure.</para> +<para>The supplied agents support thousands of distinct performance metrics, +many of which can have multiple values, for example, per disk, per CPU, +or per process.</para> +</section> +</section> +<section id="LE79836-PARENT"> + +<title>Conceptual Foundations</title> +<para><indexterm id="ITch01-129"><primary>conceptual foundations</primary> +</indexterm>The following sections provide a detailed overview of concepts +that underpin Performance Co-Pilot (PCP).</para> +<section id="id5188366"> + +<title>Performance Metrics</title> +<para><indexterm id="ITch01-130"><primary>PMAPI</primary><secondary>naming +metrics</secondary></indexterm><indexterm id="IG3137188863"><primary>performance metrics +</primary><secondary>concept</secondary></indexterm> Across all of the +supported performance metric domains, there are a large number of performance +metrics. Each metric has its own structure and semantics. PCP presents +a uniform interface to these metrics, independent of the underlying metric +data source.</para> +<para><indexterm id="IG3137188864"><primary>PMNS</primary><secondary>brief description</secondary> +</indexterm>The Performance Metrics Name Space (PMNS) provides a hierarchical +classification of human-readable metric names, and a mapping from these external +names to internal metric identifiers. See <xref linkend="LE94677-PARENT"/>, for +a description of the PMNS.</para> +</section> +<section id="id5188440"> + +<title>Performance Metric Instances</title> +<para>When performance metric values are returned to a requesting application, +there may be more than one value instance for a particular metric; for +example, independent counts for each CPU, process, disk, or local filesystem. +Internal instance identifiers correspond one to one with external (human-readable) +descriptions of the members of an instance domain.</para> +<para>Transient performance metrics (such as per-process information) +cause repeated requests for the same metric to return different numbers +of values, or changes in the particular instance identifiers returned. +These changes are expected and fully supported by the PCP infrastructure; +however, metric instantiation is guaranteed to be valid only at the time of collection.</para> +</section> +<section id="id5188469"> + +<title>Current Metric Context</title> +<para>When performance metrics are retrieved, they are delivered in the +context of a particular source of metrics, a point in time, and a profile +of desired instances. This means that the application making the request +has already negotiated to establish the context in which the request should +be executed.</para> +<para><indexterm id="IG3137188865"><primary>pmlogger tool</primary><secondary>current metric +context</secondary></indexterm>A metric source may be the current performance +data from a particular host (a live or real-time source), or an archive +log of performance data collected by <command>pmlogger</command> at some +distant host or at an earlier time (a retrospective or archive source). +</para> +<para><indexterm id="ITch01-134"><primary>collection time</primary></indexterm>By +default, the collection time for a performance metric is the current time +of day for real-time sources, or current point within an archive source. +For archives, the collection time may be reset to an arbitrary time within +the bounds of the archive log.<indexterm id="ITch01-135"><primary>archive logs</primary> +<secondary>collection time</secondary></indexterm></para> +</section> +<section id="id5188562"> + +<title>Sources of Performance Metrics and Their Domains</title> +<para><indexterm id="ITch01-137"><primary>performance metrics</primary> +<secondary>sources</secondary></indexterm><indexterm id="ITch01-138"> +<primary>functional domains</primary></indexterm> Instrumentation for +the purpose of performance monitoring typically consists of counts of +activity or events, attribution of resource consumption, and service-time +or response-time measures. This instrumentation may exist in one or more +of the functional domains as shown in <xref linkend="id5188602"/>. +</para> +<figure id="id5188602"><title>Performance Metric Domains as Autonomous Collections +of Data</title><mediaobject><imageobject><imagedata fileref="figures/metric-domains.svg"/></imageobject><textobject><phrase>Performance Metric Domains as Autonomous Collections +of Data</phrase></textobject></mediaobject></figure> +<para>Each domain has an associated access method:</para> +<itemizedlist> +<listitem><para><indexterm id="IG3137188866"><primary>Kernel data structures</primary></indexterm>The +operating system kernel, including sub-system data structures - per-process +resource consumption, network statistics, disk activity, or memory management +instrumentation.</para> +</listitem> +<listitem><para><indexterm id="IG3137188867"><primary>Mail servers</primary></indexterm><indexterm id="IG3137188868"> +<primary>layered software services</primary></indexterm>A layered software +service such as activity logs for a World Wide Web server or an email delivery +server.<indexterm id="IG3137188869"><primary>application programs</primary></indexterm></para> +</listitem> +<listitem><para>An application program such as measured response time +for a production application running a periodic and benign probe transaction +(as often required in service level agreements), or rate of computation +and throughput in jobs per minute for a batch stream.</para> +</listitem> +<listitem><para><indexterm id="IG3137188870"><primary>network routers and bridges</primary> +</indexterm><indexterm id="IG3137188871"><primary>external equipment</primary></indexterm>External +equipment such as network routers and bridges.</para> +</listitem></itemizedlist> +<para><indexterm id="ITch01-139"><primary>performance metrics</primary> +<secondary>methods</secondary></indexterm>For each domain, the set of +performance metrics may be viewed as an abstract data type, with an associated +set of methods that may be used to perform the following tasks:</para> +<itemizedlist> +<listitem><para>Interrogate the metadata that describes the syntax and +semantics of the performance metrics</para> +</listitem> +<listitem><para>Control (enable or disable) the collection of some or +all of the metrics</para> +</listitem> +<listitem><para>Extract instantiations (current values) for some or all +of the metrics</para> +</listitem></itemizedlist> +<para>We refer to each functional domain as a performance metrics domain +and assume that domains are functionally, architecturally, and administratively +independent and autonomous. Obviously the set of performance metrics domains +available on any host is variable, and changes with time as software and +hardware are installed and removed.</para> +<para>The number of performance metrics domains may be further enlarged +in cluster-based or network-based configurations, where there is potentially +an instance of each performance metrics domain on each node. Hence, the +management of performance metrics domains must be both extensible at a +particular host and distributed across a number of hosts.</para> +<para><indexterm id="IG3137188872"><primary>PMID</primary><secondary>description</secondary> +</indexterm><indexterm id="IG3137188873"><primary>Performance Metric Identifier</primary> +<see>PMID</see></indexterm>Each performance metrics domain on a particular +host must be assigned a unique Performance Metric Identifier (PMID). In +practice, this means unique identifiers are assigned globally for each +performance metrics domain type. For example, the same identifier would +be used for the Apache Web Server performance metrics domain on all hosts.</para> +</section> +<section id="id5188837"> + +<title>Distributed Collection</title> +<para><indexterm id="ITch01-142"><primary>distributed collection</primary> +</indexterm><indexterm id="ITch01-143"><primary>collector hosts</primary> +</indexterm><indexterm id="IG3137188874"><primary>PMCD</primary><secondary>distributed collection +</secondary></indexterm>The performance metrics collection architecture +is distributed, in the sense that any performance tool may be executing +remotely. However, a PMDA usually runs on the system for which it is collecting +performance measurements. In most cases, connecting these tools together +on the collector host is the responsibility of the PMCD process, as shown +in <xref linkend="id5188883"/>.</para> +<figure id="id5188883"><title>Process Structure for Distributed Operation</title> +<mediaobject><imageobject><imagedata fileref="figures/remote-collector.svg"/></imageobject><textobject><phrase>Process Structure for Distributed Operation</phrase></textobject></mediaobject></figure> +<para>The host running the monitoring tools does not require any collection +tools, including <command>pmcd</command>, because all requests for metrics +are sent to the <command>pmcd</command> process on the collector host. +These requests are then forwarded to the appropriate PMDAs, which respond +with metric descriptions, help text, and most importantly, metric values. +</para> +<para><indexterm id="IG3137188875"><primary>PMCD</primary><secondary>distributed collection +</secondary></indexterm>The connections between monitor clients and <literal>pmcd</literal> processes are managed in <filename>libpcp</filename>, below +the PMAPI level; see the <command>pmapi(3)</command> man page. +Connections between PMDAs and <command>pmcd</command> are managed by the +PMDA routines; see the <command>pmda(3)</command> man page. +There can be multiple monitor clients and multiple PMDAs on the one host, +but normally there would be only one <literal>pmcd</literal> process.</para> +</section> +<section id="LE94677-PARENT"> + +<title>Performance Metrics Name Space</title> +<para><indexterm id="ITch01-144"><primary>PMNS</primary><secondary>description +</secondary></indexterm> <indexterm id="ITch01-146"><primary>PMID</primary> +<secondary>description</secondary></indexterm>Internally, each unique +performance metric is identified by a Performance Metric Identifier (PMID) +drawn from a universal set of identifiers, including some that are reserved +for site-specific, application-specific, and customer-specific use.</para> +<para><indexterm id="ITch01-148"><primary>performance metrics</primary> +<secondary>PMNS</secondary></indexterm>An external name space - the Performance +Metrics Name Space (PMNS) - maps from a hierarchy (or tree) of human-readable +names to PMIDs.</para> +<section id="id5189100"> + +<title>Performance Metrics Name Space Diagram</title> +<para>Each node in the PMNS tree is assigned a label that must begin with +an alphabet character, and be followed by zero or more alphanumeric characters +or the underscore (_) character. The root node of the tree has the special +label of <literal>root</literal>.</para> +<para>A metric name is formed by traversing the tree from the root to +a leaf node with each node label on the path separated by a period. The +common prefix <literal>root</literal><emphasis role="bold">.</emphasis> is omitted +from all names. For example, <xref linkend="id5189137"/> shows the +nodes in a small subsection of a PMNS.</para> +<figure id="id5189137"><title>Small Performance Metrics Name Space (PMNS) +</title><mediaobject><imageobject><imagedata fileref="figures/pmns-small.svg"/></imageobject><textobject><phrase>Small Performance Metrics Name Space (PMNS) +</phrase></textobject></mediaobject></figure> +<para>In this subsection, the following are valid names for performance +metrics:</para> +<literallayout class="monospaced">kernel.percpu.syscall +network.tcp.rcvpack +hw.router.recv.total_util +</literallayout> +</section> +</section> +<section id="id5189172"> + +<title>Descriptions for Performance Metrics</title> +<para><indexterm id="ITch01-149"><primary>performance metrics</primary> +<secondary>descriptions</secondary></indexterm><indexterm id="ITch01-150"> +<primary>metadata</primary></indexterm> Through the various performance +metric domains, the PCP must support a wide range of formats and semantics +for performance metrics. This <firstterm>metadata</firstterm> describing +the performance metrics includes the following:</para> +<itemizedlist> +<listitem><para>The internal identifier, Performance Metric Identifier +(PMID), for the metric</para> +</listitem> +<listitem><para><indexterm id="IG3137188876"><primary>64-bit IEEE format</primary></indexterm>The +format and encoding for the values of the metric, for example, an unsigned +32-bit integer or a string or a 64-bit IEEE format floating point number +</para> +</listitem> +<listitem><para>The semantics of the metric, particularly the interpretation +of the values as free-running counters or instantaneous values</para> +</listitem> +<listitem><para>The dimensionality of the values, in the dimensions of +events, space, and time</para> +</listitem> +<listitem><para>The scale of values; for example, bytes, kilobytes (KB), +or megabytes (MB) for the space dimension</para> +</listitem> +<listitem><para>An indication if the metric may have one or many associated +values</para> +</listitem> +<listitem><para>Short (and extended) help text describing the metric</para> +</listitem></itemizedlist> +<para>For each metric, this metadata is defined within the associated +PMDA, and PCP arranges for the information to be exported to performance +tools that use the metadata when interpreting the values for each metric.</para> +</section> +<section id="id5189302"> + +<title>Values for Performance Metrics</title> +<para>The following sections describe two types of performance metrics, +single-valued and set-valued.</para> +<section id="id5189314"> + +<title>Single-Valued Performance Metrics</title> +<para><indexterm id="IG3137188877"><primary>single-valued performance metrics</primary> +</indexterm>Some performance metrics have a singular value within their +performance metric domains. For example, available memory (or the total +number of context switches) has only one value per performance metric +domain, that is, one value per host. The metadata describing the metric +makes this fact known to applications that process values for these metrics. +</para> +</section> +<section id="id5189346"> + +<title>Set-Valued Performance Metrics</title> +<para><indexterm id="IG3137188878"><primary>set-valued performance metrics</primary></indexterm>Some +performance metrics have a set of values or instances in each implementing +performance metric domain. For example, one value for each disk, one value +for each process, one value for each CPU, or one value for each activation +of a given application.</para> +<para>When a metric has multiple instances, the PCP framework does not +pollute the Name Space with additional metric names; rather, a single +metric may have an associated set of values. These multiple values are +associated with the members of an <firstterm>instance domain</firstterm>, +such that each instance has a unique instance identifier within the associated +instance domain. For example, the “per CPU” instance domain +may use the instance identifiers 0, 1, 2, 3, and so on to identify the +configured processors in the system.</para> +<para>Internally, instance identifiers are encoded as binary values, but +each performance metric domain also supports corresponding strings as +external names for the instance identifiers, and these names are used +at the user interface to the PCP utilities.</para> +<para>For example, the performance metric <literal>disk.dev.total</literal> +counts I/O operations for each disk spindle, and the associated instance +domain contains one member for each disk spindle. On a system with five +specific disks, one value would be associated with each of the external +and internal instance identifier pairs shown in <xref linkend="id5189432"/>. +</para> +<table id="id5189432" frame="topbot"> + +<title>Sample Instance Identifiers for Disk Statistics +</title> +<tgroup cols="2" colsep="0" rowsep="0"> +<colspec colwidth="198*"/> +<colspec colwidth="198*"/> +<thead> +<row rowsep="1" valign="top"><entry align="left" valign="bottom"><para>External Instance +Identifier</para></entry><entry align="left" valign="bottom"><para>Internal +Instance Identifier</para></entry></row></thead> +<tbody> +<row valign="top"> +<entry align="left" valign="top"><para>disk0</para></entry> +<entry align="left" valign="top"><para>131329</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para>disk1</para></entry> +<entry align="left" valign="top"><para>131330</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para>disk2</para></entry> +<entry align="left" valign="top"><para>131331</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para>disk3</para></entry> +<entry align="left" valign="top"><para>131841</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para>disk4</para></entry> +<entry align="left" valign="top"><para>131842</para></entry></row></tbody> +</tgroup></table> +<para>Multiple performance metrics may be associated with a single instance +domain.</para> +<para>Each performance metric domain may dynamically establish the instances +within an instance domain. For example, there may be one instance for +the metric <literal>kernel.percpu.idle</literal> on a workstation, but +multiple instances on a multiprocessor server. Even more dynamic is <literal>filesys.free</literal>, where the values report the amount of free space +per file system, and the number of values tracks the mounting and unmounting +of local filesystems.</para> +<para>PCP arranges for information describing instance domains to be exported +from the performance metric domains to the applications that require this +information. Applications may also choose to retrieve values for all instances +of a performance metric, or some arbitrary subset of the available instances. +</para> +</section> +</section> +<section id="id5189653"> + +<title>Collector and Monitor Roles</title> +<para><indexterm id="ITch01-155"><primary>roles</primary><secondary>collector +</secondary></indexterm><indexterm id="IG3137188879"><primary>roles</primary><secondary> +monitor</secondary></indexterm>Hosts supporting PCP services are broadly +classified into two categories:</para> +<variablelist id="Z926617459sdc"> +<varlistentry> +<term>Collector</term> +<listitem><para><indexterm id="ITch01-156"><primary>collector hosts</primary> +</indexterm><indexterm id="IG3137188880"><primary>PMDA</primary><secondary>collectors</secondary> +</indexterm>Hosts that have <literal>pmcd</literal> and one or more performance +metric domain agents (PMDAs) running to collect and export performance +metrics</para> +</listitem></varlistentry> +<varlistentry> +<term>Monitor</term> +<listitem><para><indexterm id="IG3137188881"><primary>monitor hosts</primary></indexterm>Hosts +that import performance metrics from one or more collector hosts to be +consumed by tools to monitor, manage, or record the performance of the +collector hosts</para> +</listitem></varlistentry> +</variablelist> +<para>Each PCP enabled host can operate as a collector, a monitor, or +both.</para> +</section> +<section id="id5189901"> + +<title>Retrospective Sources of Performance Metrics</title> +<para><indexterm id="ITch01-159"><primary>performance metrics</primary> +<secondary>retrospective sources</secondary></indexterm>The PMAPI also +supports delivery of performance metrics from a historical +source in the form of a PCP archive log. Archive logs are created using +the <literal>pmlogger</literal> utility, and are replayed in an architecture +as shown in <xref linkend="id5189940"/>.</para> +<figure id="id5189940"><title>Architecture for Retrospective Analysis</title><mediaobject><imageobject><imagedata fileref="figures/retrospective-architecture.svg"/></imageobject><textobject><phrase>Architecture for Retrospective Analysis</phrase></textobject></mediaobject></figure> +<para>The PMAPI has been designed to minimize the differences required +for an application to process performance data from an archive or from +a real-time source. As a result, most PCP tools support live and retrospective +monitoring with equal facility.</para> +</section> +<section id="id5189964"> + +<title>Product Extensibility</title> +<para><indexterm id="ITch01-160"><primary>PCP</primary><secondary>extensibility +</secondary></indexterm>Much of the PCP software's potential for attacking +difficult performance problems in production environments comes from the +design philosophy that considers extensibility to be critically important. +</para> +<para>The performance analyst can take advantage of the PCP infrastructure +to deploy value-added performance monitoring tools and services. Here +are some examples:</para> +<itemizedlist> +<listitem><para>Easy extension of the PCP collector to accommodate new +performance metrics and new sources of performance metrics, in particular +using the interfaces of a special-purpose library to develop new PMDAs +(see the <command>pmda(3)</command> man page)</para> +</listitem> +<listitem><para><indexterm id="IG3137188885"><primary>libpcp_pmda library</primary></indexterm><indexterm id="IG3137188886"> +<primary>libpcp_mmv library</primary></indexterm>Use of libraries +(<filename>libpcp_pmda</filename> and <filename>libpcp_mmv</filename>) +to aid in the development of new capabilities to export performance metrics +from local applications</para> +</listitem> +<listitem><para>Operation on any performance metric using generalized +toolkits</para> +</listitem> +<listitem><para>Distribution of PCP components such as collectors across +the network, placing the service where it can do the most good</para> +</listitem> +<listitem><para>Dynamic adjustment to changes in system configuration +</para> +</listitem> +<listitem><para>Flexible customization built into the design of all PCP +tools</para> +</listitem> +<listitem><para>Creation of new monitor applications, using the routines +described in the <command>pmapi(3)</command> man page</para> +</listitem></itemizedlist> +</section> +</section> +<section id="LE13618-PARENT"> + +<title>Overview of Component Software</title> +<para><indexterm id="IG3137188825"><primary>software</primary></indexterm><indexterm id="IG3137188826"><primary> +component software</primary></indexterm>Performance Co-Pilot (PCP) is +composed of both text-based and graphical tools. Each tool is fully +documented by a man page. These man pages are named after the +tools or commands they describe, and are accessible through the <command>man</command> +command. For example, to see the <command>pminfo(1)</command> man page for the +<command>pminfo</command> command, enter this command:</para> +<literallayout class="monospaced"><userinput>man pminfo</userinput></literallayout> +<para>A representative list of PCP tools and commands, grouped by functionality, +is provided in the following four sections.</para> +<section id="id5177430"> + +<title>Performance Monitoring and Visualization</title> +<para><indexterm id="IG3137188827"><primary>PCP</primary><secondary>tool summaries</secondary> +</indexterm><indexterm id="IG3137188828"><primary>performance monitoring</primary></indexterm>The +following tools provide the principal services for the PCP end-user with +an interest in monitoring, visualizing, or processing performance information +collected either in real time or from PCP archive logs:</para> +<variablelist> +<varlistentry> +<term><command>pmatop</command></term> +<listitem><para><indexterm id="IG3137188800"><primary>pmatop tool</primary> +<secondary>brief description</secondary></indexterm>Full-screen monitor +of the load on a system from a kernel, hardware and processes point of view. +It is modeled on the Linux <command>atop(1)</command> tool +(<ulink url="http://www.atoptool.nl/">home page</ulink>) and provides a +showcase for the variety of data available using PCP services and the +Python scripting interfaces.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmchart</command></term> +<listitem><para><indexterm id="IG3137188800nat"><primary>pmchart tool</primary> +<secondary>brief description</secondary></indexterm>Strip chart tool for +arbitrary performance metrics. Interactive graphical utility that can display +multiple charts simultaneously, from multiple hosts or archives, aligned on a +unified time axis (X-axis), or on multiple tabs.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmcollectl</command></term> +<listitem><para><indexterm id="IG3137188801"><primary>pmcollectl tool</primary> +<secondary>brief description</secondary></indexterm>Statistics collection +tool with good coverage of a number of Linux kernel subsystems, with the +everything-in-one-tool approach pioneered by <command>sar(1)</command>. +It is modeled on the Linux <command>collectl(1)</command> utility +(<ulink url="http://collectl.sourceforge.net/">home page</ulink>) and +provides another example of use of the Python scripting interfaces to build +more complex functionality with relative ease, with PCP as a foundation.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdumptext</command></term> +<listitem><para><indexterm id="ITch01-22"><primary>pmdumptext tool</primary> +<secondary>brief description</secondary></indexterm>Outputs the values +of arbitrary performance metrics collected live or from a PCP archive, in +textual format.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmevent</command></term> +<listitem><para><indexterm id="IG3137188713"><primary>pmevent tool</primary> +<secondary>brief description</secondary></indexterm>Reports on event metrics, +decoding the timestamp and event parameters for text-based reporting.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmie</command></term> +<listitem><para><indexterm id="ITch01-32"><primary>pmie tool</primary> +<secondary>brief description</secondary></indexterm>Evaluates predicate-action +rules over performance metrics for alarms, automated system management tasks, +dynamic configuration tuning, and so on. It is an inference engine.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmieconf</command></term> +<listitem><para><indexterm id="IG3137188829"><primary>pmieconf tool</primary><secondary> +brief description</secondary></indexterm><indexterm id="IG3137188830"><primary>pmie tool +</primary><secondary>pmieconf rules</secondary></indexterm>Creates parameterized +rules to be used with the PCP inference engine (<command>pmie</command>). +It can be run either interactively or from scripts for automating the setup of inference +(the PCP start scripts do this, for example, to generate a default configuration).</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pminfo</command></term> +<listitem><para><indexterm id="ITch01-34"><primary>pminfo tool</primary> +<secondary>brief description</secondary></indexterm>Displays information +about arbitrary performance metrics available from PCP, including help +text with <literal>-T</literal>.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlogsummary</command></term> +<listitem><para><indexterm id="IG3137188831"><primary>pmlogsummary tool</primary></indexterm>Calculates +and reports various statistical summaries of the performance metric values +from a PCP archive.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmprobe</command></term> +<listitem><para><indexterm id="IG3137188832"><primary>pmprobe tool</primary></indexterm>Probes +for performance metric availability, values, and instances.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmstat</command></term> +<listitem><para><indexterm id="ITch01-37"><primary>pmstat tool</primary> +<secondary>brief description</secondary></indexterm>Provides a text-based +display of metrics that summarize the performance of one or more systems +at a high level.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmval</command></term> +<listitem><para><indexterm id="ITch01-42"><primary>pmval tool</primary> +<secondary>brief description</secondary></indexterm>Provides a text-based +display of the values for arbitrary instances of a selected performance +metric, suitable for ASCII logs or inquiry over a slow link.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5177776"> + +<title>Collecting, Transporting, and Archiving Performance Information +</title> +<para><indexterm id="IG3137188833"><primary>PCP</primary><secondary>tool summaries</secondary> +</indexterm><indexterm id="IG3137188834"><primary>data collection tools</primary></indexterm><indexterm id="IG3137188835"> +<primary>network transportation tools </primary></indexterm><indexterm id="IG3137188836"> +<primary>archive logs</primary><secondary>creation</secondary></indexterm>PCP +provides the following tools to support real-time data collection, network +transport, and archive log creation services for performance data:</para> +<variablelist> +<varlistentry> +<term><command>mkaf</command></term> +<listitem><para><indexterm id="ITch01-48"><primary>mkaf tool</primary> +</indexterm>Aggregates an arbitrary collection of PCP archive logs into +a <firstterm>folio</firstterm> to be used with <command>pmafm</command>. +</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmafm</command></term> +<listitem><para><indexterm id="ITch01-50"><primary>pmafm tool</primary> +<secondary>brief description</secondary></indexterm>Interrogates, manages, +and replays an archive folio as created by <command>mkaf</command>, or +the periodic archive log management scripts, or the record mode of other +PCP tools.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmcd</command></term> +<listitem><para><indexterm id="ITch01-52"><primary>PMCD</primary><secondary> +brief description</secondary></indexterm><indexterm id="IG3137188837"><primary>Performance +Metrics Collection Daemon</primary><see>PMCD</see></indexterm><indexterm id="IG3137188838"> +<primary>pmcd tool</primary><see>PMCD</see></indexterm>Is the Performance +Metrics Collection Daemon (PMCD). This daemon must run on each system +being monitored, to collect and export the performance information necessary +to monitor the system.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmcd_wait</command></term> +<listitem><para><indexterm id="IG3137188839"><primary>pmcd_wait tool</primary></indexterm>Waits +for <command>pmcd</command> to be ready to accept client connections. +</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdaapache</command></term> +<listitem><para><indexterm id="IG3137188803"><primary>pmdaapache tool</primary> +</indexterm>Exports performance metrics from the Apache Web Server. +It is a Performance Metrics Domain Agent (PMDA).</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdacisco</command></term> +<listitem><para><indexterm id="ITch01-54"><primary>pmdacisco tool</primary> +</indexterm>Extracts performance metrics from one or more Cisco routers.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdaelasticseach</command></term> +<listitem><para><indexterm id="IG3137188714"><primary>pmdaelasticsearch tool</primary> +</indexterm>Extracts performance metrics from an elasticsearch cluster.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdagfs2</command></term> +<listitem><para><indexterm id="IG3137188804"><primary>pmdagfs2 tool</primary> +</indexterm>Exports performance metrics from the GFS2 clustered filesystem.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdagluster</command></term> +<listitem><para><indexterm id="IG3137188712"><primary>pmdagluster tool</primary> +</indexterm>Extracts performance metrics from the Gluster filesystem.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdainfiniband</command></term> +<listitem><para><indexterm id="IG3137188805"><primary>pmdainfiniband tool</primary> +</indexterm>Exports performance metrics from the Infiniband kernel driver.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdakvm</command></term> +<listitem><para><indexterm id="IG3137188806"><primary>pmdakvm tool</primary> +</indexterm>Extracts performance metrics from the Linux Kernel Virtual Machine (KVM) +infrastructure.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdalustrecomm</command></term> +<listitem><para><indexterm id="IG3137188807"><primary>pmdalustrecomm tool</primary> +</indexterm>Exports performance metrics from the Lustre clustered filesystem.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdamailq</command></term> +<listitem><para><indexterm id="ITch01-58"><primary>pmdamailq tool</primary> +</indexterm>Exports performance metrics describing the current state of +items in the <literal>sendmail</literal> queue.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdamemcache</command></term> +<listitem><para><indexterm id="IG3137188808"><primary>pmdamemcache tool</primary> +</indexterm>Extracts performance metrics from memcached, a distributed +memory caching daemon commonly used to improve web serving performance.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdammv</command></term> +<listitem><indexterm id="IG3137188841"><primary>pmdammv tool</primary></indexterm><para> +Exports metrics from instrumented applications linked with the <filename>pcp_mmv</filename> +shared library or the +<ulink url="http://code.google.com/p/parfait/">Parfait</ulink> +framework for Java instrumentation. +These metrics are custom developed per application, and in the case of Parfait, automatically +include numerous JVM, Tomcat and other server or container statistics.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdamysql</command></term> +<listitem><para><indexterm id="IG3137188809"><primary>pmdamysql tool</primary> +</indexterm>Extracts performance metrics from the MySQL relational database.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdanamed</command></term> +<listitem><para><indexterm id="IG3137188701"><primary>pmdanamed tool</primary> +</indexterm>Exports performance metrics from the Internet domain name server, named.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdanginx</command></term> +<listitem><para><indexterm id="IG3137188702"><primary>pmdanginx tool</primary> +</indexterm>Extracts performance metrics from the nginx HTTP and reverse proxy server.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdapostfix</command></term> +<listitem><para><indexterm id="IG3137188703"><primary>pmdapostfix tool</primary> +</indexterm>Export performance metrics from the Postfix mail transfer agent.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdapostgres</command></term> +<listitem><para><indexterm id="IG3137188704"><primary>pmdapostgres tool</primary> +</indexterm>Extracts performance metrics from the PostgreSQL relational database.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdaproc</command></term> +<listitem><para><indexterm id="IG3137188705"><primary>pmdaproc tool</primary> +</indexterm>Exports performance metrics for running processes.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdarsyslog</command></term> +<listitem><para><indexterm id="IG3137188706"><primary>pmdarsyslog tool</primary> +</indexterm>Extracts performance metrics from the Reliable System Log daemon.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdasamba</command></term> +<listitem><para><indexterm id="IG3137188707"><primary>pmdasamba tool</primary> +</indexterm>Extracts performance metrics from Samba, a Windows SMB/CIFS server.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdasendmail</command></term> +<listitem><para><indexterm id="IG3137188842"><primary>pmdasendmail tool</primary></indexterm>Exports +mail activity statistics from <command>sendmail</command>. +</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmdashping</literal></term> +<listitem><para><indexterm id="ITch01-60"><primary>pmdashping tool</primary> +</indexterm>Exports performance metrics for the availability and quality +of service (response-time) for arbitrary shell commands. +</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdasnmp</command></term> +<listitem><para><indexterm id="IG3137188708"><primary>pmdasnmp tool</primary> +</indexterm>Extracts SNMP performance metrics from local or remote SNMP-enabled devices.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdasummary</command></term> +<listitem><para><indexterm id="ITch01-62"><primary>pmdasummary tool</primary> +</indexterm>Derives performance metrics values from values made available +by other PMDAs. It is a PMDA itself.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdasystemd</command></term> +<listitem><para><indexterm id="IG3137188709"><primary>pmdasystemd tool</primary> +</indexterm>Extracts performance metrics from the systemd and journald services.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmdatrace</literal></term> +<listitem><para><indexterm id="ITch01-64"><primary>pmdatrace tool</primary> +</indexterm>Exports transaction performance metrics from application processes +that use the <filename>pcp_trace</filename> library.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdavmware</command></term> +<listitem><para><indexterm id="IG3137188710"><primary>pmdavmware tool</primary> +</indexterm>Extracts performance metrics from a VMWare virtualization host.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdaweblog</command></term> +<listitem><indexterm id="IG3137188843"><primary>pmdaweblog tool</primary></indexterm><para> +Scans Web-server logs to extract metrics characterizing.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdaxfs</command></term> +<listitem><para><indexterm id="IG3137188711"><primary>pmdaxfs tool</primary> +</indexterm>Extracts performance metrics from the Linux kernel XFS filesystem implementation.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdumplog</command></term> +<listitem><para><indexterm id="ITch01-66"><primary>pmdumplog tool</primary> +<secondary>brief description</secondary></indexterm>Displays selected +state information, control data, and metric values from a PCP archive +log created by <command>pmlogger</command>.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlc</command></term> +<listitem><para><indexterm id="ITch01-68"><primary>pmlc tool</primary> +<secondary>brief description</secondary></indexterm>Exercises control +over an instance of the PCP archive logger <command>pmlogger</command>, +to modify the profile of which metrics are logged and/or how frequently +their values are logged.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlogcheck</command></term> +<listitem><para><indexterm id="IG3137188844"><primary>pmlogcheck tool</primary></indexterm>Performs +integrity check for PCP archives.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlogconf</command></term> +<listitem><para><indexterm id="IG3137188845"><primary>pmlogconf tool</primary></indexterm>Creates +or modifies <command>pmlogger</command> configuration files for many common +logging scenarios, optionally probing for available metrics and enabled functionality. +It can be run either interactively or from scripts for automating the setup of data logging +(the PCP start scripts do this, for example, to generate a default configuration).</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlogextract</command></term> +<listitem><para><indexterm id="ITch01-72"><primary>pmlogextract tool</primary> +</indexterm>Reads one or more PCP archive logs and creates a temporally +merged and reduced PCP archive log as output.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlogger</command></term> +<listitem><para><indexterm id="IG3137188846"><primary>pmlogger tool</primary><secondary> +brief description</secondary></indexterm>Creates PCP archive logs of performance +metrics over time. Many tools accept these PCP archive logs as alternative +sources of metrics for retrospective analysis.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmproxy</command></term> +<listitem><para><indexterm id="ITch01-38"><primary>pmproxy tool</primary> +<secondary>brief description</secondary></indexterm>Allows the execution +of PCP tools through a network firewall system.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmtrace</command></term> +<listitem><para><indexterm id="ITch01-74"><primary>pmtrace tool</primary> +</indexterm>Provides a simple command line interface to the trace PMDA +and its associated <filename>pcp_trace</filename> library.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmwebd</command></term> +<listitem><para><indexterm id="IG3137188802"><primary>web daemon</primary> +</indexterm>Is the Performance Metrics Web Daemon, a front-end to both +<command>pmcd</command> and PCP archives, +providing a JSON interface suitable for use by web-based tools wishing to +access performance data over HTTP.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5187554"> + +<title>Operational and Infrastructure Support</title> +<para><indexterm id="IG3137188847"><primary>PCP</primary><secondary>tool summaries</secondary> +</indexterm><indexterm id="IG3137188848"><primary>operational support tools</primary></indexterm><indexterm id="ITch01-76"><primary>infrastructure support tools</primary></indexterm>PCP +provides the following tools to support the PCP infrastructure and assist +operational procedures for PCP deployment in a production environment: +</para> +<variablelist> +<varlistentry> +<term><command>pcp</command></term> +<listitem><para><indexterm id="IG3137188850"><primary>pcp tool</primary></indexterm>Summarizes +that state of a PCP installation.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmdbg</command></term> +<listitem><para><indexterm id="ITch01-89"><primary>pmdbg facility</primary> +</indexterm><indexterm id="IG3137188851"><primary>diagnostic tools</primary></indexterm><indexterm id="ITch01-90"><primary>debugging tools</primary></indexterm>Describes +the available facilities and associated control flags. PCP tools include +internal diagnostic and debugging facilities that may be activated by +run-time flags.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmerr</command></term> +<listitem><para><indexterm id="ITch01-91"><primary>pmerr tool</primary> +</indexterm>Translates PCP error codes into human-readable error messages. +</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmhostname</command></term> +<listitem><para><indexterm id="IG3137188852"><primary>pmhostname tool</primary></indexterm>Reports +hostname as returned by <command>gethostbyname</command>. Used in assorted +PCP management scripts.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmie_check</command></term> +<listitem><para><indexterm id="IG3137188853"><primary>pmie tool</primary><secondary>brief +description</secondary></indexterm>Administration of the Performance Co-Pilot +inference engine (<command>pmie</command>).</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlock</command></term> +<listitem><para><indexterm id="ITch01-95"><primary>pmlock tool</primary> +</indexterm>Attempts to acquire an exclusive lock by creating a file with +a mode of 0.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmlogger_*</command></term> +<listitem><para><indexterm id="IG3137188854"><primary>pmlogger_check script</primary></indexterm><indexterm id="IG3137188855"> +<primary>pmlogger_daily script</primary></indexterm><indexterm id="IG3137188856"><primary> +pmlogger_merge script</primary></indexterm><indexterm id="IG3137188857"><primary>pmsnap +tool</primary><secondary>brief description</secondary></indexterm><indexterm id="IG3137188858"> +<primary>scripts</primary></indexterm>Allows you to create a customized +regime of administration and management for PCP archive log files. The <literal>pmlogger_check</literal>, <literal>pmlogger_daily</literal>, and <literal>pmlogger_merge</literal> scripts are intended for periodic execution via +the <command>cron</command> command.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmnewlog</command></term> +<listitem><para><indexterm id="ITch01-100"><primary>pmnewlog tool</primary> +</indexterm>Performs archive log rotation by stopping and restarting an +instance of <command>pmlogger</command>.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmnsadd</command></term> +<listitem><para><indexterm id="ITch01-102"><primary>pmnsadd tool</primary> +</indexterm>Adds a subtree of new names into a PMNS, as used by the components +of PCP.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmnsdel</command></term> +<listitem><para><indexterm id="ITch01-106"><primary>pmnsdel tool</primary> +</indexterm>Removes a subtree of names from a PMNS, as used by the components +of the PCP.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmnsmerge</command></term> +<listitem><para><indexterm id="IG3137188859"><primary>pcp tool</primary></indexterm>Merges +multiple PMNS files together, as used by the components of PCP.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmstore</command></term> +<listitem><para><indexterm id="ITch01-112"><primary>pmstore tool</primary> +<secondary>brief description</secondary></indexterm>Reinitializes counters +or assigns new values to metrics that act as control variables. The command +changes the current values for the specified instances of a single performance +metric.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5188066"> + +<title>Application and Agent Development</title> +<para><indexterm id="ITch01-114"><primary>application programs</primary> +</indexterm><indexterm id="IG3137188860"><primary>PCP</primary><secondary>tool summaries +</secondary></indexterm>The following PCP tools aid the development of +new programs to consume performance data, and new agents to export performance +data within the PCP framework:</para> +<variablelist> +<varlistentry> +<term><command>chkhelp</command></term> +<listitem><para><indexterm id="ITch01-115"><primary>chkhelp tool</primary> +</indexterm>Checks the consistency of performance metrics help database +files.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>dbpmda</command></term> +<listitem><para><indexterm id="ITch01-117"><primary>dbpmda tool</primary> +</indexterm>Allows PMDA behavior to be exercised and tested. It is an +interactive debugger for PMDAs.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>newhelp</command></term> +<listitem><para><indexterm id="ITch01-119"><primary>newhelp tool</primary> +</indexterm>Generates the database files for one or more source files +of PCP help text.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmapi</command></term> +<listitem><para><indexterm id="ITch01-121"><primary>pmclient tool</primary> +</indexterm><indexterm id="IG3137188861"><primary>PMAPI</primary><secondary>brief description +</secondary></indexterm><indexterm id="IG3137188862"><primary>Performance Metrics Application +Programming Interface </primary><see>PMAPI</see></indexterm>Defines a +procedural interface for developing PCP client applications. It is the +Performance Metrics Application Programming Interface (PMAPI).</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmclient</literal></term> +<listitem><para><indexterm id="ITch01-123"><primary>pmclient tool</primary> +</indexterm>Is a simple client that uses the PMAPI to report some high-level +system performance metrics. </para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmda</command></term> +<listitem><para><indexterm id="ITch01-127"><primary>dbpmda tool</primary> +<see>PMDA</see></indexterm>Is a library used by many shipped PMDAs to +communicate with a <command>pmcd</command> process. It can expedite the +development of new and custom PMDAs.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmgenmap</command></term> +<listitem><para><indexterm id="ITch01-125"><primary>pmgenmap tool</primary> +</indexterm>Generates C declarations and <command>cpp(1)</command> macros +to aid the development of customized programs that use the facilities +of PCP. It is a PMDA development tool.</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +</chapter> + + + +<chapter id="LE17127-PARENT"> + +<title>Installing and Configuring Performance Co-Pilot</title> +<para><indexterm id="ITch02-0"><primary>installing PCP</primary></indexterm><indexterm id="ITch02-1"><primary>configuring PCP</primary></indexterm><indexterm id="ITch02-3"> +<primary>PCP</primary><secondary>configuring and installing</secondary></indexterm>The +sections in this chapter describe the basic installation and configuration +steps necessary to run Performance Co-Pilot (PCP) on your systems. The following +major sections are included:</para> +<itemizedlist> +<listitem><para><xref linkend="LE18649-PARENT"/> describes the main packages +of PCP software and how they must be installed on each system.</para> +</listitem> +<listitem><para><xref linkend="LE26146-PARENT"/>, describes the fundamentals +of maintaining the performance data collector.</para> +</listitem><listitem><para><xref linkend="LE43202-PARENT"/>, describes +the basics of installing a new Performance Metrics Domain Agent (PMDA) to +collect metric data and pass it to the PMCD.</para> +</listitem> +<listitem><para><xref linkend="LE70712-PARENT"/>, offers advice on problems +involving the PMCD.</para> +</listitem></itemizedlist> +<section id="LE18649-PARENT"> + +<title>Product Structure</title> +<para><indexterm id="IG3137188887"><primary>subsystems</primary></indexterm><indexterm id="ITch02-5"> +<primary>monitor configuration</primary></indexterm><indexterm id="IG3137188888"><primary>roles +</primary><secondary>collector</secondary></indexterm><indexterm id="IG3137188889"><primary> +roles</primary><secondary>monitor</secondary></indexterm>In a typical deployment, +Performance Co-Pilot (PCP) would be installed in a collector configuration +on one or more hosts, from which the performance information could then be +collected, and in a monitor configuration on one or more workstations, from +which the performance of the server systems could then be monitored.</para> +<para>On some platforms Performance Co-Pilot is presented as multiple packages; +typically separating the server components from graphical user interfaces and +documentation.</para> +<variablelist id="Z926620168sdc"> +<varlistentry> +<term>pcp-X.Y.Z-<replaceable>rev</replaceable></term> +<listitem><para>package for core PCP</para> +</listitem></varlistentry> +<varlistentry> +<term>pcp-gui-X.Y.Z-<replaceable>rev</replaceable></term> +<listitem><para>package for graphical PCP client tools</para> +</listitem></varlistentry> +<varlistentry> +<term>pcp-doc-X.Y.Z-<replaceable>rev</replaceable></term> +<listitem><para>package for online PCP documentation</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="LE26146-PARENT"> + +<title>Performance Metrics Collection Daemon (PMCD)</title> +<para><indexterm id="ITch02-20"><primary>PMCD</primary><secondary>maintenance +</secondary></indexterm><indexterm id="IG3137188890"><primary>Performance Metrics Collection +Daemon</primary><see>PMCD</see></indexterm> On each Performance Co-Pilot (PCP) +collection system, you must be certain that the <command>pmcd</command> daemon +is running. This daemon coordinates the gathering and exporting of performance +statistics in response to requests from the PCP monitoring tools.</para> +<section id="id5190406"> + +<title>Starting and Stopping the PMCD</title> +<para><indexterm id="IG3137188891"><primary>PMCD</primary><secondary>starting and stopping</secondary> +</indexterm>To start the daemon, enter the following commands as <literal>root</literal> on each PCP collection system:</para> +<literallayout class="monospaced"><userinput>chkconfig pmcd on</userinput>  +<userinput>${PCP_RC_DIR}/pmcd start</userinput> </literallayout> +<para>These commands instruct the system to start the daemon immediately, +and again whenever the system is booted. It is not necessary to start the +daemon on the monitoring system unless you wish to collect performance information +from it as well.</para> +<para>To stop <command>pmcd</command> immediately on a PCP collection system, +enter the following command:</para> +<literallayout class="monospaced"><userinput>${PCP_RC_DIR}/pmcd stop</userinput></literallayout> +</section> +<section id="LE58493-PARENT"> + +<title>Restarting an Unresponsive PMCD</title> +<para>Sometimes, if a daemon is not responding on a PCP collection system, the +problem can be resolved by stopping and then immediately restarting a fresh +instance of the daemon. If you need to stop and then immediately restart PMCD +on a PCP collection system, use the <literal>start</literal> argument provided +with the script in <filename>${PCP_RC_DIR}</filename>. The command syntax +is, as follows:</para> +<literallayout class="monospaced"><userinput>${PCP_RC_DIR}/pmcd start</userinput> </literallayout> +<para>On startup, <command>pmcd</command> looks for a configuration file at +<filename>${PCP_PMCDCONF_PATH}</filename>. This file specifies which agents +cover which performance metrics domains and how PMCD should make contact with +the agents. A comprehensive description of the configuration file syntax and +semantics can be found in the <command>pmcd(1)</command> man page. +</para> +<para>If the configuration is changed, <command>pmcd</command> reconfigures +itself when it receives the <literal>SIGHUP</literal> signal. Use the following +command to send the <literal>SIGHUP</literal> signal to the daemon:</para> +<literallayout class="monospaced"><userinput>${PCP_BINADM_DIR}/pmsignal -a -s HUP pmcd</userinput></literallayout> +<para>This is also useful when one of the PMDAs managed by <command>pmcd</command> +has failed or has been terminated by <command>pmcd</command>. Upon receipt +of the <literal>SIGHUP</literal> signal, <command>pmcd</command> restarts +any PMDA that is configured but inactive. The exception to this rule is the +case of a PMDA which must run with superuser privileges (where possible, this +is avoided) - for these PMDAs, a full <command>pmcd</command> restart must be +performed, using the process described earlier (not SIGHUP).</para> +</section> +<section id="id5190621"> + +<title>PMCD Diagnostics and Error Messages</title> +<para><indexterm id="ITch02-23"><primary>PMCD</primary><secondary>diagnostics +and error messages</secondary></indexterm>If there is a problem with <command> +pmcd</command>, the first place to investigate should be the <filename>pmcd.log</filename> file. +By default, this file is in the <filename>${PCP_LOG_DIR}/pmcd</filename> directory.</para> +</section> +<section id="id5190661"> + +<title>PMCD Options and Configuration Files</title> +<para><indexterm id="ITch02-26"><primary>PMCD</primary><secondary>configuration +files</secondary></indexterm>There are two files that control PMCD operation. +These are the <filename>${PCP_PMCDCONF_PATH}</filename> and <filename>${PCP_PMCDOPTIONS_PATH}</filename> files. +The <filename>pmcd.options</filename> +file contains the command line options used with PMCD; it is read +when the daemon is invoked by <literal>${PCP_RC_DIR}/pmcd</literal>. +The <filename>pmcd.conf</filename> file contains configuration information +regarding domain agents and the metrics that they monitor. +These configuration files are described in the following sections.</para> +<section id="id5190706"> + +<title>The <filename>pmcd.options</filename> File</title> +<para><indexterm id="ITch02-27"><primary>pmcd.options file</primary></indexterm>Command +line options for the PMCD are stored in the <filename>${PCP_PMCDOPTIONS_PATH}</filename> +file. The PMCD can be invoked directly from a shell prompt, or it can be invoked +by<literal> ${PCP_RC_DIR}/pmcd</literal> as part of the boot process. +It is usual and normal to invoke it using <literal>${PCP_RC_DIR}/pmcd</literal>, +reserving shell invocation for debugging purposes.</para> +<para>The PMCD accepts certain command line options to control its execution, +and these options are placed in the <filename>pmcd.options</filename> file +when <filename>${PCP_RC_DIR}/pmcd</filename> is being used to start the +daemon. The following options (amongst others) are available:</para> +<variablelist> +<varlistentry> +<term><literal>-i </literal> <replaceable>address</replaceable></term> +<listitem><para>For hosts with more than one network interface, this option +specifies the interface on which this instance of the PMCD accepts connections. +Multiple <literal>-i</literal> options may be specified. The default in the +absence of any <literal>-i</literal> option is for PMCD to accept connections +on all interfaces.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>-l </command> <replaceable>file</replaceable></term> +<listitem><para>Specifies a log file. If no <literal>-l</literal> option is +specified, the log file name is <filename>pmcd.log</filename> and it is created +in the directory <filename>${PCP_LOG_DIR}/pmcd/</filename>.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>-s </command> <replaceable>file</replaceable></term> +<listitem><para>Specifies the path to a local unix domain socket +(for platforms supporting this socket family only). +The default value is <filename>${PCP_RUN_DIR}/pmcd.socket</filename>.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>-t </command> <replaceable>seconds</replaceable></term> +<listitem><para><indexterm id="ITch02-28"><primary>PDU</primary></indexterm><indexterm id="IG3137188892"> +<primary>protocol data units</primary><see>PDU</see></indexterm>Specifies +the amount of time, in seconds, before PMCD times out on protocol data unit +(PDU) exchanges with PMDAs. If no time out is specified, the default is five +seconds. Setting time out to zero disables time outs (not recommended, PMDAs +should always respond quickly).</para> +<para>The time out may be dynamically modified by storing the number of seconds +into the metric <literal>pmcd.control.timeout</literal> using <command>pmstore</command>.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-T </literal> <replaceable>mask</replaceable></term> +<listitem><para>Specifies whether connection and PDU tracing are turned on +for debugging purposes.</para> +</listitem></varlistentry> +</variablelist> +<para>See the <command>pmcd(1)</command> man page for complete +information on these options.</para> +<para>The default <filename>pmcd.options</filename> file shipped with PCP +is similar to the following:</para> +<literallayout class="monospaced"># command-line options to pmcd, uncomment/edit lines as required + +# longer timeout delay for slow agents +# -t 10 + +# suppress timeouts +# -t 0 + +# make log go someplace else +# -l /some/place/else + +# debugging knobs, see pmdbg(1) +# -D N +# -f + +# Restricting (further) incoming PDU size to prevent DOS attacks +# -L 16384 + +# enable event tracing bit fields +# 1 trace connections +# 2 trace PDUs +# 256 unbuffered tracing +# -T 3 + +# setting of environment variables for pmcd and +# the PCP rc scripts. See pmcd(1) and PMAPI(3). +# PMCD_WAIT_TIMEOUT=120</literallayout> +<para>The most commonly used options have been placed in this file for your +convenience. To uncomment and use an option, simply remove the pound sign +(#) at the beginning of the line with the option you wish to use. Restart +<command>pmcd</command> for the change to take effect; that is, as superuser, enter +the command:</para> +<literallayout class="monospaced"><userinput>${PCP_RC_DIR}/pmcd start</userinput></literallayout> +</section> +<section id="LE63226-PARENT"> + +<title>The <filename>pmcd.conf</filename> File</title> +<para><indexterm id="ITch02-29"><primary>pmcd.conf file</primary></indexterm>When +the PMCD is invoked, it reads its configuration file, which is <filename>${PCP_PMCDCONF_PATH}</filename>. +This file contains entries that specify the PMDAs used by this instance of +the PMCD and which metrics are covered by these PMDAs. Also, you may specify +access control rules in this file for the various hosts, users and groups on your network. +This file is described completely in the <command>pmcd(1)</command> man page.</para> +<para>With standard PCP operation (even if you have not created and added +your own PMDAs), you might need to edit this file in order to add any additional +access control you wish to impose. If you do not add access control rules, all access +for all operations is granted to the local host, and read-only access is granted to +remote hosts. The <filename>pmcd.conf</filename> file is automatically generated +during the software build process and on Linux, for example, is similar to the +following:</para> +<literallayout class="monospaced"> Performance Metrics Domain Specifications +# +# This file is automatically generated during the build +# Name Id IPC IPC Params File/Cmd +pmcd 2 dso pmcd_init ${PCP_PMDAS_DIR}/pmcd/pmda_pmcd.so +linux 60 dso linux_init ${PCP_PMDAS_DIR}/linux/pmda_linux.so +proc 3 pipe binary ${PCP_PMDAS_DIR}/proc/pmdaproc -d 3 +xfs 11 pipe binary ${PCP_PMDAS_DIR}/xfs/pmdaxfs -d 11 + +[access] +disallow * : store; +allow localhost : all;</literallayout> +<note><para>Even though PMCD does not run with <literal>root</literal> privileges, +you must be very careful not to configure PMDAs in this file if you are not +sure of their action. This is because all PMDAs are initially started as +<literal>root</literal> (allowing them to assume alternate identities, such as +<literal>postgres</literal> for example), after which <command>pmcd</command> +drops its privileges. +Pay close attention that permissions on this file are not +inadvertently downgraded to allow public write access.</para> +</note> +<para>Each entry in this configuration file contains rules that specify how +to connect the PMCD to a particular PMDA and which metrics the PMDA monitors. +A PMDA may be attached as a Dynamic Shared Object (DSO) or by using a socket +or a pair of pipes. The distinction between these attachment methods is described +below.</para> +<para>An entry in the <filename>pmcd.conf</filename> file looks like this: +</para> +<literallayout class="monospaced"><replaceable>label_name</replaceable> <replaceable>domain_number</replaceable> <replaceable>type</replaceable> <replaceable>path</replaceable></literallayout> +<para>The <replaceable>label_name</replaceable> field specifies a name for +the PMDA. The <replaceable>domain_number</replaceable> is an integer value +that specifies a domain of metrics for the PMDA. The <replaceable>type</replaceable> +field indicates the type of entry (DSO, socket, or pipe). The <replaceable> +path</replaceable> field is for additional information, and varies according +to the type of entry.</para> +<para>The following rules are common to DSO, socket, and pipe syntax:</para> +<variablelist> +<varlistentry> +<term><replaceable>label_name</replaceable></term> +<listitem><para>An alphanumeric string identifying the agent.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>domain_number</replaceable></term> +<listitem><para>An unsigned integer specifying the agent's domain.</para> +</listitem></varlistentry> +</variablelist> +<para>DSO entries follow this syntax:</para> +<synopsis><replaceable>label_name</replaceable> <replaceable>domain_number</replaceable> dso <replaceable>entry-point</replaceable> <replaceable>path</replaceable></synopsis> +<para>The following rules apply to the DSO syntax:</para> +<variablelist> +<varlistentry> +<term><literal>dso</literal></term> +<listitem><para>The entry type.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>entry-point</replaceable></term> +<listitem><para>The name of an initialization function called when the DSO +is loaded.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>path</replaceable></term> +<listitem><para>Designates the location of the DSO. An absolute path must be used. +On most platforms this will be a <literal>so</literal> suffixed file, on Windows it +is a <literal>dll</literal>, and on Mac OS X it is a <literal>dylib</literal> file.</para> +</listitem></varlistentry> +</variablelist> +<para>Socket entries in the <filename>pmcd.conf</filename> file follow this syntax:</para> +<synopsis><replaceable>label_name</replaceable> <replaceable>domain_number</replaceable> socket <replaceable>addr_family</replaceable> <replaceable>address</replaceable> <replaceable>command</replaceable> <optional><replaceable>args</replaceable></optional> </synopsis> +<para>The following rules apply to the socket syntax:</para> +<variablelist> +<varlistentry> +<term><literal>socket</literal></term> +<listitem><para>The entry type.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>addr_family</replaceable></term> +<listitem><para>Specifies if the socket is <literal>AF_INET</literal>, <literal>AF_IPV6</literal> +or <literal>AF_UNIX</literal>. If the socket is <literal>INET</literal>, the word <literal>inet</literal> +appears in this place. If the socket is <literal>IPV6</literal>, the word <literal>ipv6</literal> +appears in this place. If the socket is <literal>UNIX</literal>, +the word <literal>unix</literal> appears in this place.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>address</replaceable></term> +<listitem><para>Specifies the address of the socket. For INET or IPv6 sockets, this +is a port number or port name. For UNIX sockets, this is the name of the PMDA's +socket on the local host.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>command</replaceable></term> +<listitem><para>Specifies a command to start the PMDA when the PMCD is invoked +and reads the configuration file.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>args</replaceable></term> +<listitem><para>Optional arguments for <replaceable>command</replaceable>. +</para> +</listitem></varlistentry> +</variablelist> +<para>Pipe entries in the <filename>pmcd.conf</filename> file follow this syntax:</para> +<synopsis><replaceable>label_name</replaceable> <replaceable>domain_number</replaceable> pipe <replaceable>protocol</replaceable> <replaceable>command</replaceable> <optional><replaceable>args</replaceable></optional></synopsis> +<para>The following rules apply to the pipe syntax:</para> +<variablelist> +<varlistentry> +<term><literal>pipe</literal></term> +<listitem><para>The entry type.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>protocol</replaceable></term> +<listitem><para>Specifies whether a text-based or a binary PCP protocol should +be used over the pipes. Historically, this parameter was able to be “text” +or “binary.” The text-based protocol has long since been deprecated and +removed, however, so nowadays “binary” is the only valid value here. +</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>command</replaceable></term> +<listitem><para>Specifies a command to start the PMDA when the PMCD is invoked +and reads the configuration file.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>args</replaceable></term> +<listitem><para>Optional arguments for <replaceable>command</replaceable>. +</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5191707"> + +<title>Controlling Access to PMCD with <filename>pmcd.conf</filename></title> +<para><indexterm id="ITch02-30"><primary>pmcd.conf file</primary></indexterm>You +can place this option extension in the <filename>pmcd.conf</filename> file +to control access to performance metric data based on hosts, users and groups. +To add an access control section, begin by placing the following line at the end +of your <filename>pmcd.conf</filename> file:</para> +<literallayout class="monospaced">[access] </literallayout> +<para>Below this line, you can add entries of the following forms:</para> +<literallayout class="monospaced">allow hosts <replaceable>hostlist</replaceable> : <replaceable>operations</replaceable> ; disallow hosts <replaceable>hostlist</replaceable> : <replaceable>operations</replaceable> ; +allow users <replaceable>userlist</replaceable> : <replaceable>operations</replaceable> ; disallow users <replaceable>userlist</replaceable> : <replaceable>operations</replaceable> ; +allow groups <replaceable>grouplist</replaceable> : <replaceable>operations</replaceable> ; disallow groups <replaceable>grouplist</replaceable> : <replaceable>operations</replaceable> ; +</literallayout> +<para>The keywords <replaceable>users</replaceable>, <replaceable>groups</replaceable> +and <replaceable>hosts</replaceable> can be used in either plural or singular form.</para> +<para>The <replaceable>userlist</replaceable> and <replaceable>grouplist</replaceable> +fields are comma-separated lists of authenticated users and groups from the +local <filename>/etc/passwd</filename> and <filename>/etc/groups</filename> files, +NIS (network information service) or LDAP (lightweight directory access protocol) service.</para> +<para>The <replaceable>hostlist</replaceable> is a comma-separated list of +host identifiers; the following rules apply:</para> +<itemizedlist> +<listitem><para>Host names must be in the local system's <filename>/etc/hosts</filename> +file or known to the local DNS (domain name service).</para> +</listitem> +<listitem><para>IP and IPv6 addresses may be given in the usual numeric notations.</para> +</listitem> +<listitem><para>A wildcarded IP or IPv6 address may be used to specify groups of hosts, +with the single wildcard character * as the last-given component of the address. +The wildcard .* refers to all IP (IPv4) addresses. +The wildcard :* refers to all IPv6 addresses. +If an IPv6 wildcard contains a :: component, then the final * refers to the final 16 bits of +the address only, otherwise it refers to the remaining unspecified bits of the address.</para> +</listitem> +</itemizedlist> +<para>The wildcard ``*'' refers to all users, groups or host addresses. +Names of users, groups or hosts may not be wildcarded.</para> +<para>For example, the following <replaceable>hostlist</replaceable> entries +are all valid:</para> +<literallayout class="monospaced">babylon +babylon.acme.com +123.101.27.44 +localhost +155.116.24.* +192.* +.* +fe80::223:14ff:feaf:b62c +fe80::223:14ff:feaf:* +fe80:* +:* +*</literallayout> +<para>The <replaceable>operations</replaceable> field can be any of the following: +</para> +<itemizedlist> +<listitem><para>A comma-separated list of the operation types described below. +</para> +</listitem> +<listitem><para>The word <firstterm>all</firstterm> to allow or disallow all +operations as specified in the first field.</para> +</listitem> +<listitem><para>The words <firstterm>all except</firstterm> and a list of +operations. This entry allows or disallows all operations as specified in +the first field except those listed.</para> +</listitem> +<listitem><para>The phrase <firstterm>maximum</firstterm> N <firstterm>connections</firstterm> +to set an upper bound (N) on the number of connections an individual host, user or +group of users may make. This can only be added to the <replaceable>operations</replaceable> +list of an allow statement.</para> +</listitem> +</itemizedlist> +<para>The operations that can be allowed or disallowed are as follows:</para> +<variablelist condition="sgi_termlength:narrow"> +<varlistentry> +<term><literal>fetch</literal></term> +<listitem><para>Allows retrieval of information from the PMCD. This may be +information about a metric (such as a description, instance domain, or help +text) or an actual value for a metric.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>store</literal></term> +<listitem><para>Allows the PMCD to store metric values in PMDAs that permit +store operations. Be cautious in allowing this operation, because it may be +a security opening in large networks, although the PMDAs shipped with the +PCP package typically reject store operations, except for selected performance +metrics where the effect is benign.</para> +</listitem></varlistentry> +</variablelist> +<para>For example, here is a sample access control portion of a +<filename>${PCP_PMCDCONF_PATH}</filename> file:</para> +<literallayout class="monospaced">allow hosts babylon, moomba : all ; +disallow user sam : all ; +allow group dev : fetch ; +allow hosts 192.127.4.* : fetch ; +disallow host gate-inet : store ; </literallayout> +<para>Complete information on access control syntax rules in the +<filename>pmcd.conf</filename> file can be found in the <command>pmcd(1)</command> +man page.</para> +</section> +</section> +</section> +<section id="LE43202-PARENT"> + +<title>Managing Optional PMDAs</title> +<para><indexterm id="ITch02-31"><primary>PMDA</primary><secondary>managing +optional agents</secondary></indexterm>Some Performance Metrics Domain Agents +(PMDAs) shipped with Performance Co-Pilot (PCP) are designed to be installed +and activated on every collector host, for example, <literal>linux</literal>, +<literal>windows</literal>, <literal>darwin</literal>, <literal>pmcd</literal>, +and <literal>process</literal> PMDAs.</para> +<para>Other PMDAs are designed for optional activation and require some user +action to make them operational. In some cases these PMDAs expect local site +customization to reflect the operational environment, the system configuration, +or the production workload. This customization is typically supported by interactive +installation scripts for each PMDA.</para> +<para>Each PMDA has its own directory located below <filename>${PCP_PMDAS_DIR}</filename>. +Each directory contains a <filename>Remove</filename> script to unconfigure +the PMDA, remove the associated metrics from the PMNS, and restart the <command> +pmcd</command> daemon; and an <filename>Install</filename> script to install +the PMDA, update the PMNS, and restart the PMCD daemon.</para> +<para>As a shortcut mechanism to support automated PMDA installation, a file named +<filename>.NeedInstall</filename> can be created in a PMDA directory below +<filename>${PCP_PMDAS_DIR}</filename>. The next restart of PCP services will +invoke that PMDAs installation automatically, with default options taken. +</para> +<section id="LE31599-PARENT"> + +<title>PMDA Installation on a PCP Collector Host</title> +<para><indexterm id="ITch02-32"><primary>PMDA</primary><secondary>installation +</secondary></indexterm>To install a PMDA you must perform a collector installation +for each host on which the PMDA is required to export performance metrics. +PCP provides a distributed metric namespace (PMNS) and metadata, so it is not necessary +to install PMDAs (with their associated PMNS) on PCP monitor hosts.</para> +<para><indexterm id="ITch02-33"><primary>collector hosts</primary></indexterm> +You need to update the PMNS, configure the PMDA, and notify PMCD. +The <literal>Install</literal> script for each PMDA automates these operations, as follows: +</para> +<orderedlist><listitem><para>Log in as <literal>root</literal> (the superuser). +</para> +</listitem><listitem><para>Change to the PMDA's directory as shown in the following +example:</para> +<literallayout class="monospaced"><userinput>cd ${PCP_PMDAS_DIR}/cisco</userinput></literallayout> +</listitem><listitem><para><indexterm id="IG3137188893"><primary>PMD</primary></indexterm><indexterm id="IG3137188894"> +<primary>Performance Metrics Domain</primary><see>PMD</see></indexterm>In +the unlikely event that you wish to use a non-default Performance Metrics +Domain (PMD) assignment, determine the current PMD assignment:</para> +<literallayout class="monospaced"><userinput>cat domain.h</userinput></literallayout> +<para><indexterm id="IG3137188895"><primary>${PCP_VAR_DIR}/pmns/stdpmid file</primary></indexterm><indexterm id="IG3137188896"> +<primary>${PCP_PMCDCONF_PATH} file</primary></indexterm>Check that +there is no conflict in the PMDs as defined in <filename>${PCP_VAR_DIR}/pmns/stdpmid</filename> +and the other PMDAs currently in use (listed in <filename>${PCP_PMCDCONF_PATH}</filename>). +Edit <filename>domain.h</filename> to assign the new domain number if there +is a conflict (this is highly unlikely to occur in a regular PCP installation).</para> +</listitem><listitem><para>Enter the following command:</para> +<literallayout class="monospaced"><userinput>./Install</userinput></literallayout> +<para>You may be prompted to enter some local parameters or configuration +options. The script applies all required changes to the control files and +to the PMNS, and then notifies PMCD. <xref linkend="Z929138022sdc"/> is illustrative +of the interactions:</para> +<example id="Z929138022sdc"> +<title>PMNS Installation Output +</title> +<literallayout class="monospaced">You will need to choose an appropriate configuration for +installation of the “cisco” Performance Metrics Domain Agent (PMDA). + + collector collect performance statistics on this system + monitor allow this system to monitor local and/or remote systems + both collector and monitor configuration for this system + +Please enter c(ollector) or m(onitor) or b(oth) [b] <userinput>collector</userinput> + +Cisco hostname or IP address? [return to quit] <userinput>wanmelb</userinput> + +A user-level password may be required for Cisco “show int” command. + If you are unsure, try the command + $ telnet wanmelb + and if the prompt “Password:” appears, a user-level password is + required; otherwise answer the next question with an empty line. + +User-level Cisco password? <userinput>********</userinput> +Probing Cisco for list of interfaces ... + +Enter interfaces to monitor, one per line in the format +tX where “t” is a type and one of “e” (Ethernet), or “f” (Fddi), or +“s” (Serial), or “a” (ATM), and “X” is an interface identifier +which is either an integer (e.g. 4000 Series routers) or two +integers separated by a slash (e.g. 7000 Series routers). + +The currently unselected interfaces for the Cisco “wanmelb” are: + e0 s0 s1 +Enter “quit” to terminate the interface selection process. +Interface? [e0] <userinput>s0</userinput> + +The currently unselected interfaces for the Cisco “wanmelb” are: + e0 s1 +Enter “quit” to terminate the interface selection process. +Interface? [e0] <userinput>s1</userinput> + +The currently unselected interfaces for the Cisco “wanmelb” are: + e0 +Enter “quit” to terminate the interface selection process. +Interface? [e0] <userinput>quit</userinput> + +Cisco hostname or IP address? [return to quit] +Updating the Performance Metrics Name Space (PMNS) ... +Installing pmchart view(s) ... +Terminate PMDA if already installed ... +Installing files ... +Updating the PMCD control file, and notifying PMCD ... +Check cisco metrics have appeared ... 5 metrics and 10 values</literallayout> +</example> +</listitem></orderedlist> +</section> +<section id="id5192380"> + +<title>PMDA Removal on a PCP Collector Host</title> +<para><indexterm id="ITch02-35"><primary>PMDA</primary><secondary>removal +</secondary></indexterm>To remove a PMDA, you must perform a collector removal +for each host on which the PMDA is currently installed.</para> +<para>The PMNS needs to be updated, the PMDA unconfigured, and PMCD notified. +The <literal>Remove</literal> script for each PMDA automates these operations, +as follows:</para> +<orderedlist><listitem><para>Log in as <literal>root</literal> (the superuser). +</para> +</listitem><listitem><para>Change to the PMDA's directory as shown in the following +example:</para> +<literallayout class="monospaced"><userinput>cd ${PCP_PMDAS_DIR}/elasticsearch</userinput></literallayout> +</listitem><listitem><para>Enter the following command:</para> +<literallayout class="monospaced"><userinput>./Remove</userinput></literallayout> +<para>The following output illustrates the result:</para> +<literallayout class="monospaced">Culling the Performance Metrics Name Space ... +elasticsearch ... done +Updating the PMCD control file, and notifying PMCD ... +Removing files ... +Check elasticsearch metrics have gone away ... OK</literallayout> +</listitem></orderedlist> +</section> +</section> +<section id="LE70712-PARENT"> + +<title>Troubleshooting</title> +<para><indexterm id="ITch02-38"><primary>troubleshooting</primary><secondary> +PMCD</secondary></indexterm>The following sections offer troubleshooting advice +on the Performance Metrics Name Space (PMNS), missing and incomplete values +for performance metrics, kernel metrics and the PMCD.</para> +<para>Advice for troubleshooting the archive logging system is provided in <xref linkend="LE93354-PARENT"/>.</para> +<section id="LE97133-PARENT"> + +<title>Performance Metrics Name Space</title> +<para><indexterm id="IG3137188897"><primary>pminfo tool</primary><secondary>displaying the PMNS +</secondary></indexterm><indexterm id="IG3137188898"><primary>PMNS</primary><secondary>troubleshooting +</secondary></indexterm>To display the active PMNS, use the <literal>pminfo</literal> +command; see the <command>pminfo(1)</command> man page.</para> +<para>The PMNS at the collector host is updated whenever a PMDA is installed +or removed, and may also be updated when new versions of PCP are +installed. During these operations, the PMNS is typically updated by merging the +(plaintext) namespace components from each installed PMDA. +These separate PMNS components reside in the <filename>${PCP_VAR_DIR}/pmns</filename> +directory and are merged into the <filename>root</filename> file there.</para> +</section> +<section id="LE90170-PARENT"> + +<title>Missing and Incomplete Values for Performance Metrics +</title> +<para><indexterm id="ITch02-39"><primary>performance metrics</primary><secondary> +missing and incomplete values</secondary></indexterm>Missing or incomplete +performance metric values are the result of their unavailability.</para> +<section id="LE89271-PARENT"> + +<title>Metric Values Not Available</title> +<para>The following symptom has a known cause and resolution:</para> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><para>Values for some or all of the instances of a performance metric +are not available.</para> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>This can occur as a consequence of changes in the installation +of modules (for example, a DBMS or an application package) that provide the +performance instrumentation underpinning the PMDAs. Changes in the selection +of modules that are installed or operational, along with changes in the version +of these modules, may make metrics appear and disappear over time.</para> +<para>In simple terms, the PMNS contains a metric name, but when that metric +is requested, no PMDA at the collector host supports the metric.</para> +<para>For archive logs, the collection of metrics to be logged is a subset +of the metrics available, so utilities replaying from a PCP archive log may +not have access to all of the metrics available from a live (PMCD) source. +</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>Make sure the underlying instrumentation is available and +the module is active. Ensure that the PMDA is running on the host to be monitored. +If necessary, create a new archive log with a wider range of metrics to be +logged.</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +<section id="LE76751-PARENT"> + +<title>Kernel Metrics and the PMCD</title> +<para><indexterm id="IG3137188899"><primary>troubleshooting</primary><secondary>Kernel metrics +</secondary></indexterm><indexterm id="IG31371888100"><primary>troubleshooting</primary><secondary> +PMCD</secondary></indexterm>The following issues involve the kernel metrics and the PMCD:</para> +<itemizedlist> +<listitem><para>Cannot connect to remote PMCD</para> +</listitem> +<listitem><para>PMCD not reconfiguring after hang-up</para> +</listitem> +<listitem><para>PMCD does not start</para> +</listitem></itemizedlist> +<section id="id5192807"> + +<title>Cannot Connect to Remote PMCD</title> +<para><indexterm id="IG31371888101"><primary>PMCD</primary><secondary>remote connection</secondary> +</indexterm><indexterm id="ITch02-42"><primary>troubleshooting</primary><secondary> +general utilities</secondary></indexterm>The following symptom has a known +cause and resolution:</para> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><para><indexterm id="IG31371888102"><primary>pmchart tool</primary><secondary>remote +PMCD</secondary></indexterm><indexterm id="IG31371888103"><primary>pmie tool</primary><secondary> +remote PMCD</secondary></indexterm><indexterm id="IG31371888104"><primary>pmlogger tool</primary> +<secondary>remote PMCD</secondary></indexterm>A PCP client tool (such as <literal>pmchart</literal>, <literal>pmie</literal>, or <literal>pmlogger</literal>) +complains that it is unable to connect to a remote PMCD (or establish a PMAPI +context), but you are sure that PMCD is active on the remote host.</para> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para><indexterm id="IG31371888105"><primary>TCP/IP</primary><secondary>remote PMCD +</secondary></indexterm>To avoid hanging applications for the duration of TCP/IP time +outs, the PMAPI library implements its own time out when trying to establish +a connection to a PMCD. If the connection to the host is over a slow network, +then successful establishment of the connection may not be possible before +the time out, and the attempt is abandoned.</para> +<para>Alternatively, there may be a firewall in-between the client tool and PMCD which is blocking the connection attempt.</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>Establish that the PMCD on <replaceable>far-away-host</replaceable> +is really alive, by connecting to its control port (TCP port number 44321 by +default):<literallayout class="monospaced"><userinput>telnet far-away-host 44321</userinput></literallayout></para> +<para>This response indicates the PMCD is not running and needs restarting: +</para> +<literallayout class="monospaced">Unable to connect to remote host: Connection refused</literallayout> +<para>To restart the PMCD on that host, enter the following command:<literallayout class="monospaced"><userinput>${PCP_RC_DIR}/pmcd start</userinput></literallayout></para> +<para>This response indicates the PMCD is running:<literallayout class="monospaced">Connected to far-away-host </literallayout></para> +<para><indexterm id="IG31371888106"><primary>PMCD_CONNECT_TIMEOUT variable</primary></indexterm>Interrupt +the <literal>telnet</literal> session, increase the PMAPI time out by setting +the <literal>PMCD_CONNECT_TIMEOUT</literal> environment variable to some number +of seconds (60 for instance), and try the PCP client tool again.</para> +<para>If these techniques are ineffective, it is likely an intermediary firewall is blocking the client from accessing the PMCD port - resolving such issues is firewall-host platform-specific and cannot practically be covered here.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5193049"> + +<title>PMCD Not Reconfiguring after <literal>SIGHUP</literal></title> +<para><indexterm id="IG31371888107"><primary>SIGHUP signal</primary></indexterm>The following +symptom has a known cause and resolution:</para> +<variablelist> +<varlistentry> +<term>Symptom</term> +<listitem><para>PMCD does not reconfigure itself after receiving the <literal>SIGHUP</literal> signal.</para> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para><indexterm id="IG31371888108"><primary>pmcd.conf file</primary> +</indexterm>If there is a syntax error in <filename>${PCP_PMCDCONF_PATH}</filename>, +PMCD does not use the contents of the file. This can lead to +situations in which the configuration file and PMCD's internal state do not +agree.</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>Always monitor PMCD's log. For example, use the following +command in another window when reconfiguring PMCD, to watch errors occur:<literallayout class="monospaced"><userinput>tail -f ${PCP_LOG_DIR}/pmcd/pmcd.log</userinput></literallayout></para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5193138"> + +<title>PMCD Does Not Start</title> +<para><indexterm id="IG31371888109"><primary>PMCD</primary><secondary>not starting</secondary> +</indexterm>The following symptom has a known cause and resolution:</para> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><indexterm id="IG31371888110"><primary>${PCP_LOGDIR}/pmcd/pmcd.log file</primary></indexterm> +<para>If the following messages appear in the PMCD log +(<filename>${PCP_LOG_DIR}/pmcd/pmcd.log</filename>), consider the cause +and resolution:</para> +<literallayout class="monospaced">pcp[27020] Error: OpenRequestSocket(44321) bind: Address already in +use +pcp[27020] Error: pmcd is already running +pcp[27020] Error: pmcd not started due to errors!</literallayout> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>PMCD is already running or was terminated before it could +clean up properly. The error occurs because the socket it advertises for client +connections is already being used or has not been cleared by the kernel.</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>Start PMCD as <literal>root</literal> (superuser) by typing:<literallayout class="monospaced"><userinput>${PCP_RC_DIR}/pmcd start</userinput></literallayout></para> +<para>Any existing PMCD is shut down, and a new one is started in such a way +that the symptomatic message should not appear.</para> +<para>If you are starting PMCD this way and the symptomatic message appears, +a problem has occurred with the connection to one of the deceased PMCD's clients. +</para> +<para>This could happen when the network connection to a remote client is +lost and PMCD is subsequently terminated. The system may attempt to keep the +socket open for a time to allow the remote client a chance to reestablish +the connection and read any outstanding data.</para> +<para><indexterm id="IG31371888111"><primary>netstat command</primary></indexterm>The only solution +in these circumstances is to wait until the socket times out and the kernel +deletes it. This <command>netstat</command> command displays the status of +the socket and any connections:<literallayout class="monospaced"><userinput>netstat -ant | grep 44321</userinput></literallayout></para> +<para>If the socket is in the <literal>FIN_WAIT</literal> or <literal>TIME_WAIT</literal> state, then you must wait for it to be deleted. Once the command +above produces no output, PMCD may be restarted. Less commonly, you may have +another program running on your system that uses the same Internet port number +(44321) that PMCD uses.</para> +<para><indexterm id="IG31371888112"><primary>PCPIntro command</primary></indexterm><indexterm id="IG31371888113"> +<primary>PMCD_PORT variable</primary></indexterm>Refer to the <command>PCPIntro(1)</command> +man page for a description of how to override the default +PMCD port assignment using the <literal>PMCD_PORT</literal> environment variable. +</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +</section> +</chapter> + + + +<chapter id="LE94335-PARENT"> + +<title>Common Conventions and Arguments</title> +<para><indexterm id="ITch03-0"><primary>PCP</primary><secondary>conventions +</secondary></indexterm> <indexterm id="ITch03-1"><primary>conventions +</primary></indexterm><indexterm id="Z1033415461tls"><primary>user interface +components</primary></indexterm> This chapter deals with the user interface +components that are common to most text-based utilities that make up the +monitor portion of Performance Co-Pilot (PCP). These are the major sections +in this chapter:</para> +<itemizedlist> +<listitem><para><xref linkend="LE85600-PARENT"/>, details some basic standards +used in the development of PCP tools.</para> +</listitem> +<listitem><para><xref linkend="LE68596-PARENT"/>, details other options +to use with PCP tools.</para> +</listitem> +<listitem><para><xref linkend="LE76997-PARENT"/>, describes the time control +dialog and time-related command line options available for use with PCP +tools.</para> +</listitem> +<listitem><para><xref linkend="LE61303-PARENT"/>, describes the environment +variables supported by PCP tools.</para> +</listitem> +<listitem><para><xref linkend="LE12082-PARENT"/>, describes how to execute +PCP tools that must retrieve performance data from the Performance Metrics +Collection Daemon (PMCD) on the other side of a TCP/IP security firewall. +</para> +</listitem> +<listitem><para><xref linkend="LE17322-PARENT"/>, covers some uncommon +scenarios that may compromise performance metric integrity over the short +term.</para> +</listitem></itemizedlist> +<para><indexterm id="ITch03-2"><primary>PCP</primary><secondary>naming +conventions</secondary></indexterm>Many of the utilities provided with +PCP conform to a common set of naming and syntactic conventions for command +line arguments and options. This section outlines these conventions and +their meaning. The options may be generally assumed to be honored for +all utilities supporting the corresponding functionality.</para> +<para>In all cases, the man pages for each utility fully describe the +supported command arguments and options.</para> +<para><indexterm id="IG31371888114"><primary>pmrun tool</primary></indexterm>Command line +options are also relevant when starting PCP applications from the desktop +using the <keycap>Alt</keycap> double-click method. This technique launches +the <command>pmrun</command> program to collect additional arguments to +pass along when starting a PCP application.</para> +<section id="LE85600-PARENT"> + +<title>Alternate Metrics Source Options</title> +<para>The default source of performance metrics is from PMCD on the local +host. This default <command>pmcd</command> connection will be made using +the Unix domain socket, if the platform supports that, else a localhost +Inet socket connection is made. +This section describes how to obtain metrics from sources other +than this default.</para> +<section id="id5193612"> + +<title>Fetching Metrics from Another Host</title> +<para><indexterm id="ITch03-3"><primary>fetching metrics</primary></indexterm><indexterm id="IG31371888115"> +<primary>pmchart tool</primary><secondary>fetching metrics</secondary> +</indexterm><indexterm id="IG31371888116"><primary>pmie tool</primary><secondary>fetching +metrics</secondary></indexterm>The option <literal>-h</literal> <replaceable>host</replaceable> +directs any PCP utility (such as <literal>pmchart</literal> +or <literal>pmie</literal>) to make a connection with the PMCD instance +running on <replaceable>host</replaceable>. Once established, this connection +serves as the principal real-time source of performance metrics and metadata. +The <replaceable>host</replaceable> specification may be more than a simple host +name or address - it can also contain decorations specifying protocol type (secure +or not), authentication information, and other connection attributes. +Refer to the <command>PCPIntro(1)</command> man page for full details of these, +and examples of use of these specifications can also be found in the +<citetitle>PCP Tutorials and Case Studies</citetitle> companion document.</para> +</section> +<section id="id5193712"> + +<title>Fetching Metrics from an Archive Log</title> +<para><indexterm id="ITch03-6"><primary>fetching metrics</primary></indexterm><indexterm id="ITch03-5"><primary>archive logs</primary><secondary>fetching metrics +</secondary></indexterm><indexterm id="ITch03-8"><primary>PCP</primary> +<secondary>log file option</secondary></indexterm>The option <literal>-a</literal> +<replaceable>archive</replaceable> directs the utility to +treat the PCP archive logs with base name <replaceable>archive</replaceable> +as the principal source of performance metrics and metadata.</para> +<para><indexterm id="IG31371888117"><primary>pmlogger tool</primary><secondary>archive logs +</secondary></indexterm>PCP archive logs are created with <command>pmlogger</command>. +Most PCP utilities operate with equal facility for performance +information coming from either a real-time feed via PMCD on some host, +or for historical data from a PCP archive log. For more information on +archive logs and their use, see <xref linkend="LE93354-PARENT"/>.</para> +<para><indexterm id="ITch03-9"><primary>archive logs</primary><secondary> +physical filenames</secondary></indexterm>The base name (<filename>archive</filename>) +of the PCP archive log used with the <literal>-a</literal> +option implies the existence of the files created automatically by <command>pmlogger</command>, +as listed in <xref linkend="id5193871"/>.</para> +<table id="id5193871" frame="topbot"> + +<title>Physical Filenames for Components of a PCP Archive Log</title> +<tgroup cols="2" colsep="0" rowsep="0"> +<colspec colwidth="100*"/> +<colspec colwidth="296*"/> +<thead> +<row rowsep="1" valign="top"><entry align="left" valign="bottom"><para>Filename</para></entry> +<entry align="left" valign="bottom"><para>Contents</para></entry></row> +</thead> +<tbody> +<row valign="top"> +<entry align="left" valign="top"><para><filename>archive.</filename><replaceable> +index</replaceable><emphasis role="bold"/></para></entry> +<entry align="left" valign="top"><para>Temporal index for rapid access +to archive contents</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><filename>archive.</filename><replaceable> +meta</replaceable></para></entry> +<entry align="left" valign="top"><para>Metadata descriptions for performance +metrics and instance domains appearing in the archive</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><filename>archive.N</filename></para></entry> +<entry align="left" valign="top"><para>Volumes of performance metrics +values, for <filename>N</filename> = 0,1,2,...</para></entry></row></tbody> +</tgroup></table> +<para>Some tools are able to concurrently process multiple PCP archive +logs (for example, for retrospective analysis of performance across multiple +hosts), and accept either multiple <command>-a</command> options or a +comma separated list of archive names following the <command>-a</command> +option.</para> +<note><para>The <command>-h</command> and <command>-a</command> options +are almost always mutually exclusive. Currently, <command>pmchart</command> +is the exception to this rule but other tools may continue to blur this +line in the future.</para> +</note> +</section> +</section> +<section id="LE68596-PARENT"> + +<title>General PCP Tool Options</title> +<para><indexterm id="IG31371888118"><primary>tool options</primary></indexterm>The following +sections provide information relevant to most of the PCP tools. It is +presented here in a single place for convenience.</para> +<section id="id5194103"> + +<title>Common Directories and File Locations</title> +<para><indexterm id="IG31371888119"><primary>common directories</primary></indexterm><indexterm id="IG31371888120"> +<primary>file locations</primary></indexterm>The following files and directories +are used by the PCP tools as repositories for option and configuration +files and for binaries:</para> +<variablelist condition="sgi_termlength:wide"> +<varlistentry> +<term><filename>${PCP_DIR}/etc/pcp.env</filename></term> +<listitem><para><indexterm id="IG31371888121"><primary>${PCP_DIR}/etc/pcp.env file</primary></indexterm>Script +to set PCP run-time environment variables.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_DIR}/etc/pcp.conf</filename></term> +<listitem><para><indexterm id="IG31371888122"><primary>${PCP_DIR}/etc/pcp.conf file</primary></indexterm>PCP +configuration and environment file.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_PMCDCONF_PATH}</filename></term> +<listitem><para><indexterm id="IG31371888123"><primary>${PCP_PMCDCONF_PATH} file</primary></indexterm> +<indexterm id="IG31371888124"><primary>PMCD</primary><secondary>${PCP_PMCDCONF_PATH} file +</secondary></indexterm>Configuration file for Performance Metrics +Collection Daemon (PMCD). Sets environment variables, including <literal>PATH</literal>.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_BINADM_DIR}/pmcd</filename></term> +<listitem><para><indexterm id="IG31371888125"><primary>${PCP_BINADM_DIR}/pmcd file</primary> +</indexterm>The PMCD binary.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_PMCDOPTIONS_PATH}</filename></term> +<listitem><para><indexterm id="IG31371888126"><primary>${PCP_PMCDOPTIONS_PATH} file +</primary></indexterm>Command line options for PMCD.<?sgi-newline?> +</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_RC_DIR}/pmcd</filename></term> +<listitem><para><indexterm id="IG31371888127"><primary>${PCP_RC_DIR}/pmcd file</primary> +</indexterm>The PMCD startup script.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_BIN_DIR}/<replaceable>pcptool</replaceable></filename></term> +<listitem><para>Directory containing PCP tools such as <command>pmstat +</command>, <command>pminfo</command>, <command>pmlogger</command>, <command>pmlogsummary</command>, +<command>pmchart</command>, <command>pmie</command>, and so on.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_SHARE_DIR}</filename></term> +<listitem><para>Directory containing shareable PCP-specific files and +repository directories such as <command>bin</command>, <command>demos</command>, +<command>examples</command> and <command>lib</command>. </para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_VAR_DIR}</filename></term> +<listitem><para>Directory containing non-shareable (that is, per-host) +PCP specific files and repository directories.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_BINADM_DIR}/<replaceable>pcptool</replaceable></filename></term> +<listitem><para>PCP tools that are typically not executed directly by +the end user such as <command>pmcd_wait</command>.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_SHARE_DIR}/lib/<replaceable>pcplib</replaceable></filename></term> +<listitem><para>Miscellaneous PCP libraries and executables.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_PMDAS_DIR}</filename></term> +<listitem><para> Performance Metric Domain Agents (PMDAs), one directory +per PMDA.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_VAR_DIR}/config</filename></term> +<listitem><para>Configuration files for PCP tools, typically with one +directory per tool.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_DEMOS_DIR}</filename></term> +<listitem><para>Demonstration data files and example programs.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_LOG_DIR}</filename></term> +<listitem><para>By default, diagnostic and trace log files generated by +PMCD and PMDAs. Also, the PCP archive logs are managed in one directory +per logged host below here.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_VAR_DIR}/pmns</filename></term> +<listitem><para>Files and scripts for the Performance Metrics Name Space +(PMNS).</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5194616"> + +<title>Alternate Performance Metric Name Spaces</title> +<para><indexterm id="ITch03-10"><primary>PMNS</primary><secondary>PMNS +</secondary></indexterm>The Performance Metrics Name Space (PMNS) defines +a mapping from a collection of human-readable names for performance metrics +(convenient to the user) into corresponding internal identifiers (convenient +for the underlying implementation).</para> +<para>The distributed PMNS used in PCP avoids most requirements for an +alternate PMNS, because clients' PMNS operations are supported at the +Performance Metrics Collection Daemon (PMCD) or by means of PMNS data +in a PCP archive log. The distributed PMNS is the default, but alternates +may be specified using the <command>-n</command> <replaceable>namespace</replaceable> +argument to the PCP tools. When a PMNS is maintained on a host, it is likely +to reside in the <filename>${PCP_VAR_DIR}/pmns</filename> directory.</para> +</section> +</section> +<section id="LE76997-PARENT"> + +<title>Time Duration and Control</title> +<para><indexterm id="IG31371888129"><primary>time duration</primary></indexterm>The periodic +nature of sampling performance metrics and refreshing the displays of +the PCP tools makes specification and control of the temporal domain a +common operation. In the following sections, the services and conventions +for specifying time positions and intervals are described.</para> +<section id="LE96583-PARENT"> + +<title>Performance Monitor Reporting Frequency and +Duration</title> +<para><indexterm id="IG31371888130"><primary>reporting frequency</primary></indexterm><indexterm id="IG31371888131"> +<primary>duration</primary></indexterm>Many of the performance monitoring +utilities have periodic reporting patterns. The <literal>-t </literal> <replaceable>interval</replaceable> +and <literal>-s</literal> <replaceable>samples</replaceable> options +are used to control the sampling (reporting) interval, +usually expressed as a real number of seconds (<replaceable>interval</replaceable>), +and the number of <replaceable>samples</replaceable> to be reported, respectively. +In the absence of the <literal>-s</literal> flag, the default behavior +is for the performance monitoring utilities to run until they are explicitly +stopped.</para> +<para><indexterm id="IG31371888132"><primary>PCPIntro command</primary></indexterm>The <replaceable> +interval</replaceable> argument may also be expressed in terms of minutes, +hours, or days, as described in the <command>PCPIntro(1)</command> +man page.</para> +</section> +<section id="LE14729-PARENT"> + +<title>Time Window Options</title> +<para><indexterm id="IG31371888133"><primary>time window options</primary></indexterm><indexterm id="IG31371888134"> +<primary>window options</primary></indexterm>The following options may +be used with most PCP tools (typically when the source of the performance +metrics is a PCP archive log) to tailor the beginning and end points of +a display, the sample origin, and the sample time alignment to your convenience. +</para> +<para>The <literal>-S</literal>, <literal>-T</literal>, <literal>-O</literal> +and <literal>-A</literal> command line options are used by PCP applications +to define a time window of interest.</para> +<variablelist condition="sgi_termlength:standard"> +<varlistentry> +<term><literal>-S </literal> <replaceable>duration</replaceable></term> +<listitem><para>The start option may be used to request that the display +start at the nominated time. By default, the first sample of performance +data is retrieved immediately in real-time mode, or coincides with the +first sample of data in a PCP archive log in archive mode. For archive +mode, the <literal>-S</literal> option may be used to specify a later +time for the start of sampling. By default, if <replaceable>duration</replaceable> +is an integer, the units are assumed to be seconds.</para> +<para>To specify an offset from the beginning of a PCP archive (in archive +mode) simply specify the offset as the <replaceable>duration</replaceable>. +For example, the following entry retrieves the first sample of data at +exactly 30 minutes from the beginning of a PCP archive.</para> +<para><literallayout class="monospaced">-S 30min </literallayout></para> +<para>To specify an offset from the end of a PCP archive, prefix the +<replaceable>duration</replaceable> with a minus sign. In this case, the first sample +time precedes the end of archived data by the given <replaceable>duration</replaceable>. +For example, the following entry retrieves the first sample +exactly one hour preceding the last sample in a PCP archive.</para> +<para><literallayout class="monospaced">-S -1hour </literallayout></para> +<para>To specify the calendar date and time (local time in the reporting +timezone) for the first sample, use the <literal>ctime(3)</literal> syntax +preceded by an “at” sign (@). For example, the following entry +specifies the date and time to be used.</para> +<para><literallayout class="monospaced">-S '@ Mon Mar 4 13:07:47 2013' </literallayout></para> +<para>Note that this format corresponds to the output format of the +<command>date</command> command for easy “cut and paste.” However, +be sure to enclose the string in quotes so it is preserved as a single +argument for the PCP tool.</para> +<para>For more complete information on the date and time syntax, see the <command>PCPIntro(1)</command> man page.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-T </literal> <replaceable>duration</replaceable> </term> +<listitem><para>The terminate option may be used to request that the display +stop at the time designated by <replaceable>duration</replaceable>. By +default, the PCP tools keep sampling performance data indefinitely (in +real-time mode) or until the end of a PCP archive (in archive mode). The <literal>-T</literal> option may be used to specify an earlier time to terminate +sampling.</para> +<para>The interpretation for the <replaceable>duration</replaceable> argument +in a <literal>-T</literal> option is the same as for the <literal>-S</literal> +option, except for an unsigned time interval that is interpreted as being +an offset from the start of the time window as defined by the default +(now for real time, else start of archive) or by a <literal>-S</literal> +option. For example, these options define a time window that spans 45 +minutes, after an initial offset (or delay) of 1 hour:<literallayout class="monospaced">-S 1hour -T 45mins</literallayout></para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-O </literal> <replaceable>duration</replaceable></term> +<listitem><para>By default, samples are fetched from the start time (see +the description of the <literal>-S</literal> option) to the terminate +time (see the description of the <literal>-T</literal> option). The offset <literal>-O</literal> option allows the specification of a time between the start +time and the terminate time where the tool should position its initial +sample time. This option is useful when initial attention is focused at +some point within a larger time window of interest, or when one PCP tool +wishes to launch another PCP tool with a common current point of time +within a shared time window.</para> +<para>The <replaceable>duration</replaceable> argument accepted by <literal>-O</literal> conforms to the same syntax and semantics as the <replaceable> +duration</replaceable> argument for <literal>-T</literal>. For example, +these options specify that the initial position should be the end of the +time window:<literallayout class="monospaced">-O -0 </literallayout></para> +<para>This is most useful with the <command>pmchart</command> command +to display the tail-end of the history up to the end of the time window. +</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-A </literal> <replaceable>alignment</replaceable></term> +<listitem><para>By default, performance data samples do not necessarily +happen at any natural unit of measured time. The <literal>-A</literal> +switch may be used to force the initial sample to be on the specified +<replaceable>alignment</replaceable>. For example, these three options specify alignment +on seconds, half hours, and whole hours:<literallayout class="monospaced">-A 1sec +-A 30min +-A 1hour</literallayout></para> +<para>The <literal>-A</literal> option advances the time to achieve the +desired alignment as soon as possible after the start of the time window, +whether this is the default window, or one specified with some combination +of <literal>-A</literal> and <literal>-O</literal> command line options. +</para> +</listitem></varlistentry> +</variablelist> +<para>Obviously the time window may be overspecified by using multiple +options from the set <literal>-t</literal>, <literal>-s</literal>, <literal>-S</literal>, <literal>-T</literal>, <literal>-A</literal>, and <literal>-O</literal>. Similarly, the time window may shrink to nothing by injudicious +choice of options.</para> +<para>In all cases, the parsing of these options applies heuristics guided +by the principal of “least surprise”; the time window is always +well-defined (with the end never earlier than the start), but may shrink +to nothing in the extreme.</para> +</section> +<section id="id5195323"> + +<title>Timezone Options</title> +<para><indexterm id="ITch03-12"><primary>timezone options</primary></indexterm>All +utilities that report time of day use the local timezone by default. The +following timezone options are available:</para> +<variablelist> +<varlistentry> +<term><literal>-z</literal></term> +<listitem><para>Forces times to be reported in the timezone of the host +that provided the metric values (the PCP collector host). When used in +conjunction with <literal>-a</literal> and multiple archives, the convention +is to use the timezone from the first named archive.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-Z </literal> <replaceable>timezone</replaceable> </term> +<listitem><para><indexterm id="IG31371888135"><primary>environ man page</primary></indexterm>Sets +the TZ variable to a timezone string, as defined in <command>environ(7)</command>, +for example, <literal>-Z UTC</literal> for universal time.</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +<section id="LE61303-PARENT"> + +<title>PCP Environment Variables</title> +<para><indexterm id="ITch03-14"><primary>PCP</primary><secondary>environment +variables</secondary></indexterm><indexterm id="IG31371888136"><primary>environment variables +</primary></indexterm><indexterm id="IG31371888137"><primary>${PCP_DIR}/etc/pcp.conf file</primary> +</indexterm><indexterm id="IG31371888138"><primary>/etc/pcp.env file</primary></indexterm><indexterm id="IG31371888139"> +<primary>pmGetConfig function</primary></indexterm>When you are using +PCP tools and utilities and are calling PCP library functions, a standard +set of defined environment variables are available in the <filename>${PCP_DIR}/etc/pcp.conf</filename> +file. These variables are generally used to specify the location of various +PCP pieces in the file system and may be loaded into shell scripts by +sourcing the <filename>${PCP_DIR}/etc/pcp.env</filename> shell script. They may +also be queried by C, C++, perl and python programs using the +<command>pmGetConfig</command> library function. +If a variable is already defined in the +environment, the values in the <filename>pcp.conf</filename> file do not +override those values; that is, the values in <filename>pcp.conf</filename> +serve only as installation defaults. For additional information, see the <command>pcp.conf(5)</command>, <command>pcp.env(5)</command>, and <command>pmGetConfig(3)</command> +man pages.</para> +<para>The following environment variables are recognized by PCP (these +definitions are also available on the <command>PCPIntro(1)</command> man page):</para> +<variablelist condition="sgi_termlength:nextline"> +<varlistentry> +<term><literal>PCP_COUNTER_WRAP</literal></term> +<listitem><para><indexterm id="ITch03-15"><primary>PCP_COUNTER_WRAP variable +</primary></indexterm>Many of the performance metrics exported from PCP +agents expect that counters increase monotonically. Under some circumstances, +one value of a metric may be smaller than the previously fetched value. +This can happen when a counter of finite precision overflows, when the +PCP agent has been reset or restarted, or when the PCP agent exports values +from an underlying instrumentation that is subject to asynchronous discontinuity. +</para> +<para>If set, the <literal>PCP_COUNTER_WRAP</literal> environment variable +indicates that all such cases of a decreasing counter should be treated +as a counter overflow; and hence the values are assumed to have wrapped +once in the interval between consecutive samples. Counter wrapping was +the default in versions before the PCP release 1.3.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>PCP_STDERR</literal></term> +<listitem><para><indexterm id="ITch03-18"><primary>PCP_STDERR variable +</primary></indexterm><indexterm id="IG31371888140"><primary>pmprintf tool</primary></indexterm><indexterm id="IG31371888141"> +<primary>pmconfirm command</primary><secondary>error messages</secondary> +</indexterm>Specifies whether <command>pmprintf()</command> error messages +are sent to standard error, an <literal>pmconfirm</literal> dialog box, +or to a named file; see the <command>pmprintf(3)</command> +man page. Messages go to standard error if <literal>PCP_STDERR</literal> +is unset or set without a value. If this variable is set to <literal>DISPLAY</literal>, then messages go to an <literal>pmconfirm</literal> +dialog box; see the <command>pmconfirm(1)</command> man page. +Otherwise, the value of <literal>PCP_STDERR</literal> is assumed to be +the name of an output file.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>PMCD_CONNECT_TIMEOUT</literal></term> +<listitem><para><indexterm id="ITch03-22"><primary>PMCD_CONNECT_TIMEOUT +variable</primary></indexterm><indexterm id="IG31371888143"><primary>PMCD</primary><secondary> +PMCD_CONNECT_TIMEOUT variable</secondary></indexterm>When attempting to +connect to a remote PMCD on a system that is booting or at the other end +of a slow network link, some PMAPI routines could potentially block for +a long time until the remote system responds. These routines abort and +return an error if the connection has not been established after some +specified interval has elapsed. The default interval is 5 seconds. This +may be modified by setting this variable in the environment to a larger +number of seconds for the desired time out. This is most useful in cases +where the remote host is at the end of a slow network, requiring longer +latencies to establish the connection correctly.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>PMCD_PORT</literal></term> +<listitem><para><indexterm id="ITch03-23"><primary>PMCD_PORT variable +</primary></indexterm><indexterm id="IG31371888144"><primary>PMCD</primary><secondary>PMCD_PORT +variable</secondary></indexterm><indexterm id="IG31371888145"><primary>TCP/IP</primary><secondary> +sockets</secondary></indexterm>This TCP/IP port is used by PMCD to create +the socket for incoming connections and requests. The default is port +number 44321, which you may override by setting this variable to a different +port number. If a non-default port is in effect when PMCD is started, +then every monitoring application connecting to that PMCD must also have +this variable set in its environment before attempting a connection.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>PMCD_RECONNECT_TIMEOUT</literal></term> +<listitem><para><indexterm id="ITch03-24"><primary>PMCD_RECONNECT_TIMEOUT +variable</primary></indexterm><indexterm id="IG31371888146"><primary>PMCD</primary><secondary> +PMCD_RECONNECT_TIMEOUT variable</secondary></indexterm>When a monitor +or client application loses its connection to a PMCD, the connection may +be reestablished by calling the <command>pmReconnectContext(3)</command> +PMAPI function. However, attempts to reconnect are controlled by a back-off +strategy to avoid flooding the network with reconnection requests. By +default, the back-off delays are 5, 10, 20, 40, and 80 seconds for consecutive +reconnection requests from a client (the last delay is repeated for any +further attempts after the last delay in the list). Setting this environment +variable to a comma-separated list of positive integers redefines the +back-off delays. For example, setting the delays to <userinput>1,2</userinput> +will back off for 1 second, then back off every 2 seconds thereafter. +</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>PMCD_REQUEST_TIMEOUT</literal></term> +<listitem><para><indexterm id="ITch03-25"><primary>PMCD_REQUEST_TIMEOUT +variable</primary></indexterm><indexterm id="IG31371888147"><primary>PMCD</primary><secondary> +PMCD_REQUEST_TIMEOUT variable</secondary></indexterm>For monitor or client +applications connected to PMCD, there is a possibility of the application +hanging on a request for performance metrics or metadata or help text. +These delays may become severe if the system running PMCD crashes or the +network connection is lost or the network link is very slow. By setting +this environment variable to a real number of seconds, requests to PMCD +timeout after the specified number of seconds. The default behavior is +to wait 10 seconds for a response from every PMCD for all applications. +</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>PMLOGGER_PORT</literal></term> +<listitem><para><indexterm id="IG31371888148"><primary>PMLOGGER_PORT variable</primary></indexterm><indexterm id="IG31371888149"> +<primary>pmlogger tool</primary></indexterm><indexterm id="IG31371888150"><primary>pmlc tool +</primary><secondary>PMLOGGER_PORT variable</secondary></indexterm>This +environment variable may be used to change the base TCP/IP port number +used by <command>pmlogger</command> to create the socket to which <command>pmlc</command> +instances try to connect. The default base port number is 4330. +If used, this variable should be set in the environment before <command>pmlogger</command> +is executed. If <command>pmlc</command> and <command>pmlogger</command> +are on different hosts, then obviously <literal>PMLOGGER_PORT</literal> +must be set to the same value in both places.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>PMPROXY_PORT</literal></term> +<listitem><para><indexterm id="ITch03-30"><primary>PMPROXY_PORT variable +</primary></indexterm><indexterm id="IG31371888152"><primary>pmproxy port</primary></indexterm> +This environment variable may be used to change the base TCP/IP port number +used by <command>pmproxy</command> to create the socket to which proxied +clients connect, on their way to a distant <command>pmcd</command>.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="LE12082-PARENT"> + +<title>Running PCP Tools through a Firewall</title> +<para><indexterm id="IG31371888153"><primary>firewalls</primary></indexterm><indexterm id="IG31371888154"><primary> +TCP/IP</primary><secondary>collector and monitor hosts</secondary></indexterm>In +some production environments, the Performance Co-Pilot (PCP) monitoring +hosts are on one side of a TCP/IP firewall, and the PCP collector hosts +may be on the other side.</para> +<para><indexterm id="IG31371888155"><primary>PROXY protocol</primary></indexterm><indexterm id="IG31371888156"> +<primary>pmproxy tool</primary><secondary>TCP/IP firewall</secondary> +</indexterm><indexterm id="IG31371888157"><primary>PMCD</primary><secondary>TCP/IP firewall +</secondary></indexterm><indexterm id="IG31371888158"><primary>pmlogger tool</primary><secondary> +TCP/IP firewall</secondary></indexterm><indexterm id="IG31371888159"><primary>pmlc tool</primary> +<secondary>TCP/IP firewall</secondary></indexterm><indexterm id="IG31371888160"><primary> +PMCD_PORT variable</primary></indexterm><indexterm id="IG31371888161"><primary>PMLOGGER_PORT +variable</primary></indexterm>If the firewall service sits between the +monitor and collector tools, the <literal>pmproxy</literal> service may +be used to perform both packet forwarding and DNS proxying through the +firewall; see the <command>pmproxy(1)</command> +man page. Otherwise, it is necessary to arrange for packet +forwarding to be enabled for those TCP/IP ports used by PCP, namely 44321 +(or the value of the <literal>PMCD_PORT</literal> environment variable) +for connections to PMCD.</para> +<section id="id5196469"> + +<title>The <command>pmproxy</command> service</title> +<para>The <command>pmproxy</command> service allows PCP clients running on +hosts located on one side of a firewall to monitor remote +hosts on the other side. The basic connection syntax is as +follows, where <replaceable>tool</replaceable> is an arbitrary PCP application, +typically a monitoring tool:</para> +<literallayout class="monospaced">pmprobe -h remotehost@proxyhost</literallayout> +<para>This extended host specification syntax is part of a larger set of +available extensions to the basic host naming syntax - refer to the +<command>PCPIntro(1)</command> man page for further details.</para> +</section> +</section> +<section id="LE17322-PARENT"> + +<title>Transient Problems with Performance Metric Values +</title> +<para><indexterm id="IG31371888164"><primary>transient problems</primary></indexterm>Sometimes +the values for a performance metric as reported by a PCP tool appear to +be incorrect. This is typically caused by transient conditions such as +metric wraparound or time skew, described below. These conditions result +from design decisions that are biased in favor of lightweight protocols +and minimal resource demands for PCP components.</para> +<para>In all cases, these events are expected to occur infrequently, and +should not persist beyond a few samples.</para> +<section id="id5196702"> + +<title>Performance Metric Wraparound</title> +<para><indexterm id="ITch03-32"><primary>performance metric wraparound +</primary></indexterm><indexterm id="IG31371888165"><primary>PCP_COUNTER_WRAP variable</primary> +</indexterm>Performance metrics are usually expressed as numbers with +finite precision. For metrics that are cumulative counters of events or +resource consumption, the value of the metric may occasionally overflow +the specified range and wraparound to zero.</para> +<para>Because the value of these counter metrics is computed from the +rate of change with respect to the previous sample, this may result in +a transient condition where the rate of change is an unknown value. If +the <literal>PCP_COUNTER_WRAP</literal> environment variable is set, this +condition is treated as an overflow, and speculative rate calculations +are made. In either case, the correct rate calculation for the metric +returns with the next sample.</para> +</section> +<section id="id5196749"> + +<title>Time Dilation and Time Skew</title> +<para><indexterm id="ITch03-33"><primary>time dilation</primary></indexterm>If +a PMDA is tardy in returning results, or the PCP monitoring tool is connected +to PMCD via a slow or congested network, an error might be introduced +in rate calculations due to a difference between the time the metric was +sampled and the time PMCD sends the result to the monitoring tool.</para> +<para>In practice, these errors are usually so small as to be insignificant, +and the errors are self-correcting (not cumulative) over consecutive samples. +</para> +<para>A related problem may occur when the system time is not synchronized +between multiple hosts, and the time stamps for the results returned from +PMCD reflect the skew in the system times. In this case, it is recommended +that NTP (network time protocol) be used to keep the system clocks on the +collector systems synchronized; +for information on NTP refer to the <command>ntpd(1)</command> man page.</para> +</section> +</section> +</chapter> + + + +<chapter id="LE38515-PARENT"> + +<title>Monitoring System Performance</title> +<para><indexterm id="IG31371888166"><primary>monitoring system performance</primary></indexterm><indexterm id="IG31371888167"> +<primary>performance monitoring</primary></indexterm><indexterm id="IG31371888168"><primary> +man command</primary><secondary>usage</secondary></indexterm><indexterm id="Z1033415772tls"> +<primary>pmchart tool</primary><secondary>man example</secondary></indexterm>This +chapter describes the performance monitoring tools available in Performance +Co-Pilot (PCP). This product provides a group of commands and tools for measuring +system performance. Each tool is described completely by its own man page. +The man pages are accessible through the <command>man</command> command. For +example, the man page for the tool <command>pmdumptext</command> is viewed +by entering the following command:</para> +<literallayout class="monospaced"><userinput>man pmdumptext</userinput></literallayout> +<para>The following major sections are covered in this chapter:</para> +<itemizedlist> +<listitem id="Z930615099sdc"><para><xref linkend="LE91266-PARENT"/>, discusses <command>pmstat</command>, +a utility that provides a periodic one-line summary of system performance.</para> +</listitem> +<listitem><para><xref linkend="Z926977852sdc"/>, discusses <command>pmdumptext</command>, +a utility that shows the current values for named performance metrics.</para> +</listitem> +<listitem><para><xref linkend="LE35315-PARENT"/>, describes <command>pmval</command>, +a utility that displays performance metrics in a textual format.</para> +</listitem> +<listitem><para><xref linkend="LE60452-PARENT"/>, describes <command>pminfo</command>, +a utility that displays information about performance metrics. +</para> +</listitem> +<listitem><para><indexterm id="IG31371888169"><primary>pmstore tool</primary><secondary>setting +metric values</secondary></indexterm><xref linkend="LE10170-PARENT"/>, describes +the use of the <command>pmstore</command> utility to arbitrarily set or reset +selected performance metric values.</para> +</listitem></itemizedlist> +<para><indexterm id="IG31371888170"><primary>text-based tools</primary></indexterm><indexterm id="IG31371888171"> +<primary>2D tools</primary></indexterm>The following sections describe the +various graphical and text-based PCP tools used to monitor local or remote +system performance.</para> +<section id="LE91266-PARENT"> + +<title>The <command>pmstat</command> Command</title> +<para><indexterm id="ITch04-20"><primary>pmstat tool</primary><secondary> +description</secondary></indexterm> The <command>pmstat</command> command +provides a periodic, one-line summary of system performance. This command +is intended to monitor system performance at the highest level, after which +other tools may be used for examining subsystems to observe potential performance +problems in greater detail. After entering the <command>pmstat</command> +command, you see output similar to the following, with successive lines appearing +periodically:</para> +<literallayout class="monospaced"><userinput>pmstat</userinput> +@ Thu Aug 15 09:25:56 2013 + loadavg memory swap io system cpu + 1 min swpd free buff cache pi po bi bo in cs us sy id + 1.29 833960 5614m 144744 265824 0 0 0 1664 13K 23K 6 7 81 + 1.51 833956 5607m 144744 265712 0 0 0 1664 13K 24K 5 7 83 + 1.55 833956 5595m 145196 271908 0 0 14K 1056 13K 24K 7 7 74</literallayout> +<para>An additional line of output is added every five seconds. The +<literal>-t </literal><replaceable>interval</replaceable> option may be +used to vary the update interval (i.e. the sampling interval).</para> +<para>The output from <command>pmstat</command> is directed to standard output, +and the columns in the report are interpreted as follows:</para> +<variablelist> +<varlistentry> +<term><literal>loadavg</literal></term> +<listitem><para>The 1-minute load average (runnable processes).</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>memory</literal></term> +<listitem><para>The swpd column indicates average swap space used during the +interval (all columns reported in Kbytes unless otherwise indicated). +The <literal>free</literal> column indicates average free memory during the +interval. The <literal>buff</literal> column indicates average buffer memory +in use during the interval. The <literal>cache</literal> column indicates +average cached memory in use during the interval.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>swap</literal></term> +<listitem><para>Reports the average number of pages that are paged-in +(<literal>pi</literal>) and paged-out (<literal>po</literal>) per +second during the interval. +It is normal for the paged-in values to be non-zero, but the system is +suffering memory stress if the paged-out values are non-zero over an +extended period.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>io</literal></term> +<listitem><para>The <literal>bi</literal> and <literal>bo</literal> columns +indicate the average rate per second of block input and block output +operations respectfully, during the interval. +These rates are independent of the I/O block size. +If the values become large, they are reported as thousands of operations per +second (K suffix) or millions of operations per second (M suffix).</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>system</literal></term> +<listitem><para> +Context switch rate (<literal>cs</literal>) and interrupt rate (<literal>in</literal>). +Rates are expressed as average operations per second during the interval. +Note that the interrupt rate is normally at least HZ (the clock interrupt rate, +and <literal>kernel.all.hz</literal> metric) interrupts per second.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>cpu</literal></term> +<listitem><para>Percentage of CPU time spent executing user code (<literal>us</literal>), +system and interrupt code (<literal>sy</literal>), idle loop (<literal>id</literal>).</para> +</listitem></varlistentry> +</variablelist> +<para>As with most PCP utilities, real-time metric, and archive logs are interchangeable. +</para> +<para>For example, the following command uses a local system PCP archive log +<replaceable>20130731</replaceable> and the timezone of the host (<literal>smash</literal>) +from which performance metrics in the archive were collected:</para> +<literallayout class="monospaced"><userinput>pmstat -a </userinput><replaceable>${PCP_LOG_DIR}/pmlogger/smash/20130731</replaceable> <userinput>-t 2hour -A 1hour -z</userinput> +Note: timezone set to local timezone of host "smash" +@ Wed Jul 31 10:00:00 2013 + loadavg memory swap io system cpu + 1 min swpd free buff cache pi po bi bo in cs us sy id + 3.90 24648 6234m 239176 2913m ? ? ? ? ? ? ? ? ? + 1.72 24648 5273m 239320 2921m 0 0 4 86 11K 19K 5 5 84 + 3.12 24648 5194m 241428 2969m 0 0 0 84 10K 19K 5 5 85 + 1.97 24644 4945m 244004 3146m 0 0 0 84 10K 19K 5 5 84 + 3.82 24640 4908m 244116 3147m 0 0 0 83 10K 18K 5 5 85 + 3.38 24620 4860m 244116 3148m 0 0 0 83 10K 18K 5 4 85 + 2.89 24600 4804m 244120 3149m 0 0 0 83 10K 18K 5 4 85 +pmFetch: End of PCP archive log</literallayout> +<para>For complete information on <literal>pmstat</literal> usage and command +line options, see the <command>pmstat(1)</command> man page. +</para> +</section> +<section id="Z926977852sdc"> + +<title>The <command>pmdumptext</command> Command</title> +<para><indexterm id="IG31371888172"><primary>pmdumptext tool</primary><secondary>description +</secondary></indexterm>The <command>pmdumptext</command> command displays +performance metrics in ASCII tables, suitable for export into databases or +report generators. It is a flexible command. For example, the following command +provides continuous memory statistics on a host named <literal>surf</literal>: +</para> +<literallayout class="monospaced"><userinput>pmdumptext -Ximu -h </userinput><userinput>surf</userinput><userinput> -f '%H:%M:%S' xfs.buffer</userinput> +[ 1] xfs.buffer.get +[ 2] xfs.buffer.create +[ 3] xfs.buffer.get_locked +[ 4] xfs.buffer.get_locked_waited +[ 5] xfs.buffer.busy_locked +[ 6] xfs.buffer.miss_locked +[ 7] xfs.buffer.page_retries +[ 8] xfs.buffer.page_found +[ 9] xfs.buffer.get_read + + Column 1 2 3 4 5 6 7 8 9 + Metric get create locked waited locked locked etries _found t_read + Units c/s c/s c/s c/s c/s c/s c/s c/s c/s +10:13:23 ? ? ? ? ? ? ? ? ? +10:13:24 0.16K 9.00 0.15K 5.00 0.00 0.00 0.00 12.00 9.00 +10:13:25 1.21K 38.00 1.17K 15.00 0.00 0.00 0.00 62.01 38.00 +10:13:26 5.80K 0.12K 5.69K 41.99 0.00 0.00 0.00 0.19K 0.12K</literallayout> +<para>See the <command>pmdumptext(1)</command> man page for more +information.</para> +</section> +<section id="LE35315-PARENT"> + +<title>The <command>pmval</command> Command</title> +<para><indexterm id="ITch04-23"><primary>pmval tool</primary><secondary>description +</secondary></indexterm> The <command>pmval</command> command dumps the current +values for the named performance metrics. For example, the following command +reports the value of performance metric <literal>proc.nprocs</literal> once +per second (by default), and produces output similar to this:</para> +<literallayout class="monospaced"><userinput>pmval proc.nprocs</userinput> +metric: proc.nprocs +host: localhost +semantics: discrete instantaneous value +units: none +samples: all +interval: 1.00 sec + 81 + 81 + 82 + 81</literallayout> +<para>In this example, the number of running processes was reported once per +second.</para> +<para>Where the semantics of the underlying performance metrics indicate that +it would be sensible, <command>pmval</command> reports the rate of change +or resource utilization.</para> +<para>For example, the following command reports idle processor utilization +for each of four CPUs on the remote host <literal>dove</literal>, each five +seconds apart, producing output of this form:</para> +<literallayout class="monospaced"><userinput>pmval -h dove -t 5sec -s 4 kernel.percpu.cpu.idle</userinput> +metric: kernel.percpu.cpu.idle +host: dove +semantics: cumulative counter (converting to rate) +units: millisec (converting to time utilization) +samples: 4 +interval: 5.00 sec + +cpu:1.1.0.a cpu:1.1.0.c cpu:1.1.1.a cpu:1.1.1.c + 1.000 0.9998 0.9998 1.000 + 1.000 0.9998 0.9998 1.000 + 0.8989 0.9987 0.9997 0.9995 + 0.9568 0.9998 0.9996 1.000 +</literallayout> +<para>Similarly, the following command reports disk I/O read rate every minute +for just the disk <literal>/dev/disk1</literal>, and produces output +similar to the following:</para> +<literallayout class="monospaced"><userinput>pmval -t 1min -i disk1 disk.dev.read</userinput> +metric: disk.dev.read +host: localhost +semantics: cumulative counter (converting to rate) +units: count (converting to count / sec) +samples: indefinite +interval: 60.00 sec + disk1 + 33.67 + 48.71 + 52.33 + 11.33 + 2.333 </literallayout> +<para>The <literal>-r</literal> flag may be used to suppress the rate calculation +(for metrics with counter semantics) and display the raw values of the metrics. +</para> +<para>In the example below, manipulation of the time within the archive is +achieved by the exchange of time control messages between <command>pmval</command> +and <command>pmtime</command>.</para> +<literallayout class="monospaced"><userinput>pmval -g -a ${PCP_LOG_DIR}/pmlogger/myserver/20130731 kernel.all.load</userinput></literallayout> +<para><indexterm id="IG31371888173"><primary>PCP Tutorials and Case Studies</primary><secondary>pmval command</secondary> +</indexterm>The <command>pmval</command> command is documented by the <command>pmval(1)</command> +man page, and annotated examples of the use of <command>pmval</command> can be found in +the <citetitle>PCP Tutorials and Case Studies</citetitle> companion document. +</para> +</section> +<section id="LE60452-PARENT"> + +<title>The <command>pminfo</command> Command</title> +<para><indexterm id="ITch04-25"><primary>pminfo tool</primary><secondary> +description</secondary></indexterm>The <command>pminfo</command> command displays +various types of information about performance metrics available through the +Performance Co-Pilot (PCP) facilities.</para> +<para>The <literal>-T</literal> option is extremely useful; it provides help +text about performance metrics:</para> +<literallayout class="monospaced"><userinput>pminfo -T mem.util.cached</userinput> +mem.util.cached +Help: +Memory used by the page cache, including buffered file data. +This is in-memory cache for files read from the disk (the pagecache) +but doesn't include SwapCached.</literallayout> +<para>The <literal>-t</literal> option displays the one-line help text associated +with the selected metrics. The <literal>-T</literal> option prints more verbose +help text.</para> +<para>Without any options, <command>pminfo</command> verifies that the specified +metrics exist in the namespace, and echoes those names. Metrics may be specified +as arguments to <command>pminfo</command> using their full metric names. For +example, this command returns the following response:</para> +<literallayout class="monospaced"><userinput>pminfo hinv.ncpu network.interface.total.bytes</userinput> +hinv.ncpu +network.interface.total.bytes </literallayout> +<para>A group of related metrics in the namespace may also be specified. +For example, to list all of the <literal>hinv</literal> metrics you would +use this command:</para> +<literallayout class="monospaced"><userinput>pminfo hinv</userinput> +hinv.physmem +hinv.pagesize +hinv.ncpu +hinv.ndisk +hinv.nfilesys +hinv.ninterface +hinv.nnode +hinv.machine +hinv.map.scsi +hinv.map.cpu_num +hinv.map.cpu_node +hinv.map.lvname +hinv.cpu.clock +hinv.cpu.vendor +hinv.cpu.model +hinv.cpu.stepping +hinv.cpu.cache +hinv.cpu.bogomips</literallayout> +<para>If no metrics are specified, <command>pminfo</command> displays the +entire collection of metrics. This can be useful for searching for metrics, +when only part of the full name is known. For example, this command returns +the following response:</para> +<literallayout class="monospaced"><userinput>pminfo | grep nfs</userinput> +nfs.client.calls +nfs.client.reqs +nfs.server.calls +nfs.server.reqs +nfs3.client.calls +nfs3.client.reqs +nfs3.server.calls +nfs3.server.reqs +nfs4.client.calls +nfs4.client.reqs +nfs4.server.calls +nfs4.server.reqs</literallayout> +<para>The <literal>-d</literal> option causes <command>pminfo</command> to +display descriptive information about metrics (refer to the <command>pmLookupDesc(3)</command> +man page for an explanation of this metadata information). +The following command and response show use of the <literal>-d</literal> option: +</para> +<literallayout class="monospaced"><userinput>pminfo -d proc.nprocs disk.dev.read filesys.free</userinput> +proc.nprocs + Data Type: 32-bit unsigned int InDom: PM_INDOM_NULL 0xffffffff + Semantics: discrete Units: none + +disk.dev.read + Data Type: 32-bit unsigned int InDom: 60.1 0xf000001 + Semantics: counter Units: count + +filesys.free + Data Type: 64-bit unsigned int InDom: 60.5 0xf000005 + Semantics: instant Units: Kbyte</literallayout> +<para>The <literal>-f</literal> option to <command>pminfo</command> forces +the current value of each named metric to be fetched and printed. In the example +below, all metrics in the group <literal>hinv</literal> are selected:</para> +<literallayout class="monospaced"><userinput>pminfo -f hinv</userinput> +hinv.physmem + value 15701 + +hinv.pagesize + value 16384 + +hinv.ncpu + value 4 + +hinv.ndisk + value 6 + +hinv.nfilesys + value 2 + +hinv.ninterface + value 8 + +hinv.nnode + value 2 + +hinv.machine + value "IP35" + +hinv.map.cpu_num + inst [0 or "cpu:1.1.0.a"] value 0 + inst [1 or "cpu:1.1.0.c"] value 1 + inst [2 or "cpu:1.1.1.a"] value 2 + inst [3 or "cpu:1.1.1.c"] value 3 + +hinv.map.cpu_node + inst [0 or "node:1.1.0"] value "/dev/hw/module/001c01/slab/0/node" + inst [1 or "node:1.1.1"] value "/dev/hw/module/001c01/slab/1/node" + +hinv.cpu.clock + inst [0 or "cpu:1.1.0.a"] value 800 + inst [1 or "cpu:1.1.0.c"] value 800 + inst [2 or "cpu:1.1.1.a"] value 800 + inst [3 or "cpu:1.1.1.c"] value 800 + +hinv.cpu.vendor + inst [0 or "cpu:1.1.0.a"] value "GenuineIntel" + inst [1 or "cpu:1.1.0.c"] value "GenuineIntel" + inst [2 or "cpu:1.1.1.a"] value "GenuineIntel" + inst [3 or "cpu:1.1.1.c"] value "GenuineIntel" + +hinv.cpu.model + inst [0 or "cpu:1.1.0.a"] value "0" + inst [1 or "cpu:1.1.0.c"] value "0" + inst [2 or "cpu:1.1.1.a"] value "0" + inst [3 or "cpu:1.1.1.c"] value "0" + +hinv.cpu.stepping + inst [0 or "cpu:1.1.0.a"] value "6" + inst [1 or "cpu:1.1.0.c"] value "6" + inst [2 or "cpu:1.1.1.a"] value "6" + inst [3 or "cpu:1.1.1.c"] value "6" + +hinv.cpu.cache + inst [0 or "cpu:1.1.0.a"] value 0 + inst [1 or "cpu:1.1.0.c"] value 0 + inst [2 or "cpu:1.1.1.a"] value 0 + inst [3 or "cpu:1.1.1.c"] value 0 + +hinv.cpu.bogomips + inst [0 or "cpu:1.1.0.a"] value 1195.37 + inst [1 or "cpu:1.1.0.c"] value 1195.37 + inst [2 or "cpu:1.1.1.a"] value 1195.37 + inst [3 or "cpu:1.1.1.c"] value 1195.37</literallayout> +<para>The <literal>-h</literal> option directs <command>pminfo</command> to +retrieve information from the specified host. If the metric has an instance +domain, the value associated with each instance of the metric is printed: +</para> +<literallayout class="monospaced"><userinput>pminfo -h dove -f filesys.mountdir</userinput> +filesys.mountdir + inst [0 or "/dev/xscsi/pci00.01.0/target81/lun0/part3"] value "/" + inst [1 or "/dev/xscsi/pci00.01.0/target81/lun0/part1"] value "/boot/efi"</literallayout> +<para><indexterm id="IG31371888174"><primary>PMID</primary><secondary>printing</secondary></indexterm>The <literal>-m</literal> option prints the Performance Metric Identifiers (PMIDs) of the +selected metrics. This is useful for finding out which PMDA supplies the metric. +For example, the output below identifies the PMDA supporting domain 4 (the +leftmost part of the PMID) as the one supplying information for the metric <literal>environ.extrema.mintemp</literal>:</para> +<literallayout class="monospaced">pminfo -m environ.extrema.mintemp +environ.extrema.mintemp PMID: 4.0.3 </literallayout> +<para>The <literal>-v</literal> option verifies that metric definitions in +the PMNS correspond with supported metrics, and checks that a value is available +for the metric. Descriptions and values are fetched, but not printed. Only +errors are reported.</para> +<para><indexterm id="IG31371888175"><primary>pminfo tool</primary><secondary>PCP Tutorials and Case Studies</secondary> +</indexterm><indexterm id="IG31371888176"><primary>PCP Tutorials and Case Studies</primary><secondary>pminfo command +</secondary></indexterm>Complete information on the <command>pminfo</command> +command is found in the <command>pminfo(1)</command> man page. +There are further examples of the use of <command>pminfo</command> in the +<citetitle>PCP Tutorials and Case Studies</citetitle>.</para> +</section> +<section id="LE10170-PARENT"> + +<title>The <command>pmstore</command> Command</title> +<para><indexterm id="ITch04-26"><primary>pmstore tool</primary><secondary> +description</secondary></indexterm>From time to time you may wish to change +the value of a particular metric. Some metrics are counters that may need +to be reset, and some are simply control variables for agents that collect +performance metrics. When you need to change the value of a metric for any +reason, the command to use is <command>pmstore</command>.</para> +<note><para>For obvious reasons, the ability to arbitrarily change the value +of a performance metric is not supported. Rather, PCP collectors selectively +allow some metrics to be modified in a very controlled fashion.</para> +</note> +<para>The basic syntax of the command is as follows:</para> +<literallayout class="monospaced">pmstore <replaceable>metricname</replaceable> <replaceable>value</replaceable> </literallayout> +<para>There are also command line flags to further specify the action. For +example, the <literal>-i</literal> option restricts the change to one or more +instances of the performance metric.</para> +<para>The <replaceable>value</replaceable> may be in one of several forms, +according to the following rules:</para> +<orderedlist><listitem><para>If the metric has an integer type, then +<replaceable>value</replaceable> should consist of an optional leading hyphen, followed +either by decimal digits or “0x” and some hexadecimal digits; “0X” +is also acceptable instead of “0x.”</para> +</listitem><listitem><para>If the metric has a floating point type, then +<replaceable>value</replaceable> should be in the form of an integer (described above), +a fixed point number, or a number in scientific notation.</para> +</listitem><listitem><para>If the metric has a string type, then +<replaceable>value</replaceable> is interpreted as a literal string of ASCII characters. +</para> +</listitem><listitem><para>If the metric has an aggregate type, then an attempt +is made to interpret <replaceable>value</replaceable> as an integer, a floating +point number, or a string. In the first two cases, the minimal word length +encoding is used; for example, “123” would be interpreted as a +four-byte aggregate, and “0x100000000” would be interpreted as +an eight-byte aggregate.</para> +</listitem></orderedlist> +<para>The following example illustrates the use of <command>pmstore</command> +to enable performance metrics collection in the <literal>txmon</literal> PMDA +(see <literal>${PCP_PMDAS_DIR}/txmon</literal> for the source code of the txmon +PMDA). When the metric <literal>txmon.control.level</literal> has the value +0, no performance metrics are collected. Values greater than 0 enable progressively +more verbose instrumentation.</para> +<literallayout class="monospaced"><userinput>pminfo -f txmon.count</userinput> +txmon.count +No value(s) available! +<userinput>pmstore txmon.control.level 1</userinput> +txmon.control.level old value=0 new value=1 +<userinput>pminfo -f txmon.count</userinput> +txmon.count + inst [0 or "ord-entry"] value 23 + inst [1 or "ord-enq"] value 11 + inst [2 or "ord-ship"] value 10 + inst [3 or "part-recv"] value 3 + inst [4 or "part-enq"] value 2 + inst [5 or "part-used"] value 1 + inst [6 or "b-o-m"] value 0</literallayout> +<para>For complete information on <command>pmstore</command> usage and syntax, +see the <command>pmstore(1)</command> man page.</para> +</section> +</chapter> + + +<chapter id="LE21414-PARENT"> + +<title>Performance Metrics Inference Engine</title> +<para><indexterm id="ITch06-0"><primary>pmie tool</primary><secondary>performance +metrics inference engine</secondary></indexterm><indexterm id="ITch06-1"> +<primary>tool options</primary></indexterm><indexterm id="ITch06-2"><primary> +Performance Metrics Inference Engine</primary><see>pmie tool</see></indexterm>The +Performance Metrics Inference Engine (<command>pmie</command>) is a tool that +provides automated monitoring of, and reasoning about, system performance +within the Performance Co-Pilot (PCP) framework.</para> +<para>The major sections in this chapter are as follows:</para> +<itemizedlist> +<listitem><para><xref linkend="LE41170-PARENT"/>, provides an introduction +to the concepts and design of <command>pmie</command>.</para> +</listitem> +<listitem><para><xref linkend="LE15993-PARENT"/>, describes the basic syntax +and usage of <command>pmie</command>.</para> +</listitem> +<listitem><para><xref linkend="LE90227-PARENT"/>, discusses the complete +<command>pmie</command> rule specification language.</para> +</listitem> +<listitem><para><xref linkend="LE60280-PARENT"/>, provides an example, covering +several common performance scenarios.</para> +</listitem> +<listitem><para><xref linkend="LE31514-PARENT"/>, presents some tips and techniques +for <command>pmie</command> rule development.</para> +</listitem> +<listitem><para><xref linkend="LE91221-PARENT"/>, presents some important information +on using <command>pmie</command>.</para> +</listitem> +<listitem><para><xref linkend="Z927039566sdc"/>, describes how to use the +<command>pmieconf</command> command to generate <command>pmie</command> rules.</para> +</listitem> +<listitem><para><xref linkend="Z927039824sdc"/>, provides support for running +<command>pmie</command> as a daemon.</para> +</listitem></itemizedlist> +<section id="LE41170-PARENT"> + +<title>Introduction to <command>pmie</command></title> +<para><indexterm id="ITch06-4"><primary>pmie tool</primary><secondary>automated +reasoning</secondary></indexterm>Automated reasoning within Performance Co-Pilot +(PCP) is provided by the Performance Metrics Inference Engine, (<command>pmie</command>), +which is an applied artificial intelligence application. +</para> +<para>The <command>pmie</command> tool accepts expressions describing adverse +performance scenarios, and periodically evaluates these against streams of +performance metric values from one or more sources. When an expression is +found to be true, <command>pmie</command> is able to execute arbitrary actions +to alert or notify the system administrator of the occurrence of an adverse +performance scenario. These facilities are very general, and are designed +to accommodate the automated execution of a mixture of generic and site-specific +performance monitoring and control functions.</para> +<para>The stream of performance metrics to be evaluated may be from one or +more hosts, or from one or more PCP archive logs. In the latter case, <command>pmie</command> +may be used to retrospectively identify adverse performance conditions.</para> +<para><indexterm id="IG31371888177"><primary>PCP</primary><secondary>pmie capabilities</secondary> +</indexterm><indexterm id="IG31371888178"><primary>PMAPI</primary><secondary>pmie capabilities +</secondary></indexterm>Using <command>pmie</command>, you can filter, interpret, +and reason about the large volume of performance data made available from PCP +collector systems or PCP archives.</para> +<para>Typical <command>pmie</command> uses include the following:</para> +<itemizedlist> +<listitem><para>Automated real-time monitoring of a host, a set of hosts, +or client-server pairs of hosts to raise operational alarms when poor performance +is detected in a production environment</para> +</listitem> +<listitem><para>Nightly processing of archive logs to detect and report performance +regressions, or quantify quality of service for service level agreements or management +reports, or produce advance warning of pending performance problems</para> +</listitem> +<listitem><para>Strategic performance management, for example, detection of +slightly abnormal to chronic system behavior, trend analysis, and capacity planning +</para> +</listitem></itemizedlist> +<para><indexterm id="IG31371888179"><primary>pmie tool</primary><secondary>language</secondary> +</indexterm>The <command>pmie</command> expressions are described in a language +with expressive power and operational flexibility. It includes the following +operators and functions:</para> +<itemizedlist> +<listitem><para>Generalized predicate-action pairs, where a predicate is a +logical expression over the available performance metrics, and the action +is arbitrary. Predefined actions include the following:</para> +<itemizedlist><listitem><para><indexterm id="IG31371888180"><primary>pmconfirm command</primary> +<secondary>visible alarm</secondary></indexterm>Launch a visible alarm with <literal>pmconfirm</literal>; see the <command>pmconfirm(1)</command> man page.</para> +</listitem><listitem><para><indexterm id="IG31371888181"><primary>syslog function</primary></indexterm><indexterm id="IG31371888182"> +<primary>system log file</primary></indexterm>Post an entry to the system log file; +see the <command>syslog(3)</command> man page.</para> +</listitem><listitem><para><indexterm id="IG31371888183"><primary>${PCP_LOG_DIR}/NOTICES file</primary> +</indexterm>Post an entry to the PCP noticeboard file <filename>${PCP_LOG_DIR}/NOTICES</filename>; +see the <command>pmpost(1)</command> man page.</para> +</listitem><listitem><para>Execute a shell command or script, for example, +to send e-mail, initiate a pager call, warn the help desk, and so on.</para> +</listitem><listitem><para>Echo a message on standard output; useful for scripts +that generate reports from retrospective processing of PCP archive logs.</para> +</listitem></itemizedlist> +</listitem> +<listitem><para>Arithmetic and logical expressions in a C-like syntax.</para> +</listitem> +<listitem><para>Expression groups may have an independent evaluation frequency, +to support both short-term and long-term monitoring.</para> +</listitem> +<listitem><para>Canonical scale and rate conversion of performance metric +values to provide sensible expression evaluation.</para> +</listitem> +<listitem><para>Aggregation functions of <literal>sum</literal>, <literal>avg</literal>, <literal>min</literal>, and <literal>max</literal>, that may +be applied to collections of performance metrics values clustered over multiple +hosts, or multiple instances, or multiple consecutive samples in time.</para> +</listitem> +<listitem><para>Universal and existential quantification, to handle expressions +of the form “for every....” and “at least one...”. +</para> +</listitem> +<listitem><para>Percentile aggregation to handle statistical outliers, such +as “for at least 80% of the last 20 samples, ...”.</para> +</listitem> +<listitem><para>Macro processing to expedite repeated use of common subexpressions +or specification components.</para> +</listitem> +<listitem><para>Transparent operation against either live-feeds of performance +metric values from PMCD on one or more hosts, or against PCP archive logs +of previously accumulated performance metric values.</para> +</listitem></itemizedlist> +<para>The power of <command>pmie</command> may be harnessed to automate the +most common of the deterministic system management functions that are responses +to changes in system performance. For example, disable a batch stream if the +DBMS transaction commit response time at the ninetieth percentile goes over +two seconds, or stop accepting uploads and send e-mail to the <replaceable>sysadmin</replaceable> +alias if free space in a storage system falls below five percent.</para> +<para>Moreover, the power of <command>pmie</command> can be directed towards +the exceptional and sporadic performance problems. For example, if a network +packet storm is expected, enable IP header tracing for ten seconds, and send +e-mail to advise that data has been collected and is awaiting analysis. Or, +if production batch throughput falls below 50 jobs per minute, activate a pager +to the systems administrator on duty.</para> +<para><indexterm id="IG31371888184"><primary>pmie tool</primary><secondary>customization</secondary> +</indexterm><indexterm id="IG31371888185"><primary>pmieconf tool</primary><secondary>customization +</secondary></indexterm>Obviously, <command>pmie</command> customization is +required to produce meaningful filtering and actions in each production environment. +The <command>pmieconf</command> tool provides a convenient customization method, +allowing the user to generate parameterized <command>pmie</command> rules +for some of the more common performance scenarios.</para> +</section> +<section id="LE15993-PARENT"> + +<title>Basic <command>pmie</command> Usage</title> +<para><indexterm id="IG31371888186"><primary>pmie tool</primary><secondary>basic examples</secondary> +</indexterm>This section presents and explains some basic examples of <command>pmie</command> usage. +The <command>pmie</command> tool accepts the common +PCP command line arguments, as described in <xref linkend="LE94335-PARENT"/>. +In addition, <command>pmie</command> accepts the following command line arguments: +</para> +<variablelist condition="sgi_termlength:narrow"> +<varlistentry> +<term><literal>-d</literal></term> +<listitem><para>Enables interactive debug mode.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-v</literal></term> +<listitem><para>Verbose mode: expression values are displayed.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-V</literal></term> +<listitem><para>Verbose mode: annotated expression values are displayed.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>-W</literal></term> +<listitem><para>When-verbose mode: when a condition is true, the satisfying +expression bindings are displayed.</para> +</listitem></varlistentry> +</variablelist> +<para>One of the most basic invocations of this tool is this form:</para> +<literallayout class="monospaced"><userinput>pmie</userinput> <replaceable>filename</replaceable></literallayout> +<para>In this form, the expressions to be evaluated are read from <replaceable>filename</replaceable>. +In the absence of a given <replaceable>filename</replaceable>, +expressions are read from standard input, which may be your system keyboard.</para> +<section id="LE23271-PARENT"> + +<title><command>pmie</command> use of PCP services</title> +<para><indexterm id="ITch06-7"><primary>PCP</primary><secondary>pmie tool +</secondary></indexterm>Before you use <command>pmie</command>, it +is strongly recommended that you familiarize yourself with the concepts from +the <xref linkend="LE79836-PARENT"/>. The discussion in this section serves +as a very brief review of these concepts.</para> +<para><indexterm id="IG31371888187"><primary>pminfo tool</primary><secondary>pmie arguments</secondary> +</indexterm>PCP makes available thousands of performance metrics that +you can use when formulating expressions for <command>pmie</command> to evaluate. +If you want to find out which metrics are currently available on your system, +use this command:</para> +<literallayout class="monospaced"><userinput>pminfo</userinput> </literallayout> +<para>Use the <command>pmie</command> command line arguments to find out more +about a particular metric. In <xref linkend="Z983825969sdc"/>, to fetch new +metric values from host <literal>dove</literal>, you use the <literal>-f</literal> flag:</para> +<example id="Z983825969sdc"> +<title><command>pmie</command> +with the <literal>-f</literal> Option</title> +<literallayout class="monospaced"><userinput>pminfo -f -h dove disk.dev.total</userinput></literallayout> +<para>This produces the following response:</para> +<literallayout class="monospaced">disk.dev.total + inst [0 or "xscsi/pci00.01.0/target81/lun0/disc"] value 131233 + inst [4 or "xscsi/pci00.01.0/target82/lun0/disc"] value 4 + inst [8 or "xscsi/pci00.01.0/target83/lun0/disc"] value 4 + inst [12 or "xscsi/pci00.01.0/target84/lun0/disc"] value 4 + inst [16 or "xscsi/pci00.01.0/target85/lun0/disc"] value 4 + inst [18 or "xscsi/pci00.01.0/target86/lun0/disc"] value 4</literallayout> +</example> +<para>This reveals that on the host <literal>dove</literal>, the metric <literal>disk.dev.total</literal> has six instances, one for each disk on the system. +</para> +<para>Use the following command to request help text (specified with the <literal>-T</literal> flag) to provide more information about performance metrics: +</para> +<literallayout class="monospaced"><userinput>pminfo -T network.interface.in.packets</userinput></literallayout> +<para>The metadata associated with a performance metric is used by <command>pmie</command> +to determine how the value should be interpreted. You can examine +the descriptor that encodes the metadata by using the <literal>-d</literal> +flag for <command>pminfo</command>, as shown in <xref linkend="Z983826092sdc"/>: +</para> +<example id="Z983826092sdc"> +<title><command>pmie</command> +with the <literal>-d</literal> and <literal>-h</literal> Options</title> +<literallayout class="monospaced"><userinput>pminfo -d -h </userinput><replaceable>somehost </replaceable><userinput>mem.util.cached kernel.percpu.cpu.user</userinput></literallayout> +<para>In response, you see output similar to this:</para> +<literallayout class="monospaced">mem.util.cached + Data Type: 64-bit unsigned int InDom: PM_INDOM_NULL 0xffffffff + Semantics: instant Units: Kbyte + +kernel.percpu.cpu.user + Data Type: 64-bit unsigned int InDom: 60.0 0xf000000 + Semantics: counter Units: millisec</literallayout> +</example> +<note><para><indexterm id="ITch06-8"><primary>PM_INDOM_NULL</primary></indexterm>A +cumulative counter such as <literal>kernel.percpu.cpu.user</literal> is automatically +converted by <command>pmie</command> into a rate (measured in events per second, +or count/second), while instantaneous values such as <literal>mem.util.cached</literal> +are not subjected to rate conversion. Metrics with an instance +domain (<literal>InDom</literal> in the <command>pminfo</command> output) +of <literal>PM_INDOM_NULL</literal> are singular and always produce one value +per source. However, a metric like <literal>kernel.percpu.cpu.user</literal> +has an instance domain, and may produce multiple values per source (in this +case, it is one value for each configured CPU).</para> +</note> +</section> +<section id="id5199666"> + +<title>Simple <command>pmie</command> Usage</title> +<para><indexterm id="ITch06-9"><primary>pmie tool</primary><secondary>examples +</secondary></indexterm><xref linkend="Z983823607sdc"/> directs the inference +engine to evaluate and print values (specified with the <literal>-v</literal> +flag) for a single performance metric (the simplest possible expression), +in this case <literal>disk.dev.total</literal>, collected from the local PMCD: +</para> +<example id="Z983823607sdc"> +<title><command>pmie</command> +with the <literal>-v</literal> Option</title> +<literallayout class="monospaced"><userinput>pmie -v</userinput> +<userinput>iops = disk.dev.total;</userinput> +<userinput>Ctrl+D</userinput> +iops: ? ? +iops: 14.4 0 +iops: 25.9 0.112 +iops: 12.2 0 +iops: 12.3 64.1 +iops: 8.594 52.17 +iops: 2.001 71.64</literallayout> +</example> +<para>On this system, there are two disk spindles, hence two values of the +expression <command>iops</command> per sample. Notice that the values for +the first sample are unknown (represented by the question marks [?] in the +first line of output), because rates can be computed only when at least two +samples are available. The subsequent samples are produced every ten seconds +by default. The second sample reports that during the preceding ten seconds +there was an average of 14.4 transfers per second on one disk and no transfers +on the other disk.</para> +<para>Rates are computed using time stamps delivered by PMCD. Due to unavoidable +inaccuracy in the actual sampling time (the sample interval is not exactly +10 seconds), you may see more decimal places in values than you expect. Notice, +however, that these errors do not accumulate but cancel each other out over +subsequent samples.</para> +<para>In <xref linkend="Z983823607sdc"/>, the expression to be evaluated was +entered using the keyboard, followed by the end-of-file character [<keycap>Ctrl</keycap>+<keycap>D</keycap>]. +Usually, it is more convenient to enter expressions +into a file (for example, <filename>myrules</filename>) and ask <command>pmie</command> +to read the file. Use this command syntax:</para> +<literallayout class="monospaced"><userinput>pmie -v myrules</userinput> </literallayout> +<para>Please refer to the <command>pmie(1)</command> man page +for a complete description of <command>pmie</command> command line options. +</para> +</section> +<section id="id5199841"> + +<title>Complex <command>pmie</command> Examples</title> +<para><indexterm id="ITch06-10"><primary>pmie tool</primary><secondary>examples +</secondary></indexterm>This section illustrates more complex <command>pmie</command> +expressions of the specification language. <xref linkend="LE90227-PARENT"/>, +provides a complete description of the <command>pmie</command> specification +language.</para> +<para>The following arithmetic expression computes the percentage of write +operations over the total number of disk transfers.</para> +<literallayout class="monospaced">(disk.all.write / disk.all.total) * 100; </literallayout> +<para>The <literal>disk.all</literal> metrics are singular, so this expression +produces exactly one value per sample, independent of the number of disk devices. +</para> +<note><para>If there is no disk activity,<literal> disk.all.total</literal> +will be zero and <command>pmie</command> evaluates this expression to be not +a number. When <literal>-v</literal> is used, any such values are displayed +as question marks.</para> +</note> +<para>The following logical expression has the value <literal>true</literal> +or <literal>false</literal> for each disk:</para> +<literallayout class="monospaced">disk.dev.total > 10 && +disk.dev.write > disk.dev.read; </literallayout> +<para>The value is true if the number of writes exceeds the number of reads, +and if there is significant disk activity (more than 10 transfers per second). +<xref linkend="Z983824038sdc"/> demonstrates a simple action:</para> +<example id="Z983824038sdc"> +<title>Printed <command>pmie</command> Output</title> +<literallayout class="monospaced">some_inst disk.dev.total > 60 + -> print "[%i] high disk i/o"; +</literallayout> +</example> +<para>This prints a message to the standard output whenever the total number +of transfers for some disk (<literal>some_inst</literal>) exceeds 60 transfers +per second. The <literal>%i</literal> (instance) in the message is replaced +with the name(s) of the disk(s) that caused the logical expression to be <literal>true</literal>.</para> +<para>Using <command>pmie</command> to evaluate the above expressions every +3 seconds, you see output similar to <xref linkend="Z983824037sdc"/>. +Notice the introduction of labels for each <command>pmie</command> expression.</para> +<example id="Z983824037sdc"> +<title>Labelled <command>pmie</command> Output</title> +<literallayout class="monospaced"><userinput>pmie -v -t 3sec</userinput> +<userinput>pct_wrt = (disk.all.write / disk.all.total) * 100;</userinput> +<userinput>busy_wrt = disk.dev.total > 10 &&</userinput> + <userinput>disk.dev.write > disk.dev.read;</userinput> +<userinput>busy = some_inst disk.dev.total > 60</userinput> + <userinput>-> print "[%i] high disk i/o ";</userinput> +<userinput>Ctrl+D</userinput> +pct_wrt: ? +busy_wrt: ? ? +busy: ? + +pct_wrt: 18.43 +busy_wrt: false false +busy: false + +Mon Aug 5 14:56:08 2012: [disk2] high disk i/o +pct_wrt: 10.83 +busy_wrt: false false +busy: true + +pct_wrt: 19.85 +busy_wrt: true false +busy: false + +pct_wrt: ? +busy_wrt: false false +busy: false + +Mon Aug 5 14:56:17 2012: [disk1] high disk i/o [disk2] high disk i/o +pct_wrt: 14.8 +busy_wrt: false false +busy: true</literallayout> +</example> +<para>The first sample contains unknowns, since all expressions depend on +computing rates. Also notice that the expression <literal>pct_wrt</literal> +may have an undefined value whenever all disks are idle, as the denominator +of the expression is zero. If one or more disks is busy, the expression <literal>busy</literal> is true, and the message from the <literal>print</literal> +in the action part of the rule appears (before the <literal>-v</literal> values). +</para> +</section> +</section> +<section id="LE90227-PARENT"> + +<title>Specification Language for <command>pmie</command></title> +<para><indexterm id="IG31371888188"><primary>pmie tool</primary><secondary>language</secondary> +</indexterm>This section describes the complete syntax of the <command>pmie</command> +specification language, as well as macro facilities and the issue +of sampling and evaluation frequency. The reader with a preference for learning +by example may choose to skip this section and go straight to the examples +in <xref linkend="LE60280-PARENT"/>.</para> +<para>Complex expressions are built up recursively from simple elements:</para> +<orderedlist><listitem><para>Performance metric values are obtained from PMCD +for real-time sources, otherwise from PCP archive logs.</para> +</listitem><listitem><para>Metrics values may be combined using arithmetic +operators to produce arithmetic expressions.</para> +</listitem><listitem><para>Arithmetic expressions may be compared using relational +operators to produce logical expressions.</para> +</listitem><listitem><para>Logical expressions may be combined using Boolean +operators, including powerful quantifiers.</para> +</listitem><listitem><para>Aggregation operators may be used to compute summary +expressions, for either arithmetic or logical operands.</para> +</listitem><listitem><para>The final logical expression may be used to initiate +a sequence of actions.</para> +</listitem></orderedlist> +<section id="LE51927-PARENT"> + +<title>Basic <command>pmie</command> Syntax</title> +<para><indexterm id="ITch06-11"><primary>pmie tool</primary><secondary>syntax +</secondary></indexterm>The <command>pmie</command> rule specification language +supports a number of basic syntactic elements.</para> +<section id="id5200293"> + +<title>Lexical Elements</title> +<para><indexterm id="IG31371888189"><primary>lexical elements</primary></indexterm>All <command>pmie</command> +expressions are composed of the following lexical elements: +</para> +<variablelist> +<varlistentry> +<term>Identifier</term> +<listitem><para>Begins with an alphabetic character (either upper or lowercase), +followed by zero or more letters, the numeric digits, and the special characters +period (<literal>.</literal>) and underscore (<literal>_</literal>), as shown +in the following example:<literallayout class="monospaced"><literal>x</literal>, <literal>disk.dev.total</literal> and <literal>my_stuff</literal></literallayout></para> +<para>As a special case, an arbitrary sequence of letters enclosed by apostrophes +(<literal>'</literal>) is also interpreted as an <replaceable>identifier</replaceable>; +for example:<literallayout class="monospaced">'vms$slow_response'</literallayout></para> +</listitem></varlistentry> +<varlistentry> +<term>Keyword</term> +<listitem><para>The aggregate operators, units, and predefined actions are +represented by keywords; for example, <literal>some_inst</literal>, <literal>print</literal>, and <literal>hour</literal>.</para> +</listitem></varlistentry> +<varlistentry> +<term>Numeric constant</term> +<listitem><para>Any likely representation of a decimal integer or floating +point number; for example, 124, 0.05, and -45.67</para> +</listitem></varlistentry> +<varlistentry> +<term>String constant</term> +<listitem><para>An arbitrary sequence of characters, enclosed by double quotation +marks (<literal>"x"</literal>).</para> +</listitem></varlistentry> +</variablelist> +<para>Within quotes of any sort, the backslash (<literal>\</literal>) may +be used as an escape character as shown in the following example:</para> +<literallayout class="monospaced">"A \"gentle\" reminder"</literallayout> +</section> +<section id="id5200461"> + +<title>Comments </title> +<para><indexterm id="IG31371888190"><primary>comments</primary></indexterm>Comments may be embedded +anywhere in the source, in either of these forms:</para> +<variablelist> +<varlistentry> +<term><literal>/* text */</literal></term> +<listitem><para>Comment, optionally spanning multiple lines, with no nesting +of comments.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>// text</literal></term> +<listitem><para>Comment from here to the end of the line.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5200514"> + +<title>Macros</title> +<para><indexterm id="IG31371888191"><primary>macros</primary></indexterm>When they are fully +specified, expressions in <command>pmie</command> tend to be verbose and repetitive. +The use of macros can reduce repetition and improve readability and modularity. +Any statement of the following form associates the macro name <literal>identifier</literal> with the given string constant.</para> +<literallayout class="monospaced"><literal>identifier = "<replaceable>string</replaceable>";</literal></literallayout> +<para>Any subsequent occurrence of the macro name <literal>identifier</literal> +is replaced by the <replaceable>string</replaceable> most recently associated +with a macro definition for <literal>identifier</literal>.</para> +<literallayout class="monospaced"><userinput>$</userinput><literal>identifier</literal> </literallayout> +<para>For example, start with the following macro definition:</para> +<literallayout class="monospaced">disk = "disk.all"; </literallayout> +<para>You can then use the following syntax:</para> +<literallayout class="monospaced">pct_wrt = ($disk.write / $disk.total) * 100;</literallayout> +<note><para>Macro expansion is performed before syntactic parsing; so macros +may only be assigned constant string values.</para> +</note> +</section> +<section id="id5200653"> + +<title>Units</title> +<para><indexterm id="IG31371888192"><primary>units</primary></indexterm>The inference engine +converts all numeric values to canonical units (seconds for time, bytes for +space, and events for count). To avoid surprises, you are encouraged to specify +the units for numeric constants. If units are specified, they are checked +for dimension compatibility against the metadata for the associated performance +metrics.</para> +<para>The syntax for a <literal>units</literal> specification is a sequence +of one or more of the following keywords separated by either a space or a +slash (<literal>/</literal>), to denote per: <literal>byte</literal>, <literal>KByte</literal>, <literal>MByte</literal>, <literal>GByte</literal>, <literal>TByte</literal>, <literal>nsec</literal>, <literal>nanosecond</literal>, <literal>usec</literal>, <literal>microsecond</literal>, <literal>msec</literal>, <literal>millisecond</literal>, <literal>sec</literal>, <literal>second</literal>, <literal>min</literal>, <literal>minute</literal>, <literal>hour</literal>, <literal>count</literal>, <literal>Kcount</literal>, <literal>Mcount</literal>, <literal>Gcount</literal>, or <literal>Tcount</literal>. Plural forms are also accepted. +</para> +<para>The following are examples of units usage:</para> +<literallayout class="monospaced">disk.dev.blktotal > 1 Mbyte / second; +mem.util.cached < 500 Kbyte;</literallayout> +<note><para>If you do not specify the units for numeric constants, it is assumed +that the constant is in the canonical units of seconds for time, bytes for +space, and events for count, and the dimensionality of the constant is assumed +to be correct. Thus, in the following expression, the <literal>500</literal> +is interpreted as 500 bytes.</para> +<literallayout class="monospaced">mem.util.cached < 500</literallayout> +</note> +</section> +</section> +<section id="LE88708-PARENT"> + +<title>Setting Evaluation Frequency</title> +<para><indexterm id="ITch06-12"><primary>pmie tool</primary><secondary>setting +evaluation frequency</secondary></indexterm><indexterm id="IG31371888193"><primary>evaluation +frequency</primary></indexterm>The identifier name <literal>delta</literal> +is reserved to denote the interval of time between consecutive evaluations +of one or more expressions. Set <literal>delta</literal> as follows:</para> +<literallayout class="monospaced">delta = <replaceable>number</replaceable> [<replaceable>units</replaceable>];</literallayout> +<para>If present, <literal>units</literal> must be one of the time units described +in the preceding section. If absent, <literal>units</literal> are assumed +to be <literal>seconds</literal>. For example, the following expression has +the effect that any subsequent expressions (up to the next expression that +assigns a value to <literal>delta</literal>) are scheduled for evaluation +at a fixed frequency, once every five minutes.</para> +<literallayout class="monospaced">delta = 5 min; </literallayout> +<para>The default value for <literal>delta</literal> may be specified using +the <literal>-t</literal> command line option; otherwise <literal>delta</literal> +is initially set to be 10 seconds.</para> +</section> +<section id="LE73508-PARENT"> + +<title><command>pmie</command> Metric Expressions</title> +<para><indexterm id="ITch06-13"><primary>pmie tool</primary><secondary>metric +expressions</secondary></indexterm><indexterm id="IG31371888194"><primary>PMNS</primary><secondary> +metric expressions</secondary></indexterm><indexterm id="IG31371888196"><primary> +PMCD</primary><secondary>collector host</secondary></indexterm>The performance +metrics namespace (PMNS) provides a means of naming performance metrics, +for example, <literal>disk.dev.read</literal>. +PCP allows an application to retrieve one or more values for a performance +metric from a designated source (a collector host running PMCD, or a PCP archive +log). To specify a single value for some performance metric requires the metric +name to be associated with all three of the following:</para> +<itemizedlist> +<listitem><para>A particular host (or source of metrics values)</para> +</listitem> +<listitem><para>A particular instance (for metrics with multiple values)</para> +</listitem> +<listitem><para>A sample time</para> +</listitem></itemizedlist> +<para>The permissible values for hosts are the range of valid hostnames as +provided by Internet naming conventions.</para> +<para><indexterm id="IG31371888197"><primary>PMDA</primary><secondary>instance names</secondary> +</indexterm>The names for instances are provided by the Performance Metrics +Domain Agents (PMDA) for the instance domain associated with the chosen performance +metric.</para> +<para>The sample time specification is defined as the set of natural numbers +0, 1, 2, and so on. A number refers to one of a sequence of sampling events, +from the current sample 0 to its predecessor 1, whose predecessor was 2, and +so on. This scheme is illustrated by the time line shown in <xref linkend="id5201070"/>. +</para> +<figure id="id5201070"><title>Sampling Time Line</title><mediaobject><imageobject><imagedata fileref="figures/sampling-timeline.svg"/></imageobject><textobject><phrase>Sampling Time Line</phrase></textobject></mediaobject></figure> +<para>Each sample point is assumed to be separated from its predecessor by +a constant amount of real time, the <literal>delta</literal>. The most recent +sample point is always zero. The value of <literal>delta</literal> may vary +from one expression to the next, but is fixed for each expression; for more +information on the sampling interval, see <xref linkend="LE88708-PARENT"/>. +</para> +<para>For <command>pmie</command>, a metrics expression is the name of a metric, +optionally qualified by a host, instance and sample time specification. Special +characters introduce the qualifiers: colon (<literal>:</literal>) for hosts, +hash or pound sign (<literal>#</literal>) for instances, and at (<literal>@</literal>) for sample times. The following expression refers to the previous +value (<literal>@1</literal>) of the counter for the disk read operations +associated with the disk instance <literal>#disk1</literal> on the host <literal>moomba</literal>.</para> +<literallayout class="monospaced">disk.dev.read :moomba #disk1 @1 </literallayout> +<para>In fact, this expression defines a point in the three-dimensional (3D) +parameter space of {<literal>host</literal>} x {<literal>instance</literal>} +x {<literal>sample time</literal>} as shown in <xref linkend="id5201194"/>. +</para> +<figure id="id5201194"><title>Three-Dimensional Parameter Space</title><mediaobject><imageobject><imagedata fileref="figures/parameter-space.svg"/></imageobject><textobject><phrase>Three-Dimensional Parameter Space</phrase></textobject></mediaobject></figure> +<para>A metric expression may also identify sets of values corresponding to +one-, two-, or three-dimensional slices of this space, according to the following +rules:</para> +<orderedlist><listitem><para>A metric expression consists of a PCP metric +name, followed by optional host specifications, followed by optional instance +specifications, and finally, optional sample time specifications.</para> +</listitem><listitem><para>A host specification consists of one or more host +names, each prefixed by a colon (<literal>:</literal>). For example: <literal>:indy :far.away.domain.com :localhost</literal></para> +</listitem><listitem><para>A missing host specification implies the default <command>pmie</command> +source of metrics, as defined by a <literal>-h</literal> option +on the command line, or the first named archive in an <literal>-a</literal> +option on the command line, or PMCD on the local host.</para> +</listitem><listitem><para>An instance specification consists of one or more +instance names, each prefixed by a hash or pound (<literal>#</literal>) sign. +For example: <literal>#eth0 #eth2</literal></para> +<para>Recall that you can discover the instance names for a particular metric, +using the <command>pminfo</command> command. See <xref linkend="LE23271-PARENT"/>. +</para> +<para>Within the <command>pmie</command> grammar, an instance name is an identifier. +If the instance name contains characters other than alphanumeric characters, +enclose the instance name in single quotes; for example, <literal>#'/boot' #'/usr'</literal></para> +</listitem><listitem><para>A missing instance specification implies all instances +for the associated performance metric from each associated <command>pmie</command> +source of metrics.</para> +</listitem><listitem><para>A sample time specification consists of either +a single time or a range of times. A single time is represented as an at (<literal>@</literal>) followed by a natural number. A range of times is an at (<literal>@</literal>), followed by a natural number, followed by two periods (<literal>..</literal>) followed by a second natural number. The ordering of the end +points in a range is immaterial. For example, <literal>@0..9</literal> specifies +the last 10 sample times.</para> +</listitem><listitem><para>A missing sample time specification implies the +most recent sample time.</para> +</listitem></orderedlist> +<para>The following metric expression refers to a three-dimensional set of +values, with two hosts in one dimension, five sample times in another, and +the number of instances in the third dimension being determined by the number +of configured disk spindles on the two hosts.</para> +<literallayout class="monospaced">disk.dev.read :foo :bar @0..4</literallayout> +</section> +<section id="LE59099-PARENT"> + +<title><command>pmie</command> Rate Conversion</title> +<para><indexterm id="ITch06-14"><primary>pmie tool</primary><secondary>rate +conversion</secondary></indexterm><indexterm id="IG31371888198"><primary>rate conversion</primary> +</indexterm>Many of the metrics delivered by PCP are cumulative counters. +Consider the following metric:</para> +<literallayout class="monospaced">disk.all.total </literallayout> +<para>A single value for this metric tells you only that a certain number +of disk I/O operations have occurred since boot time, and that information +may be invalid if the counter has exceeded its 32-bit range and wrapped. You +need at least two values, sampled at known times, to compute the recent rate +at which the I/O operations are being executed. The required syntax would +be this:</para> +<literallayout class="monospaced">(disk.all.total @0 - disk.all.total @1) / delta </literallayout> +<para>The accuracy of <literal>delta</literal> as a measure of actual inter-sample +delay is an issue. <command>pmie</command> requests samples, at intervals +of approximately <literal>delta</literal>, while the results exported from +PMCD are time stamped with the high-resolution system clock time when the +samples were extracted. For these reasons, a built-in and implicit rate conversion +using accurate time stamps is provided by <command>pmie</command> for performance +metrics that have counter semantics. For example, the following expression +is unconditionally converted to a rate by <command>pmie</command>.</para> +<literallayout class="monospaced">disk.all.total </literallayout> +</section> +<section id="id5201567"> + +<title><command>pmie</command> Arithmetic Expressions</title> +<para><indexterm id="ITch06-15"><primary>pmie tool</primary><secondary>arithmetic +expressions</secondary></indexterm><indexterm id="IG31371888199"><primary>arithmetic expressions +</primary></indexterm>Within <command>pmie</command>, simple arithmetic expressions +are constructed from metrics expressions (see <xref linkend="LE73508-PARENT"/>) +and numeric constants, using all of the arithmetic operators and precedence +rules of the C programming language.</para> +<para>All <command>pmie</command> arithmetic is performed in double precision. +</para> +<para><xref linkend="LE87294-PARENT"/>, describes additional operators that +may be used for aggregate operations to reduce the dimensionality of an arithmetic +expression.</para> +</section> +<section id="id5201630"> + +<title><command>pmie</command> Logical Expressions</title> +<para><indexterm id="ITch06-16"><primary>pmie tool</primary><secondary>logical +expressions</secondary></indexterm><indexterm id="IG31371888200"><primary>logical expressions +</primary></indexterm>A number of logical expression types are supported: +</para> +<itemizedlist> +<listitem><para>Logical constants</para> +</listitem> +<listitem><para>Relational expressions</para> +</listitem> +<listitem><para>Boolean expressions</para> +</listitem> +<listitem><para>Quantification operators</para> +</listitem></itemizedlist> +<section id="id5201707"> + +<title>Logical Constants</title> +<para><indexterm id="IG31371888201"><primary>logical constants</primary></indexterm>Like in the +C programming language, <command>pmie</command> interprets an arithmetic value +of zero to be false, and all other arithmetic values are considered true. +</para> +</section> +<section id="id5201731"> + +<title>Relational Expressions</title> +<para><indexterm id="IG31371888202"><primary>relational expressions</primary></indexterm>Relational +expressions are the simplest form of logical expression, in which values may +be derived from arithmetic expressions using <command>pmie</command> relational +operators. For example, the following is a relational expression that is true +or false, depending on the aggregate total of disk read operations per second +being greater than 50.</para> +<literallayout class="monospaced">disk.all.read > 50 count/sec</literallayout> +<para>All of the relational logical operators and precedence rules of the +C programming language are supported in <command>pmie</command>.</para> +<para>As described in <xref linkend="LE73508-PARENT"/>, arithmetic expressions +in <command>pmie</command> may assume set values. The relational operators +are also required to take constant, singleton, and set-valued expressions +as arguments. The result has the same dimensionality as the operands. Suppose +the rule in <xref linkend="Z983832287sdc"/> is given:</para> +<example id="Z983832287sdc"> +<title>Relational Expressions +</title> +<literallayout class="monospaced"><userinput>hosts = ":gonzo";</userinput> +<userinput>intfs = "#eth0 #eth2";</userinput> +<userinput>all_intf = network.interface.in.packets</userinput> + <userinput>$hosts $intfs @0..2 > 300 count/sec;</userinput></literallayout> +<para>Then the execution of <command>pmie</command> may proceed as follows: +</para> +<literallayout class="monospaced"><userinput>pmie -V uag.11</userinput> +all_intf: + gonzo: [eth0] ? ? ? + gonzo: [eth2] ? ? ? +all_intf: + gonzo: [eth0] false ? ? + gonzo: [eth2] false ? ? +all_intf: + gonzo: [eth0] true false ? + gonzo: [eth2] false false ? +all_intf: + gonzo: [eth0] true true false + gonzo: [eth2] false false false</literallayout> +</example> +<para>At each sample, the relational operator greater than (>) produces six +truth values for the cross-product of the <literal>instance</literal> and <literal>sample time</literal> dimensions.</para> +<para><xref linkend="LE97708-PARENT"/>, describes additional logical operators +that may be used to reduce the dimensionality of a relational expression. +</para> +</section> +<section id="id5201927"> + +<title>Boolean Expressions</title> +<para><indexterm id="IG31371888203"><primary>Boolean expressions</primary></indexterm>The regular +Boolean operators from the C programming language are supported: conjunction +(<literal>&&</literal>), disjunction (<literal>||</literal>) and negation +(<literal>!</literal>).</para> +<para>As with the relational operators, the Boolean operators accommodate +set-valued operands, and set-valued results.</para> +</section> +<section id="LE97708-PARENT"> + +<title>Quantification Operators</title> +<para><indexterm id="IG31371888204"><primary>quantification operators</primary></indexterm><indexterm id="IG31371888205"> +<primary>operators</primary></indexterm>Boolean and relational operators may +accept set-valued operands and produce set-valued results. In many cases, +rules that are appropriate for performance management require a set of truth +values to be reduced along one or more of the dimensions of hosts, instances, +and sample times described in <xref linkend="LE73508-PARENT"/>. +The <command>pmie</command> quantification operators perform this function.</para> +<para>Each quantification operator takes a one-, two-, or three-dimension +set of truth values as an operand, and reduces it to a set of smaller dimension, +by quantification along a single dimension. For example, suppose the expression +in the previous example is simplified and prefixed by <literal>some_sample</literal>, to produce the following expression:</para> +<literallayout class="monospaced"><userinput>intfs = "#eth0 #eth2";</userinput>  +<userinput>all_intf = some_sample network.interface.in.packets</userinput> + <userinput>$intfs @0..2 > 300 count/sec;</userinput></literallayout> +<para>Then the expression result is reduced from six values to two (one per +interface instance), such that the result for a particular instance will be +false unless the relational expression for the same interface instance is +true for at least one of the preceding three sample times.</para> +<para>There are existential, universal, and percentile quantification operators +in each of the <replaceable>host</replaceable>, <replaceable>instance</replaceable>, +and <replaceable>sample time</replaceable> dimensions to produce the nine +operators as follows:</para> +<variablelist> +<varlistentry> +<term><literal>some_host</literal></term> +<listitem><para>True if the expression is true for at least one <replaceable>host</replaceable> +for the same <replaceable>instance</replaceable> and <literal>sample time</literal>.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>all_host</literal></term> +<listitem><para>True if the expression is true for every <replaceable>host</replaceable> +for the same <replaceable>instance</replaceable> and <replaceable>sample time</replaceable>.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>N</replaceable><literal>%</literal><literal>_host</literal></term> +<listitem><para>True if the expression is true for at least <replaceable>N</replaceable>% +of the <replaceable>hosts</replaceable> for the same <replaceable>instance</replaceable> +and <replaceable>sample time</replaceable>.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>some_inst</literal></term> +<listitem><para>True if the expression is true for at least one <replaceable>instance</replaceable> +for the same <replaceable>host</replaceable> and <replaceable>sample time</replaceable>.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>all_instance</literal></term> +<listitem><para>True if the expression is true for every <replaceable>instance</replaceable> +for the same <replaceable>host</replaceable> and <replaceable>sample time</replaceable>.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>N</replaceable><literal>%</literal><literal>_instance</literal></term> +<listitem><para>True if the expression is true for at least <replaceable>N</replaceable>% +of the <replaceable>instances</replaceable> for the same <replaceable>host</replaceable> +and <replaceable>sample time</replaceable>.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>some_sample time</literal></term> +<listitem><para>True if the expression is true for at least one <replaceable>sample time</replaceable> +for the same <replaceable>host</replaceable> and <replaceable>instance</replaceable>.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>all_sample time</literal></term> +<listitem><para>True if the expression is true for every <replaceable>sample time</replaceable> +for the same <replaceable>host</replaceable> and <replaceable>instance</replaceable>.</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>N</replaceable><literal>%</literal><literal>_sample time</literal></term> +<listitem><para>True if the expression is true for at least <replaceable>N</replaceable>% +of the <replaceable>sample times</replaceable> for the same <replaceable>host</replaceable> +and <replaceable>instance</replaceable>.</para> +</listitem></varlistentry> +</variablelist> +<para>These operators may be nested. For example, the following expression +answers the question: “Are all hosts experiencing at least 20% of their +disks busy either reading or writing?”</para> +<literallayout class="monospaced">Servers = ":moomba :babylon"; +all_host ( + 20%_inst disk.dev.read $Servers > 40 || + 20%_inst disk.dev.write $Servers > 40 +); </literallayout> +<para>The following expression uses different syntax to encode the same semantics: +</para> +<literallayout class="monospaced">all_host ( + 20%_inst ( + disk.dev.read $Servers > 40 || + disk.dev.write $Servers > 40 + ) +);</literallayout> +<note><para>To avoid confusion over precedence and scope for the quantification +operators, use explicit parentheses.</para> +</note> +<para>Two additional quantification operators are available for the instance +dimension only, namely <literal>match_inst</literal> and <literal>nomatch_inst</literal>, that take a regular expression and a boolean expression. The result +is the boolean AND of the expression and the result of matching (or not matching) +the associated instance name against the regular expression.</para> +<para>For example, this rule evaluates error rates on various 10BaseT Ethernet +network interfaces (such as ecN, ethN, or efN):</para> +<literallayout class="monospaced">some_inst + match_inst "^(ec|eth|ef)" + network.interface.total.errors > 10 count/sec +-> syslog "Ethernet errors:" " %i"</literallayout> +</section> +</section> +<section id="id5202519"> + +<title><command>pmie</command> Rule Expressions</title> +<para><indexterm id="IG31371888206"><primary>rule expressions</primary></indexterm>Rule expressions +for <command>pmie</command> have the following syntax:</para> +<literallayout class="monospaced">lexpr -> <replaceable>actions</replaceable> ;</literallayout> +<para>The semantics are as follows:</para> +<itemizedlist> +<listitem><para>If the logical expression <literal>lexpr</literal> evaluates <literal>true</literal>, then perform the <replaceable>actions</replaceable> that follow. +Otherwise, do not perform the <replaceable>actions</replaceable>.</para> +</listitem> +<listitem><para>It is required that <literal>lexpr</literal> has a singular +truth value. Aggregation and quantification operators must have been applied +to reduce multiple truth values to a single value.</para> +</listitem> +<listitem><para>When executed, an <replaceable>action</replaceable> completes +with a success/failure status.</para> +</listitem> +<listitem><para>One or more <replaceable>actions</replaceable> may appear; +consecutive <replaceable>actions</replaceable> are separated by operators +that control the execution of subsequent <replaceable>actions</replaceable>, +as follows:</para> +<variablelist id="Z926963018sdc"> +<varlistentry> +<term><replaceable>action-1 </replaceable><literal>&</literal></term> +<listitem><para>Always execute subsequent actions (serial execution).</para> +</listitem></varlistentry> +<varlistentry> +<term><replaceable>action-1 </replaceable><userinput>|</userinput> </term> +<listitem><para>If <replaceable>action-1</replaceable> fails, execute subsequent +actions, otherwise skip the subsequent actions (alternation).</para> +</listitem></varlistentry> +</variablelist> +</listitem></itemizedlist> +<para>An <replaceable>action</replaceable> is composed of a keyword to identify +the action method, an optional <replaceable>time</replaceable> specification, +and one or more arguments.</para> +<para>A <replaceable>time</replaceable> specification uses the same syntax +as a valid time interval that may be assigned to <literal>delta</literal>, +as described in <xref linkend="LE88708-PARENT"/>. If the <replaceable>action</replaceable> +is executed and the <replaceable>time</replaceable> specification +is present, <command>pmie</command> will suppress any subsequent execution +of this <replaceable>action</replaceable> until the wall clock time has advanced +by <replaceable>time</replaceable>.</para> +<para>The arguments are passed directly to the action method.</para> +<para>The following action methods are provided:</para> +<variablelist> +<varlistentry> +<term><literal>shell</literal></term> +<listitem><para>The single argument is passed to the shell for execution. +This <replaceable>action</replaceable> is implemented using <literal>system</literal> in the background. The <replaceable>action</replaceable> does not +wait for the system call to return, and succeeds unless the fork fails.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>alarm</literal></term> +<listitem><para><indexterm id="IG31371888207"><primary>DISPLAY variable</primary></indexterm>A +notifier containing a time stamp, a single <replaceable>argument</replaceable> +as a message, and a <guilabel>Cancel</guilabel> button is +posted on the current display screen (as identified by the <literal>DISPLAY</literal> environment variable). Each alarm <replaceable>action</replaceable> +first checks if its notifier is already active. If there is an identical active +notifier, a duplicate notifier is not posted. The action succeeds unless the +fork fails.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>syslog</literal></term> +<listitem><para><indexterm id="IG31371888208"><primary>pmlogger tool</primary></indexterm><indexterm id="IG31371888209"> +<primary>syslog function</primary></indexterm>A message is written into the +system log. If the first word of the first argument is <literal>-p</literal>, +the second word is interpreted as the priority (see the <command>syslog(3)</command> +man page); the message tag is <literal>pcp-pmie</literal>. +The remaining argument is the message to be written to the system log. +This action always succeeds.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>print</literal></term> +<listitem><para><indexterm id="IG31371888210"><primary>time-stamped message</primary></indexterm>A +message containing a time stamp in <command>ctime(3)</command> format and the +argument is displayed out to standard output (<command>stdout</command>). +This action always succeeds.</para> +</listitem></varlistentry> +</variablelist> +<para>Within the argument passed to an action method, the following expansions +are supported to allow some of the context from the logical expression on +the left to appear to be embedded in the argument:</para> +<variablelist condition="sgi_termlength:narrow"> +<varlistentry> +<term><literal>%h</literal></term> +<listitem><para>The value of a <replaceable>host</replaceable> that makes +the expression true.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>%i</literal></term> +<listitem><para>The value of an <replaceable>instance</replaceable> that makes +the expression true.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>%v</literal></term> +<listitem><para>The value of a performance metric from the logical expression. +</para> +</listitem></varlistentry> +</variablelist> +<para><indexterm id="IG31371888211"><primary>pmie tool</primary><secondary>%-token</secondary> +</indexterm>Some ambiguity may occur in respect to which host, instance, or +performance metric is bound to a %-token. In most cases, the leftmost binding +in the top-level subexpression is used. You may need to use <command>pmie</command> +in the interactive debugging mode (specify the <literal>-d</literal> +command line option) in conjunction with the <literal>-W</literal> command +line option to discover which subexpressions contributes to the %-token bindings. +</para> +<para><xref linkend="Z983833504sdc"/> illustrates some of the options when +constructing rule expressions:</para> +<example id="Z983833504sdc"> +<title>Rule Expression Options +</title> +<literallayout class="monospaced">some_inst ( disk.dev.total > 60 ) + -> syslog 10 mins "[%i] busy, %v IOPS " & + shell 1 hour "echo \ + 'Disk %i is REALLY busy. Running at %v I/Os per second' \ + | Mail -s 'pmie alarm' sysadm"; </literallayout> +</example> +<para><indexterm id="IG31371888212"><primary>system log file</primary></indexterm>In this +case, <literal>%v</literal> and <literal>%i</literal> are both associated +with the instances for the metric <literal>disk.dev.total</literal> that make +the expression true. If more than one instance makes the expression true (more +than one disk is busy), then the argument is formed by concatenating the result +from each %-token binding. The text added to the system log file +might be as shown in <xref linkend="Z983833442sdc"/>:</para> +<example id="Z983833442sdc"> +<title>System Log Text</title> +<literallayout class="monospaced">Aug 6 08:12:44 5B:gonzo pcp-pmie[3371]: + [disk1] busy, 3.7 IOPS [disk2] busy, 0.3 IOPS</literallayout> +</example> +<note><para>When <command>pmie</command> is processing performance metrics +from a PCP archive log, the <replaceable>actions</replaceable> will be processed +in the expected manner; however, the action methods are modified to report +a textual facsimile of the <replaceable>action</replaceable> on the standard +output.</para> +</note> +<para>Consider the rule in <xref linkend="Z983833364sdc"/>:</para> +<example id="Z983833364sdc"> +<title>Standard Output</title> +<literallayout class="monospaced">delta = 2 sec; // more often for demonstration purposes +percpu = "kernel.percpu"; +// Unusual usr-sys split when some CPU is more than 20% in usr mode +// and sys mode is at least 1.5 times usr mode +// +cpu_usr_sys = some_inst ( + $percpu.cpu.sys > $percpu.cpu.user * 1.5 && + $percpu.cpu.user > 0.2 + ) -> alarm "Unusual sys time: " "%i "; </literallayout> +<para>When evaluated against an archive, the following output is generated +(the alarm action produces a message on standard output):</para> +<literallayout class="monospaced"><userinput>pmafm ${HOME}/f4 pmie cpu.head cpu.00</userinput> +alarm Wed Aug 7 14:54:48 2012: Unusual sys time: cpu0 +alarm Wed Aug 7 14:54:50 2012: Unusual sys time: cpu0 +alarm Wed Aug 7 14:54:52 2012: Unusual sys time: cpu0 +alarm Wed Aug 7 14:55:02 2012: Unusual sys time: cpu0 +alarm Wed Aug 7 14:55:06 2012: Unusual sys time: cpu0 </literallayout> +</example> +</section> +<section id="LE87294-PARENT"> + +<title><command>pmie</command> Intrinsic Operators</title> +<para><indexterm id="ITch06-17"><primary>pmie tool</primary><secondary>intrinsic +operators</secondary></indexterm><indexterm id="IG31371888213"><primary>intrinsic operators</primary> +</indexterm>The following sections describe some other useful intrinsic operators +for <command>pmie</command>. These operators are divided into three groups: +</para> +<itemizedlist> +<listitem><para>Arithmetic aggregation</para> +</listitem> +<listitem><para>The <literal>rate</literal> operator</para> +</listitem> +<listitem><para>Transitional operators</para> +</listitem></itemizedlist> +<section id="id5203384"> + +<title>Arithmetic Aggregation</title> +<para><indexterm id="IG31371888214"><primary>pmie tool</primary><secondary>arithmetic aggregation +</secondary></indexterm><indexterm id="IG31371888215"><primary>arithmetic aggregation</primary> +</indexterm>For set-valued arithmetic expressions, the following operators +reduce the dimensionality of the result by arithmetic aggregation along one +of the <replaceable>host</replaceable>, <replaceable>instance</replaceable>, +or <replaceable>sample time</replaceable> dimensions. For example, to aggregate +in the <replaceable>host</replaceable> dimension, the following operators +are provided:</para> +<variablelist> +<varlistentry> +<term><literal>avg_host</literal></term> +<listitem><para><indexterm id="IG31371888216"><primary>avg_host operator</primary></indexterm>Computes +the average value across all <replaceable>instances</replaceable> for the +same <replaceable>host</replaceable> and <replaceable>sample time</replaceable></para> +</listitem></varlistentry> +<varlistentry> +<term><literal>sum_host</literal></term> +<listitem><para><indexterm id="IG31371888217"><primary>sum_host operator</primary></indexterm>Computes +the total value across all <replaceable>instances</replaceable> for the same +<replaceable>host</replaceable> and <replaceable>sample time</replaceable></para> +</listitem></varlistentry> +<varlistentry> +<term><literal>count_host</literal></term> +<listitem><para><indexterm id="IG31371888218"><primary>count_host operator</primary></indexterm>Computes +the number of values across all <replaceable>instances</replaceable> for the +same <replaceable>host</replaceable> and <replaceable>sample time</replaceable></para> +</listitem></varlistentry> +<varlistentry> +<term><literal>min_host</literal></term> +<listitem><para><indexterm id="IG31371888219"><primary>min_host operator</primary></indexterm>Computes +the minimum value across all <replaceable>instances</replaceable> for the +same <replaceable>host</replaceable> and <replaceable>sample time</replaceable></para> +</listitem></varlistentry> +<varlistentry> +<term><literal>max_host</literal></term> +<listitem><para><indexterm id="IG31371888220"><primary>max_host operator</primary></indexterm>Computes +the maximum value across all <replaceable>instances</replaceable> for the +same <replaceable>host</replaceable> and <replaceable>sample time</replaceable></para> +</listitem></varlistentry> +</variablelist> +<para><indexterm id="IG31371888221"><primary>*_inst operator</primary></indexterm><indexterm id="IG31371888222"> +<primary>*_sample operator</primary></indexterm>Ten additional operators correspond +to the forms <literal>*_inst</literal> and <literal>*_sample</literal>.</para> +<para>The following example illustrates the use of an aggregate operator in +combination with an existential operator to answer the question “Does +some host currently have two or more busy processors?”</para> +<literallayout class="monospaced">// note '' to escape - in host name +poke = ":moomba :'mac-larry' :bitbucket"; +some_host ( + count_inst ( kernel.percpu.cpu.user $poke + + kernel.percpu.cpu.sys $poke > 0.7 ) >= 2 + ) + -> alarm "2 or more busy CPUs"; </literallayout> +</section> +<section id="id5203682"> + +<title>The <command>rate</command> Operator</title> +<para><indexterm id="IG31371888223"><primary>pmie tool</primary><secondary>rate operator</secondary> +</indexterm><indexterm id="IG31371888224"><primary>rate operator</primary></indexterm>The <literal>rate</literal> operator computes the rate of change of an arithmetic expression +as shown in the following example:</para> +<literallayout class="monospaced">rate mem.util.cached </literallayout> +<para>It returns the rate of change for the <literal>mem.util.cached</literal> +performance metric; that is, the rate at which page cache memory is being +allocated and released.</para> +<para>The <literal>rate</literal> intrinsic operator is most useful for metrics +with instantaneous value semantics. For metrics with counter semantics, <command> +pmie</command> already performs an implicit rate calculation (see the <xref linkend="LE59099-PARENT"/>) and the <literal>rate</literal> operator would +produce the second derivative with respect to time, which is less likely to +be useful.</para> +</section> +<section id="id5203810"> + +<title>Transitional Operators</title> +<para><indexterm id="IG31371888225"><primary>pmie tool</primary><secondary>transitional operators +</secondary></indexterm><indexterm id="IG31371888226"><primary>transitional operators</primary> +</indexterm>In some cases, an action needs to be triggered when an expression +changes from true to false or vice versa. The following operators take a logical +expression as an operand, and return a logical expression:</para> +<variablelist> +<varlistentry> +<term><literal>rising</literal></term> +<listitem><para>Has the value <literal>true</literal> when the operand transitions +from <literal>false</literal> to <literal>true</literal> in consecutive samples. +</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>falling</literal></term> +<listitem><para>Has the value <literal>false</literal> when the operand transitions +from <literal>true</literal> to <literal>false</literal> in consecutive samples. +</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +</section> +<section id="LE60280-PARENT"> + +<title><command>pmie</command> Examples</title> +<para><indexterm id="ITch06-18"><primary>pmie tool</primary><secondary>real +examples</secondary></indexterm>The examples presented in this section are +task-oriented and use the full power of the <command>pmie</command> specification +language as described in <xref linkend="LE90227-PARENT"/>.</para> +<para><indexterm id="IG31371888228"><primary>${PCP_DEMOS_DIR}</primary></indexterm>Source code for the <command>pmie</command> +examples in this chapter, and many more examples, is provided within the +<citetitle>PCP Tutorials and Case Studies</citetitle>. <xref linkend="Z928441343sdc"/> and <xref linkend="Z928441176sdc"/> +illustrate monitoring CPU utilization and disk activity.</para> +<example id="Z928441343sdc"> +<title>Monitoring CPU Utilization</title> +<literallayout class="monospaced">// Some Common Performance Monitoring Scenarios +// +// The CPU Group +// +delta = 2 sec; // more often for demonstration purposes +// common prefixes +// +percpu = "kernel.percpu"; +all = "kernel.all"; +// Unusual usr-sys split when some CPU is more than 20% in usr mode +// and sys mode is at least 1.5 times usr mode +// +cpu_usr_sys = + some_inst ( + $percpu.cpu.sys > $percpu.cpu.user * 1.5 && + $percpu.cpu.user > 0.2 + ) + -> alarm "Unusual sys time: " "%i "; +// Over all CPUs, syscall_rate > 1000 * no_of_cpus +// +cpu_syscall = + $all.syscall > 1000 count/sec * hinv.ncpu + -> print "high aggregate syscalls: %v"; +// Sustained high syscall rate on a single CPU +// +delta = 30 sec; +percpu_syscall = + some_inst ( + $percpu.syscall > 2000 count/sec + ) + -> syslog "Sustained syscalls per second? " "[%i] %v "; +// the 1 minute load average exceeds 5 * number of CPUs on any host +hosts = ":gonzo :moomba"; // change as required +delta = 1 minute; // no need to evaluate more often than this +high_load = + some_host ( + $all.load $hosts #'1 minute' > 5 * hinv.ncpu + ) + -> alarm "High Load Average? " "%h: %v ";</literallayout> +</example> +<example id="Z928441176sdc"> +<title>Monitoring Disk Activity +</title> +<literallayout class="monospaced">// Some Common Performance Monitoring Scenarios +// +// The Disk Group +// +delta = 15 sec; // often enough for disks? +// common prefixes +// +disk = "disk"; +// Any disk performing more than 40 I/Os per second, sustained over +// at least 30 seconds is probably busy +// +delta = 30 seconds; +disk_busy = + some_inst ( + $disk.dev.total > 40 count/sec + ) +] -> shell "Mail -s 'Heavy systained disk traffic' sysadm"; +// Try and catch bursts of activity ... more than 60 I/Os per second +// for at least 25% of 8 consecutive 3 second samples +// +delta = 3 sec; +disk_burst = + some_inst ( + 25%_sample ( + $disk.dev.total @0..7 > 60 count/sec + ) + ) + -> alarm "Disk Burst? " "%i "; +// any SCSI disk controller performing more than 3 Mbytes per +// second is busy +// Note: the obscure 512 is to convert blocks/sec to byte/sec, +// and pmie handles the rest of the scale conversion +// +some_inst $disk.ctl.blktotal * 512 > 3 Mbyte/sec + -> alarm "Busy Disk Controller: " "%i ";</literallayout> +</example> +</section> +<section id="LE31514-PARENT"> + +<title>Developing and Debugging <command>pmie</command> +Rules</title> +<para> <indexterm id="ITch06-20"><primary>pmie tool</primary><secondary>developing +rules</secondary></indexterm>Given the <literal>-d</literal> command line +option, <command>pmie</command> executes in interactive mode, and the user +is presented with a menu of options:</para> +<literallayout class="monospaced">pmie debugger commands + f [file-name] - load expressions from given file or stdin + l [expr-name] - list named expression or all expressions + r [interval] - run for given or default interval + S time-spec - set start time for run + T time-spec - set default interval for run command + v [expr-name] - print subexpression for %h, %i and %v bindings + h or ? - print this menu of commands + q - quit +pmie> </literallayout> +<para>If both the <literal>-d</literal> option and a filename are present, +the expressions in the given file are loaded before entering interactive mode. +Interactive mode is useful for debugging new rules.</para> +</section> +<section id="LE91221-PARENT"> + +<title>Caveats and Notes on <command>pmie</command></title> +<para><indexterm id="IG31371888229"><primary>caveats</primary></indexterm>The following sections +provide important information for users of <command>pmie</command>.</para> +<section id="id5204255"> + +<title>Performance Metrics Wraparound</title> +<para><indexterm id="ITch06-21"><primary>performance metric wraparound</primary> +</indexterm><indexterm id="IG31371888230"><primary>PCP_COUNTER_WRAP variable</primary></indexterm><indexterm id="IG31371888231"> +<primary>metric wraparound</primary></indexterm>Performance metrics that are +cumulative counters may occasionally overflow their range and wraparound to +0. When this happens, an unknown value (printed as <literal>?</literal>) is +returned as the value of the metric for one sample (recall that the value +returned is normally a rate). You can have PCP interpolate a value based on +expected rate of change by setting the <literal>PCP_COUNTER_WRAP</literal> +environment variable.</para> +</section> +<section id="id5204304"> + +<title><command>pmie</command> Sample Intervals</title> +<para><indexterm id="ITch06-22"><primary>pmie tool</primary><secondary>sample +intervals</secondary></indexterm><indexterm id="IG31371888232"><primary>sample intervals</primary> +</indexterm>The sample interval (<literal>delta</literal>) should always be +long enough, particularly in the case of rates, to ensure that a meaningful +value is computed. Interval may vary according to the metric and your needs. +A reasonable minimum is in the range of ten seconds or several minutes. Although +PCP supports sampling rates up to hundreds of times per second, using +small sample intervals creates unnecessary load on the monitored system.</para> +</section> +<section id="id5204396"> + +<title><command>pmie</command> Instance Names</title> +<para><indexterm id="ITch06-23"><primary>pmie tool</primary><secondary>instance +names</secondary></indexterm>When you specify a metric instance name (<literal>#</literal><replaceable>identifier</replaceable>) in a <command>pmie</command> +expression, it is compared against the instance name looked up from either +a live collector system or an archive as follows:</para> +<itemizedlist> +<listitem><para>If the given instance name and the looked up name are the same, +they are considered to match.</para> +</listitem> +<listitem><para>Otherwise, the first two space separated tokens are extracted +from the looked up name. If the given instance name is the same as either of these +tokens, they are considered a match.</para> +</listitem></itemizedlist> +<para>For some metrics, notably the per process (<literal>proc.xxx.xxx</literal>) +metrics, the first token in the looked up instance name is impossible to determine +at the time you are writing <command>pmie</command> expressions. The above +policy circumvents this problem.</para> +</section> +<section id="id5204468"> + +<title><command>pmie</command> Error Detection</title> +<para><indexterm id="ITch06-24"><primary>pmie tool</primary><secondary>error +detection</secondary></indexterm><indexterm id="IG31371888233"><primary>error detection</primary> +</indexterm>The parser used in <command>pmie</command> is not particularly robust +in handling syntax errors. It is suggested that you check any problematic +expressions individually in interactive mode:</para> +<literallayout class="monospaced"><userinput>pmie -v -d</userinput> +pmie> f +<replaceable>expression</replaceable> +<userinput>Ctrl+D</userinput></literallayout> +<para>If the expression was parsed, its internal representation is shown: +</para> +<literallayout class="monospaced">pmie> <userinput>l</userinput></literallayout> +<para>The expression is evaluated twice and its value printed:</para> +<literallayout class="monospaced">pmie> <userinput>r 10sec</userinput></literallayout> +<para>Then quit:</para> +<literallayout class="monospaced">pmie> <userinput>q</userinput></literallayout> +<para>It is not always possible to detect semantic errors at parse time. This +happens when a performance metric descriptor is not available from the named +host at this time. A warning is issued, and the expression is put on a wait +list. The wait list is checked periodically (about every five minutes) to +see if the metric descriptor has become available. If an error is detected +at this time, a message is printed to the standard error stream +(<command>stderr</command>) and the offending expression is set aside.</para> +</section> +</section> +<section id="Z927039566sdc"> + +<title>Creating <command>pmie</command> Rules with <command>pmieconf</command></title> +<para><indexterm id="IG31371888234"><primary>pmie tool</primary><secondary>pmieconf rules</secondary> +</indexterm><indexterm id="IG31371888235"><primary>pmieconf tool</primary><secondary>rules</secondary> +</indexterm>The <command>pmieconf</command> tool is a command line utility +that is designed to aid the specification of <command>pmie</command> rules +from parameterized versions of the rules. <command>pmieconf</command> is used +to display and modify variables or parameters controlling the details of the +generated <command>pmie</command> rules.</para> +<para><command>pmieconf</command> reads two different forms of supplied input +files and produces a localized <command>pmie</command> configuration file +as its output.</para> +<para>The first input form is a generalized <command>pmie</command> rule file +such as those found below <filename>${PCP_VAR_DIR}/config/pmieconf</filename>. +These files contain the generalized rules which <command>pmieconf</command> +is able to manipulate. Each of the rules can be enabled or disabled, or the +individual variables associated with each rule can be edited.</para> +<para>The second form is an actual <command>pmie</command> configuration file +(that is, a file which can be interpreted by <command>pmie</command>, conforming +to the <command>pmie</command> syntax described in <xref linkend="LE90227-PARENT"/>). +This file is both input to and output from <command>pmieconf</command>.</para> +<para>The input version of the file contains any changed variables or rule +states from previous invocations of <command>pmieconf</command>, and the output +version contains both the changes in state (for any subsequent <command>pmieconf +</command> sessions) and the generated <command>pmie</command> syntax. The +<command>pmieconf</command> state is embedded within a <command>pmie</command> comment +block at the head of the output file and is not interpreted by <command>pmie</command> +itself.</para> +<para><indexterm id="IG31371888236"><primary>pmie tool</primary><secondary>procedures</secondary> +</indexterm><command>pmieconf</command> is an integral part of the <command>pmie</command> +daemon management process described in <xref linkend="Z927039824sdc"/>. <xref linkend="Z930357839sdc"/> and <xref linkend="Z930357878sdc"/> introduce the <command>pmieconf</command> tool through a series of typical operations.</para> +<procedure id="Z930357839sdc"> +<title>Display <command>pmieconf</command> Rules</title> +<step><para>Start <command>pmieconf</command> interactively (as the superuser).<literallayout class="monospaced"><userinput>pmieconf -f ${PCP_SYSCONF_DIR}/pmie/config.demo</userinput> +Updates will be made to ${PCP_SYSCONF_DIR}/pmie/config.demo + +pmieconf></literallayout></para> +</step><step><para>List the set of available <command>pmieconf</command> +rules by using the <command>rules</command> command.</para> +</step><step><para>List the set of rule groups using the <command>groups</command> command.</para> +</step><step><para>List only the enabled rules, using the <command>rules enabled</command> command.</para> +</step><step><para>List a single rule:</para> +<literallayout class="monospaced">pmieconf> <userinput>list memory.swap_low</userinput> + rule: memory.swap_low [Low free swap space] + help: There is only threshold percent swap space remaining - the system + may soon run out of virtual memory. Reduce the number and size of + the running programs or add more swap(1) space before it +completely + runs out. + predicate = + some_host ( + ( 100 * ( swap.free $hosts$ / swap.length $hosts$ ) ) + < $threshold$ + && swap.length $hosts$ > 0 // ensure swap in use + ) + vars: enabled = no + threshold = 10% + +pmieconf></literallayout> +</step><step><para>List one rule variable:<literallayout class="monospaced">pmieconf> <userinput>list memory.swap_low threshold</userinput> + rule: memory.swap_low [Low free swap space] + threshold = 10% + +pmieconf></literallayout></para> +</step> +</procedure> +<procedure id="Z930357878sdc"> +<title>Modify <command>pmieconf</command> Rules and Generate a <command>pmie</command> File</title> +<step><para>Lower the threshold for the <literal>memory.swap_low</literal> rule, and also change the <command>pmie</command> sample interval +affecting just this rule. The <literal>delta</literal> variable is special +in that it is not associated with any particular rule; it has been defined +as a global <command>pmieconf</command> variable. Global variables can be +displayed using the <command>list global</command> command to <command>pmieconf</command>, +and can be modified either globally or local to a specific rule. +<literallayout class="monospaced">pmieconf> <userinput>modify memory.swap_low threshold 5</userinput> + +pmieconf> <userinput>modify memory.swap_low delta "1 sec"</userinput> + +pmieconf></literallayout></para> +</step><step><para>Disable all of the rules except for the <literal>memory.swap_low</literal> rule so that you can see the effects of your change +in isolation.</para> +<para>This produces a relatively simple <command>pmie</command> configuration +file:<literallayout class="monospaced">pmieconf> <userinput>disable all</userinput> + +pmieconf> <userinput>enable memory.swap_low</userinput> + +pmieconf> <userinput>status</userinput> + verbose: off + enabled rules: 1 of 35 + pmie configuration file: ${PCP_SYSCONF_DIR}/pmie/config.demo + pmie processes (PIDs) using this file: (none found) + +pmieconf> quit</literallayout></para> +<para>You can also use the <command>status</command> command to verify that +only one rule is enabled at the end of this step.</para> +<para/> +</step><step id="Z930357553sdc"><para>Run <command>pmie</command> +with the new configuration file. Use a text editor to view the newly generated +<command>pmie</command> configuration file (<filename>${PCP_SYSCONF_DIR}/pmie/config.demo</filename>), and +then run the command:<literallayout class="monospaced"><userinput>pmie -T "1.5 sec" -v -l ${HOME}/demo.log ${PCP_SYSCONF_DIR}/pmie/config.demo</userinput> +memory.swap_low: false + +memory.swap_low: false + +<userinput>cat ${HOME}/demo.log</userinput> +Log for pmie on <replaceable>venus</replaceable> started Mon Jun 21 16:26:06 2012 + +pmie: PID = 21847, default host = <replaceable>venus</replaceable> + +[Mon Jun 21 16:26:07] pmie(21847) Info: evaluator exiting + +Log finished Mon Jun 21 16:26:07 2012</literallayout></para> +</step><step><para>Notice that both of the <command>pmieconf</command> +files used in the previous step are simple text files, as described in the <command>pmieconf(5)</command> man page:</para> +<literallayout class="monospaced"><userinput>file ${PCP_SYSCONF_DIR}/pmie/config.demo</userinput> +${PCP_SYSCONF_DIR}/pmie/config.demo: PCP pmie config (V.1) +<userinput>file ${PCP_VAR_DIR}/config/pmieconf/memory/swap_low</userinput> +${PCP_VAR_DIR}/config/pmieconf/memory/swap_low: PCP pmieconf rules (V.1)</literallayout> +</step> +</procedure> +</section> +<section id="Z927039824sdc"> + +<title>Management of <command>pmie</command> Processes</title> +<para><indexterm id="IG31371888237"><primary>pmie tool</primary><secondary>procedures</secondary> +</indexterm>The <command>pmie</command> process can be run as a daemon as +part of the system startup sequence, and can thus be used to perform automated, +live performance monitoring of a running system. To do this, run these commands +(as superuser):</para> +<literallayout class="monospaced"><userinput>chkconfig pmie on</userinput> +<userinput>${PCP_RC_DIR}/pmie start</userinput></literallayout> +<para>By default, these enable a single <command>pmie</command> process monitoring +the local host, with the default set of <command>pmieconf</command> rules +enabled (for more information about <command>pmieconf</command>, see <xref linkend="Z927039566sdc"/>). +<xref linkend="Z930363467sdc"/> illustrates how +you can use these commands to start any number of <command>pmie</command> +processes to monitor local or remote machines.</para> +<procedure id="Z930363467sdc"> +<title>Add a New <command>pmie</command> Instance to the <command>pmie</command> Daemon Management Framework</title> +<step><para>Use a text editor (as superuser) to edit the <command>pmie</command> +control file <filename>${PCP_PMIECONTROL_PATH}</filename>. +Notice the default entry toward the end of the file, which looks like this: +</para> +<literallayout class="monospaced">#Host S? Log File Arguments +LOCALHOSTNAME n PCP_LOG_DIR/pmie/LOCALHOSTNAME/pmie.log -c config.default</literallayout> +<para>This entry is used to enable a local <command>pmie</command> process. +Add a new entry for a remote host on your local network (for example, <literal>venus</literal>), by using your <command>pmie</command> configuration file +(see <xref linkend="Z927039566sdc"/>):</para> +<literallayout class="monospaced"> +#Host S? Log File Arguments +venus n PCP_LOG_DIR/pmie/venus/pmie.log -c config.demo +</literallayout> +<note><para>Without an absolute path, the configuration file (<literal>-c</literal> above) +will be resolved using <filename>${PCP_SYSCONF_DIR}/pmie</filename> - if <filename>config.demo</filename> +was created in <xref linkend="Z930357878sdc"/> it would be used here for host <literal>venus</literal>, +otherwise a new configuration file will be generated using the default rules +(at <filename>${PCP_SYSCONF_DIR}/pmie/config.demo</filename>).</para> +</note> +</step><step><para>Enable <command>pmie</command> daemon management: <literallayout class="monospaced"><userinput>chkconfig pmie on</userinput></literallayout></para> +<para>This simple step allows <command>pmie</command> to be started as part +of your machine's boot process.</para> +</step><step><para>Start the two <command>pmie</command> daemons. +At the end of this step, you should see two new <command>pmie</command> processes +monitoring the local and remote hosts:</para> +<literallayout class="monospaced">${PCP_RC_DIR}/pmie start +Performance Co-Pilot starting inference engine(s) ... +</literallayout> +<para>Wait a few moments while the startup scripts run. The <command>pmie</command> +start script uses the <command>pmie_check</command> script to do most of its work.</para> +<para>Verify that the <command>pmie</command> processes have started:</para> +<literallayout class="monospaced"><userinput>pcp</userinput> +Performance Co-Pilot configuration on pluto: + + platform: Linux pluto 3.10.0-0.rc7.64.el7.x86_64 #1 SMP + hardware: 8 cpus, 2 disks, 23960MB RAM + timezone: EST-10 + pmcd: Version 3.8.3-1, 8 agents + pmda: pmcd proc xfs linux mmv infiniband gluster elasticsearch<replaceable> + pmie: pluto: ${PCP_LOG_DIR}/pmie/pluto/pmie.log + venus: ${PCP_LOG_DIR}/pmie/venus/pmie.log</replaceable></literallayout> +</step> +</procedure> +<para>If a remote host is not up at the time when <command>pmie</command> +is started, the <command>pmie</command> process may exit. <command>pmie</command> +processes may also exit if the local machine is starved of memory resources. +To counter these adverse cases, it can be useful to have a <command>crontab</command> +entry running. Adding an entry as shown in <xref linkend="id5208584"/> +ensures that if one of the configured <command>pmie</command> processes exits, +it is automatically restarted.</para> +<note><para>Depending on your platform, the <command>crontab</command> entry discussed +here may already have been installed for you, as part of the package installation process. +In this case, the file <filename>/etc/cron.d/pcp-pmie</filename> will exist, and the rest +of this section can be skipped.</para> +</note> + +<section id="id5208584"> +<title>Add a <command>pmie</command> <command>crontab</command> Entry</title> +<para>To activate the maintenance and housekeeping scripts for a collection +of inference engines, execute the following tasks while logged into the local host +as the superuser (<literal>root</literal>):</para> +<orderedlist><listitem><para>Augment the <filename>crontab</filename> file +for the <literal>pcp</literal> user. For example:</para> +<literallayout class="monospaced"><userinput>crontab -l -u pcp > ${HOME}/crontab.txt</userinput></literallayout> +</listitem><listitem><para>Edit <filename>${HOME}/crontab.txt</filename>, adding lines +similar to those from the sample <filename>${PCP_VAR_DIR}/config/pmie/crontab</filename> +file for <literal>pmie_daily</literal> and <literal>pmie_check</literal>; +for example:</para> +<literallayout class="monospaced"># daily processing of pmie logs +10 0 * * * ${PCP_BINADM_DIR}/pmie_daily +# every 30 minutes, check pmie instances are running +25,55 * * * * ${PCP_BINADM_DIR}/pmie_check</literallayout> +</listitem><listitem><para>Make these changes permanent with this command: +</para> +<literallayout class="monospaced"><userinput>crontab -u pcp < ${HOME}/crontab.txt</userinput></literallayout> +</listitem></orderedlist> +</section> +<section id="id5205585"> + +<title>Global Files and Directories</title> +<para><indexterm id="IG31371888238"><primary>pmie tool</primary><secondary>global files and directories +</secondary></indexterm>The following global files and directories influence +the behavior of <command>pmie</command> and the <command>pmie</command> management +scripts:</para> +<variablelist condition="sgi_termlength:nextline" id="Z930361086sdc"> +<varlistentry> +<term><filename>${PCP_DEMOS_DIR}/pmie/*</filename></term> +<listitem><para>Contains sample <command>pmie</command> rules that may be +used as a basis for developing local rules.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_SYSCONF_DIR}/pmie/config.default</filename></term> +<listitem><para>Is the default <command>pmie</command> configuration file +that is used when the <command>pmie</command> daemon facility is enabled. +Generated by <command>pmieconf</command> if not manually setup beforehand. +</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_VAR_DIR}/config/pmieconf/*/*</filename></term> +<listitem><para>Contains the <command>pmieconf</command> rule definitions +(templates) in its subdirectories.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_PMIECONTROL_PATH}</filename></term> +<listitem><para>Defines which PCP collector hosts require a daemon <command>pmie</command> +to be launched on the local host, where the configuration file +comes from, where the <command>pmie</command> log file should be created, +and <command>pmie</command> startup options.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_VAR_DIR}/config/pmlogger/crontab</filename></term> +<listitem><para>Contains default <command>crontab</command> entries that +may be merged with the <command>crontab</command> entries for root to schedule +the periodic execution of the <command>pmie_check</command> script, for verifying +that <command>pmie</command> instances are running. +Only for platforms where a default <command>crontab</command> is not automatically +installed during the initial PCP package installation.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_LOG_DIR}/pmie/*</filename></term> +<listitem><para>Contains the <command>pmie</command> log files for the host. +These files are created by the default behavior of the <filename>${PCP_RC_DIR}/pmie</filename> +startup scripts.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5205812"> + +<title><command>pmie</command> Instances and Their Progress</title> +<para>The PMCD PMDA exports information about executing <command>pmie</command> +instances and their progress in terms of rule evaluations and action execution +rates.</para> +<variablelist condition="sgi_termlength:wide" id="Z929060000sdc"> +<varlistentry> +<term><command>pmie_check</command></term> +<listitem><para>This command is similar to the <command>pmlogger</command> +support script, <command>pmlogger_check</command>.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_RC_DIR}/pmie</filename></term> +<listitem><para>This start script supports the starting and stopping of multiple +<command>pmie</command> instances that are monitoring one or more hosts.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_TMP_DIR}/pmie</filename></term> +<listitem><para>The statistics that <command>pmie</command> gathers are maintained +in binary data structure files. These files are located in this directory.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmcd.pmie</literal> metrics</term> +<listitem><para>If <command>pmie</command> is running on a system with a PCP +collector deployment, the PMCD PMDA exports these metrics via the +<filename>pmcd.pmie</filename> group of metrics.</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +</chapter> + + +<chapter id="LE93354-PARENT"> + +<title>Archive Logging</title> +<para><indexterm id="IG31371888239"><primary>archive logs</primary><secondary>usage</secondary> +</indexterm>Performance monitoring and management in complex systems demands +the ability to accurately capture performance characteristics for subsequent +review, analysis, and comparison. Performance Co-Pilot (PCP) provides extensive +support for the creation and management of archive logs that capture a user-specified +profile of performance information to support retrospective performance analysis. +</para> +<para>The following major sections are included in this chapter:</para> +<itemizedlist> +<listitem><para><xref linkend="LE43411-PARENT"/>, presents the concepts and +issues involved with creating and using archive logs.</para> +</listitem> +<listitem><para><xref linkend="LE46764-PARENT"/>, describes the interaction +of the PCP tools with archive logs.</para> +</listitem> +<listitem><para><xref linkend="LE57424-PARENT"/>, shows some shortcuts for +setting up useful PCP archive logs.</para> +</listitem> +<listitem><para><xref linkend="Z930642977sdc"/>, provides information about +other archive logging features and sevices.</para> +</listitem> +<listitem><para><xref linkend="LE80113-PARENT"/>, presents helpful directions +if your archive logging implementation is not functioning correctly.</para> +</listitem></itemizedlist> +<section id="LE43411-PARENT"> + +<title>Introduction to Archive Logging</title> +<para><indexterm id="ITch07-2"><primary>pmlogger tool</primary><secondary> +archive logs</secondary></indexterm>Within the PCP, the <command>pmlogger</command> +utility may be configured to collect archives of performance metrics. +The archive creation process is simple and very flexible, incorporating the +following features:</para> +<itemizedlist> +<listitem><para>Archive log creation at either a PCP collector (typically +a server) or a PCP monitor system (typically a workstation), or at some designated +PCP archive logger host.</para> +</listitem> +<listitem><para>Concurrent independent logging, both local and remote. The +performance analyst can activate a private <command>pmlogger</command> instance +to collect only the metrics of interest for the problem at hand, independent +of other logging on the workstation or remote host.</para> +</listitem> +<listitem><para>Independent determination of logging frequency for individual +metrics or metric instances. For example, you could log the “5 minute” +load average every half hour, the write I/O rate on the DBMS log spindle every +10 seconds, and aggregate I/O rates on the other disks every minute.</para> +</listitem> +<listitem><para><indexterm id="IG31371888240"><primary>pmlc tool</primary><secondary>dynamic +adjustment</secondary></indexterm><indexterm id="IG31371888241"><primary>pmlogger tool</primary> +<secondary>pmlc control</secondary></indexterm>Dynamic adjustment of what +is to be logged, and how frequently, via <command>pmlc</command>. This feature +may be used to disable logging or to increase the sample interval during periods +of low activity or chronic high activity. +A local <command>pmlc</command> may interrogate and control a +remote <command>pmlogger</command>, subject to the access control restrictions +implemented by <command>pmlogger</command>.</para> +</listitem> +<listitem><para>Self-contained logs that include all system configuration +and metadata required to interpret the values in the log. These logs can be +kept for analysis at a much later time, potentially after the hardware or +software has been reconfigured and the logs have been stored as discrete, +autonomous files for remote analysis. The logs are endian-neutral and platform +independent - there is no requirement that the monitor host machine used for +analysis be similar to the collector machine in any way, nor do they have to +have the same versions of PCP. PCP archives created over 15 years ago can +still be replayed with the current versions of PCP!</para> +</listitem> +<listitem><para><indexterm id="IG31371888242"><primary>cron scripts</primary></indexterm><literal>cron</literal>-based +scripts to expedite the operational management, for example, +log rotation, consolidation, and culling. Another helper tool, +<command>pmlogconf</command> can be used to generate suitable +logging configurations for a variety of situations.</para> +</listitem> +<listitem><para><indexterm id="IG31371888243"><primary>mkaf tool</primary></indexterm><indexterm id="IG31371888244"> +<primary>pmafm tool</primary><secondary>archive folios</secondary></indexterm>Archive +folios as a convenient aggregation of multiple archive logs. Archive folios +may be created with the <command>mkaf</command> utility and processed with +the <command>pmafm</command> tool.</para> +</listitem></itemizedlist> +<section id="id5206288"> + +<title>Archive Logs and the PMAPI</title> +<para><indexterm id="IG31371888245"><primary>archive logs</primary><secondary>PMAPI</secondary> +</indexterm>Critical to the success of the PCP archive logging scheme is the +fact that the library routines providing access to real-time feeds of performance +metrics also provide access to the archive logs.</para> +<para><indexterm id="IG31371888246"><primary>PMAPI</primary><secondary>archive logs</secondary> +</indexterm>Live feeds (or real-time) sources of performance metrics and archives +are literally interchangeable, with a single Performance Metrics Application +Programming Interface (PMAPI) that preserves the same semantics for both styles +of metric source. In this way, applications and tools developed against the +PMAPI can automatically process either live or historical performance data. +</para> +</section> +<section id="id5206362"> + +<title>Retrospective Analysis Using Archive Logs</title> +<para><indexterm id="IG31371888247"><primary>archive logs</primary><secondary>retrospective analysis +</secondary></indexterm><indexterm id="IG31371888248"><primary>retrospective analysis</primary> +</indexterm>One of the most important applications of archive logging services +provided by PCP is in the area of retrospective analysis. In many cases, understanding +today's performance problems can be assisted by side-by-side comparisons with +yesterday's performance. With routine creation of performance archive logs, +you can concurrently replay pictures of system performance for two or more +periods in the past.</para> +<para>Archive logs are also an invaluable source of intelligence when trying +to diagnose what went wrong, as in a performance post-mortem. Because the PCP +archive logs are entirely self-contained, this analysis can be performed off-site +if necessary.</para> +<para>Each archive log contains metric values from only one host. However, +many PCP tools can simultaneously visualize values from multiple archives +collected from different hosts.</para> +<para>The archives can be replayed using the inference engine (<command>pmie</command> +is an application that uses the PMAPI). This allows you to +automate the regular, first-level analysis of system performance.</para> +<para>Such analysis can be performed by constructing suitable expressions +to capture the essence of common resource saturation problems, then periodically +creating an archive and playing it against the expressions. For example, you +may wish to create a daily performance audit (perhaps run by the <command>cron</command> +command) to detect performance regressions.</para> +<para>For more about <command>pmie</command>, see <xref linkend="LE21414-PARENT"/>. +</para> +</section> +<section id="id5206454"> + +<title>Using Archive Logs for Capacity Planning</title> +<para><indexterm id="IG31371888249"><primary>archive logs</primary><secondary>capacity planning +</secondary></indexterm><indexterm id="IG31371888250"><primary>capacity planning</primary></indexterm>By +collecting performance archives with relatively long sampling periods, or +by reducing the daily archives to produce summary logs, the capacity planner +can collect the base data required for forward projections, and can estimate +resource demands and explore “what if” scenarios by replaying +data using visualization tools and the inference engine.</para> +</section> +</section> +<section id="LE46764-PARENT"> + +<title>Using Archive Logs with Performance Tools</title> +<para><indexterm id="ITch07-4"><primary>performance visualization tools</primary> +</indexterm>Most PCP tools default to real-time display of current values +for performance metrics from PCP collector host(s). However, most PCP tools +also have the capability to display values for performance metrics retrieved +from PCP archive log(s). The following sections describe plans, steps, and +general issues involving archive logs and the PCP tools.</para> +<section id="id5206523"> + +<title>Coordination between <command>pmlogger</command> and PCP tools</title> +<para><indexterm id="IG31371888251"><primary>pmlogger tool</primary><secondary>PCP tool coordination +</secondary></indexterm>Most commonly, a PCP tool would be invoked with the <literal>-a</literal> option to process an archive log some time after <command>pmlogger</command> had finished creating the archive. However, a tool such as +<command>pmchart</command> that uses a Time Control dialog (see <xref linkend="LE76997-PARENT"/>) +stops when the end of archive is reached, but could resume if more data is +written to the PCP archive log.<note><para><indexterm id="IG31371888252"><primary>SIGUSR1 signal +</primary></indexterm><indexterm id="IG31371888253"><primary>flush command</primary></indexterm><indexterm id="IG31371888254"> +<primary>pmlc tool</primary><secondary>flush command</secondary></indexterm><command>pmlogger</command> +uses buffered I/O to write the archive log so that the +end of the archive may be aligned with an I/O buffer boundary, rather than +with a logical archive log record. If such an archive was read by a PCP tool, +it would appear truncated and might confuse the tool. These problems may be +avoided by sending <command>pmlogger</command> a <literal>SIGUSR1</literal> +signal, or by using the <command>flush</command> command of <command>pmlc</command> +to force <command>pmlogger</command> to flush its output buffers. +</para> +</note></para> +</section><section id="id5206675"> + +<title>Administering PCP Archive Logs Using <command>cron</command> Scripts +</title> +<para><indexterm id="ITch07-6"><primary>cron scripts</primary></indexterm><indexterm id="IG31371888255"> +<primary>scripts</primary></indexterm>Many operating systems support +the <literal>cron</literal> process scheduling system.</para> +<para><indexterm id="ITch07-7"><primary>pmlogger_check script</primary></indexterm><indexterm id="IG31371888256"> +<primary>pmlogger_daily script</primary></indexterm><indexterm id="IG31371888257"><primary>pmsnap +tool</primary><secondary>script usage</secondary></indexterm><indexterm id="IG31371888258"><primary> +pmlogger_merge script</primary></indexterm>PCP supplies shell scripts to use +the <literal>cron</literal> functionality to help manage your archive logs. +The following scripts are supplied:</para> +<variablelist> +<varlistentry> +<term><emphasis role="bold">Script</emphasis></term><listitem><para><emphasis role="bold">Description</emphasis></para></listitem></varlistentry> +<varlistentry> +<term><literal>pmlogger_daily(1)</literal></term> +<listitem><para>Performs a daily housecleaning of archive logs and notices. +</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmlogger_merge(1)</literal></term> +<listitem><para><literal/>Merges archive logs and is called by <literal>pmlogger_daily</literal>.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmlogger_check(1)</literal></term> +<listitem><para>Checks to see that all desired <command>pmlogger</command> +processes are running on your system, and invokes any that are missing for +any reason.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmlogconf(1)</literal></term> +<listitem><para>Generates suitable <command>pmlogger</command> configuration +files based on a pre-defined set of templates. It can probe the state of the +system under observation to make informed decisions about which metrics to +record. This is an extensible facility, allowing software upgrades and new +PMDA installations to add to the existing set of templates.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmsnap(1)</command></term> +<listitem><para>Generates graphic image snapshots of <command>pmchart</command> +performance charts at regular intervals.</para> +</listitem></varlistentry> +</variablelist> +<para><indexterm id="IG31371888259"><primary>${PCP_PMLOGGERCONTROL_PATH} file</primary> +</indexterm>The configuration files used by these scripts can be edited to +suit your particular needs, and are generally controlled by the +<filename>${PCP_PMLOGGERCONTROL_PATH}</filename> file (<literal>pmsnap</literal> +has an additional control file, <filename>${PCP_PMSNAPCONTROL_PATH}</filename>). +Complete information on these scripts is available in the <command>pmlogger_daily(1)</command> +and <command>pmsnap(1)</command> man pages.</para> +</section> +<section id="LE92914-PARENT"> + +<title>Archive Log File Management</title> +<para><indexterm id="IG31371888260"><primary>archive logs</primary><secondary>file management +</secondary></indexterm>PCP archive log files can occupy a great deal of disk +space, and management of archive logs can be a large task in itself. The following +sections provide information to assist you in PCP archive log file management. +</para> +<section id="id5206941"> + +<title>Basename Conventions</title> +<para><indexterm id="IG31371888261"><primary>basename conventions</primary></indexterm>When a +PCP archive is created by <command>pmlogger</command>, an archive basename +must be specified and several physical files are created, as shown in <xref linkend="id5206972"/>.</para> +<table id="id5206972" frame="topbot"> + +<title>Filenames for PCP Archive Log Components (<literal>archive</literal>.<replaceable>*</replaceable>)</title> +<tgroup cols="2" colsep="0" rowsep="0"> +<colspec colwidth="102*"/> +<colspec colwidth="294*"/> +<thead> +<row rowsep="1" valign="top"><entry align="left" valign="bottom"><para>Filename</para></entry> +<entry align="left" valign="bottom"><para>Contents</para></entry></row></thead> +<tbody> +<row valign="top"> +<entry align="left" valign="top"><para><filename>archive.</filename><emphasis role="bold"/><replaceable>index</replaceable></para></entry> +<entry align="left" valign="top"><para>Temporal index for rapid access to +archive contents.</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><filename>archive.</filename><replaceable>meta</replaceable></para></entry> +<entry align="left" valign="top"><para>Metadata descriptions for performance +metrics and instance domains appearing in the archive.</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><filename>archive.N</filename></para></entry> +<entry align="left" valign="top"><para>Volumes of performance metrics values, +for <filename>N</filename> = 0,1,2,...</para></entry></row></tbody></tgroup> +</table> +</section> +<section id="id5207181"> + +<title>Log Volumes</title> +<para><indexterm id="IG31371888262"><primary>log volumes</primary></indexterm>A single PCP archive +may be partitioned into a number of volumes. These volumes may expedite management +of the archive; however, the metadata file and at least one volume must be +present before a PCP tool can process the archive.</para> +<para>You can control the size of an archive log volume by using the <literal>-v</literal> command line option to <command>pmlogger</command>. This option +specifies how large a volume should become before <command>pmlogger</command> +starts a new volume. Archive log volumes retain the same base filename as +other files in the archive log, and are differentiated by a numeric suffix +that is incremented with each volume change. For example, you might have a +log volume sequence that looks like this:</para> +<literallayout class="monospaced">netserver-log.0 +netserver-log.1 +netserver-log.2</literallayout> +<para><indexterm id="IG31371888263"><primary>SIGHUP signal</primary></indexterm><indexterm id="IG31371888264"><primary> +pmlc tool</primary><secondary>SIGHUP signal</secondary></indexterm>You can +also cause an existing log to be closed and a new one to be opened by sending +a <literal>SIGHUP</literal> signal to <command>pmlogger</command>, or by using +the <command>pmlc</command> command to change the <command>pmlogger</command> +instructions dynamically, without interrupting <command>pmlogger</command> +operation. Complete information on log volumes is found in the <command>pmlogger(1)</command> +man page.</para> +</section><section id="id5207300"> + +<title>Basenames for Managed Archive Log Files</title> +<para>The PCP archive management tools support a consistent scheme for selecting +the basenames for the files in a collection of archives and for mapping these +files to a suitable directory hierarchy.</para> +<para>Once configured, the PCP tools that manage archive logs employ a consistent +scheme for selecting the basename for an archive each time <command>pmlogger</command> +is launched, namely the current date and time in the format YYYYMMDD.HH.MM. +Typically, at the end of each day, all archives for a particular host on that +day would be merged to produce a single archive with a basename constructed +from the date, namely YYYYMMDD. The <literal>pmlogger_daily</literal> script +performs this action and a number of other routine housekeeping chores.</para> +</section><section id="id5207335"> + +<title>Directory Organization for Archive Log Files</title> +<para>If you are using a deployment of PCP tools and daemons to collect metrics +from a variety of hosts and storing them all at a central location, you should +develop an organized strategy for storing and naming your log files.</para> +<note><para><indexterm id="IG31371888265"><primary>pmchart tool</primary><secondary>short-term +executions</secondary></indexterm>There are many possible configurations of <command>pmlogger</command>, +as described in <xref linkend="LE56598-PARENT"/>. The directory +organization described in this section is recommended for any system on which +<command>pmlogger</command> is configured for permanent execution (as opposed to short-term +executions, for example, as launched from <command>pmchart</command> to record +some performance data of current interest).</para> +</note> +<para>Typically, the filesystem structure can be used to reflect the +number of hosts for which a <command>pmlogger</command> instance is expected +to be running locally, obviating the need for lengthy and cumbersome filenames. +It makes considerable sense to place all logs for a particular host in a separate +directory named after that host. Because each instance of <command>pmlogger</command> +can only log metrics fetched from a single host, this also simplifies +some of the archive log management and administration tasks.</para> +<para>For example, consider the filesystem and naming structure shown in <xref linkend="id5207462"/>.</para> +<figure id="id5207462"><title>Archive Log Directory Structure</title><mediaobject><imageobject><imagedata fileref="figures/log-directory.svg"/></imageobject><textobject><phrase>Archive Log Directory Structure</phrase></textobject></mediaobject></figure> +<para><indexterm id="IG31371888266"><primary>${PCP_PMLOGGERCONTROL_PATH} file</primary> +</indexterm>The specification of where to place the archive log files for +particular <command>pmlogger</command> instances is encoded in the configuration +file <filename>${PCP_PMLOGGERCONTROL_PATH}</filename>, +and this file should be customized on each host running an instance of <command>pmlogger</command>.</para> +<para>If many archives are being created, and the associated PCP collector +systems form peer classes based upon service type (Web servers, +DBMS servers, NFS servers, and so on), then it may be appropriate to introduce +another layer into the directory structure, or use symbolic links to group +together hosts providing similar service types.</para> +</section> +<section id="id5207540"> + +<title>Configuration of <command>pmlogger</command></title> +<para><indexterm id="IG31371888267"><primary>pmlogger tool</primary><secondary>configuration +</secondary></indexterm>The configuration files used by <command>pmlogger</command> +describe which metrics are to be logged. Groups of metrics may +be logged at different intervals to other groups of metrics. Two states, mandatory +and advisory, also apply to each group of metrics, defining whether metrics +definitely should be logged or not logged, or whether a later advisory definition +may change that state.</para> +<para>The mandatory state takes precedence if it is <literal>on</literal> +or <literal>off</literal>, causing any subsequent request for a change in +advisory state to have no effect. If the mandatory state is <literal>maybe</literal>, +then the advisory state determines if logging is enabled or not. +</para> +<para>The mandatory states are <literal>on</literal>, <literal>off</literal>, +and <literal>maybe</literal>. The advisory states, which only affect metrics +that are mandatory <literal>maybe</literal>, are <literal>on</literal> and +<literal>off</literal>. Therefore, a metric that is mandatory <literal>maybe</literal> +in one definition and advisory <literal>on</literal> in another definition +would be logged at the advisory interval. Metrics that are not specified in +the <command>pmlogger</command> configuration file are mandatory <literal>maybe</literal> +and advisory <literal>off</literal> by default and are not logged.</para> +<para>A complete description of the <command>pmlogger</command> configuration +format can be found on the <command>pmlogger(1)</command> man page.</para> +</section> +<section id="id5207708"> + +<title>PCP Archive Contents</title> +<para><indexterm id="IG31371888268"><primary>archive logs</primary><secondary>contents</secondary> +</indexterm><indexterm id="IG31371888269"><primary>pmdumplog tool</primary><secondary>archive +log contents</secondary></indexterm>Once a PCP archive log has been created, +the <command>pmdumplog</command> utility may be used to display various information +about the contents of the archive. For example, start with the following command: +</para> +<para><literal>pmdumplog -l ${PCP_LOG_DIR}/pmlogger/www.sgi.com/19960731</literal></para> +<para>It might produce the following output:</para> +<literallayout class="monospaced">Log Label (Log Format Version 1) +Performance metrics from host www.sgi.com + commencing Wed Jul 31 00:16:34.941 1996 + ending Thu Aug 1 00:18:01.468 1996</literallayout> +<para>The simplest way to discover what performance metrics are contained +within an archive is to use <literal>pminfo</literal> as shown in <xref linkend="Z984165598sdc"/>: +</para> +<example id="Z984165598sdc"> +<title>Using <literal>pminfo</literal> to Obtain Archive Information</title> +<literallayout class="monospaced"><literal>pminfo -a ${PCP_LOG_DIR}/pmlogger/www.sgi.com/19960731 network.mbuf</literal> +network.mbuf.alloc +network.mbuf.typealloc +network.mbuf.clustalloc +network.mbuf.clustfree +network.mbuf.failed +network.mbuf.waited +network.mbuf.drained</literallayout> +</example> +</section> +</section> +</section> +<section id="LE57424-PARENT"> + +<title>Cookbook for Archive Logging</title> +<para><indexterm id="IG31371888270"><primary>cookbook</primary></indexterm><indexterm id="IG31371888271"><primary> +pmlogger tool</primary><secondary>cookbook tasks</secondary></indexterm>The +following sections present a checklist of tasks that may be performed to enable +PCP archive logging with minimal effort. For a complete explanation, refer +to the other sections in this chapter and the man pages for <command>pmlogger +</command> and related tools.</para> +<section id="id5207850"> + +<title>Primary Logger</title> +<para><indexterm id="IG31371888272"><primary>primary archive</primary></indexterm>Assume you +wish to activate primary archive logging on the PCP collector host <literal>pluto</literal>. +Execute the following while logged into <literal>pluto</literal> as the superuser +(<literal>root</literal>).</para> +<orderedlist> +<listitem><para>Start <command>pmcd</command> and <command>pmlogger</command>:</para> +<literallayout class="monospaced"><userinput>chkconfig pmcd on</userinput> +<userinput>chkconfig pmlogger on</userinput> +<userinput>${PCP_RC_DIR}/pmcd start</userinput> +Starting pmcd ... +<userinput>${PCP_RC_DIR}/pmlogger start</userinput> +Starting pmlogger ...</literallayout> +</listitem><listitem><para>Verify that the primary <command>pmlogger</command> +instance is running:</para> +<literallayout class="monospaced"><userinput>pcp</userinput> +Performance Co-Pilot configuration on pluto: + + platform: Linux pluto 3.10.0-0.rc7.64.el7.x86_64 #1 SMP + hardware: 8 cpus, 2 disks, 23960MB RAM + timezone: EST-10 + pmcd: Version 3.8.3-1, 8 agents + pmda: pmcd proc xfs linux mmv infiniband gluster elasticsearch<replaceable> + pmlogger: primary logger: pluto/20130815.10.00</replaceable> + pmie: pluto: ${PCP_LOG_DIR}/pmie/pluto/pmie.log + venus: ${PCP_LOG_DIR}/pmie/venus/pmie.log</literallayout> +</listitem><listitem><para>Verify that the archive files are being created +in the expected place:</para> +<literallayout class="monospaced"><userinput>ls ${PCP_LOG_DIR}/pmlogger/pluto</userinput> +20130815.10.00.0 +20130815.10.00.index +20130815.10.00.meta +Latest +pmlogger.log</literallayout> +</listitem><listitem><para>Verify that no errors are being logged, and the rate of +expected growth of the archives:</para> +<literallayout class="monospaced"><userinput>cat ${PCP_LOG_DIR}/pmlogger/pluto/pmlogger.log</userinput> +Log for pmlogger on pluto started Thu Aug 15 10:00:11 2013 + +Config parsed +Starting primary logger for host "pluto" +Archive basename: 20130815.00.10 + +Group [26 metrics] { + hinv.map.lvname + ... + hinv.ncpu +} logged once: 1912 bytes + +Group [11 metrics] { + kernel.all.cpu.user + ... + kernel.all.load +} logged every 60 sec: 372 bytes or 0.51 Mbytes/day + +...</literallayout> +</listitem> +</orderedlist> +</section> +<section id="id5208266"> + +<title>Other Logger Configurations</title> +<para>Assume you wish to create archive logs on the local host for performance +metrics collected from the remote host <literal>venus</literal>. Execute all +of the following tasks while logged into the local host as the superuser (<literal>root</literal>).</para> +<procedure id="id5208288"> +<title>Creating Archive Logs</title> +<step><para>Create a suitable <command>pmlogger</command> configuration +file. There are several options:</para> +<itemizedlist> +<listitem><para>Run the <command>pmlogconf(1)</command> utility to generate +a configuration file, and (optionally) interactively customize it further to +suit local needs. +</para> +<literallayout class="monospaced"><userinput>${PCP_BINADM_DIR}/pmlogconf ${PCP_SYSCONF_DIR}/pmlogger/config.venus</userinput> +Creating config file "${PCP_SYSCONF_DIR}/pmlogger/config.venus" using default settings + +<userinput>${PCP_BINADM_DIR}/pmlogconf ${PCP_SYSCONF_DIR}/pmlogger/config.venus</userinput> + +Group: utilization per CPU +Log this group? [n] y +Logging interval? [default] + +Group: utilization (usr, sys, idle, ...) over all CPUs +Log this group? [y] y +Logging interval? [default] + +Group: per spindle disk activity +Log this group? [n] y + +...</literallayout> +</listitem> +<listitem><para>Do nothing - a default configuration will be created in the +following step, using <command>pmlogconf(1)</command> probing and automatic file +generation based on the metrics available at the remote host. The +<filename>${PCP_RC_DIR}/pmlogger</filename> start script handles this.</para> +</listitem> +<listitem><para>Manually - create a configuration file with a text editor, +or arrange to have one put in place by configuration management tools like +<ulink url="https://puppetlabs.com/">Puppet</ulink> or +<ulink url="http://www.opscode.com/chef/">Chef</ulink>.</para> +</listitem> +</itemizedlist> +</step><step><para>Edit <filename>${PCP_PMLOGGERCONTROL_PATH}</filename>. +Using the line for <literal>remote</literal> as a template, add +the following line to the file:</para> +<literallayout class="monospaced"><userinput>venus n n PCP_LOG_DIR/pmlogger/venus -r -T24h10m -c config.venus</userinput></literallayout> +</step><step><para>Start <command>pmlogger</command>:</para> +<literallayout class="monospaced"><userinput>${PCP_BINADM_DIR}/pmlogger_check</userinput> +Restarting pmlogger for host "venus" ..... done</literallayout> +</step><step><para>Verify that the <command>pmlogger</command> instance +is running:</para> +<literallayout class="monospaced"><userinput>pcp</userinput> +Performance Co-Pilot configuration on pluto: + + platform: Linux pluto 3.10.0-0.rc7.64.el7.x86_64 #1 SMP + hardware: 8 cpus, 2 disks, 23960MB RAM + timezone: EST-10 + pmcd: Version 3.8.3-1, 8 agents + pmda: pmcd proc linux xfs mmv infiniband gluster elasticsearch<replaceable> + pmlogger: primary logger: pluto/20130815.10.00 + venus.redhat.com: venus/20130815.11.15</replaceable> +<userinput>pmlc</userinput> +pmlc> <userinput>show loggers</userinput> +The following pmloggers are running on pluto: + primary (19144) 5141 +pmlc> <userinput>connect 5141</userinput> +pmlc> <userinput>status</userinput><replaceable> +pmlogger [5141] on host pluto is logging metrics from host venus</replaceable> +log started Thu Aug 15 11:15:39 2013 (times in local time) +last log entry Thu Aug 15 11:47:39 2013 +current time Thu Aug 15 11:48:13 2013 +log volume 0 +log size 146160</literallayout> +</step> +</procedure> +<para>To create archive logs on the local host for performance metrics collected +from multiple remote hosts, repeat the steps in <xref linkend="id5208288"/> +for each remote host (each with a new <filename>control</filename> file entry).</para> +</section> +<section id="id5208588"> + +<title>Archive Log Administration</title> +<para><indexterm id="IG31371888290"><primary>archive logs</primary><secondary>administration +</secondary></indexterm>Assume the local host has been set up to create archive +logs of performance metrics collected from one or more hosts (which may be +either the local host or a remote host).</para> +<note><para>Depending on your platform, the <command>crontab</command> entry discussed +here may already have been installed for you, as part of the package installation process. +In this case, the file <filename>/etc/cron.d/pcp-pmlogger</filename> will exist, and the rest +of this section can be skipped.</para> +</note> +<para>To activate the maintenance and housekeeping scripts for a collection +of archive logs, execute the following tasks while logged into the local host +as the superuser (<literal>root</literal>):</para> +<orderedlist><listitem><para>Augment the <filename>crontab</filename> file +for the <literal>pcp</literal> user. For example:</para> +<literallayout class="monospaced"><userinput>crontab -l -u pcp > ${HOME}/crontab.txt</userinput></literallayout> +</listitem><listitem><para>Edit <filename>${HOME}/crontab.txt</filename>, adding lines +similar to those from the sample <filename>${PCP_VAR_DIR}/config/pmlogger/crontab</filename> +file for <literal>pmlogger_daily</literal> and <literal>pmlogger_check</literal>; +for example:</para> +<literallayout class="monospaced"># daily processing of archive logs +10 0 * * * ${PCP_BINADM_DIR}/pmlogger_daily +# every 30 minutes, check pmlogger instances are running +25,55 * * * * ${PCP_BINADM_DIR}/pmlogger_check</literallayout> +</listitem><listitem><para>Make these changes permanent with this command: +</para> +<literallayout class="monospaced"><userinput>crontab -u pcp < ${HOME}/crontab.txt</userinput></literallayout> +</listitem></orderedlist> +</section> +</section> +<section id="Z930642977sdc"> + +<title>Other Archive Logging Features and Services</title> +<para>Other archive logging features and services include PCP archive folios, +manipulating archive logs, primary logger, and using <command>pmlc</command>. +</para> +<section id="LE73509-PARENT"> + +<title>PCP Archive Folios</title> +<para><indexterm id="ITch07-9"><primary>folios</primary></indexterm><indexterm id="IG31371888291"> +<primary>archive logs</primary><secondary>folios</secondary></indexterm><indexterm id="IG31371888292"> +<primary>pmchart tool</primary><secondary>record mode</secondary></indexterm><indexterm id="IG31371888293"> +<primary>pmcollectl tool</primary><secondary>record mode</secondary></indexterm>A +collection of one or more PCP archive logs may be combined with a control +file to produce a PCP archive folio. Archive folios are created using either <literal>mkaf</literal> or the interactive record mode services of various PCP monitor tools (e.g. <command>pmchart</command> and <command>pmcollectl</command>).</para> +<para><indexterm id="IG31371888294"><primary>pmlogger tool</primary><secondary>folios</secondary> +</indexterm>The automated archive log management services also create an archive +folio named <filename>Latest</filename> for each managed <command>pmlogger +</command> instance, to provide a symbolic name to the most recent archive +log. With reference to <xref linkend="id5207462"/>, this would mean the +creation of the folios <filename>${PCP_LOG_DIR}/pmlogger/one/Latest</filename> +and <filename>${PCP_LOG_DIR}/pmlogger/two/Latest</filename>.</para> +<para><indexterm id="IG31371888295"><primary>pmafm tool</primary><secondary>interactive commands +</secondary></indexterm>The <command>pmafm</command> utility is completely +described in the <command>pmafm(1)</command> man page, and provides +the interactive commands (single commands may also be executed from the command +line) for the following services:</para> +<itemizedlist> +<listitem><para>Checking the integrity of the archives in the folio.</para> +</listitem> +<listitem><para>Displaying information about the component archives.</para> +</listitem> +<listitem><para>Executing PCP tools with their source of performance metrics +assigned concurrently to all of the component archives (where the tool supports +this), or serially executing the PCP tool once per component archive.</para> +</listitem> +<listitem><para>If the folio was created by a single PCP monitoring tool, +replaying all of the archives in the folio with that monitoring tool.</para> +</listitem> +<listitem><para>Restricting the processing to particular archives, or the +archives associated with particular hosts.</para> +</listitem></itemizedlist> +</section> +<section id="id5208950"> + +<title>Manipulating Archive Logs with <command>pmlogextract</command></title> +<para><indexterm id="ITch07-10"><primary>pmlogextract tool</primary></indexterm> +The <literal>pmlogextract</literal> tool takes a number of PCP archive logs +from a single host and performs the following tasks:</para> +<itemizedlist> +<listitem><para>Merges the archives into a single log, while maintaining the +correct time stamps for all values.</para> +</listitem> +<listitem><para>Extracts all metric values within a temporal window that could +encompass several archive logs.</para> +</listitem> +<listitem><para>Extracts only a configurable subset of metrics from the archive +logs.</para> +</listitem></itemizedlist> +<para>See the <command>pmlogextract(1)</command> man page for +full information on this command.</para> +</section> +<section id="id5208951"> + +<title>Summarizing Archive Logs with <command>pmlogsummary</command></title> +<para><indexterm id="ITch07-11"><primary>pmlogsummary tool</primary></indexterm> +The <literal>pmlogsummary</literal> tool provides statistical summaries of +archives, or specific metrics within archives, or specific time windows of +interest in an archive. These summaries include various averages, minima, +maxima, sample counts, histogram bins, and so on.</para> +<para>As an example, for Linux host <literal>pluto</literal>, report on its +use of anonymous huge pages - average use, maximum, time at which maximum +occured, total number of samples in the archive, and the units used for the +values - as shown in <xref linkend="Z984165598nat"/>: +</para> +<example id="Z984165598nat"> +<title>Using <literal>pmlogsummary</literal> to Summarize Archive Information</title> +<literallayout class="monospaced"><userinput>pmlogsummary -MIly ${PCP_LOG_DIR}/pmlogger/pluto/20130815 mem.util.anonhugepages</userinput> +Performance metrics from host pluto + commencing Thu Aug 15 00:10:12.318 2013 + ending Fri Aug 16 00:10:12.299 2013 + +mem.util.anonhugepages 7987742.326 8116224.000 15:02:12.300 1437 Kbyte + +<userinput>pminfo -t mem.util.anonhugepages</userinput> +mem.util.anonhugepages [amount of memory in anonymous huge pages]</literallayout> +</example> +<para>See the <command>pmlogsummary(1)</command> man page for +detailed information about this commands many options.</para> +</section> +<section id="id5209027"> + +<title>Primary Logger</title> +<para><indexterm id="ITch07-14"><primary>primary logger</primary></indexterm><indexterm id="ITch07-16"><primary>pmlogger tool</primary><secondary>primary instance +</secondary></indexterm>On each system for which PMCD is active (each PCP +collector system), there is an option to have a distinguished instance of +the archive logger <command>pmlogger</command> (the “primary” +logger) launched each time PMCD is started. This may be used to ensure the +creation of minimalist archive logs required for ongoing system management +and capacity planning in the event of failure of a system where a remote +<command>pmlogger</command> may be running, or because the preferred archive logger +deployment is to activate <command>pmlogger</command> on each PCP collector +system.</para> +<para>Run the following command as superuser on each PCP collector system +where you want to activate the primary <command>pmlogger</command>:</para> +<literallayout class="monospaced"><userinput>chkconfig pmlogger on</userinput></literallayout> +<para>The primary logger launches the next time the <command>${PCP_RC_DIR}/pmlogger start</command> script runs. +If you wish this to happen immediately, follow up with this command:</para> +<literallayout class="monospaced"><userinput>${PCP_BINADM_DIR}/pmlogger_check -V</userinput></literallayout> +<para><indexterm id="IG31371888296"><primary>${PCP_PMLOGGERCONTROL_PATH} file</primary></indexterm><indexterm id="IG31371888297"> +<primary>${PCP_SYSCONF_DIR}/pmlogger/config.default file</primary></indexterm>When +it is started in this fashion, the <filename>${PCP_PMLOGGERCONTROL_PATH}</filename> +must use the second field of one configuration line to designate the primary logger, +and usually will also use the <command>pmlogger</command> configuration file +<filename>${PCP_SYSCONF_DIR}/pmlogger/config.default</filename> (although the latter +is not mandatory).</para> +</section> +<section id="id5209191"> + +<title>Using <command>pmlc</command></title> +<para><indexterm id="IG31371888298"><primary>pmlogger tool</primary><secondary>configuration +</secondary></indexterm><indexterm id="ITch07-17"><primary>pmlc tool</primary> +<secondary>description</secondary></indexterm>You may tailor <command>pmlogger</command> +dynamically with the <command>pmlc</command> command (if it is configured to allow access +to this functionality). Normally, +the <command>pmlogger</command> configuration is read at startup. If you choose +to modify the <filename>config</filename> file to change the parameters under +which <command>pmlogger</command> operates, you must stop and restart the +program for your changes to have effect. Alternatively, you may change parameters +whenever required by using the <command>pmlc</command> interface.</para> +<para>To run the <command>pmlc</command> tool, enter:</para> +<literallayout class="monospaced"><userinput>pmlc</userinput></literallayout> +<para>By default, <command>pmlc</command> acts on the primary instance of <command>pmlogger</command> +on the current host. See the <command>pmlc(1)</command> +man page for a description of command line options. When it is +invoked, <command>pmlc</command> presents you with a prompt:</para> +<literallayout class="monospaced">pmlc> </literallayout> +<para>You may obtain a listing of the available commands by entering a question +mark (?) and pressing <keycap>Enter</keycap>. You see output similar to that +in <xref linkend="Z984165791sdc"/>:</para> +<example id="Z984165791sdc"> +<title>Listing Available Commands +</title> +<literallayout class="monospaced"> show loggers [@<host>] display <pid>s of running pmloggers + connect _logger_id [@<host>] connect to designated pmlogger + status information about connected pmlogger + query metric-list show logging state of metrics + new volume start a new log volume + flush flush the log buffers to disk + log { mandatory | advisory } on <interval> _metric-list + log { mandatory | advisory } off _metric-list + log mandatory maybe _metric-list + timezone local|logger|'<timezone>' change reporting timezone + help print this help message + quit exit from pmlc + _logger_id is primary | <pid> | port <n> + _metric-list is _metric-spec | { _metric-spec ... } + _metric-spec is <metric-name> | <metric-name> [ <instance> ... ]</literallayout> +<para>Here is an example:</para> +<literallayout class="monospaced"><userinput>pmlc</userinput> +pmlc> <userinput>show loggers @babylon</userinput> +The following pmloggers are running on babylon: + primary (1892) +pmlc> <userinput>connect 1892 @babylon</userinput> +pmlc> <userinput>log advisory on 2 secs disk.dev.read</userinput> +pmlc> <userinput>query disk.dev</userinput> +disk.dev.read + adv on nl 5 min [131073 or “disk1”] + adv on nl 5 min [131074 or “disk2”] +pmlc> <userinput>quit</userinput></literallayout> +</example> +<note><para>Any changes to the set of logged metrics made via <command>pmlc</command> +are not saved, and are lost the next time <command>pmlogger</command> +is started with the same configuration file. Permanent changes are made by +modifying the <command>pmlogger</command> configuration file(s).</para> +</note> +<para>Refer to the <command>pmlc(1)</command> and <command>pmlogger(1)</command> +man pages for complete details.</para> +</section> +</section> +<section id="LE80113-PARENT"> + +<title>Archive Logging Troubleshooting</title> +<para><indexterm id="ITch07-19"><primary>troubleshooting</primary><secondary> +archive logging</secondary></indexterm><indexterm id="IG31371888299"><primary>archive logs</primary> +<secondary>troubleshooting</secondary></indexterm><indexterm id="IG31371888300"><primary>pmlogger +tool</primary><secondary>troubleshooting</secondary></indexterm>The following +issues concern the creation and use of logs using <command>pmlogger</command>. +</para> +<section id="id5209540"> + +<title><command>pmlogger</command> Cannot Write Log</title> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><para>The <command>pmlogger</command> utility does not start, and +you see this message:</para> +<literallayout class="monospaced"><literal>__pmLogNewFile: “foo.index” already exists, not over-written</literal></literallayout> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>Archive logs are considered sufficiently precious that +<command>pmlogger</command> does not empty or overwrite an existing set of archive +log files. The log named <filename>foo</filename> actually consists of the +physical file <filename>foo.index</filename>, <filename>foo.meta</filename>, +and at least one file <filename>foo.N</filename>, where <filename>N</filename> +is in the range 0, 1, 2, 3, and so on.</para> +<para>A message similar to the one above is produced when a new <command>pmlogger</command> +instance encounters one of these files already in existence.</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>Move the existing archive aside, or if you are sure, +remove all of the parts of the archive log. +For example, use the following command:<literallayout class="monospaced"><userinput>rm -f foo.*</userinput></literallayout></para> +<para>Then rerun <command>pmlogger</command>.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5209654"> + +<title>Cannot Find Log</title> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><para><indexterm id="ITch07-23"><primary>pmdumplog tool</primary> +<secondary>troubleshooting</secondary></indexterm>The <literal>pmdumplog</literal> +utility, or any tool that can read an archive log, displays this message: +</para> +<literallayout class="monospaced"><literal>Cannot open archive mylog: No such file or directory</literal></literallayout> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>An archive consists of at least three physical files. If the +base name for the archive is <filename>mylog</filename>, then the archive +actually consists of the physical files <filename>mylog.index</filename>, +<filename>mylog.meta</filename>, and at least one file <filename>mylog.N</filename>, +where <filename>N</filename> is in the range 0, 1, 2, 3, and so on.</para> +<para>The above message is produced if one or more of the files is missing. +</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>Use this command to check which files the utility is trying +to open:</para> +<para><literallayout class="monospaced"><userinput>ls mylog.*</userinput></literallayout></para> +<para>Turn on the internal debug flag <literal>DBG_TRACE_LOG</literal> (<literal>-D</literal> 128) to see which files are being inspected by the <literal>__pmOpenLog</literal> routine as shown in the following example:<literallayout class="monospaced"><userinput>pmdumplog -D 128 -l mylog</userinput></literallayout></para> +<para>Locate the missing files and move them all to the same directory, or +remove all of the files that are part of the archive, and recreate the archive +log.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5209840"> + +<title>Primary <command>pmlogger</command> Cannot Start</title> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><para>The primary <command>pmlogger</command> cannot be started. +A message like the following appears:</para> +<literallayout class="monospaced"><literal>pmlogger: there is already a primary pmlogger running</literal></literallayout> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>There is either a primary <command>pmlogger</command> already +running, or the previous primary <command>pmlogger</command> was terminated +unexpectedly before it could perform its cleanup operations.</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para><indexterm id="IG31371888301"><primary>show command</primary></indexterm><indexterm id="IG31371888302"> +<primary>pmlc tool</primary><secondary>show command</secondary></indexterm><indexterm id="IG31371888303"> +<primary>SIGINT signal</primary></indexterm><indexterm id="IG31371888304"><primary>kill command +</primary></indexterm>If there is already a primary <command>pmlogger</command> +running and you wish to replace it with a new <command>pmlogger</command>, +use the <literal>show</literal> command in <command>pmlc</command> to determine +the process ID of the primary <command>pmlogger</command>. The process ID +of the primary <command>pmlogger</command> appears in parentheses after the +word “primary.” Send a <literal>SIGINT</literal> signal to the +process to shut it down (use either the <command>kill</command> command if +the platform supports it, or the <command>pmsignal</command> command). If the +process does not exist, proceed to the manual cleanup described in the paragraph +below. If the process did exist, it should now be possible to start the new <command>pmlogger</command>.</para> +<para>If <command>pmlc</command>'s <command>show</command> command displays +a process ID for a process that does not exist, a <command>pmlogger</command> +process was terminated before it could clean up. If it was the primary <command>pmlogger</command>, +the corresponding control files must be removed before +one can start a new primary <command>pmlogger</command>. It is a good idea +to clean up any spurious control files even if they are not for the primary <command>pmlogger</command>.</para> +<para><indexterm id="IG31371888305"><primary>${PCP_TMP_DIR}/pmlogger files</primary></indexterm>The +control files are kept in <filename>${PCP_TMP_DIR}/pmlogger</filename>. A control +file with the process ID of the <command>pmlogger</command> as its name is +created when the <command>pmlogger</command> is started. In addition, the +primary <command>pmlogger</command> creates a symbolic link named <literal>primary</literal> to its control file.</para> +<para>For the primary <command>pmlogger</command>, remove both the symbolic +link and the file (corresponding to its process ID) to which the link points. +For other <command>pmlogger</command>s, remove just the process ID file. Do +not remove any other files in the directory. If the control file for an active +<command>pmlogger</command> is removed, <command>pmlc</command> is not able to contact +it.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5210097"> + +<title>Identifying an Active <command>pmlogger</command> Process</title> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><para><indexterm id="IG31371888306"><primary>active pmlogger process</primary></indexterm>You +have a PCP archive log that is demonstrably growing, but do not know the identify +of the associated <command>pmlogger</command> process.</para> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>The PID is not obvious from the log, or the archive name may +not be obvious from the output of the <command>ps</command> command.</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>If the archive basename is <filename>foo</filename>, run the +following commands:</para> +<literallayout class="monospaced"><userinput>pmdumplog -l foo</userinput> +<literal>Log Label (Log Format Version 1)</literal> +Performance metrics from host gonzo + commencing Wed Aug 7 00:10:09.214 1996 + ending Wed Aug 7 16:10:09.155 1996 + +<userinput>pminfo -a foo -f pmcd.pmlogger</userinput>  +pmcd.pmlogger.host + inst [10728 or "10728"] value "gonzo" +pmcd.pmlogger.port + inst [10728 or "10728"] value 4331 +pmcd.pmlogger.archive + inst [10728 or "10728"] value "<replaceable>/usr/var/adm/pcplog/gonzo/foo</replaceable>"</literallayout> +<para>All of the information describing the creator of the archive is revealed +and, in particular, the instance identifier for the PMCD metrics (<literal>10728</literal> in the example above) is the PID of the <command>pmlogger</command> instance, which may be used to control the process via <command>pmlc</command>.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5210268"> + +<title>Illegal Label Record</title> +<variablelist> +<varlistentry> +<term>Symptom:</term> +<listitem><para><indexterm id="IG31371888307"><primary>illegal label record</primary></indexterm>PCP +tools report:</para> +<literallayout class="monospaced">Illegal label record at start of PCP archive log file.</literallayout> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>The label record at the start of each of the physical archive +log files has become either corrupted or one is out of sync with the others. +</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para> +If you believe the log may have been corrupted, this can be verified using <command>pmlogcheck</command>. +If corruption is limited to just the label record at the start, the <command>pmloglabel</command> +can be used to force the labels back in sync with each other, with known-good values that you supply.</para> +<para>Refer to the <command>pmlogcheck(1)</command> and <command>pmloglabel(1)</command> man pages.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5210389"> + +<title>Empty Archive Log Files or <command>pmlogger</command> Exits Immediately +</title> +<variablelist id="Z930351569sdc"> +<varlistentry> +<term>Symptom:</term> +<listitem><para>Archive log files are zero size, requested metrics are not +being logged, or <command>pmlogger</command> exits immediately with no error +messages.</para> +</listitem></varlistentry> +<varlistentry> +<term>Cause:</term> +<listitem><para>Either <command>pmlogger</command> encountered errors in the +configuration file, has not flushed its output buffers yet, or some (or all) +metrics specified in the <command>pmlogger</command> configuration file have +had their state changed to advisory <literal>off</literal> or mandatory <literal>off</literal> via <command>pmlc</command>. It is also possible that the logging +interval specified in the <command>pmlogger</command> configuration file for +some or all of the metrics is longer than the period of time you have been +waiting since <command>pmlogger</command> started.</para> +</listitem></varlistentry> +<varlistentry> +<term>Resolution:</term> +<listitem><para>If <command>pmlogger</command> exits immediately with no error +messages, check the <filename>pmlogger.log</filename> file in the directory +<command>pmlogger</command> was started in for any error messages. +If <command>pmlogger</command> has not yet flushed its buffers, enter one of +the following commands (depending on platform support): +<literallayout class="monospaced">killall -SIGUSR1 pmlogger +${PCP_BINADM_DIR}/pmsignal -a -s USR1 pmlogger</literallayout></para> +<para>Otherwise, use the <literal>status</literal> command for <command>pmlc</command> +to interrogate the internal <command>pmlogger</command> state of +specific metrics.</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +</chapter> + + + +<chapter id="LE83321-PARENT"> + +<title>Performance Co-Pilot Deployment Strategies</title> +<para><indexterm id="IG31371888308"><primary>deployment strategies</primary></indexterm>Performance +Co-Pilot (PCP) is a coordinated suite of tools and utilities allowing +you to monitor performance and make automated judgments and initiate actions +based on those judgments. PCP is designed to be fully configurable for +custom implementation and deployed to meet specific needs in a variety +of operational environments.</para> +<para>Because each enterprise and site is different and PCP represents +a new way of managing performance information, some discussion of deployment +strategies is useful.</para> +<para><indexterm id="IG31371888309"><primary>PMCD</primary><secondary>monitoring utilities +</secondary></indexterm><indexterm id="IG31371888310"><primary>PMDA</primary><secondary> +monitoring utilities</secondary></indexterm>The most common use of performance +monitoring utilities is a scenario where the PCP tools are executed on +a workstation (the PCP monitoring system), while the interesting performance +data is collected on remote systems (PCP collector systems) by a number +of processes, specifically the Performance Metrics Collection Daemon (PMCD) +and the associated Performance Metrics Domain Agents (PMDAs). These processes +can execute on both the monitoring system and one or more collector systems, +or only on collector systems. However, collector systems are the real +objects of performance investigations.</para> +<para>The material in this chapter covers the following areas:</para> +<itemizedlist> +<listitem><para><xref linkend="LE85282-PARENT"/>, presents the spectrum +of deployment architectures at the highest level.</para> +</listitem> +<listitem><para><xref linkend="LE69500-PARENT"/>, describes alternative +deployments for PMCD and the PMDAs.</para> +</listitem> +<listitem><para><xref linkend="LE56598-PARENT"/>, covers alternative deployments +for the <command>pmlogger</command> tool.</para> +</listitem> +<listitem><para><xref linkend="LE62310-PARENT"/>, presents the options +that are available for deploying the <command>pmie</command> tool.</para> +</listitem></itemizedlist> +<para>The options shown in this chapter are merely suggestions. They are +not comprehensive, and are intended to demonstrate some possible ways +of deploying the PCP tools for specific network topologies and purposes. +You are encouraged to use them as the basis for planning your own deployment, +consistent with your needs.</para> +<section id="LE85282-PARENT"> + +<title>Basic Deployment</title> +<para>In the simplest PCP deployment, one system is configured as both +a collector and a monitor, as shown in <xref linkend="id5210776"/>. +Because some of the PCP monitor tools make extensive use of visualization, this +suggests the monitor system should be configured with a graphical display.</para> +<figure id="id5210776"><title>PCP Deployment for a Single System</title><mediaobject><imageobject><imagedata fileref="figures/local-deploy.svg"/></imageobject><textobject><phrase>PCP Deployment for a Single System</phrase></textobject></mediaobject></figure> +<para>However, most PCP deployments involve at least two systems. For +example, the setup shown in <xref linkend="id5210808"/> would be representative +of many common scenarios.</para> +<figure id="id5210808"><title>Basic PCP Deployment for Two Systems</title><mediaobject><imageobject><imagedata fileref="figures/remote-deploy.svg"/></imageobject><textobject><phrase>Basic PCP Deployment for Two Systems</phrase></textobject></mediaobject></figure> +<para>But the most common site configuration would include a mixture of +systems configured as PCP collectors, as PCP monitors, and as both PCP +monitors and collectors, as shown in <xref linkend="id5210866"/>.</para> +<para>With one or more PCP collector systems and one or more PCP monitor +systems, there are a number of decisions that need to be made regarding +the deployment of PCP services across multiple hosts. For example, in <xref linkend="id5210866"/> +there are several ways in which both the inference engine (<command>pmie</command>) +and the PCP archive logger (<command>pmlogger</command>) could be deployed. +These options are discussed in the following sections of this chapter.</para> +<figure id="id5210866"><title>General PCP Deployment for Multiple Systems +</title><mediaobject><imageobject><imagedata fileref="figures/multi-deploy.svg"/></imageobject><textobject><phrase>General PCP Deployment for Multiple Systems +</phrase></textobject></mediaobject></figure> +</section> +<section id="LE69500-PARENT"> + +<title>PCP Collector Deployment</title> +<para><indexterm id="ITch08-0"><primary>PCP</primary><secondary>collector +deployment</secondary></indexterm>Each PCP collector system must have +an active <literal>pmcd</literal> and, typically, a number of PMDAs installed. +</para> +<section id="id5210938"> + +<title>Principal Server Deployment</title> +<para>The first hosts selected as PCP collector systems are likely to +provide some class of service deemed to be critical to the information +processing activities of the enterprise. These hosts include: +</para> +<itemizedlist> +<listitem><para>Database servers</para> +</listitem> +<listitem><para>Web servers for an Internet or Intranet presence</para> +</listitem> +<listitem><para>NFS or other central storage server</para> +</listitem> +<listitem><para>A video server</para> +</listitem> +<listitem><para>A supercomputer</para> +</listitem> +<listitem><para>An infrastructure service provider, for example, print, +DNS, LDAP, gateway, firewall, router, or mail services</para> +</listitem> +<listitem><para>Any system running a mission-critical application</para> +</listitem></itemizedlist> +<para>Your objective may be to improve quality of service on a system +functioning as a server for many clients. You wish to identify and repair +critical performance bottlenecks and deficiencies in order to maintain +maximum performance for clients of the server.</para> +<para>For some of these services, the PCP base product or the PCP add-on +packages provide the necessary collector components. Others would require +customized PMDA development, as described in the companion +<citetitle>Performance Co-Pilot Programmer's Guide</citetitle>.</para> +</section> +<section id="id5211030"> + +<title>Quality of Service Measurement</title> +<para><indexterm id="IG31371888311"><primary>service management</primary></indexterm>Applications +and services with a client-server architecture need to monitor performance +at both the server side and the client side.</para> +<para>The arrangement in <xref linkend="id5211102"/> illustrates one +way of measuring quality of service for client-server applications.</para> +<figure id="id5211102"><title>PCP Deployment to Measure Client-Server Quality +of Service</title><mediaobject><imageobject><imagedata fileref="figures/qos-deploy.svg"/></imageobject><textobject><phrase>PCP Deployment to Measure Client-Server Quality +of Service</phrase></textobject></mediaobject></figure> +<para>The configuration of the PCP collector components on the Application +Server System is standard. The new facility is the deployment of some +PCP collector components on the Application Client System; this uses a +customized PMDA and a generalization of the ICMP “ping” tool +as follows:</para> +<itemizedlist> +<listitem><para>The <literal>Client App</literal> is specially developed +to periodically make typical requests of the <literal>App Server</literal>, +and to measure the response time for these requests (this is an application-specific “ping”). +</para> +</listitem> +<listitem><para>The PMDA on the Application Client System captures the +response time measurements from the <literal>Client App</literal> and +exports these into the PCP framework.</para> +</listitem></itemizedlist> +<para>At the PCP monitor system, the performance of the system running +the <literal>App Server</literal> and the end-user quality of service +measurements from the system where the <literal>Client App</literal> is +running can be monitored concurrently.</para> +<para>PCP contains a number of examples of this architecture, including +the <literal>shping</literal> PMDA for IP-based services (including +HTTP), and the <literal>dbping</literal> PMDA for database servers.</para> +<para>The source code for each of these PMDAs is readily available; +users and administrators are encouraged to adapt these agents to the +needs of the local application environment.</para> +<para>It is possible to exploit this arrangement even further, with these +methods:</para> +<itemizedlist> +<listitem><para>Creating new instances of the <literal>Client App</literal> +and PMDA to measure service quality for your own mission-critical services. +</para> +</listitem> +<listitem><para>Deploying the <literal>Client App</literal> and associated +PCP collector components in a number of strategic hosts allows the quality +of service over the enterprise's network to be monitored. For example, +service can be monitored on the Application Server System, on the same +LAN segment as the Application Server System, on the other side of a firewall +system, or out in the WAN.</para> +</listitem></itemizedlist> +</section> +</section> +<section id="LE56598-PARENT"> + +<title>PCP Archive Logger Deployment</title> +<para><indexterm id="ITch08-3"><primary>PCP</primary><secondary>archive +logger deployment</secondary></indexterm>PCP archive logs are created +by the <command>pmlogger</command> utility, as discussed in <xref linkend="LE93354-PARENT"/>. +They provide a critical capability to perform retrospective performance +analysis, for example, to detect performance regressions, for problem +analysis, or to support capacity planning. The following sections discuss +the options and trade-offs for <command>pmlogger</command> deployment. +</para> +<section id="id5211332"> + +<title>Deployment Options</title> +<para>The issue is relatively simple and reduces to “On which host(s) +should <command>pmlogger</command> be running?” The options are +these:</para> +<itemizedlist> +<listitem><para>Run <command>pmlogger</command> on each PCP collector +system to capture local performance data.</para> +</listitem> +<listitem><para>Run <command>pmlogger</command> on some of the PCP monitor +systems to capture performance data from remote PCP collector systems. +</para> +</listitem> +<listitem><para>As an extension of the previous option, designate one +system to act as the PCP archive site to run all <command>pmlogger</command> +instances. This arrangement is shown in <xref linkend="id5211413"/>. +</para> +<figure id="id5211413"><title>Designated PCP Archive Site</title><mediaobject><imageobject><imagedata fileref="figures/designated-logger.svg"/></imageobject><textobject><phrase>Designated PCP Archive Site</phrase></textobject></mediaobject></figure> +</listitem></itemizedlist> +</section> +<section id="id5211434"> + +<title>Resource Demands for the Deployment Options</title> +<para>The <command>pmlogger</command> process is very lightweight in terms +of computational demand; most of the (very small) CPU cost is associated with +extracting performance metrics at the PCP collector system (PMCD and the +PMDAs), which are independent of the host on which <command>pmlogger</command> +is running.</para> +<para>A local <command>pmlogger</command> consumes disk bandwidth and +disk space on the PCP collector system. A remote <command>pmlogger</command> +consumes disk space on the site where it is running and network bandwidth +between that host and the PCP collector host.</para> +<para>The archive logs typically grow at a rate of anywhere between a few +kilobytes (KB) to tens of megabytes (MB) per day, depending on how many +performance metrics are logged and the choice of sampling frequencies. +There are some advantages in minimizing the number of hosts over which the +disk resources for PCP archive logs must be allocated; however, the aggregate +requirement is independent of where the <command>pmlogger</command> processes +are running.</para> +</section> +<section id="id5211489"> + +<title>Operational Management</title> +<para>There is an initial administrative cost associated with configuring +each <command>pmlogger</command> instance, and an ongoing administrative +investment to monitor these configurations, perform regular housekeeping +(such as rotation, compression, and culling of PCP archive log files), +and execute periodic tasks to process the archives (such as nightly performance +regression checking with <command>pmie</command>).</para> +<para>Many of these tasks are handled by the supplied <command>pmlogger</command> +administrative tools and scripts, as described in <xref linkend="LE92914-PARENT"/>. +However, the necessity and importance of these tasks favor a centralized <command> +pmlogger</command> deployment, as shown in <xref linkend="id5211413"/>. +</para> +</section> +<section id="id5211573"> + +<title>Exporting PCP Archive Logs</title> +<para><indexterm id="IG31371888312"><primary>archive logs</primary><secondary>export</secondary> +</indexterm>Collecting PCP archive logs is of little value unless the +logs are processed as part of the ongoing performance monitoring and management +functions. This processing typically involves the use of the tools on +a PCP monitor system, and hence the archive logs may need to be read on +a host different from the one they were created on.</para> +<para>NFS mounting is obviously an option, but the PCP tools support random +access and both forward and backward temporal motion within an archive +log. If an archive is to be subjected to intensive and interactive processing, +it may be more efficient to copy the files of the archive log to the PCP +monitor system first.</para> +<note><para>Each PCP archive log consists of at least three separate files +(see <xref linkend="LE92914-PARENT"/> for details). You must have concurrent +access to all of these files before a PCP tool is able to process an archive +log correctly.</para> +</note> +</section> +</section> +<section id="LE62310-PARENT"> + +<title>PCP Inference Engine Deployment</title> +<para>The <command>pmie</command> utility supports automated reasoning +about system performance, as discussed in <xref linkend="LE21414-PARENT"/>, +and plays a key role in monitoring system performance for both real-time +and retrospective analysis, with the performance data being retrieved +respectively from a PCP collector system and a PCP archive log.</para> +<para>The following sections discuss the options and trade-offs for +<command>pmie</command> deployment.</para> +<section id="id5211669"> + +<title>Deployment Options</title> +<para>The issue is relatively simple and reduces to “On which host(s) +should <command>pmie</command> be running?” You must consider both +real-time and retrospective uses, and the options are as follows:</para> +<itemizedlist> +<listitem><para>For real-time analysis, run <command>pmie</command> on +each PCP collector system to monitor local system performance.</para> +</listitem> +<listitem><para>For real-time analysis, run <command>pmie</command> on +some of the PCP monitor systems to monitor the performance of remote PCP +collector systems.</para> +</listitem> +<listitem><para>For retrospective analysis, run <command>pmie</command> +on the systems where the PCP archive logs reside. The problem then reduces +to <command>pmlogger</command> deployment as discussed in <xref linkend="LE56598-PARENT"/>. +</para> +</listitem> +<listitem><para>As an example of the “distributed management with +centralized control” philosophy, designate some system to act as +the PCP Management Site to run all <command>pmlogger</command> and <command>pmie</command> +instances. This arrangement is shown in <xref linkend="id5211805"/>. +</para> +</listitem></itemizedlist> +<para>One <command>pmie</command> instance is capable of monitoring multiple +PCP collector systems; for example, to evaluate some universal rules that +apply to all hosts. At the same time a single PCP collector system may +be monitored by multiple <command>pmie</command> instances; for example, +for site-specific and universal rule evaluation, or to support both tactical +performance management (operations) and strategic performance management +(capacity planning). Both situations are depicted in <xref linkend="id5211805"/>. +</para> +<figure id="id5211805"><title>PCP Management Site Deployment</title><mediaobject><imageobject><imagedata fileref="figures/designated-manager.svg"/></imageobject><textobject><phrase>PCP Management Site Deployment</phrase></textobject></mediaobject></figure> +</section> +<section id="id5211825"> + +<title>Resource Demands for the Deployment Options</title> +<para>Depending on the complexity of the rule sets, the number of hosts +being monitored, and the evaluation frequency, <command>pmie</command> +may consume CPU cycles significantly above the resources required to simply +fetch the values of the performance metrics. If this becomes significant, +then real-time deployment of <command>pmie</command> away from the PCP +collector systems should be considered in order to avoid the “you're +part of the problem, not the solution” scenario in terms of CPU +utilization on a heavily loaded server.</para> +</section> +<section id="id5211850"> + +<title>Operational Management</title> +<para>An initial administrative cost is associated with configuring each +<command>pmie</command> instance, particularly in the development of the rule sets +that accurately capture and classify “good” versus “bad” +performance in your environment. These rule sets almost always involve +some site-specific knowledge, particularly in respect to the “normal” +levels of activity and resource consumption. The <command>pmieconf</command> +tool (see <xref linkend="Z927039566sdc"/>) may be used to help develop +localized rules based upon parameterized templates covering many common +performance scenarios. In complex environments, customizing these rules +may occur over an extended period and require considerable performance +analysis insight.</para> +<para>One of the functions of <command>pmie</command> provides for continual +detection of adverse performance and the automatic generation of alarms +(visible, audible, e-mail, pager, and so on). Uncontrolled deployment +of this alarm initiating capability throughout the enterprise may cause +havoc.</para> +<para>These considerations favor a centralized <command>pmie</command> +deployment at a small number of PCP monitor sites, or in a PCP Management +Site as shown in <xref linkend="id5211805"/>.</para> +<para>However, it is most likely that knowledgeable users with specific +needs may find a local deployment of <command>pmie</command> most useful +to track some particular class of service difficulty or resource utilization. +In these cases, the alarm propagation is unlikely to be required or is +confined to the system on which <command>pmie</command> is running.</para> +<para>Configuration and management of a number of <command>pmie</command> +instances is made much easier with the scripts and control files described +in <xref linkend="Z927039824sdc"/>.</para> +</section> +</section> +</chapter> + + + +<chapter id="LE62564-PARENT"> + +<title>Customizing and Extending PCP Services</title> +<para><indexterm id="IG31371888313"><primary>customization</primary><secondary>PCP services</secondary> +</indexterm><indexterm id="IG31371888314"><primary>extensibility</primary></indexterm> Performance +Co-Pilot (PCP) has been developed to be fully extensible. The following sections +summarize the various facilities provided to allow you to extend and customize +PCP for your site:</para> +<itemizedlist> +<listitem><para><xref linkend="LE61335-PARENT"/>, describes the procedure for +customizing the summary PMDA to export derived metrics formed by aggregation +of base PCP metrics from one or more collector hosts.</para> +</listitem> +<listitem><para><xref linkend="LE80685-PARENT"/>, describes the various options +available for customizing and extending the basic PCP tools.</para> +</listitem> +<listitem><para><xref linkend="LE50224-PARENT"/>, covers the concepts and tools +provided for updating the PMNS (Performance Metrics Name Space).</para> +</listitem> +<listitem><para><xref linkend="LE10270-PARENT"/>, details where to find further +information to assist in the development of new PMDAs to extend the range +of performance metrics available through the PCP infrastructure.</para> +</listitem> +<listitem><para><xref linkend="LE31248-PARENT"/>, outlines how new tools may +be developed to process performance data from the PCP infrastructure.</para> +</listitem></itemizedlist> +<section id="LE61335-PARENT"> + +<title>PMDA Customization</title> +<para>The generic procedures for installing and activating the optional PMDAs +have been described in <xref linkend="LE43202-PARENT"/>. In some cases, these +procedures prompt the user for information based upon the local system or +network configuration, application deployment, or processing profile to customize +the PMDA and hence the performance metrics it exports.</para> +<para>The summary PMDA is a special case that warrants further discussion. +</para> +<section id="LE43074-PARENT"> + +<title>Customizing the Summary PMDA</title> +<para><indexterm id="ITch09-1"><primary>PMDA</primary><secondary>customizing +</secondary></indexterm>The summary PMDA exports performance metrics derived +from performance metrics made available by other PMDAs. It is described completely +in the <command>pmdasummary(1)</command> man page.</para> +<para>The summary PMDA consists of two processes:</para> +<variablelist> +<varlistentry> +<term><command>pmie</command> process</term> +<listitem><para>Periodically samples the base metrics and compute values for +the derived metrics. This dedicated instance of the PCP <command>pmie</command> +inference engine is launched with special command line arguments by the main +process. See <xref linkend="LE41170-PARENT"/>, for a complete discussion of +the <command>pmie</command> feature set.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>main</literal> process</term> +<listitem><para>Reads and buffers the values computed by the <command>pmie</command> +process and makes them available to the Performance Metrics Collection Daemon +(PMCD).</para> +</listitem></varlistentry> +</variablelist> +<para>All of the metrics exported by the summary PMDA have a singular instance +and the values are instantaneous; the exported value is the correct value +as of the last time the corresponding expression was evaluated by the +<command>pmie</command> process.</para> +<para>The summary PMDA resides in the <literal>${PCP_PMDAS_DIR}/summary</literal> +directory and may be installed with a default configuration by following the +steps described in <xref linkend="LE31599-PARENT"/>.</para> +<para>Alternatively, you may customize the summary PMDA to export your own +derived performance metrics by following the steps in <xref linkend="id5212322"/>: +</para> +<procedure id="id5212322"> +<title>Customizing the Summary PMDA</title> +<step><para>Check that the symbolic constant <literal>SYSSUMMARY</literal> is defined in the <filename>${PCP_VAR_DIR}/pmns/stdpmid</filename> +file. If it is not, perform the postinstall update of this file, as superuser: +<literallayout class="monospaced"><userinput>cd ${PCP_VAR_DIR}/pmns +./Make.stdpmid</userinput></literallayout></para> +</step><step><para><indexterm id="IG31371888315"><primary>PMNS</primary><secondary>names +</secondary></indexterm>Choose Performance Metric Name Space (PMNS) names +for the new metrics. These must begin with <literal>summary</literal> and +follow the rules described in the <command>pmns(5)</command> man +page. For example, you might use <literal>summary.fs.cache_write</literal> +and <literal>summary.fs.cache_hit</literal>.</para> +</step><step><para><indexterm id="IG31371888316"><primary>PMID</primary><secondary>PMNS +names</secondary></indexterm>Edit the <filename>pmns</filename> file in the <literal>${PCP_PMDAS_DIR}/summary</literal> +directory to add the new metric names in +the format described in the <command>pmns(5)</command> man page. +You must choose a unique performance metric identifier (PMID) for each metric. +In the <filename>pmns</filename> file, these appear as <literal>SYSSUMMARY:0:x</literal>. +The value of <replaceable>x</replaceable> is arbitrary in the +range 0 to 1023 and unique in this file. Refer to <xref linkend="LE50224-PARENT"/>, +for a further explanation of the rules governing PMNS updates.</para> +<para>For example:</para> +<literallayout class="monospaced">summary { + cpu + disk + netif + fs /*new*/ +} +summary.fs { + cache_write SYSSUMMARY:0:10 + cache_hit SYSSUMMARY:0:11 +}</literallayout> +</step><step><para>Use the local test PMNS <literal>root</literal> +and validate that the PMNS changes are correct.</para> +<para>For example, enter this command:</para> +<literallayout class="monospaced"><userinput>pminfo -n root -m summary.fs</userinput></literallayout> +<para>You see output similar to the following:</para> +<literallayout class="monospaced">summary.fs.cache_write PMID: 27.0.10 +summary.fs.cache_hit PMID: 27.0.11</literallayout> +</step><step><para>Edit the <filename>${PCP_PMDAS_DIR}/summary/expr.pmie</filename> +file to add new <command>pmie</command> expressions. If the name +to the left of the assignment operator (=) is one of the PMNS names, then +the <command>pmie</command> expression to the right will be evaluated and +returned by the summary PMDA. The expression must return a numeric value. +Additional description of the <command>pmie</command> expression syntax may +be found in <xref linkend="LE90227-PARENT"/>.</para> +<para>For example, consider this expression:</para> +<literallayout class="monospaced">// filesystem buffer cache hit percentages +prefix = "kernel.all.io"; // macro, not exported +summary.fs.cache_write = + 100 - 100 * $prefix.bwrite / $prefix.lwrite; +summary.fs.cache_hit = + 100 - 100 * $prefix.bread / $prefix.lread;</literallayout> +</step><step><para>Run <command>pmie</command> in debug mode to verify +that the expressions are being evaluated correctly, and the values make sense. +</para> +<para>For example, enter this command:</para> +<literallayout class="monospaced"><userinput>pmie -t 2sec -v expr.pmie</userinput></literallayout> +<para>You see output similar to the following:</para> +<literallayout class="monospaced">summary.fs.cache_write: ? +summary.fs.cache_hit: ? +summary.fs.cache_write: 45.83 +summary.fs.cache_hit: 83.2 +summary.fs.cache_write: 39.22 +summary.fs.cache_hit: 84.51</literallayout> +</step><step><para>Install the new PMDA.</para> +<para>From the <literal>${PCP_PMDAS_DIR}/summary</literal> directory, +use this command:</para> +<literallayout class="monospaced"><userinput>./Install</userinput></literallayout> +<para>You see the following output:</para> +<literallayout class="monospaced">You need to choose an appropriate configuration for installation of +the “summary” Performance Metrics Domain Agent (PMDA). + +collector collect performance statistics on this system +monitor allow this system to monitor local and/or remote systems +both collector and monitor configuration for this system + +Please enter c(ollector) or m(onitor) or b(oth) [b] <userinput>both</userinput> +Interval between summary expression evaluation (seconds)? [10] <userinput>10</userinput> +Updating the Performance Metrics Name Space... +Installing pmchart view(s) ... +Terminate PMDA if already installed ... +Installing files .. +Updating the PMCD control file, and notifying PMCD ... +Wait 15 seconds for the agent to initialize ... +Check summary metrics have appeared ... 8 metrics and 8 values</literallayout> +</step> +<step><para>Check the metrics.</para> +<para>For example, enter this command:</para> +<literallayout class="monospaced"><userinput>pmval -t 5sec -s 8 summary.fs.cache_write</userinput></literallayout> +<para>You see a response similar to the following:</para> +<literallayout class="monospaced">metric: summary.fs.cache_write +host: localhost +semantics: instantaneous value +units: none +samples: 8 +interval: 5.00 sec +63.60132158590308 +62.71878646441073 +62.71878646441073 +58.73968492123031 +58.73968492123031 +65.33822758259046 +65.33822758259046 +72.6099706744868</literallayout> +<para>Note that the values are being sampled here by <literal>pmval</literal> +every 5 seconds, but <command>pmie</command> is passing only new values to +the summary PMDA every 10 seconds. Both rates could be changed to suit the +dynamics of your new metrics.</para> +</step><step><para>You may now create <command>pmchart</command> views, +<command>pmie</command> rules, and <command>pmlogger</command> +configurations to monitor and archive your new performance metrics.</para> +</step> +</procedure> +</section> +</section> +<section id="LE80685-PARENT"> + +<title>PCP Tool Customization</title> +<para><indexterm id="ITch09-2"><primary>PCP</primary><secondary>tool customization +</secondary></indexterm><indexterm id="IG31371888317"><primary>tool customization</primary></indexterm>Performance +Co-Pilot (PCP) has been designed and implemented with a philosophy that embraces +the notion of toolkits and encourages extensibility.</para> +<para>In most cases, the PCP tools provide orthogonal services, based on external +configuration files. It is the creation of new and modified configuration +files that enables PCP users to customize tools quickly and meet the needs +of the local environment, in many cases allowing personal preferences to be +established for individual users on the same PCP monitor system.</para> +<para>The material in this section is intended to act as a checklist of pointers +to detailed documentation found elsewhere in this guide, in the man pages, +and in the files that are made available as part of the PCP installation. +</para> +<section id="id5212794"> + +<title>Archive Logging Customization</title> +<para><indexterm id="IG31371888318"><primary>archive logs</primary><secondary>customization</secondary> +</indexterm><indexterm id="IG31371888319"><primary>customization</primary><secondary>archive +logs</secondary></indexterm>The PCP archive logger is presented in <xref linkend="LE93354-PARENT"/>, +and documented in the <command>pmlogger(1)</command> man page. +</para> +<para>The following global files and directories influence the behavior of +<command>pmlogger</command>:</para> +<variablelist condition="sgi_termlength:nextline" id="Z930367453sdc"> +<varlistentry> +<term><filename>${PCP_SYSCONF_DIR}/pmlogger</filename> </term> +<listitem><para>Enable/disable state for the primary logger facility using +this command:<literallayout class="monospaced"><userinput><literal>chkconfig pmlogger on</literal></userinput></literallayout></para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_SYSCONF_DIR}/pmlogger/config.default</filename> </term> +<listitem><para>The default <command>pmlogger</command> configuration file +that is used for the primary logger when this facility is enabled.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_VAR_DIR}/config/pmlogconf/tools</filename> </term> +<listitem><para>Every PCP tool with a fixed group of performance metrics contributes +a <command>pmlogconf</command> configuration file that includes each of the +performance metrics used in the tool, for example, <filename>${PCP_VAR_DIR}/config/pmlogconf/pmstat</filename> +for <literal>pmstat</literal>.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_PMLOGGERCONTROL_PATH}</filename> </term> +<listitem><para>Defines which PCP collector hosts require +<command>pmlogger</command> to be launched on the local host, where the configuration file +comes from, where the archive log files should be created, and <command>pmlogger</command> +startup options.</para> +<para>This <literal>control</literal> file supports the starting and stopping of multiple +<command>pmlogger</command> instances that monitor local or remote hosts.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>/etc/cron.d/pcp-pmlogger</filename> or <filename>${PCP_VAR_DIR}/config/pmlogger/crontab</filename> </term> +<listitem><para>Default <literal>crontab</literal> entries that may be merged +with the <literal>crontab</literal> entries for the <literal>pcp</literal> user +to schedule the periodic execution of the archive log management scripts, for +example, <literal>pmlogger_daily</literal>.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_LOG_DIR}/pmlogger/somehost</filename> </term> +<listitem><para>The default behavior of the archive log management scripts +create archive log files for the host <replaceable>somehost</replaceable> +in this directory.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_LOG_DIR}/pmlogger/somehost/Latest</filename> </term> +<listitem><para>A PCP archive folio for the most recent archive for the host <replaceable>somehost</replaceable>. +This folio is created and maintained by the <literal>cron</literal>-driven periodic archive log management scripts, for example, <literal>pmlogger_check</literal>. Archive folios may be processed with the <literal>pmafm</literal> tool.</para> +</listitem></varlistentry> +</variablelist> +</section> +<section id="id5213167"> + +<title>Inference Engine Customization</title> +<para><indexterm id="IG31371888320"><primary>inference engine</primary></indexterm><indexterm id="IG31371888321"> +<primary>customization</primary><secondary>inference engine</secondary></indexterm>The +PCP inference engine is presented in <xref linkend="LE21414-PARENT"/>, and +documented in the <command>pmie(1)</command> man page.</para> +<para>The following global files and directories influence the behavior of <command>pmie</command>:</para> +<variablelist condition="sgi_termlength:nextline" id="Z930367570sdc"> +<varlistentry> +<term><filename>${PCP_SYSCONF_DIR}/pmie</filename></term> +<listitem><para>Controls the pmie daemon facility. Enable using this command:<literallayout class="monospaced">chkconfig pmie on</literallayout></para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_SYSCONF_DIR}/pmie/config.default</filename></term> +<listitem><para>The <command>pmie</command> configuration file that is used +for monitoring the local host when the <command>pmie</command> daemon facility +is enabled in the default configuration. This file is created using <command>pmieconf</command> +the first time the daemon facility is activated.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_PMIECONTROL_PATH}</filename></term> +<listitem><para>Defines which PCP collector hosts require a daemon <command>pmie</command> +to be monitoring from the local host, where the configuration files comes from, +where the <command>pmie</command> log file should be created, +and <command>pmie</command> startup options.</para> +<para>This <literal>control</literal> file supports the starting and stopping of multiple +<command>pmie</command> instances that are each monitoring one or more hosts.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_VAR_DIR}/config/pmieconf/*/*</filename></term> +<listitem><para>Each <command>pmieconf</command> rule definition can be found +below one of these subdirectories.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>/etc/cron.d/pcp-pmie</filename> or <filename>${PCP_VAR_DIR}/config/pmie/crontab</filename> </term> +<listitem><para>Default <literal>crontab</literal> entries that may be merged +with the <literal>crontab</literal> entries for the <literal>pcp</literal> user +to schedule the periodic execution of the <command>pmie_check</command> and +<command>pmie_daily</command> scripts, for verifying that <command>pmie</command> +instances are running and logs rotated.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_LOG_DIR}/pmie/somehost</filename></term> +<listitem><para>The default behavior of the <filename>${PCP_RC_DIR}/pmie</filename> +startup scripts create <command>pmie</command> log files for the host +<replaceable>somehost</replaceable> in this directory.</para> +</listitem></varlistentry> +<varlistentry> +<term><command>pmie_check</command> and <command>pmie_daily</command></term> +<listitem><para>These commands are similar to the <command>pmlogger</command> +support scripts, <command>pmlogger_check</command> and <command>pmlogger_daily</command>.</para> +</listitem></varlistentry> +<varlistentry> +<term><filename>${PCP_TMP_DIR}/pmie</filename></term> +<listitem><para>The statistics that <command>pmie</command> gathers are maintained +in binary data structure files. These files can be found in the +<filename>${PCP_TMP_DIR}/pmie</filename> directory.</para> +</listitem></varlistentry> +<varlistentry> +<term><literal>pmcd.pmie</literal> metrics</term> +<listitem><para>The PMCD PMDA exports information about executing <command>pmie</command> +processes and their progress in terms of rule evaluations and action execution +rates.</para> +<para>If <command>pmie</command> is running on a system with a PCP +collector deployment, the <command>pmcd</command> PMDA exports these metrics +via the <filename>pmcd.pmie</filename> group of metrics.</para> +</listitem></varlistentry> +</variablelist> +</section> +</section> +<section id="LE50224-PARENT"> + +<title>PMNS Management</title> +<para><indexterm id="IG31371888322"><primary>PMNS</primary><secondary>management</secondary> +</indexterm>This section describes the syntax, semantics, and processing framework +for the external specification of a Performance Metrics Name Space (PMNS) +as it might be loaded by the PMAPI routine <command>pmLoadNameSpace</command>; +see the <command>pmLoadNameSpace(3)</command> man page. This is usually done only +by <command>pmcd</command>, except in rare circumstances such as <xref linkend="LE43074-PARENT"/>. +</para> +<para>The PMNS specification is a simple text source file that can be edited +easily. For reasons of efficiency, a binary format is also supported; the +utility <literal>pmnscomp</literal> translates the ASCII source format into +binary format; see the <command>pmnscomp(1)</command> man page. +</para> +<section id="id5213650"> + +<title>PMNS Processing Framework</title> +<para>The PMNS specification is initially passed through <command>pmcpp(1)</command>. +This means the following facilities may be used in the specification:</para> +<itemizedlist> +<listitem><para>C-style comments</para> +</listitem> +<listitem><para><literal>#include</literal> directives</para> +</listitem> +<listitem><para><literal>#define</literal> directives and macro substitution +</para> +</listitem> +<listitem><para>Conditional processing with <literal>#ifdef</literal>, +<literal>#ifndef</literal>, <literal>#endif</literal>, and <literal>#undef</literal></para> +</listitem></itemizedlist> +<para>When <command>pmcpp(1)</command> is executed, the standard include directories +are the current directory and <literal>${PCP_VAR_DIR}/pmns</literal>, where some +standard macros and default specifications may be found.</para> +</section> +<section id="id5213751"> + +<title>PMNS Syntax</title> +<para><indexterm id="ITch09-4"><primary>PMNS</primary><secondary>syntax</secondary> +</indexterm><indexterm id="IG31371888323"><primary>syntax</primary></indexterm>Every PMNS is +tree structured. The paths to the leaf nodes are the performance metric names. +The general syntax for a non-leaf node in PMNS is as follows:</para> +<literallayout class="monospaced">pathname { + name [pmid] + ... +}</literallayout> +<para>Here <literal>pathname</literal> is the full pathname from the root +of the PMNS to this non-leaf node, with each component in the path separated +by a period. The root node for the PMNS has the special name <literal>root</literal>, but the prefix string <literal>root.</literal> must be omitted +from all other <literal>pathnames</literal>.</para> +<para>For example, refer to the PMNS shown in <xref linkend="id5213846"/>. +The correct pathname for the rightmost non-leaf node is <literal>cpu.utilization</literal>, not <literal>root.cpu.utilization</literal>.</para> +<figure id="id5213846"><title>Small Performance Metrics Name Space (PMNS)</title><mediaobject><imageobject><imagedata fileref="figures/pmns-small2.svg"/></imageobject><textobject><phrase>Small Performance Metrics Name Space (PMNS)</phrase></textobject></mediaobject></figure> +<para>Each component in the pathname must begin with an alphabetic character +and be followed by zero or more alphanumeric characters or the underscore +(_) character. For alphabetic characters in a component, uppercase and lowercase +are significant.</para> +<para>Non-leaf nodes in the PMNS may be defined in any order desired. The +descendent nodes are defined by the set of <literal>names</literal>, relative +to the pathname of their parent non-leaf node. For descendent nodes, leaf +nodes have a <literal>pmid</literal> specification, but non-leaf nodes do +not.</para> +<para>The syntax for the <literal>pmid</literal> specification was chosen +to help manage the allocation of Performance Metric IDs (PMIDs) across disjoint +and autonomous domains of administration and implementation. Each <literal>pmid</literal> consists of three integers separated by colons, for example, <literal>14:27:11</literal>. This is intended to mirror the implementation hierarchy +of performance metrics. The first integer identifies the domain in which the +performance metric lies. Within a domain, related metrics are often grouped +into clusters. The second integer identifies the cluster, and the third integer, +the metric within the cluster.</para> +<para>The PMNS specification for <xref linkend="id5213846"/> is shown in <xref linkend="Z928447759sdc"/>:</para> +<example id="Z928447759sdc"> +<title>PMNS Specification</title> +<literallayout class="monospaced">/* +* PMNS Specification +*/ +#define KERNEL 1 +root { + network + cpu +} +#define NETWORK 26 +network { + interrupts KERNEL:NETWORK:1 + packets +} +network.packets { + in KERNEL:NETWORK:35 + out KERNEL:NETWORK:36 +} +#define CPU 10 +cpu { + syscalls KERNEL:CPU:10 + utilization +} +#define USER 20 +#define SYSTEM 21 +#define IDLE 22 +cpu.utilization { + user KERNEL:CPU:USER + sys KERNEL:CPU:SYSTEM + idle KERNEL:CPU:IDLE +}</literallayout> +</example> +<para>For complete documentation of the PMNS and associated utilities, see +the <command>pmns(5)</command>, <command>pmnsadd(1)</command>, +<command>pmnsdel(1)</command> and <command>pmnsmerge(1)</command> man pages. +</para> +</section> +</section> +<section id="LE10270-PARENT"> + +<title>PMDA Development</title> +<para><indexterm id="ITch09-5"><primary>PMDA</primary><secondary>development +</secondary></indexterm>Performance Co-Pilot (PCP) is designed to be extensible +at the collector site.</para> +<para>Application developers are encouraged to create new PMDAs to export +performance metrics from the applications and service layers that are particularly +relevant to a specific site, application suite, or processing environment. +</para> +<para>These PMDAs use the routines of the <literal>libpcp_pmda</literal> library, +which is discussed in detail in the <citetitle>Performance Co-Pilot Programmer's Guide</citetitle>.</para> +</section> +<section id="LE31248-PARENT"> + +<title>PCP Tool Development</title> +<para><indexterm id="ITch09-6"><primary>PCP</primary><secondary>tool development +</secondary></indexterm><indexterm id="IG31371888324"><primary>tool development</primary></indexterm>Performance +Co-Pilot (PCP) is designed to be extensible at the monitor site.</para> +<para>Application developers are encouraged to create new PCP client applications +to monitor or display performance metrics in a manner that is particularly +relevant to a specific site, application suite, or processing environment. +</para> +<para>Client applications use the routines of the PMAPI (performance metrics +application programming interface) described in the <citetitle>Performance Co-Pilot Programmer's Guide</citetitle>. +At the time of writing, native PMAPI interfaces are available for the C, C++ and +Python languages.</para> +</section> +</chapter> + + +<appendix id="LE65325-PARENT"> + +<title>Acronyms</title> +<para><indexterm id="ITchA-0"><primary>acronyms</primary></indexterm><indexterm id="ITchA-1"><primary>glossary</primary></indexterm><xref linkend="id5214266"/> +provides a list of the acronyms used in the Performance Co-Pilot +(PCP) documentation, help cards, man pages, and user interface.</para> +<table id="id5214266" frame="topbot"> + +<title>Performance Co-Pilot Acronyms and Their Meanings +</title> +<tgroup cols="2" colsep="0" rowsep="0"> +<colspec colwidth="120*"/> +<colspec colwidth="276*"/> +<thead> +<row rowsep="1" valign="top"><entry align="left" valign="bottom"><para>Acronym</para></entry> +<entry align="left" valign="bottom"><para>Meaning</para></entry></row> +</thead> +<tbody> +<row valign="top"> +<entry align="left" valign="top"><para>API</para></entry> +<entry align="left" valign="top"><para>Application Programming Interface +</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para>DBMS</para></entry> +<entry align="left" valign="top"><para>Database Management System</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para>DNS</para></entry> +<entry align="left" valign="top"><para>Domain Name Service</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="IG31371888325"><primary>DSO</primary> +</indexterm> DSO</para></entry> +<entry align="left" valign="top"><para>Dynamic Shared Object</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="IG31371888326"><primary>I/O</primary> +</indexterm> I/O</para></entry> +<entry align="left" valign="top"><para>Input/Output</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="IG31371888327"><primary>IPC</primary> +</indexterm> IPC</para></entry> +<entry align="left" valign="top"><para>Interprocess Communication</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-4"><primary> +PCP</primary><secondary>acronym</secondary></indexterm> PCP</para></entry> +<entry align="left" valign="top"><para>Performance Co-Pilot</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-5"><primary> +PDU</primary></indexterm> PDU</para></entry> +<entry align="left" valign="top"><para>Protocol data unit</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-6"><primary> +PMAPI</primary><secondary>acronym</secondary></indexterm> PMAPI</para></entry> +<entry align="left" valign="top"><para>Performance Metrics Application +Programming Interface</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-7"><primary> +PMCD</primary><secondary>acronym</secondary></indexterm> PMCD</para></entry> +<entry align="left" valign="top"><para>Performance Metrics Collection +Daemon</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-9"><primary> +PMD</primary></indexterm> PMD</para></entry> +<entry align="left" valign="top"><para>Performance Metrics Domain</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-10"><primary> +PMDA</primary><secondary>acronym</secondary></indexterm> PMDA</para></entry> +<entry align="left" valign="top"><para>Performance Metrics Domain Agent +</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-11"><primary> +PMID</primary><secondary>acronym</secondary></indexterm> PMID</para></entry> +<entry align="left" valign="top"><para>Performance Metric Identifier</para></entry> +</row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-12"><primary> +PMNS</primary><secondary>acronym</secondary></indexterm> PMNS</para></entry> +<entry align="left" valign="top"><para>Performance Metrics Name Space +</para></entry></row> +<row valign="top"> +<entry align="left" valign="top"><para><indexterm id="ITchA-13"><primary> +TCP/IP</primary><secondary>acronym</secondary></indexterm> TCP/IP</para></entry> +<entry align="left" valign="top"><para>Transmission Control Protocol/Internet +Protocol</para></entry></row></tbody></tgroup></table> +</appendix> +<index id="sgi-index"> + +<indexentry> + <primaryie>*_inst operator <link linkend="IG31371888221">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>*_sample operator <link linkend="IG31371888222">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>2D tools <link linkend="IG31371888171">Monitoring System Performance</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>64-bit IEEE format <link linkend="IG3137188876">Descriptions for Performance Metrics</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmGetConfig function <link linkend="IG31371888139">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>acronyms <link linkend="ITchA-0">Acronyms</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>active pmlogger process <link linkend="IG31371888306">Identifying an Active <command>pmlogger</command> Process</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>adaptation <link linkend="IG3137188815">Dynamic Adaptation to Change</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>application programs <link linkend="ITch01-114">Application and Agent Development</link> <link linkend="IG3137188869">Sources of Performance Metrics and Their Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>archive logs</primaryie> + <secondaryie>administration <link linkend="IG31371888290">Archive Log Administration</link></secondaryie> + <secondaryie>analysis <link linkend="IG3137188816">Logging and Retrospective Analysis</link></secondaryie> + <secondaryie>capacity planning <link linkend="IG31371888249">Using Archive Logs for Capacity Planning</link></secondaryie> + <secondaryie>collection time <link linkend="ITch01-135">Current Metric Context</link></secondaryie> + <secondaryie>contents <link linkend="IG31371888268">PCP Archive Contents</link></secondaryie> + <secondaryie>creation <link linkend="IG3137188836">Collecting, Transporting, and Archiving Performance Information +</link></secondaryie> + <secondaryie>customization <link linkend="IG3137188819">Automated Operational Support</link> <link linkend="IG31371888318">Archive Logging Customization</link></secondaryie> + <secondaryie>export <link linkend="IG31371888312">Exporting PCP Archive Logs</link></secondaryie> + <secondaryie>fetching metrics <link linkend="ITch03-5">Fetching Metrics from an Archive Log</link></secondaryie> + <secondaryie>file management <link linkend="IG31371888260">Archive Log File Management</link></secondaryie> + <secondaryie>folios <link linkend="IG31371888291">PCP Archive Folios</link></secondaryie> + <secondaryie>physical filenames <link linkend="ITch03-9">Fetching Metrics from an Archive Log</link></secondaryie> + <secondaryie>PMAPI <link linkend="IG31371888245">Archive Logs and the PMAPI</link></secondaryie> + <secondaryie>retrospective analysis <link linkend="IG31371888247">Retrospective Analysis Using Archive Logs</link></secondaryie> + <secondaryie>troubleshooting <link linkend="IG31371888299">Archive Logging Troubleshooting</link></secondaryie> + <secondaryie>usage <link linkend="IG31371888239">Archive Logging</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>arithmetic aggregation <link linkend="IG31371888215">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>arithmetic expressions <link linkend="IG31371888199"><command>pmie</command> Arithmetic Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>audience <link linkend="IG313718884">Empowering the PCP User</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>audits <link linkend="IG3137188820">Automated Operational Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>automated operational support <link linkend="IG3137188817">Automated Operational Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>avg_host operator <link linkend="IG31371888216">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>basename conventions <link linkend="IG31371888261">Basename Conventions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>Boolean expressions <link linkend="IG31371888203">Boolean Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>capacity planning <link linkend="IG31371888250">Using Archive Logs for Capacity Planning</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>caveats <link linkend="IG31371888229">Caveats and Notes on <command>pmie</command></link></primaryie> +</indexentry> + +<indexentry> + <primaryie>centralized archive logging <link linkend="IG3137188818">Automated Operational Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>coverage <link linkend="IG3137188824">Metric Coverage</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>chkhelp tool <link linkend="ITch01-115">Application and Agent Development</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>client-server architecture <link linkend="IG3137188813">PCP Distributed Operation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>collection time <link linkend="ITch01-134">Current Metric Context</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>collector hosts <link linkend="ITch01-143">Distributed Collection</link> <link linkend="ITch01-156">Collector and Monitor Roles</link> <link linkend="ITch02-33">PMDA Installation on a PCP Collector Host</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>comments <link linkend="IG31371888190">Comments </link></primaryie> +</indexentry> + +<indexentry> + <primaryie>common directories <link linkend="IG31371888119">Common Directories and File Locations</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>component software <link linkend="IG3137188826">Overview of Component Software</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>conceptual foundations <link linkend="ITch01-129">Conceptual Foundations</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>configuring PCP <link linkend="ITch02-1">Installing and Configuring Performance Co-Pilot +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>conventions <link linkend="ITch03-1">Common Conventions and Arguments</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>cookbook <link linkend="IG31371888270">Cookbook for Archive Logging</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>count_host operator <link linkend="IG31371888218">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>cron scripts <link linkend="IG31371888242">Introduction to Archive Logging</link> <link linkend="ITch07-6">Administering PCP Archive Logs Using <command>cron</command> Scripts +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>customization</primaryie> + <secondaryie>archive logs <link linkend="IG31371888319">Archive Logging Customization</link></secondaryie> + <secondaryie>inference engine <link linkend="IG31371888321">Inference Engine Customization</link></secondaryie> + <secondaryie>PCP services <link linkend="IG31371888313">Customizing and Extending PCP Services</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>data collection tools <link linkend="IG3137188834">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>dbpmda tool <link linkend="ITch01-117">Application and Agent Development</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>debugging tools <link linkend="ITch01-90">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>deployment strategies <link linkend="IG31371888308">Performance Co-Pilot Deployment Strategies</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>diagnostic tools <link linkend="IG3137188851">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>DISPLAY variable <link linkend="IG31371888207"><command>pmie</command> Rule Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>distributed collection <link linkend="ITch01-142">Distributed Collection</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>domains <link linkend="IG313718885">Unification of Performance Metric Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>DSO <link linkend="IG31371888325">Acronyms</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>duration <link linkend="IG31371888131">Performance Monitor Reporting Frequency and +Duration</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>dynamic adaptation <link linkend="IG3137188814">Dynamic Adaptation to Change</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>environ man page <link linkend="IG31371888135">Timezone Options</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>environment variables <link linkend="IG31371888136">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>error detection <link linkend="IG31371888233"><command>pmie</command> Error Detection</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_PMLOGGERCONTROL_PATH} file <link linkend="IG31371888296">Primary Logger</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_DIR}/etc/pcp.conf file <link linkend="IG31371888122">Common Directories and File Locations</link> <link linkend="IG31371888137">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_DIR}/etc/pcp.env file <link linkend="IG31371888121">Common Directories and File Locations</link> <link linkend="IG31371888138">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_RC_DIR}/pmcd file <link linkend="IG31371888127">Common Directories and File Locations</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>evaluation frequency <link linkend="IG31371888193">Setting Evaluation Frequency</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>extensibility <link linkend="IG3137188821">PCP Extensibility</link> <link linkend="IG31371888314">Customizing and Extending PCP Services</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>external equipment <link linkend="IG3137188871">Sources of Performance Metrics and Their Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>fetching metrics <link linkend="ITch03-3">Fetching Metrics from Another Host</link> <link linkend="ITch03-6">Fetching Metrics from an Archive Log</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>file locations <link linkend="IG31371888120">Common Directories and File Locations</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>firewalls <link linkend="IG31371888153">Running PCP Tools through a Firewall</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>flush command <link linkend="IG31371888253">Coordination between <command>pmlogger</command> and PCP tools</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>folios <link linkend="ITch07-9">PCP Archive Folios</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>functional domains <link linkend="ITch01-138">Sources of Performance Metrics and Their Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>glossary <link linkend="ITchA-1">Acronyms</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>illegal label record <link linkend="IG31371888307">Illegal Label Record</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>inference engine <link linkend="IG31371888320">Inference Engine Customization</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>infrastructure support tools <link linkend="ITch01-76">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>installing PCP <link linkend="ITch02-0">Installing and Configuring Performance Co-Pilot +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>intrinsic operators <link linkend="IG31371888213"><command>pmie</command> Intrinsic Operators</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>I/O <link linkend="IG31371888326">Acronyms</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>IPC <link linkend="IG31371888327">Acronyms</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>kill command <link linkend="IG31371888304">Primary <command>pmlogger</command> Cannot Start</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>layered software services <link linkend="IG3137188868">Sources of Performance Metrics and Their Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>lexical elements <link linkend="IG31371888189">Lexical Elements</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>libpcp_mmv library <link linkend="IG3137188886">Product Extensibility</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>libpcp_pmda library <link linkend="IG3137188885">Product Extensibility</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>log volumes <link linkend="IG31371888262">Log Volumes</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>logging</primaryie> + <seeie>archive logs</seeie> +</indexentry> + +<indexentry> + <primaryie>logical constants <link linkend="IG31371888201">Logical Constants</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>logical expressions <link linkend="IG31371888200"><command>pmie</command> Logical Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>macros <link linkend="IG31371888191">Macros</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>man command</primaryie> + <secondaryie>usage <link linkend="IG31371888168">Monitoring System Performance</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>max_host operator <link linkend="IG31371888220">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>metadata <link linkend="ITch01-150">Descriptions for Performance Metrics</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>metric domains <link linkend="IG313718886">Unification of Performance Metric Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>metric wraparound <link linkend="IG31371888231">Performance Metrics Wraparound</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>min_host operator <link linkend="IG31371888219">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>mkaf tool <link linkend="ITch01-48">Collecting, Transporting, and Archiving Performance Information +</link> <link linkend="IG31371888243">Introduction to Archive Logging</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>monitor configuration <link linkend="ITch02-5">Product Structure</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>monitor hosts <link linkend="IG3137188881">Collector and Monitor Roles</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>monitoring system performance <link linkend="IG31371888166">Monitoring System Performance</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>naming scheme <link linkend="IG3137188810">Uniform Naming and Access to Performance Metrics +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>netstat command <link linkend="IG31371888111">PMCD Does Not Start</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>network routers and bridges <link linkend="IG3137188870">Sources of Performance Metrics and Their Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>network transportation tools <link linkend="IG3137188835">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>newhelp tool <link linkend="ITch01-119">Application and Agent Development</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>Mail servers <link linkend="IG3137188867">Sources of Performance Metrics and Their Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>objectives <link linkend="IG313718882">Objectives</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>operational support tools <link linkend="IG3137188848">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>operators <link linkend="IG31371888205">Quantification Operators</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>overview <link linkend="ITch01-0">Introduction to Performance Co-Pilot</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmatop tool</primaryie> + <secondaryie>brief description <link linkend="IG3137188800">Performance Monitoring and Visualization</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmcd.options file <link linkend="ITch02-27">The <filename>pmcd.options</filename> File</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PCP</primaryie> + <secondaryie>acronym <link linkend="ITchA-4">Acronyms</link></secondaryie> + <secondaryie>archive logger deployment <link linkend="ITch08-3">PCP Archive Logger Deployment</link></secondaryie> + <secondaryie>collector deployment <link linkend="ITch08-0">PCP Collector Deployment</link></secondaryie> + <secondaryie>configuring and installing <link linkend="ITch02-3">Installing and Configuring Performance Co-Pilot +</link></secondaryie> + <secondaryie>conventions <link linkend="ITch03-0">Common Conventions and Arguments</link></secondaryie> + <secondaryie>distributed operation <link linkend="ITch01-2">PCP Distributed Operation</link></secondaryie> + <secondaryie>environment variables <link linkend="ITch03-14">PCP Environment Variables</link></secondaryie> + <secondaryie>extensibility <link linkend="IG3137188822">PCP Extensibility</link> <link linkend="ITch01-160">Product Extensibility</link></secondaryie> + <secondaryie>features <link linkend="IG313718880">Introduction to Performance Co-Pilot</link></secondaryie> + <secondaryie>log file option <link linkend="ITch03-8">Fetching Metrics from an Archive Log</link></secondaryie> + <secondaryie>naming conventions <link linkend="ITch03-2">Common Conventions and Arguments</link></secondaryie> + <secondaryie>pmie capabilities <link linkend="IG31371888177">Introduction to <command>pmie</command></link></secondaryie> + <secondaryie>pmie tool <link linkend="ITch06-7"><command>pmie</command> use of PCP services</link></secondaryie> + <secondaryie>tool customization <link linkend="ITch09-2">PCP Tool Customization</link></secondaryie> + <secondaryie>tool development <link linkend="ITch09-6">PCP Tool Development</link></secondaryie> + <secondaryie>tool summaries <link linkend="IG3137188827">Performance Monitoring and Visualization</link> <link linkend="IG3137188833">Collecting, Transporting, and Archiving Performance Information +</link> <link linkend="IG3137188847">Operational and Infrastructure Support</link> <link linkend="IG3137188860">Application and Agent Development</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pcp tool <link linkend="IG3137188850">Operational and Infrastructure Support</link> <link linkend="IG3137188859">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PCP Tutorials and Case Studies</primaryie> + <secondaryie>pminfo command <link linkend="IG31371888176">The <command>pminfo</command> Command</link></secondaryie> + <secondaryie>pmval command <link linkend="IG31371888173">The <command>pmval</command> Command</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>PCP_COUNTER_WRAP variable <link linkend="ITch03-15">PCP Environment Variables</link> <link linkend="IG31371888165">Performance Metric Wraparound</link> <link linkend="IG31371888230">Performance Metrics Wraparound</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PCP_STDERR variable <link linkend="ITch03-18">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PCPIntro command <link linkend="IG31371888112">PMCD Does Not Start</link> <link linkend="IG31371888132">Performance Monitor Reporting Frequency and +Duration</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PDU <link linkend="ITch02-28">The <filename>pmcd.options</filename> File</link> <link linkend="ITchA-5">Acronyms</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>Performance Co-Pilot</primaryie> + <seeie>PCP</seeie> +</indexentry> + +<indexentry> + <primaryie>Performance Metric Identifier</primaryie> + <seeie>PMID</seeie> +</indexentry> + +<indexentry> + <primaryie>performance metric wraparound <link linkend="ITch03-32">Performance Metric Wraparound</link> <link linkend="ITch06-21">Performance Metrics Wraparound</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>performance metrics</primaryie> + <secondaryie>concept <link linkend="IG3137188863">Performance Metrics</link></secondaryie> + <secondaryie>descriptions <link linkend="ITch01-149">Descriptions for Performance Metrics</link></secondaryie> + <secondaryie>methods <link linkend="ITch01-139">Sources of Performance Metrics and Their Domains</link></secondaryie> + <secondaryie>missing and incomplete values <link linkend="ITch02-39">Missing and Incomplete Values for Performance Metrics +</link></secondaryie> + <secondaryie>PMNS <link linkend="ITch01-148">Performance Metrics Name Space</link></secondaryie> + <secondaryie>retrospective sources <link linkend="ITch01-159">Retrospective Sources of Performance Metrics</link></secondaryie> + <secondaryie>sources <link linkend="ITch01-137">Sources of Performance Metrics and Their Domains</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>Performance Metrics Application Programming Interface</primaryie> + <seeie>PMAPI</seeie> +</indexentry> + +<indexentry> + <primaryie>Performance Metrics Collection Daemon</primaryie> + <seeie>PMCD</seeie> +</indexentry> + +<indexentry> + <primaryie>Performance Metrics Domain</primaryie> + <seeie>PMD</seeie> +</indexentry> + +<indexentry> + <primaryie>Performance Metrics Domain Agent</primaryie> + <seeie>PMDA</seeie> +</indexentry> + +<indexentry> + <primaryie>Performance Metrics Inference Engine</primaryie> + <seeie>pmie tool</seeie> +</indexentry> + +<indexentry> + <primaryie>Performance Metrics Name Space</primaryie> + <seeie>PMNS</seeie> +</indexentry> + +<indexentry> + <primaryie>performance monitoring <link linkend="IG3137188828">Performance Monitoring and Visualization</link> <link linkend="IG31371888167">Monitoring System Performance</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>performance visualization tools <link linkend="ITch07-4">Using Archive Logs with Performance Tools</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PM_INDOM_NULL <link linkend="ITch06-8"><command>pmie</command> and the Performance Metrics +Collection System</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmafm tool</primaryie> + <secondaryie>archive folios <link linkend="IG31371888244">Introduction to Archive Logging</link></secondaryie> + <secondaryie>brief description <link linkend="ITch01-50">Collecting, Transporting, and Archiving Performance Information +</link></secondaryie> + <secondaryie>interactive commands <link linkend="IG31371888295">PCP Archive Folios</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>PMAPI</primaryie> + <secondaryie>acronym <link linkend="ITchA-6">Acronyms</link></secondaryie> + <secondaryie>archive logs <link linkend="IG31371888246">Archive Logs and the PMAPI</link></secondaryie> + <secondaryie>brief description <link linkend="IG3137188861">Application and Agent Development</link></secondaryie> + <secondaryie>naming metrics <link linkend="ITch01-130">Performance Metrics</link></secondaryie> + <secondaryie>pmie capabilities <link linkend="IG31371888178">Introduction to <command>pmie</command></link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>PMCD</primaryie> + <secondaryie>acronym <link linkend="ITchA-7">Acronyms</link></secondaryie> + <secondaryie>brief description <link linkend="ITch01-52">Collecting, Transporting, and Archiving Performance Information +</link></secondaryie> + <secondaryie>collector host <link linkend="IG31371888196"><command>pmie</command> Metric Expressions</link></secondaryie> + <secondaryie>configuration files <link linkend="ITch02-26">PMCD Options and Configuration Files</link></secondaryie> + <secondaryie>diagnostics and error messages <link linkend="ITch02-23">PMCD Diagnostics and Error Messages</link></secondaryie> + <secondaryie>distributed collection <link linkend="IG3137188874">Distributed Collection</link> <link linkend="IG3137188875">Distributed Collection</link></secondaryie> + <secondaryie>maintenance <link linkend="ITch02-20">Performance Metrics Collection Daemon (PMCD)</link></secondaryie> + <secondaryie>monitoring utilities <link linkend="IG31371888309">Performance Co-Pilot Deployment Strategies</link></secondaryie> + <secondaryie>not starting <link linkend="IG31371888109">PMCD Does Not Start</link></secondaryie> + <secondaryie>PMCD_CONNECT_TIMEOUT variable <link linkend="IG31371888143">PCP Environment Variables</link></secondaryie> + <secondaryie>PMCD_PORT variable <link linkend="IG31371888144">PCP Environment Variables</link></secondaryie> + <secondaryie>PMCD_RECONNECT_TIMEOUT variable <link linkend="IG31371888146">PCP Environment Variables</link></secondaryie> + <secondaryie>PMCD_REQUEST_TIMEOUT variable <link linkend="IG31371888147">PCP Environment Variables</link></secondaryie> + <secondaryie>remote connection <link linkend="IG31371888101">Cannot Connect to Remote PMCD</link></secondaryie> + <secondaryie>starting and stopping <link linkend="IG3137188891">Starting and Stopping the PMCD</link></secondaryie> + <secondaryie>TCP/IP firewall <link linkend="IG31371888157">Running PCP Tools through a Firewall</link></secondaryie> + <secondaryie>${PCP_PMCDCONF_PATH} file <link linkend="IG31371888124">Common Directories and File Locations</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmcd tool</primaryie> + <seeie>PMCD</seeie> +</indexentry> + +<indexentry> + <primaryie>PMCD_CONNECT_TIMEOUT variable <link linkend="IG31371888106">Cannot Connect to Remote PMCD</link> <link linkend="ITch03-22">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PMCD_PORT variable <link linkend="IG31371888160">Running PCP Tools through a Firewall</link> <link linkend="IG31371888113">PMCD Does Not Start</link> <link linkend="ITch03-23">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PMCD_RECONNECT_TIMEOUT variable <link linkend="ITch03-24">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PMCD_REQUEST_TIMEOUT variable <link linkend="ITch03-25">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmcd_wait tool <link linkend="IG3137188839">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmcd.conf file <link linkend="ITch02-29">The <filename>pmcd.conf</filename> File</link> <link linkend="ITch02-30">Controlling Access to PMCD with <filename>pmcd.conf</filename></link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmchart tool</primaryie> + <secondaryie>brief description <link linkend="IG3137188800nat">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>fetching metrics <link linkend="IG31371888115">Fetching Metrics from Another Host</link></secondaryie> + <secondaryie>man example <link linkend="Z1033415772tls">Monitoring System Performance</link></secondaryie> + <secondaryie>record mode <link linkend="IG31371888292">PCP Archive Folios</link></secondaryie> + <secondaryie>remote PMCD <link linkend="IG31371888102">Cannot Connect to Remote PMCD</link></secondaryie> + <secondaryie>short-term executions <link linkend="IG31371888265">Directory Organization for Archive Log Files</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmclient tool <link linkend="ITch01-121">Application and Agent Development</link> <link linkend="ITch01-123">Application and Agent Development</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmcollectl tool</primaryie> + <secondaryie>brief description <link linkend="IG3137188801">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>record mode <link linkend="IG31371888293">PCP Archive Folios</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmconfirm command</primaryie> + <secondaryie>error messages <link linkend="IG31371888141">PCP Environment Variables</link></secondaryie> + <secondaryie>visible alarm <link linkend="IG31371888180">Introduction to <command>pmie</command></link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>PMD <link linkend="IG3137188893">PMDA Installation on a PCP Collector Host</link> <link linkend="ITchA-9">Acronyms</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PMDA</primaryie> + <secondaryie>acronym <link linkend="ITchA-10">Acronyms</link></secondaryie> + <secondaryie>collectors <link linkend="IG3137188880">Collector and Monitor Roles</link></secondaryie> + <secondaryie>customizing <link linkend="ITch09-1">Customizing the Summary PMDA</link></secondaryie> + <secondaryie>development <link linkend="ITch09-5">PMDA Development</link></secondaryie> + <secondaryie>installation <link linkend="ITch02-32">PMDA Installation on a PCP Collector Host</link></secondaryie> + <secondaryie>instance names <link linkend="IG31371888197"><command>pmie</command> Metric Expressions</link></secondaryie> + <secondaryie>libraries <link linkend="IG3137188823">PCP Extensibility</link></secondaryie> + <secondaryie>managing optional agents <link linkend="ITch02-31">Managing Optional PMDAs</link></secondaryie> + <secondaryie>monitoring utilities <link linkend="IG31371888310">Performance Co-Pilot Deployment Strategies</link></secondaryie> + <secondaryie>removal <link linkend="ITch02-35">PMDA Removal on a PCP Collector Host</link></secondaryie> + <secondaryie>unification <link linkend="IG313718887">Unification of Performance Metric Domains</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmdaapache tool <link linkend="IG3137188803">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdacisco tool <link linkend="ITch01-54">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdaelasticsearch tool <link linkend="IG3137188714">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdagfs2 tool <link linkend="IG3137188804">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdagluster tool <link linkend="IG3137188712">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdainfiniband tool <link linkend="IG3137188805">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdakvm tool <link linkend="IG3137188806">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdalustrecomm tool <link linkend="IG3137188807">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdamailq tool <link linkend="ITch01-58">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdamemcache tool <link linkend="IG3137188808">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdammv tool <link linkend="IG3137188841">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdamysql tool <link linkend="IG3137188809">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdanamed tool <link linkend="IG3137188701">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdanginx tool <link linkend="IG3137188702">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdapostfix tool <link linkend="IG3137188703">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdapostgresql tool <link linkend="IG3137188704">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdaproc tool <link linkend="IG3137188705">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdarsyslog tool <link linkend="IG3137188706">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdasamba tool <link linkend="IG3137188707">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdasendmail tool <link linkend="IG3137188842">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdasnmp tool <link linkend="IG3137188708">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdasummary tool <link linkend="ITch01-62">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdasystemd tool <link linkend="IG3137188709">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdavmware tool <link linkend="IG3137188710">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdaweblog tool <link linkend="IG3137188843">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdaxfs tool <link linkend="IG3137188711">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdbg facility <link linkend="ITch01-89">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmdumplog tool</primaryie> + <secondaryie>archive log contents <link linkend="IG31371888269">PCP Archive Contents</link></secondaryie> + <secondaryie>brief description <link linkend="ITch01-66">Collecting, Transporting, and Archiving Performance Information +</link></secondaryie> + <secondaryie>troubleshooting <link linkend="ITch07-23">Cannot Find Log</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmdumptext tool</primaryie> + <secondaryie>brief description <link linkend="ITch01-22">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>description <link linkend="IG31371888172">The <command>pmdumptext</command> Command</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmerr tool <link linkend="ITch01-91">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmgenmap tool <link linkend="ITch01-125">Application and Agent Development</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmhostname tool <link linkend="IG3137188852">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PMID</primaryie> + <secondaryie>acronym <link linkend="ITchA-11">Acronyms</link></secondaryie> + <secondaryie>description <link linkend="IG3137188872">Sources of Performance Metrics and Their Domains</link> <link linkend="ITch01-146">Performance Metrics Name Space</link></secondaryie> + <secondaryie>PMNS names <link linkend="IG31371888316">Customizing the Summary PMDA</link></secondaryie> + <secondaryie>printing <link linkend="IG31371888174">The <command>pminfo</command> Command</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmie tool</primaryie> + <secondaryie>%-token <link linkend="IG31371888211"><command>pmie</command> Rule Expressions</link></secondaryie> + <secondaryie>arithmetic aggregation <link linkend="IG31371888214">Arithmetic Aggregation</link></secondaryie> + <secondaryie>arithmetic expressions <link linkend="ITch06-15"><command>pmie</command> Arithmetic Expressions</link></secondaryie> + <secondaryie>automated reasoning <link linkend="ITch06-4">Introduction to <command>pmie</command></link></secondaryie> + <secondaryie>basic examples <link linkend="IG31371888186">Basic <command>pmie</command> Usage</link></secondaryie> + <secondaryie>brief description <link linkend="ITch01-32">Performance Monitoring and Visualization</link> <link linkend="IG3137188853">Operational and Infrastructure Support</link></secondaryie> + <secondaryie>customization <link linkend="IG31371888184">Introduction to <command>pmie</command></link></secondaryie> + <secondaryie>developing rules <link linkend="ITch06-20">Developing and Debugging <command>pmie</command> +Rules</link></secondaryie> + <secondaryie>error detection <link linkend="ITch06-24"><command>pmie</command> Error Detection</link></secondaryie> + <secondaryie>examples <link linkend="ITch06-9">Simple <command>pmie</command> Usage</link> <link linkend="ITch06-10">Complex <command>pmie</command> Examples</link></secondaryie> + <secondaryie>fetching metrics <link linkend="IG31371888116">Fetching Metrics from Another Host</link></secondaryie> + <secondaryie>global files and directories <link linkend="IG31371888238">Global Files and Directories</link></secondaryie> + <secondaryie>instance names <link linkend="ITch06-23"><command>pmie</command> Instance Names</link></secondaryie> + <secondaryie>intrinsic operators <link linkend="ITch06-17"><command>pmie</command> Intrinsic Operators</link></secondaryie> + <secondaryie>language <link linkend="IG31371888179">Introduction to <command>pmie</command></link> <link linkend="IG31371888188">Specification Language for <command>pmie</command></link></secondaryie> + <secondaryie>logical expressions <link linkend="ITch06-16"><command>pmie</command> Logical Expressions</link></secondaryie> + <secondaryie>metric expressions <link linkend="ITch06-13"><command>pmie</command> Metric Expressions</link></secondaryie> + <secondaryie>performance metrics inference engine <link linkend="ITch06-0">Performance Metrics Inference Engine</link></secondaryie> + <secondaryie>pmieconf rules <link linkend="IG3137188830">Performance Monitoring and Visualization</link> <link linkend="IG31371888234">Creating <command>pmie</command> Rules with <command> +pmieconf</command></link></secondaryie> + <secondaryie>procedures <link linkend="IG31371888236">Creating <command>pmie</command> Rules with <command> +pmieconf</command></link> <link linkend="IG31371888237">Management of <command>pmie</command> Processes</link></secondaryie> + <secondaryie>rate conversion <link linkend="ITch06-14"><command>pmie</command> Rate Conversion</link></secondaryie> + <secondaryie>rate operator <link linkend="IG31371888223">The <command>rate</command> Operator</link></secondaryie> + <secondaryie>real examples <link linkend="ITch06-18"><command>pmie</command> Examples</link></secondaryie> + <secondaryie>remote PMCD <link linkend="IG31371888103">Cannot Connect to Remote PMCD</link></secondaryie> + <secondaryie>sample intervals <link linkend="ITch06-22"><command>pmie</command> Sample Intervals</link></secondaryie> + <secondaryie>setting evaluation frequency <link linkend="ITch06-12">Setting Evaluation Frequency</link></secondaryie> + <secondaryie>syntax <link linkend="ITch06-11">Basic <command>pmie</command> Syntax</link></secondaryie> + <secondaryie>transitional operators <link linkend="IG31371888225">Transitional Operators</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmevent tool</primaryie> + <secondaryie>brief description <link linkend="IG3137188713">Performance Monitoring and Visualization</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmieconf tool</primaryie> + <secondaryie>brief description <link linkend="IG3137188829">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>customization <link linkend="IG31371888185">Introduction to <command>pmie</command></link></secondaryie> + <secondaryie>rules <link linkend="IG31371888235">Creating <command>pmie</command> Rules with <command> +pmieconf</command></link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pminfo tool</primaryie> + <secondaryie>brief description <link linkend="ITch01-34">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>description <link linkend="ITch04-25">The <command>pminfo</command> Command</link></secondaryie> + <secondaryie>displaying the PMNS <link linkend="IG3137188897">Performance Metrics Name Space</link></secondaryie> + <secondaryie>PCP Tutorials and Case Studies <link linkend="IG31371888175">The <command>pminfo</command> Command</link></secondaryie> + <secondaryie>pmie arguments <link linkend="IG31371888187"><command>pmie</command> and the Performance Metrics +Collection System</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmstat tool</primaryie> + <secondaryie>brief description <link linkend="ITch01-37">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>description <link linkend="ITch04-20">The <command>pmstat</command> Command</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmlc tool</primaryie> + <secondaryie>brief description <link linkend="ITch01-68">Collecting, Transporting, and Archiving Performance Information +</link></secondaryie> + <secondaryie>description <link linkend="ITch07-17">Using <command>pmlc</command></link></secondaryie> + <secondaryie>dynamic adjustment <link linkend="IG31371888240">Introduction to Archive Logging</link></secondaryie> + <secondaryie>flush command <link linkend="IG31371888254">Coordination between <command>pmlogger</command> and PCP tools</link></secondaryie> + <secondaryie>PMLOGGER_PORT variable <link linkend="IG31371888150">PCP Environment Variables</link></secondaryie> + <secondaryie>show command <link linkend="IG31371888302">Primary <command>pmlogger</command> Cannot Start</link></secondaryie> + <secondaryie>SIGHUP signal <link linkend="IG31371888264">Log Volumes</link></secondaryie> + <secondaryie>TCP/IP firewall <link linkend="IG31371888159">Running PCP Tools through a Firewall</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmlock tool <link linkend="ITch01-95">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogcheck tool <link linkend="IG3137188844">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogconf tool <link linkend="IG3137188845">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogextract tool <link linkend="ITch01-72">Collecting, Transporting, and Archiving Performance Information +</link> <link linkend="ITch07-10">Manipulating Archive Logs with <command>pmlogextract</command></link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogger tool <link linkend="IG31371888149">PCP Environment Variables</link> <link linkend="IG31371888208"><command>pmie</command> Rule Expressions</link></primaryie> + <secondaryie>archive logs <link linkend="IG31371888117">Fetching Metrics from an Archive Log</link> <link linkend="ITch07-2">Introduction to Archive Logging</link></secondaryie> + <secondaryie>brief description <link linkend="IG3137188846">Collecting, Transporting, and Archiving Performance Information +</link></secondaryie> + <secondaryie>configuration <link linkend="IG31371888267">Configuration of <command>pmlogger</command></link> <link linkend="IG31371888298">Using <command>pmlc</command></link></secondaryie> + <secondaryie>cookbook tasks <link linkend="IG31371888271">Cookbook for Archive Logging</link></secondaryie> + <secondaryie>current metric context <link linkend="IG3137188865">Current Metric Context</link></secondaryie> + <secondaryie>folios <link linkend="IG31371888294">PCP Archive Folios</link></secondaryie> + <secondaryie>PCP tool coordination <link linkend="IG31371888251">Coordination between <command>pmlogger</command> and PCP tools</link></secondaryie> + <secondaryie>pmlc control <link linkend="IG31371888241">Introduction to Archive Logging</link></secondaryie> + <secondaryie>primary instance <link linkend="ITch07-16">Primary Logger</link></secondaryie> + <secondaryie>remote PMCD <link linkend="IG31371888104">Cannot Connect to Remote PMCD</link></secondaryie> + <secondaryie>TCP/IP firewall <link linkend="IG31371888158">Running PCP Tools through a Firewall</link></secondaryie> + <secondaryie>troubleshooting <link linkend="IG31371888300">Archive Logging Troubleshooting</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogger_check script <link linkend="IG3137188854">Operational and Infrastructure Support</link> <link linkend="ITch07-7">Administering PCP Archive Logs Using <command>cron</command> Scripts +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogger_daily script <link linkend="IG3137188855">Operational and Infrastructure Support</link> <link linkend="IG31371888256">Administering PCP Archive Logs Using <command>cron</command> Scripts +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogger_merge script <link linkend="IG3137188856">Operational and Infrastructure Support</link> <link linkend="IG31371888258">Administering PCP Archive Logs Using <command>cron</command> Scripts +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PMLOGGER_PORT variable <link linkend="IG31371888161">Running PCP Tools through a Firewall</link> <link linkend="IG31371888148">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmlogsummary tool <link linkend="IG3137188831">Performance Monitoring and Visualization +</link> <link linkend="ITch07-11">Summarizing Archive Logs with <command>pmlogsummary</command></link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmnewlog tool <link linkend="ITch01-100">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PMNS</primaryie> + <secondaryie>acronym <link linkend="ITchA-12">Acronyms</link></secondaryie> + <secondaryie>brief description <link linkend="IG3137188864">Performance Metrics</link></secondaryie> + <secondaryie>defined names <link linkend="IG3137188811">Uniform Naming and Access to Performance Metrics +</link></secondaryie> + <secondaryie>description <link linkend="ITch01-144">Performance Metrics Name Space</link></secondaryie> + <secondaryie>management <link linkend="IG31371888322">PMNS Management</link></secondaryie> + <secondaryie>metric expressions <link linkend="IG31371888194"><command>pmie</command> Metric Expressions</link></secondaryie> + <secondaryie>names <link linkend="IG31371888315">Customizing the Summary PMDA</link></secondaryie> + <secondaryie>PMNS <link linkend="ITch03-10">Alternate Performance Metric Name Spaces</link></secondaryie> + <secondaryie>syntax <link linkend="ITch09-4">PMNS Syntax</link></secondaryie> + <secondaryie>troubleshooting <link linkend="IG3137188898">Performance Metrics Name Space</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>PMPROXY_PORT variable <link linkend="ITch03-30">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmnsadd tool <link linkend="ITch01-102">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmnsdel tool <link linkend="ITch01-106">Operational and Infrastructure Support</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmprintf tool <link linkend="IG31371888140">PCP Environment Variables</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmprobe tool <link linkend="IG3137188832">Performance Monitoring and Visualization</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmrun tool <link linkend="IG31371888114">Common Conventions and Arguments</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmsnap tool</primaryie> + <secondaryie>brief description <link linkend="IG3137188857">Operational and Infrastructure Support</link></secondaryie> + <secondaryie>script usage <link linkend="IG31371888257">Administering PCP Archive Logs Using <command>cron</command> Scripts +</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmproxy tool</primaryie> + <secondaryie>brief description <link linkend="ITch01-38">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>pmproxy port <link linkend="IG31371888152">PCP Environment Variables</link></secondaryie> + <secondaryie>TCP/IP firewall <link linkend="IG31371888156">Running PCP Tools through a Firewall</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmstore tool</primaryie> + <secondaryie>brief description <link linkend="ITch01-112">Operational and Infrastructure Support</link></secondaryie> + <secondaryie>description <link linkend="ITch04-26">The <command>pmstore</command> Command</link></secondaryie> + <secondaryie>setting metric values <link linkend="IG31371888169">Monitoring System Performance</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmtrace tool <link linkend="ITch01-74">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>pmval tool</primaryie> + <secondaryie>brief description <link linkend="ITch01-42">Performance Monitoring and Visualization</link></secondaryie> + <secondaryie>description <link linkend="ITch04-23">The <command>pmval</command> Command</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>pmwebd tool <link linkend="IG3137188802">Collecting, Transporting, and Archiving Performance Information +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>primary archive <link linkend="IG31371888272">Primary Logger</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>primary logger <link linkend="ITch07-14">Primary Logger</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>protocol data units</primaryie> + <seeie>PDU</seeie> +</indexentry> + +<indexentry> + <primaryie>quantification operators <link linkend="IG31371888204">Quantification Operators</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>rate conversion <link linkend="IG31371888198"><command>pmie</command> Rate Conversion</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>rate operator <link linkend="IG31371888224">The <command>rate</command> Operator</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>relational expressions <link linkend="IG31371888202">Relational Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>reporting frequency <link linkend="IG31371888130">Performance Monitor Reporting Frequency and +Duration</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>retrospective analysis <link linkend="IG31371888248">Retrospective Analysis Using Archive Logs</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>roles</primaryie> + <secondaryie>collector <link linkend="ITch01-155">Collector and Monitor Roles</link> <link linkend="IG3137188888">Product Structure</link></secondaryie> + <secondaryie>monitor <link linkend="IG3137188879">Collector and Monitor Roles</link> <link linkend="IG3137188889">Product Structure</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>rule expressions <link linkend="IG31371888206"><command>pmie</command> Rule Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>sample intervals <link linkend="IG31371888232"><command>pmie</command> Sample Intervals</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>kernel data structures <link linkend="IG3137188866">Sources of Performance Metrics and Their Domains</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>scripts <link linkend="IG3137188858">Operational and Infrastructure Support</link> <link linkend="IG31371888255">Administering PCP Archive Logs Using <command>cron</command> Scripts +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>service management <link linkend="IG31371888311">Quality of Service Measurement</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>set-valued performance metrics <link linkend="IG3137188878">Set-Valued Performance Metrics</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>show command <link linkend="IG31371888301">Primary <command>pmlogger</command> Cannot Start</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>SIGHUP signal <link linkend="IG31371888107">PMCD Not Reconfiguring after <literal>SIGHUP</literal></link> <link linkend="IG31371888263">Log Volumes</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>SIGINT signal <link linkend="IG31371888303">Primary <command>pmlogger</command> Cannot Start</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>SIGUSR1 signal <link linkend="IG31371888252">Coordination between <command>pmlogger</command> and PCP tools</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>single-valued performance metrics <link linkend="IG3137188877">Single-Valued Performance Metrics</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>PROXY protocol <link linkend="IG31371888155">Running PCP Tools through a Firewall</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>software <link linkend="IG3137188825">Overview of Component Software</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>subsystems <link linkend="IG3137188887">Product Structure</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>sum_host operator <link linkend="IG31371888217">Arithmetic Aggregation</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>syntax <link linkend="IG31371888323">PMNS Syntax</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>syslog function <link linkend="IG31371888181">Introduction to <command>pmie</command></link> <link linkend="IG31371888209"><command>pmie</command> Rule Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>system log file <link linkend="IG31371888182">Introduction to <command>pmie</command></link> <link linkend="IG31371888212"><command>pmie</command> Rule Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>target usage <link linkend="IG313718883">PCP Target Usage</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>TCP/IP</primaryie> + <secondaryie>acronym <link linkend="ITchA-13">Acronyms</link></secondaryie> + <secondaryie>collector and monitor hosts <link linkend="IG31371888154">Running PCP Tools through a Firewall</link></secondaryie> + <secondaryie>remote PMCD <link linkend="IG31371888105">Cannot Connect to Remote PMCD</link></secondaryie> + <secondaryie>sockets <link linkend="IG31371888145">PCP Environment Variables</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>text-based tools <link linkend="IG31371888170">Monitoring System Performance</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>time dilation <link linkend="ITch03-33">Time Dilation and Time Skew</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>time duration <link linkend="IG31371888129">Time Duration and Control</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>time window options <link linkend="IG31371888133">Time Window Options</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>time-stamped message <link linkend="IG31371888210"><command>pmie</command> Rule Expressions</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>timezone options <link linkend="ITch03-12">Timezone Options</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>tool customization <link linkend="IG31371888317">PCP Tool Customization</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>tool development <link linkend="IG31371888324">PCP Tool Development</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>tool options <link linkend="IG31371888118">General PCP Tool Options</link> <link linkend="ITch06-1">Performance Metrics Inference Engine</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>transient problems <link linkend="IG31371888164">Transient Problems with Performance Metric Values +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>transitional operators <link linkend="IG31371888226">Transitional Operators</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>troubleshooting</primaryie> + <secondaryie>archive logging <link linkend="ITch07-19">Archive Logging Troubleshooting</link></secondaryie> + <secondaryie>general utilities <link linkend="ITch02-42">Cannot Connect to Remote PMCD</link></secondaryie> + <secondaryie>kernel metrics <link linkend="IG3137188899">Kernel Metrics and the PMCD</link></secondaryie> + <secondaryie>PMCD <link linkend="ITch02-38">Troubleshooting</link> <link linkend="IG31371888100">Kernel Metrics and the PMCD</link></secondaryie> +</indexentry> + +<indexentry> + <primaryie>uniform naming <link linkend="IG313718889">Uniform Naming and Access to Performance Metrics +</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>units <link linkend="IG31371888192">Units</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>user interface components <link linkend="Z1033415461tls">Common Conventions and Arguments</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_BINADM_DIR}/pmcd file <link linkend="IG31371888125">Common Directories and File Locations</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_LOG_DIR}/NOTICES file <link linkend="IG31371888183">Introduction to <command>pmie</command></link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_LOGDIR}/pmcd/pmcd.log file <link linkend="IG31371888110">PMCD Does Not Start</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_PMCDOPTIONS_PATH} file <link linkend="IG31371888126">Common Directories and File Locations</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_PMCDCONF_PATH} file<link linkend="IG3137188896">PMDA Installation on a PCP Collector Host</link> <link linkend="IG31371888108">PMCD Not Reconfiguring after <literal>SIGHUP</literal></link> <link linkend="IG31371888123">Common Directories and File Locations</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_SYSCONF_DIR}/pmlogger/config.default file <link linkend="IG31371888297">Primary Logger</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_PMLOGGERCONTROL_PATH} file <link linkend="IG31371888259">Administering PCP Archive Logs Using <command>cron</command> Scripts +</link> <link linkend="IG31371888266">Directory Organization for Archive Log Files</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_DEMOS_DIR} <link linkend="IG31371888228"><command>pmie</command> Examples</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_VAR_DIR}/pmns/stdpmid file <link linkend="IG3137188895">PMDA Installation on a PCP Collector Host</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>${PCP_TMP_DIR}/pmlogger files <link linkend="IG31371888305">Primary <command>pmlogger</command> Cannot Start</link></primaryie> +</indexentry> + +<indexentry> + <primaryie>window options <link linkend="IG31371888134">Time Window Options</link></primaryie> +</indexentry> + +</index> + +</book> |