Notes on using the PCP QA Suite =============================== Preliminaries ------------- PCP needs to be installed on the local host, with pmcd operational. The sample PMDA needs to be installed. Basic getting started --------------------- There is some local configuration needed ... check the file "common.config" ... this script uses heuristics to set a number of interesting variables, specifically: $PCPQA_CLOSE_X_SERVER The $DISPLAY setting for an X server that is willing to accept connections from X clients running on the local machine. $PCPQA_FAR_PMCD The hostname for a host running pmcd, but the host is preferably a long way away (over a WAN) for timing test. $PCPQA_HYPHEN_HOST The hostname for a host running pmcd, with a hyphen (-) in the hostname. Next, mk.qa_hosts is a script that includes heuristics for selecting and sorting the list of potential remote PCP QA hosts (qa_hosts.master). Refer to the comments in qa_hosts.master, and make appropriate changes. For each of the potential remote PCP QA hosts, the following must be set up: (a) PCP installed, (b) pmcd(1) running, (c) a login for the user "pcpqa" needs to be created, and then set up in such a way that ssh/scp will work without the need for any password, i.e. these sorts of commands $ ssh pcpqa@pcp-qa-host some-command $ scp some-file pcpqa@pcp-qa-host:some-dir must work correctly when run from the local host. The "pcpqa" user's environment must also be initialized so that their shell's path includes all of the PCP binary directories, so that all PCP commands are executable without full pathnames. Of most concern would be auxilliary directory (usually /usr/pcp/bin, /usr/share/pcp/bin or /usr/libexec/pcp/bin) where commands like pmlogger(1), pmhostname(1), mkaf(1), etc.) are installed. Once you've modified common.config and qa_hosts.master, then run "chk.setup" to validate the settings. For test 051 we need five local hostnames that are valid, although PCP does not need to be installed there, nor pmcd(1) running. The five hosts listed in 051.hosts (the comments at the start of this file explain what is required) should suffice for most installations. The PCP QA tests are designed to be run by a non-root user. Where root privileges are needed, e.g. to stop or start pmcd, install/remove PMDAs, etc. the "sudo" application is used. When using sudo for QA, your local or pcpqa user needs to be able to execute commands as root without being prompted for a password. This can be achieved by adding the following line to the /etc/sudoers file (or in more recent versions of sudo, a /etc/sudoers.d/pcpqa file): pcpqa ALL=(ALL) NOPASSWD: ALL Some tests are graphical, and wish to make use of your display. For authentication to success, you may find you need to perform some access list updates, e.g. "xhost +local:" for such tests to pass (e.g. test 325). You can now verify your QA setup, by running: ./check 000 The first time you run "check" (see below) it will descend into the src directory (see below) and make all of the QA test programs and dynamic PCP archives, so some patience may be required. If test 000 fails, it may be that you have locally developed PMDAs or optional PMDAs installed. Edit common.filter, and modify the _filter_top_pmns() procedure to strip the top-level name components for any new metric names (there are lots of examples already there) ... if these are distributed PMDAs, you should send patches back to pcp@mail.performancecopilot.org. Doing the Real Work ------------------- check ... This script runs tests and verifies the output. In general, test NNN is expected to terminate with an exit status of 0, no core file and produce output that matches that in the file NNN.out ... failures leave the current output in NNN.out.bad, and may leave a more verbose trace that is useful for diagnosing failures in NNN.full. The command line options to check are: NNN run test NNN (leading zeros will be added as necessary to the test sequence number, so 00N and N are equivalent) NNN- all tests >= NNN NNN-MMM all tests in the range NNN ... MMM -l diffs in line mode (the default is to use xdiff or similar) -n show me, do not run any tests -q quick mode, by-pass the initial setup integrity checks (recommended that you do not use this the first time, nor if the last run test failed) -g xxx include tests from a named group (xxx) ... refer to the "groups" file -x xxx exclude tests from a named group (xxx) ... refer to the "groups" file If none of the NNN variants or -g is specified, then the default is to run all tests. Each of the NNN scripts that may be run by check follows the same basic scheme: - include some optional shell procedures and set variables to define the local configuration options - optionally, check the run-time environment to see if it makes sense to run the test at all, and if not echo the reason to the file NNN.notrun and exit ... check will notice the NNN.notrun file and skip any testing of the exit status or comparison of output - define $tmp as a prefix to be used for all temporary files, and install a trap handler to remove temporary files when the scipt exits - optionally, check the run-time environment to choose one of a number of expected output formats, and link the selected file to NNN.out ... if the same output is expected in all environments, the NNN.out file will already exist as part of the PCP QA distribution - run the test - optionally save all the output in the file NNN.full ... this is only useful for debugging test failures - filter the output to produce deterministic output that will match NNN.out if the test has been successful remake NNN This script creates a new NNN.out file. Since the NNN.out files are precious, and reflect the state of the qualified and expected output, they should typically not be changed unless some change has been made to the NNN script or the filters it uses. new Make sure "group" is writeable, then run "new" to create the skeletal framework of a new test. It is strongly suggested that you base your test on an existing test ... pay particular attention to making the output deterministic so the test uses the "not run" protocols (see 009 and check for examples) to avoid running the test (and hence failing) if an optional application, feature or platform is not available, and uses appropriate filters (see common.filter for lots of useful filters already packaged as shell procedures). show-me ... Report differences between the NNN.out and NNN.out.bad files. By default, uses all of the NNN.out.bad files in the current directory, but can also specify test numbers or ranges of test numbers on the command line. Other options may be used to fetch good and bad output files from various exotic remote locations (refer to the script). Make in the src Directory ----------------------------- The src directory contains a number of test applications that are designed to exercise some of the more exotic corners of the PCP functionality. In making these applications, you may see this ... Error: trace_dev.h and ../../src/include/trace_dev.h are different! make: [trace_dev.h] Error 1 (ignored) this is caused by the source for the pcp_trace library being out of sync with the src applications. If this happens, please ... 1. cd src 2. diff -u trace_dev.h ../../src/include/trace_dev.h and mail the differences to pcp@mail.performancecopilot.org so we can refine the Makefiles to avoid cosmetic differences 3. mv trace_dev.h trace_dev.h.orig cp ../../src/include/trace_dev.h trace_dev.h 4. make 008 Issues ---------- Test 008 depends on the local disk configuration, so you need to make your own 008.out file (or rather a variant that 008 will link to 008.out when the test is run). Refer to the 008 script, but here is the basic recipe: $ touch touch 008.out.`hostname` $ remake 008 $ mv 008.out 008.out.`hostname` Be aware that it can be adversely influenced by temporary disks like USB sticks, mobile phones, or other transient storage that may come and go in your test systems. Fixes ----- If you find something that does not work, and fix it, or create additional QA tests, please send the details to pcp@mail.performancecopilot.org.