In the beginning all PCP archives were created by pmlogger.  So a
corrupt PCP archive meant pmlogger had a bug or was interrupted in
some manner.

We fixed the bugs (!) and hardened the checking of archives to ensure
we could process as much as possible of an interrupted archive.

But things changed and archives could be created in more ways ...
- pmlogmerge, so more checks to ensure the semantic consistency of
  the input archives, but again we could assume pmlogmerge would
  create correct archives
- pmlogreduce, same as pmlogmerge

With the introduction of libpcp_import and bindings for Perl and
Python, we now have the possibility of and infinite number of scripts
creating archives using low-level calls that can be combined to
produce and infinite variety of corrupted archives.  The first example
of this class is https://bugzilla.redhat.com/show_bug.cgi?id=958745
but we should expect more of these to be lurking in the future.

When these problems appear, the initial triage effort is directed
(rightly) towards the replay tool that is failing, and it takes
considerable time and effort to determine that the root cause is
a corrupted archive, not an application or libpcp failure.

Some corruption we can (and do) catch in libpcp.  We could probably
do more there, but the most common usage with "interp" mode replay
makes it almost impossible to check timestamps on the fly, so the
but above would be most unlikely to be found there.

So, in the spirit of the original Unix filesystem, I'm proposing
an ncheck/icheck (none of you're whimpy fsck in those days) tool,
pmlogcheck.

The objective would be to have one anally retentive tool that can
assert the "goodness" of a PCP archive, as the first step in any
triage, even before pmdumplog is used.

pmlogcheck would certainly be a multi-pass tool, initially using
no libpcp services to read blocks of the files, and then graduate
to the low-level libpcp routines once basic sanity has been
established.

The sorts of checks it might try would include:

x. process temporal index if any
   [ ] check label
   [ ] if missing, warn
   [ ] else load temporal index
x. process meta file
   [ ] check label
   [ ] check header-trailer len for every record
   [ ] check timestamps for indoms are monotonic increasing (and >= label
       record start)
   [ ] check timestamp and offset against temporal index (if available)
   [ ] if OK load metadata
x. process each metric volume file
   [ ] check label
   [ ] check header-trailer len for every record
   [ ] check timestamps are monotonic increasing (and >= label record start)
   [ ] check timestamp and offset against temporal index (if available)
   [ ] check pmids are defined in meta data
   [ ] check instances are defined in metadata
   [ ] check value encoding matches metadata type