summaryrefslogtreecommitdiff
path: root/pkgtools/pkglint
diff options
context:
space:
mode:
authorrillig <rillig>2006-02-26 23:38:07 +0000
committerrillig <rillig>2006-02-26 23:38:07 +0000
commitd05052ab81f4bc048f08d8e638ab9227fe9e2df2 (patch)
treecc1c0efd86e57b159e09b121c75bfda963898321 /pkgtools/pkglint
parente2de76bab24f94fd8e29a2d63ef2a3538afc80ec (diff)
downloadpkgsrc-d05052ab81f4bc048f08d8e638ab9227fe9e2df2.tar.gz
Added the book ``Design and implementation of pkglint''.
Diffstat (limited to 'pkgtools/pkglint')
-rw-r--r--pkgtools/pkglint/files/doc/Makefile26
-rw-r--r--pkgtools/pkglint/files/doc/chap.code.xml218
-rw-r--r--pkgtools/pkglint/files/doc/chap.defs.xml25
-rw-r--r--pkgtools/pkglint/files/doc/chap.intro.xml15
-rw-r--r--pkgtools/pkglint/files/doc/chap.statemachines.xml65
-rw-r--r--pkgtools/pkglint/files/doc/chap.types.xml419
-rw-r--r--pkgtools/pkglint/files/doc/pkglint.xml34
-rw-r--r--pkgtools/pkglint/files/doc/statemachine.patch.diabin0 -> 2385 bytes
-rw-r--r--pkgtools/pkglint/files/doc/statemachine.shellcmd.diabin0 -> 3531 bytes
-rw-r--r--pkgtools/pkglint/files/doc/stylesheet.xsl4
10 files changed, 806 insertions, 0 deletions
diff --git a/pkgtools/pkglint/files/doc/Makefile b/pkgtools/pkglint/files/doc/Makefile
new file mode 100644
index 00000000000..fa8bde4a426
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/Makefile
@@ -0,0 +1,26 @@
+# $NetBSD: Makefile,v 1.1 2006/02/26 23:38:07 rillig Exp $
+#
+
+XMLDOCS+= pkglint.xml
+XMLDOCS+= chap.intro.xml
+XMLDOCS+= chap.defs.xml
+XMLDOCS+= chap.types.xml
+XMLDOCS+= chap.code.xml
+XMLDOCS+= chap.statemachines.xml
+
+IMAGES+= statemachine.patch.png
+IMAGES+= statemachine.shellcmd.png
+
+.PHONY: all
+all: pkglint.html
+
+pkglint.html: ${XMLDOCS} ${IMAGES} stylesheet.xsl
+ xmlto -m stylesheet.xsl html-nochunks pkglint.xml
+
+.PHONY: clean
+clean:
+ rm -f *.html *.png
+
+.SUFFIXES: .dia .png
+.dia.png:
+ dia -e ${.TARGET:Q} -t png ${.IMPSRC}
diff --git a/pkgtools/pkglint/files/doc/chap.code.xml b/pkgtools/pkglint/files/doc/chap.code.xml
new file mode 100644
index 00000000000..6eb19490caf
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/chap.code.xml
@@ -0,0 +1,218 @@
+<!-- $NetBSD: chap.code.xml,v 1.1 2006/02/26 23:38:07 rillig Exp $ -->
+
+<chapter id="code">
+<title>Code structure</title>
+
+ <para>In this chapter, I give an overview of how the &pkglint;
+ code is organized, starting with the <function>main</function>
+ function, passing the functions that check a single line and
+ finally arriving at the infrastructure that makes writing the
+ other functions easier.</para>
+
+<sect1 id="code.overview">
+<title>Overview</title>
+
+ <para>The &pkglint; code is structured in modular, easy to
+ understand procedures. These procedures can be further
+ classified with respect to what they do. There are procedures
+ that check a file, others check the lines of a file, again
+ others check a single line. These classes of procedures are
+ described in the following sections in a top-down
+ fashion.</para>
+
+ <para>If nothing special is said about which procedures call
+ which others, you may assume that procedures of a certain rank
+ only call procedures that are of a strictly lower rank. For
+ example, no <function>checkline_*</function> will ever call
+ <function>checkfile_*</function>. Sometimes, functions of the
+ same rank are called, but these cases are documented
+ explicitly.</para>
+
+</sect1>
+
+<sect1 id="code.select">
+<title>Selecting the proper checking function</title>
+
+ <para>The <function>main</function> procedure of &pkglint; is a
+ simple loop around a TODO list containing pathnames of items (I
+ couldn't think of a better name here). The decision of which
+ checks to apply to a given item is done in
+ <function>checkitem</function>, which checks whether the item is
+ a file or a directory and dispatches the actual checking to
+ specialized procedures.</para>
+
+</sect1>
+
+<sect1 id="code.dir">
+<title>Checking a directory</title>
+
+ <para>The procedures that check a directory are
+ <function>checkdir_root</function> for the pkgsrc root
+ directory, <function>checkdir_category</function> for a category
+ of packages and <function>checkdir_package</function> for a
+ single package.</para>
+
+</sect1>
+
+<sect1 id="code.file">
+<title>Checking a file</title>
+
+ <para>Since the dispatching for files requires much code, it has
+ been put into a separate procedure called
+ <function>checkfile</function>, which further dispatches the
+ call to the other procedures.</para>
+
+ <para>The procedures that check a specific file are
+ <function>checkfile_ALTERNATIVES</function>,
+ <function>checkfile_DESCR</function>,
+ <function>checkfile_distinfo</function>,
+ <function>checkfile_extra</function>,
+ <function>checkfile_INSTALL</function>,
+ <function>checkfile_MESSAGE</function>,
+ <function>checkfile_mk</function>,
+ <function>checkfile_patch</function> and
+ <function>checkfile_PLIST</function>. For most of the
+ procedures, it should be obvious to which files they are
+ applied. A distinction is made between buildlink3 files and
+ other <filename>Makefiles</filename>, as some additional checks
+ apply to buildlink3 files. Of course, these procedures use
+ pretty much the same code for checking, and this is where the
+ <function>checklines_*</function> functions step in.</para>
+
+ <para>The <function>checkfile_package_Makefile</function>
+ function is somewhat special in that it expects four parameters
+ instead of only one. This is because loading the package data
+ has been separated from the actual checking.</para>
+
+</sect1>
+
+<sect1 id="code.lines">
+<title>Checking the lines in a file</title>
+
+ <para>This class of procedures consists of
+ <function>checklines_trailing_empty_lines</function>,
+ <function>checklines_package_Makefile_varorder</function> and
+ <function>checklines_mk</function>. The middle one is too
+ complex to be included in
+ <function>checkfile_package_Makefile</function>, and the other
+ ones are of so generic use that they deserved to be procedures
+ of their own.</para>
+
+ <para>The <function>checklines_mk</function> makes heavy use of
+ the various <function>checkline_*</function> functions that are
+ explained in the next chapter.</para>
+
+</sect1>
+
+<sect1 id="code.line">
+<title>Checking a single line in a file</title>
+
+ <para>This class of procedures checks a single line of a file.
+ The number of parameters differs for most of these procedures,
+ as some need more context information and others don't.</para>
+
+ <para>The procedures that are applicable to any file type are
+ <function>checkline_length</function>,
+ <function>checkline_valid_characters</function>,
+ <function>checkline_valid_characters_in_variable</function>,
+ <function>checkline_trailing_whitespace</function>,
+ <function>checkline_rcsid_regex</function>,
+ <function>checkline_rcsid</function>,
+ <function>checkline_relative_path</function>,
+ <function>checkline_relative_pkgdir</function>,
+ <function>checkline_spellcheck</function> and
+ <function>checkline_cpp_macro_names</function>.</para>
+
+ <para>The rest of the procedures is specific to
+ <filename>Makefile</filename>s:
+ <function>checkline_mk_text</function>,
+ <function>checkline_mk_shellword</function>,
+ <function>checkline_mk_shelltext</function>,
+ <function>checkline_mk_shellcmd</function>,
+ <function>checkline_mk_vartype_basic</function>,
+ <function>checkline_mk_vartype_basic</function>,
+ <function>checkline_mk_vartype</function> and
+ <function>checkline_mk_varassign</function>.</para>
+
+ <para>This class of procedures contains the most code in
+ &pkglint;. The procedures that check shell commands and shell
+ words both have around 200 lines, and the largest procedure is
+ the check for predefined variable types, which has almost 500
+ lines. But the code is not complex at all, since this procedure
+ contains a large switch for all the predefined types. The checks
+ for a single type usually fit on a single screen.</para>
+
+</sect1>
+
+<sect1 id="code.infrastructure">
+<title>The &pkglint; infrastructure</title>
+
+ <para>To keep the code in the checking procedures small and
+ legible, an additional layer of procedures is needed that
+ provides basic operations and abstractions for handling files as
+ a collection of lines and to print all diagnostics in a common
+ format that is suitable to further processing by software
+ tools.</para>
+
+ <para>Since October 2004, this part of &pkglint; makes use of
+ some of the object oriented features of the Perl programming
+ language. It has worked quite well upto now, but it has not been
+ fun to write object-oriented code in Perl. The most basic
+ feature I am missing is that the compiler checks whether an
+ object has a specific method or not, as I have often written
+ <code>$line->warning()</code> instead of
+ <code>$line->log_warning()</code>. This makes refacturing quite
+ difficult if you don't have a 100&nbsp;% coverage test, and I
+ don't have that.</para>
+
+ <para>The classes are all defined in the
+ <varname>PkgLint</varname> namespace.</para>
+
+ <para>The traditional class is <classname>Line</classname>,
+ which represents a logical line of a file. In case of
+ <filename>Makefile</filename>s, line continuations are parsed
+ properly and combined into a single line. For all other files,
+ each logical line corresponds to a physical line. The
+ <classname>Line</classname> class has accessor methods to its
+ fields <methodname>fname</methodname>,
+ <methodname>lines</methodname> and
+ <methodname>text</methodname>. It also has the methods
+ <methodname>log_fatal</methodname>,
+ <methodname>log_error</methodname>,
+ <methodname>log_warning</methodname>,
+ <methodname>log_info</methodname> and
+ <methodname>log_debug</methodname> that all have one parameter,
+ the diagnostics message. The other methods are used less
+ often.</para>
+
+ <para>In January 2006, the logging has been improved in
+ functionality. Before that, a logical line could well consist of
+ 300 physical lines, so a diagnostic would say <quote>you have a
+ bug somewhere between line 100 and 400</quote>. This is not
+ helpful. Therefore, a new class has been invented that allows to
+ map each character of a logical line to its corresponding
+ physical location in the file. The new representation of a
+ logical line is called a <classname>String</classname>. This
+ feature is still experimental, since the only method for logging
+ a string is <methodname>log_warning</methodname>. The others are
+ still missing. It is also completely unclear how lines that have
+ been fixed by &pkglint; are represented since this moves
+ characters around in the physical lines.</para>
+
+ <para>To make pattern matching with the new
+ <classname>String</classname> easy to use, the additional class
+ <classname>StringMatch</classname> has been created. It saves
+ the result of a <classname>String</classname> that is matched
+ against a regular expression. The canonical way to get such a
+ <classname>StringMatch</classname> is to call the
+ <methodname>String::match</methodname> method.</para>
+
+ <para>Since the <classname>StringMatch</classname> was
+ convenient to use, the <classname>SimpleMatch</classname> class
+ represents the result of matching a Perl string against a
+ regular expression. The class <classname>Location</classname> is
+ currently unused.</para>
+
+ </sect1>
+
+</chapter>
diff --git a/pkgtools/pkglint/files/doc/chap.defs.xml b/pkgtools/pkglint/files/doc/chap.defs.xml
new file mode 100644
index 00000000000..86759b2a58f
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/chap.defs.xml
@@ -0,0 +1,25 @@
+<!-- $NetBSD: chap.defs.xml,v 1.1 2006/02/26 23:38:07 rillig Exp $ -->
+
+<chapter id="defs">
+<title>Definitions</title>
+
+ <para>In every non-toy program, the need arises to define new
+ words or redefine and clarify existing words. This is the list
+ of words that are used in pkglint.</para>
+
+ <variablelist>
+
+ <varlistentry><term>function</term><listitem><para>A subroutine
+ that is called to obtain a return value, rather than for its
+ side effects. Functions should restrict the user-visible side
+ effects to the necessary minimum.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term>procedure</term><listitem><para>A subroutine
+ that is not called to obtain a return value, but rather called
+ because of its side effects, like input/output.</para>
+ </listitem></varlistentry>
+
+ </variablelist>
+
+</chapter>
diff --git a/pkgtools/pkglint/files/doc/chap.intro.xml b/pkgtools/pkglint/files/doc/chap.intro.xml
new file mode 100644
index 00000000000..32a68f69e2e
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/chap.intro.xml
@@ -0,0 +1,15 @@
+<!-- $NetBSD: chap.intro.xml,v 1.1 2006/02/26 23:38:07 rillig Exp $ -->
+
+<chapter id="intro">
+<title>Introduction</title>
+
+ <para>&pkglint; is a static analysis tool for pkgsrc packages.
+ It finds many errors and problematic issues in those packages.
+ Starting in June 2004, &pkglint; has evolved into a powerful
+ tool that gives precise warnings wherever possible. With that
+ power comes much additional complexity, which cannot be
+ understood from reading the source code alone. This document
+ provides the necessary background information to understand what
+ the actual code does and why it is done this way.</para>
+
+</chapter>
diff --git a/pkgtools/pkglint/files/doc/chap.statemachines.xml b/pkgtools/pkglint/files/doc/chap.statemachines.xml
new file mode 100644
index 00000000000..4142cbc63af
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/chap.statemachines.xml
@@ -0,0 +1,65 @@
+<!-- $NetBSD: chap.statemachines.xml,v 1.1 2006/02/26 23:38:07 rillig Exp $ -->
+
+<chapter id="statemachines">
+<title>State machines</title>
+
+ <para>This chapter explains the various state machines that are
+ used in &pkglint;. It also provides graphical representations of
+ them that are much easier to read than the source code.</para>
+
+ <para>The opaque arrows in the figures represent transitions
+ that have a regular expression as condition. The hollow arrows
+ are the default transitions if nothing else matches. When
+ multiple regular expressions match in a state, the one that
+ appears first in the source code is chosen.</para>
+
+<sect1 id="statemachines.shellword">
+<title>The state machine for shell words</title>
+
+ <para>The state machine for single shell words is pretty simple,
+ and I think it can be understood from the source code alone. So
+ no graphical representation is provided.</para>
+
+</sect1>
+
+<sect1 id="statemachines.shellcommand">
+<title>The state machine for shell commands</title>
+
+ <figure id="statemachine.patch">
+ <title>The state transitions for shell commands</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="statemachine.shellcmd.png" format="PNG"/>
+ </imageobject>
+ <textobject>(Here should be a drawing of the state transitions.)</textobject>
+ </mediaobject>
+ </figure>
+
+ <para>The punch card symbols provide a means to go to a certain
+ state whenever the input matches the text on the punch
+ card.</para>
+
+</sect1>
+
+<sect1 id="statemachines.patch">
+<title>The state machine for patch files</title>
+
+ <para>The state machine for patch files is the newest of the
+ state machines. Here, the state transitions are separated from
+ the code, which makes the code itself pretty small. I don't know
+ yet if this programming style is elegant or not. Time will
+ show.</para>
+
+ <figure id="statemachine.patch">
+ <title>The state transitions for patch files</title>
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="statemachine.patch.png" format="PNG"/>
+ </imageobject>
+ <textobject>(Here should be a drawing of the state transitions.)</textobject>
+ </mediaobject>
+ </figure>
+
+</sect1>
+
+</chapter>
diff --git a/pkgtools/pkglint/files/doc/chap.types.xml b/pkgtools/pkglint/files/doc/chap.types.xml
new file mode 100644
index 00000000000..baed9a81184
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/chap.types.xml
@@ -0,0 +1,419 @@
+<!-- $NetBSD: chap.types.xml,v 1.1 2006/02/26 23:38:07 rillig Exp $ -->
+
+<chapter id="types">
+<title>The &pkglint; type system</title>
+
+ <para>One of the most notable additions to &pkglint; is the
+ introduction of typed variables. Traditionally, in
+ <filename>Makefile</filename>s, all variables have the type
+ <type>String</type>. This prevents many useful checks from being
+ done before executing the code.</para>
+
+ <para>To that time, &pkglint; already did some checks based on
+ the value of the variables, but these checks had no common
+ structure that could be described easily.</para>
+
+<sect1 id="types.history">
+<title>History</title>
+
+ <para>In February 2005, initial support for the &pkglint; type
+ system has been added. Some of the common variables have been
+ assigned types such as <literal><type>Boolean</type></literal>
+ or <literal><type>Yes_Or_Undefined</type></literal>, which are
+ the two common ways to represent boolean variables in pkgsrc.
+ The list of typed variables has been moved from the &pkglint;
+ code to an external file, <filename>makevars.map</filename>.
+ Many more basic types have been added later.</para>
+
+ <para>In October 2005, the type system has been extended to
+ allow <literal><type>List of
+ <replaceable>simple-type</replaceable></type></literal>, which
+ allowed to handle variables like <varname>DEPENDS</varname> and
+ <varname>CFLAGS</varname>. One month later, enumeration types
+ have been added, allowing the type of
+ <varname>PTHREAD_OPTS</varname> to be expressed as <literal>List
+ of { require native }</literal>.</para>
+
+</sect1>
+
+<sect1 id="types.syntax">
+<title>Syntax for defining types</title>
+
+<programlisting>
+ type ::= list-variant "of" simple-type
+ | simple-type
+ list-variant ::= "List" "!"? "+"?
+ simple-type ::= predefined-type
+ | enumeration
+ predefined-type ::= [A-Za-z][0-9A-Z_a-z]*
+ enumeration ::= "{" (enumeration-item)* "}"
+ enumeration-item ::= [-0-9A-Z_a-z]+
+</programlisting>
+
+</sect1>
+<sect1 id="types.semantics">
+<title>Semantics of the types</title>
+
+ <para>The <firstterm>simple types</firstterm> in &pkglint; are
+ either predefined types or enumeration types. A
+ <firstterm>predefined type</firstterm> is used by its name. See
+ <xref linkend="types.predefined"/> for the list of predefined
+ types.</para>
+
+ <para>Enumeration types are defined by writing a
+ <literal>{</literal>, followed by the enumeration items,
+ followed by a <literal>}</literal>. The enumeration items are
+ separated by space characters.</para>
+
+ <para>A list type can be constructed from a predefined type or
+ an enumeration. It is not possible to construct lists of lists,
+ since I have never needed that. When defining a list type, the
+ <literal>List</literal> keyword may be followed immediately
+ (that is, no white-space) by a <literal>!</literal> or a
+ <literal>+</literal>. A <literal>!</literal> means that the list
+ is an internal list, as opposed to an external list. Most lists
+ are external lists, so this has been chosen as the default
+ value. The differences between these two types are described in
+ the <ulink url="&pkgsrc-guide;/makefile.html">pkgsrc guide, the
+ chapter about <filename>Makefile</filename>s</ulink>. A
+ <literal>+</literal> restricts the valid operations on a
+ variable of that type. The only allowed operations are setting
+ the list to a commented empty value, for example
+ <literal>#&nbsp;none</literal>, or appending to the list, using
+ the <literal>+=</literal> operator.</para>
+
+</sect1>
+<sect1 id="types.predefined">
+<title>Predefined types</title>
+
+ <para>There are many predefined types in &pkglint;, which are
+ described below.</para>
+
+ <!-- reference: pkglint.pl, revision 1.532 -->
+ <variablelist>
+
+ <varlistentry><term><literal><type>AwkCommand</type></literal></term>
+ <listitem><para>An awk command. Currently nothing is checked
+ here.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>BuildlinkDepmethod</type></literal></term>
+ <listitem><para>Must be either <literal>build</literal> or
+ <literal>full</literal>.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>BuildlinkDepth</type></literal></term>
+ <listitem><para>This type is only intended for one variable,
+ namely <varname>BUILDLINK_DEPTH</varname>, which is only
+ modified in <filename>buildlink3.mk</filename>
+ files.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>BuildlinkPackages</type></literal></term>
+ <listitem><para>The type of the variable
+ <varname>BUILDLINK_PACKAGES</varname>. Like
+ <literal><type>BuildlinkDepth</type></literal> above, this is
+ only used in <filename>buildlink3.mk</filename> files. This
+ variable has two different patterns to be modified. The first is
+ to remove the current package from itself, and the second is to
+ append the current package. This prevents a package from showing
+ up twice in the list.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Category</type></literal></term>
+ <listitem><para>One of the categories that a package may be
+ placed in. The list of categories has been assembled manually
+ when the type was introduced. There is no further agreement on
+ which valid categories are valid, besides the top level
+ directory names in pkgsrc.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>CFlag</type></literal></term>
+ <listitem><para>One word in a <varname>CFLAGS</varname> or
+ <varname>CPPFLAGS</varname> variable. &pkglint; knows the flags
+ starting with <literal>-D</literal>, <literal>-U</literal>,
+ <literal>-I</literal>. Flags starting with
+ <literal>-O</literal>, <literal>-W</literal>,
+ <literal>-f</literal>, <literal>-g</literal> or
+ <literal>-m</literal> are silently accepted since they are
+ commonly used for the GNU compilers. As the pkgsrc framework
+ does not know how to handle most of these flags, care should be
+ taken.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Comment</type></literal></term>
+ <listitem><para>The comment of a package.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Dependency</type></literal></term>
+ <listitem><para>A simple dependency like
+ <literal>foopkg>=1.0</literal>, <literal>foopkg-[0-9]*</literal>
+ or <literal>foopkg-1.0</literal>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>DependencyWithPath</type></literal></term>
+ <listitem><para>A dependency (see above), followed by a colon
+ and a relative directory. For some packages, special variables
+ like <varname>USE_TOOLS</varname> should be used instead of an
+ explicit dependency.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>DistSuffix</type></literal></term>
+ <listitem><para>The value of the variable
+ <varname>EXTRACT_SUFX</varname>. The difference in the name is
+ intentional here, since <varname>EXTRACT_SUFX</varname> is a
+ misnomer. <varname>DIST_SUFX</varname> or
+ <varname>DIST_SUFFIX</varname> would be more appropriate.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Filename</type></literal></term>
+ <listitem><para>A filename, as defined in <ulink
+ url="http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html#tag_03_169">POSIX</ulink>.
+ This type further restricts the set of allowed characters.
+ See also <literal><type>Pathname</type></literal>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Filemask</type></literal></term>
+ <listitem><para>A shell globbing pattern that does not contain a
+ slash. See also <literal><type>Pathmask</type></literal>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Identifier</type></literal></term>
+ <listitem><para>In various places in pkgsrc, identifiers are
+ used. This type collects the most common naming conventions.
+ When you need a more specific check, you have to write your own
+ check.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>LdFlag</type></literal></term>
+ <listitem><para>A flag that is passed to the linker. Flags
+ starting with <literal>-L</literal> or <literal>-l</literal> are
+ accepted, as well as some others that are assumed to be handled
+ by the wrapper framework.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Mail_Address</type></literal></term>
+ <listitem><para>Checks for a very restricted subset of <ulink
+ url="http://www.ietf.org/rfc/rfc2822.txt">RFC
+ 2822</ulink>.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Message</type></literal></term>
+ <listitem><para>Messages are printed to the user as status
+ indicators. <ulink
+ url="http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/portlint/src/portlint.pl#rev1.77">As
+ opposed to FreeBSD</ulink>, they should not be quoted since they
+ may be used in contexts where quoting should be done
+ differently.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Option</type></literal></term>
+ <listitem><para>An option from the
+ <literal>PKG_OPTIONS</literal> framework. Options should not
+ contain underscores. They should be documented in
+ <filename>pkgsrc/mk/default/options.description</filename>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Pathlist</type></literal></term>
+ <listitem><para>A list of directories that are separated by
+ colons, like the popular environment variable
+ <varname>PATH</varname>. This type differs from the type
+ <literal><type>List of Pathname</type></literal> in the
+ character that is used as a separator.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Pathmask</type></literal></term>
+ <listitem><para>A shell globbing expression that may include
+ slashes.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Pathname</type></literal></term>
+ <listitem><para>A pathname, as defined in <ulink
+ url="http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html#tag_03_266">POSIX</ulink>.
+ See also <literal><type>Filename</type></literal>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Perl5Packlist</type></literal></term>
+ <listitem><para>A common error has been to refer to
+ <varname>INSTALLARCHLIB</varname> in the location of the packing
+ list. Therefore no references to other variables are
+ allowed.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>PkgName</type></literal></term>
+ <listitem><para>A package name should conform to some
+ restrictions, since the filename of the binary package is
+ created from it, which is then interpreted by pkg_add and the
+ like.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>PkgOptionsVar</type></literal></term>
+ <listitem><para>I had once made the mistake of referencing
+ <varname>PKGBASE</varname> in this variable, not knowing that
+ <varname>PKG_OPTIONS_VAR</varname> is used during preprocessing,
+ when <varname>PKGBASE</varname> is not yet defined. This type
+ prevent that mistake from being done again.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>PkgRevision</type></literal></term>
+ <listitem><para>The package revision must be a small integer.
+ The only place where this definition may occur is the package
+ <filename>Makefile</filename> itself, as this variable says
+ something about the individual package. There is no mechanism in
+ pkgsrc for something similar to <varname>PKGREVISION</varname>
+ that can be used in <filename>Makefile.common</filename>
+ files.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>PlatformTriple</type></literal></term>
+ <listitem><para>pkgsrc has been ported to many platforms, all of
+ which are identified using a triple of operating system,
+ operating system version and hardware
+ architecture.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Readonly</type></literal></term>
+ <listitem><para>This type is used to mark a variable as being
+ read-only to a package author. As this is not really a data type
+ but an access restriction, it will disappear in the next version
+ of the type system.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>RelativePkgDir</type></literal></term>
+ <listitem><para>A directory name that is relative to the package
+ directory. Mostly used for dependencies. See also
+ <literal><type>RelativePkgPath</type></literal>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>RelativePkgPath</type></literal></term>
+ <listitem><para>A pathname that is relative to the package
+ directory. It may point to either a regular file or a directory.
+ See also <literal><type>RelativePkgDir</type></literal>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>SVR4PkgName</type></literal></term>
+ <listitem><para>When converting pkgsrc packages to Solaris
+ packages, the package name is restricted to 9 characters, of
+ which five remain for the package
+ itself.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>ShellCommand</type></literal></term>
+ <listitem><para>A shell command is similar to a
+ <literal><type>List of ShellWord</type></literal>, except that
+ additional checks are performed on the direct use of tool names
+ or certain other deprecated shell commands.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>ShellWord</type></literal></term>
+ <listitem><para>A shell word is what the shell would regard as a
+ single word.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Stage</type></literal></term>
+ <listitem><para>In pkgsrc, there are phases, stages and steps.
+ Especially for the <varname>SUBST_STAGE</varname> variable, this
+ should always be one of the few predefined names, otherwise the
+ whole substitution group will be ignored.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Tool</type></literal></term>
+ <listitem><para>The pkgsrc tools framework contains very few
+ plausibility checks. To prevent spelling mistakes, the list of
+ valid tool names is loaded from the pkgsrc infrastructure files
+ and compared with the names that are used in the
+ <varname>USE_TOOLS</varname> variable.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>URL</type></literal></term>
+ <listitem><para>URLs appear in <varname>MASTER_SITES</varname>
+ and the <varname>HOMEPAGE</varname>. If a
+ <varname>MASTER_SITES</varname> group exists for a given URL, it
+ should be used instead of listing the URL directly.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>UserGroupName</type></literal></term>
+ <listitem><para>User and group names should consist only of
+ alphanumeric characters and the underscore. This restriction
+ ensures maximum portability of pkgsrc.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Userdefined</type></literal></term>
+ <listitem><para>Another instance of misuse of the type system.
+ But it helps to catch some errors in packages. This type will
+ disappear in the next version of the type system. See also
+ <literal><type>Readonly</type></literal>.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Varname</type></literal></term>
+ <listitem><para>Variable names are restricted to only uppercase
+ letters and the underscore in the basename, and arbitrary
+ characters in the parameterized part, following the dot.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>WrkdirSubdirectory</type></literal></term>
+ <listitem><para>The variable <varname>WRKSRC</varname> is
+ usually defined with reference to <varname>WRKDIR</varname>.
+ This check currently does nothing, and I don't know if it's
+ worth to check anything here.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>WrksrcSubdirectory</type></literal></term>
+ <listitem><para>Subdirectories of <varname>WRKSRC</varname> can
+ be used in <varname>CONFIGURE_DIRS</varname> and some other
+ variables. For convenience, they are interpreted relative to
+ <varname>WRKSRC</varname>, so package authors don't have to type
+ <literal>${WRKSRC}</literal> all the time.</para>
+ </listitem></varlistentry>
+
+ <varlistentry><term><literal><type>Yes</type></literal></term>
+ <listitem><para>This type is used for variables that are checked
+ using <literal>defined(VARNAME)</literal>. Their value is
+ interpreted as <quote>true</quote> if they are defined, no
+ matter if they are set to <literal>yes</literal> or
+ <literal>no</literal>.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>YesNo</type></literal></term>
+ <listitem><para>This type is used for variables that are checked
+ using <literal>defined(VARNAME) &amp;&amp;
+ !empty(VARNAME:M[Yy][Ee][Ss])</literal>. A value of
+ <varname>no</varname> means <quote>no</quote> for
+ them.</para></listitem></varlistentry>
+
+ <varlistentry><term><literal><type>YesNoFromCommand</type></literal></term>
+ <listitem><para>Like <literal><type>YesNo</type></literal>, but
+ the value may be produced by a shell command using the
+ <literal>!=</literal> operator.</para></listitem></varlistentry>
+
+ </variablelist>
+
+</sect1>
+
+<sect1 id="types.future">
+<title>Future directions</title>
+
+ <para>The framework for defining data types in the makevars.map
+ file is insufficient. It does not allow ACLs that specify which
+ variables may be read or written by the various actors in
+ pkgsrc. At the moment, the data type and the permissions are
+ intermixed (see type
+ <literal><type>Readonly</type></literal>).</para>
+
+ <para>To overcome these design flaws, I will create a new type
+ system for &pkglint; that is based on the current one, but
+ provides ACLs to define the permitted operations on each
+ variable. Each ACL entry is then a combination of an
+ <firstterm>actor</firstterm> with an
+ <firstterm>operation</firstterm>.</para>
+
+ <table id="types.acl.actors">
+ <title>ACL Actors</title>
+ <tgroup cols="2">
+ <thead><row><entry>Actor</entry><entry>Description</entry></row></thead>
+ <tbody>
+ <row><entry>package</entry><entry>The package author</entry></row>
+ <row><entry>system</entry><entry>The pkgsrc infrastructure</entry></row>
+ <row><entry>bl3</entry><entry><filename>buildlink3.mk</filename> and <filename>builtin.mk</filename> files</entry></row>
+ <row><entry>user</entry><entry>The pkgsrc user via <filename>mk.conf</filename></entry></row>
+ <row><entry>cmdline</entry><entry>The pkgsrc user via the command line</entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <table id="types.acl.actions">
+ <title>ACL Operations</title>
+ <tgroup cols="2">
+ <thead><row><entry>Operation</entry><entry>Description</entry></row></thead>
+ <tbody>
+ <row><entry>write</entry><entry>Create a variable or overwrite the value</entry></row>
+ <row><entry>append</entry><entry>Append to a list</entry></row>
+ <row><entry>default</entry><entry>Provide a default value for a variable</entry></row>
+ <row><entry>read</entry><entry>Use the value when executing the shell commands</entry></row>
+ <row><entry>readpp</entry><entry>Use the value during preprocessing</entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+
+</sect1>
+
+</chapter>
diff --git a/pkgtools/pkglint/files/doc/pkglint.xml b/pkgtools/pkglint/files/doc/pkglint.xml
new file mode 100644
index 00000000000..133dbd48309
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/pkglint.xml
@@ -0,0 +1,34 @@
+<?xml version="1.0" encoding="iso-8859-1"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+ "http://www.oasis-open.org/docbook/xml/4.4/docbook.dtd"
+[
+ <!ENTITY pkglint "<literal>pkglint</literal>">
+ <!ENTITY pkgsrc-guide "http://www.NetBSD.org/Documentation/pkgsrc">
+
+ <!ENTITY chap.intro SYSTEM "chap.intro.xml">
+ <!ENTITY chap.defs SYSTEM "chap.defs.xml">
+ <!ENTITY chap.types SYSTEM "chap.types.xml">
+ <!ENTITY chap.code SYSTEM "chap.code.xml">
+ <!ENTITY chap.statemachines SYSTEM "chap.statemachines.xml">
+]>
+
+<!-- $NetBSD: pkglint.xml,v 1.1 2006/02/26 23:38:07 rillig Exp $ -->
+
+<book>
+<title>Design and implementation of &pkglint;</title>
+
+<bookinfo>
+<author>
+ <firstname>Roland</firstname>
+ <surname>Illig</surname>
+ <email>rillig@NetBSD.org</email>
+</author>
+</bookinfo>
+
+&chap.intro;
+&chap.defs;
+&chap.types;
+&chap.code;
+&chap.statemachines;
+
+</book>
diff --git a/pkgtools/pkglint/files/doc/statemachine.patch.dia b/pkgtools/pkglint/files/doc/statemachine.patch.dia
new file mode 100644
index 00000000000..bf1aac9919d
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/statemachine.patch.dia
Binary files differ
diff --git a/pkgtools/pkglint/files/doc/statemachine.shellcmd.dia b/pkgtools/pkglint/files/doc/statemachine.shellcmd.dia
new file mode 100644
index 00000000000..76b7442f210
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/statemachine.shellcmd.dia
Binary files differ
diff --git a/pkgtools/pkglint/files/doc/stylesheet.xsl b/pkgtools/pkglint/files/doc/stylesheet.xsl
new file mode 100644
index 00000000000..21cb5b848b8
--- /dev/null
+++ b/pkgtools/pkglint/files/doc/stylesheet.xsl
@@ -0,0 +1,4 @@
+<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
+ version="1.0">
+ <xsl:param name="html.longdesc" select="0"/>
+</xsl:stylesheet>