2 files changed, 202 insertions, 37 deletions
diff --git a/README.SMART-POINTERS b/README.SMART-POINTERS
index c01280a8..abd65d8c 100644
--- a/README.SMART-POINTERS
+++ b/README.SMART-POINTERS
@@ -8,9 +8,78 @@ in favor of boost::shared_ptr, which is simpler, more flexible, and
 probably more efficient (although I haven't checked that).  New code
 should use shared_ptr and/or scoped_ptr.
 
-
-
-
+NEW STUFF:
+
+  The preferred convention for new code is to use boost's shared_ptr
+and scoped_ptr classes.  If you just have a private pointer that
+should never be "owned" by other code, you might consider scoped_ptr.
+Otherwise, shared_ptr is a better bet.  You should think twice before
+writing bare pointers.
+
+  shared_ptr is NOT MAGIC.  You can still get memory leaks and crashes
+if you misuse it; it is merely a tool to help you manage your object
+lifetimes.  The following notes will help.
+
+  * Do not write this code:
+
+      boost::shared_ptr<T> t = new T(...);
+
+    It is (potentially) exception-unsafe.  Instead of thinking hard
+    about whether it's safe or not, do this, which is both safe and
+    more efficient:
+
+      boost::shared_ptr<T> t = boost::make_shared<T>(...);
+
+  * Be very careful about creating circular chains of shared_ptrs.
+    These cannot be reclaimed by the shared_ptr mechanism and will
+    result in memory leaks.  Many of the notes below are just
+    ways of avoiding this.
+
+  * Although shared_ptr technically allows "sharing" of pointers, most
+    shared_ptr objects in aptitude should have a single owner at any
+    given time, just like bare pointers.  For instance, look at how
+    search patterns are normally handled.  pattern::create() might be
+    invoked from some subroutine.  It creates a pattern in a
+    shared_ptr and returns it, passing ownership to its caller.  Its
+    caller in turn invokes the constructor of another object, passing
+    the pattern's shared_ptr in.  The constructor takes ownership and
+    stores the pattern in a member variable for later use.
+
+    The difference between shared_ptr and bare pointers is that
+    violating the above pattern will at worst produce a memory leak
+    rather than a crash.
+
+  * For "leaf" objects that don't store smart pointers to anything
+    else, using shared_ptr to freely share references is reasonable.
+
+  * Objects that form an acyclic graph can be safely
+    reference-counted.  You can force the objects in a module to form
+    an acyclic graph by making their constructor(s) take pointers to
+    other objects, and by not providing a way to modify links after
+    the fact.  For instance, the match pattern objects and the (not
+    shared) fragment objects use this approach.  Be sure that you
+    don't have strong references to objects outside the module,
+    though, or you can still be bitten by this problem.
+
+  * Be very careful about embedding shared_ptrs into callbacks.  If
+    the type you're embedding can't contain strong references, this is
+    OK.  If it can, you can end up creating a hidden reference cycle
+    that will be hard to debug.  Consider using boost's weak_ptr
+    object, which automatically becomes invalid when its parent
+    shared_ptr is destroyed.
+
+    Callbacks passed to the main loop are a special case, since those
+    are normally not stored permanently anywhere (i.e., they have no
+    strong incoming references), and there may be lifetime issues that
+    make weak references unacceptable.  If you rely on this, be sure
+    to document it in comments and ensure that you really don't take a
+    strong reference to the callback.  It's better not to do this, but
+    I know there were a few places that I ran into where it was handy.
+
+  * Because there is a cost to using shared_ptr, don't use it
+    willy-nilly, particularly for low-level code.
+
+OLD DEPRECATED STUFF:
 
   aptitude uses smart pointers in a number of ways internally.  These
 techniques generally simplify the code and make it easy to program
diff --git a/README.THREADS b/README.THREADS
index fd569194..61c8bbc8 100644
--- a/README.THREADS
+++ b/README.THREADS
@@ -1,39 +1,134 @@
+  Summary:
+
+    1. Don't use threads.
+    2. If you still want to use threads, only communicate with the UI
+       thread by injecting events into its main loop.
+    3. Handle slots you receive from the main thread via safe_slot.
+    4. Watch out for what happens when the cache is closed underneath
+       you; use the appropriate signals to avoid disaster.
+    5. Use job_background_queue to implement your background thread if
+       at all possible.
+
   The basic threading philosophy followed in aptitude can be
 summarized thus: "if you aren't sure it's safe, do it in the main
-thread; if it's going to take a long time, try to make it safe to run
-in the background".  The mechanism for doing so is
-interface-dependent: for the curses frontend, it's
-cwidget::toplevel::post_event (cwidget/toplevel.h), and for the GTK+
-frontend, it's gui::post_event.  Both of these functions place a
-callback object into the global event queue and wake up the main
-thread (if necessary).
+thread; if it's going to take a long time, run it in the background
+and only communicate with the main thread via event injection".
+aptitude does *not* perform any concurrent computation in the normal
+sense; it uses threads only to keep the UI responsive while a
+background computation or I/O operation is going on.
 
-  The actual threading constructs used are the pthread wrappers in
-cwidget/generic/threads.h (and also cwidget/generic/event_queue.h).
+  The general pattern is this:
+
++-- UI thread -------+-------------------------+--- handle event ---
+                      \                       / post_event
++---- background job ------------------------+----------------------
+
+  The UI thread is assumed to be running an event-handling loop where
+it waits for a new event to appear; when an event appears, it invokes
+a callback contained in the event.  Even the command-line UI can do
+this, although when it needs to interact with a background thread, it
+usually just uses a "dummy" main loop.
+
+  Note that the background thread continues running.  In general,
+background threads have their own input queue (containing "work to be
+done") and are automatically started and stopped as work appears or
+disappears.  The class job_queue_thread provides this behavior and
+should be used for new code if possible.  This architecture should
+eventually evolve into something like Java's system of thread pools
+and executors, but it's not there yet.
+
+  Background threads are given a safe mechanism for passing slot
+objects to the main thread, in the form of a callback function.
+
+void post_thunk(const sigc::slot<void> &);
+
+  You can assume that the slot object will be wrapped in a safe_slot.
+It's probably not necesary, but I've had enough bad experiences with
+slots and threads (see below) that I prefer handling them at arm's
+length.
+
+  When the main thread has time, the slot will be invoked, then
+destroyed.  Normally, the slot will in turn invoke some callback slot
+provided by the main thread.
+
+  Using this idiom allows the main thread code to be insulated from
+the details of the threading model.  As far as it's concerned, it
+invokes a frontend routine, passing a callback, and some time later
+receives an event indicating that the background process completed.
+
+  *** THIS IS FAR SUPERIOR TO THE THREADING MODELS APTITUDE WAS
+PREVIOUSLY USING AND SHOULD BE USED FOR ALL NEW CODE UNLESS YOU HAVE A
+VERY, VERY, VERY GOOD REASON ***.  You can probably find some vestiges
+of the old threading model in various places (especially the
+resolver).  Don't imitate them.
+
+
+
+  Even with this mechanism, you have to watch out for a few gotchas.
+A particularly nasty one involves object lifetimes.  If you want to
+invoke a method on an object that was created in a background thread,
+but you want to invoke it in the main thread, it's tempting to do
+this:
 
-  Background threads are spawned to do long-running operations, but they
-generally only access data that is "owned" by the thread.  More
-details on the currently existing background threads below.
+    post_thunk(sigc::mem_fun(*obj, &obj_type::method));
 
-  These threads generally take some sort of "continuation" object
-that's invoked when the background process is finished; it's expected
-that this object will probably post some sort of event to the main
-thread.
+  The problem is that unless obj is a bare pointer and obj_type::method
+ends with "delete this", one of two things will happen:
+
+   1) No-one is managing obj's lifetime, and it will leak.
+
+   2) The background thread holds strong references to obj, and the
+      thunk might try to invoke a method on a destroyed obj (or,
+      worse, encounter sigc++ type-safety issues when the implicit
+      weak reference is being invalidated).
+
+  The safest way to pass objects between threads in this way is to
+manage the object's lifetime with shared_ptr, and use a *keepalive
+slot*:
+
+    post_thunk(keepalive(sigc::mem_fun(*obj, &obj_type::method), obj));
+
+  This will take a temporary shared reference to obj, ensuring that it
+is not deleted until the thunk fires.  Note that this idiom cannot
+create cycles, since the only new reference to obj is stored in the
+posted thunk, which will eventually be destroyed by the main loop.
+
+  An alternative is to use bare pointers and manage the object
+lifetime by hand.  This is error-prone and I recommend against it.
+
+
+
+  Very little global or shared state should be accessed by background
+threads.  The one large exception is the currently loaded apt cache;
+any background thread that needs to perform computations on packages
+will obviously need access to the cache (even if it was not a global
+variable, it would still be widely shared).
+
+  The apt module provides sigc signals that are invoked in the main
+thread prior to destroying the cache object and after a new one is
+created.  If you create a background thread that accesses the cache,
+you should hook into these signals, stopping your thread on the first
+one and resuming it on the second one.  The job_queue_thread class
+provides methods that you can hook into to achieve this effect.
+
+
+
+  The actual threading constructs used are the pthread wrappers in
+cwidget/generic/threads.h (and also cwidget/generic/event_queue.h).
 
-  Several of aptitude's background threads exist to serialize work
-that's too time-consuming to perform in the main thread.  The template
-job_event_queue (in src/generic/util/job_event_queue.h) provides a
-generic implementation of this behavior via the Curiously Recurring
-Template Pattern.
 
+WARNINGS ABOUT THINGS THAT ARE DANGEROUS:
 
   Things that you might thank are threadsafe but aren't include:
 
   * sigc++ objects.  Not only do you have to watch out for manual
     additions and deletions to connection lists during invocation, you
     also have to watch out for automatic invalidation of slots at any
-    time.  Best practice here is to confine sigc++ access to the main
-    thread, or to use safe_slot (but read the caveats in its header).
+    time.  In particular, COPYING SIGC++ SLOTS IS NOT THREADSAFE.  If
+    you write a background thread that accepts a slot from the main
+    thread (for instance, as a callback), you should wrap that slot in
+    a safe_slot to ensure that it is only copied in the main thread.
+    Read safe_slot's header for more details.
 
   * Smart pointers other than boost::shared_ptr.  Most smart pointers
     that aptitude uses are NOT threadsafe.  This means that *EVEN
@@ -53,21 +148,22 @@ Template Pattern.
     constructs, so I decided to see whether it was possible to just be
     very careful about handling reference-counted objects.
 
-Existing background threads:
+Some existing background threads:
 
   * The cwidget library creates threads to handle keyboard input,
     certain asynchronous signals, and timed events.  You generally
     don't need to worry about these.
 
-  * Downloads are performed by a background thread.  In keeping with
-    the overall philosophy, only the actual download is handled in
-    this way -- the setup of the download and any actions taken once
-    it completes are handled by the main thread.  The gatekeeper for
-    downloads is in download_thread.cc; it provides the basic thread
-    harness, as well as a convenience class that forwards the various
-    download messages to a foreground progress meter.  (these are
-    basically inter-thread RPC calls, and they block the download
-    thread until the progress meter's method call returns a value)
+  * Downloads are performed by a background thread.  This predates
+    job_queue_thread and doesn't use it.  Only the actual download is
+    handled in the background thread -- the setup of the download and
+    any actions taken once it completes are handled by the main
+    thread.  The gatekeeper for downloads is in download_thread.cc; it
+    provides the basic thread harness, as well as a convenience class
+    that forwards the various download messages to a foreground
+    progress meter.  (these are basically inter-thread RPC calls, and
+    they block the download thread until the progress meter's method
+    call returns a value)
 
   * The problem resolver runs in a background thread.  This thread
     always exists, even when there is no resolver (in which case it
@@ -78,4 +174,4 @@ Existing background threads:
 
   * From the GTK+ interface, changelog parsing and checking for
     changelogs in the download cache both happen in a background
-    thread.
+    thread, using job_queue_thread.