Tla 1.3.1 Librification Experiment Progress Report
  
  2005-02-21
  
  This document refers to the tla development branch:
Archive:
lord@emf.net--librify-tla-2005Version:
tla--factor-1--1.3.1(See links on the home page for help finding my archives.)
Context
    Last week I reported having just spent a week modifying the libawk
  part of tla to permit string-sharing between multiple relational
  and associative table entries.  (The bug report describes that
  work.)
  
I was surprised at how quickly that work went. I had only meant to spend some time estimating how long the task would take. Working on the estimate, I couldn't resist working on the modification itself, and when the deadline for the estimate arrived I already finished the work I was supposed to estimate! (I provided an estimate of "-3 days" -- undertaking the already-complete task should count as adding three days to our schedule. :-)
    That got me thinking: the libawk cleanups were a small part of the
  long list of changes that would be necessarily to incrementally
  transform libarch from its state in 1.3 into a more librified and
  more portable piece of code: friendly to scripting languages, GUIs,
  alternative front-ends, extension languages, and non-unix platforms.
  
I have long assumed (as described elsewhere that an incremental librification starting from the 1.3 code base had poor chances for success.
    But if the fixes to libawk took "-3 days", perhaps the rest of
  incremental librification wouldn't be so impractical.
  
A Two Week Experiment
    Today marks the middle of what will be a two week experiment in 1.3
  librification.
  
    The form of the experiement is that I am spending these two weeks to
  librify as much of libarch as I can subject to some constraints:
  
Results After Week 1
Executive Summary
The experiment tries to answer the yes or no question:
The Executive Question
Is persuing the librification effort for 1.3.1 a practical strategy for persuing the high-level objectives for GNU Arch (such as Windows support, Unicode support, scripting and extension language support, a demonstrably/visibly robust implementation, etc.)?
One week into the experiment I will leave my betting money at 90/10: there is a 90% chance that the answer to the executive question will be a clear "yes" at the end of the two week experiment (which will be 28-Feb-2005).
One modest but interesting demonstration of the benefits of librification might be the improvements to error reporting that it might lead to.
Technical Summary
See also ./tla-fn-anatomy.html.
Last Week
I spent most of the first week laying down a foundation for librification. That included:
        Factoring the source tree.  I set up a framework for splitting
  up files into multiple directories organized around modular
  and "modular cluster" boundaries.   The old contents of libarch
  now live in a directory called libarch-compat.   Those files will
  be incrementally deprecated: removed one by one and replaced by
  librified replacements in other libarch-* directories.
      
        Setting up a new front end. My plan is to librify "from the top
  down" as much as possible.  I'll work in a loop: pick a an arch
  subcommand; rewrite it to use only librified code (no code from
  libarch-compat); repeat until there are no unlibrified commands.
  Therefore, I set up a new tla.c (the home of main).  The new
  front end first looks for a librified version of the subcommand.  If
  it doesn't find one, it runs the subcommand from libarch-compat.
      
        Designing and implementing the error signalling mechanism.
  libarch needs to consistently and robustly signal errors rather
  than (in the manner of 1.3 and prior) often simply exitting on
  discovery of an error condition.  Part of what I did this week was
  to install run-time systemf support (./src/tla/libach-errors) for
  error management.
      
        Rebuilt libawk.  The libawk cleanup modified all callers into
  libawk to be robust in the face of a libawk implementation that
  shared strings between multiple table entries.   It also modified 
  the existing libawk code to actually share strings
  opportunistically, resulting in at least a significant run-time
  space savings.   Many librified functions will need libawk-style
  functionality but the existing libawk implementation does not
  provide for error signalling and recovery and, in other ways, does
  not conform to the requirements for a fully librified libarch.
  Last week, collecting ideas and code-scraps from both the existing
  libawk and the code base for tla 2.0, I built a new
  implementation of the functionality in libawk.  The new libawk
  (now called libarch-values), in addition to be librified, adds
  support for table entries whose values are of types other than just
  string (e.g., integer-valued table entries).
      
        I also started on librifying the my-id command.  That invovled
  working on revised support for option parsing, on the API for
  functions implementing tla sub-commands, and work on writing
  librified versions of the low-level functions for manipulating a
  user id.
      
        This work went well in a few senses.  I was able to cut-past-edit a
  certain amount of code from both tla 1.3 and tla 2.0 to write
  what I needed in this context.  A great deal of the new code I
  simply rewrote, from scratch: this was code that is a minor
  variation on code that I've rewritten from scratch 3 or 4 times over
  the past few months.  The resulting code seems to work well,
  although testing has been scattershot.   I'm satisfied with
  the emerging calling conventions.
    
    
  
  
  
    
Next Week and Possibly Beyond
In week two I have a little more work to do on the foundation: string primitive operations; better option parsing; the beginnings of a more portable file system protocol stack.
Beyond that I want to librify as much as I can in the remaining time.
      I'll consider the experiment to have produced a distinctly positive
  result (meaning that this approach to librification is worth
  persuing) if I can get through librifying the file-id command and
  some commands that pertain to per-user (~/.arch-params)
  parameters.  Such an outcome implies an efficient framework for
  reimplementing CLI parsers, progress on librifying namespace
  management, project tree file system access, project tree arch
  control file access, and ~/.arch-params access.
    
      A positive outcome will warrant a follow-on series of three "wind
  sprints": one each to librify inventory, mkpatch, and dopatch.
  Past experience has shown that, once those commands are in place,
  implementing (in this case, librifying) the rest of tla is a
  relative cake walk.
  
  
  
  
Librification Experiment Constraints
    This experiment asks how long it will take to make a "clean up pass"
  over libarch such that, at the end of the process, the constraints
  described below are satisfied throughout the implementation of
  tla.
  
Librification Experiment Constraints
Upward compatability -- for several roughly one week intervals it is anticipated that only part of
libarchwill be librified. Nevertheless,tlamust be fully operable at those intervals, passing bothmake testand changeset burn-in tests. The intent is that it should be possible (and ideally useful) to merge partially-complete librification work into the mainline early and often.Perfect Error Handling -- Librified parts of
libarchmust have perfected error handling. That means that they do not exit the process except under truly uncontinuable conditions -- most errors are signalled to callers. Resource allocation and deallocation must be robustly handled across all paths, including error-signaling paths through the code.Abstract String Handling -- No part
libarchcode should make presumptions about the internal representation of strings. Strings should be manipulated purely via procedural interfaces based on an ontology of code-point-index-addressable sequences of unicode codepoints. Where specific codepoint values must be presumed, only graphical and space ASCII characters should be referred to.Reinforced On-disk Representation Abstractions
libarchhas long internally had a rough layering of its filesystem access. Thevulayer, from hackerlab, provides a low-level indirection above Posix system calls; for each of project-trees,~/.arch-paramdirectories, and file-system archives arch includes a roughly procedural interface. Within those three primary disk formats are ad-hoc formats for specific subcomponents (e.g, for files in~/.arch-paramsor for patch logs in./{arch}). Two of these subsystems (project tree and archive formats) have proven to need major restructuring for a clean port to Windows-based platforms. Throughout the code, abstraction barriers are unevenly preserved with leaks across them exposing details of path names, descriptors, and so forth. A librified libarch needs to clarify the layering in these components and ensure that the API to them is sufficiently abstract that changes to them (such as for a Windows port) can be made easily.Customizability, Extensibility, and Self-Documentation Third party developers have made very clear the demand for robust scripting language bindings to
libarch. Work on arch GUIs, IDE bindings, and alternative front-ends suggests a similar demand. Some desirable capabilities in the core of arch, such as file-type-specificdiffcompuation andpatchapplication, suggests a demand for an arch which is not merely scriptable (callable as primitive routines from a scripting language) but extensible (can be configured to call out to extension language routines during core operations). The APIs, data types, error handling conventions, and available documentation used in a librifiedlibarchmust be scripting and extension language friendly.
Copyright
Copyright (C) 2004 Tom Lord
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
    See the file COPYING for further information about
 the copyright and warranty status of this work.