Table of Contents

Proposal for new system: Build, Package, Deploy, Maintain

Currently, the entire Evergreen package runs almost unaltered from the CVS layout. This was an initial design decision to avoid the overhead of moving things around into a different structure for a running system, and has served well in a development environment. It has allowed us to build and test changes in a running system with almost no down time using simple symlinks. Now that we're stamping versions and receiving a good bit of outside interest, though, it makes sense to create a more structured packaging system. It will also make easier a degree of automated testing during the build process, and ease the creation of dependency checking hooks.

This is, however, no simple matter. There are several subsets that can work more-or-less independently, and can be upgraded (or downgraded) on their own. The divisions lie mostly on language and top-level repository directory boundries, but not entirely, as in the case of configuration and metadata files, some of which can and should be version dependant, and some which should persist across installations.

There is also the case of DB schema changes. While a simple matter in the case of a single database, Evergreen is designed to make use of a replication cluster. How and when a clustered schema can be modified varies with the specific replication solution, and we should attempt to support all tested configurations. At this point, that means Postgres + Slony-I, but other solutions exist and have different requirements for schema modification.

First, I'll attempt to divvy up the parts into overall categories, each of which should be able to use a common build/package/deploy mechanism appropriate for each step in the process. Each step may include different parts of the system, and may be managed by a completely different framework.

After this I'll attempt to sketch the basic implementation overview for each category, and for each step.

Categories
  1. OpenSRF System dependencies
    • memcached
    • perl
    • libxml2
    • libxslt
    • apache
    • syslog
    • … other things to be investigated …
  2. OpenSRF Compiled C code
    • OpenSRF Support Libraries
    • OpenSRF Client/Server Libraries
    • OpenSRF Router
    • OpenSRF ChopChop jabber server
    • OpenSRF Apache gateway module
    • docgen.xsl
  3. OpenSRF Perl dependencies
    • POSIX
    • Error
    • Cache::Memcached
    • DBI
    • DBD::SQLite
    • Digest::MD5
    • Fcntl
    • Net::Server
    • Net::Server::Prefork
    • Time::HiRes
    • Unix::Syslog
    • XML::LibXML
  4. OpenSRF Perl modules
    • OpenSRF specific JSON (others don't seem to handle utf8 well…)
    • OpenSRF::*
  5. OpenSRF Javascript libraries
    • contents of ILS/OpenSRF/src/javascript/
  6. OpenSRF versioned data files
    • - opensrf_core.xml.example
    • - opensrf.xml.example
    • - srfsh.xml.example
    • - bootstrap.conf.example
  7. OpenSRF commands
    • osrf_ctl.sh
    • jabber_users_create (jabber2d + mysql configuration only, deprecated)
  8. OpenSRF versioned upgrade scripts
    • none as of yet
  9. Evergreen System dependencies
    • expat
    • libdbi (CVS version currently required)
    • libjs (CVS version currently required)
    • … other things to be investigated …
  10. Evergreen Compiled C code
    • OpenSRF Authentication/Authorization application
    • OpenSRF CStore (CRUD storage) application
    • OpenSRF RStore (Reporter CRUD storage) application
    • OpenSRF Apache I18N module
  11. Evergreen Perl dependencies
    • Event
    • Class::DBI (0.96)
    • CGI
    • DateTime
    • DateTime::Format::ISO8601
    • GD::Graph
    • LWP::UserAgent
    • MARC::Record (CVS version…)
    • MARC::Charset
    • MARC::File::XML
    • Net::Z3950 (requires the yaz toolkit)
    • Text::CSV_XS
    • Spreadsheet::WriteExcel::Big
    • Getopt::Long
    • Tie::IxHash
    • Email::Send
    • FileHandle
    • RPC::XML
    • … and probably some more – these are the top level deps …
  12. Evergreen Perl modules
    • OpenILS::Utils::*, OpenILS::Application::AppUtils
    • OpenILS::WWW::AddedContent (and descendants)
    • OpenILS::WWW::SuperCat (and descendants)
    • OpenILS::WWW::XMLRPCGateway (and descendants)
    • OpenILS::Reporter::*
    • OpenILS::SIP (and descendants)
    • OpenILS::Application::Actor (and descendants)
    • OpenILS::Application::Cat (and descendants)
    • OpenILS::Application::Circ (and descendants)
    • OpenILS::Application::Collections
    • OpenILS::Application::Ingest
    • OpenILS::Application::Penalty
    • OpenILS::Application::Reporter
    • OpenILS::Application::Search (and descendants)
    • OpenILS::Application::Storage (and descendants, without the Driver hierarchy)
    • OpenILS::Application::Storage::Driver::Pg (and descendants)
    • OpenILS::Application::SuperCat
  13. Evergreen Backend JS libraries
    • contents of ILS/Open-ILS/src/javascript/backend/catalog/
    • contents of ILS/Open-ILS/src/javascript/backend/circ/
    • contents of ILS/Open-ILS/src/javascript/backend/libs/
    • contents of ILS/Open-ILS/src/javascript/backend/penalty/
    • … these require inspection, as some are user customizable and should be moved to group 16 …
  14. Evergreen versioned data files
    • contents of ILS/Open-ILS/src/sql (currently only the Pg directory and descendants)
    • fm_IDL.xml
    • openils.xml.example
    • oils_sip.xml.example
    • hold_notification_template.example
    • OPAC javascript (Bill??)
    • Staff Client javascript (Jason??)
  15. Evergreen commands and scripts
    • autgen.sh
    • clark-kent.pl
    • online upgrade scripts
    • … others that Bill can help fill in …
  16. Evergreen Staff Client
    • All the base Staff Client stuff … Jason?
  17. Evergreen versioned upgrade scripts
    • none as of yet, but will include schema upgrades etc.
  18. Evergreen version independent files
    • OPAC templates, images
    • Staff Client interfaces
    • local I18N modifications
    • … these survive upgrades …

Now, that's a /lot/ of parts to split out into individual packages. It's worth it, IMO, because the speed of development is such, today, that it may be appropriate in different situations to use slightly different sets of packages. This is made possible by the SOA based design of the overall system, and keeps things nice and flexible on the user end. All of the above categories could probably stand some adjustment and coalescing, but as a first step I think it behooves us to be explicit and precise. We can always refactor…

Building should be mostly simple. There are only about 3 or 4 techniques to consider, they just need to be applied in repetition.

Build/Package/Deploy Breakdown
  1. OpenSRF Build
    • Categories 1 and 2 (OpenSRF C stuff) should be handled by autoconf/automake. Basing the actual build on the existing makefiles should make things manageable, perhaps even trivial, for someone with autoconf/automake experience. AFAICT, this should amount to some derivative of the current build concepts in the ILS/OpenSRF tree (though the makefiles will likely have to go entirely), and some scaffolding for ac/am. docgen.xsl is included here to make sure it is in sync with the installed version of the introspection API, which is served through the XML gateway.
    • Category 3 can be handled by a bundle file. HOWEVER(!), this bundle should not be published on CPAN! The reason for this is that the dependencies can and will change, and any published bundle will only contain the MOST RECENT set of dependencies. Bundles can be installed from a local tarball using the standard CPAN module. This will allow us to keep the exact requirements of the version to be installed right next to the installation itself. Bundles are good, using CPAN to distribute multiple active useful versions of a bundle is bad. This bundle can probably be created at "build" time from scripts that look at the perl modules.
    • Category 4 should probably be a package for each of OpenSRF JSON and the OpenSRF::* stuff collectively. The reason for keeping the JSON stuff separate is that we'd like to replace it in the future, and will in fact have to upgrade it soon-ish with the addition of an outstanding patch. Nothing to actually "build" here.
    • Category 5 is essentially a set of data files. These are not heavily used today, but provide native Jabber client/server support in Javascript environments, and should be kept basically in sync with Categories 2 and 4. Nothing to actually "build" here.
    • Category 6 is another set of data files which should be installed with each upgrade. They provide the baseline configuration examples for the Category 2 and 4 package sets. Nothing to actually "build" here.
    • Category 7 contains the current startup and helper scripts for opensrf. These should probably just be folded into the package coming out of Category 2. Nothing to actually "build" here.
  2. OpenSRF Packaging
    • Basically, packaging scripts should follow http://www.pathname.com/fhs/ for building base, /-relative binary tarballs from the above build output, one for each Category. Specifics of the packaging hierarchy will firm up as implementation proceeds. These tarballs will then be used to build distribution specific packages such as RPMs and DEBs. For Gentoo ebuilds (which are directly compiled), a stamped `cvs extract` should be used as the basis of the build step.
    • Distro specific package built from the tarballs at this stage
  3. OpenSRF Deployment – OpenSRF, being a dependency of Evergreen, is relatively simple to build and install. Because of this one would use the distro specific packages to install the code. Other than system library requirements, OpenSRF can be run using just what's built and packaged here if one uses the chop-chop jabber server. It seems like logical packages would be something along the lines of:
    • OpenSRF perl – the OpenSRF Perl modules, other than the JSON module
    • OpenSRF perl-JSON – the OpenSRF Perl JSON module
    • OpenSRF common – consisting of compiled support libraries
    • OpenSRF server – consisting of the OpenSRF Router and the c-server OpenSRF Application Infrastructure
    • OpenSRF chop-chop – the chop-chop jabber server
    • OpenSRF www-gw – apache gateway module
    • OpenSRF srfsh – the srfsh client
    • OpenSRF py-client – python client libs
    • OpenSRF js-client – javascript client libs
  4. Evergreen Build
    • Categories 9 and 10 should be managed by ac/am, following the same guidelines as Categories 1 and 2 above, though based on the ILS/Open-ILS branch of the directory hierarchy. Obviously, some of the build here is dependent on the libraries created for OpenSRF.
    • Category 11, like 3 above, can be managed by a LOCAL bundle. It is even more important that this bundle be local and not from CPAN than the OpenSRF one, however, because these dependencies will change fairly rapidly over the next several versions, and it's just a pain to hunt for the right version of the module when we could simply package the bundle with the install.
    • Categories 14 and 17 should be installed anew with each backend update. The IDL and DB schema work together to inform the rest of the system about objects and relationships. Any significant DB change will require an IDL update, potentially a Storage application update, and maybe other application changes to deal with object shaping (though only in the case of new, required fields or entirely removed fields). DB schema upgrades, in particular, will be a little difficult to automate, as the use (or not) of Slony-I or other replication techniques changes the procedure of the upgrade.
    • Category 16 will require some build-time work, but Jason will need to spec this out.
    • the rest don't require building, but will all need some intelligence wrapped around their packaging and deployment steps
  5. Evergreen Packaging (basic notes; much to be discussed)
    • Should follow the same guidelines as the OpenSRF packaging, using the FHS, etc. Tarballs are built for each Category (except where noted below), and distro specific packages are built from those.
    • I imagine a separate packages for the Apache modules would be useful
    • Category 12 consists of all the perl-based OpenILS/Evergreen OpenSRF Applications. Each of these defines a (mostly) stable API, and can be upgraded (for the most part) independently. For this reason, every single application, and the supporting Utils stuff, should be in their own package. It would be perfectly appropriate to use version X of the Actor application with a newer version Y of both the Search and Storage apps, if one wished to avoid, say, requiring a staff client upgrade and still wanted to get improved search ranking flexibility.
    • Staff Client … hmm … it has a good deal of local logic used for offline mode, templates, etc, but the majority of the system is remote. Because of this, it has a versioning system built in that allows it to detect supported remote install sets, and one server can support multiple Staff Client versions. Jason will need to go into more detail on the particulars of when and why the packaging and deployment of a new Staff Client would be appropriate, but its local component is probably the most stable and static portion of Evergreen.
  6. Evergreen Deployment – Compared to OpenSRF, Evergreen is quite a beast, and a shifty one at that. As such, I think a management environment (mused about on the -dev list under the possible names FIT or KIT) that can track current installed parts and versions and help with tracking local changes would, in the long term, be very beneficial. FIR/KIT should eventually manage the following
    • Evergreen Backend
      • IDL XML and Storage Class::DBI hierarchy
      • Database Package
        • Storage server DB Drivers (PostgreSQL only today)
        • Base Schema files (again, Pg only today)
        • DB initialization scripts – currently embedded in the installation
        • version-to-version schema upgrade scripts and utilities
        • Slony-I replication examples
      • Application packages – All separate packages
        • All Perl OpenSRF Applications individually
        • Authentication/Authorization OpenSRF Application
        • CStore (CRUD storage) OpenSRF Application
        • RStore (Reporter CRUD storage) OpenSRF Application
      • Upgrade/redeploy/restart scripts
    • Web interfaces – Separate packages
      • SuperCat mod_perl application
      • SlimPAC/SuperCat XSLT set
      • I18N Apache module
      • Reporter authentication proxy
      • OPAC default theme
      • default I18N string packs
    • Staff Client (Jason??)
Overarching thoughts for discussion

By building support for distro packages in the short to medium term we provide a stopgap solution for easing installation until FIT/KIT exists. However, I think there is a great advantage to be had in creating and maintaining a specialized installation, deployment and configuration environment.

Evergreen is far more than the sum of its dependencies and OpenSRF Applications – there is a large amount of local configuration and customization that must be taken into account, and a generic package management system cannot be expected to cover all required cases. As mentioned before, just with the database setup alone there are varying degrees of availability and replication requirements that depend on the amount of time a site wishes to spend on such issues.

Evergreen isn't just one tool, it is an amalgamation of many distinct tools all focused on solving a large, complex problem (automation of a library, for whatever definition of library we evolve it to cover), and as such deserves a layer of administrative services to manage them all in as cohesive an interface as possible. Installation is only one stage in the product lifecycle, and it leads directly to the deployment and maintenance stages. Over time I believe we can build an integrated build, packaging, installation, deployment and configuration system that will be the envy of all other ILSs on the market. It will require a lot of work to build the infrastructure for this environment, but most of the work will lie outside the main body of code, hopefully allowing (currently) non-core developers to jump in.

So, what exists today in the way of administrative interfaces? Not a ton. There are bootstrapping CGIs for adding Organizational Units, circulation and fine rules and such to the system, which we are hoping to deprecate with a django-based interface. The advantage of such an interface is that it can be extended to manage much more than simple in-database objects. It could be made to track and upgrade installed versions of different components, and install new ones. I'll leave it at that for now, but I want to plant the seed in light of this proposal…

This is, again, just a rough sketch, but it's a sketch of where I would like to see the the overall administrative scaffolding head.