Proposal for new system: Build, Package, Deploy, Maintain
Currently, the entire Evergreen package runs almost unaltered from the CVS layout. This was an initial design decision to avoid the overhead of moving things around into a different structure for a running system, and has served well in a development environment. It has allowed us to build and test changes in a running system with almost no down time using simple symlinks. Now that we're stamping versions and receiving a good bit of outside interest, though, it makes sense to create a more structured packaging system. It will also make easier a degree of automated testing during the build process, and ease the creation of dependency checking hooks.
This is, however, no simple matter. There are several subsets that can work more-or-less independently, and can be upgraded (or downgraded) on their own. The divisions lie mostly on language and top-level repository directory boundries, but not entirely, as in the case of configuration and metadata files, some of which can and should be version dependant, and some which should persist across installations.
There is also the case of DB schema changes. While a simple matter in the case of a single database, Evergreen is designed to make use of a replication cluster. How and when a clustered schema can be modified varies with the specific replication solution, and we should attempt to support all tested configurations. At this point, that means Postgres + Slony-I, but other solutions exist and have different requirements for schema modification.
First, I'll attempt to divvy up the parts into overall categories, each of which should be able to use a common build/package/deploy mechanism appropriate for each step in the process. Each step may include different parts of the system, and may be managed by a completely different framework.
After this I'll attempt to sketch the basic implementation overview for each category, and for each step.
Categories
OpenSRF System dependencies
OpenSRF Compiled C code
OpenSRF Support Libraries
OpenSRF Client/Server Libraries
OpenSRF Router
OpenSRF ChopChop jabber server
OpenSRF Apache gateway module
docgen.xsl
OpenSRF Perl dependencies
POSIX
Error
Cache::Memcached
DBI
DBD::SQLite
Digest::MD5
Fcntl
Net::Server
Net::Server::Prefork
Time::HiRes
Unix::Syslog
XML::LibXML
OpenSRF Perl modules
OpenSRF Javascript libraries
OpenSRF versioned data files
OpenSRF commands
OpenSRF versioned upgrade scripts
Evergreen System dependencies
expat
libdbi (CVS version currently required)
libjs (CVS version currently required)
… other things to be investigated …
Evergreen Compiled C code
OpenSRF Authentication/Authorization application
OpenSRF CStore (CRUD storage) application
OpenSRF RStore (Reporter CRUD storage) application
OpenSRF Apache I18N module
Evergreen Perl dependencies
Event
Class::DBI (0.96)
CGI
DateTime
DateTime::Format::ISO8601
GD::Graph
LWP::UserAgent
MARC::Record (CVS version…)
MARC::Charset
MARC::File::XML
Net::Z3950 (requires the yaz toolkit)
Text::CSV_XS
Spreadsheet::WriteExcel::Big
Getopt::Long
Tie::IxHash
Email::Send
FileHandle
RPC::XML
… and probably some more – these are the top level deps …
Evergreen Perl modules
OpenILS::Utils::*, OpenILS::Application::AppUtils
OpenILS::WWW::AddedContent (and descendants)
OpenILS::WWW::SuperCat (and descendants)
OpenILS::WWW::XMLRPCGateway (and descendants)
OpenILS::Reporter::*
OpenILS::SIP (and descendants)
OpenILS::Application::Actor (and descendants)
OpenILS::Application::Cat (and descendants)
OpenILS::Application::Circ (and descendants)
OpenILS::Application::Collections
OpenILS::Application::Ingest
OpenILS::Application::Penalty
OpenILS::Application::Reporter
OpenILS::Application::Search (and descendants)
OpenILS::Application::Storage (and descendants, without the Driver hierarchy)
OpenILS::Application::Storage::Driver::Pg (and descendants)
OpenILS::Application::SuperCat
Evergreen Backend JS libraries
contents of ILS/Open-ILS/src/javascript/backend/catalog/
contents of ILS/Open-ILS/src/javascript/backend/circ/
contents of ILS/Open-ILS/src/javascript/backend/libs/
contents of ILS/Open-ILS/src/javascript/backend/penalty/
… these require inspection, as some are user customizable and should be moved to group 16 …
Evergreen versioned data files
contents of ILS/Open-ILS/src/sql (currently only the Pg directory and descendants)
fm_IDL.xml
openils.xml.example
oils_sip.xml.example
hold_notification_template.example
OPAC javascript (Bill??)
Staff Client javascript (Jason??)
Evergreen commands and scripts
Evergreen Staff Client
Evergreen versioned upgrade scripts
Evergreen version independent files
Now, that's a /lot/ of parts to split out into individual packages. It's worth it, IMO, because the speed of development is such, today, that it may be appropriate in different situations to use slightly different sets of packages. This is made possible by the SOA based design of the overall system, and keeps things nice and flexible on the user end. All of the above categories could probably stand some adjustment and coalescing, but as a first step I think it behooves us to be explicit and precise. We can always refactor…
Building should be mostly simple. There are only about 3 or 4 techniques to consider, they just need to be applied in repetition.
Build/Package/Deploy Breakdown
OpenSRF Build
Categories 1 and 2 (OpenSRF C stuff) should be handled by autoconf/automake. Basing the actual build on the existing makefiles should make things manageable, perhaps even trivial, for someone with autoconf/automake experience. AFAICT, this should amount to some derivative of the current build concepts in the ILS/OpenSRF tree (though the makefiles will likely have to go entirely), and some scaffolding for ac/am. docgen.xsl is included here to make sure it is in sync with the installed version of the introspection
API, which is served through the XML gateway.
Category 3 can be handled by a bundle file. HOWEVER(!), this bundle should not be published on CPAN! The reason for this is that the dependencies can and will change, and any published bundle will only contain the MOST RECENT set of dependencies. Bundles can be installed from a local tarball using the standard CPAN module. This will allow us to keep the exact requirements of the version to be installed right next to the installation itself. Bundles are good, using CPAN to distribute multiple active useful versions of a bundle is bad. This bundle can probably be created at "build" time from scripts that look at the perl modules.
Category 4 should probably be a package for each of OpenSRF JSON and the OpenSRF::* stuff collectively. The reason for keeping the JSON stuff separate is that we'd like to replace it in the future, and will in fact have to upgrade it soon-ish with the addition of an outstanding patch. Nothing to actually "build" here.
Category 5 is essentially a set of data files. These are not heavily used today, but provide native Jabber client/server support in Javascript environments, and should be kept basically in sync with Categories 2 and 4. Nothing to actually "build" here.
Category 6 is another set of data files which should be installed with each upgrade. They provide the baseline configuration examples for the Category 2 and 4 package sets. Nothing to actually "build" here.
Category 7 contains the current startup and helper scripts for opensrf. These should probably just be folded into the package coming out of Category 2. Nothing to actually "build" here.
OpenSRF Packaging
OpenSRF Deployment – OpenSRF, being a dependency of Evergreen, is relatively simple to build and install. Because of this one would use the distro specific packages to install the code. Other than system library requirements, OpenSRF can be run using just what's built and packaged here if one uses the chop-chop jabber server. It seems like logical packages would be something along the lines of:
OpenSRF perl – the OpenSRF Perl modules, other than the JSON module
OpenSRF perl-JSON – the OpenSRF Perl JSON module
OpenSRF common – consisting of compiled support libraries
OpenSRF server – consisting of the OpenSRF Router and the c-server OpenSRF Application Infrastructure
OpenSRF chop-chop – the chop-chop jabber server
OpenSRF www-gw – apache gateway module
OpenSRF srfsh – the srfsh client
OpenSRF py-client – python client libs
OpenSRF js-client – javascript client libs
Evergreen Build
Categories 9 and 10 should be managed by ac/am, following the same guidelines as Categories 1 and 2 above, though based on the ILS/Open-ILS branch of the directory hierarchy. Obviously, some of the build here is dependent on the libraries created for OpenSRF.
Category 11, like 3 above, can be managed by a LOCAL bundle. It is even more important that this bundle be local and not from CPAN than the OpenSRF one, however, because these dependencies will change fairly rapidly over the next several versions, and it's just a pain to hunt for the right version of the module when we could simply package the bundle with the install.
Categories 14 and 17 should be installed anew with each backend update. The IDL and DB schema work together to inform the rest of the system about objects and relationships. Any significant DB change will require an IDL update, potentially a Storage application update, and maybe other application changes to deal with object shaping (though only in the case of new, required fields or entirely removed fields). DB schema upgrades, in particular, will be a little difficult to automate, as the use (or not) of Slony-I or other replication techniques changes the procedure of the upgrade.
Category 16 will require some build-time work, but Jason will need to
spec this out.
the rest don't require building, but will all need some intelligence wrapped around their packaging and deployment steps
Evergreen Packaging (basic notes; much to be discussed)
Should follow the same guidelines as the OpenSRF packaging, using the FHS, etc. Tarballs are built for each Category (except where noted below), and distro specific packages are built from those.
I imagine a separate packages for the Apache modules would be useful
Category 12 consists of all the perl-based OpenILS/Evergreen OpenSRF Applications. Each of these defines a (mostly) stable
API, and can be upgraded (for the most part) independently. For this reason, every single application, and the supporting Utils stuff, should be in their own package. It would be perfectly appropriate to use version X of the Actor application with a newer version Y of both the Search and Storage apps, if one wished to avoid, say, requiring a staff client upgrade and still wanted to get improved search ranking flexibility.
Staff Client … hmm … it has a good deal of local logic used for offline mode, templates, etc, but the majority of the system is remote. Because of this, it has a versioning system built in that allows it to detect supported remote install sets, and one server can support multiple Staff Client versions. Jason will need to go into more detail on the particulars of when and why the packaging and deployment of a new Staff Client would be appropriate, but its local component is probably the most stable and static portion of Evergreen.
Evergreen Deployment – Compared to OpenSRF, Evergreen is quite a beast, and a shifty one at that. As such, I think a management environment (mused about on the -dev list under the possible names
FIT or KIT) that can track current installed parts and versions and help with tracking local changes would, in the long term, be very beneficial. FIR/KIT should eventually manage the following
Overarching thoughts for discussion
By building support for distro packages in the short to medium term we provide a stopgap solution for easing installation until FIT/KIT exists. However, I think there is a great advantage to be had in creating and maintaining a specialized installation, deployment and configuration environment.
Evergreen is far more than the sum of its dependencies and OpenSRF Applications – there is a large amount of local configuration and customization that must be taken into account, and a generic package management system cannot be expected to cover all required cases. As mentioned before, just with the database setup alone there are varying degrees of availability and replication requirements that depend on the amount of time a site wishes to spend on such issues.
Evergreen isn't just one tool, it is an amalgamation of many distinct tools all focused on solving a large, complex problem (automation of a library, for whatever definition of library we evolve it to cover), and as such deserves a layer of administrative services to manage them all in as cohesive an interface as possible. Installation is only one stage in the product lifecycle, and it leads directly to the deployment and maintenance stages. Over time I believe we can build an integrated build, packaging, installation, deployment and configuration system that will be the envy of all other ILSs on the market. It will require a lot of work to build the infrastructure for this environment, but most of the work will lie outside the main body of code, hopefully allowing (currently) non-core developers to jump in.
So, what exists today in the way of administrative interfaces? Not a ton. There are bootstrapping CGIs for adding Organizational Units, circulation and fine rules and such to the system, which we are hoping to deprecate with a django-based interface. The advantage of such an interface is that it can be extended to manage much more than simple in-database objects. It could be made to track and upgrade installed versions of different components, and install new ones. I'll leave it at that for now, but I want to plant the seed in light of this proposal…
This is, again, just a rough sketch, but it's a sketch of where I would like to see the the overall administrative scaffolding head.