Notes, thoughts, musings and progress made on the "Replace CVS with SVN on www.inf" project.
If you want the gory details in chronological order, start reading below.
If you want the most recent parts, start reading here
This project has been abandoned, see
FinalProjectReport226WwwSvn
11 Aug 2014
New server
There's a new server, skelp.inf, that will eventually replace wafer as www.inf, and it would be a good idea to rsync CVS data to skelp and check the entire conversion process. We also need to decide if the DB distribution should remain part of the new SVN mechanism, or be separated out (discussions continue).
October 2014
Re-visited the authorisation process within SVN, and porting the CVS configurations (via
access
and
avail
files) threw up a couple of points to ponder. I had assumed that LCFG was the way to go, but it does make for a very inelegant behind-the-scenes mechanism... but perhaps this is OK for something the users won't really see. It might be a bit of a pain for Support to update though. May raise this as a discussion point with other COs.
Although the currently-used conversion and management of access and authorisation within SVN (via the use of the
/disk/data/infweb/svn/webdav_authz
file) is messy, it is a stand-alone process, and could be replaced if the currently-used mechanism is thought too cumbersome (all we need is the
webdav_authz
file, SVN doesn't really care how it gets there.
Note that the options provided via SVN's
webdav_authz
file are not as numerous as those in CVS (with the Access Control List Extension Patch), and so the more-flexible set of access permissions has been constrained into just RO (checkout), RW (commit), or None. It has yet to be confirmed how much of the CVS functionality can be successfully ported to SVN.
January 2015
Some time has been spent tweaking the
webdav_authz
file generation via customised scripts as referred to above. This was messy and frustrating, but now almost complete.
Some consideration has also been given to the management of the non-SVN portions of the web structure (DB-generated pages and such-like). There has been some discussion with the RaT Unit (who maintain teaching DB-generated pages and others), but no firm commitment to restructure as yet.
February 2015
Finished the
webdav_authz
file generation tweaks, but there are a few loose ends that are a bit puzzling - need to work through an svn commit process to make sure correct access permissions are being applied. Still need to sort out the RaT Unit's DB-generated pages, and get the teaching pages pruned (those that have moved to central servers, anyway).
April 2015
Thought we were on the home straight, but have discovered some issues with web editing via SeaMonkey. Also awaiting decision on generation and restructuring of teaching DB-generated pages.
May 2015
Investigated Kerberos/SVN integration - seemed to work.
June 2015
Upgraded server to SL6.6 and checked status of auto-generated pages, also Investigated user editing & checked documentation.
July 2015
Realised that separate Institute page URLs were not being correctly converted to SVN paths for editing, so spent time looking at scripts which generated SVN paths and associated publishing mechanism.
August 2015
Institute publishing needs a little more investigation and tweaking. Ongoing.
September 2015
The issues with Institute publishing seem to have been fixed (ipab.inf as test example), remaining Institutes need to be similarly configured, new/interim certificates generated, and other minor configurations. The main outstanding issue now is automatically/DB-generated pages and extricating them from the whole CVS/SVN mechanism.
October 2015
Some slight progress with remaining Institute publishing sites, and re-started discussions with RaT about extricating automatically/DB-generated pages from the whole CVS/SVN mechanism.
November 2015
Continued testing & checking individual Institute sites/pages, making sure publishing mechanism worked.
February 2016
Finished testing & checking individual Institute sites/pages, and that part of the process seems to work. Now need to look at the automatically-generated pages, and to check with RaT to see whether their position has changed.
RaT have tidied and updated generated pages, so that fewer pages need attention. Using a separate auto-generated location, such as
web-auto
, we can add an
Alias
for each of the locations:
old location |
Alias-ed to |
web/people/ |
web-auto/people |
web/research/ |
web-auto/research |
dice/doc/database/dm |
web-auto/dice/doc/database/dm |
lfcs/people |
web-auto/lfcs/people |
web/polop |
web-auto/polop |
(Assuming that
web/people/*.{html,txt,csv,xml}
and
web/research/*.{html,gif,inc}
are also auto-generated.)
For example,
web/research/hcrc/
pages would move to
web-auto/research/hcrc/
, and an
Alias
directive would direct the browser to the correct location.
April 2016
Continued discussions with Neil and Graham about the best way to handle automatically-generated pages. We seem to be agreed that retaining CVS submission for DB pages is an acceptable compromise, but submit them to cvs.inf
, rather than www.inf
(and so we can remove CVS server elements from www.inf
). This should involve only minor changes to the relevant DB conduits.
The possibility of retaining wafer.inf
as a CVS server, and not moving the DB CVS submissions to cvs.inf
was considered, but the attraction of removing another CVS server was enough to overcome the (slight) additional effort of synchronising the pages between www.inf
and cvs.inf
(probably with something like rsync
).
Experimented by creating new repository, /disk/cvs/web-auto
, on cvs.inf
(for receipt of generated pages from DB and elsewhere), and then checked out people pages from www.inf
and committed to new repository (thus we have, in knox:/disk/cvs/web-auto/people
, what would be there if submitted from the DB). This needs to be synced with www.inf:/liveroot/web-auto/people/
.
May 2016
Checking the similar mechanism on mail.inf
, we can see that this is done via " cvs checkout
" using the ID wwwrun
, but there's no saved RSA key for wwwrun
on knox, and ~wwwrun
is set to /tmp
. We could use the postgres
ID, which does have a shell and .ssh directory. So generated public key on skelp, added it to authorized_keys
on knox, and tweaked permissions on knox:/cvs/web-auto/
. Running
" /usr/bin/cvs checkout
<dir>" on skelp.inf as postgres then works, but using " /usr/bin/cvs export
< dir >" may be preferable, since it ignores CVS house-keeping directories.
So we have a method of getting DB-generated pages from DB via cvs.inf
to www.inf
, and now need to get updated files from /cvsroot/DBco/
to /liveroot/web-auto/
(the latter being the location apache should use whenever it wants an auto-generated page, such locations being configured via Alias directives in the appropriate Apache configuration files).
It appears that the CVS roots for DB-generated pages are controlled by the file legacy/outgoing/legacy_conduit_roots.txt
, and that changing this (and any associated CVSPASS files) would suffice to make the necessary changes on the DB side.
July 2016
Worked on generation of extra-CVS .meta files, and these can now be automatically generated or updated.
Not much further progress, as work on SL7 has taken precedence.
August 2016
Minimal progress (SL7 got in the way).
To Do
Complete the implementation of the auto-generated update mechanism, including:
- transfer mechanism for CVS-committed files from DB conduits (on knox) to new www.inf
- DB conduit commits a checked-in version to somewhere like cvs.inf:/disk/cvs/DBcin
- CVS file gets checked-out to somewhere like cvs.inf:/disk/cvs/DBcout
- rsync initiated from www.inf to www.inf:/cvsroot/web-auto/
- www.inf should now be up to date.
- co-ordinating the run of "cvs export" so that it's not run until the DB-dump has finished (either time-constrained via
cron
, or triggered by a flag file).
- preventing DB-generated files from being over-written by web-published files, possibly by removing relevant write-access via
/svn/webdav_authz
)
Review the various elements already completed - making sure that they are still consistent with each other and provide the necessary functionality. A dry-run installatiion from scratch would be desirable.
Once the viability of the final configuration is established, arrange a suitable time for the switch (providing all necessary warnings and caveats).
March 2017
As above, very little time & effort has been devoted to this project since the start of the SL7 project. The To Do list above is still substantially correct.
March 2018
After a period of quiessence, resumed work on this, and re-constructed test service on belter.inf (it having been completely re-installed as an SL7 machine, and running Apache 2.4)
Completely re-instated web SVN publishing test service, and checked for Apache 2.2/2.4 changes and glitches.
Need to complete the checking, for top-level and for each site/VHost.
Go-live will need a resync with www.inf and a user use test.
March 2019
After Roger's departure, this project was reviewed, and given that so much time has passed and the changes to the way www.inf is used, it was felt it was best to just draw a line under this project and abandon it. FinalProjectReport226WwwSvn.
--
RogerBurroughes - 26 Mar 2018