Managing the Main LCFG Slave Servers

Informatics has two main LCFG slave servers, lcfg1 (vega) and lcfg2 (altair)

lcfg1 is also accessible via the standard lcfg and lcfghost aliases. If the server with these aliases is not available then installs will not work. To avoid upsetting COs it is a good idea to update the DNS well in advance of planned downtime for that server (or do it out-of-hours).

We have two instead of one because:

  • if one breaks, we'll hopefully still have a working one
  • back when we had one LCFG server, we found that a lot of the load on it came from Apache. With the two-servers solution, the servers duplicate the LCFG profile building, but split the Apache load between them. The clients know that both servers are offering the same new profile, and tend to pick one server at random to get it from.

Configuration is done via these files:

  • dice/options/lcfg-slave-server.h
  • live/lcfg-slave-server.h

All the configuration should go into the release-controlled dice/options/lcfg-slave-server.h (or the relevant header included from within that header).

The live/lcfg-slave-server.h is intended to be purely for emergency overrides of normal settings and the introduction of new features and configuration settings when it is necessary to avoid the standard wait of a week or so for the settings to get through the release cycle.

When a new slave server is added it MUST be added to the relevant macro in live/lcfg-slave-servers-list.h so that it has rsync access to the LCFG master.

As well as the two main slave servers there are usually several test LCFG servers.

The bits of config not held in the header file are:

  • Every server MUST have the LCFG_SERVICE_ALIAS macro defined. This is the short name (e.g. lcfg2 or lcfg4) not the FQDN. The LCFG_SERVICE_ALIAS cpp variable has to be defined before the inclusion of the lcfg-slave-server.h header file.

  • One server (and only one) should be the LCFG cron server, this is set by defining the LCFG_SLAVESERVER_CRON before including dice/options/lcfg-slave-server.h in the LCFG profile.

Disaster Recovery

If an LCFG slave server dies then don't bother trying to resurrect, just do a new install, there is no local data which needs to be preserved. Once the replacement server is active and has finished the full profile build you will need to clear the cache on the other server and restart the lcfg server to avoid them being too far out of sync.

Topic revision: r7 - 08 May 2019 - 15:10:19 - ChrisCooke
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies