Removing AMD from servers
To improve resilience we are looking at removing AMD from servers. First a reminder:
The amd component configures /var/lcfg/conf/amd/amd.conf which has the
following maps:
map |
symlink |
type |
notes |
/amd/partition |
/partition |
ldap |
access to NFS partitions |
/amd/nethome |
/nethome |
ldap |
mostly points to user AFS home dirs, some NFS |
/amd/group |
/group |
ldap |
various group space |
/amd/yesterday |
/yesterday |
ldap |
links to yesterday NFS home dirs and partitions |
/amd/legacy |
/legacy |
ldap |
various legacy paths |
/amd/public |
/public |
ldap |
used to access homepages and southbridge files |
/amd/platspec |
/platspec |
file amd.platspec.map |
can we drop this generally? |
/amd/platform |
/platform |
file amd.platform.map |
empty, remove ? |
/amd/localhome |
/home (sometimes) |
file via localhome component |
only on localhome.h machines |
Notes
- /home usually points to /amd/nethome except on
localhome.h
machines
- /pkgs/master/ contains symlinks to /amd/parititon/ maps that no longer exist.
Just not running the amd binary on machines will break all these
links/locations. This shouldn't be a problem for servers where only
sysmans login, as we all have AFS homedirs (ie don't use /home), and
will just know not to expect the others to work.
For multi-user servers (eg ssh.inf) the service manager would have to decide if just turning
off amd and explaining the consequences to its users is acceptable, or continue to run with
amd on that machine.
For servers that use
localhome.h
you can't just remove amd, as it
currently uses amd to map a small subset of users to a local home area
on the machine, everyone else is mapped via an amd rule to
/amd/nethome/...
If you were happy that localhome machines will only allow those listed
to have localhomes a login on that machine, then we could modify the
localhome header and component to work without amd. Remember
localhome.h works by:
- twiddling a bit in openldap so that the home directory returned for a user is /home/UUN, rather than their AFS home path.
- pointing /home at /amd/localhome which is a map that points some users to a bit of local disk space, and everyone else at /amd/nethome/
So we could just point /home and the bit of local disk space, eg
/disk/home/solti/, the localhome component would have created the
necessary user subdirs.
First thoughts
I think at a meeting we decided that the default for machines
including server.h, would be AMD was disabled. People would have to
take steps on their servers if they wanted the AMD functionality.
My proposal is to:
- create a "no-amd.h" header file that will try to undo all the amd setup in filesystem.h. This will allow me to add that header to machines to test it out.
- it would check for a #define to see if it should actually undo the amd setup.
- then identify those services that need amd, ssh, homepages, localhome.h and add that #define (or another header that does the #define)
- then if everyone is happy move "no-amd.h" to server.h, need to check ordering issues.
This would cater for the bulk of machines, we can then look at
modifying localhome.h and the component to create a non-amd version if
we think it is worth while.
It would probably be cleared to eventually have all the "no-amd" stuff
in filesystem.h
26/11/2009
Following an email from George, I'll probably skip the "no-amd.h" idea and go straight to stripping out the amd resources from filesystem.h and moving that to a separate header, and then include that from filesystem.h. It would be #ifdef'd to allow simple turning it off and on. Also, localhome without amd is definitely worthwhile.
23/2/2010
So almost all mentions of amd and related resources were in dice/options/filesystem.h, they have now been moved into the new dice/options/amd.h header and filesystem.h now includes this.
The amd.h header checks to see if DICE_OPTIONS_AMD_DISABLE is defined, and if it is not then it does as it did before and configures amd, it also defines DICE_OPTIONS_AMD_ENABLED so that the localhome.h header can check for it.
localhome.h has been tweaked to work around the fact that amd may not be available, if it isn't it just symlinks /home to the localtion of the local home dirs. This means that local home users of that machine don't notice any change, but those who could log in, but would otherwise have had a network home dir, now find they have no home dir at all.
22/6/2010
After seemingly successful trial of the headers by the MPU, the plan will be:
- identify those machines including server.h, but that require AMD.
- publicise that list to COs to see if people agree
- For the agreed list, leave AMD enabled, for all other machines including server.h disable AMD
For step 3, I propose creating a new define, eg DICE_OPTIONS_AMD_ENABLE, which those servers requiring AMD would define before any of the other includes, the server.h would then be modified to set DICE_OPTIONS_AMD_DISABLE appropriately.
Need to check if ordering and the CPP is an issue.
We need to remember that we (currently) still want AMD to be enable by default on all other machines.
10/10/2010
The plan from back in June isn't going to work due to ordering. Most (all) profiles include things in this order:
os/sl5.h
options/server.h
unfortunately it is the sl5.h that includes defaults.h -> filesystem.h -> amd.h, so setting #defines in server.h happen too late for amd.h checks to see. Options:
- rejig the ordering of header files to suit me -Danger Will Robinson
- Stick with the current, ie those wanting to disable AMD, just have to stick a #define at the top of their machine profile - hum, possible but not great
- come up with an "no-amd.h" header that could be included later on, and undoes the resources set in amd.h. By some fluke this was (well the name of the file part) was my original proposal! - Would work, but extra work keeping amd.h and no-amd.h in sync.
From dumpdeps and some grepping I find some 95 server.h machines which have an auth.users resource that isn't just " @sysmans @techs". That's excluding the beowulf machines bw14!* and hcrc14\*, with those it is 157 machines.
12/10/2010
Created a dice/options/disable-amd.h that basically takes amd out of boot.services, and some amd file resources that create various /amd symlinks. Also added a warning about including localhome.h before disable-amd.h
13/10/2010
I think the headers are done now. All we need to do is identify the machines including server.h that might
not want amd to disable. This will be left to the units, but I've grubbed about and come up with this list of hostnames,
ServersWithExtraUsers
So to disable amd on a machine you can either
#define DICE_OPTIONS_AMD_DISABLE
before dice/options/amd.h gets included and/or:
#include <dice/options/disable-amd.h>
after amd.h is included. If you want to enable amd (overriding it being disabled via
the #define or #include above), then you need to:
#define DICE_OPTIONS_AMD_ENABLE
before either amd.h or disable-amd.h is included.
The plan is to add the disable-amd.h to server.h.
Add your further comments below - or edit directly.
--
NeilBrown - 25 Nov 2009