Fixing a Machine with a Broken LDAP db

Generally, replication failures in the openldap log file will indicate some kind of ldap breakage.

If you can ssh in, then ldap is working but replication may be flagging errors. If this is the case, you should be able to do om openldap stop; om openldap start -- -f to rebuild the database.

More often, you won't be able to ssh in - you'll be prompted for a password. If this happens, edit the machine's profile and set openldap.server, something like this...

!openldap.server mSET(bpdir.inf.ed.ac.uk)
This should allow you to log in - quite often you'll need to ctrl-c the login process to get to a shell (the same for when you nsu).

Check to see if slapd is running - it will quite often be in a spin, eating up as much cpu as it can. You'll have to kill -9. To restart slapd, first remove the run file (rm -f /var/lcfg/tmp/openldap.run) and then do om openldap start. Quite often the replication stage of openldap start will fail (because the database needs to be recovered (as a result of not being shut down cleanly) prior to the replication and I don't think the component waits for this. Do a subsequent om openldap kick to check things are OK. If they're not, then you'll probably need to rebuild the database (om openldap start -- -f).

If you had to ctrl-c the login process then this can be fixed by restarting amd (om amd stop; om amd start).

Finally, remove the openldap.server line from the profile.

-- TobyBlake - 16 Nov 2007

Edit | Attach | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r3 - 24 Apr 2008 - 12:14:27 - TobyBlake
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies