This document attempts to give a comprehensive overview of the LDAP setup in Informatics.

Software

We use OpenLDAP on all our DICE machines. We build our own openldap RPMs. We don't make an effort to keep up with current openldap releases on standard DICE client machines releases. On servers, our upgrade policy is:

  • We aim to keep up to date with the latest OpenLDAP release (give or take a version or two)
  • With new releases we deploy to one slave first, for testing, then gradually to the other ones. Once a release has proven to be reliable, we deploy on the master.

The openldap daemon is called slapd.

We produce the following RPMs:

  • openldap
  • openldap-libs
  • openldap-server
  • openldap-debuginfo

And for local configuration:

  • openldap-schema

For openldap's underlying database, we use the provided mdb format. In the past we used bdb, which we built and distributed ourselves.

Server setup and configuration

Our account management system Prometheus is responsible for populating the LDAP tree with user, group and netgroup information. See PrometheusOverview for more details. The prometheus flow diagram shows which parts of LDAP are synchronised from Prometheus.

The following gives a brief summary of the main branches of the LDAP tree, what they represent and where the data comes from:

  • ou=AutofsMaps - autofs automount maps. Updated on the rfe server (currently danio) by /usr/bin/ldapBuildAutofsMap.

  • ou=Capabilities - for authorisation - groupOfNames objects with lists of users who possess that capability. Managed by Prometheus.

  • ou=Group - posixGroup objects to provide group name and gid mapping. Managed by Prometheus.

  • ou=Identities - unused, but might be in the future.

  • ou=Maps - (historic) amd map information. Kept up to date by manual runs of ldapBuildAmdMaps

  • ou=Netgroup - for authorisation - nisNetgroup objects with lists of users/hostnames (there are a handful of host-specific netgroups). Managed by Prometheus.

  • ou=Partitions - (historic) NFS partition information. Used by ldapBuildAmdMaps. Kept up to date by ldappartsync on (currently) danio

  • ou=People - user account information (rfc2307). Managed by Prometheus.

  • ou=rfeMaps - rfe map data. Kept up to date by ldaprfemapsync on (currently) danio

Master server

There is one master server (currently polly) sited in the Forum server room. All updates have to be made to the master.

Disk setup

We configure separate disk partitions for the following:

  • /var/openldap-data - the openldap data directory
  • /var/openldap-snapshot - snapshots of the openldap database

Configuration

The master server is configured by LCFG resources via the <dice/options/openldap-server-common.h> header. Specific configuration is controlled with appropriate #define statements, which can bring in other header files - consult the header for more information.

The LDAP schema is installed on all machines by RPM (openldap-schema). In order to make a schema change you must ensure that all machines have updated to a new version of the RPM before making a change to LDAP data that uses any aspect of the new schema.

Access control

There are no filter holes for the master - so there is no visibility from outside the Informatics network.

Reads are permitted for all (both authenticated and anonymous).

Writes are permitted from:

  • users who possess the ldap/write entitlement (essentially sysmans)
  • prometheus master server principal (prometheus/fqdn.of.server@INF.ED.AC.UK)

Backups

An hourly cron job runs om openldap save. This uses slapcat to dump an LDIF file of the full openldap database to /var/openldap-snapshot. The master keeps three months of backups. This partition is rsynced nightly to a mirror server (currently to maunsell). Also note that the slaves should always have a full copy of the ldap directory.

Slave servers

There are currently three site slaves - nelson (IF), campbell (AT) and klein (KB). These slaves are kept in sync with the master server via openldap syncrepl technology - changes are pushed to the slaves as soon as they happen on the master. There are also four "lightweight" slaves (damflask, hutter, redmires and schneider), which are hosted on virtual machines. The only functional difference between site slaves and lightweight slaves (other than the former being physical and the latter being virtual) are:

  • site slaves keep more backups (see below).
  • site slaves have the openldap disk partitions as the master, lightweight slaves have the system default

All slaves are configured using the <dice/options/openldap-server-common.h> header with appropriate #define statements.

Backups

Backups are made to /var/openldap-snapshot, as on the master, but are not rsynced anywhere. Lightweight slaves keep one day of backups, site slaves keep one month.

TLS

All slaves are configured with TLS. This is done via the inclusion (via the common header) of <dice/options/openldap-tls-server.h>. This uses the lcfg-x509 component to acquire a locally-signed certificate.

Access control

We restrict access to slapd via our firewall to 'edlan', 'edlan172' and 'tardis', as defined in <live/ipfilter.h>.

We use tcpwrappers to restrict access to:

  • EdLAN:
    • 129.215.0.0/255.255.0.0
    • 192.168.0.0/255.255.0.0
    • 172.16.0.0/255.240.0.0
    • [2001:630:3c1::]/48
  • TARDIS:
    • 193.62.81.0/255.255.255.0

In openldap ACLs:

We allow access to ou=People for everyone.

We allow access to the rest of the tree for

  • authenticated users
  • localhost
  • those from 'inf.ed.ac.uk' (via a DNS reverse lookup)

This is configured from <dice/options/openldap-edlan-acls.h>, included via the common server header.

We make data visible to EdLAN for Virtual DICE.

Monitoring

The LDAP services on the master and site slave servers is monitored via Nagios - the configuration for this is in openldap-server-common.h

Agents

We no longer have agents which update LDAP data. This description is included for historical reasons, and in case we decide to use agents again ...

One thing that is worthy of note is our use of agents, for jobs which update LDAP data. A good example is the syncdbldap script which currently runs every day on greenford to sync changes from the school database into LDAP. This job is given the appropriate permissions for making changes by an agent in LDAP called ldapsyncagent, which has the krbName attribute of ldapsync/greenford.inf.ed.ac.uk - a cron job on greenford authenticates (using a keytab) as this identity before running syndbldap. The agent has a role of ldapuseradmin, which is what gives it the appropriate capabilities to make changes to those areas of LDAP (as defined by the acls set on the master). There are a small number of these agents in use.

IPv6

All of the LDAP servers have IPv6 addresses.

DNS

There is a round-robin DNS alias - dir.inf.ed.ac.uk - which consists of all the slaves, both IPv4 and IPv6.

There is a separate, IPv4 only, alias: dirv4.inf.ed.ac.uk, which has IPv4 addresses for all the slaves.

The _ldap._tcp.inf.ed.ac.uk SRV record is also available - this is used by rfe and sssd (see below).

There is also a separate SRV record used by autofs: _ldap._tcp.mapdir.inf.ed.ac.uk. The use of a separate record dates back to when we were running separate LDAP servers for autofs maps (for historical reasons).

Other slave servers

There are other machines which are configured to be openldap syncrepl slaves, but which are not part of the LDAP service. These are typically infrastructure machines and are configured in this way so as to have no network dependency, in the event of network unavailability. They run (and use) their own LDAP server and are configured via the header =<dice/options/openldap-run-and-use-local-server.h>=/

slaprepl

DICE machines used to run their own LDAP servers, replicating hourly from the master using a locally written slaprepl tool. This is now deprecated in favour of sssd, as detailed below. It is still possible to configure a machine in this way, however (see openldap-server-common.h for details).

DICE clients

sssd

All DICE clients use sssd. This provides a secure (TLS) interface with the seven LDAP slave servers for user, group and netgroup information. It is configured through the lcfg-sssd component (which is built on top of the lcfg-inifile component) and the <dice/options/sssd.h> header. It provides failover (should the LDAP server to which it is connected become unavailable) and caching.

Note that sssd currently defaults to establishing connections via IPv4 only.

sssd client troubleshooting

sssd is managed by systemd, so the standard systemd tools can be used, e.g.:

  • systemctl status sssd to check the status of sssd

The lcfg-sssd component writes the /etc/sssd/sssd.conf file and starts/restarts sssd as necessary.

To completely clear the sssd cache (as root):

  • systemctl stop sssd
  • rm -f /var/lib/sss/db/*
  • systemctl start sssd

autofs

DICE clients also communicate with LDAP slave servers to obtain map information for autofs. This does not currently use TLS, but really should.

Disaster Recovery

For this purpose, we assume that a disaster involves the unavailability of the master server (unavailability of slaves can be solved by moving dns aliases and srv records around). This section addresses the steps required to move the LDAP master to a different machine. In situations where the master has become unavailable, all slaves will continue to work normally, but will not receive updates.

Note that klein, at KB is the designated disaster recovery machine for LDAP.

The following steps should be undertaken to move the LDAP master to a different machine:

  • Get a copy of the last (good) LDIF backup from either the snapshot directory on the current master, or from the mirror
  • On the new master machine, use om openldap load to load the LDIF file. Note that this may not be necessary if you're promoting a slave to master, as it may already have an up to date replicated copy of the master data. In some respects this may depend on the nature of the disaster (e.g. if data corruption is suspected).
  • Give the new server the appropriate LCFG headers. This is likely to include at least <dice/options/openldap-server-common.h> with appropriate #define statements. Check the profile in case there's anything which should be in a header, but isn't.
  • Check (and transfer) any DNS aliases (e.g. ldap)
  • Check that slave servers receive any updates.
  • Check that prometheus updates are working

Useful Documentation

-- TobyBlake - 12 Mar 2019

Topic revision: r12 - 12 Mar 2019 - 11:37:01 - TobyBlake
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies