Serial Consoles

Contents

1. Overview

  1. We use three types of technology to support serial consoles, all of which are front-ended by our conserver setup:
    1. Lantronix serial consoles - where the physical serial connector of a machine is cabled to a Lantronix box, which in turn is controlled over the network by a console server.
    2. IPMI Serial-over-LAN (SOL) consoles - where the Baseboard Management Controller of a machine redirects the serial I/O of that machine over the network so that it can be managed by an ipmitool process running on a console server.
    3. KVM serial consoles - where the virtual serial console of a KVM guest is managed by a virsh console process running on a console server.
  2. A machine using a console provided by a Lantronix box needs to have its physical serial adaptor physically cabled to the appropriate Lantronix box. A machine using a console implemented by IPMI needs to have its existing network connection configured appropriately. A machine using a console implemented by KVM requires no special configuration. In all three cases, though, an appropriate entry needs to be added to the live/console_server.h header file so that our conserver system acquires the necessary configuration.
  3. In principle, any of our console servers could manage any of our Lantronix boxes, or any of our servers which use an IPMI console - that's because both of those things are done over the network. In practice, owing to considerations of localisation as well as subnetting, things are generally arranged to be localised per site.
  4. A console server talks over the network to a Lantronix box by making one ssh connection per configured console. The ssh connection is under the auspices of the (otherwise unprivileged) user 'conserver', and is arranged using ssh private/public keys. See 'How to set up a Lantronix SLC Console Server' in 6. Existing documentation below.
  5. For IPMI SOL consoles, a console server talks over the network to the Baseboard Management Controller (BMC) of the target machine through an ipmitool process which is exec'ed on the console server by the /usr/sbin/_conserver-ipmiconsole helper script. The password used is a shared secret (common throughout the site) which is set up on the the target machine at BMC configuration time. See 'How to configure a machine to provide a console managed by IPMI Serial-over-Lan' in 6. Existing documentation below.
  6. For KVM consoles, a console server talks over the network to the KVM host of the target machine through an virsh process which is ultimately exec'ed on the console server via the /usr/sbin/_conserver-kvmconsole helper script. The virsh process runs as user 'conserver', and authentication is done using a keytab file (common throughout the site) which is installed on each console server machine at installation.
  7. Each of our console server machines are peers: there is no 'master/slave' configuration. The contents of the /etc/conserver.cf file on each of these servers is exactly the same, and is generated from the conserver's component resources, which are themselves set in the two files live/console_server.h and dice/options/console_server.h.
  8. All of our Lantronix boxes have been configured with a static IP configuration; there are no DHCP dependencies. (This change was made in order to reduce dependencies. Previously, the Lantronix boxes were configured to acquire their IP configurations via DHCP, but it was found that they could go permanently off-the-air if they lost contact with their DHCP servers for more than an hour or so.)
  9. Once a Lantronix box has been set up, it should never be necessary to connect to it directly or to change its configuration. The single exception is if a serial port on the Lantronix needs to be set to something other than the default configuration of 9600,8,n,1 - see 4. Setting serial communications parameters below.
  10. Console logs for any machine can found in the directory /var/consoles of its corresponding console server machine. These logs are also mirrored nightly to the School's rsyslog server.

2. Console server machines and their locations

Console server Techology Machines served Physical location Serial console?
consoles.inf (currently charmoz ) Lantronix, IPMI & KVM Forum servers IF-B.02 comms rack Yes - IPMI, managed by atconsoles.inf
atconsoles.inf (currently courtes) Lantronix, IPMI & KVM AT servers AT server room Yes - IPMI, managed by consoles.inf
kbconsoles.inf (currently sinopoli) Lantronix, IPMI & KVM KB servers JCMB server room Yes - IPMI, managed by consoles.inf

The 'Serial console?' column indicates whether or not the console server machine itself has a serial console. Obviously, where that is the case, the console has to be managed by a different console server.

The Forum console server consoles.inf, the AT console server atconsoles.inf, and the KB console server kbconsoles.inf all have serial consoles implemented via IPMI, so can all be remotely rebooted when necessary. In addition, they can be remotely power-cycled via IPMI: see the 'How to configure a machine to provide a console managed by IPMI Serial-over-Lan' document in 6. Existing documentation below.

3. Lantronix boxes and their locations

Lantronix box Controlled by Network served Physical location
srslc00.f.net.inf.ed.ac.uk consoles.inf Forum network IF-B.02 rack 0
srslc02.f.net.inf.ed.ac.uk consoles.inf Forum network IF-B.02 rack 3
srslc04.f.net.inf.ed.ac.uk consoles.inf Forum network IF-B.02 rack 6
srslc06.f.net.inf.ed.ac.uk consoles.inf Forum network IF-B.02 rack 9
srslccomms.f.net.inf.ed.ac.uk consoles.inf Forum network IF-B.02 comms rack 1
atslc00.at.net.inf.ed.ac.uk atconsoles.inf AT network AT server room racks
atslc01.at.net.inf.ed.ac.uk atconsoles.inf AT network AT server room racks
kbslc00.kb.net.inf.ed.ac.uk kbconsoles.inf KB network JCMB server room racks

Note that all Lantronix boxes are on non-routed subnets, so if you want to connect to any of the Lantronix boxes directly (either via ssh, or via their web interfaces), you need to do so from a machine with an interface on that subnet. The appropriate machine to use is the corresponding console server, i.e. either consoles.inf, atconsoles.inf or kbconsoles.inf.

The username you should use to authenticate to the Lantronix box is 'sysadmin'; it gives full administrative privileges. For its corresponding password, ask the Infrastructure Unit.

4. Setting serial communications parameters

For consoles which use Lantronix boxes, the default serial configuration is 9600,8,n,1. This should be suitable for all servers, but it may not be suitable for things like disc arrays where, in particular, the baud rate might need to be changed. To do this, ssh to the relevant Lantronix box as user 'sysadmin' (you will need to know the corresponding password), and issue the command set deviceport port n baud m (where n is the port number in question, a number in the range 1-32, and m is the baud rate). Please also add a relevant comment to the relevant entry in the live/console_server.h header, so that a record is maintained of the change.

Example:

  ssh srslc00.f.net.inf.ed.ac.uk -l sysadmin
  ...[snip]...
  set deviceport port 9 baud 115200
  logout

Corresponding comment in live/console_server.h:

  conserver.consolename_srslc00p09        ifevo1     /* 115200 baud */

5. What can go wrong

  1. Occasionally, the conserver system has 'gone wrong' on a console server machine in a way that has left orphaned conserver processes running. A symptom is that there will be more than one conserver process on the console server machine with a PPID of 1. (There should only with one conserver process with this PPID, namely the master daemon.) The consequence is that some consoles then become unresponsive. The exact cause isn't known, but the fix is to stop conserver on the affected console server machine (om conserver stop), kill any conserver processes which remain, and then restart conserver (om conserver start). Keep an eye out for 'duplicate console definition' errors on the restart, and fix these as required by editing live/console_server.h.
  2. Occasionally, individual IPMI consoles can become inactive: ipmitool appears to become unresponsive. The usual fix is to 'down' and then 'up' the affected console via the console command on a console server. A more drastic fix is to stop and then restart conserver, as above.
  3. Duplicate definitions for consoles in the live/console_server.h will stop conserver from starting (e.g. after a reboot, or after an om conserver stop; om conserver start.) Such duplicate definitions don't always cause obvious symptoms, since they don't cause conserver to stop working after a HUP (which is the signal it gets on a reconfiguration.) The fix is to remove any such duplicates, and restart conserver as necessary. Recent cases of duplicates have occurred when people have been experimenting with IPMI SOL consoles for a machine, and then changed over to the use of normal serial consoles without removing the IPMI SOL console definition. Duplicates, and the misconfiguration they create, might be a cause of the 'orphaned' processes problem mentioned above.
  4. The Forum console server consoles.inf, the AT console server atconsoles.inf, and the KB console server kbconsoles.inf, all have serial consoles implemented by IPMI, and can therefore be remotely rebooted and/or power-cycled as necessary to fix problems - see section 2 above. Each can also be remotely power-cycled by rfe control of its corresponding power bars.
  5. The Forum Lantronix boxes can be remotely rebooted (via their command line interfaces) if necessary. They can also be remotely power-cycled by rfe control of their corresponding power bars - but note that each Lantronix box has two distinct power supplies, each of which has been connected to a different power bar, and that both power supplies therefore need to be cycled. (Note that neither rebooting nor power-cycling of the Lantronix boxes should ever be necessary: if there are communications problems between the console server machines and any of the Lantronix boxes, the problem probably lies elsewhere.)
  6. The are several spare Lantronix boxes. These are boxes which were once in use, but have since been retired. There is one on the Dexion shelving in the Forum server room IF-B.02, and an additional four which can be found 'parked' in Forum racks 2, 5, 7 and 11. All have been set to factory defaults. Should an existing Lantronix box fail, replace it with one of the spares. The spare will first need to be installed and configured as described below in 6. Existing documentation.

6. Existing documentation

-- IanDurkacz - 17 Apr 2019

Topic revision: r26 - 17 Apr 2019 - 14:06:50 - IanDurkacz
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies