IBM DS3524 storage system

There is currently one IBM DS3524 storage system (known as ds35000) with two controllers each with one ethernet interface and two fibre connections. The controllers are known as ds35000a.inf.ed.ac.uk and ds35000b.inf.ed.ac.uk (for controller A and B respectively). Each controller is connected to both fibre fabrics. ds35000 has 24 of 300GB SAS disks - configured as a RAID 10 array of 22 disks with 2 hot spares.

The DS3524 supports active/active failover - that's where the logical drives are spread over the two controllers to share the load. If one of the controllers fails, the remaining controller seamlessly takes over servicing the failed controller's logical drives. This works correctly from SL6 onwards.

The IBM DS3524 has no user interface. It speaks a proprietary protocol to one or more management stations. Management stations can be normal DICE boxes with IBM Storage Manager software installed (add the header "dice/options/ibm_sm.h"). This software provides both a GUI interface (SMclient) and a cli interface (SMcli). The following instructions pertain to the GUI interface.

Warranty

ds35000 has an onsite next-business-day warranty which expires in August 2015. Details on the warranty are stored in /afs/inf.ed.ac.uk/group/mp-unit/warranty_info/ds35000 Report faults using IBM's ESC service (you will have to register first).

Monitoring

The DS3524 is monitored by a locally authored script called 'smclimonitor'. This is designed to be periodically called from cron on a management station. The script will be modified to support Nagios at some point in the future. A server at IF (atom) and a VM at KB (giz) are running this script.

Configuration

Add a new storage system

  • Select addition method - manual
  • Add New Storage Subsystem - Out-of-band-management - ds35000a and ds35000b

Add new logical drive

  • Run SMclient (as root) on a machine with IBM Storage Management software installed.
  • Select the storage system you want to configure - eg "Storage Subsystem ds35000". This will popup a Subsystem Management window, after having asked for the management password.
  • Select Logical tab
  • Click the + beside MAIN to open up the logical drives on array MAIN
  • Select Free Capacity (at the bottom)
  • Right click and select Create Logical Drive
  • Specify the drive capacity and give an appropriate Logical Drive name, click customise settings. Try and use a slightly different size to the existing volumes - that makes it easier to identify the correct volume on a host.
  • Chose "Map later using Mappings View"
  • You should now see the new logical drive in the list of logical drive volumes (probably with a clock symbol as the drive is being initialised)

Add new host

  • As above to connect to the storage system you want to configure.
  • Select Mappings tab
  • Select the storage subsystem and with right click select Define Host
  • Enter the host name. We don't use (explicit) storage partitions.
  • Select FC interface
  • Add the host's HBAs by "Add by selecting known unassociated host port identifier". Assign appropriate an alias for each HBA port - eg circle-p1). If you do not know the port identifiers then you can look them up on the Services Unit port info page.

Map a logical drive to a host

  • As above, connect to the storage system you want to configure
  • Click the + beside "Undefined Mappings" to display unassigned logical drives
  • Select the logical drive you wish to map, right click and select "Define additional mappings"
  • Choose the appropriate host from the pull down menu and select a LUN (or use default value). Note that LUNs start from 0 and work upwards for each host - they are not shared across all hosts as they are for Nexsan systems)

Problem log

  • 03/06/14 - both controllers went off network (SMclimonitor, ping etc). FC carried on quite happily. Removing and replacing the network cables brought the interfaces back on line
  • January 15 - a disk went bad. This locked up the whole array. Powercycling brought the array back, minus the duff disk. It was replaced by IBM. See blog article. IBM suggested updating the firmware
  • 13/12/15 - both controllers lost routing, though were still available on wire-S33. FC carried on happily. On attempting to fix the problem remotely (it was a Sunday) managed to incorrectly configure both primary network interfaces of the controllers such that re-seating the cables on the Monday didn't bring back networking. Solution was to configure DHCP (via LCFG Profiles) to use the secondary network interfaces. This fixed the problem.
  • 10/03/16 - both controllers lost routing (access from on wire (S33) still fine). Disabling and re-enabling tne controllers' network ports, at the network switch, brought routing back.
  • 06/04/17 - Logical Drive not on preferred path due to ADT/RDAC failover error. Resetting both controllers individidually didn't fix problem. Powering off and reseating the modules worked for 10 mins, but problem came back. Reassigning controller A as preferred controller for MAIN array made the error disappear. On further investigation it looked like the FC port 3 on controller A isn't working. Tried swapping the GBIC but that didn't help. Now using port 5 instead of port 3 and things look happy.
  • 25/10/17 - controller A lost routing (access from on wire (S33) still fine). Disabling (for a few minutes) and re-enabling the controller's network ports, at the network switch, brought routing back. It appears necessary to leave the port disabled for a few minutes before re-enabling.

-- AlastairScobie - 20 Sep 2010

Topic revision: r16 - 25 Oct 2017 - 11:00:25 - AlastairScobie
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies