Server/Rack Power Data

This page is a scratchpad for information about servers and PDUs

Re: Operational Meeting, Wednesday 17th June 2020

At 15.6.2020, we have the following servers installed in IF-B.02 Rack 16:

Rack slots Name Type PSU config Current draw
at 15.6.2020 (A)
Historic total peak
current draw (A)
Rack power bar
connections
31-34 arnold ASUS ESC 8000 G3 3 PSU (redundant) 2.7, 2.5, 2.5 7.7 (?) s32.pdu outlets 22+23, s33.pdu outlet 23
26-30 mcclintock Dell T630 2 PSU (redundant) off, off 9.4 s32.pdu outlet 15, s33.pdu outlet 4
21-25 ostrom Dell T630 2 PSU (non redundant) 1.2, 1.0 9.4 s32.pdu outlet 13, s33.pdu outlet 13
16-20 barre Dell T630 2 PSU (non redundant) 1.0, 1.0 8.1 s32.pdu outlet 7, s33.pdu outlet 7
11-15 levi Dell T630 2 PSU (redundant) 1.4, 0.2 8.5 s32.pdu outlet 3, s33.pdu outlet 3
6-10 nuesslein Dell T630 2 PSU (redundant) 3.2, 0.2 8.5 s32.pdu outlet 2, s33.pdu outlet 2
1-5 greider Dell T630 2 PSU (non redundant) 1.6, 1.6 9.2 s32.pdu outlet 1, s33.pdu outlet 1

Notes:

  1. All of the PSUs noted above are rated at 1600W. At 230V input voltage, that implies a maximum possible current draw of about 7A per PSU.
  2. Dell T630 PSUs are named 'PSU1' and 'PSU2'. Without physical inspection, we cannot be sure which PSU is plugged into which power rack power bar. (This matters: in 'normal' redundant mode, PSU1 is active, and PSU2 is nominally sleeping - unless power or efficiency requirements cause it to be brought into service.)
  3. Our rack PDUs trigger a 'bank near overload' at 13A. Recently reported power 'bank near overloads':
    • s32.pdu, bank #2 (25.4.2020 & 26.4.2020 - mcclintock was subsequently switched off to try to deal with this)
    • s33.pdu, bank #1 (25.5.2020, 2.6.2020, 3.6.2020, 4.6.2020 & 8.6.2020)
  4. ASUS ESC 8000 User Guide
  5. Dell PowerEdge T630 Owner's Manual


Server power draw.

This page documents GPU and other power data. (figures given in brackets are with 1 redundant power supply (-) removed.)

server power supply Powered Off Peak load during boot idle Full (GPU) load full load plus bonnie++ on all disks
Hannah (Asus 8000 g4 w 8 Geforce 2080-ti) PS1 0.26 (-)(-) 1.19 0.74 (-)(-) 2.74(-)(-) 2.4(-)(-)
NO BIOS CONFIG FOR PSU PS2 0.26 (0.39)(-) 1.02 0.74 (1.12) 2.53(?)(-) 2.22(?)(-)
PS3 0.26(0.39)(N/A) 1.02 0.74(0.95)(N/A) 2.53(3.89)(N/A) 2.19(?)(N/A)
lennoxtown (gigabyte w 8 Geforce 2080-ti) PS1 0.83 (-)(-) 1.5 1.36 (-)(-) 3.18(-)(-) 3.15(-)(-)
NO BIOS CONFIG FOR PSU PS2 0.88(1.27)(n/a) 1.7 1.56(1.68)(n/a) 3.39(4.69)(-) 3.4(?)(-)
PS3 0.87(1.15)(n/a) 1.5 1.40(1.53)(n/a) 3.17(5.02)(N/A) 3.18(?)(n/a)
tomorden (tyan f77d w 8 Geforce 2080-ti) PS1 0.28 (-)(-) 1.17 0.68 (-)(-) 2.56(-)(-) 2.45(-)(-)
NO BIOS CONFIG FOR PSU PS2 0.9(1.25)(n/a) 1.72 1.53(1.58)(n/a) 3.41(4.60)(-) 3.38(?)(-)
PS3 0.89(1.14)(n/a) 1.51 1.38(1.49)(n/a) 3.17(4.90)(N/A) 3.10(?)(n/a)
invincible ( asus esc 4000 g3) PS1 0.18 (-) 1.17 0.68 (-) 2.28(-) 2.2(-)
NO BIOS CONFIG FOR PSU PS2 0.18(0.38) 1.5 0.69(1.58) 2.3(4.60) 2.2(4.5)

Basic procedure (work in progress)

  1. Power off machine.
  2. Replug machine using power meters
  3. Take readings with machine powered off
  4. Remove PSU in sequence taking readings (machine still powered off)
  5. Power On machine
  6. Take readings of meters noting the max reading on each meter until the login prompt appears.
  7. Poweroff Server
    1. 1 Remove PSU in sequence
    2. 2 Boot server
    3. 3 Take readings of meters noting the mac reading on each meter until the login prompt appears
    4. 4 goto 7 (increment PSU)

Dells

Dells configured on redundant power supplies will draw ~90% from one PSU, dells with non-redundant power supplies will balance the power draw.

It turns out that the dell bmc will store some power data which is accessble to ipmi through an oem extension:

[glorious]root:  /usr/bin/ipmitool delloem powermonitor
Power Tracking Statistics
Statistic      : Cumulative Energy Consumption
Start Time     : Mon Mar  2 18:05:00 2015
Finish Time    : Mon Dec  9 07:51:16 2019
Reading        : 3143.6 kWh

Statistic      : System Peak Power
Start Time     : Mon Mar  2 18:05:00 2015
Peak Time      : Fri Sep 27 09:59:56 2019
Peak Reading   : 250 W

Statistic      : System Peak Amperage
Start Time     : Mon Mar  2 18:05:00 2015
Peak Time      : Fri Sep 27 09:59:56 2019
Peak Reading   : 1.3 A
[glorious]root: 

Running this on all the Dells give us this interesting Graph.

dells.png

Clearly a power draw of 6.5KA is wrong so if we exclude data points where the current draw is over 60A (pdus in the forum are rated at 32A)

fig_1_all_sensible_dells.png

So what do the GPU numbers look like, if we concentrate on the t630s

fig_2_T630s.png

In this case glorious is not actually a GPU server but the other nodes are showing a fair spread of peak current draw. if we concentrate on one specific GPU (1080 say)

fig_3_t630s_1080-ti.png

Again trying to work out what factors might affect the pwower consumption if we split the graph pabsed on power supplies we get these graphs

t630_1080-ti_1600W.png t630_1080-ti_1100W.png

-- IainRae - 05 Nov 2019

Topic revision: r14 - 17 Jun 2020 - 07:57:01 - IanDurkacz
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies