TWiki
>
DICE Web
>
ServicesUnit
>
ServicesUnitCurrentTasks
>
Dec2012ForumAFSUpgrades
(19 Mar 2013,
NeilBrown
)
(raw view)
E
dit
A
ttach
Note I'm mastering this in emacs ~neilb/work/dice/fileservices/afs/dec.2012/forum-upgrades and will only update it occasionally. Wiki changes will be lost the next time I sync. ---+ December 2012 Forum AFS Server Upgrades/Replacements By the end of the year we need to have replaced 3 of the four existing Forum AFS file servers: squonk, bunyip, cameleopard and crocotta. With the new: nessie, yeti, kraken and have them all running SL6x64 Initial state is that the original machines serve all their data from the SAN, ifevo1 and ifevo2. The new machines have 1.5TB of local RAID 10 disks, and no FC cards. The plan is to get one FC card and install it in kraken, and then migrate all volumes from one of the old servers to kraken, so the old machine can be decommissioned and it's FC card installed in yeti. And so on until the new machines all have FC cards from the 3 decommissioned old machines. This would leave one old machine with no where to migrate to (if we want to SL6 it without disrupting users). Another goal would be to free up ifevo1 of all data, so we can update its firmware and probably recreate its vdisks. Some rough facts and figures, sizes in TBs U=user volumes G=group volumes | *machine* | *Total disk* | *Used disk* | *Total disk for Groups* | *Total disk for users* | | squonk | 8 | 5 | 5 | 2 | | bunyip | 7 | 5 | 6 | 1 | | cameleopard | 5 | 4 | 5 | 1 | | crocotta | 9 | 4 | 6 | 2.5 | ifevo1 has 10TB RAID5 + 6TB RAID10 bunyip is showing a DIMM Parity error - so best that that's not the one that survives. ---+ Vague Plan ifevo2 has 16x2TB unallocated disks. So initially I've grabbed 5 of them to make an 8TB RAID5 vdisk, with the intention to move volumes from the next fileserver to be replaced/upgraded to, and then back off again once replaced. ie just as some shuffle space. Possibly not moving stuff back that comes off the ifevo1 so it gradually empties. ---++ Repartition R720 RAID 10 parted commands <pre> mkpart sdb1 ext3 0% 33% mkpart sdb2 ext3 33% 66% mkpart sdb3 ext3 66% 99% simpler but a bit wasteful or mkpart sdb1 ext3 0% 476416MiB mkpart sdb2 ext3 476416MiB 952831MiB mkpart sdb3 ext3 952831MiB 100% more efficient use of space but harder to read! </pre> ---++ Migrate Cameleopard to Nessie | *Cameleopard -> Nessie*||||||| | *type* | *Source partitions* | *sizes* | | *Dest Size* | *Dest Partitions* | *Notes* | | group | a,b,c,d | 4x250G | -> | 2x500G | d, e | | | group | e,f,g,h | 4x500G | -> | 2x1000G | f, g | | | user | i,j | 2x250G | -> | 1x450G | b | shuffle some to c | | group | k,l | 2x250G | -> | 1x500G | h | | | user | m | 1x250G | -> | 1x450G | c | plus some over spll from b | | group | o | 1x760G | -> | 1x1000G | i | | Leaving vicepa empty for extra users from crocotta. A total 4.5TB SAN space. Time take to do moves: * user.moves real 850m29.464s = 14.25hrs - the user partitions * group.moves1 real 1559m1.203s = 26hrs - group partitions a to g * group.moves2 real 1117m29.952s = 18.5hrs - group partitions h to o ---++ Migrate bunyip to kraken | *bunyip -> kraken*||||||| | *type* | *Source partitions* | *sizes* | | *Dest Size* | *Dest Partitions* | *Notes* | | user | a,b,c,d | 4x250G | -> | 3x500G | a,b,c | a and c to a, b to b, d to c | | group | e,f,g,h | 4x540G | -> | 4x5400G | e,f,g,h | | | group | i | 1x950G | -> | 1x1000G | i | | | group | j,k | 2x500G | -> | 2x500G | j,k | | | group | l | 1x1000G | -> | 1x1000G | l | | | group | m | 1x1100G | -> | 1x1000G | m | | mostly 1 to 1 mapping, some sizes tweeked a wee bit. * group.ami8, the single 990GB volume on vicepm took 14h to move. * user vol moves real 985m29.680s = 16h * group2moves real 1311m36.477s * group1moves real 1863m50.228s ---++ Migrate crocotta to yeti | *crocotta -> yeti*||||||| | *type* | *Source partitions* | *sizes* | | *Dest Size* | *Dest Partitions* | *Notes* | | user | a,b,c,d,e,f | 6x250G | -> | 3x500G | a,b,c | ab to a, cd to b, ef to c | | group | g | 1x250G | -> | 1x500G | d | mixed | | user | h | 1x250G | -> | 1x500G | d | mixed | | group | i | 1.2TB | -> | 1.2TB | e | | | group | j,k | 2x600GB | -> | 2x600GB | f,g | | | group | n | 1.2TB | -> | 1.2TB | h | | | group | o | 1TB | -> | 1TB | i | | | group | p | 1TB | -> | 1TB | j | | | mpu | squonk | 580GB | -> | 580GB | k | | | user | l,m | 800GB | -> | 800GB | l | | <pre> # crocotta ./afsmigrate-partition crocotta a yeti a ./afsmigrate-partition crocotta b yeti a a & b 711mins ./afsmigrate-partition crocotta c yeti b ./afsmigrate-partition crocotta d yeti b ./afsmigrate-partition crocotta e yeti c 3h:14m ./afsmigrate-partition crocotta f yeti c 2h:55m.28s ./afsmigrate-partition crocotta g yeti d ./afsmigrate-partition crocotta h yeti d ./afsmigrate-partition crocotta i yeti e ./afsmigrate-partition crocotta j yeti f ./afsmigrate-partition crocotta k yeti g all 1324mins ./afsmigrate-partition crocotta l yeti l ./afsmigrate-partition crocotta m yeti l ./afsmigrate-partition crocotta n yeti h 667m (11.1hr) for lmn ./afsmigrate-partition crocotta o yeti i 4h:53m.21s ./afsmigrate-partition crocotta p yeti j 2h:44m.21s d 00c0ffd863b400007a95cf5001000000 3600c0ff000d863b47a95cf5001000000 e 00c0ffd863b40000ab95cf5001000000 3600c0ff000d863b4ab95cf5001000000 fg 00c0ffd863b40000d595cf5001000000 3600c0ff000d863b4d595cf5001000000 h 00c0ffd863b40000fe95cf5001000000 3600c0ff000d863b4fe95cf5001000000 i 00c0ffd863b400001f96cf5001000000 3600c0ff000d863b41f96cf5001000000 j 00c0ffd863b400004396cf5001000000 3600c0ff000d863b44396cf5001000000 k 00c0ffd8545a00007a4fcf5001000000 3600c0ff000d8545a7a4fcf5001000000 l 00c0ffd863b40000a69bcf5001000000 3600c0ff000d863b4a69bcf5001000000 nessie k 00c0ffd863b40000589fcf5001000000 - small squonk groups vols 3600c0ff000d863b4589fcf5001000000 afsmigrate-partition squonk g nessie k afsmigrate-partition squonk i nessie k afsmigrate-partition squonk n nessie k afsmigrate-partition squonk o nessie k The above to 368mins = 6.1hrs afsmigrate-partition squonk p yeti k took 4h:14m </pre> ---++ Note on moving partitions between servers After some Googling and chats in the OpenAFS chat room. These are the steps to do to move vicep's from server A to B (and having the volumes they contain move too). * Shutdown A * Disconnect partitions from A and attach them to B (they don't have to use the same vicep name as when on A). * Restart file server on B (if you need to) * vos syncvldb -server B * vos syncserver -server B * Restart A (without the moved vicep's) - this is to keep clients happy for the next 2 hours, after which A can be turned off (if it has no more volumes) * Double check that the vldb records things where they should be. * that's it. Any problems with clients not finding the new location of the volumes should be solved by "fs checkvolumes" on the affected client. ---++ Still to do * update AFSPartitions wiki page with bunyip/kraken move * reclaim free SAN space * crocotta move to yeti - needs bunyip's FC card * SL6 squonk. ---++ squonk moves <pre> 900GB vicepn on kraken, move squonk a,b,c,d to it LUN27 674mins 1100GB vicepj on nessie, move squonk h,j,k,l LUN 6 726m squonk f to nessie a 250GB </pre>
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r5
<
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r5 - 19 Mar 2013 - 11:46:24 -
NeilBrown
DICE
DICE Web
DICE Wiki Home
Changes
Index
Search
Meetings
CEG
Operational
Computing Projects
Technical Discussion
Units
Infrastructure
Managed Platform
Research & Teaching
Services
User Support
Other
Service Catalogue
Platform upgrades
Procurement
Historical interest
Emergencies
Critical shutdown
Where's my software?
Pandemic planning
This is
WebLeftBar
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback
This Wiki uses
Cookies