TWiki
>
DICE Web
>
ResearchAndTeachingUnit
>
RATUnitMeetings
>
RATUnitMeeting
(27 Jan 2020, Main.TimColles)
(raw view)
E
dit
A
ttach
<!-- * Set NOAUTOLINK = On --> <style type="text/css"> h2 { border-bottom: 1px solid black; } /* this is to help differentiate projects on the print-friendly version */ div.twikiToc { display: none; } /* hide the toc on the print-friendly version */ #main-copy div.twikiToc { display: block; } #main-copy h2 { border-bottom: none; } /* make things look slightly sensible on the screen version */ #main-copy h3 { /* allows projects to look like projects */ background-color: transparent; color: black; font-size: 130%; font-weight: bold; margin: 0 0; padding: 0.5ex 0 0.5ex 1ex; } </style> ---+ RAT Unit Meeting -- 27-01-2020 %TOC% ---++ Projects ---+++ CompProj:379 Review of DICE desktop platform * gdutton: [[DiceDesktopReviewFinalReport][completion report]] in progress * gdutton: create new project once report done: purpose has changed to wider than original remit ---+++ CompProj:392 Live Chat Service * discussed with CSO's, many opinions, another non-queued interruption would be a problem * looked at some packages * can we use teams - this has mobile app, web client is fine on linux, but automation may be needed to make this work in ways like live chat, including returning to RT * gdutton: [[LiveSupportChatService][interim report]] on status and options pending * gdutton: revisited some work on Mattermost for other purposes * low priority ---+++ CompProj:417 Roles Management * gdutton: met Toby for remaining deliverables, report written. Notes at [[http://homepages.inf.ed.ac.uk/cgi/gdutton/awki/][gdutton:EntitlementTools]] * finish project when roles supremo page is sufficient ---+++ CompProj:420 Hadoop * timc: check if dpia needed, yes * we need to decide if we can auto-delete and/or retention policy * could have two stage process i.e. archive first then delete, with email to say accessible but not run jobs, will be deleted after set period of time otherwise ---+++ CompProj:463 Merged MLP/MSC Teaching Clusters * script done to generate auto.homes file driven by caps - important so we can separate users onto the correct filesystem, teaching or research * add ability to get config from file * now shifting people about, along with the amount of data involved means this will take a while to do * done using rsync * script managing home directories nor running in reporting mode for safety * working on component, hopefully LCFG mangaeable * working on DPIA ---+++ CompProj:464 User access to last/ps/w etc * gdutton: concrete proposals report continuingTHIS WEEK ---+++ CompProj:470 Personalised Portal Page for Academic Staff * ongoing, "select" academics testing; need to demo for admin; core development is done; many reports to add or transfer from Portal but that is not part of this project * continue on documentation * status: very soft launch! ---+++ CompProj:506 Teaching Software 2019/20 * Node.js added (RT:99400) * S2: nbgrader remains ---+++ HTTP -> HTTPS check * rwb to investigate, piggy-backing on neilb efforts * Nothing major anticipated. ---+++ CompProj:539 User Facing MHR Data Asset Register * gdutton: met with cms to capture requirements and consider development for users * will stall for the time being ---+++ GPU Approved Supplier Procurement * ITT went out on 24th Jan, deadline is 14th Feb, evaluation 15th to 21st * make sure procurement give us all tender related documentation from suppliers * some items will have to be purchased as "call-off" * delivery scheduled end March. ---++ Misc Development * continuing on migrating some TSP processes * still todo form 2 handling, some tweaks requested following review meeting * gdutton: other TSP enhancement * timc: go over changes with gdutton * RT lifespan/retention plus other things - discussed at ops * DPIA should precede these changes. * add accounts by ldap * look at merging emails - needs QA process * look at purging and retention rules * run pre move to postgresql scripts ---++ Operational * 7.6 upgrades * CDT VM host is done * h/w security decommission * wasserboxer - move bridgeport off - flapjack due to be re-installed, due this week * fondant - off, needs to be physically removed * arcsim - ssh access only (iptables), also wants a VM server for arcsimvm1 (commonrail) * bocian/blanik/karenin * timc: all ssh access only while migration completed by Rob, manual iptables configuration * henwen + wilbur - due for removal now, rack space is precious * GPU networking * iainr: discuss infrastructure with gdmr/idurkacz * add to spending plan if appropriate * 10gb cards now pulled * 10gb switch now populated with 10gb modules * iainr has got power figures from "hannah", max power draw about 8.5A with all GPUs running but if running CPUs power draw goes down * data can be collected live from Dells so can get historical maximum, need to check if higher than aircon maximum * doing DPIAs: * we need to do *all* our services, in approx priority order: * Webmark * Theon * TheonPortal * RT4 (can use some of Unidesk replacement one) * ProjSubs, Projects-Archive * DPMT * Slurm * Hadoop * License server logs * Lab exam? * aburford: need to do one for ProctorU * timc: doing WhosOff (probably not) * privacy statement for: * Codegrade - draft with Rena, legal dept have written DPIA * Webmark * tophat attendance * privacy question, may not be specifically tophat, sitting with IS for comment * looking at live capture (transcript) options * live broadcast audio much better, latency in transcription may be significant * doing more testing with disability office * issues with videoconferencing hampering efforts; loan hardware should help * likely to be a compromise approach * Nick helping with testing, gone back to disability office to suggest purchasing iPad and microphone * we have an iPad we could use for testing - ask Alastair * lennoxtown still with Novatech, some response since 1st Jan * re-prod * ubatuba also still broken (2 GPU failures) * will wait on lennoxtown and then do a GPU swap to test * laphroaig still falling over weekly(?) in use * also partially knocks over BMC, further investigation required * fell over again - using an ordinary drive for OS and see what happens * check for newer firmware? * PG v12 - no OIDs anymore, affects TheonUI * synthetic oid column is a minimal safe fix * Theon (meringue) downtime * timc: need to deal with Learn embeds - specific 403 for learn subdirectory would be useful * would be nice to monitor postgresql configuration failures at component level -- gdutton to bugzilla * postgresql access * local trust access allows any authorised user to login as any user account * this does not affect our general usage but in the context of a further authorisation failure we would have been at risk * remove localhost IP access (we don't need to use it) * move all other trust types to peer, adds a further level of security * change permissions on socket to limit access by unix uid * should probably move nagios to gssapi - need a little bit of work * shuffling data from teaching cluster to research cluster * iain will do a few and write procedure so richard can then assist * pgteach move to teacake / v11 * done * mattermost trial * vs. Slack, vs. MS Teams * going to use a team created on the EPCC service for trial * Inf VM seems to be next best option if we must host it. * SSO/Authentication can be added to Open Source edition. * removed final FLEXlm from two machines, including legacy matlab and Simics * report Rob's VM issues following edge switch reboots * bonding protection caused hosts to not see network failure * George has proposed solutions written up in infrastructure report for next ops meeting * gdutton: bouncing guest media state should avoid reboot, liasing with Rob to test * ILCC cluster * all these machines should be under Slurm by Wednesday * likely to be purchasing more ---++ AOCB * [[https://nagios.inf.ed.ac.uk/nagios/cgi-bin/status.cgi?hostgroup=DICEServersResearchAndTeachingServers&style=detail&servicestatustypes=28&hoststatustypes=15][Nagios RAT Issues]] * [[https://rt4.inf.ed.ac.uk/Search/Results.html?SavedChartSearchId=new&RowsPerPage=50&Page=&Format=%27+++%3Cb%3E%3Ca+href%3D%22__WebPath__%2FTicket%2FDisplay.html%3Fid%3D__id__%22%3E__id__%3C%2Fa%3E%3C%2Fb%3E%2FTITLE%3A%23%27%2C%0A%27%3Cb%3E%3Ca+href%3D%22__WebPath__%2FTicket%2FDisplay.html%3Fid%3D__id__%22%3E__Subject__%3C%2Fa%3E%3C%2Fb%3E%2FTITLE%3ASubject%27%2C%0A%27__Status__%27%2C%0A%27__QueueName__%27%2C%0A%27__OwnerName__%27%2C%0A%27__Priority__%27%2C%0A%27__NEWLINE__%27%2C%0A%27%27%2C%0A%27%3Csmall%3E__Requestors__%3C%2Fsmall%3E%27%2C%0A%27%3Csmall%3E__CreatedRelative__%3C%2Fsmall%3E%27%2C%0A%27%3Csmall%3E__ToldRelative__%3C%2Fsmall%3E%27%2C%0A%27%3Csmall%3E__LastUpdatedRelative__%3C%2Fsmall%3E%27%2C%0A%27%3Csmall%3E__TimeLeft__%3C%2Fsmall%3E%27&Order=ASC%7CASC%7CASC%7CASC&SavedSearchId=RT%3A%3AGroup-104900-SavedSearch-352&Query=CF.%7BCategory%7D+LIKE+%27BAD+RAT%27+AND+Status+!%3D+%27resolved%27&OrderBy=id%7C%7C%7C][BAD-RAT tickets]] <!-- // These lines are here so that vim behaves properly during meetings. // vim: set syntax=twiki expandtab tabstop=3 shiftwidth=3 autoindent smartindent: -->
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r296
<
r295
<
r294
<
r293
<
r292
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r296 - 27 Jan 2020 - 15:15:01 - Main.TimColles
DICE
DICE Web
DICE Wiki Home
Changes
Index
Search
Meetings
CEG
Operational
Computing Projects
Technical Discussion
Units
Infrastructure
Managed Platform
Research & Teaching
Services
User Support
Other
Service Catalogue
Platform upgrades
Procurement
Historical interest
Emergencies
Critical shutdown
Where's my software?
Pandemic planning
This is WebLeftBar
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback
This Wiki uses
Cookies