Project page - https://devproj.inf.ed.ac.uk/project/show/137

MirrorImprovementsProjectLog

Update 1/11/2011 Enduser docs now at ServicesUnitMirrorService

Proposed changes/improvements to the existing mirror service

The aim here is to add some useful improvements, without needing to expend too much effort, 2 weeks has allocated to the project.

So the initial plan is to drive more of the mirror config from the client, not the server. So you can say "I'd like to mirror this, and this" and the rest happens automatically. Also better/easier reporting of if the mirror has actually happened, and where it is.

Due to the complexity of making it all truly automatic, I decided it was easier (and quicker) to leave the decision of where to mirror to, to be an human decision. This could be revisited at a later date if required.

Mirror Client

So the plan will be to create a new "mirror_client" component, along with header files, so that typical usage would be along the lines of:

#include <dice/options/mirror-client-forum.h>

MIRROR_THIS( tag1, /disk/data )
MIRROR_THIS_NOTAPE( tag2, /disk/data2 )

The site specific include file is to make it simple to workout where the client is.

The MIRROR_THIS macro would set mirror_client and rsync resources, and the "tags" would allow further, more detailed, tweaking of resources if the user needed it. It also allows the client owner to make others uses of rsync if required.

The NOTAPE option would allow you to specify if you didn't want the mirror going to tape for some reason.

I've thought about adding a "frequency" resource/parameter, so you could request something other than the usual "nightly", but for now I've decided to skip this.

The mirror_client component will have a method(s) to report back on the state of the mirrors, eg

  • when the last successful mirror was
  • where it was mirrored to

One of the things that the macro and mirror_client will do is add the details of what's to be mirrored to a spanning map that the mirror servers will see.

I'll possibly look at some nagios monitoring to alert if the last mirror is older than it should be.

Mirror Server

Remember that there are multiple mirror servers. So the spanning map that is constructed by the mirror clients will be visible by multiple machines. This is why truly automating which mirror server should do a particular mirror would be tricky. Easier to leave that to a human.

Mirror servers would use new headers and an improved rmirror component to configure and do the mirroring. eg

#include <dice/options/mirror-server-jcmb.h>

MIRROR_CLIENT( clientname, tag1, /disk/rmirror01 )
MIRROR_CLIENT( clientname, tag2, /disk/rmirror02 ) /* a non-tibsed partition*/

Again the site specific mirror server header makes it simple to take appropriate action depending on where the server is. eg it could just build a list of all mirror servers at a particular site.

By default the macro would configure the existing rmirror resources to mirror the specified client and tag into an appropriately named sub-dir of /disk/rmirror01/, but again using the tags a power user could override that.

When the rmirror component runs at night to do the mirrors, currently it just logs the output to the log file, I propose to store information about the mirrors (somewhere, probably just in a flat file in /var/lcfg/conf/rmirror/), recording when the mirror of a particular source started and finished, and if there was an error, when and what was it, where it stored the data.

Again new method(s) to the component would report back on the clients it is configured to mirror. They would include the recorded information, like when the mirror started and finished, success/failure, location, etc.

Issues

How am I going to make the mirror_client method of reporting back on the details of the last mirror work? - I don't want the user to have to know that it was server X that did the mirror, if they did, then we can easily call server's X rmirror.report method for the information. - I could assume that we'll setup a central mirror overseers server and it would know about all the mirror servers and it could query them all to find out which did actually mirror that particular host and tag.

An overseer server would also be able to spot clients in the spanning map for which there was no mirror server defined, and alert (via nagios/email) someone to trigger the manual configuration of a server to mirror that missing client.

Then I wondered, rather than having a single overseer server, all the mirror servers could act like one. I could use spanning maps so that all the mirror servers knew about each other, then when asked for details for a particular client, if it wasn't one of its, then it would ask the others (just using the 'om host.rmirror ...' syntax).

Other minor improvements

The rmirror could (configurable) create destination mirror directories, currently it fails if you say "mirror to /disk/rmirror01/jings" and that dir doesn't exist. Creating the missing dir doesn't seem too dangerous!

Add method to force a fresh mirror for the specified host/tag, this would allow you do say "mirror this now". Currently you have to do an rsync by hand, or wait for the nightly mirror or call "run" and mirror everything on that server.

Observations

I found myself wishing I could programmaticly store values back into an LCFG spanning map, so that other hosts could interrogate those resources. It would be nice if LCFG came with a shared memory space that all clients could access and manipulate (securely I suppose). One for the LCFG wish list!

Requests and Suggestions

Ideas from George:

  • Have a "profile.group" resource so that any reports could be clustered together, or just easier to spot.
  • "to tape" shouldn't be the default, each "mirror this" should be explicit as to tape or no tape.
  • any changes to rsync can't interrupt other uses of it on that machine.
  • The servers could publish to a spanning map, saying they are a mirror server, and the clients could subscribe to that map.
  • would be happy with a daily report emailed, or on a web page show the state of all mirrors (with the group info suggested above added)

Ideas from Alastair:

  • would be good to allow mirroring of a subset of an rsync module tree, or even listed subsets .. eg
if the rsync client has a module called "data" which equates to say /disk/data on that machine. Then you want to be able to have the server only mirror subdirs of that module, eg do the equivalent of:

rsync client::data/subdir1 /disk/mirror/client/subdir1
rsync client::data/subdir2 /disk/mirror/client/subdir2

I think that should be simple enough via direct manipulation of rmirror resources, but perhaps not so simple to provide via a "MIRROR_THIS" macro that you'd only use on the client.

  • (optional) email notifications of the status of a mirror per target and per server run.

Ideas from Stephen:

  • Nagios monitoring

A snag with this might be the anoying mail's it would generate. If all was good up until a mirror run at 11pm, and then the mirror of that client fails at 11:30pm, then you'd have 9 hours of nagios alerts until someone got in the following day. There was some suggestion this is just a matter of tuning the polling time for nagios.

As success and failure recording was going to be something I'd be looking at any way, I'm going to ignore specific nagios work for now. Once the main work is done, we can see what can be done to expose that info via nagios.

-- NeilBrown - 20 Sep 2010

What about selfmanaged machine mirrors?

Topic revision: r7 - 01 Nov 2011 - 23:37:27 - NeilBrown
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies