A Place to Dump random useful bits of Information

How to give one user exclusive use of a printer

This came up because Alex Judd needed to print to special paper to make badges. she didn't want anyone else to print to her paper!

CUPS now that we are using cups, then on the cups server (astrotype at the time), do lpadmin -p if513m0 -u allow:neilb to only neilb to print to if513m0. To allow access to everyone do lpadmin -if if513m0 -u allow:all. See http://www.cups.org/doc-1.1/sam.html#4_3_5

LPRng this is old LPRng, and probably redundant now.
After a bit of head scratching, the hacky answer seems to be to edit /etc/lpd/lpd.perms and go to the bottom of the file.


REJECT SERVICE=R,P PRINTER=at10c NOT REMOTEUSER=ajudd      <- additional restriction goes here!!!!

## lines below here are generated automatically from caps data for .inf ##
ACCEPT SERVICE=C,M REMOTEGROUP=@printing/all/manage
ACCEPT SERVICE=R,P REMOTEGROUP=@printing/all/print
ACCEPT SERVICE=R,P PRINTER=at9c,at10c,at12c,at13c REMOTEGROUP=@printing/colour/print
ACCEPT SERVICE=R,P PRINTER=at1,at2,at3,at4,at5,at6,at6b,at7,at8,at11,at14,at15 REMOTEGROUP=@printing/public/print
ACCEPT SERVICE=R,P PRINTER=at9c,at10c,at12c,at13c,at16 REMOTEGROUP=@printing/restricted/print

This stops everyone except Alex Judd from printing to printer at10c. Note it needs to go before the very last section to have any effect. ie if any previous ACCEPT line allows people to print, then the REJECT will not be affective.

kill -HUP the parent lpd process to get it to re-read it's configuration. The next time the LPRng component is stop/started/configured/run, then the lpd.perms file will be regenerated and your hand edit undone.

Printer Problems

See ServicesUnitPrintingProblems for some notes on solving some common printer problems.

Restarting AMD

See the RestartingAMD topic.

How To Make a New ATAbeast Volume Available on Pegasus and Hippocampus (re-scan LUNs)

Pegasus and Hippocampus have a Qlogic 2200 series card rather than the more standard Qlogic 2300 series. If you create a new volume on a fibre connected disk (ATAbeast) and want it to become visible without a reboot, you need to do:-

  • cfgadm -al to list the possible targets
  • cfgadm -c configure ap-id
  • devfsadm

In our case, (on hippocampus) the ap-id was c4::5000402001e8090c, which appears in the output of cfgadm -al. This can be confirmed by looking at the disk list in format, for example :-

18. c4t5000402001E8090Cd17 <NEXSAN-ATAbeastF-8r41 cyl 26832 alt 2 hd 128 sec 128>
          /pci@8,600000/SUNW,qlc@1/fp@0,0/ssd@w5000402001e8090c,11

The ap-id is made up of the controller number and the target address. We can see that this confirms it is mounted from NEXSAN-ATAbeastF. THe output of cfgadm -al does not mention the device :-

bash-2.05# /usr/sbin/cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c0::dsk/c0t6d0                 CD-ROM       connected    configured   unknown
c1                             fc-private   connected    configured   unknown
c1::500000e0104109a1           disk         connected    configured   unknown
c1::500000e010419b61           disk         connected    configured   unknown
c2                             scsi-bus     connected    unconfigured unknown
c4                             fc-fabric    connected    configured   unknown
c4::210000e08b10f211           unknown      connected    unconfigured unknown
c4::210000e08b131d6a           unknown      connected    unconfigured unknown
c4::210000e08b132cb4           unknown      connected    unconfigured unknown
c4::5000402001e8090c           disk         connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb0/3                         unknown      empty        unconfigured ok
usb0/4                         unknown      empty        unconfigured ok

Note that it appears that if you increase the size of a volume from the ATAbeast end, but that the Solaris end already has a disk label for it's old size, it appears the the sun will not pick up on the new size until you delete this label. To get this to work, we deleted the existing (empty in our case) volume on the ATAbeast and created a new one. after the cfgadm and devfsadm, the Sun (format) could see the new size. It would be nice if you could 'grow' a volume on the ATAbeast and use the new size without losing the data!

How to install missing Solaris packages

Hopefully this won't be needed once the Solaris package management has been sorted, but occasionally a Sun will uninstall packages that it needs! Usually this happens at boot time, but could happen if updaterpms is run at the wrong time.

We had an incident where har lost it's AFS packages (ie AFS wasn't working). The first thing to do was to discover the packages that should be put back. In this case we logged into a Sun that was still running AFS and look through all it's installed packages for AFS:

bash-2.05# pkginfo -l | grep -i afs
   PKGINST:  INFafs
      NAME:  AFS server binaries and associated files for Solaris
   PKGINST:  afsutils
      NAME:  Some utility scripts to help maintaining the AFS file system

Then find the packages in /repository/ on the Sun (pezenas:/disk/rpms/master/packages at the time of writing):

bash-2.05# cd /repository
bash-2.05# ls
LCFG   MACOS  SFW    SUNW
bash-2.05# ls */INFafs*
SFW/INFafs-1.3.85-1.pkg.gz  SFW/INFafs-1.4.0.pkg.gz
SFW/INFafs-1.3.85-2.pkg.gz  SFW/INFafs-1.4.1-0.pkg.gz
SFW/INFafs-1.3.85-3.pkg.gz  SFW/INFafs-1.4.1-1.pkg.gz
SFW/INFafs-1.4.0-1.pkg.gz   SFW/INFafs-1.4.4-1.pkg.gz

In this case we have a few to choose from, so look at what the still working Sun is using:

bash-2.05# pkginfo -l INFafs
   PKGINST:  INFafs
      NAME:  AFS server binaries and associated files for Solaris
  CATEGORY:  system
      ARCH:  sun4u
   VERSION:  1.4.4-1

You now need to gunzip the package file to somewhere so that you can then do the pkgadd. eg

gunzip < SFW/INFafs-1.4.4-1.pkg.gz > /opt/INFafs-1.4.4-1.pkg
pkgadd -d /opt/INFafs-1.4.4-1.pkg
The following packages are available:
  1  INFafs     AFS server binaries and associated files for Solaris
                (sun4u) 1.4.4-1

Select package(s) you wish to process (or 'all' to process
all packages). (default: all) [?,??,q]: 

Proceed to install the package, taking care to note what files (or ownerships) it might warn about changing.

Once you've installed the packages and got what was broken going again, you'll probably want to find out what caused the packages to be uninstalled in the first place so it doesn't happen again!

Some Notes on Roombooking

See ServicesUnitRBS for the new RBS, and ServicesUnitRoomBooking for the previous (no longer in use) Shezhu.

Server UPS Status Page

Is here :- http://netmon.inf.ed.ac.uk/cgi-bin/upsstats.cgi

Conference Kiosk

It's debatable if this is our problem, but as I touched it last - see here ConferenceKiosk

ISDD/OSDB Informatics Download Database

Neil wrote this, but I think officially RAT look after it. Bare minimum stuff. The software is all on the main informatics web server (currently wafer) in CVSROOT/web/research/isdd/ and the data is stored in a postgres DB also running on that machine. Sub dirs of that isdd contain useful admin scripts and tools. The code is the documentation!

I found some historical info in ~neilb/work/dice/web/osdb/. I probably also have an email or web page that I sent to John B. when he took over looking after it years ago.

jabber service

Not much to say here, but one little nugget, is that if the jabber service dies, or you reboot it, you need to tell the nagios notification bot to rejoin. Do this by running om jnotify stop/start on the nagios server (and backup nagios server).

You don't need any specific roles/entitlements to access jabber, but some rooms have individual config "/config" in an existing chat. eg the "cos" room is invite only. You do need the jabber/muc/cos entitlement to see the logs on the web though.

Backup Related

Mirror Service

This is currently undergoing a revamp, see ServicesUnitMirrorService

Cleaning up if incremental Networker backups get stuck

I probably should have emphasised that if either the afs backups or the inf backups ever report that they cannot start because there is already a backup running then there is a real problem that needs sorting.

Normally the problem is that afs on one (or more) of the Sun(s) has got stuck and a backup of the / partition has wedged. In this case, I suspect it had to do with the way the bpbeast failed, leaving the save process talking to a disk that wasn't responding.

To clear these problems

  • fix underlying problem (usually amd on Sun)
  • on ouroboros go to nwadmin and group control and stop the backup that is wedged
  • if no other backups are running, stop the networker daemon (/etc/init.d/networker stop) on backup client
  • check for save process on the backup client that haven't died and kill (-9 if needs be) them
  • restart the networker daemon

It's also worth repeating steps 3 to 5 on ouroboros if it looks as if there are stuck processes there too. Again, make sure that there isn't a backup (legitimately) running before stopping the networker daemon.

For full backups (BP1, ATFH) it is possible that you will get the backup not started because thee is already a backup running and it's not an error, merely that I have only labeled enough tapes to fill during the overnight window and that I am planning on labeling another tape at a suitable point. Check with me before killing these full backups!

At the moment, backup reports are going to services unit and Lindsay, so support will not have noticed. The clue to the failure was both in the "backup failed to start because another backup is already running" messages and in the dwindling entries in the reports from the servers.

How to do Networker restores if a out of sequence error is reported

This can happen if the tape you are restoring from is faulty or if for some reason you are starting the read after the start of the data. Networker will not try to put this data back into the filesystem, but will instead put it into a stream file with a name something like nsrscan.4149307220.000001. What it's doing here is reading the data streams and stitching them back together, but not converting into unix filesystem format. Make sure you have plenty of space and save the stream to disk. You can then 'unpack' the stream using something like


/usr/lib/nsr/uasm -r -m /disk=/mnt/TEMPRESTORE  < /partition/ptn157/RESTORE/nsrscan.4149307220.000001

Where the -m tells uasm to replace /disk at the start of the filesystem name with the desired location, in this case a big chunk of disk mounted under /mnt/TEMPRESTORE. i.e. /disk/a/c/b would go to /mnt/TEMPRESTORE/a/b/c

How To Start a New Months Retrospect Backup Set

This should happen for free at the beginning of each month, BUT, if for some reason it doesn't, you can fix this by the following steps :-

  • Go to the configure/Backup sets menu (stop the backup server to get to the main menu)
  • Select the most appropriate backup set from the drop down list. This is the current backup set, and should be listed under Monthly (last 6 months) folder.
  • Click on configure
  • A new menu will pop up, select the options tab and then click on actions
  • Select New Media
  • Edit the name of the backup set if needs be. For some reason when I tried, it cam up with the next but one in the sequence and not the next!
  • That should be it. Close the Windows and restart the server from the Run menu.

The reason that a new tape set is started each month is that the script overnight (mac default) has a New Media backup scheduled for the first Friday of each month (look at it's schedule to see this). The reason this script didn't run was a combination of forgetting to change the tapes and a power cut. When I powered the machine back on, it would not allow me to stop the backup server without canceling the pending scripts that were waiting for new tapes to appear.

For information on the retrospect backups, see the old backups docs and in particular, here is the tape recycling bit:

The Retrospect backup server is connected to an HP autoloader, which houses a Quantum DLT 8000 tape drive. Typically the monthly backups use two type IV tapes and one type III tape. New tapes are required on the first Friday of each month. Tapes are kept for six months before being recycled and the annual backups (done in September) are kept forever. Recycled tapes should be erased prior to their use.

Note tapes are labeled <sequence-number>-<year>[<month-code>] eg 2-2007[003] meaning the second tape in March 2007 backup set.

How To Start a New Year's Retrospect Backup Set

This seemed to work, not sure how many of the steps were needed.

From Retrospect Control panel (Retrospect Directory) on the Configure tab click, Backup Sets:

Set Type: Tape
Skip Security
Data storage: Tick hardware compression
Name: 2011 [001] (or whatever year it is)
Click New, then in Choose a Folder, put it in G4TA-01/Library/Preferences/Retrospect/ (which should be the default) then Save.
New Set will appear at top of the backup list, drag it down to the Monthly (last 6 months) section.

Then from Automate tab do Scripts -> Common Catalog go through the dozen or so scripts, eg Daytime (laptops etc), Overnight (PC Default) etc, and do Edit -> Change destination to the new backupset eg 2011 [001]. Note you'll have to remove the previous destination after you've added the new one.

That's it?

Notes on installing a new Superloader

The front panel passwords come as 000000 by default. If you want to use the web interface on a brand new unit, the default user and password are both set to guest.

Before you can use the web interface, you need to configure the netorking to pick up its address by DHCP. This can be done from the configure menu on the front panel.

If you need to configure the device from networker, using /usr/sbin/nsr/jbconfig first make sure you have configured the superloader with only one cartridge (in our case left) and restart. If you don't do this, it will report that it is a 16 slot device and networker licensing will fail! You can edit the usable slots to be 1 to 8, from the GUI, but this won't help. As long as the superloader is configured with only one cartrige before you run jbconfig it will report itself as an 8 slot device.

Neo 8000 Tape Library

The Neo 8000 replaced the old SunStore L180 at JCMB in July 2010. Some information about it can be found on the ServicesUnitNeo8000 topic.

Web Related

Switching over to the www.inf.ed.ac.uk DR machine

If www.inf is off the air for some reason, you can switch to a (readonly) offsite copy by following the instructions at ServiceUnitWwwOffsiteDrPlan

Switching over to the web.inf.ed.ac.uk DR machine

If you need to switch to the offsite version of web.inf.ed.ac.uk, then see WebInfEdAcUKDisasterRecovery

Notes on redirecting an ex students hompages url to their new staff homepages

RT 31479 asks :-


Could you please set up a forward for Sasha's old student homepages URL
http://homepages.inf.ed.ac.uk/s0199920 to now point to her staff pages
http://homepages.inf.ed.ac.uk/scalhoun/ 

Redirects like this are done in the apache config for homepages, in this case the file:

/public/homepages/homepages-data/conf/homepages.conf

whichi is under RCS control on laney.inf. There is a comment right at the beginning of the file to say what to look for later on in the file. You'll find a section with all the other sMATRIC to UUN redirects, eg:

RewriteRule   ^/s9903543(/?|/.*)$    /pcrook$1 [R=permanent,L]

Just add a new rule for Sasha.

See also :- http://www.dice.inf.ed.ac.uk/units/services/info/homepages.html

Do a "/usr/sbin/apachectl configtest" to check you didn't make a typo, and if all is well, then restart apache with "om apache restart"

Some notes on managing the main website

See the ServicesUnitWebNotes topic. Note there is a now a Zope/Plone instance on www.inf. Look in the infweb.conf apache config to see which bits are served by Plone www.inf/ZOPE will get you to the main instance and uses the same admin username and password as the wcms.inf service.

WCMS aka Plone site creation

See the ServicesUnitPloneWCMSManagement topic.

groups.inf and related sites

groups.inf (and conferences.inf, workshops.inf, events.inf) See http://www.dice.inf.ed.ac.uk/units/services/docs/groups.html

Plone Tips

See the ServicesUnitPloneTips topic.

WordPress Notes

Capability Access Restriction on Wiki

Simply, instead of WikiName or NameOfGroup, replace with TheFooBarCapbility to match the cap the/foo/bar. See TWikiCapabilities.

Web server being hammered by someone

We really need to look at installing the equivalent of mod_throttle or something to stop this from happening, but in the meantime we are dealing with the occasional occurrence by hand. The basic principle is to identify the IP address the requests are coming from and alter the apache config to deny it access. More details on ServicesUnitStopWebDOS.

Setup Informatics EASE web site

See SetupEASEWebSite

ISDD (or OSDB of old)

Back in the days of "teams" rather than "unit" this was a web team thing. Then it moved to RAT, where it may still be, but the only little bit of documentation's become orphaned, so I'm linking to it here http://www.dice.inf.ed.ac.uk/doc/osdb.html

dunkvm - aka informatics ventures web sites

See InformaticsVenturesWeb

Mail Related

Converting MBX format mailfolder

As we no longer host user inboxes, then this is unlikely to be needed anymore:

To convert from MBX to plain UNIX format mail boxes, use the mailutil command. Use full paths to avoid disappointment. eg:
mailutil copy -v /home/uun/Mail/INBOX \#driver.unix/home/uun/Mail/unix/INBOX

Note the back slash # or you could enclose in quotes.

How To Create a Shared Mailbox

We no longer host shared mailboxes, but IS do. See SharedMailbox topic.

Legacy and "guest domain" email forwarding, eg @dcs, @sicsa or @lcfg.org addresses

The legacy mail (@dcs, @dai, etc) and the hosting of mail for research type domains, eg lcfg.org, or informatics-ventures.com are done on the "virtual mail relay" (VMR) machine (virtualrelay.inf.ed.ac.uk). Currently (June 2013) this is the machine beeknow.inf.

All the VMR does it accepts mail for a given address and forwards it onto a current live address, eg neilb@dcs -> neilb@infREMOVE_THIS. The forwarding is done in the file /opt/mail/virtusertable and is a bit like a regular aliases file, but the format is slightly different as you have to include the domain, and it is space separated, rather than : (colon).

The forwards are grouped together by domain, if adding a new forward, please add it by the existing forwards for that domain.

The /opt/mail/virtusertable is under RCS control, and you probably want to read the comments at the start of it. After making changes, run "make".

To test your change has worked, run /usr/sbin/sendmail -bv address@domain.com to check it is going to forward as expected.

If changing entries for someone, you could offer to remove one or all of their entries completely if they are just a source of spam.

If you want to add a new domain to forward for, ask services unit, as DNS, IS and other files need informed and updated.

SMS email forwarding

The sMATRIC@inf to sMATRIC@sms is a sendmail rule, but overrides can be created by editing:

flakey:/opt/sendmail/aliases-sms

again under RCS control, and run "make" after any changes.

Adding and deleting mailman lists

See http://www.dice.inf.ed.ac.uk/doc/mail/mailaliases.html

Unblocking Mailman

Sometimes messages to mailman lists may be blocked because of a rogue mailman process. To fix this, do the following:

  • on flakey.inf (the current mailman host), check contents of /var/spool/mailman/out - any more than a few transient files here confirms that outgoing mailman messages are blocked.
  • stop mailman with "/etc/init.d/mailman stop", which should kill all but the rogue process (probably "qrunner --runner=OutgoingRunner"), and the controlling process, mailmanctl.
  • kill the qrunner process - the controlling mailmanctl process will spawn a new instance, which should then start delivering the blocked mail
  • check contents of /var/spool/mailman/out again, and, when all contents have disappeared...
  • kill the controlling process, mailmanctl.
  • restart daemon with "/etc/init.d/mailman start".
  • check all mailman processes look OK (there should be nine processes in all)

Dealing with Mailman bounces

Information on what we should do when mailman notifies us that a list member has generated excessive bounces. The details are at ServicesUnitMailmanBounceNotifications.

Mailman sibling lists

Mailman sibling lists are probably the solution when someone asks for a list that its membership is made up of other lists, but there are gotchas. See MailmanSiblingLists

mailman cron checkdbs errors

These happen when the cron to mail list admins their list of pending approvals. If one of those pending emails contains odd characters (usually spam) that it can't deal with, the script checkdbs bombs out. The bare minimum to do is run /disk/mailraid/scripts/checkdbs-test as mailman deduce the problem message and remove it from the pending list for the affect list. More details on http://www.dice.inf.ed.ac.uk/units/services/info/mailman.html (though they were written a while ago).

countmail script for ITO publicity tracking

In RT 43098 Diana asks if we could keep track of how many mails have been sent to a particular address. Now they could do this themselves by just keeping the messages and counting them via a filter, but I've knocked up a script /disk/mailraid/scripts/countmail which expects to be called as a pipe from sendmail and takes on argument, a tag as to which counter to increment. This updates a file /opt/apache/html/default/count.txt, which appears on the web so that those that know can view it.

To use it do something like this in the aliases file, before:

phd-admissions+pp: infphds+pp@ed
after
phd-admissions+pp: infphds+pp@ed, "|/disk/mailraid/scripts/countmail phd-pp"
So the mail continues to go where it did before, plus it increments the phd-pp counter in the data file.

Note that the file component has been updated to create the necessary /etc/smrsh/countmail link into /disk/mailraid/scripts/, and that the data file needs to be writeable by daemon:mail, or it won't work.

callremctl script

See CallRemctlScript

smtp fail2ban

fail2ban is running on the smtp server. This will throttle attempts by the bad guys to hack passwords, but wont stop them. You can list currently banned IPs with:

root: fail2ban-client status authsmtp
Status for the jail: authsmtp
|- Filter
|  |- Currently failed: 13
|  |- Total failed:     5104
|  `- File list:        /var/lcfg/log/mail.log
`- Actions
   |- Currently banned: 10
   |- Total banned:     512
   `- Banned IP list:   110.138...
and manually add an IP with
fail2ban-client -vvv set authsmtp banip A.B.C.D
The -vvv just turns up verbosity - see https://www.fail2ban.org/wiki/index.php/Commands

@ed email addresses

There is some confusion over @ed email addresses. See IS page https://www.ed.ac.uk/information-services/computing/comms-and-collab/email/directories/directorypolicy However I believe that everyone has a UUN@edREMOVE_THIS.ac.uk address that forwards to their primary Uni email service. Kenny MacDonald confirmed that "uun@ed thing is aimed at making systems work better,". Unidesk I181106-0739 also confirms UUN@ed is a thing now.

File server related

How to set up a new AFS file server

  • Add the line afs.server    true to the new server's profile
  • Copy the contents of the /usr/afs/etc directory on an existing server to the new server in the most secure manner you can envisage.
  • On the new server run the afs-server init script
/etc/init.d/afs-server start
  • Run bos status <newservername> on any DICE machine. If it returns without any error messages, bosserver has successfully started on the new server

  • On any dice machine using your AFS admin credentials, run the following command to start up the file server:
 
bos create -server <newservername> -instance fs -type fs -cmd \
"/usr/afs/bin/fileserver -L -p 23 -rxpck 400 -busyat 600 -s 1200 -l 1200 -cb 65535 -b 240 -vc 1200 -allow-dotted-principals" \ "/usr/afs/bin/volserver -p 16 -allow-dotted-principals" \ 
"/usr/afs/bin/salvager -parallel all5"

  • run bos status <newservername> once more. it should return:
Instance fs, currently running normally.
    Auxiliary status is: file server running.

Using local disk on a server

Some notes on preparing and using local disk on a server ServicesLocalDiskServer

Topic revision: r62 - 09 Aug 2019 - 14:10:15 - NeilBrown
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies