Meeting:2014/11/20

From SOWNWiki
Jump to: navigation, search

Meeting (to be) held on 2014/11/20 at 18:00 in Zepler CLS Lecture Room

Previous meeting: 16 October 2014 18:00:00 Next meeting: 7 January 2015 19:00:00


Agenda

Icinga Alerts

  • EAPOL-* checks
    • Should we change them so the checks are direct?
    • We probably need one check via JANET and checks for each server directly for ECS and SOTON
  • Node NFSEN checks
    • Temporarily disabled whilst we sort out problems with nfcapd
  • Node SYSLOG checks
    • Package upgrade should have fixed this issue. Will need to upgrade other nodes as they come online. We will be warned if more than one process is running on a node so we know there is a problem or it needs to be upgraded. Max connections with syslog server has been removed.
  • Node SSH-NODE-PASSWORD checks
    • Do we need the SSH check on nodes as well?
      • No, we should make SSH check passive and send a critical check report via NCSA when the SSH-NODE-PASSWORD check fails because it cannot to the node.
    • How frequent should be these checks?
    • Why does Dropbear (SSH) keep dying on Carnation Road node? Can we produce an upgrade to check the the status of Dropbear and (re)start it if necessary?
      • morse will patch SSH on #263 to see if we can figure out what is wrong.
      • If the problem turns out to be something we cannot fix we will add a hook to check and restart SSH when necessary.
  • BACKUP3/BACKUPTRANSFER check
    • Any ideas of how to avoid this failing as often.
    • BACKUP2/BACKUPTRANSFER also went critical yesterday

Building Zepler / SOWN@coordinates nodes


Todo List

AOB

  • None at present.
Facts about "2014/11/20"
Has date20 November 2014 18:00:00 +
Has end date20 November 2014 20:00:00 +
Has location59/1257 +