Final report for the User accessible login reports project (#254)

Background

In the report which investigated the root compromise of the Informatics SSH Service in 2011 one of the proposals for possible improvements (#18) was to "Consider pro filing user behaviour". The idea was that it might be possible to develop a system that automatically identi fies the profi les of "normal" login behaviour for users. It would then be able to flag-up any potentially abnormal activity. As noted in the report this would probably be rather difficult and require considerable work, particularly in producing a sufficiently capable system which doesn't overload the Computing Team with false-positives. It is also likely that it would have been difficult to gain acceptance from the members of the School for running such a system.

As an alternative to this proposal a project plan was formed around the idea of providing all the essential information to the users so that they could do the necessary checking of login activity for their DICE account. They know what their "normal" activity is like and have a much better chance of spotting any unusual logins at peculiar times or from different places.

Although the necessary login information is stored into syslog files for all DICE machines, until recently it was not in an easily accessible format. This situation changed in 2012 with the introduction of the BuzzSaw database as part of the System Security Enhancements project (#224). We now have the information from the syslog files regularly imported into a PostgreSQL database which makes querying and report generation very straightforward.

The Project

The main aim of this project was to develop a web interface which would provide users access to all the login information we have stored for their DICE account. The intention was that the interface would highlight any events which needed particular attention (e.g. logins from outside of the Informatics network) to guide users. It had to be a fairly simple and intuitive layout which could be easily accessed from any web browser.

To get the project going quickly a prototype was rapidly developed with the explicit intention that its primary purpose would be to assist in the design of the interface. The code in the prototype was allowed to be hacky (in fact almost encouraged to be hacky...) and lack robustness as long as it delivered the necessary data from the database into the templating system so that ideas could be quickly tested and thrown away. This was all done safe in the knowledge that once we had a finalised design for the interface the code could be rewritten. This strategy worked very well and allowed a large number of iterations on the design to be done in a short space of time. The first stage of testing was done with COs who we knew would be fairly demanding in their requirements and would have a large amount of data related to SSH and Cosign logins. Being able to show the real data made the test designs much more realistic and helped us find all sorts of data handling issues well before the interface was exposed to any real users. For any system with a user interface I would definitely recommend this as a very useful and productive approach as long as the prototype really is thrown away at the end and the code rewritten in a better, more robust style...

After launching the web interface we had some very encouraging feedback from users:

"Thanks for this - it's a useful security feature."

"Very nice and informative presentation of day-by-day authentication activity. Was easy to see that it was all OK."

We also had some suggestions for enhancements, e.g.

"A nice enhancement, which I expect many folk would appreciate, would be to make it possible to register 'known' addresses. At the moment connections from my home IP address are highlighted, at least twice a day, and that's . . . unhelpful -- false positives reduce people's willingness to scan for bogies. . ."

We also had several people suggest that we provide an alternate view of the data which grouped the login activity by source location (rather than by day of the month). This is very similar to the way the email reports are presented. This would definitely be a very useful enhancement and help expose any oddities amongst the data. We did not have sufficient time in this project to add that view but I think it would be highly beneficial and probably not require more than a few days work.

"Ideally, it would be nice if this page also listed the sources of log-ons, in one place, separate from the record of individual log-ons iyswim. My record shows a zillion logons from (address allocated by ISP) - naturally, since this is "home" - and it's cognitively burdensome to scroll down looking for things that are yellow but not that. E.g. I'd find it much easier to check my May data if at the top of the page it said something like (below) as then I'd immediately know there was no need to look at the detail."

This month there were logins from:
  - host1.example.org
  - foo.example.com

It was clear that most users would need regular reminders to check the login activity for their accounts but a simple "please check your logins" email was likely to be seen as nothing other than us nagging them and would get ignored after the first couple of months. To solve this problem we decided to create a tool for sending monthly emails which summarised the most important activity into a short high-density report that hopefully would be thoroughly checked by users even if they did not then go on to visit the web interface.

Alongside the aim of creating the web interface and monthly email reports the intention was that this would be a useful project for improving my Python programming skills. There is no better way to really get to grips with a language than to use it for a real purpose. This added a lot to the time required (it would have been quicker if I had done it in Perl) but definitely helped to hugely improve my Python knowledge and skills. I feel that I am now in a position to seriously use Python for any future projects.

I also chose to use the Django web framework for this project. In terms of the design of the code I feel this was exactly the right thing to do but learning the system probably added an extra week to the time taken for the project. Django provides an exceedingly helpful object-relational mapping framework to assist with querying databases. I would happily use this again if I have a need to create more web interfaces to database backends, it compares very favourably to the Perl DBIx::Class module. Django also has a fairly nice templating system, this is similar to the Perl Template Toolkit (TT) but I think I slightly prefer the greater flexibility provided by TT. There is a philosophical difference of opinion here in the way templates are expected to work. Django puts the emphasis on doing all of the data mangling in the controller code and only using the template view for displaying data, this sometimes leads to a need to structure the data in slightly odd ways to make it easier to display. I liked the way Django maps URLs onto view functions, this is much nicer than the Catalyst approach which uses subroutine attributes (a bit like the way some Python frameworks use decorators). I like having a single mapping file which shows all the routes from the user interface through to the code. The alternative is having to inspect each separate module (and subroutine within) to see what they handle. I feel the biggest benefit of using a system like Django is that it is self-contained. It provides the ORM, the controller framework and the templating system, this means there is a high-level of consistency and all the parts are well designed to work together. The Perl Catalyst system does a reasonable job of pulling together DBIx::Class and Template Toolkit but it lacks a certain amount of consistency because it aims to support the use of many different models and templates. Django also has excellent documentation and there are many good tutorials and examples available which demonstrate whole systems, this makes learning how to solve particular problems relatively straightforward.

Summary

This project has clearly demonstrated the benefits of having easy access to the important information stored in our system logs. The project has delivered a relatively simple system which improves the security of our entire network by helping our users play an active part in spotting a compromise of their account. The web interface has had some good feedback from the users which suggests that the system has been well received and that it will be used as intended. Similarly the responses to the first monthly email summary are encouraging so hopefully we have hit on the right approach to getting our users to check their login activity once in a while.

The project took 3 weeks (15 days) rather than the originally plan of 2, the extra time is accounted for by the decision to use the Django framework which meant there was more learning required. This project involved a large amount of personal development time and I now have a much better knowledge of Python and Django.

The benefits of using a rapid prototyping approach were clearly seen in this project. Giving our test users early access to a functional system meant we could try out many different ideas in quick succession and use the feedback immediately, we did not have to wait until the end of the project to discover if things worked well (or not). For future projects with user interfaces I will definitely use a similar approach again.

Future Work

  • Support hiding of internal sources in the web view
  • Alternate web view which groups login activity by source location rather than by day of month
  • Allow users to specify certain external hosts as being "trusted"

Documentation

Time Spent

Period Hours
2012 T3 32
2013 T1 53
2013 T2 20

Total hours: 105 hours (15 days / 3 weeks)

-- StephenQuinney - 19 Jun 2013

Topic revision: r3 - 21 Jun 2013 - 11:19:27 - StephenQuinney
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies