Notes on running the various web services

This is a page for various nuggets of wisdom, extracted directly from N. Brown's brain cells on how to cope with various web issues affecting the main web server (though much of it is probably also applicable to the other servers)

Web server is not responding but the server appears to be up

run ps -ef | grep httpd | wc -l. If the value you get back is close to or over 200, then the server is hitting the limit on the munber of httpd daemons which can run simultaneously. This shouldn't normally happen but can occur when someone is trying to download a PS or PDF file via Internet Explorer 5 running on Windows 98. This may be legitimate but there is also an application for downloading entire web sites which reports itself as IE5 on WIN98. Either way, we need to block whoever is doing this so that service can be restored.

The above should less likely now, as we're using limitipcon module that limits the number of simultaneous connections from a single IP address.

  1. Identify the site.
It may be possible to identify the offending site by looking at the apache.access logs for the relevant site. A better way is to go to http://www.inf.ed.ac.uk/server-status (for the main web server of course) which should give a clearer view of what is connected and what they are downloading. Access to this page is limited to a few machines. If you wish to give a machine access to these pages, you need to edit main .conf in the apache conf directory.

  1. Block the offending machine
There's a couple of ways to do this. The best way is probably to put a deny directive in the main server configuration. To locate this, do a qxprof apache.serverroot and qxprof apache.config. If you want to be a bit more selective, you can use the conf directory for the individual sites instead. Do something like:


DONT USE <LOCATION> TAG, as it can lead to inadvertently opening up access that was previously restricted.

should do the trick. Note that you will still see connection attempts in the access log but they won't succeed.

Google

Neil and Gordon can access version Google stats about www.inf (and can request Google to remove pages from its caches) via Google's webmaster tools. This is only because Neil's verified that he's a site admin for www.inf (which anyone with edit access to the www.inf homepage could do. I then invited Gordon to also be a siteadmin.

Verifying new site admins could be tricky, now that the front page is redirected to Polopoly. Could/should perhaps configure to not redirect if the browser is detected as google, so we can add verification cookies if necessary. However, we'd have to watch that the regular google crawler still gets redirected.

-- CraigStrachan - 18 Mar 2008

Topic revision: r4 - 05 Aug 2011 - 14:26:26 - NeilBrown
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies