Discussion:
[Nagios-users] Old Checkfiles make nagios unusable
Andre Timmermann
2009-09-18 18:03:13 UTC
Permalink
Hello list,

we see a very strange behaviour on our nagios-installations.

/var/lib/nagios3/spool/checkresults contained 29355 old checkfiles which
made nagios stop checking hosts. After a restart the nagios webinterface
was unusable (Whoops...)

On another system I have 685 stale files:

***@nagios:/var/lib/nagios3/spool/checkresults # ls -la | head -n 10
total 2776
drwxr-x--- 2 nagios nagios 36864 Sep 18 19:25 .
drwxr-x--- 3 nagios nagios 4096 Mar 23 15:50 ..
-rw------- 1 nagios nagios 280 May 14 19:46 check0Bp5pw
-rw------- 1 nagios nagios 279 May 3 14:09 check0LTWle
-rw------- 1 nagios nagios 287 May 14 19:46 check0So2dK
-rw------- 1 nagios nagios 251 May 3 14:09 check0V8UXl
-rw------- 1 nagios nagios 252 May 3 14:09 check0Y4kYK
-rw------- 1 nagios nagios 252 May 3 14:09 check0cF56S
-rw------- 1 nagios nagios 295 May 3 14:02 check0csdub

***@nagios:/var/lib/nagios3/spool/checkresults # cat check0Bp5pw
### Active Check Result File ###
file_time=1242323204

### Nagios Service Check Result ###
# Time: Thu May 14 19:46:44 2009
host_name=testhost
service_description=http
check_type=0
check_options=0
scheduled_check=1
reschedule_check=1
latency=571.055000
start_time=1242323204.55059

Obviously this should not happen. This nagios-instance has got 291
active hosts and 1549 service checks

We use version 3.0.6-4~lenny2 from debian lenny.

One (hacky) solution could be using tmpreaper to clean
up /var/lib/nagios3/spool/checkresults once a day, but first of all I
want to know if we can improuve our setup or if this is a known problem.

Did anybody see this before? What would be the best solution to fix
that?

Best regards,
Andre
--
Mit freundlichen Gruessen

Andre Timmermann
Nine Internet Solutions AG, Albisriederstr. 243c, CH-8047 Zuerich
Tel +41 44 637 40 00 | Direkt +41 44 637 40 06 | Fax +41 44 637 40 01
Loading...