Discussion:
[Nagios-users] Max concurrent service checks
Gian Paolo Buono
2009-03-25 09:39:20 UTC
Permalink
Hi,

from nagios.log I receive this message ?

[1237973726] Max concurrent service checks (400) has been reached. Delaying
further checks until previous checks are complete...
[1237973726] Max concurrent service checks (400) has been reached. Delaying
further checks until previous checks are complete...
[1237973726] Max concurrent service checks (400) has been reached. Delaying
further checks until previous checks are complete...

any idea ? I is this a problem?

By thanks...
S***@gfkl.com
2009-03-25 11:06:06 UTC
Permalink
Post by Gian Paolo Buono
from nagios.log I receive this message ?
[1237973726] Max concurrent service checks (400) has been reached.
Delaying further checks until previous checks are complete...
[1237973726] Max concurrent service checks (400) has been reached.
Delaying further checks until previous checks are complete...
[1237973726] Max concurrent service checks (400) has been reached.
Delaying further checks until previous checks are complete...
any idea ? I is this a problem?
Well, that's up to you to decide.
You obviously told nagios to run no more then 400 checks at a time.
Nagios now reached that limit and delays further checks, so no more
then 400 processes are forked.

I don't know why you set that limit, do you? ;)

Take a look at the nagios.cfg and the "max_concurrent_checks" setting.

Regards
Sascha
--
Sascha Runschke
IT-Infrastruktur

fon : +49 (201) / 102-1879
fax : +49 (201) / 102-1102105
mobil : +49 (173) / 5419665



GFKL Financial Services AG
Vorstand: Dr. Peter Jänsch (Vors.), Jürgen Baltes, Dr. Tom Haverkamp
Vorsitzender des Aufsichtsrats: Dr. Georg F. Thoma
Sitz: Limbecker Platz 1, 45127 Essen, Amtsgericht Essen, HRB 13522
Ricardo Maraschini
2009-03-25 13:09:18 UTC
Permalink
Post by Gian Paolo Buono
from nagios.log I receive this message ?
[1237973726] Max concurrent service checks (400) has been reached.
Delaying further checks until previous checks are complete...
[...]
Post by Gian Paolo Buono
any idea ? I is this a problem?
Search by max_concurrent_checks on
http://nagios.sourceforge.net/docs/3_0/configmain.html

-rm
Gian Paolo Buono
2009-03-26 09:32:54 UTC
Permalink
Hi,

I set this limit because my server go in hang ...and I don'have any error
log in /var/log/messages ...
I have think that my server there are too many processes ...
I have set max_concurrent_checks=0 and my server go in hanq more often...

My server is a FreeBSD 7.1-RELEASE-p2 with 950 host and 4900 service, Nagios
3.0.3

Anothre problem is that sometimes nagios don't update the status and when i
try to stop nagios don't dies, i try to kill -9 the process but don't dies
so I have to reboot the server.

Any idea ? thank you for the support bye..

On Wed, Mar 25, 2009 at 2:09 PM, Ricardo Maraschini <
Post by Ricardo Maraschini
Post by Gian Paolo Buono
from nagios.log I receive this message ?
[1237973726] Max concurrent service checks (400) has been reached.
Delaying further checks until previous checks are complete...
[...]
Post by Gian Paolo Buono
any idea ? I is this a problem?
Search by max_concurrent_checks on
http://nagios.sourceforge.net/docs/3_0/configmain.html
-rm
Andreas Ericsson
2009-03-26 09:57:49 UTC
Permalink
Post by Gian Paolo Buono
Hi,
I set this limit because my server go in hang ...and I don'have any error
log in /var/log/messages ...
I have think that my server there are too many processes ...
I have set max_concurrent_checks=0 and my server go in hanq more often...
My server is a FreeBSD 7.1-RELEASE-p2 with 950 host and 4900 service, Nagios
3.0.3
Anothre problem is that sometimes nagios don't update the status and when i
try to stop nagios don't dies, i try to kill -9 the process but don't dies
so I have to reboot the server.
Any idea ? thank you for the support bye..
The only way I'm aware of that a process can become unkillable is when it's
in uninterruptable IO (ie, the kernel is waiting for response from a piece
of hardware in such a way that everything else is more or less locked down).

Are you using network-mounted drives to store any of Nagios' output files?
If so, stop doing that immediately. Network filesystems perform extremely
poorly with files that are being frequently updated.

Apart from that, it seems as if your system isn't quite up to scratch for
handling the workload you want to put on it.
--
Andreas Ericsson ***@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
Gian Paolo Buono
2009-03-26 10:35:39 UTC
Permalink
Hi,
I don't have any nfs mount on this server, and can not find the problem..
I think that the problem is the raid controller...

[***@server /usr/local/etc/nagios]# dmesg | grep -i raid
aac0: <IBM ServeRAID-8k> port 0x5000-0x50ff mem
0xc9e00000-0xc9ffffff,0xc7fe0000-0xc7ffffff irq 17 at device 0.0 on pci4
aac0: ServeRAID 8k-l , aac driver 2.0.0-1
aacd0: <RAID 1 (Mirror)> on aac0

but i dont have any log on this ..any suggest ?
Post by Andreas Ericsson
Post by Gian Paolo Buono
Hi,
I set this limit because my server go in hang ...and I don'have any error
log in /var/log/messages ...
I have think that my server there are too many processes ...
I have set max_concurrent_checks=0 and my server go in hanq more often...
My server is a FreeBSD 7.1-RELEASE-p2 with 950 host and 4900 service, Nagios
3.0.3
Anothre problem is that sometimes nagios don't update the status and when i
try to stop nagios don't dies, i try to kill -9 the process but don't dies
so I have to reboot the server.
Any idea ? thank you for the support bye..
The only way I'm aware of that a process can become unkillable is when it's
in uninterruptable IO (ie, the kernel is waiting for response from a piece
of hardware in such a way that everything else is more or less locked down).
Are you using network-mounted drives to store any of Nagios' output files?
If so, stop doing that immediately. Network filesystems perform extremely
poorly with files that are being frequently updated.
Apart from that, it seems as if your system isn't quite up to scratch for
handling the workload you want to put on it.
--
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
Andreas Ericsson
2009-03-30 11:00:50 UTC
Permalink
Post by Gian Paolo Buono
Hi,
I don't have any nfs mount on this server, and can not find the problem..
I think that the problem is the raid controller...
That would indeed put processes in uninterruptable IO, since the kernel
will refuse to let processes run while it's waiting for a response from
the hardware.
Post by Gian Paolo Buono
aac0: <IBM ServeRAID-8k> port 0x5000-0x50ff mem
0xc9e00000-0xc9ffffff,0xc7fe0000-0xc7ffffff irq 17 at device 0.0 on pci4
aac0: ServeRAID 8k-l , aac driver 2.0.0-1
aacd0: <RAID 1 (Mirror)> on aac0
but i dont have any log on this ..any suggest ?
Try using the same hardware but with a different kernel (Windows or Linux)
that has another driver. If the driver *or* the controller is broken,
you'll get unkillable processes.

If the raid hardware is broken, you need to replace the hardware.
If the BSD raid driver is buggy, you need to either get new (and better
supported) hardware, or change the OS.
--
Andreas Ericsson ***@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
Giorgio Zarrelli
2009-03-26 10:34:19 UTC
Permalink
Ciao,

issue a top and look for wa. Your troubles make me think about some i/o
problems and so too many wait cycles on cpu.

Giorgio
Post by Gian Paolo Buono
Hi,
I set this limit because my server go in hang ...and I don'have any error
log in /var/log/messages ...
I have think that my server there are too many processes ...
I have set max_concurrent_checks=0 and my server go in hanq more often...
My server is a FreeBSD 7.1-RELEASE-p2 with 950 host and 4900 service, Nagios
3.0.3
Anothre problem is that sometimes nagios don't update the status and when i
try to stop nagios don't dies, i try to kill -9 the process but don't dies
so I have to reboot the server.
Any idea ? thank you for the support bye..
On Wed, Mar 25, 2009 at 2:09 PM, Ricardo Maraschini <
Post by Ricardo Maraschini
Post by Gian Paolo Buono
from nagios.log I receive this message ?
[1237973726] Max concurrent service checks (400) has been reached.
Delaying further checks until previous checks are complete...
[...]
Post by Gian Paolo Buono
any idea ? I is this a problem?
Search by max_concurrent_checks on
http://nagios.sourceforge.net/docs/3_0/configmain.html
-rm
Loading...