Discussion:
[Nagios-users] Plugin timed out after 10 seconds
Rudy Montemayor
2004-10-25 20:25:02 UTC
Permalink
Nagios Users,

I have been getting a few notification messages of this type and I'm
sure it's due to our network; however having said that how and where do
I increase the timeout period to say 20 seconds.

Notification Type: PROBLEM

Service: Check /ORACLE/BP1/ORAARCH
Host: HCIBP1 BW Prod
State: CRITICAL

Date/Time: Sun Oct 24 02:39:25 CDT 2004

Additional Info:

CHECK_NRPE: Socket timeout after 10 seconds.


Also like this:

Notification Type: PROBLEM
Host: PRT204-Calgary
State: DOWN
Info: CRITICAL - Plugin timed out after 10 seconds

Date/Time: Mon Oct 25 11:29:56 CDT 2004

The next time the service is checked it recovers. Where is this 10
second limit set at?

Thanks for your help in advance.

Rudy Montemayor
n***@mm.quex.org
2004-10-25 22:44:06 UTC
Permalink
Post by Rudy Montemayor
I have been getting a few notification messages of this type and I'm
sure it's due to our network; however having said that how and where
do I increase the timeout period to say 20 seconds.
Notification Type: PROBLEM
Service: Check /ORACLE/BP1/ORAARCH
Host: HCIBP1 BW Prod
State: CRITICAL
CHECK_NRPE: Socket timeout after 10 seconds.
This one looks like the plugin itself is enforcing the timeout,
so look for a command-line switch to adjust the timeout.
Post by Rudy Montemayor
Notification Type: PROBLEM
Host: PRT204-Calgary
State: DOWN
Info: CRITICAL - Plugin timed out after 10 seconds
This one looks to me like Nagios is timing it out itself, so for
that one check the main config.
Post by Rudy Montemayor
The next time the service is checked it recovers. Where is this 10
second limit set at?
Check the values of service_check_timeout and host_check_timeout
in the main configuration file. Note that host checks, when they
need to be done, kind of slow everything else down so it's better
if they don't take ages and ages to run...

Also, since host checks aren't repeated X number of times before
the host is declared down, you may want to write a wrapper around
it to repeat the test 2 or 3 times if you have hosts that are a
little bit flaky, but you don't care if they stop responding for
a short period of time.
Andreas Ericsson
2004-10-26 06:07:01 UTC
Permalink
Post by n***@mm.quex.org
Post by Rudy Montemayor
I have been getting a few notification messages of this type and I'm
sure it's due to our network; however having said that how and where
do I increase the timeout period to say 20 seconds.
Notification Type: PROBLEM
Service: Check /ORACLE/BP1/ORAARCH
Host: HCIBP1 BW Prod
State: CRITICAL
CHECK_NRPE: Socket timeout after 10 seconds.
This one looks like the plugin itself is enforcing the timeout,
so look for a command-line switch to adjust the timeout.
Post by Rudy Montemayor
Notification Type: PROBLEM
Host: PRT204-Calgary
State: DOWN
Info: CRITICAL - Plugin timed out after 10 seconds
This one looks to me like Nagios is timing it out itself, so for
that one check the main config.
Umm, no. If that was nagios then the above would also be nagios. This is
the 'default' plugin output for networking plugins that time out.
check_nrpe does it a bit differently so as to distinguish between the
check_nrpe plugin timing out and the remote plugin it's executing doing
the same.
Post by n***@mm.quex.org
Post by Rudy Montemayor
The next time the service is checked it recovers. Where is this 10
second limit set at?
Check the values of service_check_timeout and host_check_timeout
in the main configuration file. Note that host checks, when they
need to be done, kind of slow everything else down so it's better
if they don't take ages and ages to run...
Ignore that, and check for the -t option in your checkcommands file.
Post by n***@mm.quex.org
Also, since host checks aren't repeated X number of times before
the host is declared down,
That depends on max_check_attempts, but host checks are done in a
serialized manner, so a temporary spike lasting longer than
max_check_attempts * plugin timeout will render a critical.
Post by n***@mm.quex.org
you may want to write a wrapper around
it to repeat the test 2 or 3 times if you have hosts that are a
little bit flaky, but you don't care if they stop responding for
a short period of time.
--
Andreas Ericsson ***@op5.se
OP5 AB www.op5.se
Lead Developer
n***@mm.quex.org
2004-10-26 06:37:06 UTC
Permalink
Post by Andreas Ericsson
Post by n***@mm.quex.org
Also, since host checks aren't repeated X number of times before
the host is declared down,
That depends on max_check_attempts, but host checks are done in a
serialized manner, so a temporary spike lasting longer than
max_check_attempts * plugin timeout will render a critical.
My mistake. I thought there was an earlier discussion which
concluded that the max_check_attempts setting is ignored for host
checks, at least in 1.x. Having just tested it on our 1.2 installation,
I see that it does actually work as it's supposed to.

Continue reading on narkive:
Loading...