Post by Marc PowellPost by Simone FeliciPlease help, noone has an idea?
Ah, my Nagios version (Nagios 3.0.3).
Also why all my hosts are checked more or less every 4 seconds? :(
Thank's!
The information provided so far indicates that hosts will only be
checked on demand, as you expect. That means, probably, either the
information is incorrect or they're being checked on demand. Since the
normal interval for actions in nagios is measured in minutes, I'd lean
toward the on-demand side of things.
Can you post the host definition from status.dat.
Can you post relevant log entries for the host and any services on
that host near the host checks. You may need to increase your logging
options in nagios.cfg.
Is the 4 second number in any way significant to your installation? Is
your time_interval less than 60?
Debug mode is available to you to figure out what's going on. This is
almost certainly going to be your best source for resolution.
Good morning.
With time_interval do you mean interval_lenght?
I've set it to "1". In this way I've set all checks in seconds, because I need for certain services a retry interval of
30seconds.
Here additional infos:
########################################
# NAGIOS STATUS FILE
#
# THIS FILE IS AUTOMATICALLY GENERATED
# BY NAGIOS. DO NOT MODIFY THIS FILE!
########################################
<<cut>>
hoststatus {
host_name=<<MY-WINDOWS-EXAMPLE-SERVER>>
modified_attributes=3
check_command=check-host-alive
check_period=24hx7
notification_period=24hx7
check_interval=5.000000
retry_interval=1.000000
event_handler=
has_been_checked=1
should_be_scheduled=1
check_execution_time=0.016
check_latency=12.595
check_type=0
current_state=0
last_hard_state=0
last_event_id=116739
current_event_id=116740
current_problem_id=0
last_problem_id=51304
plugin_output=PING OK - Packet loss = 0%, RTA = 0.65 ms
long_plugin_output=
performance_data=
last_check=1227773285
next_check=1227773291
check_options=0
current_attempt=1
max_attempts=2
current_event_id=116740
last_event_id=116739
state_type=1
last_state_change=1227714561
last_hard_state_change=1227714561
last_time_up=1227773286
last_time_down=1227714537
last_time_unreachable=1215613491
last_notification=0
next_notification=0
no_more_notifications=0
current_notification_number=0
current_notification_id=46804
notifications_enabled=1
problem_has_been_acknowledged=0
acknowledgement_type=0
active_checks_enabled=1
passive_checks_enabled=0
event_handler_enabled=0
flap_detection_enabled=0
failure_prediction_enabled=1
process_performance_data=0
obsess_over_host=0
last_update=1227773300
is_flapping=0
percent_state_change=0.00
scheduled_downtime_depth=0
}
<<cut>>
Then I've enabled debugging (24) and here the result pasting only where I've found the example host.
I've written "(..CUT..)" to skip MB of lines not important, referring to other services/hosts (having of course the same
problem).
(..CUT..)
[1227774166.211254] [008.0] [pid=25062] ** Timed Event ** Type: 12, Run Time: Thu Nov 27 09:22:38 2008
[1227774166.211266] [008.0] [pid=25062] ** Host Check Event ==> Host: '<<MY-WINDOWS-EXAMPLE-SERVER>>', Options: 0,
Latency: 8.211000 sec
[1227774166.211283] [016.0] [pid=25062] Attempting to run scheduled check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>': check
options=0, latency=8.211000
[1227774166.211296] [016.0] [pid=25062] ** Running async check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774166.211325] [016.0] [pid=25062] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774166.211434] [016.1] [pid=25062] Check result output will be written to '/tmp/checkNmYAPl' (fd=7)
[1227774166.229697] [008.1] [pid=25062] ** Event Check Loop
[1227774166.229755] [008.1] [pid=25062] Next High Priority Event Time: Thu Nov 27 09:22:47 2008
[1227774166.229771] [008.1] [pid=25062] Next Low Priority Event Time: Thu Nov 27 09:22:38 2008
[1227774166.229781] [008.1] [pid=25062] Current/Max Service Checks: 0/80
[1227774166.229794] [008.1] [pid=25062] Running event...
[1227774166.229808] [008.0] [pid=25062] ** Timed Event ** Type: 12, Run Time: Thu Nov 27 09:22:38 2008
(..CUT..)
[1227774186.288210] [016.1] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
(..CUT..)
[1227774186.433737] [016.1] [pid=6021] Checking service 'DISK-SPACE' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.433749] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.433894] [016.1] [pid=6021] Checking service 'IMAP' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.433905] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434036] [016.1] [pid=6021] Checking service 'MEMORY' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434047] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434179] [016.1] [pid=6021] Checking service 'PING' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434190] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434323] [016.1] [pid=6021] Checking service 'POP3' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434334] [016.1] [pid=6021] Service is not flapping (0.00% state change).
[1227774186.434464] [016.1] [pid=6021] Checking service 'SMTP' on host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774186.434475] [016.1] [pid=6021] Service is not flapping (0.00% state change).
(..CUT..)
[1227774190.773319] [016.1] [pid=6021] Handling check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774190.773330] [016.1] [pid=6021] ** Handling async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774190.773345] [016.1] [pid=6021] HOST: <<MY-WINDOWS-EXAMPLE-SERVER>>, ATTEMPT=1/2, CHECK TYPE=ACTIVE, STATE
TYPE=HARD, OLD STATE=0, NEW STATE=0
[1227774190.773372] [016.1] [pid=6021] Host was UP.
[1227774190.773382] [016.1] [pid=6021] Host is still UP.
[1227774190.773392] [016.1] [pid=6021] Pre-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2,
Type=HARD, Final State=0
[1227774190.773414] [016.1] [pid=6021] Post-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2,
Type=HARD, Final State=0
[1227774190.773430] [016.1] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774190.773452] [016.1] [pid=6021] Rescheduling next check of host at Thu Nov 27 09:23:15 2008
[1227774190.773481] [016.0] [pid=6021] Scheduling a non-forced, active check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>' @
Thu Nov 27 09:23:15 2008
[1227774190.773500] [016.1] [pid=6021] ** Async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>' handled: new state=0
[1227774190.773524] [016.1] [pid=6021] Deleted check result file '/usr/local/nagios/var/spool/checkresults/ctdW9xA'
(..CUT..)
[1227774197.619431] [008.0] [pid=6021] ** Timed Event ** Type: 12, Run Time: Thu Nov 27 09:23:08 2008
[1227774197.619443] [008.0] [pid=6021] ** Host Check Event ==> Host: '<<MY-WINDOWS-EXAMPLE-SERVER>>', Options: 0,
Latency: 9.619000 sec
[1227774197.619460] [016.0] [pid=6021] Attempting to run scheduled check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>': check
options=0, latency=9.619000
[1227774197.619473] [016.0] [pid=6021] ** Running async check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774197.619504] [016.0] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774197.619601] [016.1] [pid=6021] Check result output will be written to '/tmp/checkICrgKX' (fd=7)
[1227774197.636440] [008.1] [pid=6021] ** Event Check Loop
[1227774197.636524] [008.1] [pid=6021] Next High Priority Event Time: Thu Nov 27 09:23:18 2008
[1227774197.636544] [008.1] [pid=6021] Next Low Priority Event Time: Thu Nov 27 09:23:08 2008
[1227774197.636558] [008.1] [pid=6021] Current/Max Service Checks: 0/80
[1227774197.636573] [008.1] [pid=6021] Running event...
(..CUT..)
[1227774198.041822] [016.1] [pid=6021] Handling check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774198.041833] [016.1] [pid=6021] ** Handling async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>'...
[1227774198.041851] [016.1] [pid=6021] HOST: <<MY-WINDOWS-EXAMPLE-SERVER>>, ATTEMPT=1/2, CHECK TYPE=ACTIVE, STATE
TYPE=HARD, OLD STATE=0, NEW STATE=0
[1227774198.041863] [016.1] [pid=6021] Host was UP.
[1227774198.041872] [016.1] [pid=6021] Host is still UP.
[1227774198.041881] [016.1] [pid=6021] Pre-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2,
Type=HARD, Final State=0
[1227774198.041893] [016.1] [pid=6021] Post-handle_host_state() Host: <<MY-WINDOWS-EXAMPLE-SERVER>>, Attempt=1/2,
Type=HARD, Final State=0
[1227774198.041904] [016.1] [pid=6021] Checking host '<<MY-WINDOWS-EXAMPLE-SERVER>>' for flapping...
[1227774198.041933] [016.1] [pid=6021] Rescheduling next check of host at Thu Nov 27 09:23:23 2008
[1227774198.041955] [016.0] [pid=6021] Scheduling a non-forced, active check of host '<<MY-WINDOWS-EXAMPLE-SERVER>>' @
Thu Nov 27 09:23:23 2008
[1227774198.042001] [016.1] [pid=6021] ** Async check result for host '<<MY-WINDOWS-EXAMPLE-SERVER>>' handled: new state=0
[1227774198.042025] [016.1] [pid=6021] Deleted check result file '/usr/local/nagios/var/spool/checkresults/cb
(..CUT..)
It's enough?
Any helps?
Thank's!!!
Simon