Discussion:
[Nagios-users] Event Handlers, start a java program, nohup issues.
Marco Tirado
2009-07-02 08:09:46 UTC
Permalink
Hello Users:

I have a problem with an event handler of mine. The handler starts a java
daemon-like program which loops forever waiting for connections and performs
JMX queries against our java applications.

The problem is that the handler times out when it is run by nagios. This is
what I see in the logs:

[01-07-2009 18:45:36] SERVICE EVENT HANDLER:
bj-mon-01;JMX_Server_Running;(null);(null);(null);start_jmx_server
[01-07-2009 18:46:07] Warning: Service event handler command
'/usr/local/nagios/libexec/eventhandlers/start_jmx_server CRITICAL SOFT 1'
timed out after 30 seconds

The event handler should start my JMXServer both in hard and soft states. I
have run the command from the console as the "nagios" user and it works, so
the problem has nothing to do with user rights for nagios.

The problem is that the handler hangs when I run "nohup" followed by my
command for starting the server (see the red text below).

My event handler looks like this:

###########################
# PROPERTIES
###########################

PORT="4444"
ECHO_CMD="/bin/echo"
JAVA_CMD="/usr/bin/java"
CLASSPATH="MyClasspath"
JVM_OPTIONS="MyOptions"

###########################

# What state is the JMXServer in?
case "$1" in

OK)
;;

WARNING)
;;

UNKNOWN)
;;

CRITICAL)

case "$2" in

SOFT)

`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "TRYING restart" >> /tmp/test`

;;

HARD)

`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "FINISHED trying" >> /tmp/test`

;;

esac

;;

esac

exit 0



Any help, hint or recommendation is deeply appreciated.

//Marco
Marco Tirado
2009-07-02 14:16:39 UTC
Permalink
That is exactly what I am doing (or trying to do with) the "&" character at
the end of my command. But it does not appear to be working, the command
looks like this:

nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT </dev/null 2>&1
Post by Marco Tirado
$LOG_FILE&
Any suggestions? Am I missing something else?

//Marco
Post by Marco Tirado
I have a problem with an event handler of mine. The handler starts a java
daemon-like program which loops forever waiting for connections and performs
JMX queries against our java applications.
The problem is that the handler times out when it is run by nagios. This is
bj-mon-01;JMX_Server_Running;(null);(null);(null);start_jmx_server
[01-07-2009 18:46:07] Warning: Service event handler command
'/usr/local/nagios/libexec/eventhandlers/start_jmx_server CRITICAL SOFT 1'
timed out after 30 seconds
The event handler should start my JMXServer both in hard and soft states. I
have run the command from the console as the "nagios" user and it works, so
the problem has nothing to do with user rights for nagios.
The problem is that the handler hangs when I run "nohup" followed by my
command for starting the server (see the red text below).
###########################
# PROPERTIES
###########################
PORT="4444"
ECHO_CMD="/bin/echo"
JAVA_CMD="/usr/bin/java"
CLASSPATH="MyClasspath"
JVM_OPTIONS="MyOptions"
###########################
# What state is the JMXServer in?
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
case "$2" in
SOFT)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "TRYING restart" >> /tmp/test`
;;
HARD)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "FINISHED trying" >> /tmp/test`
;;
esac
;;
esac
exit 0
Any help, hint or recommendation is deeply appreciated.
You need to make the java daemon run in the background. That will make
Nagios ignore it after it has moved from the foreground.
--
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
Kevin Keane
2009-07-02 14:31:22 UTC
Permalink
First, put a space in front of that &. Otherwise, it may be treated as
part of the variable name.

Second, I believe that the 2>&1 needs to come AFTER the redirection to
$LOG_FILE. Otherwise, what you are doing is redirecting stderr to stdin
- which is still the console - and THEN redirecting stdin to the log
file. I always get this wrong, though - it might be exactly opposite of
what I'm thinking.

But if I'm right, because stderr is still attached to the console,
nagios would still consider it to be in the foreground.
Post by Marco Tirado
That is exactly what I am doing (or trying to do with) the "&"
character at the end of my command. But it does not appear to be
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT </dev/null
2>&1 >> $LOG_FILE&
Any suggestions? Am I missing something else?
//Marco
I have a problem with an event handler of mine. The handler starts a java
daemon-like program which loops forever waiting for
connections and performs
JMX queries against our java applications.
The problem is that the handler times out when it is run by nagios. This is
bj-mon-01;JMX_Server_Running;(null);(null);(null);start_jmx_server
[01-07-2009 18:46:07] Warning: Service event handler command
'/usr/local/nagios/libexec/eventhandlers/start_jmx_server CRITICAL SOFT 1'
timed out after 30 seconds
The event handler should start my JMXServer both in hard and soft states. I
have run the command from the console as the "nagios" user and it works, so
the problem has nothing to do with user rights for nagios.
The problem is that the handler hangs when I run "nohup" followed by my
command for starting the server (see the red text below).
###########################
# PROPERTIES
###########################
PORT="4444"
ECHO_CMD="/bin/echo"
JAVA_CMD="/usr/bin/java"
CLASSPATH="MyClasspath"
JVM_OPTIONS="MyOptions"
###########################
# What state is the JMXServer in?
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
case "$2" in
SOFT)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "TRYING restart" >> /tmp/test`
;;
HARD)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "FINISHED trying" >> /tmp/test`
;;
esac
;;
esac
exit 0
Any help, hint or recommendation is deeply appreciated.
You need to make the java daemon run in the background. That will make
Nagios ignore it after it has moved from the foreground.
--
OP5 AB www.op5.se <http://www.op5.se>
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
------------------------------------------------------------------------
------------------------------------------------------------------------------
------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
--
Kevin Keane
Owner
The NetTech
Find the Uncommon: Expert Solutions for a Network You Never Have to Think About

Office: 866-642-7116
http://www.4nettech.com

This e-mail and attachments, if any, may contain confidential and/or proprietary information. Please be advised that the unauthorized use or disclosure of the information is strictly prohibited. The information herein is intended only for use by the intended recipient(s) named above. If you have received this transmission in error, please notify the sender immediately and permanently delete the e-mail and any copies, printouts or attachments thereof.
Marc Powell
2009-07-02 14:43:55 UTC
Permalink
Post by Marco Tirado
That is exactly what I am doing (or trying to do with) the "&"
character at the end of my command. But it does not appear to be
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT </dev/
null 2>&1 >> $LOG_FILE&
Do you really intend to set the output of /dev/null as the STDIN for
nohup? I expect you meant to use '>/dev/null'.

--
Marc
Marco Tirado
2009-07-03 08:45:40 UTC
Permalink
Hello users:

Thank you for all the answers. I have made some changes to my command: I
added a space before the "&" character and I explicitly piped the "stdout"
and "stderr" to my log file to avoid missunderstandings. The command looks
like this now:

nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer.JMXServerDispatcher
$PORT </dev/null >>$LOG_FILE 2>>$LOG_FILE &

The result however is much worst in this case, instead of seeing a timeout
in the logs, the whole nagios process hangs (that is the logs freeze and no
new tests are performed). I believe these issues have to do with how nagios
handles the "&" character or processes in the background in general.

Any thoughts? other suggestions?


//Marco


By the way David:

We have used the check_jmx plugin in the past with out own modifications,
however we have a very large system with around 1000 JMX queries running
each 5 minutes and the check_jmx solution is not very scalable. If we use
check_jmx that means starting 1000 JVM (and all the resources that come with
it) every five minutes, that consumes a lot of resources and generates a lot
of load on the monitoring server. Our system is continuously growing and new
JMX queries are added quite frequently.

To solve that issue we have developed a JMX server that takes care of all
the JMX queries and a light weight client that is run by nagios. This is a
much more scalable solution that decouples the whole JMX processing from
nagios (the JMX server can be running in any machine not only the nagios
server).

Cheers
Post by Marc Powell
Post by Marco Tirado
That is exactly what I am doing (or trying to do with) the "&"
character at the end of my command. But it does not appear to be
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT </dev/
null 2>&1 >> $LOG_FILE&
Do you really intend to set the output of /dev/null as the STDIN for
nohup? I expect you meant to use '>/dev/null'.
--
Marc
------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
Edgar Matzinger
2009-07-03 10:56:57 UTC
Permalink
LS,
Post by Marco Tirado
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS
JMXServer.JMXServerDispatcher $PORT </dev/null >>$LOG_FILE
2>>$LOG_FILE &
Change this in .... $PORT <&- >>$LOGFILE 2>&1 &
<&- closes STDIN....
Post by Marco Tirado
Any thoughts? other suggestions?
can't you use the init.d script? E.g. "/sbin/service jmx start"...
Or something similar....

HTH, cu l8r, Edgar.
--
|\ /| : : Addr: Valid Eindhoven B.V.
/ | \/ | : Edgar R. Matzinger : t.a.v. E.R. Matzinger
/ | | : : Flight Forum 565
\ /| /\| : : 5657DR Eindhoven
\/ / \ : Valid Eindhoven BV :
\ /\ / : :
\/ |\/ : :
| : :
Disclaimer: Any comments, opinions made are mine, etc ...
David Rosenstrauch
2009-07-02 14:00:38 UTC
Permalink
Perhaps instead of using a java daemon to do JMX queries, you could use
the check_jmx nagios plugin. (Available at the monitoring exchange
site.) I've been using it in our Nagios system, and it's been working
nicely (after some enhancements).

I'm in the process of adopting and enhancing the code (neither of the
previous 2 authors wanted to maintain it) and setting up a proper home
for it on sourceforge (http://sourceforge.net/projects/nagioscheckjmx/).
(Don't download it yet until I get the 1.0 release uploaded.)

HTH,

DR
Post by Marco Tirado
I have a problem with an event handler of mine. The handler starts a java
daemon-like program which loops forever waiting for connections and performs
JMX queries against our java applications.
The problem is that the handler times out when it is run by nagios. This is
bj-mon-01;JMX_Server_Running;(null);(null);(null);start_jmx_server
[01-07-2009 18:46:07] Warning: Service event handler command
'/usr/local/nagios/libexec/eventhandlers/start_jmx_server CRITICAL SOFT 1'
timed out after 30 seconds
The event handler should start my JMXServer both in hard and soft states. I
have run the command from the console as the "nagios" user and it works, so
the problem has nothing to do with user rights for nagios.
The problem is that the handler hangs when I run "nohup" followed by my
command for starting the server (see the red text below).
###########################
# PROPERTIES
###########################
PORT="4444"
ECHO_CMD="/bin/echo"
JAVA_CMD="/usr/bin/java"
CLASSPATH="MyClasspath"
JVM_OPTIONS="MyOptions"
###########################
# What state is the JMXServer in?
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
case "$2" in
SOFT)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "TRYING restart" >> /tmp/test`
;;
HARD)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "FINISHED trying" >> /tmp/test`
;;
esac
;;
esac
exit 0
Any help, hint or recommendation is deeply appreciated.
//Marco
Andreas Ericsson
2009-07-02 08:50:20 UTC
Permalink
Post by Marco Tirado
I have a problem with an event handler of mine. The handler starts a java
daemon-like program which loops forever waiting for connections and performs
JMX queries against our java applications.
The problem is that the handler times out when it is run by nagios. This is
bj-mon-01;JMX_Server_Running;(null);(null);(null);start_jmx_server
[01-07-2009 18:46:07] Warning: Service event handler command
'/usr/local/nagios/libexec/eventhandlers/start_jmx_server CRITICAL SOFT 1'
timed out after 30 seconds
The event handler should start my JMXServer both in hard and soft states. I
have run the command from the console as the "nagios" user and it works, so
the problem has nothing to do with user rights for nagios.
The problem is that the handler hangs when I run "nohup" followed by my
command for starting the server (see the red text below).
###########################
# PROPERTIES
###########################
PORT="4444"
ECHO_CMD="/bin/echo"
JAVA_CMD="/usr/bin/java"
CLASSPATH="MyClasspath"
JVM_OPTIONS="MyOptions"
###########################
# What state is the JMXServer in?
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
case "$2" in
SOFT)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "TRYING restart" >> /tmp/test`
;;
HARD)
`$ECHO_CMD "TRYING restart" >> /tmp/test`
nohup $JAVA_CMD -cp $CLASSPATH $JVM_OPTIONS JMXServer $PORT
</dev/null 2>&1 >> $LOG_FILE&
`$ECHO_CMD "FINISHED trying" >> /tmp/test`
;;
esac
;;
esac
exit 0
Any help, hint or recommendation is deeply appreciated.
You need to make the java daemon run in the background. That will make
Nagios ignore it after it has moved from the foreground.
--
Andreas Ericsson ***@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
Loading...