Thursday, December 2, 2010

OPMN Ping timeouts

I did this exercise quite a while ago and thought of posting as I encountered this issue again. Basically, opmn manages all containers and does health check. It can restart the container if container doesn't respond in timely fashion. In opmn log file, it shows up as : [libopmnoc4j] Forcefully Terminating Process. During heavy load or temporary network failure, this is not desired behavior so it is better to control and fine tune such parameters.

It is pretty scattered documentation on different parameters, so thought of putting them together. OPMN algorithm documentation can be found at :

By default opmn just comes with stop/start and restart timeout, which are used just during start/stop and restart as name suggest, but the run time algorithm uses different ping parameters:

Different ping option from OPMN
1. OS Process Check: Every 2 seconds OPMN queries the OS with the managed process id to see if it has terminated.
2. Forward Ping: Periodically, 20 seconds by default, OPMN sends a ping message to the managed process and expects a result within 20 seconds.
3. Reverse Ping: Every 20 seconds managed process sends OPMN a ping notification.

Forward ping can be fine tuned with ping attribute in opmn.xml
Reverse ping can be set using "reverseping" parameters in opmn.xml

No comments: