Sun Grid Engine Monitor
Important Note:
Current release of GEMonitor supports Sun Grid Engine 5.3 ONLY. Version 6.0
has different output (so I am told) so it will need to be reworked. That
is in the process. No release date yet.
Introduction
We are currently in the
process of migrating from LSF to SGE so we are trying to make people's
life easier by providing them with a nicer way to monitor jobs other
than the standard qstat/qhost commands. I downloaded Grid Engine portal
but the installation involved was too much and besides it was an
overkill.
Since I like PHP I decided to write a web front-end to qstat. The total
code currently is about 500 lines of code out of which a third is a
date conversion library I downloaded off the net. Below is a sample
page of output. Please notice that time in queue or runtime is
actually
calculated so you will know if a job has been running for too long.
Also some of the rows are color-coded depending on certain metrics ie.
light blue if job is a sleep job (used for grabbing a token for use of
particular lab setup), gray for suspended or pending jobs, red for jobs
where machine load is close to 0 indicating a problem with the job.
ChangeLog
Version 0.9
- Rework of how information is fetched from qstat
Version 0.8
- New admin page that shows queues that are in trouble. Access it by
going to http://webserver/gemonitor/admin.php
- If a job goes to the error state ie. Eqw show the reason for it
- Minor bugfixes
Download
If you would like to download the source you can get it
by clicking on the following link gemonitor-0.9.tar.gz. Old releases are here.
Documentation
Installation document is here.

Please contact me at
vuksan-php@veus.hr