Thursday 13 February 2020

Setting _JAVA_OPTIONS for Trinity in Galaxy configuration

We recently encountered an issue with the Trinity tool running on the compute cluster back-end of our local production Galaxy instance: specifically, a cluster admin noticed that Trinity jobs were creating more processes than had been allocated when the jobs were submitted, resulting in overload of the nodes they'd been dispatched too.

Our Galaxy instance is configured to send Trinity jobs to a special destination defined in the job_conf.xml file:

        ...
<destination id="jse_drop_trinity" runner="jse_drop">
   <param id="qsub_options">-V -j n -l mem256 -pe smp.pe 12</param>
           <param id="galaxy_slots">12</param>
           <env id="GALAXY_MEMORY_MB">194560
</destination>
        ...
        <tool id="trinity" destination="jse_drop_trinity" />
        ...

The qsub_options are options for our Grid Engine-based submission system which dispatches Trinity to a 12-core parallel environment on one of the higher memory nodes on the cluster; the galaxy_slots option tells the job that 12 slots are available, and is passed to Trinity on start up so that it knows how many processes it can start.

These options appeared to be working correctly, so the question was then: where were the extra processes coming from? The admin identified that Trinity is actually a Java-based software package, and that the Java runtime appeared to be starting additional multiple processes for its garbage collection (a process within the Java runtime for managing memory usage and other internal book-keeping operations).

Looking at the output from a Trinity job showed the default command line:

Thursday, February 13, 2020: 10:09:18   CMD: java -Xmx64m -XX:ParallelGCThreads=2  -jar /mnt/rvmi/centaurus/galaxy/production/tool_dependencies/_conda/envs/__trinity@2.8.4/opt/trinity-2.8.4/util/support_scripts/ExitTester.jar 0

which includes -XX:ParallelGCThreads=2 and indicates that each Java process should use 2 threads for garbage collection (GC).

It's possible to override the defaults by setting the desired option in the _JAVA_OPTIONS environment variable when the job is run, and this can be done by adding a new element in the job destination for Trinity:

        <env id="_JAVA_OPTIONS">-XX:ParallelGCThreads=1</env>

(See the section on Enviroment modifications in the Galaxy documentation for more details.)

With this in place subsequent Trinity jobs behaved correctly when submitted to the compute cluster.