Having seen various references to GALAXY_SLOTS on the developer's mailing list I'd assumed this was some esoteric feature that I would need to set up to use, but in actual fact it's almost embarrassingly simple for most cases. Essentially it can be thought of as an internal variable that's set by Galaxy when it starts a job, which indicates the number of threads that are available for that job and which can subsequently be accessed by a tool in order to make use of that number of threads.
The official documentation can be found here: https://wiki.galaxyproject.org/Admin/Config/GALAXY_SLOTS
and this covers the essential details, but the executive summary is:
- Tool developers should use GALAXY_SLOTS when specifying the number of threads a tool should run with;
- Galaxy admins shouldn't need to configure anything unless they're using the local runner, or (possibly) a novel cluster submission system.
For tool developers
All that is required for tool developers is to specific GALAXY_SLOTS in the <command> tag in the tool XML wrapper, when setting the number of threads the tool uses.
The syntax for specifying the variable is:
\${GALAXY_SLOTS:-N}
where N is the default value to use if GALAXY_SLOTS is not set. (See the "Tool XML File syntax" documentation for the
For example, here's a code fragment from the XML wrapper from a tool to run the Trimmomatic program:
The number of threads defaults to 6 unless GALAXY_SLOTS is explicitly set.
(Aside: the Trimmomatic tool itself can be obtained from the toolshed at https://toolshed.g2.bx.psu.edu/view/pjbriggs/trimmomatic)
For Galaxy Admins
It turns out that generally there is nothing special to do for most cluster systems, although this is not immediately clear from the documentation: in most cases GALAXY_SLOTS is handled automagically and so doesn't require any explicit configuration.
For example for DRMAA (which is what we're using locally), we have job runners defined in our job_conf.xml file like:
In our set up, -pe smp.pe 4 above requests 4 cores for the job. When using this runner, Galaxy will automagically determine the number of cores from DRMAA (i.e. 4) and set GALAXY_SLOTS to the appropriate value - nothing more to do.
The most obvious exception is the "local" job runner, where you need to explicitly set the number of available slots using the <param id="local_slots"> tag in job_conf.xml; see https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#Local for more details.
Finally, for other job submission systems see the documentation on how to verify that the environment is being set correctly.
Thanks for explaining the GALAXY_SLOTS. Suppose if I have defined 4 threads in job_conf.xml files while it is set to 1 (GALAXY_SLOTS:-1) in tool wrapper. Which one it would pick?
ReplyDeleteThe definition in the tool wrapper sets the default number of threads in the absence of any additional configuration, however I believe that this is over-ridden by the definition in the job_conf.xml file.
DeleteSo in your example, GALAXY_SLOTS would be set to 4 threads when the tool was run.
HTH