PyLauncher at TACC

Last update: February 17, 2025

What is PyLauncher

PyLauncher (Python + Launcher) is a Python-based parametric job launcher, a utility for distributing and executing many small jobs in parallel, using fewer resources than would be necessary to execute all jobs simultaneously. On many batch-based cluster computers this is a better strategy than submitting many small individual small jobs.

While TACC's deprecated Launcher utility worked on serial codes, PyLauncher works with multi-threaded and MPI executables.

Example: You need to run a program with 1000 different input values, and you want to use 100 cores for that; PyLauncher will cycle through your list of commands using cores as they become available.

The PyLauncher source code is written in Python, but this need not concern you: in the simplest scenario you use a two line Python script. However, for more sophisticated scenarios the code can be extended or integrated into a Python application.

Installations

PyLauncher is available on all TACC systems via the Lmod modules system. Use the following in your batch script or idev session.

$ module load pylauncher

Important

PyLauncher requires at least Python version 3.9:

$ module load python3/3.9 # or newer
On some systems the Python installation is missing the required paramiko module and the PyLauncher will abort with an error to that effect. In that case, do a one-time setup:

$ pip install paramiko

Basic setup

PyLauncher, like any compute-intensive application, must be invoked from a Slurm job script, or interactively within an idev session. PyLauncher interrogates Slurm's environment variables to query the available computational resources, but the only parameter you have to set is Slurm's -N directive. Other parameters, such as the -n commandline option or the SLURM --tasks-per-node parameter are ignored; instead Pylauncher queries the number of cores that are available per node. The number of nodes depends on how many tasks you want to execute; typically you would have more tasks than cores.

#SBATCH -N 5 # number of nodes you want to use

Pylauncher will then use all the cores of these nodes, running by default one commandline per core. See below for exceptions.

The pylauncher module sets the TACC_PYLAUNCHER_DIR and PYTHONPATH environment variables.

Your batch script can then invoke Python3 on the launcher code:

## file: mylauncher.py
import pylauncher 
pylauncher.ClassicLauncher("commandlines")

PyLauncher will now execute the lines in the file commandlines:

# this is the commandlines file
./yourprogram value1
./yourprogram value2

The commands can be complicated as you wish, e.g.:

mkdir output1 && cd output1 && ../yourprogram value1

Tip

If the commands use a consecutive input parameter, you can use the string PYLTID which expands to the number of the command.

./yourprogram -n PYLTID #1
./yourprogram -n PYLTID #2
./yourprogram -n PYLTID #3
./yourprogram -n PYLTID #4

At the end of the run, PyLauncher will produce final statistics:

Launcherjob run completed.

total running time: 222.22

# tasks completed: 160
tasks aborted: 0
max runningtime:  97.95
avg runningtime:  36.59
aggregate   : 5854.60
speedup     :  26.35

Host pool of size 40.

Number of tasks executed per node:
max: 11
avg: 4

This reports that 160 commands were executed, using 40 cores. Ideally we would expect a 40 times speedup, but because of variations in run time the aggregate running time of all commands was reduced by only 26.

If you want more detailed trace output during the run, add an option:

launcher.ClassicLauncher("commandlines",debug="host+job")

Output files

PyLauncher will create a directory "pylauncher_tmp123456" where "123456" maps to the job number. You can set the name of this directory explicitly:

pylauncher.ClassicLauncher("commandlines",workdir="pylauncher_out")

However, note that PyLauncher will not allow you to re-use that directory, so you need to delete it in between runs.

You need to take care of the output of your commandlines explicitly. For instance, your commandlines file could say

mkdir -p myoutput && cd myoutput && ${HOME}/myprogram input1
mkdir -p myoutput && cd myoutput && ${HOME}/myprogram input2
mkdir -p myoutput && cd myoutput && ${HOME}/myprogram input3
...

A file "queuestate" is generated with a listing of which of your commands were successfully executed, and, in case your job times out, which ones were pending or not scheduled. This can be used to restart your job. See below.

Launcher types

The ClassicLauncher uses by default a single core per commandline. The following options / launcher types are available if you want to run multi-threaded, MPI, or GPU-accelerated tasks.

Multi-Threaded

If your program is multi-threaded, you can give each command line more than one core with:

launcher.ClassicLauncher("commandlines",cores=4)

This can also be used if your program takes more memory than would normally be assigned to a single core. Say you have nodes with 64 cores each and 128Gbyte of memory. By default, each of the 64 task on the node would then have access to 2Gbyte. By specifying cores=4, only 16 tasks would be allocated to a node, but each task now has access to 8Gbyte.

If you want each command line to use all the cores of a node, specify:

launcher.ClassicLauncher("commandlines",cores="node")

The number of simultaneously running commands is then equal to the number of nodes you requested.

If you have a multi-threaded program and you want to set the number of cores individually for each commandline, use the option cores="file" (literally, the word "file" in quotes) and prefix each commandline with the core count:

5,myprogram value1
2,myprogram value2
7,myprogram value3
# et cetera

MPI

If your program is MPI parallel, replace the ClassicLauncher call:

launcher.IbrunLauncher("parallellines",cores=3)

(This also holds for the case of running a python code that, perhaps indirectly, relies on mpi4py.)

The "parallellines" file consists of command-lines without the MPI job starter, which is supplied by PyLauncher:

./parallelprogram 0 10
./parallelprogram 1 10
./parallelprogram 2 10

By default, these lines are prefixed with the ibrun specification. If your lines need the ibrun in a different location, you can use the placeholder PYL_MPIEXEC to indicate this:

mkdir out1 && cd out1 && PYL_MPIEXEC ./parallelprogram 0 10
mkdir out2 && cd out2 && PYL_MPIEXEC ./parallelprogram 2 10
mkdir out3 && cd out3 && PYL_MPIEXEC ./parallelprogram 3 10

GPU launcher

For GPU jobs, use the GPULauncher. This needs an extra parameter gpuspernode that is dependent on the cluster where you run this. If you omit this parameter or set it too high, the launcher may start your tasks when no GPUs are available. See the userguide for your cluster to find the correct number.

pylauncher.GPULauncher\
    ("gpucommandlines",
     gpuspernode=3 # adjust for the desired cluster
     )

Submit launcher

The SubmitLauncher is the only launcher that should be invoked outside a SLURM job, since it generates SLURM jobs and submits them. This makes sense in rare cases where you have tasks of widely varying runtime, and you don't want a regular launcher run where multiple nodes fall idle towards the end of the job, and thereby rack up SU's.

This launcher has a second compulsory argument after the commandlines file: a specification of the SLURM parameter, the way you would specify them to idev or srun. These indicate how each of your commandlines is run as a separate SLURM job.

Here is a sample call:

TACCproject = # your allocation identifier
queue = small # adjust for cluster
maxjobs = 3   # queue limit
workdir = "sublauncher_out"
pylauncher.SubmitLauncher\
    ("submitlines",
     f"-A {TACCproject} -N 1 -n 1 -p {queue} -t 0:5:0", # slurm arguments
     nactive=maxjobs,      # two jobs simultaneously
     maxruntime=900,       # this test should not take too long
     workdir=workdir,
     debug="host+queue+exec+job+task", # lots of debug output
     )

Sample Job Setup

Slurm Job Script File on Frontera

#!/bin/bash
#SBATCH   -p development
#SBATCH   -J pylaunchertest
#SBATCH   -o pylaunchertest.o%j
#SBATCH   -e pylaunchertest.o%j
#SBATCH   –ntasks-per-node 1 # this parameter is ignored
#SBATCH   -N 2
#SBATCH   -t 0:40:00
#SBATCH   -A YourProject

module load python3
python3 example_classic_launcher.py

PyLauncher File

where "example_classic_launcher.py" contains:

import pylauncher
pylauncher.ClassicLauncher("commandlines",debug="host+job")

Command Lines File

and "commandlines" contains your parameter sweep.

./myparallelprogram arg1 argA
./myparallelprogram arg1 argB
...

Debugging and tracing

If you want more detailed trace output during the run, add an option:

launcher.ClassicLauncher("commandlines",debug="job")

In the launcher invocation, the debug parameter causes trace output to be printed during the run. For example, the debug="job" setting produces output:

tick 104
Queue:
completed  60 jobs: 0-44 47-48 50-53 56 58 60-61 64 66 68 70 75
aborted     0 jobs:
queued      5 jobs: 99-103
running 39 jobs: 45-46 49 54-55 57 59 62-63 65 67 69 71-74 76-98

This states that in the 104’th stage some jobs were completed/queued/running.

The tick message is output every half second. This can be changed, for instance to 1/10th of a second, by specifying delay=.1 in the launcher command. In some cases, for instance if each command is a python invocation that does many imports, you could increase the delay parameter.

For even more trace output, use debug="host+exec+task+job+ssh". (If you need to submit a problem ticket, it helps diagnosis if you run with this full trace output.)

Advanced PyLauncher usage

PyLauncher in an `idev` Session

PyLauncher creates a working directory with a name based on the SLURM job number. PyLauncher will also refuse to reuse a working directory. Together this has implications for running PyLauncher twice in an idev session: after the first run, the second run will complain that the working directory already exists. You have to delete it yourself, or explicitly designate a different working directory name in the launcher command:

pylauncher.ClassicLauncher( "mycommandlines",workdir=<unique name>).

Restart File

PyLauncher leaves behind a restart file titled "queuestate" that lists which commandlines were finished, and which ones were under way, or to be scheduled when the launcher job finished. You can use this in case your launcher job is killed for exceeding the time limit. You can then resume:

pylauncher.ResumeClassicLauncher("queuestate",debug="job")

The default name "queuestate" can be overridden by giving an explicit name

pylauncher.ClassicLauncher( "commandlines",queuestate="queustate5")

Submit Launcher

Suppose you allocate 10 nodes to a launcher job, and one commandline takes 10 hours longer than the others. This leads to 9 nodes being idle for several hours. For this sort of use case, consider the SubmitLauncher', which runs outside of Slurm, and which submits Slurm jobs: For instance, the following command submits jobs to Frontera's [small` queue](../../hpc/frontera/#table6), and makes sure that the maximum queue limit of 2 nodes is not exceeded:

launcher.SubmitLauncher\
    ("commandlines",
    "-A YourProject -N 1 -n 1 -p small -t 0:15:0", # slurm arguments
    nactive=2, # queue limit
    )

Debugging PyLauncher Output

Each PyLauncher run stores output to a unique automatically generated subdirectory based on the job ID.

This directory contains three types of files:

Files with your command lines as they are executed by the launcher. Names: exec0, exec1, etc.
Time stamp files that the PyLauncher uses to determine whether commandlines have finished. Names: expire0, expire1, etc
Standard out/error files. These can be useful if you observe that some commandlines don't finish or don't give the right result. Names: out0, out1, et.

Parameters

Here are some parameters that may sometimes come in handy.

parameter	description
`delay=fraction` default: `delay=.5`	The fraction of a second that PyLauncher waits to start up new jobs, or test for finished ones. If you fire up complicated python jobs, you may want to increase this from the default.
`workdir=directory` default: generated from the SLURM jobid	This is the location of the internal execute/out/test files that PyLauncher generates.
`queuestate=filename` default filename: `queuestate`	This is a file that PyLauncher can use to restart if your jobs aborts, or is killed for exceeding the time limit. If you run multiple simultaneous jobs, you may want to specify this explicitly.
`maxruntime=seconds` default: infinite	Maximum runtime for the launcher job.