Schedulers
A scheduler's role is to allocate compute infrastructure suitable to running a tier or wrapper layer as requested by the parent wrapper's orchestrator. This may require:
- Selecting a suitable machine based on the CPU and RAM requests made in a !Job's specification;
- Reserving licenses for commercial tools that are needed to execute a !Job;
- Interfacing with a grid engine such as SGE or Slurm.
Built-in Schedulers
LocalScheduler
LocalScheduler
is a simple implementation of the abstract scheduler class
that launches all tiers and wrappers on the local machine. This scheduler will
not scale to large workloads, but is a good starting point for testing out work
specifications.
This is the default scheduler used by Gator, and can be explicitly selected on the command line by using:
Helper Methods
The BaseScheduler
class offers some attributes and functions that can be useful
to support all schedulers:
inst.scheduler_id
- determines the name of the scheduler, which can be passed to lower-level jobs to re-use the same scheduler. The default behaviour is to convert the class name to lowercase and remove the wordscheduler
;inst.base_command
- constructs a standard set of arguments which are needed when launching any job including the parent websocket URL (--parent
), the polling interval for heartbeats and statistic collection (--interval
), the scheduler ID (--scheduler
), and whether or not all messages should be forwarded up the tree (--all-msg
);inst.create_command(child)
- accepts an instance of theChild
dataclass and creates a command string specific to launching that job, using the list of standard arguments generated byinst.base_command
and extending them with the child ID and tracking directory specific to the wrapper/tier being launched.
Custom Schedulers
Schedulers must extend from the gator.scheduler.common.BaseScheduler
base class
and implement at least the launch
and wait_for_all
as described below:
gator.scheduler.common.BaseScheduler.launch(tasks)
abstractmethod
async
Launch all given tasks onto the compute infrastructure, this function is asynchronous but should return as soon as all tasks are launched (i.e. it should not block until tasks complete).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tasks |
List[Child]
|
List of Child objects to schedule |
required |
gator.scheduler.common.BaseScheduler.wait_for_all()
abstractmethod
async
Wait for all tasks previously launched to complete by polling the compute infrastructure.