mstk.scheduler.Scheduler¶
- class mstk.scheduler.Scheduler¶
Base class for job schedulers.
Scheduler should not be constructed directly. Use its subclasses instead.
- username¶
The current user
- Type:
str
- sh¶
The default name of the job script
- Type:
str
- job_parameter¶
The default parameters for submitting a job
- Type:
- cached_jobs_expire¶
The lifetime of cached jobs in seconds.
- Type:
int
Methods
__init__()download(**kwargs)Download the simulation files to target folder.
generate_sh(commands, name, params[, ...])Generate a shell script for commands to be executed by the job scheduler on compute nodes.
get_job_from_name(name)Get the job with specified name.
get_jobs([use_cache])Retrieve all the jobs that are currently managed by job scheduler.
is_running(name)Check whether a job is pending or running (not killed or finished or failed).
Whether this job scheduler is running normally.
kill_job(name)Kill a job which has the specified name.
submit([sh])Submit a job script to scheduler.
upload(**kwargs)Upload the simulation files to target folder.
Attributes
Whether this is a remote job scheduler
- is_remote = False¶
Whether this is a remote job scheduler
- is_working()¶
Whether this job scheduler is running normally.
- Returns:
is
- Return type:
bool
- generate_sh(commands, name, params, workdir=None, sh=None)¶
Generate a shell script for commands to be executed by the job scheduler on compute nodes.
- Parameters:
commands (str) – List of commands to be executed by the job scheduler on compute node step by step.
name (str) – The name of the job to be submitted.
params (JobParameter) – The parameters for the job.
workdir (str, Optional) – The working directory.
sh (str, Optional) – The name (path) of the shell script being written. If not set, will use the default
sh.
- upload(**kwargs) bool¶
Upload the simulation files to target folder.
This method should be implemented by subclasses that is remote scheduler, which is determined by the attribute
is_remote. If it’s not a remote job scheduler, will simply return True.- Returns:
successful – Whether the upload is successful
- Return type:
bool
- download(**kwargs) bool¶
Download the simulation files to target folder.
This method should be implemented by subclasses that is remote scheduler, which is determined by the attribute
is_remote. If it’s not a remote job scheduler, will simply return True.- Returns:
successful – Whether the download is successful
- Return type:
bool
- submit(sh=None, **kwargs)¶
Submit a job script to scheduler.
- Parameters:
sh (str, optional) – The file name of the job script. If not set, will use the default
sh.- Returns:
id – Job ID. -1 means failed
- Return type:
int
- get_job_from_name(name)¶
Get the job with specified name.
If such a job can not be found, None will be returned. If several job have same name, the most recently submitted one will be returned.
- Parameters:
name (str) –
- Returns:
job
- Return type:
PbsJob or None
- is_running(name)¶
Check whether a job is pending or running (not killed or finished or failed).
- Parameters:
name (str) –
- Returns:
is
- Return type:
bool
- kill_job(name)¶
Kill a job which has the specified name.
- Parameters:
name (str) –
- Returns:
killed
- Return type:
bool
- get_jobs(use_cache=True)¶
Retrieve all the jobs that are currently managed by job scheduler.
It calls scontrol show job for Slurm or qstat -f -u for Torque to get the list of jobs. If some jobs have finished long time ago (depends on the setting of job scheduler on the machine), they may disappear from the list output by job scheduler.
In order not to apply too much pressure to the job scheduler, cache can be used to stores the jobs. If use_cache set to True and the cache is not expired, this method will return the cached results without calling scontrol or qstat. The lifetime of cache is determined by attribute
cached_jobs_expirein seconds.- Parameters:
use_cache (bool) – Whether the cached job list be used
- Returns:
jobs
- Return type:
list of PbsJob