mstk.scheduler.RemoteSlurm¶
- class mstk.scheduler.RemoteSlurm(host, username, remote_dir, port=22)¶
Slurm job scheduler running on a remote machine.
- Parameters:
host (str) – The IP address of the remote host that is running Slurm
port (int) – The SSH port for logging in the remote host
username (str) – The username for logging in the remote host
remote_dir (str) – The default directory to use on the remote host for running calculation
- host¶
The IP address of the remote host that is running Slurm
- Type:
str
- port¶
The SSH port for logging in the remote host
- Type:
int
- username¶
The username for logging in the remote host
- Type:
str
- remote_dir¶
The default directory to use on the remote host for running calculation
- Type:
str
- sh¶
The default name of the job script
- Type:
str
- job_parameter¶
The default parameters for submitting a job
- Type:
- submit_cmd¶
The command for submitting the job script. If is sbatch by default. But extra argument can be provided, e.g. sbatch –qos=debug.
- Type:
str
- cached_jobs_expire¶
The lifetime of cached jobs in seconds.
- Type:
int
Methods
__init__(host, username, remote_dir[, port])download([remote_dir, local_dir])Upload all the files in remote directory to current local directory.
generate_sh(commands, name[, parameter, ...])Generate a shell script for commands to be executed by the job scheduler on compute nodes.
get_job_from_name(name)Get the job with specified name.
get_jobs([use_cache])Retrieve all the jobs that are currently managed by job scheduler.
is_running(name)Check whether a job is pending or running (not killed or finished or failed).
Check whether Slurm is working normally on the remote machine.
kill_job(name)Kill a job which has the specified name.
submit([sh, remote_dir])Submit a job script to the Slurm scheduler on the remote machine.
upload([local_dir, remote_dir])Upload all the files in current local directory to remote directory.
Attributes
Whether this is a remote job scheduler
- is_remote = True¶
Whether this is a remote job scheduler
- is_working() bool¶
Check whether Slurm is working normally on the remote machine.
It calls sinfo –version and check the output.
- Returns:
is
- Return type:
bool
- upload(local_dir=None, remote_dir=None)¶
Upload all the files in current local directory to remote directory.
- Parameters:
local_dir (dir, optional) – If not set, will use the current dir.
remote_dir (dir, optional) – If not set, will use the default
remote_dir.
- Returns:
successful – Whether the upload is successful
- Return type:
bool
- download(remote_dir=None, local_dir=None) bool¶
Upload all the files in remote directory to current local directory.
- Parameters:
remote_dir (dir, optional) – If not set, will use the default
remote_dir.local_dir (dir, optional) – If not set, will use the current dir.
- Returns:
successful – Whether the download is successful
- Return type:
bool
- submit(sh=None, remote_dir=None)¶
Submit a job script to the Slurm scheduler on the remote machine.
- Parameters:
sh (str) – The job script to be submitted.
remote_dir (str) – The directory to submit the script on the remote machine.
- Returns:
id
- Return type:
int
- kill_job(name) bool¶
Kill a job which has the specified name.
- Parameters:
name (str) –
- Returns:
killed
- Return type:
bool