dpgen.remote package

Submodules

dpgen.remote.RemoteJob module

class dpgen.remote.RemoteJob.CloudMachineJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

Methods

block_call

block_checkcall

check_status

clean

download

get_job_root

submit

upload

check_status()[source]
submit(job_dirs, cmd, args=None, resources=None)[source]
class dpgen.remote.RemoteJob.JobStatus(value)[source]

Bases: Enum

An enumeration.

finished = 5
running = 3
terminated = 4
unknown = 100
unsubmitted = 1
waiting = 2
class dpgen.remote.RemoteJob.LSFJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

Methods

block_call

block_checkcall

check_limit

check_status

clean

download

get_job_root

submit

upload

check_limit(task_max)[source]
check_status()[source]
submit(job_dirs, cmd, args=None, resources=None, restart=False)[source]
class dpgen.remote.RemoteJob.PBSJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

Methods

block_call

block_checkcall

check_status

clean

download

get_job_root

submit

upload

check_status()[source]
submit(job_dirs, cmd, args=None, resources=None)[source]
class dpgen.remote.RemoteJob.RemoteJob(ssh_session, local_root, job_uuid=None)[source]

Bases: object

Methods

block_call

block_checkcall

clean

download

get_job_root

upload

block_call(cmd)[source]
block_checkcall(cmd)[source]
clean()[source]
download(job_dirs, remote_down_files, back_error=False)[source]
get_job_root()[source]
upload(job_dirs, local_up_files, dereference=True)[source]
class dpgen.remote.RemoteJob.SSHSession(jdata)[source]

Bases: object

Methods

close

get_session_root

get_ssh_client

close()[source]
get_session_root()[source]
get_ssh_client()[source]
class dpgen.remote.RemoteJob.SlurmJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

Methods

block_call

block_checkcall

check_status

clean

download

get_job_root

submit

upload

check_status()[source]
submit(job_dirs, cmd, args=None, resources=None, restart=False)[source]
class dpgen.remote.RemoteJob.awsMachineJob(remote_root, work_path, job_uuid=None)[source]

Bases: object

Methods

download

upload

download(job_dir, remote_down_files, dereference=True)[source]
upload(job_dir, local_up_files, dereference=True)[source]

dpgen.remote.decide_machine module

dpgen.remote.decide_machine.convert_mdata(mdata, task_types=['train', 'model_devi', 'fp'])[source]

Convert mdata for DP-GEN main process. New convension is like mdata[“fp”][“machine”], DP-GEN needs mdata[“fp_machine”]

Notice that we deprecate the function which can automatically select one most avalaible machine, since this function was only used by Angus, and only supports for Slurm. In the future this can be implemented.

Parameters
mdatadict

Machine parameters to be converted.

task_typeslist of string

Type of tasks, default is [“train”, “model_devi”, “fp”]

Returns
dict

mdata converted

dpgen.remote.group_jobs module

class dpgen.remote.group_jobs.PMap(path, fname='pmap.json')[source]

Bases: object

Path map class to operate {read,write,delte} the pmap.json file

Methods

delete

dump

load

delete()[source]
dump(pmap, indent=4)[source]
load()[source]
dpgen.remote.group_jobs.aws_submit_jobs(machine, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, forward_task_deference=True)[source]
dpgen.remote.group_jobs.group_local_jobs(ssh_sess, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, forward_task_deference=True)[source]
dpgen.remote.group_jobs.group_slurm_jobs(ssh_sess, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, remote_job=<class 'dpgen.remote.RemoteJob.SlurmJob'>, forward_task_deference=True)[source]
dpgen.remote.group_jobs.ucloud_submit_jobs(machine, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, forward_task_deference=True)[source]