3.5。工作经理

工作经理是“director”Plams环境。它会跟踪您运行的所有作业,管理主工作文件夹,将作业文件夹分配给作业,并防止多个运行相同的作业。

每一个例子 JobManager 与工作文件夹相关联。此文件夹是创建的 JobManager 初始化实例,此实例管理的所有作业都有其工作文件夹中的作业文件夹。你不应该改变工作经理’S创建后的工作文件夹。

初始化plams环境时 init() 功能,一个实例 JobManager is created and stored in config.jm. This instance is tied to PLAMS main working folder (see 主脚本 有关详细信息),默认情况下使用每次都需要与作业管理器进行一些交互。在正常情况下,您永远不会明确触摸任何 JobManager 实例(手动创建它,调用其任何方法,探索其数据等)。所有互动都自动处理 run() or other methods.

笔记

通常不需要使用任何其他作业管理员而不是默认的作业管理器。分裂在多个实例之间的工作 JobManager 可能会导致一些问题(不同的实例’t communicate, so 重新运行预防 不能正常工作)。

但是,可以手动创建另一个实例 JobManager (with a different working folder) and use it for part of your jobs (by passing it as 工作 manager keyword argument to run())。如果您决定这样做,请确保通过所有实例 JobManager 你手动创造了 finish() (as a list).

一个示例应用程序,可以在许多不同的计算机上运行脚本中的作业(例如通过SSH)并单独 JobManager on each of them.

3.5.1。重新运行预防

在一些关于运行大量自动生成作业的应用程序(尤其是子系统方法),可能发生两个或更多作业是相同的。 Plams拥有内置机制来检测这种情况并避免不必要的工作。

run(),就在实际工作执行之前,一份唯一的作业标识符(调用 哈希 )计算出来。作业管理器存储先前运行的作业的所有散列,并检查您尝试执行的作业的哈希是否尚未存在。如果检测到这样的情况,则不会发生任何执行并使用先前作业的结果。以前的工作结果’S文件夹可以复制或链接到当前作业’s folder, based on link_files key in 以前的 job’s 设置.

笔记

链接使用硬链接完成。 Windows计算机不支持硬链接,因此,如果您在Windows下运行PLAM,则始终复制结果。

整个Rerun预防机制的关键部分正常工作 哈希 () function. It needs to produce different hashes for different jobs and exactly the same hashes for jobs that do exactly the same work. It is difficult to come up with the scheme that works well for all kind of external binaries, since the technical details about job preparation can differ a lot. Currently implemented method works based on calculating SHA256 hash of input and/or runscript contents. The value of 哈希 ing key in job manager’s 设置 can be one of the following: 'input', 'runscript', 'input+runscript' (or None to disable the rerun prevention).

如果您决定实现自己的散列方法,可以通过覆盖来完成 哈希 _input() and/or meth:〜scm.plams.core.basejob.singlejob.hash_runscript..

警告

It may happen that two jobs with exactly the same input and runscript files correspond to different jobs (for example if they rely on some external file that is supplied using relative path). Pay special attention to that. If you are experiencing problems (PLAMS sees two different jobs as the same one), disable the rerun prevention (config.jm.settings.hashing = None)

在当前的实现中禁用散列 MultiJob 他们没有的情况’T有输入和运行件。当然,一个是Multijobs的孩子的单一作业是以正常方式散列的,所以试图运行完全相同的Multijob作为之前的运行,不会触发Multijob级别的重新运行,而是单独为每个孩子单独工作。

3.5.2。酸洗

The lifespan of all elements that are parts of PLAMS environment is limited to a single script. That means every script you run uses its own independent job manager, working folder or config settings. These objects are initialized at the beginning of the script with init() command and they cease to exist when the script ends. Also all settings adjustments (apart from those done by editing plams_defaults) are local just for one script.

因此,您在当前脚本中使用的作业管理器不知道在过去脚本中运行的任何作业。但是,在某些情况下,能够将先前运行的作业导入当前脚本并使用其结果或基于它构建新作业是非常有用的。为此,PLAMS为作业对象提供数据保留机制。每次执行工作成功完成(见 跑步工作) the whole job object is saved to a .dill file using Python mechanism called pickling.

笔记

默认的python酸洗包, pickle,不足以处理一些常见的普遍普遍的plams对象。幸运的是,这 莳萝 package provides an excellent replacement for pickle, following the same interface and being able to save and load almost everything. It is strongly recommended to use 莳萝 to ensure proper work of PLAMS data preserving mechanism. However, if 莳萝 is not installed in the Python interpreter you’re using to run PLAMS, the regular pickle package will be used instead (which can work if your Job objects are not too fancy, but in most cases it will most likely fail). Please use 莳萝 , it’自由,容易得到和令人敬畏。

Such a .dill file can be loaded in future scripts using load() function:

>>> oldjob = load('/home/user/science/plams.12345/myjob/myjob.dill')

这个操作带回了旧的 Job 在(差不多)中的实例相同的状态就在执行完成后。

技术的

Python酸洗机制遵循腌制物体中的引用。这意味着如果您尝试泡制的对象包含对另一个对象的引用(就像a Job 实例有一个参考 Results 实例),也保存了另一个对象。由于拆开后,没有“empty”对象中的引用。

但是,每次 Job Plam中的实例是对作业管理员的引用,反过来有引用所有其他作业,所以酸洗一个作业将有效地意味着酸洗几乎整个环境。为了避免这种情况 Job 需要通过去除引用来准备酸洗的实例“global”对象,以及一些纯粹的本地属性(例如,对作业文件夹的路径)。加载过程中,所有删除的数据都被替换为“proper”值(当前作业管理器,作业文件夹等的当前路径)。

笔记

有一种方法可以扩展上面盒子中解释的机制。如果你的 Job 对象具有一个属性,其中包含对您不使用的其他对象的引用’要与作业一起保存,可以添加此对象’s name to job’s _dont_pickle list:

myjob.something = some_big_and_clumsy_object_you_dont_want_to_pickle
myjob._dont_pickle = ['something'] #or myjob._dont_pickle.append('something')

That way big clumsy object will not be stored in the .dill file. After loading such a .dill file the value of myjob.something will simply be None.

_dont_pickle 是每个的属性 Job instance, initialized by the constructor to an empty list. It does not contain names of attributes that are always removed ( like 工作 manager, for example), only additional ones defined by the user (see Job.__getstate__)

如上所述,节省工作发生在最后 run(). The decision if a job should be pickled is based on pickle key in job’s 设置, so it can be adjusted for each job separately. If you wish not to pickle a particular job just set myjob.settings.pickle = False. Of course global default setting in config.job.pickle can also be used.

如果你 modify a job or its corresponding Results instance afterwards, those changes are not going to be reflected in the .dill file, since it was created before your changes happened. To store such changes you need to repickle the job manually by calling myjob.pickle() after doing your changes.

笔记

并非所有Python对象都可以正确纠正,因此您需要注意对您的作业或其结果存储的外部对象的引用。

A Results instance associated with a job is saved together with it. However, results do not contain all files produced by job execution, but only relative paths to them. For that reason the .dill file is not enough to fully restore the state if you want to process the results. All other files present in job’需要的文件夹 Results 实例可以与它们相关联。因此,如果要将以前的执行作业复制到另一个位置,请确保复制 整体 作业文件夹(包括子目录)。

加载的工作是 不是 在当前的工作经理中注册。这意味着它在主工作文件夹中没有得到自己的子文件夹,它永远不会重命名和没有 清洁作业文件夹 完成了 finish()。但是,它被添加到哈希注册表,因此它是可见的 重新运行预防.

的情况下 MultiJob 有关儿童作业的所有信息都存储在父母中’s .dill file so loading a MultiJob results in loading all its children jobs. Each children job can have its own .dill file containing information about that particular job only. parent attribute of a children job is erased, so loading a children job does not result in loading its parent (and all other children).

3.5.3。重新启动崩溃的脚本

酸洗和重新运行的预防很好地产生了方便的重启机制。当脚本试图做某事时“illegal”它被Python解释器停止了。通常它是由脚本中的错误(使用错误的变量,访问列表等的错误元素)引起的。在这种情况下,人们想纠正脚本并再次运行它。但有些工作“wrong”在崩溃发生之前,脚本可能已经运行并成功完成。如果它们意味着以先前产生完全相同的结果,则将浪费时间在更正的脚本中再次运行这些作业。该解决方案是在更正的脚本的开始时从崩溃的脚本加载所有成功完成的作业,并让 重新运行预防 休息。但是,必须转到以前的脚本’s working folder and manually get paths to all .dill files there would be cumbersome. Fortunately, one can use load_all() function which, given the path to the main working folder of some finished PLAMS run, loads all .dill files stored there. So when you edit your crashed script to remove mistakes you can just add one load_all() 在开始时致电,当您运行更正的脚本时,将无法完成不必要的工作。

如果你’使用plams脚本使用 主脚本,重启甚至更容易。它可以用两种方式完成:

  1. 如果你 wish to perform the restart run in a fresh, empty working folder, all you need to do is simply to import the contents of the previous working folder (from the crashed run) using -l flag:

    plams myscript.plms
    [17:28:40]  pl  working  文件夹 : /home/user/plams.12345
    #[crashed]
    #[correct myscript.plms]
    plams -l plams.12345 myscript.plms
    [17:35:44]  pl  working  文件夹 : /home/user/plams.23456
    
  2. 如果你 would rather prefer to do an in-place restart and use the same working folder, you can use -r flag:

    plams myscript.plms
    [17:28:40]  pl  working  文件夹 : /home/user/plams.12345
    #[crashed]
    #[correct myscript.plms]
    plams -r -f plams.12345 myscript.plms
    [17:35:44]  pl  working  文件夹 : /home/user/plams.12345
    

    In this case the master script will temporarily move all the contents of plams.12345 to plams.12345.res, import all the jobs from there and start a regular run in now empty plams.12345. At the end of the script plams.12345.res will be deleted.

笔记

请记住,Rerun Prevention在后面检查工作的哈希哈希 prerun() 执行方法。因此,当您尝试运行与先前运行的作业相同的作业时(在同一脚本中或从上一个运行导入)时,它 prerun() 无论如何都执行方法,即使其余部分也是如此 跑步工作 is skipped.

3.5.4。 API.

班级 JobManager(设置, 路径=无, 文件夹= none)[来源]

负责作业和文件管理的班级。

每个实例都具有以下属性:

  • 文件夹 –工作文件夹名称。
  • 小路 –使用工作文件夹的目录的绝对路径。
  • workdir – the absolute path to the working folder ( 小路 /folder )。
  • 设置 – a Settings 此工作管理器的实例(见下文)。
  • 工作 s –使用此实例管理的所有作业列表(按顺序排列) run() calls).
  • names –一个关于作业名称的字典。对于每个名称,存储整数值,指示已运行该名称的作业数量。
  • 哈希 es –用作作业的哈希表的字典。

小路 文件夹 can be adjusted with constructor arguments 小路 文件夹 . If not supplied, Python current working directory and string plams. appended with PID of the current process are used.

设置 属性直接设置为值 设置 参数(与他们被复制的其他类别不同),它应该是一个 Settings 使用以下键的实例:

  • 哈希 ing –选择散列方法(见 重新运行预防 )。
  • counter_len –在名称冲突的情况下,向作业名称附加到作业名称的长度。
  • remove_empty_directories – if True, all empty subdirectories of the working folder are removed on finish().
_register_name( 工作 )[来源]

注册姓名 工作 .

如果已注册具有相同名称的作业, 工作 is renamed by appending consecutive integers. Number of digits in the appended number is defined by counter_len value in job manager’s 设置.

_register( 工作 )[来源]

注册 工作 。注册工作’s name(如果需要重命名)并创建作业文件夹。

_check_hash( 工作 )[来源]

计算哈希 工作 and, if it is not None, search previously run jobs for the same hash. If such a job is found, return it. Otherwise, return None

load_job(文件名)[来源]

加载以前保存的工作 文件名.

文件名 should be a path to .dill file in some job folder. A Job instance stored there is loaded and returned. All attributes of this instance removed before pickling are restored. This includes 工作 manager, 小路 (absolute path to 文件名 is used), default_setting (list containing only config.job) and also parent in case of children of some MultiJob.

酸洗 for details.

remove_job( 工作 )[来源]

去掉 工作 来自职位管理员。忘记它的哈希。

_clean()[来源]

Clean all registered jobs according to their save parameter in their 设置. If remove_empty_directories is True, traverse the working directory and delete all empty subdirectories.