Common Errors

(Errors are sorted alphabetically)

dargs.dargs.ArgumentTypeError: [at root location] key xxx gets wrong value type, requires but gets

Please check your parameters with DPGEN’s Document. Maybe youhave superfluous parentheses in your parameter file.

Dargs: xxx is not allowed in strict mode.

Strict format check has been applied since version 0.10.7. To avoid misleading users, some older-version keys that are already ignored or absorbed into default settings are not allowed to be present. And the expected structure of the dictionary in the param.json also differs from those before version 0.10.7. This error will occur when format check finds older-fashion keys in the json file. Please try deleting or annotating these keys, or correspondingly modulate the json file. Example files in the newest format could be found in examples.

FileNotFoundError: [Errno 2] No such file or directory: ‘…/01.model_devi/graph.xxx.pb’

If you find this error occurs, please check your initial data. Your model will not be generated if the initial data is incorrect.

json.decoder.JSONDecodeError

Your .json file is incorrect. It may be a mistake in syntax or a missing comma.

RuntimeError: job:xxxxxxx failed 3 times

RuntimeError: job:xxxxxxx failed 3 times

......

RuntimeError: Meet errors will handle unexpected submission state.
Debug information: remote_root==xxxxxx
Debug information: submission_hash==xxxxxx
Please check the dirs and scripts in remote_root. The job information mentioned above may help.

If a user finds an error like this, he or she is advised to check the files on the remote server. It shows that your job has failed 3 times, but has not shown the reason.

To find the reason, you can check the log on the remote root. For example, you can check train.log, which is generated by DeePMD-kit. It can tell you more details. If it doesn’t help, you can manually run the .sub script, whose path is shown in Debug information: remote_root==xxxxxx

Some common reasons are as follows:

  1. Two or more jobs are submitted manually or automatically at the same time, and their hash value collide. This bug will be fixed in dpdispatcher.

  2. You may have something wrong in your input files, which causes the process to fail.

RuntimeError: find too many unsuccessfully terminated jobs.

The ratio of failed jobs is larger than ratio_failure. You can set a high value for ratio_failure or check if there is something wrong with your input files.

ValueError: Cannot load file containing picked data when allow_picked=False

Please ensure that you write the correct path of the dataset with no excess files.