我們發現paddledetection隻是修改配置檔案就可以訓練,在代碼是怎麼實作的,yaml為什麼可以自動執行個體
1.代碼梳理
train.py 132行 開始加載配置檔案
cfg = load_config(FLAGS.config)
paddet/core/workpace.py
def load_config(file_path):
"""
Load config from file.
Args:
file_path (str): Path of the config file to be loaded.
Returns: global config
"""
_, ext = os.path.splitext(file_path)
assert ext in ['.yml', '.yaml'], "only support yaml files for now"
# load config from file and merge into global config
cfg = _load_config_with_base(file_path)
cfg['filename'] = os.path.splitext(os.path.split(file_path)[-1])[0]
merge_config(cfg)
return global_config
這段代碼很簡單,就是檢測配置檔案的擴充名,然後加載,加載是寫在下面的函數的
# parse and load _BASE_ recursively
def _load_config_with_base(file_path):
with open(file_path) as f:
file_cfg = yaml.load(f, Loader=yaml.Loader)
# NOTE: cfgs outside have higher priority than cfgs in _BASE_
if BASE_KEY in file_cfg:
all_base_cfg = AttrDict()
base_ymls = list(file_cfg[BASE_KEY])
for base_yml in base_ymls:
if base_yml.startswith("~"):
base_yml = os.path.expanduser(base_yml)
if not base_yml.startswith('/'):
base_yml = os.path.join(os.path.dirname(file_path), base_yml)
with open(base_yml) as f:
base_cfg = _load_config_with_base(base_yml)
all_base_cfg = merge_config(base_cfg, all_base_cfg)
del file_cfg[BASE_KEY]
return merge_config(file_cfg, all_base_cfg)
return file_cfg
這段代碼也很簡單,就是循環加載配置檔案,因為paddle的配置檔案之中是包含多個配置檔案的。
最主要的是這句
yaml.load(f, Loader=yaml.Loader)
然後把加載的配置檔案,其實就是生成的執行個體放在全局的字典裡
all_base_cfg = merge_config(base_cfg, all_base_cfg)
這個函數很簡單
def merge_config(config, another_cfg=None):
"""
Merge config into global config or another_cfg.
Args:
config (dict): Config to be merged.
Returns: global config
"""
global global_config
dct = another_cfg or global_config
return dict_merge(dct, config)
就是放在全局字典
2.本文重點解析為什麼yaml.load 可以生成執行個體
首先要知道yaml是可以通過執行個體生成配置檔案,也可以通過配置檔案生成執行個體,這篇文章講的很好
https://www.jb51.net/article/242838.htm#_lab2_1_3
了解怎麼生成後,我們模仿寫一個例子,通過配置檔案生成執行個體
import yaml
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return '%s(name=%s, age=%d)' % (self.__class__.__name__, self.name, self.age)
def person_cons(loader, node):
value = loader.construct_mapping(node) # mapping構造器,用于dict
name = value['name']
age = value['age']
return Person(name, age)
yaml.add_constructor(u'!person', person_cons) # 用add_constructor方法為指定yaml标簽添加構造器
lily = yaml.load('!person {name: Lily, age: 19}') #生成實力
print (lily)
file_path="1.yml"
with open(file_path) as f:
file_cfg = yaml.load(f, Loader=yaml.Loader)
print(file_cfg)
1.yaml
TrainDataset:
!person
name: train
age: 11
那麼在看paddle,他的注冊在哪裡做的呢,在PADET\DATA\SOURCE\COCO.PY 中 COCODataSet上有個
@serializable
這是啥意思,裝飾器,裝飾器不懂的看看這個
http://c.biancheng.net/view/2270.html
那麼這個裝飾器函數執行了什麼,在ppdet\core\config\yaml_helper.py中
def serializable(cls):
"""
Add loader and dumper for given class, which must be
"trivially serializable"
Args:
cls: class to be serialized
Returns: cls
"""
yaml.add_constructor(u'!{}'.format(cls.__name__),
_make_python_constructor(cls))
yaml.add_representer(cls, _make_python_representer(cls))
return cls
他把類注冊進去了,是不是和我們例子的添加構造器一樣。我們在看看注冊函數
def _make_python_constructor(cls):
def python_constructor(loader, node):
if isinstance(node, yaml.SequenceNode):
args = loader.construct_sequence(node, deep=True)
return cls(*args)
else:
kwargs = loader.construct_mapping(node, deep=True)
try:
return cls(**kwargs)
except Exception as ex:
print("Error when construct {} instance from yaml config".
format(cls.__name__))
raise ex
return python_constructor
他幹啥了,他把參數傳進去,生成類的執行個體,然後傳回執行個體 cls(**kwargs)
3.總結
看到這,我們已經弄明白整個流程了,首先train.py加載的時候會執行init.py ,然後把其中包含的類都加載一遍,如果類上又裝飾器,那麼裝飾器也執行。是以就會把COCODataSet這個類的執行個體注冊到yaml的構造器,然後我們在load的時候直接在構造器裡找就好了,那麼相應的其他的類也是相同的方式注冊的。
4.register
我們發現代碼中還有這個裝飾器,那麼這個裝飾器是幹嘛的。說白了也是在Init.py初始化把類的執行個體注冊到一個全局的字典裡,然後用到那個執行個體,我們直接去字典裡取。和serializable的差別就是,yaml自帶的通過配置檔案生成類的執行個體,不懂register的,看看mmdetection中的register,講的好
https://zhuanlan.zhihu.com/p/355271993