天天看點

[pytorch][zero_grad]pytorch報錯‘builtin_function_or_method‘ object has no attribute ‘named_modules‘

pytorch做grad-cam模型梯度歸零時調用了zero_grad這個函數,然後報錯

[pytorch][zero_grad]pytorch報錯‘builtin_function_or_method‘ object has no attribute ‘named_modules‘

發現原因在于torch.nn.ReLU函數在eval狀态下會變成一個build_in_method的類,zero_grad的源碼如下

def zero_grad(self, set_to_none: bool = False) -> None:
        if getattr(self, '_is_replica', False):
            warnings.warn(
                "Calling .zero_grad() from a module created with nn.DataParallel() has no effect. "
                "The parameters are copied (in a differentiable manner) from the original module. "
                "This means they are not leaf nodes in autograd and so don't accumulate gradients. "
                "If you need gradients in your forward method, consider using autograd.grad instead.")

        for p in self.parameters():
            if p.grad is not None:
                if set_to_none:
                    p.grad = None
                else:
                    if p.grad.grad_fn is not None:
                        p.grad.detach_()
                    else:
                        p.grad.requires_grad_(False)
                    p.grad.zero_()
           

可以看到主要使調用了module.parameters()函數得到所有權重的疊代然後一次歸0,那麼我們看一下parameters()的代碼

def parameters(self, recurse: bool = True) -> Iterator[Parameter]:
        for name, param in self.named_parameters(recurse=recurse):
            yield param
           

這裡又調用了self.named_parameters()這麼一個函數得到的疊代器,我們再去看這個函數

def named_parameters(self, prefix: str = '', recurse: bool = True) -> Iterator[Tuple[str, Tensor]]:
        gen = self._named_members(
            lambda module: module._parameters.items(),
            prefix=prefix, recurse=recurse)
        for elem in gen:
            yield elem
           

繼續找_named_members

def _named_members(self, get_members_fn, prefix='', recurse=True):
        memo = set()
        modules = self.named_modules(prefix=prefix) if recurse else [(prefix, self)]
        for module_prefix, module in modules:
            members = get_members_fn(module)
            for k, v in members:
                if v is None or v in memo:
                    continue
                memo.add(v)
                name = module_prefix + ('.' if module_prefix else '') + k
                yield name, v
           

繼續找named_modules

def named_modules(self, memo: Optional[Set['Module']] = None, prefix: str = ''):
        if memo is None:
            memo = set()
        if self not in memo:
            memo.add(self)
            yield prefix, self
            for name, module in self._modules.items():
                if module is None:
                    continue
                submodule_prefix = prefix + ('.' if prefix else '') + name
                for m in module.named_modules(memo, submodule_prefix):
                    yield m

           

好了找到了這裡是一個遞歸函數,每次找自己所有的子子產品,然後傳回子產品,那麼當遞歸到一個build_in_method的時候就無法取_modules這個attribute了,是以産生錯誤,究竟為何RuLE在eval模式下會變成這個狀态我還不大清楚,但是模仿這個函數我們也可以寫自己的梯度歸零

def my_grad_zero(my_model):
	"""@Qian2333"""
    # print(my_model)
    # input()
    order_dict = None
    try:
        order_dict = my_model._modules
    except AttributeError as e:
        return
    # print(len(order_dict))
    # input()
    if(len(order_dict) == 0):
        try:
            my_model.weight.grad = None
        except AttributeError as e:
            return
    for name in order_dict:
        my_grad_zero(order_dict[name])
           

然後調用這個函數就可以解決問題了