删除Volume又遇到新的错误,日志开到debug后看到以下的内容。
Clear capabilities
volume volume-4e1817be-9b8c-4834-ad90-baf24ef61775: removing export delete_volume /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/manager.py:192
Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --show execute /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/utils.py:167
Removing volume: 4e1817be-9b8c-4834-ad90-baf24ef61775
Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --force --delete iqn.2010-10.org.openstack:volume-4e1817be-9b8c-4834-ad90-baf24ef61775 execute /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/utils.py:167
Result was 22 execute /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/utils.py:184
[-] Exception during message handling
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/openstack/common/rpc/amqp.py", line 276, in _process_data
rval = self.proxy.dispatch(ctxt, version, method, **args)
File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/openstack/common/rpc/dispatcher.py", line 145, in dispatch
return getattr(proxyobj, method)(ctxt, **kwargs)
File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/manager.py", line 206, in delete_volume
{'status': 'error_deleting'})
File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
self.gen.next()
File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/manager.py", line 193, in delete_volume
self.driver.remove_export(context, volume_ref)
File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/driver.py", line 474, in remove_export
self.tgtadm.remove_iscsi_target(iscsi_target, 0, volume['id'])
File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/iscsi.py", line 180, in remove_iscsi_target
"id:%(volume_id)s.") % locals())
KeyError: u'volume_id'
虽然最后报的是KeyError的错但实际还是在调用tgt-admin --force --delete <value>出错了,通过tgt-admin -s查看存储节点上的target看到无法删除的target存在链接异常, 类似如下所示,而实际上客户端并不存在这些连接,自然也就无法前面据说的通过客户端来释放了。
Target 28: iqn.2010-10.org.openstack:volume-4b7ee394-0028-4d87-baeb-c0ef4ec134e5
System information:
Driver: iscsi
State: ready
I_T nexus information:
I_T nexus: 121
Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
Connection: 0
IP Address: 10.61.2.9
I_T nexus: 138
Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
Connection: 0
IP Address: 10.61.2.9
I_T nexus: 140
Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
Connection: 0
IP Address: 10.61.2.9
I_T nexus: 143
Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
Connection: 0
IP Address: 10.61.2.9
I_T nexus: 147
Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
Connection: 0
IP Address: 10.61.2.9
I_T nexus: 150
Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
Connection: 0
IP Address: 10.61.2.9
LUN information:
......
最后我的解决办法是重启tgtd服务,当然正常的service tgtd restart命令是无法重启的因为链接没释放,只能先查到tgtd的pid,然后kill掉并删除锁文件最后重新启动tgtd。
[[email protected] ~]# ps aux | grep tgtd
root 2652 0.1 0.2 888048 40820 ? Ssl Apr27 104:54 tgtd
root 2653 0.0 0.0 12760 484 ? S Apr27 0:40 tgtd
root 9643 0.0 0.0 103244 872 pts/1 S+ 15:41 0:00 grep tgtd
[[email protected] ~]# kill -9 2652
[[email protected] ~]# kill -9 2653
-bash: kill: (2653) - 没有那个进程
[[email protected] ~]# service tgtd status
tgtd 已死,但是 subsys 被锁
[[email protected] ~]# rm -f /var/lock/subsys/tgtd
[[email protected] ~]# service tgtd start
正在启动 SCSI target daemon: [确定]
[[email protected] ~]# service tgtd status
tgtd (pid 9675 9674) 正在运行...
[[email protected] ~]# service cinder-volume restart
重启后所有target的链接状态就正常了,然后通过前面所说的重置数据库状态后,就可以正常删除了。
北方工业大学 | 云计算研究中心 | 姜永