虛拟機遷移使資源配置更加靈活,尤其是線上遷移,提供了虛拟機的可用性和可靠性。Openstack liberty中提供了兩種類型的遷移實作:靜态遷移(cold migration)和動态遷移(live migration)。在接下來的幾篇文章中,我将詳細分析兩種遷移的實作過程,先來看靜态遷移。
限于篇幅,靜态遷移的源碼分析将包含兩篇文章:
- 第一篇:主要介紹遷移過程中
及nova-api
所在的工作nova-conductor
- 第二篇:重點介紹
的處理過程nova-compute
下面請看第一篇的内容:
發起遷移
使用者可以手動通過
nova CLI
指令行發起雲主機遷移動作:
上述指令将
id=52e4d485-6ccf-47f3-a754-b62649e7b256
的雲主機遷移到另外一個最優的
nova-compute
節點上,
--debug
選項用來顯示執行日志:
......
curl -g -i -X POST http://controller:8774/v2/eab72784b36040a186a6b88dac9ac0b2/servers/5a7d302f-f388-4ffb-af37-f1e6964b3a51/action -H "User-Agent: python-novaclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: {SHA1}8e294a111a5deaa45f6cb0f3c58a600d2b1b0493" -d '{"migrate": null}
......
上述截取的日志表明:
novaclient
通過http方式将遷移請求發送給
nova-api
并執行
migrate
動作(action),由
nova-api
啟動時建立的路由映射,很容易的知道,該動作的入口函數為
nova/api/openstack/compute/migrate_server.py/MigrateServerController._migrate
,下文具體分析。
源碼分析
nova-api
部分
nova-api
如上分析,遷移入口如下:
#nova/api/openstack/compute/migrate_server.py/MigrateServerController._migrate, 省略裝飾器定義
def _migrate(self, req, id, body):
"""Permit admins to migrate a server to a new host.
req 是Request對象,包含該次請求資訊
id 是待遷移的雲主機id 如:52e4d485-6ccf-47f3-a754-b62649e7b256
body 是該次請求的參數資訊 {"migrate": null}
"""
#從Request對象提取請求上下文
context = req.environ['nova.context']
"""執行權限認證,預設會通過讀取host節點/etc/nova/policy.json檔案
中的權限規則完成認證,如果沒有定義相關的規則,則表明認證失敗抛抛異
這裡對應的認證規則是:
"os_compute_api:os_migrate_server:migrate": rule:admin_api"
"""
authorize(context, action='migrate')
#從nova資料庫中擷取id指向的雲主機資訊,傳回一個InstanceV2對象
instance = common.get_instance(self.compute_api, context, id)
"""省略異常處理代碼
如果雲主機不存在,找不到合适的目标主機,雲主機處于鎖定狀态,
資源不足,雲主機狀态不對(隻能是運作或者停止态)則抛異常
與‘調整雲主機大小’(resize)操作一樣,也是調用
`/nova/compute/api.py/API.resize`
執行遷移操作,resize是通過判斷
是否指定了flavor_id參數來判斷是執行‘調整雲主機大小’還是‘遷移’操作,
請看下文的具體分析
"""
self.compute_api.resize(req.environ['nova.context'], instance)
---------------------------------------------------------------
#接上文:/nova/compute/api.py/API.resize, 省略裝飾器定義
def resize(self, context, instance, flavor_id=None,
clean_shutdown=True,
**extra_instance_updates):
"""Resize (ie, migrate) a running instance.
If flavor_id is None, the process is considered a
migration, keeping the original flavor_id. If flavor_id is
not None, the instance should be migrated to a new host and
resized to the new flavor_id.
上面的注釋是說:如果flavor_id = None, 則用原有的flavor(配置)執行
遷移操作。如果不為None,則應将雲主機遷移到新的主機并應用flavor_id指
定的配置
conext 請求上下文
instance InstanceV2執行個體對象,包含雲主機的詳細配置資訊
flavor_id 配置模闆id,這裡為None,因為是遷移操作
clean_shutdown = True, 靜态遷移時開啟關機重試,如果未能正常關閉雲
主機會抛異常
"""
#檢查系統磁盤的‘自動配置磁盤’功能是否打開,否則抛異常
#遷移完成後,雲主機需要能夠自動配置系統磁盤
self._check_auto_disk_config(instance, **extra_instance_updates)
#擷取雲主機配置模闆資訊
current_instance_type = instance.get_flavor()
# If flavor_id is not provided, only migrate the instance.
#flavor_id = None, 執行遷移操作;列印日志并将目前配置作為遷移後雲主
#機的配置
if not flavor_id:
LOG.debug("flavor_id is None. Assuming migration.",
instance=instance)
new_instance_type = current_instance_type
else:
#從nova.instance_types資料表擷取flavor_id指定的配置模闆資訊
#read_deleted="no",表示讀取資料庫時過濾掉已經删除的配置模闆
new_instance_type = flavors.get_flavor_by_flavor_id(
flavor_id, read_deleted="no")
#如果雲主機是從鏡像啟動的并且目前的配置模闆中root_gb(根磁盤大
#小)不為0,而目标配置模闆中的root_gb=0,則不支援resize操作
#因為不知道怎麼配置設定系統磁盤大小了,抛異常
if (new_instance_type.get('root_gb') == and
current_instance_type.get('root_gb') != and
not self.is_volume_backed_instance(context, instance)):
reason = _('Resize to zero disk flavor is not'
'allowed.')
raise exception.CannotResizeDisk(reason=reason)
#如果沒有找到指定的配置模闆,抛異常
if not new_instance_type:
raise exception.FlavorNotFound(flavor_id=flavor_id)
#列印debug日志
current_instance_type_name = current_instance_type['name']
new_instance_type_name = new_instance_type['name']
LOG.debug("Old instance type %(current_instance_type_name)s, "
" new instance type %(new_instance_type_name)s",
{'current_instance_type_name':
current_instance_type_name,
'new_instance_type_name': new_instance_type_name},
instance=instance)
#判斷是否是同一配置模闆,遷移操作中肯定是同一配置模闆
same_instance_type = (current_instance_type['id'] ==
new_instance_type['id'])
"""NOTE(sirp): We don't want to force a customer to change
their flavor when Ops is migrating off of a failed host.
"""
#如果是resize操作,新的配置模闆被disable了,抛異常
if not same_instance_type and new_instance_type.get('disabled'):
raise exception.FlavorNotFound(flavor_id=flavor_id)
#預設cell關閉,cell_type = None
#這裡是說resize的時候,新舊配置模闆不能是相同的,因為這樣做沒有意義
if same_instance_type and flavor_id and
self.cell_type != 'compute':
raise exception.CannotResizeToSameFlavor()
# ensure there is sufficient headroom for upsizes
#如果是resize操作,需要先保留資源配額
if flavor_id:
#擷取vcpu和memory的增量配額(如果有的話,新舊配置模闆的內插補點)
deltas = compute_utils.upsize_quota_delta(context,
new_instance_type,
current_instance_type)
try:
#為目前使用者和項目保留資源(增量)配額,更新資料庫
quotas = compute_utils.reserve_quota_delta(context,
deltas,
instance)
except exception.OverQuota as exc:
#統計資源不足資訊,并列印日志
quotas = exc.kwargs['quotas']
overs = exc.kwargs['overs']
usages = exc.kwargs['usages']
headroom = self._get_headroom(quotas, usages,
deltas)
(overs, reqs, total_alloweds,
useds) = self._get_over_quota_detail(headroom,
overs, quotas, deltas)
LOG.warning(_LW("%(overs)s quota exceeded for %"
"(pid)s, tried to resize instance."),
{'overs': overs, 'pid': context.project_id})
raise exception.TooManyInstances(overs=overs,
req=reqs,
used=useds,
allowed=total_alloweds)
#遷移操作,沒有額外的資源需要保留
else:
quotas = objects.Quotas(context=context)
#更新與主機狀态:主機狀态:重建/遷移,任務狀态:準備重建或者遷移
instance.task_state = task_states.RESIZE_PREP
instance.progress =
instance.update(extra_instance_updates)
instance.save(expected_task_state=[None])
"""為nova-scheduler生成過濾選項,
CONF.allow_resize_to_same_host = true
表示允許遷移的目的主機與源主機相同,否則過濾掉源主機
"""
filter_properties = {'ignore_hosts': []}
if not CONF.allow_resize_to_same_host:
filter_properties['ignore_hosts'].append(instance.host)
#預設cell_type = None,
if self.cell_type == 'api':
# Commit reservations early and create migration record.
self._resize_cells_support(context, quotas, instance,
current_instance_type,
new_instance_type)
#flavor_id = None, 執行遷移操作,否則執行resize
#記錄執行個體操作,更新nova.instance_actions資料表,遷移結束後會更新數
#據庫記錄,反映遷移結果
if not flavor_id:
self._record_action_start(context, instance,
instance_actions.MIGRATE)
else:
self._record_action_start(context, instance,
instance_actions.RESIZE)
"""将遷移請求轉發給
`/nova/conductor/api.py/ComputeTaskAPI.resize_instance`,該
方法直接調用
`nova/conductor/rpcapi.py/ComputeTaskAPI.migrate_server`處理
請求,請看下文的分析
"""
scheduler_hint = {'filter_properties': filter_properties}
self.compute_task_api.resize_instance(context, instance,
extra_instance_updates,
scheduler_hint=scheduler_hint,
flavor=new_instance_type,
reservations=quotas.reservations or [],
clean_shutdown=clean_shutdown)
------------------------------------------------------------
#接上文:`nova/conductor/rpcapi.py/ComputeTaskAPI.migrate_server`
def migrate_server(self, context, instance, scheduler_hint,
live, rebuild,
flavor, block_migration, disk_over_commit,
reservations=None, clean_shutdown=True):
"""輸入參數如下:
live = False, 靜态遷移
rebuild = false, 遷移,而不是resize
block_migration = None, 不是塊遷移
disk_over_commit = None
reservations = [] 遷移操作,沒有增量保留資源
"""
#生成請求參數字典
kw = {'instance': instance, 'scheduler_hint':
scheduler_hint,
'live': live, 'rebuild': rebuild, 'flavor': flavor,
'block_migration': block_migration,
'disk_over_commit': disk_over_commit,
'reservations': reservations,
'clean_shutdown': clean_shutdown}
#根據RPCClient的版本相容性,選擇用戶端版本。
#在初始化rpc的時候會設定版本相容特性
version = '1.11'
if not self.client.can_send_version(version):
del kw['clean_shutdown']
version = '1.10'
if not self.client.can_send_version(version):
kw['flavor'] = objects_base.obj_to_primitive(flavor)
version = '1.6'
if not self.client.can_send_version(version):
kw['instance'] = jsonutils.to_primitive(
objects_base.obj_to_primitive(instance))
version = '1.4'
#通過同步rpc調用将`migrate_server`消息發送給rabbitmq,
#消費者`nova-conductor`将會收到該消息
cctxt = self.client.prepare(version=version)
return cctxt.call(context, 'migrate_server', **kw)
小結:
nova-api
主要完成執行個體狀态、相關條件檢查, 之後更新雲主機狀态及添加
nova.instance_actions
資料庫記錄,最後通過同步rpc将請求轉發給
nova-conductor
處理
nova-conductor
部分
nova-conductor
由前述的分析,我們很容易就知道
nova-conductor
處理遷移請求的入口:
#/nova/conductor/manager.py/ComputeTaskManager.migrate_server
def migrate_server(self, context, instance, scheduler_hint,
live, rebuild,
flavor, block_migration, disk_over_commit,
reservations=None,
clean_shutdown=True):
"""各輸入參數來自`nova-api`,如下:
scheduler_hint 排程選項,{u'filter_properties':
{u'ignore_hosts': []}}
live = False, 靜态遷移
rebuild = Flase, 遷移而不是調整雲主機大小
block_migration = None, 非塊遷移
disk_over_commit = None
reservations = [] ,遷移操作沒有增量保留資源
"""
#如果輸入的instance參數不是非法的NovaObject對象,就先從資料庫擷取
#雲主機資訊,然後生成InstanceV2對象
if instance and not isinstance(instance, nova_object.NovaObject):
# NOTE(danms): Until v2 of the RPC API, we need to tolerate
# old-world instance objects here
attrs = ['metadata', 'system_metadata', 'info_cache',
'security_groups']
instance = objects.Instance._from_db_object(
context, objects.Instance(), instance,
expected_attrs=attrs)
# NOTE: Remove this when we drop support for v1 of the RPC API
#如果輸入的flavor參數不是合法的Flavor對象,就先從資料庫提取指定id
#的配置模闆,然後生成Flavor對象
if flavor and not isinstance(flavor, objects.Flavor):
# Code downstream may expect extra_specs to be
#populated since it is receiving an object, so lookup
#the flavor to ensure this.
flavor = objects.Flavor.get_by_id(context, flavor['id'])
#動态遷移,在另外一篇文章中詳述
if live and not rebuild and not flavor:
self._live_migrate(context, instance, scheduler_hint,
block_migration, disk_over_commit)
#調用_cold_migrate執行靜态遷移,下文具體分析
elif not live and not rebuild and flavor:
instance_uuid = instance.uuid
#with語句,在遷移前記錄遷移事件記錄到資料庫
#(nova.instance_actions_events),遷移後更新資料庫遷移記錄
with compute_utils.EventReporter(context, 'cold_migrate',
instance_uuid):
self._cold_migrate(context, instance, flavor,
scheduler_hint['filter_properties'],
reservations, clean_shutdown)
#未知類型
else:
raise NotImplementedError()
-------------------------------------------------------------
#接上文:
def _cold_migrate(self, context, instance, flavor,
filter_properties,
reservations, clean_shutdown):
#從執行個體對象中擷取所使用的鏡像資訊,示例如下:
"""
{u'min_disk': u'20', u'container_format': u'bare',
u'min_ram': u'0', u'disk_format': u'raw', 'properties':
{u'base_image_ref': u'e0cc468f-6501-4a85-9b19-
70e782861387'}}
"""
image = utils.get_image_from_system_metadata(
instance.system_metadata)
#通過鏡像屬性、雲主機屬性、雲主機配置模闆生成請求參數字典,格式如下:
"""
request_spec = {
'image': image,
'instance_properties': instance,
'instance_type': flavor,
'num_instances': 1}
"""
request_spec = scheduler_utils.build_request_spec(
context, image, [instance], instance_type=flavor)
#生成遷移任務對象
#`/nova/conductor/tasks/migrate.py/MigrationTask
task = self._build_cold_migrate_task(context, instance,
flavor,
filter_properties,
request_spec,
reservations,
clean_shutdown)
"""省略異常處理代碼
如果未找到合适的目标主機,政策不合法等異常,則退出
在退出前會更新資料庫,設定雲主機的狀态并列印日志及發送
`compute_task.migrate_server`通知
"""
#執行遷移,下文具體分析
task.execute()
---------------------------------------------------------------
#接上文:`nova/conductor/tasks/migrate.py/MigrationTask._execute
def _execute(self):
#從請求參數中擷取所使用的鏡像資訊
image = self.request_spec.get('image')
#根據self.reservations保留配額生成配額對象,
#遷移操作沒有保留配額 self.reservations = []
self.quotas = objects.Quotas.from_reservations(self.context,
self.reservations,
instance=self.instance)
#添加組(group_hosts)及組政策(group_polices)資訊到過濾屬性(如果有
#的話)
scheduler_utils.setup_instance_group(self.context,
self.request_spec,
self.filter_properties)
"""添加重試參數到過濾屬性(如果配置的重試次數
CONF.scheduler_max_attempts 〉1的話),修改後的過濾屬性如下:
{'retry': {'num_attempts': 1, 'hosts': []},
u'ignore_hosts': []}
如果是`nova-compute`發送過來的重試請求,輸入的filter_properties過
濾屬性中的retry字典中包含
前一次請求的異常資訊,再次選擇目标主機的時候會排除`hosts`中的主機,在
populate_retry過程中,會列印該條異常日志;如果重試超過了最大重試次
數,也會抛異常
"""
scheduler_utils.populate_retry(
self.filter_properties,
self.instance.uuid)
#發送請求給`nova-scheduler`,根據過濾規則選擇合适的目标主機,
#如果逾時會根據前文的重試參數重試。如果成功,傳回合适的目标主機清單
#如果找不到合适的目标主機,抛異常
hosts = self.scheduler_client.select_destinations(
self.context, self.request_spec, self.filter_properties)
#選取第一個
host_state = hosts[]
#添加目标主機到過濾屬性的重試清單(重試的時候'hosts'中的主機被忽
略),示例如下:
"""
{'retry': {'num_attempts': 1, 'hosts': [[u'devstack',
u'devstack']]}, 'limits': {u'memory_mb': 11733.0,
u'disk_gb': 1182.0}, u'ignore_hosts': []}
"""
scheduler_utils.populate_filter_properties(
self.filter_properties,
host_state)
# context is not serializable
self.filter_properties.pop('context', None)
#通過異步rpc調用發送`prep_resize`消息到消息隊列,`nova-compute`會
#處理該請求(`nova/compute/rpcapi.py/ComputeAPI`)
(host, node) = (host_state['host'], host_state['nodename'])
self.compute_rpcapi.prep_resize(
self.context, image, self.instance, self.flavor, host,
self.reservations, request_spec=self.request_spec,
filter_properties=self.filter_properties, node=node,
clean_shutdown=self.clean_shutdown)
小結:
nova-conductor
主要是借助
nova-scheduler
選擇合适的目标主機,同時也會更新
nova.instance_actions_events
資料表,最後發起異步rpc調用将遷移請求轉交給
nova-compute
處理
到這裡靜态遷移的前篇就介紹完成了,過程還是比較簡單的:主要完成一些條件判斷,更新資料庫記錄,通過
nova-scheduler
選主,最後将請求轉交給
nova-compute
處理。敬請期待:
Openstack liberty 雲主機遷移源碼分析之靜态遷移2