昨天晚上憋了2個小時,愣是沒有寫一篇博文
,要寫的東西太多了,很多東西又太細了,總想把東西寫的覆寫點全面點,但是描述又相對來說簡單。這說明我要趕快提高自己能力了,加油了。
好了,不說廢話,進入正題。
本人蝦悠悠 QQ:617600535 郵箱:[email protected],歡迎交流。
(一)關于periodic_task那點事
1、periodic_task大家都知道,是一個周期性任務,那麼究竟它是在什麼時候啟動的呢?
答:對前面nova service啟動過程比較熟悉的同學一定知道,periodic_task伴随着一個服務(比如compute-serveice)的啟動而開始的。源碼在/nova/service.py檔案中,
#在start函數裡面,關于periodic_task的啟動代碼
if self.periodic_enable:
if self.periodic_fuzzy_delay:
initial_delay = random.randint(0, self.periodic_fuzzy_delay)
else:
initial_delay = None
#通過下面一句,把periodic_task給啟動了
self.tg.add_dynamic_timer(self.periodic_tasks,
initial_delay=initial_delay,
periodic_interval_max=
self.periodic_interval_max)
大家看到上面self.tg.add_dynamic_timer(),裡面有句self.periodic_task,就是調用了自己檔案中如下manager.periodic_task(),開啟了manager中帶有@periodic_task,即被periodic_task修飾的所有函數。
def periodic_tasks(self, raise_on_error=False):
"""Tasks to be run at a periodic interval."""
ctxt = context.get_admin_context()
return self.manager.periodic_tasks(ctxt, raise_on_error=raise_on_error)
通過上面的幾步,周期性工作這就開始運作啦,哈哈。
下面是compute-manager裡面關于資源重新整理的截圖。
(二) 主機資源重新整理機制
通過(一),我們已經能夠很清楚的知道,periodic_task到底是怎麼啟動的了,接下來,就延續上面的截圖,來簡單講一講compute node主機的重新整理機制
1、先貼上上一段截圖的代碼
#在檔案/nova/compute/manager.py下
#主機周期性資源重新整理函數
@periodic_task.periodic_task
def update_available_resource(self, context):
"""See driver.get_available_resource()
Periodic process that keeps that the compute host's understanding of
resource availability and usage in sync with the underlying hypervisor.
:param context: security context
"""
new_resource_tracker_dict = {}
nodenames = set(self.driver.get_available_nodes())
for nodename in nodenames:
rt = self._get_resource_tracker(nodename)
rt.update_available_resource(context)
new_resource_tracker_dict[nodename] = rt
# Delete orphan compute node not reported by driver but still in db
compute_nodes_in_db = self._get_compute_nodes_in_db(context)
for cn in compute_nodes_in_db:
if cn.get('hypervisor_hostname') not in nodenames:
LOG.audit(_("Deleting orphan compute node %s") % cn['id'])
self.conductor_api.compute_node_delete(context, cn)
self._resource_tracker_dict = new_resource_tracker_dict
簡單分析下上面的代碼:
(1)nodenames = set(self.driver.get_available_nodes()) 首先擷取可以拿到的compute node
(2)rt = self._get_resource_tracker(nodename),通過這段代碼,讓node擷取一個類似resource_tracker的句柄,可以拿來操縱底層資源
rt實際上是</nova/compute/resource_tracker.py>中類ResourceTracker的一個對象
(3)rt.update_available_resource(context),開始調用</nova/compute/resource_tracker.py>中的update_available_resource()函數
代碼如下
@utils.synchronized(COMPUTE_RESOURCE_SEMAPHORE)
def update_available_resource(self, context):
"""Override in-memory calculations of compute node resource usage based
on data audited from the hypervisor layer.
Add in resource claims in progress to account for operations that have
declared a need for resources, but not necessarily retrieved them from
the hypervisor layer yet.
"""
LOG.audit(_("Auditing locally available compute resources"))
resources = self.driver.get_available_resource(self.nodename)
(4)看到關鍵的資源更新調用 resources = self.driver.get_available_resource(self.nodename) 這行代碼,我們知道,nova中driver預設是kvm虛拟化,它又是以Libvirt為基礎的,是以,我們找到,/nova/virt/libvirt/driver.py(所有在預設kvm情況下,虛拟機連通就是通過這個driver擴充卡來實作的)
def get_available_resource(self, nodename):
"""Retrieve resource information.
This method is called when nova-compute launches, and
as part of a periodic task that records the results in the DB.
:param nodename: will be put in PCI device
:returns: dictionary containing resource info
"""
# Temporary: convert supported_instances into a string, while keeping
# the RPC version as JSON. Can be changed when RPC broadcast is removed
stats = self.host_state.get_host_stats(refresh=True)
stats['supported_instances'] = jsonutils.dumps(
stats['supported_instances'])
return stats
我們看到driver又調用了 stats = self.host_state.get_host_stats(refresh=True)來擷取主機相關的狀态資訊。
(6)是以,麻煩我們再來看下這該檔案中,HostState的幾個相關的函數
a)get_host_stats方法,因為(5)中看到,refreash = True,是以,函數會如同代碼中注釋所說,首先“run update”
def get_host_stats(self, refresh=False):
"""Return the current state of the host.
If 'refresh' is True, run update the stats first.
"""
if refresh or not self._stats:
self.update_status()
return self._stats
b)我們再來看下update_status函數到底幹了什麼?
"""Retrieve status info from libvirt."""這才是這個函數的一個真正用途,從Libvirt中把主機資源的資訊全部擷取過來(比cpu,記憶體之類的),這些也就是我們能夠獲得的一些主機資訊。看下截圖吧,這些就是我們要找的compute到底收集了啥資訊。。
(7)通過上一步,把收集到的資訊指派給一個data = {},并且_stats = data,這個_stats儲存了主機的各種資訊,我們傳回到/nova/compute/resource_tracker.py再看一下, 在update_available_resource函數中,resources = self.driver.get_available_resource(self.nodename)這句話,完成了我們從主機中更新資源的操作,到了函數末尾,有兩句關鍵調用
(A)self._report_final_resource_view(resources)
(B)self._sync_compute_node(context, resources)
先看(A)段代碼,它把資源的剩餘使用量等計算了一下,為下一步,Scheduler排程時候參考主機情況做準備
def _report_final_resource_view(self, resources):
"""Report final calculate of free memory, disk, CPUs, and PCI devices,
including instance calculations and in-progress resource claims. These
values will be exposed via the compute node table to the scheduler.
"""
LOG.audit(_("Free ram (MB): %s") % resources['free_ram_mb'])
LOG.audit(_("Free disk (GB): %s") % resources['free_disk_gb'])
vcpus = resources['vcpus']
if vcpus:
free_vcpus = vcpus - resources['vcpus_used']
LOG.audit(_("Free VCPUS: %s") % free_vcpus)
else:
LOG.audit(_("Free VCPU information unavailable"))
if 'pci_devices' in resources:
LOG.audit(_("Free PCI devices: %s") % resources['pci_devices'])
(B)該句話,顯而易見,完成了每個compute節點在資料庫中,資源狀态的一個更新操作
self._sync_compute_node(context, resources)
好了,終于把上面兩個問題講完了,我大概就是描述了下流程,有點簡陋。
本來想着能講得再細一點,但是發現這樣會越講越多,越講越亂。
是以,希望小夥伴們自己再琢磨下吧,細節問題還是自己把握下吧。
嘿嘿,上面有什麼講的不對的或者大神需要指點下我的,請馬上聯系我,我一直線上。哈哈