天天看點

k8s garbage collector源碼分析(1)-啟動分析

k8s gc分析。Kubernetes garbage collector即垃圾收集器,存在于kube-controller-manger中,它負責回收kubernetes中的資源對象,監聽資源對象事件,更新對象之間的依賴關系,并根據對象的删除政策來決定是否删除其關聯對象。

k8s garbage collector分析(1)-啟動分析

garbage collector介紹

Kubernetes garbage collector即垃圾收集器,存在于kube-controller-manger中,它負責回收kubernetes中的資源對象,監聽資源對象事件,更新對象之間的依賴關系,并根據對象的删除政策來決定是否删除其關聯對象。

關于删除關聯對象,細一點說就是,使用級聯删除政策去删除一個

owner

時,會連帶這個

owner

對象的

dependent

對象也一起删除掉。

關于對象的關聯依賴關系,garbage collector會監聽資源對象事件,根據資源對象中

ownerReference

的值,來建構對象間的關聯依賴關系,也即

owner

dependent

之間的關系。

關于owner與dependent的介紹

以建立deployment對象為例進行講解。

建立deployment對象後,kube-controller-manager為其建立出replicaset對象,且自動将該deployment的資訊設定到replicaset對象

ownerReference

值。如下面示例,即說明replicaset對象

test-1-59d7f45ffb

owner

為deployment對象

test-1

,deployment對象

test-1

dependent

為replicaset對象

test-1-59d7f45ffb

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-1
  namespace: test
  uid: 4973d370-3221-46a7-8d86-e145bf9ad0ce
...
           
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: test-1-59d7f45ffb
  namespace: test
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: Deployment
    name: test-1
    uid: 4973d370-3221-46a7-8d86-e145bf9ad0ce
  uid: 386c380b-490e-470b-a33f-7d5b0bf945fb
...
           

同理,replicaset對象建立後,kube-controller-manager為其建立出pod對象,這些pod對象也會将replicaset對象的資訊設定到pod對象的

ownerReference

的值中,replicaset是pod的

owner

,pod是replicaset的

dependent

對象中

ownerReference

的值,指定了

owner

dependent

garbage collector架構圖

k8s garbage collector源碼分析(1)-啟動分析

garbage collector中最關鍵的代碼就是

garbagecollector.go

graph_builder.go

兩部分。

garbage collector的主要組成為1個圖(對象關聯依賴關系圖)、2個處理器(

GraphBuilder

GarbageCollector

)、3個事件隊列(

graphChanges

attemptToDelete

attemptToOrphan

):

1個圖

(1)

uidToNode

:對象關聯依賴關系圖,由

GraphBuilder

維護,維護着所有對象間的關聯依賴關系。在該圖裡,每一個k8s對象會對應着關系圖裡的一個

node

,而每個

node

都會維護一個

owner

清單以及

dependent

清單。

示例:現有一個deployment A,replicaset B(owner為deployment A),pod C(owner為replicaset B),則對象關聯依賴關系如下:

3個node,分别是A、B、C

A對應一個node,無owner,dependent清單裡有B;  
B對應一個node,owner清單裡有A,dependent清單裡有C;  
C對應一個node,owner清單裡有B,無dependent。  
           
k8s garbage collector源碼分析(1)-啟動分析

2個處理器

GraphBuilder

:負責維護所有對象的關聯依賴關系圖,并産生事件觸發

GarbageCollector

執行對象回收删除操作。

GraphBuilder

graphChanges

事件隊列中擷取事件進行消費,根據資源對象中

ownerReference

的值,來建構、更新、删除對象間的關聯依賴關系圖,也即

owner

dependent

之間的關系圖,然後再作為生産者生産事件,放入

attemptToDelete

attemptToOrphan

隊列中,觸發

GarbageCollector

執行,看是否需要進行關聯對象的回收删除操作,而

GarbageCollector

進行對象的回收删除操作時會依賴于

uidToNode

這個關系圖。

(2)

GarbageCollector

:負責回收删除對象。

GarbageCollector

作為消費者,從

attemptToDelete

attemptToOrphan

隊列中取出事件進行處理,若一個對象被删除,且其删除政策為級聯删除,則進行關聯對象的回收删除。關于删除關聯對象,細一點說就是,使用級聯删除政策去删除一個

owner

owner

dependent

3個事件隊列

graphChanges

:list/watch apiserver,擷取事件,由

informer

生産,由

GraphBuilder

消費;

attemptToDelete

:級聯删除事件隊列,由

GraphBuilder

GarbageCollector

(3)

attemptToOrphan

:孤兒删除事件隊列,由

GraphBuilder

GarbageCollector

消費。

garbage collector相關啟動參數分析

kcm元件啟動參數中,與

garbage collector

相關的參數代碼如下:

// cmd/kube-controller-manager/app/options/garbagecollectorcontroller.go
// AddFlags adds flags related to GarbageCollectorController for controller manager to the specified FlagSet.
func (o *GarbageCollectorControllerOptions) AddFlags(fs *pflag.FlagSet) {
	if o == nil {
		return
	}

	fs.Int32Var(&o.ConcurrentGCSyncs, "concurrent-gc-syncs", o.ConcurrentGCSyncs, "The number of garbage collector workers that are allowed to sync concurrently.")
	fs.BoolVar(&o.EnableGarbageCollector, "enable-garbage-collector", o.EnableGarbageCollector, "Enables the generic garbage collector. MUST be synced with the corresponding flag of the kube-apiserver.")
}
           

從代碼中可以看到,kcm元件啟動參數中有兩個參數與

garbage collector

相關,分别是:

enable-garbage-collector

:是否開啟

garbage collector

,預設值為

true

concurrent-gc-syncs

garbage collector

同步操作的worker數量,預設

20

garbage collector的源碼分析将分成兩部分進行,分别是:

(1)啟動分析;

(2)核心處理邏輯分析。

本篇部落格先對garbage collector進行啟動分析。

garbage collector源碼分析-啟動分析

基于tag v1.17.4

https://github.com/kubernetes/kubernetes/releases/tag/v1.17.4

直接以

startGarbageCollectorController

函數作為garbage collector的源碼分析入口。

startGarbageCollectorController

startGarbageCollectorController函數主要邏輯如下:

(1)根據

EnableGarbageCollector

變量的值來決定是否開啟

garbage collector

EnableGarbageCollector

變量的值根據kcm元件啟動參數

--enable-garbage-collector

配置擷取,預設為

true

;不開啟則直接傳回,不會繼續往下執行;

(2)初始化

discoveryClient

,主要用來擷取叢集中的所有資源對象;

(3)調用

garbagecollector.GetDeletableResources

,擷取叢集内

garbage collector

需要處理去删除回收的所有資源對象,支援

delete

,

list

watch

三種操作的資源對象稱為

deletableResource

(4)調用

garbagecollector.NewGarbageCollector

初始化

garbage collector

(5)調用

garbageCollector.Run

,啟動

garbage collector

(6)調用

garbageCollector.Sync

監聽叢集中的

deletableResources

,當出現新的

deletableResources

時同步到

monitors

中,確定監控叢集中的所有資源;

(7)暴露http服務,注冊 debug 接口,用于debug,用來提供由

GraphBuilder

建構的叢集内所有對象的關聯關系。

// cmd/kube-controller-manager/app/core.go
func startGarbageCollectorController(ctx ControllerContext) (http.Handler, bool, error) {
	if !ctx.ComponentConfig.GarbageCollectorController.EnableGarbageCollector {
		return nil, false, nil
	}

	gcClientset := ctx.ClientBuilder.ClientOrDie("generic-garbage-collector")
	discoveryClient := cacheddiscovery.NewMemCacheClient(gcClientset.Discovery())

	config := ctx.ClientBuilder.ConfigOrDie("generic-garbage-collector")
	metadataClient, err := metadata.NewForConfig(config)
	if err != nil {
		return nil, true, err
	}

	// Get an initial set of deletable resources to prime the garbage collector.
	deletableResources := garbagecollector.GetDeletableResources(discoveryClient)
	ignoredResources := make(map[schema.GroupResource]struct{})
	for _, r := range ctx.ComponentConfig.GarbageCollectorController.GCIgnoredResources {
		ignoredResources[schema.GroupResource{Group: r.Group, Resource: r.Resource}] = struct{}{}
	}
	garbageCollector, err := garbagecollector.NewGarbageCollector(
		metadataClient,
		ctx.RESTMapper,
		deletableResources,
		ignoredResources,
		ctx.ObjectOrMetadataInformerFactory,
		ctx.InformersStarted,
	)
	if err != nil {
		return nil, true, fmt.Errorf("failed to start the generic garbage collector: %v", err)
	}

	// Start the garbage collector.
	workers := int(ctx.ComponentConfig.GarbageCollectorController.ConcurrentGCSyncs)
	go garbageCollector.Run(workers, ctx.Stop)

	// Periodically refresh the RESTMapper with new discovery information and sync
	// the garbage collector.
	go garbageCollector.Sync(gcClientset.Discovery(), 30*time.Second, ctx.Stop)

	return garbagecollector.NewDebugHandler(garbageCollector), true, nil
}
           

下面對

startGarbageCollectorController

函數裡的部分邏輯稍微展開一下分析。

1.garbagecollector.NewGarbageCollector

NewGarbageCollector函數負責初始化

garbage collector

。主要邏輯如下:

(1)初始化

GarbageCollector

結構體;

GraphBuilder

結構體,并指派給

GarbageCollector

結構體的

dependencyGraphBuilder

屬性。

// pkg/controller/garbagecollector/garbagecollector.go
func NewGarbageCollector(
	metadataClient metadata.Interface,
	mapper resettableRESTMapper,
	deletableResources map[schema.GroupVersionResource]struct{},
	ignoredResources map[schema.GroupResource]struct{},
	sharedInformers controller.InformerFactory,
	informersStarted <-chan struct{},
) (*GarbageCollector, error) {
	attemptToDelete := workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "garbage_collector_attempt_to_delete")
	attemptToOrphan := workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "garbage_collector_attempt_to_orphan")
	absentOwnerCache := NewUIDCache(500)
	gc := &GarbageCollector{
		metadataClient:   metadataClient,
		restMapper:       mapper,
		attemptToDelete:  attemptToDelete,
		attemptToOrphan:  attemptToOrphan,
		absentOwnerCache: absentOwnerCache,
	}
	gb := &GraphBuilder{
		metadataClient:   metadataClient,
		informersStarted: informersStarted,
		restMapper:       mapper,
		graphChanges:     workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "garbage_collector_graph_changes"),
		uidToNode: &concurrentUIDToNode{
			uidToNode: make(map[types.UID]*node),
		},
		attemptToDelete:  attemptToDelete,
		attemptToOrphan:  attemptToOrphan,
		absentOwnerCache: absentOwnerCache,
		sharedInformers:  sharedInformers,
		ignoredResources: ignoredResources,
	}
	if err := gb.syncMonitors(deletableResources); err != nil {
		utilruntime.HandleError(fmt.Errorf("failed to sync all monitors: %v", err))
	}
	gc.dependencyGraphBuilder = gb

	return gc, nil
}
           

1.1 gb.syncMonitors

gb.syncMonitors的主要作用是調用

gb.controllerFor

對各個

deletableResources

deletableResources

指支援 “delete”, “list”, “watch” 三種操作的資源對象)資源對象的

infomer

做初始化,并為資源的變化事件注冊

eventHandler

(AddFunc、UpdateFunc 和 DeleteFunc),對于資源的add、update、delete event,都會push到

graphChanges

隊列中,然後

gb.processGraphChanges

會從

graphChanges

隊列中取出event進行處理(後面介紹garbage collector處理邏輯的時候會做詳細分析)。

// pkg/controller/garbagecollector/graph_builder.go
func (gb *GraphBuilder) syncMonitors(resources map[schema.GroupVersionResource]struct{}) error {
	gb.monitorLock.Lock()
	defer gb.monitorLock.Unlock()

	toRemove := gb.monitors
	if toRemove == nil {
		toRemove = monitors{}
	}
	current := monitors{}
	errs := []error{}
	kept := 0
	added := 0
	for resource := range resources {
		if _, ok := gb.ignoredResources[resource.GroupResource()]; ok {
			continue
		}
		if m, ok := toRemove[resource]; ok {
			current[resource] = m
			delete(toRemove, resource)
			kept++
			continue
		}
		kind, err := gb.restMapper.KindFor(resource)
		if err != nil {
			errs = append(errs, fmt.Errorf("couldn't look up resource %q: %v", resource, err))
			continue
		}
		c, s, err := gb.controllerFor(resource, kind)
		if err != nil {
			errs = append(errs, fmt.Errorf("couldn't start monitor for resource %q: %v", resource, err))
			continue
		}
		current[resource] = &monitor{store: s, controller: c}
		added++
	}
	gb.monitors = current

	for _, monitor := range toRemove {
		if monitor.stopCh != nil {
			close(monitor.stopCh)
		}
	}

	klog.V(4).Infof("synced monitors; added %d, kept %d, removed %d", added, kept, len(toRemove))
	// NewAggregate returns nil if errs is 0-length
	return utilerrors.NewAggregate(errs)
}
           

gb.controllerFor

gb.controllerFor主要是對資源對象的

infomer

eventHandler

graphChanges

隊列中。

// pkg/controller/garbagecollector/graph_builder.go
func (gb *GraphBuilder) controllerFor(resource schema.GroupVersionResource, kind schema.GroupVersionKind) (cache.Controller, cache.Store, error) {
	handlers := cache.ResourceEventHandlerFuncs{
		// add the event to the dependencyGraphBuilder's graphChanges.
		AddFunc: func(obj interface{}) {
			event := &event{
				eventType: addEvent,
				obj:       obj,
				gvk:       kind,
			}
			gb.graphChanges.Add(event)
		},
		UpdateFunc: func(oldObj, newObj interface{}) {
			// TODO: check if there are differences in the ownerRefs,
			// finalizers, and DeletionTimestamp; if not, ignore the update.
			event := &event{
				eventType: updateEvent,
				obj:       newObj,
				oldObj:    oldObj,
				gvk:       kind,
			}
			gb.graphChanges.Add(event)
		},
		DeleteFunc: func(obj interface{}) {
			// delta fifo may wrap the object in a cache.DeletedFinalStateUnknown, unwrap it
			if deletedFinalStateUnknown, ok := obj.(cache.DeletedFinalStateUnknown); ok {
				obj = deletedFinalStateUnknown.Obj
			}
			event := &event{
				eventType: deleteEvent,
				obj:       obj,
				gvk:       kind,
			}
			gb.graphChanges.Add(event)
		},
	}
	shared, err := gb.sharedInformers.ForResource(resource)
	if err != nil {
		klog.V(4).Infof("unable to use a shared informer for resource %q, kind %q: %v", resource.String(), kind.String(), err)
		return nil, nil, err
	}
	klog.V(4).Infof("using a shared informer for resource %q, kind %q", resource.String(), kind.String())
	// need to clone because it's from a shared cache
	shared.Informer().AddEventHandlerWithResyncPeriod(handlers, ResourceResyncTime)
	return shared.Informer().GetController(), shared.Informer().GetStore(), nil
}
           

2.garbageCollector.Run

garbageCollector.Run負責啟動

garbage collector

,主要邏輯如下:

(1)調用

gc.dependencyGraphBuilder.Run

:啟動

GraphBuilder

(2)根據啟動參數配置的worker數量,起相應數量的goroutine,執行

gc.runAttemptToDeleteWorker

gc.runAttemptToOrphanWorker

,兩者屬于

GarbageCollector

的核心處理邏輯,都是去删除需要被回收對象,具體分析會在下篇部落格裡進行分析。

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) Run(workers int, stopCh <-chan struct{}) {
	defer utilruntime.HandleCrash()
	defer gc.attemptToDelete.ShutDown()
	defer gc.attemptToOrphan.ShutDown()
	defer gc.dependencyGraphBuilder.graphChanges.ShutDown()

	klog.Infof("Starting garbage collector controller")
	defer klog.Infof("Shutting down garbage collector controller")

	go gc.dependencyGraphBuilder.Run(stopCh)

	if !cache.WaitForNamedCacheSync("garbage collector", stopCh, gc.dependencyGraphBuilder.IsSynced) {
		return
	}

	klog.Infof("Garbage collector: all resource monitors have synced. Proceeding to collect garbage")

	// gc workers
	for i := 0; i < workers; i++ {
		go wait.Until(gc.runAttemptToDeleteWorker, 1*time.Second, stopCh)
		go wait.Until(gc.runAttemptToOrphanWorker, 1*time.Second, stopCh)
	}

	<-stopCh
}
           

2.1 gc.dependencyGraphBuilder.Run

gc.dependencyGraphBuilder.Run負責啟動啟動

GraphBuilder

gb.startMonitors

,啟動前面

1.1 gb.syncMonitors

中提到的infomers;

(2)每隔1s循環調用

gb.runProcessGraphChanges

,做

GraphBuilder

的核心邏輯處理,核心處理邏輯會在下篇部落格裡進行分析。

// pkg/controller/garbagecollector/graph_builder.go
func (gb *GraphBuilder) Run(stopCh <-chan struct{}) {
	klog.Infof("GraphBuilder running")
	defer klog.Infof("GraphBuilder stopping")

	// Set up the stop channel.
	gb.monitorLock.Lock()
	gb.stopCh = stopCh
	gb.running = true
	gb.monitorLock.Unlock()

	// Start monitors and begin change processing until the stop channel is
	// closed.
	gb.startMonitors()
	wait.Until(gb.runProcessGraphChanges, 1*time.Second, stopCh)

	// Stop any running monitors.
	gb.monitorLock.Lock()
	defer gb.monitorLock.Unlock()
	monitors := gb.monitors
	stopped := 0
	for _, monitor := range monitors {
		if monitor.stopCh != nil {
			stopped++
			close(monitor.stopCh)
		}
	}

	// reset monitors so that the graph builder can be safely re-run/synced.
	gb.monitors = nil
	klog.Infof("stopped %d of %d monitors", stopped, len(monitors))
}
           

3.garbageCollector.Sync

garbageCollector.Sync的主要功能是周期性的查詢叢集中所有的

deletableResources

,調用

gc.resyncMonitors

來更新

GraphBuilder

monitors

,為新出現的資源對象初始化

infomer

和注冊

eventHandler

,然後啟動

infomer

,對已經移除的資源對象的

monitors

進行銷毀。

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) Sync(discoveryClient discovery.ServerResourcesInterface, period time.Duration, stopCh <-chan struct{}) {
	oldResources := make(map[schema.GroupVersionResource]struct{})
	wait.Until(func() {
	// Get the current resource list from discovery.
	newResources := GetDeletableResources(discoveryClient)
	...
	if err := gc.resyncMonitors(newResources); err != nil {
		utilruntime.HandleError(fmt.Errorf("failed to sync resource monitors (attempt %d): %v", attempt, err))
		return false, nil
	}
	klog.V(4).Infof("resynced monitors")
	...
           

3.1 gc.resyncMonitors

調用

gc.dependencyGraphBuilder.syncMonitors

:初始化

infomer

eventHandler

gc.dependencyGraphBuilder.startMonitors

infomer

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) resyncMonitors(deletableResources map[schema.GroupVersionResource]struct{}) error {
	if err := gc.dependencyGraphBuilder.syncMonitors(deletableResources); err != nil {
		return err
	}
	gc.dependencyGraphBuilder.startMonitors()
	return nil
}
           

4.garbagecollector.NewDebugHandler

garbagecollector.NewDebugHandler暴露http服務,注冊 debug 接口,用于debug,用來提供由

GraphBuilder

// pkg/controller/garbagecollector/dump.go
func NewDebugHandler(controller *GarbageCollector) http.Handler {
	return &debugHTTPHandler{controller: controller}
}

type debugHTTPHandler struct {
	controller *GarbageCollector
}

func (h *debugHTTPHandler) ServeHTTP(w http.ResponseWriter, req *http.Request) {
	if req.URL.Path != "/graph" {
		http.Error(w, "", http.StatusNotFound)
		return
	}

	var graph graph.Directed
	if uidStrings := req.URL.Query()["uid"]; len(uidStrings) > 0 {
		uids := []types.UID{}
		for _, uidString := range uidStrings {
			uids = append(uids, types.UID(uidString))
		}
		graph = h.controller.dependencyGraphBuilder.uidToNode.ToGonumGraphForObj(uids...)

	} else {
		graph = h.controller.dependencyGraphBuilder.uidToNode.ToGonumGraph()
	}

	data, err := dot.Marshal(graph, "full", "", "  ")
	if err != nil {
		http.Error(w, err.Error(), http.StatusInternalServerError)
		return
	}
	w.Header().Set("Content-Type", "text/vnd.graphviz")
	w.Header().Set("X-Content-Type-Options", "nosniff")
	w.Write(data)
	w.WriteHeader(http.StatusOK)
}
           

擷取對象關聯關系圖

擷取全部的對象關聯關系圖:

curl http://{master_ip}:{kcm_port}/debug/controllers/garbagecollector/graph -o {output_file}
           

擷取特定uid的對象關聯關系圖:

curl http://{master_ip}:{kcm_port}/debug/controllers/garbagecollector/graph?uid={project_uid} -o {output_file}
           

示例:

curl http://192.168.1.10:10252/debug/controllers/garbagecollector/graph?uid=8727f640-112e-21eb-11dd-626400510df6 -o /home/test
           

總結

k8s garbage collector源碼分析(1)-啟動分析

GraphBuilder

GarbageCollector

graphChanges

attemptToDelete

attemptToOrphan

)。

garbage collector啟動分析

garbage collector的啟動主要是啟動了2個處理器(

GraphBuilder

GarbageCollector

),定義了對象關聯依賴關系圖以及3個事件隊列(

graphChanges

attemptToDelete

attemptToOrphan

從apiserver list/watch的事件會放入到

graphChanges

隊列,而

GraphBuilder

graphChanges

隊列中取出事件進行處理,建構對象關聯依賴關系圖,并根據對象删除政策将關聯對象放入

attemptToDelete

attemptToOrphan

隊列中,接着

GarbageCollector

attemptToDelete

attemptToOrphan

隊列中取出事件,再從對象關聯依賴關系圖中擷取資訊進行處理,最後回收删除對象。

繼續閱讀