天天看点

k8s garbage collector源码分析(2)-处理逻辑分析

k8s gc分析。Kubernetes garbage collector即垃圾收集器,存在于kube-controller-manger中,它负责回收kubernetes中的资源对象,监听资源对象事件,更新对象之间的依赖关系,并根据对象的删除策略来决定是否删除其关联对象。

garbage collector介绍

Kubernetes garbage collector即垃圾收集器,存在于kube-controller-manger中,它负责回收kubernetes中的资源对象,监听资源对象事件,更新对象之间的依赖关系,并根据对象的删除策略来决定是否删除其关联对象。

关于删除关联对象,细一点说就是,使用级联删除策略去删除一个

owner

时,会连带这个

owner

对象的

dependent

对象也一起删除掉。

关于对象的关联依赖关系,garbage collector会监听资源对象事件,根据资源对象中

ownerReference

的值,来构建对象间的关联依赖关系,也即

owner

dependent

之间的关系。

关于owner与dependent的介绍

以创建deployment对象为例进行讲解。

创建deployment对象后,kube-controller-manager为其创建出replicaset对象,且自动将该deployment的信息设置到replicaset对象

ownerReference

值。如下面示例,即说明replicaset对象

test-1-59d7f45ffb

owner

为deployment对象

test-1

,deployment对象

test-1

dependent

为replicaset对象

test-1-59d7f45ffb

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-1
  namespace: test
  uid: 4973d370-3221-46a7-8d86-e145bf9ad0ce
...
           
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: test-1-59d7f45ffb
  namespace: test
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: Deployment
    name: test-1
    uid: 4973d370-3221-46a7-8d86-e145bf9ad0ce
  uid: 386c380b-490e-470b-a33f-7d5b0bf945fb
...
           

同理,replicaset对象创建后,kube-controller-manager为其创建出pod对象,这些pod对象也会将replicaset对象的信息设置到pod对象的

ownerReference

的值中,replicaset是pod的

owner

,pod是replicaset的

dependent

对象中

ownerReference

的值,指定了

owner

dependent

garbage collector架构图

garbage collectort的详细架构与核心处理逻辑如下图。

k8s garbage collector源码分析(2)-处理逻辑分析

garbage collector中最关键的代码就是

garbagecollector.go

graph_builder.go

两部分。

garbage collector的主要组成为1个图(对象关联依赖关系图)、2个处理器(

GraphBuilder

GarbageCollector

)、3个事件队列(

graphChanges

attemptToDelete

attemptToOrphan

):

1个图

(1)

uidToNode

:对象关联依赖关系图,由

GraphBuilder

维护,维护着所有对象间的关联依赖关系。在该图里,每一个k8s对象会对应着关系图里的一个

node

,而每个

node

都会维护一个

owner

列表以及

dependent

列表。

示例:现有一个deployment A,replicaset B(owner为deployment A),pod C(owner为replicaset B),则对象关联依赖关系如下:

3个node,分别是A、B、C

A对应一个node,无owner,dependent列表里有B;  
B对应一个node,owner列表里有A,dependent列表里有C;  
C对应一个node,owner列表里有B,无dependent。  
           
k8s garbage collector源码分析(2)-处理逻辑分析

2个处理器

GraphBuilder

:负责维护所有对象的关联依赖关系图,并产生事件触发

GarbageCollector

执行对象回收删除操作。

GraphBuilder

graphChanges

事件队列中获取事件进行消费,根据资源对象中

ownerReference

的值,来构建、更新、删除对象间的关联依赖关系图,也即

owner

dependent

之间的关系图,然后再作为生产者生产事件,放入

attemptToDelete

attemptToOrphan

队列中,触发

GarbageCollector

执行,看是否需要进行关联对象的回收删除操作,而

GarbageCollector

进行对象的回收删除操作时会依赖于

uidToNode

这个关系图。

(2)

GarbageCollector

:负责回收删除对象。

GarbageCollector

作为消费者,从

attemptToDelete

attemptToOrphan

队列中取出事件进行处理,若一个对象被删除,且其删除策略为级联删除,则进行关联对象的回收删除。关于删除关联对象,细一点说就是,使用级联删除策略去删除一个

owner

owner

dependent

3个事件队列

graphChanges

:list/watch apiserver,获取事件,由

informer

生产,由

GraphBuilder

消费;

attemptToDelete

:级联删除事件队列,由

GraphBuilder

GarbageCollector

(3)

attemptToOrphan

:孤儿删除事件队列,由

GraphBuilder

GarbageCollector

消费。

对象删除策略

kubernetes 中有三种对象删除策略:

Orphan

Foreground

Background

,删除某个对象时,可以指定删除策略。下面对这三种策略进行介绍。

Foreground前台删除

Foreground即前台删除策略,属于级联删除策略,垃圾收集器会删除对象的所有

dependent

使用前台删除策略删除某个对象时,该对象的

deletionTimestamp

字段被设置,且对象的

metadata.finalizers

字段包含值

foregroundDeletion

,用于阻塞该对象删除,等到垃圾收集器在删除了该对象中所有有阻塞能力的

dependent

对象(对象的

ownerReference.blockOwnerDeletion=true

) 之后,再去除该对象的

metadata.finalizers

字段中的值

foregroundDeletion

,然后删除该

owner

对象。

以删除deployment为例,使用前台删除策略,则按照Pod->ReplicaSet->Deployment的顺序进行删除。

Background后台删除

Background即后台删除策略,属于级联删除策略,Kubernetes会立即删除该

owner

对象,之后垃圾收集器会在后台自动删除其所有的

dependent

当删除一个对象时使用了

Background

后台删除策略时,该对象因没有相关的

Finalizer

设置(只有删除策略为

foreground

Orphan

时会设置相关

Finalizer

),会直接被删除,接着

GraphBuilder

会监听到该对象的delete事件,会将其

dependents

放入到

attemptToDelete

队列中去,触发

GarbageCollector

dependents

对象的回收删除处理。

以删除deployment为例,使用后台删除策略,则按照Deployment->ReplicaSet->Pod的顺序进行删除。

Orphan孤儿删除

Orphan即孤儿删除策略,属于非级联删除策略,即删除某个对象时,不会自动删除它的

dependent

,这些

dependent

也被称作孤立对象。

Orphan

孤儿删除策略时,该对象的

metadata.finalizers

orphan

,用于阻塞该对象删除,直至

GarbageCollector

将其所有

dependents

OwnerReferences

属性中的该

owner

的相关字段去除,再去除该

owner

metadata.finalizers

Orphan

,最后才能删除该

owner

以删除deployment为例,使用孤儿删除策略,则只删除Deployment,对应ReplicaSet和Pod不删除。

删除对象时指定删除策略

当删除对象时没有特别指定删除策略,将会使用默认删除策略:Background即后台删除策略。

(1)指定后台删除策略

curl -X DELETE localhost:8080/apis/apps/v1/namespaces/default/replicasets/my-repset \
  -d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Background"}' \
  -H "Content-Type: application/json"
           

(2)指定前台删除策略

curl -X DELETE localhost:8080/apis/apps/v1/namespaces/default/replicasets/my-repset \
  -d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Foreground"}' \
  -H "Content-Type: application/json"
           

(3)指定孤儿删除策略

curl -X DELETE localhost:8080/apis/apps/v1/namespaces/default/replicasets/my-repset \
  -d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Orphan"}' \
  -H "Content-Type: application/json"
           

garbage collector的源码分析分成两部分进行,分别是:

(1)启动分析;

(2)核心处理逻辑分析。

上一篇博客已经对garbage collector的启动进行了分析,本篇博客对garbage collector的核心处理逻辑进行分析。

garbage collector源码分析-处理逻辑分析

基于tag v1.17.4

https://github.com/kubernetes/kubernetes/releases/tag/v1.17.4

前面讲过,

garbage collector

中最关键的代码就是

garbagecollector.go

graph_builder.go

两部分,也即

GarbageCollector struct

GraphBuilder struct

,所以下面处理逻辑分析将分成两大块进行分析。

1.GraphBuilder

首先先看到

GraphBuilder

GraphBuilder 主要有2个功能:

(1)基于 informers 中的资源事件在

uidToNode

属性中维护着所有对象的关联依赖关系;

(2)处理

graphChanges

中的事件,并作为生产者将事件放入到

attemptToDelete

attemptToOrphan

两个队列中,触发消费者

GarbageCollector

进行对象的回收删除操作。

1.1 GraphBuilder struct

先来简单的分析下

GraphBuilder struct

,里面最关键的几个属性及作用如下:

graphChanges

:informers 监听到的事件会放在

graphChanges

中,然后

GraphBuilder

会作为消费者,处理

graphChanges

队列中的事件;

uidToNode

(对象依赖关联关系图):根据对象uid,维护所有对象的关联依赖关系,也即前面说的

owner

dependent

之间的关系,也可以理解为

GraphBuilder

会维护一张所有对象的关联依赖关系图,而

GarbageCollector

进行对象的回收删除操作时会依赖于这个关系图;

attemptToDelete

attemptToOrphan

GraphBuilder

作为生产者往

attemptToDelete

attemptToOrphan

两个队列中存放事件,然后

GarbageCollector

作为消费者会处理

attemptToDelete

attemptToOrphan

两个队列中的事件。

// pkg/controller/garbagecollector/graph_builder.go
type GraphBuilder struct {
	...
	
	// monitors are the producer of the graphChanges queue, graphBuilder alters
	// the in-memory graph according to the changes.
	graphChanges workqueue.RateLimitingInterface
	// uidToNode doesn't require a lock to protect, because only the
	// single-threaded GraphBuilder.processGraphChanges() reads/writes it.
	uidToNode *concurrentUIDToNode
	// GraphBuilder is the producer of attemptToDelete and attemptToOrphan, GC is the consumer.
	attemptToDelete workqueue.RateLimitingInterface
	attemptToOrphan workqueue.RateLimitingInterface
	
	...
}
           
// pkg/controller/garbagecollector/graph.go
type concurrentUIDToNode struct {
	uidToNodeLock sync.RWMutex
	uidToNode     map[types.UID]*node
}
           
// pkg/controller/garbagecollector/graph.go
type node struct {
	...
	dependents map[*node]struct{}
	...
	owners []metav1.OwnerReference
}
           

从结构体定义中可以看到,一个k8s对象对应着对象关联依赖关系图里的一个

node

node

都会维护一

个owner

dependent

1.2 GraphBuilder-gb.processGraphChanges

接下来看到

GraphBuilder

的处理逻辑部分,从

gb.processGraphChanges

作为入口进行处理逻辑分析。

前面说过,informers 监听到的事件会放入到

graphChanges

队列中,然后

GraphBuilder

graphChanges

队列中的事件,而

processGraphChanges

方法就是

GraphBuilder

作为消费者处理

graphChanges

队列中事件地方。

所以在此方法中,

GraphBuilder

既是消费者又是生产者,消费处理

graphChanges

中的所有事件并进行分类,再生产事件放入到

attemptToDelete

attemptToOrphan

两个队列中去,让

GarbageCollector

作为消费者去处理这两个队列中的事件。

主要逻辑:

(1)从

graphChanges

队列中取出事件进行处理;

(2)读取

uidToNode

,判断该对象是否已经存在于已构建的对象依赖关联关系图中;下面就开始根据对象是否存在于对象依赖关联关系图中以及事件类型来做不同的处理逻辑;

(3)若

uidToNode

中不存在该

node

且该事件是

addEvent

updateEvent

,则为该

object

创建对应的

node

,并调用

gb.insertNode

将该

node

加到

uidToNode

中,然后将该

node

添加到其

owner

dependents

中;

然后再调用

gb.processTransitions

方法做处理,该方法的处理逻辑是判断该对象是否处于删除状态,若处于删除状态会判断该对象是以

orphan

模式删除还是以

foreground

模式删除(其实就是判断deployment对象的finalizer来区分删除模式,删除deployment的时候会带上删除策略,kube-apiserver会根据删除策略给deployment对象打上相应的finalizer),若以

orphan

模式删除,则将该

node

加入到

attemptToOrphan

队列中,若以

foreground

模式删除则将该对象以及其所有

dependents

都加入到

attemptToDelete

队列中;

(4)若

uidToNode

中存在该

node

addEvent

updateEvent

时,则调用

referencesDiffs

方法检查该对象的

OwnerReferences

字段是否有变化,有变化则做相应处理,更新对象依赖关联关系图,最后调用

gb.processTransitions

做处理;

(5)若事件为删除事件,则调用

gb.removeNode

,从

uidToNode

中删除该对象,然后从该

node

所有

owners

dependents

中删除该对象,再把该对象的

dependents

attemptToDelete

GarbageCollector

处理;最后检查该

node

的所有

owners

,若有处于删除状态的

owner

,此时该

owner

可能处于删除阻塞状态正在等待该

node

的删除,将该

owner

attemptToDelete

GarbageCollector

处理。

// pkg/controller/garbagecollector/graph_builder.go
func (gb *GraphBuilder) runProcessGraphChanges() {
	for gb.processGraphChanges() {
	}
}

// Dequeueing an event from graphChanges, updating graph, populating dirty_queue.
func (gb *GraphBuilder) processGraphChanges() bool {
	item, quit := gb.graphChanges.Get()
	if quit {
		return false
	}
	defer gb.graphChanges.Done(item)
	event, ok := item.(*event)
	if !ok {
		utilruntime.HandleError(fmt.Errorf("expect a *event, got %v", item))
		return true
	}
	obj := event.obj
	accessor, err := meta.Accessor(obj)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("cannot access obj: %v", err))
		return true
	}
	klog.V(5).Infof("GraphBuilder process object: %s/%s, namespace %s, name %s, uid %s, event type %v", event.gvk.GroupVersion().String(), event.gvk.Kind, accessor.GetNamespace(), accessor.GetName(), string(accessor.GetUID()), event.eventType)
	// Check if the node already exists
	existingNode, found := gb.uidToNode.Read(accessor.GetUID())
	if found {
		// this marks the node as having been observed via an informer event
		// 1. this depends on graphChanges only containing add/update events from the actual informer
		// 2. this allows things tracking virtual nodes' existence to stop polling and rely on informer events
		existingNode.markObserved()
	}
	switch {
	case (event.eventType == addEvent || event.eventType == updateEvent) && !found:
		newNode := &node{
			identity: objectReference{
				OwnerReference: metav1.OwnerReference{
					APIVersion: event.gvk.GroupVersion().String(),
					Kind:       event.gvk.Kind,
					UID:        accessor.GetUID(),
					Name:       accessor.GetName(),
				},
				Namespace: accessor.GetNamespace(),
			},
			dependents:         make(map[*node]struct{}),
			owners:             accessor.GetOwnerReferences(),
			deletingDependents: beingDeleted(accessor) && hasDeleteDependentsFinalizer(accessor),
			beingDeleted:       beingDeleted(accessor),
		}
		gb.insertNode(newNode)
		// the underlying delta_fifo may combine a creation and a deletion into
		// one event, so we need to further process the event.
		gb.processTransitions(event.oldObj, accessor, newNode)
	case (event.eventType == addEvent || event.eventType == updateEvent) && found:
		// handle changes in ownerReferences
		added, removed, changed := referencesDiffs(existingNode.owners, accessor.GetOwnerReferences())
		if len(added) != 0 || len(removed) != 0 || len(changed) != 0 {
			// check if the changed dependency graph unblock owners that are
			// waiting for the deletion of their dependents.
			gb.addUnblockedOwnersToDeleteQueue(removed, changed)
			// update the node itself
			existingNode.owners = accessor.GetOwnerReferences()
			// Add the node to its new owners' dependent lists.
			gb.addDependentToOwners(existingNode, added)
			// remove the node from the dependent list of node that are no longer in
			// the node's owners list.
			gb.removeDependentFromOwners(existingNode, removed)
		}

		if beingDeleted(accessor) {
			existingNode.markBeingDeleted()
		}
		gb.processTransitions(event.oldObj, accessor, existingNode)
	case event.eventType == deleteEvent:
		if !found {
			klog.V(5).Infof("%v doesn't exist in the graph, this shouldn't happen", accessor.GetUID())
			return true
		}
		// removeNode updates the graph
		gb.removeNode(existingNode)
		existingNode.dependentsLock.RLock()
		defer existingNode.dependentsLock.RUnlock()
		if len(existingNode.dependents) > 0 {
			gb.absentOwnerCache.Add(accessor.GetUID())
		}
		for dep := range existingNode.dependents {
			gb.attemptToDelete.Add(dep)
		}
		for _, owner := range existingNode.owners {
			ownerNode, found := gb.uidToNode.Read(owner.UID)
			if !found || !ownerNode.isDeletingDependents() {
				continue
			}
			// this is to let attempToDeleteItem check if all the owner's
			// dependents are deleted, if so, the owner will be deleted.
			gb.attemptToDelete.Add(ownerNode)
		}
	}
	return true
}
           

结合代码分析可以得知,当删除一个对象时使用了

Background

Finalizer

Foreground

Orphan

Finalizer

GraphBuilder

dependents

attemptToDelete

GarbageCollector

dependents

1.2.1 gb.insertNode

调用

gb.insertNode

node

uidToNode

node

owner

dependents

中。

// pkg/controller/garbagecollector/graph_builder.go
func (gb *GraphBuilder) insertNode(n *node) {
	gb.uidToNode.Write(n)
	gb.addDependentToOwners(n, n.owners)
}

func (gb *GraphBuilder) addDependentToOwners(n *node, owners []metav1.OwnerReference) {
	for _, owner := range owners {
		ownerNode, ok := gb.uidToNode.Read(owner.UID)
		if !ok {
			// Create a "virtual" node in the graph for the owner if it doesn't
			// exist in the graph yet.
			ownerNode = &node{
				identity: objectReference{
					OwnerReference: owner,
					Namespace:      n.identity.Namespace,
				},
				dependents: make(map[*node]struct{}),
				virtual:    true,
			}
			klog.V(5).Infof("add virtual node.identity: %s\n\n", ownerNode.identity)
			gb.uidToNode.Write(ownerNode)
		}
		ownerNode.addDependent(n)
		if !ok {
			// Enqueue the virtual node into attemptToDelete.
			// The garbage processor will enqueue a virtual delete
			// event to delete it from the graph if API server confirms this
			// owner doesn't exist.
			gb.attemptToDelete.Add(ownerNode)
		}
	}
}

           

1.2.2 gb.processTransitions

gb.processTransitions 方法检查k8s对象是否处于删除状态(对象的

deletionTimestamp

属性不为空则处于删除状态),并且对象里含有删除策略对应的

finalizer

,然后做相应的处理。

因为只有删除策略为

Foreground

Orphan

时对象才会会设置相关

Finalizer

,所以该方法只会处理删除策略为

Foreground

Orphan

的对象,对于删除策略为

Background

的对象不做处理。

若对象的

deletionTimestamp

属性不为空,且有

Orphaned

删除策略对应的

finalizer

,则将对应的

node

attemptToOrphan

GarbageCollector

去消费处理;

deletionTimestamp

foreground

finalizer

,则调用

n.markDeletingDependents

标记

node

deletingDependents

属性为

true

,代表该

node

dependents

正在被删除,并将对应的

node

及其

dependents

attemptToDelete

GarbageCollector

去消费处理。

// pkg/controller/garbagecollector/graph_builder.go
func (gb *GraphBuilder) processTransitions(oldObj interface{}, newAccessor metav1.Object, n *node) {
	if startsWaitingForDependentsOrphaned(oldObj, newAccessor) {
		klog.V(5).Infof("add %s to the attemptToOrphan", n.identity)
		gb.attemptToOrphan.Add(n)
		return
	}
	if startsWaitingForDependentsDeleted(oldObj, newAccessor) {
		klog.V(2).Infof("add %s to the attemptToDelete, because it's waiting for its dependents to be deleted", n.identity)
		// if the n is added as a "virtual" node, its deletingDependents field is not properly set, so always set it here.
		n.markDeletingDependents()
		for dep := range n.dependents {
			gb.attemptToDelete.Add(dep)
		}
		gb.attemptToDelete.Add(n)
	}
}

func startsWaitingForDependentsOrphaned(oldObj interface{}, newAccessor metav1.Object) bool {
	return deletionStartsWithFinalizer(oldObj, newAccessor, metav1.FinalizerOrphanDependents)
}

func startsWaitingForDependentsDeleted(oldObj interface{}, newAccessor metav1.Object) bool {
	return deletionStartsWithFinalizer(oldObj, newAccessor, metav1.FinalizerDeleteDependents)
}

func deletionStartsWithFinalizer(oldObj interface{}, newAccessor metav1.Object, matchingFinalizer string) bool {
	// if the new object isn't being deleted, or doesn't have the finalizer we're interested in, return false
	if !beingDeleted(newAccessor) || !hasFinalizer(newAccessor, matchingFinalizer) {
		return false
	}

	// if the old object is nil, or wasn't being deleted, or didn't have the finalizer, return true
	if oldObj == nil {
		return true
	}
	oldAccessor, err := meta.Accessor(oldObj)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("cannot access oldObj: %v", err))
		return false
	}
	return !beingDeleted(oldAccessor) || !hasFinalizer(oldAccessor, matchingFinalizer)
}

func beingDeleted(accessor metav1.Object) bool {
	return accessor.GetDeletionTimestamp() != nil
}

func hasFinalizer(accessor metav1.Object, matchingFinalizer string) bool {
	finalizers := accessor.GetFinalizers()
	for _, finalizer := range finalizers {
		if finalizer == matchingFinalizer {
			return true
		}
	}
	return false
}
           

1.2.3 gb.removeNode

gb.removeNode

uidToNode

node

owners

dependents

dependents

attemptToDelete

GarbageCollector

node

owners

owner

owner

node

owner

attemptToDelete

GarbageCollector

// pkg/controller/garbagecollector/graph_builder.go
func (gb *GraphBuilder) removeNode(n *node) {
	gb.uidToNode.Delete(n.identity.UID)
	gb.removeDependentFromOwners(n, n.owners)
}

func (gb *GraphBuilder) removeDependentFromOwners(n *node, owners []metav1.OwnerReference) {
	for _, owner := range owners {
		ownerNode, ok := gb.uidToNode.Read(owner.UID)
		if !ok {
			continue
		}
		ownerNode.deleteDependent(n)
	}
}
           

2.GarbageCollector

再来看到

GarbageCollector

GarbageCollector 主要有2个功能:

(1)处理

attemptToDelete

队列中的事件,根据对象删除策略

foreground

background

做相应的回收逻辑处理,删除关联对象;

attemptToOrphan

Orphan

,更新该

owner

dependents

对象,将对象的

OwnerReferences

属性中该

owner

的相关字段去除,接着再更新该

owner

对象,去除

Orphan

finalizers

GarbageCollector的2个关键处理方法:

gc.runAttemptToDeleteWorker

:主要负责处理

attemptToDelete

队列中的事件,负责删除策略为

foreground

background

的对象回收处理;

gc.runAttemptToOrphanWorker

attemptToOrphan

Orphan

的对象回收处理。

2.1 GarbageCollector struct

GarbageCollector struct

attemptToDelete

attemptToOrphan

GraphBuilder

attemptToDelete

attemptToOrphan

GarbageCollector

attemptToDelete

attemptToOrphan

// pkg/controller/garbagecollector/garbagecollector.go
type GarbageCollector struct {
	...
	attemptToDelete workqueue.RateLimitingInterface
	attemptToOrphan workqueue.RateLimitingInterface
	...
}
           

2.2 GarbageCollector-gc.runAttemptToDeleteWorker

GarbageCollector

gc.runAttemptToDeleteWorker

runAttemptToDeleteWorker主要逻辑为循环调用

attemptToDeleteWorker

方法。

attemptToDeleteWorker方法主要逻辑:

attemptToDelete

队列中取出对象;

(2)调用

gc.attemptToDeleteItem

尝试删除

node

(3)若删除失败则重新加入到

attemptToDelete

队列中进行重试。

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) runAttemptToDeleteWorker() {
	for gc.attemptToDeleteWorker() {
	}
}

func (gc *GarbageCollector) attemptToDeleteWorker() bool {
	item, quit := gc.attemptToDelete.Get()
	gc.workerLock.RLock()
	defer gc.workerLock.RUnlock()
	if quit {
		return false
	}
	defer gc.attemptToDelete.Done(item)
	n, ok := item.(*node)
	if !ok {
		utilruntime.HandleError(fmt.Errorf("expect *node, got %#v", item))
		return true
	}
	err := gc.attemptToDeleteItem(n)
	if err != nil {
		if _, ok := err.(*restMappingError); ok {
			// There are at least two ways this can happen:
			// 1. The reference is to an object of a custom type that has not yet been
			//    recognized by gc.restMapper (this is a transient error).
			// 2. The reference is to an invalid group/version. We don't currently
			//    have a way to distinguish this from a valid type we will recognize
			//    after the next discovery sync.
			// For now, record the error and retry.
			klog.V(5).Infof("error syncing item %s: %v", n, err)
		} else {
			utilruntime.HandleError(fmt.Errorf("error syncing item %s: %v", n, err))
		}
		// retry if garbage collection of an object failed.
		gc.attemptToDelete.AddRateLimited(item)
	} else if !n.isObserved() {
		// requeue if item hasn't been observed via an informer event yet.
		// otherwise a virtual node for an item added AND removed during watch reestablishment can get stuck in the graph and never removed.
		// see https://issue.k8s.io/56121
		klog.V(5).Infof("item %s hasn't been observed via informer yet", n.identity)
		gc.attemptToDelete.AddRateLimited(item)
	}
	return true
}
           

2.2.1 gc.attemptToDeleteItem

(1)判断

node

是否处于删除状态;

(2)从

apiserver

获取该

node

对应的对象;

(3)调用

item.isDeletingDependents

方法:通过

node

deletingDependents

字段判断该

node

当前是否正在删除

dependents

,若是则调用

gc.processDeletingDependentsItem

方法对

dependents

做进一步处理:检查该

node

blockingDependents

是否被完全删除,若是则移除该

node

对应对象的相关

finalizer

,若否,则将未删除的

blockingDependents

attemptToDelete

上面分析

GraphBuilder

时说到,在

GraphBuilder

处理

graphChanges

中的事件时,在

processTransitions

方法逻辑里,会调用

n.markDeletingDependents

,标记

node

deletingDependents

true

(4)调用

gc.classifyReferences

node

owner

分为3类,分别是

solid

(至少有一个

owner

存在且不处于删除状态)、

dangling

owner

均不存在)、

waitingForDependentsDeletion

owner

存在,处于删除状态且正在等待其

dependents

被删除);

(5)接下来将根据

solid

dangling

waitingForDependentsDeletion

的数量做不同的逻辑处理;

(6)第一种情况:当

solid

数量不为0时,即该

node

至少有一个

owner

存在且不处于删除状态,则说明该对象还不能被回收删除,此时将

dangling

waitingForDependentsDeletion

列表中的

owner

node

ownerReferences

中删除;

(7)第二种情况:

solid

数量为0,该

node

owner

处于

waitingForDependentsDeletion

状态并且

node

dependents

未被完全删除,将使用

foreground

前台删除策略来删除该

node

(8)当不满足以上两种情况时(即),进入该默认处理逻辑:按照删除对象时使用的删除策略,调用

apiserver

的接口删除对象。

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) attemptToDeleteItem(item *node) error {
	klog.V(2).Infof("processing item %s", item.identity)
	// "being deleted" is an one-way trip to the final deletion. We'll just wait for the final deletion, and then process the object's dependents.
	if item.isBeingDeleted() && !item.isDeletingDependents() {
		klog.V(5).Infof("processing item %s returned at once, because its DeletionTimestamp is non-nil", item.identity)
		return nil
	}
	// TODO: It's only necessary to talk to the API server if this is a
	// "virtual" node. The local graph could lag behind the real status, but in
	// practice, the difference is small.
	latest, err := gc.getObject(item.identity)
	switch {
	case errors.IsNotFound(err):
		// the GraphBuilder can add "virtual" node for an owner that doesn't
		// exist yet, so we need to enqueue a virtual Delete event to remove
		// the virtual node from GraphBuilder.uidToNode.
		klog.V(5).Infof("item %v not found, generating a virtual delete event", item.identity)
		gc.dependencyGraphBuilder.enqueueVirtualDeleteEvent(item.identity)
		// since we're manually inserting a delete event to remove this node,
		// we don't need to keep tracking it as a virtual node and requeueing in attemptToDelete
		item.markObserved()
		return nil
	case err != nil:
		return err
	}

	if latest.GetUID() != item.identity.UID {
		klog.V(5).Infof("UID doesn't match, item %v not found, generating a virtual delete event", item.identity)
		gc.dependencyGraphBuilder.enqueueVirtualDeleteEvent(item.identity)
		// since we're manually inserting a delete event to remove this node,
		// we don't need to keep tracking it as a virtual node and requeueing in attemptToDelete
		item.markObserved()
		return nil
	}

	// TODO: attemptToOrphanWorker() routine is similar. Consider merging
	// attemptToOrphanWorker() into attemptToDeleteItem() as well.
	if item.isDeletingDependents() {
		return gc.processDeletingDependentsItem(item)
	}

	// compute if we should delete the item
	ownerReferences := latest.GetOwnerReferences()
	if len(ownerReferences) == 0 {
		klog.V(2).Infof("object %s's doesn't have an owner, continue on next item", item.identity)
		return nil
	}

	solid, dangling, waitingForDependentsDeletion, err := gc.classifyReferences(item, ownerReferences)
	if err != nil {
		return err
	}
	klog.V(5).Infof("classify references of %s.\nsolid: %#v\ndangling: %#v\nwaitingForDependentsDeletion: %#v\n", item.identity, solid, dangling, waitingForDependentsDeletion)

	switch {
	case len(solid) != 0:
		klog.V(2).Infof("object %#v has at least one existing owner: %#v, will not garbage collect", item.identity, solid)
		if len(dangling) == 0 && len(waitingForDependentsDeletion) == 0 {
			return nil
		}
		klog.V(2).Infof("remove dangling references %#v and waiting references %#v for object %s", dangling, waitingForDependentsDeletion, item.identity)
		// waitingForDependentsDeletion needs to be deleted from the
		// ownerReferences, otherwise the referenced objects will be stuck with
		// the FinalizerDeletingDependents and never get deleted.
		ownerUIDs := append(ownerRefsToUIDs(dangling), ownerRefsToUIDs(waitingForDependentsDeletion)...)
		patch := deleteOwnerRefStrategicMergePatch(item.identity.UID, ownerUIDs...)
		_, err = gc.patch(item, patch, func(n *node) ([]byte, error) {
			return gc.deleteOwnerRefJSONMergePatch(n, ownerUIDs...)
		})
		return err
	case len(waitingForDependentsDeletion) != 0 && item.dependentsLength() != 0:
		deps := item.getDependents()
		for _, dep := range deps {
			if dep.isDeletingDependents() {
				// this circle detection has false positives, we need to
				// apply a more rigorous detection if this turns out to be a
				// problem.
				// there are multiple workers run attemptToDeleteItem in
				// parallel, the circle detection can fail in a race condition.
				klog.V(2).Infof("processing object %s, some of its owners and its dependent [%s] have FinalizerDeletingDependents, to prevent potential cycle, its ownerReferences are going to be modified to be non-blocking, then the object is going to be deleted with Foreground", item.identity, dep.identity)
				patch, err := item.unblockOwnerReferencesStrategicMergePatch()
				if err != nil {
					return err
				}
				if _, err := gc.patch(item, patch, gc.unblockOwnerReferencesJSONMergePatch); err != nil {
					return err
				}
				break
			}
		}
		klog.V(2).Infof("at least one owner of object %s has FinalizerDeletingDependents, and the object itself has dependents, so it is going to be deleted in Foreground", item.identity)
		// the deletion event will be observed by the graphBuilder, so the item
		// will be processed again in processDeletingDependentsItem. If it
		// doesn't have dependents, the function will remove the
		// FinalizerDeletingDependents from the item, resulting in the final
		// deletion of the item.
		policy := metav1.DeletePropagationForeground
		return gc.deleteObject(item.identity, &policy)
	default:
		// item doesn't have any solid owner, so it needs to be garbage
		// collected. Also, none of item's owners is waiting for the deletion of
		// the dependents, so set propagationPolicy based on existing finalizers.
		var policy metav1.DeletionPropagation
		switch {
		case hasOrphanFinalizer(latest):
			// if an existing orphan finalizer is already on the object, honor it.
			policy = metav1.DeletePropagationOrphan
		case hasDeleteDependentsFinalizer(latest):
			// if an existing foreground finalizer is already on the object, honor it.
			policy = metav1.DeletePropagationForeground
		default:
			// otherwise, default to background.
			policy = metav1.DeletePropagationBackground
		}
		klog.V(2).Infof("delete object %s with propagation policy %s", item.identity, policy)
		return gc.deleteObject(item.identity, &policy)
	}
}

           

gc.processDeletingDependentsItem

主要逻辑:检查该

node

blockingDependents

(即阻塞

owner

删除的

dpendents

)是否被完全删除,若是则移除该

node

finalizer

(finalizer移除后,kube-apiserver会删除该对象),若否,则将未删除的

blockingDependents

attemptToDelete

队列中。

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) processDeletingDependentsItem(item *node) error {
	blockingDependents := item.blockingDependents()
	if len(blockingDependents) == 0 {
		klog.V(2).Infof("remove DeleteDependents finalizer for item %s", item.identity)
		return gc.removeFinalizer(item, metav1.FinalizerDeleteDependents)
	}
	for _, dep := range blockingDependents {
		if !dep.isDeletingDependents() {
			klog.V(2).Infof("adding %s to attemptToDelete, because its owner %s is deletingDependents", dep.identity, item.identity)
			gc.attemptToDelete.Add(dep)
		}
	}
	return nil
}
           

item.blockingDependents

item.blockingDependents返回会阻塞

node

dependents

。一个

dependents

会不会阻塞

owner

的删除,主要看这个

dependents

ownerReferences

blockOwnerDeletion

属性值是否为

true

,为

true

则代表该

dependents

会阻塞

owner

的删除。

// pkg/controller/garbagecollector/graph.go
func (n *node) blockingDependents() []*node {
	dependents := n.getDependents()
	var ret []*node
	for _, dep := range dependents {
		for _, owner := range dep.owners {
			if owner.UID == n.identity.UID && owner.BlockOwnerDeletion != nil && *owner.BlockOwnerDeletion {
				ret = append(ret, dep)
			}
		}
	}
	return ret
}
           

2.3 GarbageCollector-gc.runAttemptToOrphanWorker

gc.runAttemptToOrphanWorker方法是负责处理

orphan

删除策略删除的

node

gc.runAttemptToDeleteWorker主要逻辑为循环调用

gc.attemptToDeleteWorker

下面来看一下

gc.attemptToDeleteWorker

方法的主要逻辑:

attemptToOrphan

gc.orphanDependents

方法:更新该

owner

dependents

OwnerReferences

owner

的相关字段去除,失败则将该

owner

重新加入到

attemptToOrphan

gc.removeFinalizer

owner

Orphan

finalizers

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) runAttemptToOrphanWorker() {
	for gc.attemptToOrphanWorker() {
	}
}

func (gc *GarbageCollector) attemptToOrphanWorker() bool {
	item, quit := gc.attemptToOrphan.Get()
	gc.workerLock.RLock()
	defer gc.workerLock.RUnlock()
	if quit {
		return false
	}
	defer gc.attemptToOrphan.Done(item)
	owner, ok := item.(*node)
	if !ok {
		utilruntime.HandleError(fmt.Errorf("expect *node, got %#v", item))
		return true
	}
	// we don't need to lock each element, because they never get updated
	owner.dependentsLock.RLock()
	dependents := make([]*node, 0, len(owner.dependents))
	for dependent := range owner.dependents {
		dependents = append(dependents, dependent)
	}
	owner.dependentsLock.RUnlock()

	err := gc.orphanDependents(owner.identity, dependents)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("orphanDependents for %s failed with %v", owner.identity, err))
		gc.attemptToOrphan.AddRateLimited(item)
		return true
	}
	// update the owner, remove "orphaningFinalizer" from its finalizers list
	err = gc.removeFinalizer(owner, metav1.FinalizerOrphanDependents)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("removeOrphanFinalizer for %s failed with %v", owner.identity, err))
		gc.attemptToOrphan.AddRateLimited(item)
	}
	return true
}
           

2.3.1 gc.orphanDependents

主要逻辑:更新指定

owner

dependents

OwnerReferences

owner

的相关字段去除,对于每个

dependents

,分别起一个goroutine来处理,加快处理速度。

// pkg/controller/garbagecollector/garbagecollector.go
func (gc *GarbageCollector) orphanDependents(owner objectReference, dependents []*node) error {
	errCh := make(chan error, len(dependents))
	wg := sync.WaitGroup{}
	wg.Add(len(dependents))
	for i := range dependents {
		go func(dependent *node) {
			defer wg.Done()
			// the dependent.identity.UID is used as precondition
			patch := deleteOwnerRefStrategicMergePatch(dependent.identity.UID, owner.UID)
			_, err := gc.patch(dependent, patch, func(n *node) ([]byte, error) {
				return gc.deleteOwnerRefJSONMergePatch(n, owner.UID)
			})
			// note that if the target ownerReference doesn't exist in the
			// dependent, strategic merge patch will NOT return an error.
			if err != nil && !errors.IsNotFound(err) {
				errCh <- fmt.Errorf("orphaning %s failed, %v", dependent.identity, err)
			}
		}(dependents[i])
	}
	wg.Wait()
	close(errCh)

	var errorsSlice []error
	for e := range errCh {
		errorsSlice = append(errorsSlice, e)
	}

	if len(errorsSlice) != 0 {
		return fmt.Errorf("failed to orphan dependents of owner %s, got errors: %s", owner, utilerrors.NewAggregate(errorsSlice).Error())
	}
	klog.V(5).Infof("successfully updated all dependents of owner %s", owner)
	return nil
}
           

总结

先来回顾一下

garbage collector

的构架与核心处理逻辑。

k8s garbage collector源码分析(2)-处理逻辑分析

GraphBuilder

GarbageCollector

graphChanges

attemptToDelete

attemptToOrphan

)。

从apiserver list/watch的事件会放入到

graphChanges

队列,而

GraphBuilder

graphChanges

队列中取出事件进行处理,构建对象关联依赖关系图,并根据对象删除策略将关联对象放入

attemptToDelete

attemptToOrphan

队列中,接着

GarbageCollector

会从

attemptToDelete

attemptToOrphan

队列中取出事件,再从对象关联依赖关系图中获取信息进行处理,最后回收删除对象。

总结一下3种对象删除策略下,

node

及其对象的删除过程。

dependent

deletionTimestamp

metadata.finalizers

foregroundDeletion

dependent

ownerReference.blockOwnerDeletion=true

metadata.finalizers

foregroundDeletion

owner

owner

dependent

Background

Finalizer

foreground

Orphan

Finalizer

GraphBuilder

dependents

attemptToDelete

GarbageCollector

dependents

dependent

dependent

Orphan

metadata.finalizers

orphan

GarbageCollector

dependents

OwnerReferences

owner

owner

metadata.finalizers

Orphan

owner

继续阅读