語義分割中的資料增強方法
- 為什麼要使用資料增強?
- 随機翻轉
- 随機旋轉n*90°
- 中心旋轉
- 平移
- Cutout
- 随機縮放
- 怎麼同時使用?
為什麼要使用資料增強?
在實際生産項目中,我們通常都難以擁有充足的資料來完成任務,為了充分利用有限的資料,我們需要進行資料增強,使得少量資料産生的價值等價于更多資料的價值。在這篇博文中,将會簡單介紹一些語義分割的資料增強方法并貼出源碼。
随機翻轉
主要是三種選擇,分别是水準、垂直和水準垂直翻轉。
其效果如下圖所示:
class RandomFlip:
def __init__(self, prob=1.0):
self.prob = prob
def __call__(self, img, mask=None):
if random.random() < self.prob:
d = random.randint(-1, 1)
img = cv2.flip(img, d)
if mask is not None:
mask = cv2.flip(mask, d)
return img, mask
随機旋轉n*90°
N取值範圍在0到4之間。
其效果如下:
class RandomRotate90:
def __init__(self, prob=1.0):
self.prob = prob
def __call__(self, img, mask=None):
if random.random() < self.prob:
factor = random.randint(0, 4)
img = np.rot90(img, factor)
if mask is not None:
mask = np.rot90(mask, factor)
return img.copy(), mask.copy()
中心旋轉
該方法以圖檔中心為旋轉點,旋轉一定的角度。
其效果如下圖所示:
class Rotate:
def __init__(self, limit=90, prob=1.0):
self.prob = prob
self.limit = limit
def __call__(self, img, mask=None):
if random.random() < self.prob:
angle = random.uniform(-self.limit, self.limit)
height, width = img.shape[0:2]
mat = cv2.getRotationMatrix2D((width/2, height/2), angle, 1.0)
img = cv2.warpAffine(img, mat, (height, width),
flags=cv2.INTER_LINEAR,
borderMode=cv2.BORDER_REFLECT_101)
if mask is not None:
mask = cv2.warpAffine(mask, mat, (height, width),
flags=cv2.INTER_LINEAR,
borderMode=cv2.BORDER_REFLECT_101)
return img, mask
平移
在垂直和水準方向平移一段距離。
其效果如下:
class Shift:
def __init__(self, limit=50, prob=1.0):
self.limit = limit
self.prob = prob
def __call__(self, img, mask=None):
if random.random() < self.prob:
limit = self.limit
dx = round(random.uniform(-limit, limit))
dy = round(random.uniform(-limit, limit))
height, width, channel = img.shape
y1 = limit + 1 + dy
y2 = y1 + height
x1 = limit + 1 + dx
x2 = x1 + width
img1 = cv2.copyMakeBorder(img, limit+1, limit + 1, limit + 1, limit +1,
borderType=cv2.BORDER_REFLECT_101)
img = img1[y1:y2, x1:x2, :]
if mask is not None:
mask1 = cv2.copyMakeBorder(mask, limit+1, limit + 1, limit + 1, limit +1,
borderType=cv2.BORDER_REFLECT_101)
mask = mask1[y1:y2, x1:x2, :]
return img, mask
Cutout
這篇方法的出發點除了解決遮擋問題外,還有從dropout上得到啟發(是以也稱為Cutout)。衆所周知,Dropout随機隐藏一些神經元,最後的網絡模型相當于多個模型的內建。類似于dropout的思路,這篇文章将drop用在了輸入圖檔上,并且drop掉連續的區域——即矩形區域。
class Cutout:
def __init__(self, num_holes=8, max_h_size=8, max_w_size=8, fill_value=0, prob=1.):
self.num_holes = num_holes
self.max_h_size = max_h_size
self.max_w_size = max_w_size
self.fill_value = fill_value
self.prob = prob
def __call__(self, img, mask=None):
if random.random() < self.prob:
h = img.shape[0]
w = img.shape[1]
# c = img.shape[2]
# img2 = np.ones([h, w], np.float32)
for _ in range(self.num_holes):
y = np.random.randint(h)
x = np.random.randint(w)
y1 = np.clip(max(0, y - self.max_h_size // 2), 0, h)
y2 = np.clip(max(0, y + self.max_h_size // 2), 0, h)
x1 = np.clip(max(0, x - self.max_w_size // 2), 0, w)
x2 = np.clip(max(0, x + self.max_w_size // 2), 0, w)
img[y1: y2, x1: x2, :] = self.fill_value
if mask is not None:
mask[y1: y2, x1: x2, :] = self.fill_value
return img, mask
随機縮放
class Rescale(object):
def __init__(self, output_size, prob=0.75):
self.prob = prob
assert isinstance(output_size, (int,tuple))
self.output_size = output_size
def __call__(self, image, label):
if random.random() < self.prob:
raw_h, raw_w = image.shape[:2]
img = cv.resize(image, (self.output_size, self.output_size))
lbl = cv.resize(label, (self.output_size, self.output_size))
h, w = img.shape[:2]
if h > raw_w:
i = random.randint(0, h - raw_h)
j = random.randint(0, w - raw_h)
img = img[i:i + raw_h, j:j + raw_h]
lbl = lbl[i:i + raw_h, j:j + raw_h]
else:
res_h = raw_w - h
img = cv.copyMakeBorder(img, res_h, 0, res_h, 0, borderType=cv.BORDER_REFLECT)
lbl = cv.copyMakeBorder(lbl, res_h, 0, res_h, 0, borderType=cv.BORDER_REFLECT)
return img, lbl
else:
return image, label
怎麼同時使用?
class DualCompose:
def __init__(self, transforms):
self.transforms = transforms
def __call__(self, x, mask=None):
for t in self.transforms:
x, mask = t(x, mask)
return x, mask
class OneOf:
def __init__(self, transforms, prob=0.5):
self.transforms = transforms
self.prob = prob
def __call__(self, x, mask=None):
if random.random() < self.prob:
t = random.choice(self.transforms)
t.prob = 1.
x, mask = t(x, mask)
return x, mask
class OneOrOther:
def __init__(self, first, second, prob=0.5):
self.first = first
first.prob = 1.
self.second = second
second.prob = 1.
self.prob = prob
def __call__(self, x, mask=None):
if random.random() < self.prob:
x, mask = self.first(x, mask)
else:
x, mask = self.second(x, mask)
return x, mask
transform = DualCompose([
RandomFlip(),
RandomRotate90(),
Rotate(),
Shift(),
])
image, mask = transform(img, msk)