当前位置:   article > 正文

烹饪第一个U-Net进行图像分割

烹饪第一个U-Net进行图像分割

今天我们将学习如何准备计算机视觉中最重要的网络之一:U-Net。如果你没有代码和数据集也没关系,可以分别通过下面两个链接进行访问:

代码:

https://www.kaggle.com/datasets/mateuszbuda/lgg-mri-segmentation?source=post_page-----e812e37e9cd0--------------------------------

Kaggle的MRI分割数据集:

https://www.kaggle.com/datasets/mateuszbuda/lgg-mri-segmentation?source=post_page-----e812e37e9cd0--------------------------------

主要步骤:

1. 数据集的探索

2. 数据集和Dataloader类的创建

3. 架构的创建

4. 检查损失(DICE和二元交叉熵)

5. 结果

数据集的探索

我们得到了一组(255 x 255)的MRI扫描的2D图像,以及它们相应的必须将每个像素分类为0(健康)或1(肿瘤)。

这里有一些例子:

8f4f6fa5dd3eee62cf8537f20a64c9ca.jpeg

第一行:肿瘤,第二行:健康主题

数据集和Dataloader类

这是涉及神经网络的每个项目中都会找到的一步。

数据集类

  1. import torch
  2. import torch.nn as nn
  3. import torch.nn.functional as F
  4. from torch.utils.data import Dataset, DataLoader
  5. class BrainMriDataset(Dataset):
  6. def __init__(self, df, transforms):
  7. # df contains the paths to all files
  8. self.df = df
  9. # transforms is the set of data augmentation operations we use
  10. self.transforms = transforms
  11. def __len__(self):
  12. return len(self.df)
  13. def __getitem__(self, idx):
  14. image = cv2.imread(self.df.iloc[idx, 1])
  15. mask = cv2.imread(self.df.iloc[idx, 2], 0)
  16. augmented = self.transforms(image=image,
  17. mask=mask)
  18. image = augmented['image'] # Dimension (3, 255, 255)
  19. mask = augmented['mask'] # Dimension (255, 255)
  20. # We notice that the image has one more dimension (3 color channels), so we have to one one "artificial" dimension to the mask to match it
  21. mask = np.expand_dims(mask, axis=0) # Dimension (1, 255, 255)
  22. return image, mask

数据加载器

既然我们已经创建了Dataset类来重新整形张量,我们首先需要定义训练集(用于训练模型),验证集(用于监控训练并避免过拟合),以及测试集,最终评估模型在未见数据上的性能。

  1. # Split df into train_df and val_df
  2. train_df, val_df = train_test_split(df, stratify=df.diagnosis, test_size=0.1)
  3. train_df = train_df.reset_index(drop=True)
  4. val_df = val_df.reset_index(drop=True)
  5. # Split train_df into train_df and test_df
  6. train_df, test_df = train_test_split(train_df, stratify=train_df.diagnosis, test_size=0.15)
  7. train_df = train_df.reset_index(drop=True)
  8. train_dataset = BrainMriDataset(train_df, transforms=transforms)
  9. train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
  10. val_dataset = BrainMriDataset(val_df, transforms=transforms)
  11. val_dataloader = DataLoader(val_dataset, batch_size=32, shuffle=False)
  12. test_dataset = BrainMriDataset(test_df, transforms=transforms)
  13. test_dataloader = DataLoader(test_dataset, batch_size=32, shuffle=False)

U-Net架构

e528761129f5514c2493d1871166da60.jpeg

U-Net架构是用于图像分割任务的强大模型,是卷积神经网络(CNN)的一种类型,其名称来自其U形状的结构。U-Net最初由Olaf Ronneberger等人在2015年的论文中首次开发,标题为“U-Net:用于生物医学图像分割的卷积网络”。

其结构涉及编码(降采样)路径和解码(上采样)路径。U-Net至今仍然是一个非常成功的模型,其成功来自两个主要因素:

1. 对称结构(U形状)

2. 前向连接(图片上的灰色箭头)

前向连接的主要思想是,随着我们在层中越来越深入,我们会失去有关原始图像的一些信息。然而,我们的任务是对图像进行分割,我们需要精确的图像来对每个像素进行分类。这就是为什么我们在对称解码器层的每一层中重新注入图像的原因。以下是通过Pytorch实现的代码:

  1. train_dataset = BrainMriDataset(train_df, transforms=transforms)
  2. train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
  3. val_dataset = BrainMriDataset(val_df, transforms=transforms)
  4. val_dataloader = DataLoader(val_dataset, batch_size=32, shuffle=False)
  5. test_dataset = BrainMriDataset(test_df, transforms=transforms)
  6. test_dataloader = DataLoader(test_dataset, batch_size=32, shuffle=False)
  7. class UNet(nn.Module):
  8. def __init__(self):
  9. super().__init__()
  10. # Define convolutional layers
  11. # These are used in the "down" path of the U-Net,
  12. # where the image is successively downsampled
  13. self.conv_down1 = double_conv(3, 64)
  14. self.conv_down2 = double_conv(64, 128)
  15. self.conv_down3 = double_conv(128, 256)
  16. self.conv_down4 = double_conv(256, 512)
  17. # Define max pooling layer for downsampling
  18. self.maxpool = nn.MaxPool2d(2)
  19. # Define upsampling layer
  20. self.upsample = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
  21. # Define convolutional layers
  22. # These are used in the "up" path of the U-Net,
  23. # where the image is successively upsampled
  24. self.conv_up3 = double_conv(256 + 512, 256)
  25. self.conv_up2 = double_conv(128 + 256, 128)
  26. self.conv_up1 = double_conv(128 + 64, 64)
  27. # Define final convolution to output correct number of classes
  28. # 1 because there are only two classes (tumor or not tumor)
  29. self.last_conv = nn.Conv2d(64, 1, kernel_size=1)
  30. def forward(self, x):
  31. # Forward pass through the network
  32. # Down path
  33. conv1 = self.conv_down1(x)
  34. x = self.maxpool(conv1)
  35. conv2 = self.conv_down2(x)
  36. x = self.maxpool(conv2)
  37. conv3 = self.conv_down3(x)
  38. x = self.maxpool(conv3)
  39. x = self.conv_down4(x)
  40. # Up path
  41. x = self.upsample(x)
  42. x = torch.cat([x, conv3], dim=1)
  43. x = self.conv_up3(x)
  44. x = self.upsample(x)
  45. x = torch.cat([x, conv2], dim=1)
  46. x = self.conv_up2(x)
  47. x = self.upsample(x)
  48. x = torch.cat([x, conv1], dim=1)
  49. x = self.conv_up1(x)
  50. # Final output
  51. out = self.last_conv(x)
  52. out = torch.sigmoid(out)
  53. return out

损失和评估标准

与每个神经网络一样,都有一个目标函数,一种损失,我们通过梯度下降最小化它。我们还引入了评估标准,它帮助我们训练模型(如果它在连续的3个时期中没有改善,那么我们停止训练,因为模型正在过拟合)。从这一段中有两个主要要点:

1. 损失函数是两个损失函数的组合(DICE损失,二元交叉熵)

2. 评估函数是DICE分数,不要与DICE损失混淆

DICE损失:

c5b7a1b8fafaa01bdbfe5f549725a7ce.jpeg

DICE损失

备注:我们添加了一个平滑参数(epsilon)以避免除以零。

二元交叉熵损失:

9a4538b4bda0db62bfa673b13131def8.jpeg

BCE

于是,我们的总损失是:

9da84b914878f2bd0934f20ef2aae91c.jpeg

让我们一起实现它:

  1. def dice_coef_loss(inputs, target):
  2. smooth = 1.0
  3. intersection = 2.0 * ((target * inputs).sum()) + smooth
  4. union = target.sum() + inputs.sum() + smooth
  5. return 1 - (intersection / union)
  6. def bce_dice_loss(inputs, target):
  7. inputs = inputs.float()
  8. target = target.float()
  9. dicescore = dice_coef_loss(inputs, target)
  10. bcescore = nn.BCELoss()
  11. bceloss = bcescore(inputs, target)
  12. return bceloss + dicescore

评估标准(Dice系数):

我们使用的评估函数是DICE分数。它在0到1之间,1是最好的。

0be7e283b0e50c64a2fd1a935d03b85a.jpeg

Dice分数的图示

其数学实现如下:

3f035ea9fdf85f0e09abd6293c5bd257.jpeg

  1. def dice_coef_metric(inputs, target):
  2. intersection = 2.0 * (target * inputs).sum()
  3. union = target.sum() + inputs.sum()
  4. if target.sum() == 0 and inputs.sum() == 0:
  5. return 1.0
  6. return intersection / union

训练循环

  1. def train_model(model_name, model, train_loader, val_loader, train_loss, optimizer, lr_scheduler, num_epochs):
  2. print(model_name)
  3. loss_history = []
  4. train_history = []
  5. val_history = []
  6. for epoch in range(num_epochs):
  7. model.train() # Enter train mode
  8. # We store the training loss and dice scores
  9. losses = []
  10. train_iou = []
  11. if lr_scheduler:
  12. warmup_factor = 1.0 / 100
  13. warmup_iters = min(100, len(train_loader) - 1)
  14. lr_scheduler = warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor)
  15. # Add tqdm to the loop (to visualize progress)
  16. for i_step, (data, target) in enumerate(tqdm(train_loader, desc=f"Training epoch {epoch+1}/{num_epochs}")):
  17. data = data.to(device)
  18. target = target.to(device)
  19. outputs = model(data)
  20. out_cut = np.copy(outputs.data.cpu().numpy())
  21. # If the score is less than a threshold (0.5), the prediction is 0, otherwise its 1
  22. out_cut[np.nonzero(out_cut < 0.5)] = 0.0
  23. out_cut[np.nonzero(out_cut >= 0.5)] = 1.0
  24. train_dice = dice_coef_metric(out_cut, target.data.cpu().numpy())
  25. loss = train_loss(outputs, target)
  26. losses.append(loss.item())
  27. train_iou.append(train_dice)
  28. # Reset the gradients
  29. optimizer.zero_grad()
  30. # Perform backpropagation to compute gradients
  31. loss.backward()
  32. # Update the parameters with the computed gradients
  33. optimizer.step()
  34. if lr_scheduler:
  35. lr_scheduler.step()
  36. val_mean_iou = compute_iou(model, val_loader)
  37. loss_history.append(np.array(losses).mean())
  38. train_history.append(np.array(train_iou).mean())
  39. val_history.append(val_mean_iou)
  40. print("Epoch [%d]" % (epoch))
  41. print("Mean loss on train:", np.array(losses).mean(),
  42. "\nMean DICE on train:", np.array(train_iou).mean(),
  43. "\nMean DICE on validation:", val_mean_iou)
  44. return loss_history, train_history, val_history

结果

让我们在一个带有肿瘤的主题上评估我们的模型:

326c72e0a3dc7e3cf4b716108a4a58e6.jpeg

结果看起来相当不错!我们可以看到模型明显学到了关于图像结构的一些有用信息。然而,它可能可以更好地细化分割,这可以通过我们将很快讨论的更先进的技术来实现。U-Net至今仍然广泛使用,但有一个著名的模型达到了最先进的性能,称为nn-UNet。

·  END  ·

HAPPY LIFE

0ce980e3d287c8e1d8cf10ae1f388a0f.png

本文仅供学习交流使用,如有侵权请联系作者删除

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/72179
推荐阅读
相关标签
  

闽ICP备14008679号